Compare commits
122 Commits
release_up
...
master
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
f1aff553c4 | ||
|
|
d195a673e2 | ||
|
|
8dce0fa6fe | ||
|
|
4e8a240a71 | ||
|
|
5b019e28de | ||
|
|
bf4cb1e86c | ||
|
|
eaa7c56590 | ||
|
|
8fc0fa040d | ||
|
|
56e13b0b9a | ||
|
|
785fe3c962 | ||
|
|
3c341f6e9a | ||
|
|
aae88e87ea | ||
|
|
2352e458c7 | ||
|
|
8373b93887 | ||
|
|
f0442166cd | ||
|
|
33ca8c7933 | ||
|
|
3fd1ce8993 | ||
|
|
aa93588190 | ||
|
|
12520400e5 | ||
|
|
35614e83fa | ||
|
|
ff22ac01cc | ||
|
|
e7dd826f34 | ||
|
|
fcc882174b | ||
|
|
6872d8a3bb | ||
|
|
a2cb4c3589 | ||
|
|
15008962b2 | ||
|
|
9414b51fac | ||
|
|
80ac414582 | ||
|
|
cbc151660b | ||
|
|
0024abc6e3 | ||
|
|
fa13385860 | ||
|
|
0c5f6daf52 | ||
|
|
c11e9fc1da | ||
|
|
280150713e | ||
|
|
bb11c80b1b | ||
|
|
d0961b98bf | ||
|
|
302589b7f9 | ||
|
|
cc85949d6d | ||
|
|
3a1824e3ad | ||
|
|
579643326d | ||
|
|
14f76f227e | ||
|
|
25baf5203a | ||
|
|
1178fcb0ba | ||
|
|
e4d84c8e45 | ||
|
|
7a3ab1e44c | ||
|
|
598a293dfa | ||
|
|
40b3068462 | ||
|
|
0ecbbbce75 | ||
|
|
9b1e130d18 | ||
|
|
0e17b33d2a | ||
|
|
34d80abd26 | ||
|
|
249278ab77 | ||
|
|
25fdb17f80 | ||
|
|
3a02a27f1e | ||
|
|
4eed9d529f | ||
|
|
f344d410a2 | ||
|
|
9dc1228063 | ||
|
|
4404e62f58 | ||
|
|
38d5743bbb | ||
|
|
0814eee151 | ||
|
|
f45b815221 | ||
|
|
bd629ae454 | ||
|
|
41de75a584 | ||
|
|
96a426dc36 | ||
|
|
824dd40f7e | ||
|
|
fa2e649fe8 | ||
|
|
e25e8e3a41 | ||
|
|
aa3670a902 | ||
|
|
ef1f9205ac | ||
|
|
3228bbfc63 | ||
|
|
f18a0dfc4d | ||
|
|
badb620261 | ||
|
|
acf46100ae | ||
|
|
cf2e3804d5 | ||
|
|
b7be42357f | ||
|
|
3ac82c07ae | ||
|
|
9743c0a1fa | ||
|
|
ba4dac530e | ||
|
|
7f7f0040fd | ||
|
|
9ca567cd9c | ||
|
|
ae7b234ba0 | ||
|
|
9788d1965f | ||
|
|
387e43a423 | ||
|
|
25f407fc81 | ||
|
|
dcb2c4638f | ||
|
|
7fb5dd3ef9 | ||
|
|
6a38f4bec3 | ||
|
|
aed078aeab | ||
|
|
f999f41ed3 | ||
|
|
07e43ee7e4 | ||
|
|
aac706c3f0 | ||
|
|
4ccb278051 | ||
|
|
64a733480b | ||
|
|
dd0976f678 | ||
|
|
15a3ca649d | ||
|
|
3c4770cfe5 | ||
|
|
8d7de05908 | ||
|
|
863faae57f | ||
|
|
8d3f5adcdb | ||
|
|
cd3394e129 | ||
|
|
ee5d0239a3 | ||
|
|
388111cedc | ||
|
|
b86191ed7f | ||
|
|
22753486de | ||
|
|
cf1d1dbf01 | ||
|
|
2e45d9800d | ||
|
|
a9a8de02ec | ||
|
|
e0c9376aab | ||
|
|
dd8339e650 | ||
|
|
1594ee64a1 | ||
|
|
83ed8222d2 | ||
|
|
b0aa91acce | ||
|
|
5928ba83bb | ||
|
|
ffa3a43979 | ||
|
|
7ce79a43f1 | ||
|
|
edcc50ab0c | ||
|
|
4a391522d0 | ||
|
|
1903f78285 | ||
|
|
a4dfcc4693 | ||
|
|
faffb3fef7 | ||
|
|
6c6227c403 | ||
|
|
e3be364e7a |
@@ -1,6 +1,6 @@
|
|||||||
# Azure Machine Learning Python SDK notebooks
|
# Azure Machine Learning Python SDK notebooks
|
||||||
|
|
||||||
> a community-driven repository of examples using mlflow for tracking can be found at https://github.com/Azure/azureml-examples
|
### **With the introduction of AzureML SDK v2, this samples repository for the v1 SDK is now deprecated and will not be monitored or updated. Users are encouraged to visit the [v2 SDK samples repository](https://github.com/Azure/azureml-examples) instead for up-to-date and enhanced examples of how to build, train, and deploy machine learning models with AzureML's newest features.**
|
||||||
|
|
||||||
Welcome to the Azure Machine Learning Python SDK notebooks repository!
|
Welcome to the Azure Machine Learning Python SDK notebooks repository!
|
||||||
|
|
||||||
|
|||||||
41
SECURITY.md
Normal file
@@ -0,0 +1,41 @@
|
|||||||
|
<!-- BEGIN MICROSOFT SECURITY.MD V0.0.7 BLOCK -->
|
||||||
|
|
||||||
|
## Security
|
||||||
|
|
||||||
|
Microsoft takes the security of our software products and services seriously, which includes all source code repositories managed through our GitHub organizations, which include [Microsoft](https://github.com/Microsoft), [Azure](https://github.com/Azure), [DotNet](https://github.com/dotnet), [AspNet](https://github.com/aspnet), [Xamarin](https://github.com/xamarin), and [our GitHub organizations](https://opensource.microsoft.com/).
|
||||||
|
|
||||||
|
If you believe you have found a security vulnerability in any Microsoft-owned repository that meets [Microsoft's definition of a security vulnerability](https://aka.ms/opensource/security/definition), please report it to us as described below.
|
||||||
|
|
||||||
|
## Reporting Security Issues
|
||||||
|
|
||||||
|
**Please do not report security vulnerabilities through public GitHub issues.**
|
||||||
|
|
||||||
|
Instead, please report them to the Microsoft Security Response Center (MSRC) at [https://msrc.microsoft.com/create-report](https://aka.ms/opensource/security/create-report).
|
||||||
|
|
||||||
|
If you prefer to submit without logging in, send email to [secure@microsoft.com](mailto:secure@microsoft.com). If possible, encrypt your message with our PGP key; please download it from the [Microsoft Security Response Center PGP Key page](https://aka.ms/opensource/security/pgpkey).
|
||||||
|
|
||||||
|
You should receive a response within 24 hours. If for some reason you do not, please follow up via email to ensure we received your original message. Additional information can be found at [microsoft.com/msrc](https://aka.ms/opensource/security/msrc).
|
||||||
|
|
||||||
|
Please include the requested information listed below (as much as you can provide) to help us better understand the nature and scope of the possible issue:
|
||||||
|
|
||||||
|
* Type of issue (e.g. buffer overflow, SQL injection, cross-site scripting, etc.)
|
||||||
|
* Full paths of source file(s) related to the manifestation of the issue
|
||||||
|
* The location of the affected source code (tag/branch/commit or direct URL)
|
||||||
|
* Any special configuration required to reproduce the issue
|
||||||
|
* Step-by-step instructions to reproduce the issue
|
||||||
|
* Proof-of-concept or exploit code (if possible)
|
||||||
|
* Impact of the issue, including how an attacker might exploit the issue
|
||||||
|
|
||||||
|
This information will help us triage your report more quickly.
|
||||||
|
|
||||||
|
If you are reporting for a bug bounty, more complete reports can contribute to a higher bounty award. Please visit our [Microsoft Bug Bounty Program](https://aka.ms/opensource/security/bounty) page for more details about our active programs.
|
||||||
|
|
||||||
|
## Preferred Languages
|
||||||
|
|
||||||
|
We prefer all communications to be in English.
|
||||||
|
|
||||||
|
## Policy
|
||||||
|
|
||||||
|
Microsoft follows the principle of [Coordinated Vulnerability Disclosure](https://aka.ms/opensource/security/cvd).
|
||||||
|
|
||||||
|
<!-- END MICROSOFT SECURITY.MD BLOCK -->
|
||||||
@@ -103,7 +103,7 @@
|
|||||||
"source": [
|
"source": [
|
||||||
"import azureml.core\n",
|
"import azureml.core\n",
|
||||||
"\n",
|
"\n",
|
||||||
"print(\"This notebook was created using version 1.40.0 of the Azure ML SDK\")\n",
|
"print(\"This notebook was created using version 1.59.0 of the Azure ML SDK\")\n",
|
||||||
"print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
|
"print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
@@ -329,7 +329,7 @@
|
|||||||
" print(\"Creating new gpu-cluster\")\n",
|
" print(\"Creating new gpu-cluster\")\n",
|
||||||
" \n",
|
" \n",
|
||||||
" # Specify the configuration for the new cluster\n",
|
" # Specify the configuration for the new cluster\n",
|
||||||
" compute_config = AmlCompute.provisioning_configuration(vm_size=\"STANDARD_NC6\",\n",
|
" compute_config = AmlCompute.provisioning_configuration(vm_size=\"Standard_NC6s_v3\",\n",
|
||||||
" min_nodes=0,\n",
|
" min_nodes=0,\n",
|
||||||
" max_nodes=4)\n",
|
" max_nodes=4)\n",
|
||||||
" # Create the cluster with the specified name and configuration\n",
|
" # Create the cluster with the specified name and configuration\n",
|
||||||
@@ -367,9 +367,9 @@
|
|||||||
}
|
}
|
||||||
],
|
],
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Python 3.6",
|
"display_name": "Python 3.8 - AzureML",
|
||||||
"language": "python",
|
"language": "python",
|
||||||
"name": "python36"
|
"name": "python38-azureml"
|
||||||
},
|
},
|
||||||
"language_info": {
|
"language_info": {
|
||||||
"codemirror_mode": {
|
"codemirror_mode": {
|
||||||
|
|||||||
@@ -174,7 +174,7 @@
|
|||||||
"else:\n",
|
"else:\n",
|
||||||
" print(\"creating new cluster\")\n",
|
" print(\"creating new cluster\")\n",
|
||||||
" # vm_size parameter below could be modified to one of the RAPIDS-supported VM types\n",
|
" # vm_size parameter below could be modified to one of the RAPIDS-supported VM types\n",
|
||||||
" provisioning_config = AmlCompute.provisioning_configuration(vm_size = \"Standard_NC6s_v2\", min_nodes=1, max_nodes = 1)\n",
|
" provisioning_config = AmlCompute.provisioning_configuration(vm_size = \"Standard_NC6s_v3\", min_nodes=1, max_nodes = 1)\n",
|
||||||
"\n",
|
"\n",
|
||||||
" # create the cluster\n",
|
" # create the cluster\n",
|
||||||
" gpu_cluster = ComputeTarget.create(ws, gpu_cluster_name, provisioning_config)\n",
|
" gpu_cluster = ComputeTarget.create(ws, gpu_cluster_name, provisioning_config)\n",
|
||||||
@@ -398,7 +398,7 @@
|
|||||||
"# run_config.target = gpu_cluster_name\n",
|
"# run_config.target = gpu_cluster_name\n",
|
||||||
"# run_config.environment.docker.enabled = True\n",
|
"# run_config.environment.docker.enabled = True\n",
|
||||||
"# run_config.environment.docker.gpu_support = True\n",
|
"# run_config.environment.docker.gpu_support = True\n",
|
||||||
"# run_config.environment.docker.base_image = \"rapidsai/rapidsai:cuda9.2-runtime-ubuntu18.04\"\n",
|
"# run_config.environment.docker.base_image = \"rapidsai/rapidsai:cuda9.2-runtime-ubuntu20.04\"\n",
|
||||||
"# # run_config.environment.docker.base_image_registry.address = '<registry_url>' # not required if the base_image is in Docker hub\n",
|
"# # run_config.environment.docker.base_image_registry.address = '<registry_url>' # not required if the base_image is in Docker hub\n",
|
||||||
"# # run_config.environment.docker.base_image_registry.username = '<user_name>' # needed only for private images\n",
|
"# # run_config.environment.docker.base_image_registry.username = '<user_name>' # needed only for private images\n",
|
||||||
"# # run_config.environment.docker.base_image_registry.password = '<password>' # needed only for private images\n",
|
"# # run_config.environment.docker.base_image_registry.password = '<password>' # needed only for private images\n",
|
||||||
@@ -525,9 +525,9 @@
|
|||||||
}
|
}
|
||||||
],
|
],
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Python 3.6",
|
"display_name": "Python 3.8 - AzureML",
|
||||||
"language": "python",
|
"language": "python",
|
||||||
"name": "python36"
|
"name": "python38-azureml"
|
||||||
},
|
},
|
||||||
"language_info": {
|
"language_info": {
|
||||||
"codemirror_mode": {
|
"codemirror_mode": {
|
||||||
|
|||||||
@@ -1,621 +0,0 @@
|
|||||||
{
|
|
||||||
"cells": [
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"Copyright (c) Microsoft Corporation. All rights reserved. \n",
|
|
||||||
"Licensed under the MIT License."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
""
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"# Unfairness Mitigation with Fairlearn and Azure Machine Learning\n",
|
|
||||||
"**This notebook shows how to upload results from Fairlearn's GridSearch mitigation algorithm into a dashboard in Azure Machine Learning Studio**\n",
|
|
||||||
"\n",
|
|
||||||
"## Table of Contents\n",
|
|
||||||
"\n",
|
|
||||||
"1. [Introduction](#Introduction)\n",
|
|
||||||
"1. [Loading the Data](#LoadingData)\n",
|
|
||||||
"1. [Training an Unmitigated Model](#UnmitigatedModel)\n",
|
|
||||||
"1. [Mitigation with GridSearch](#Mitigation)\n",
|
|
||||||
"1. [Uploading a Fairness Dashboard to Azure](#AzureUpload)\n",
|
|
||||||
" 1. Registering models\n",
|
|
||||||
" 1. Computing Fairness Metrics\n",
|
|
||||||
" 1. Uploading to Azure\n",
|
|
||||||
"1. [Conclusion](#Conclusion)\n",
|
|
||||||
"\n",
|
|
||||||
"<a id=\"Introduction\"></a>\n",
|
|
||||||
"## Introduction\n",
|
|
||||||
"This notebook shows how to use [Fairlearn (an open source fairness assessment and unfairness mitigation package)](http://fairlearn.org) and Azure Machine Learning Studio for a binary classification problem. This example uses the well-known adult census dataset. For the purposes of this notebook, we shall treat this as a loan decision problem. We will pretend that the label indicates whether or not each individual repaid a loan in the past. We will use the data to train a predictor to predict whether previously unseen individuals will repay a loan or not. The assumption is that the model predictions are used to decide whether an individual should be offered a loan. Its purpose is purely illustrative of a workflow including a fairness dashboard - in particular, we do **not** include a full discussion of the detailed issues which arise when considering fairness in machine learning. For such discussions, please [refer to the Fairlearn website](http://fairlearn.org/).\n",
|
|
||||||
"\n",
|
|
||||||
"We will apply the [grid search algorithm](https://fairlearn.org/v0.4.6/api_reference/fairlearn.reductions.html#fairlearn.reductions.GridSearch) from the Fairlearn package using a specific notion of fairness called Demographic Parity. This produces a set of models, and we will view these in a dashboard both locally and in the Azure Machine Learning Studio.\n",
|
|
||||||
"\n",
|
|
||||||
"### Setup\n",
|
|
||||||
"\n",
|
|
||||||
"To use this notebook, an Azure Machine Learning workspace is required.\n",
|
|
||||||
"Please see the [configuration notebook](../../configuration.ipynb) for information about creating one, if required.\n",
|
|
||||||
"This notebook also requires the following packages:\n",
|
|
||||||
"* `azureml-contrib-fairness`\n",
|
|
||||||
"* `fairlearn>=0.6.2` (pre-v0.5.0 will work with minor modifications)\n",
|
|
||||||
"* `joblib`\n",
|
|
||||||
"* `liac-arff`\n",
|
|
||||||
"* `raiwidgets`\n",
|
|
||||||
"\n",
|
|
||||||
"Fairlearn relies on features introduced in v0.22.1 of `scikit-learn`. If you have an older version already installed, please uncomment and run the following cell:"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"# !pip install --upgrade scikit-learn>=0.22.1"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"Finally, please ensure that when you downloaded this notebook, you also downloaded the `fairness_nb_utils.py` file from the same location, and placed it in the same directory as this notebook."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"<a id=\"LoadingData\"></a>\n",
|
|
||||||
"## Loading the Data\n",
|
|
||||||
"We use the well-known `adult` census dataset, which we will fetch from the OpenML website. We start with a fairly unremarkable set of imports:"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"from fairlearn.reductions import GridSearch, DemographicParity, ErrorRate\n",
|
|
||||||
"from raiwidgets import FairnessDashboard\n",
|
|
||||||
"\n",
|
|
||||||
"from sklearn.compose import ColumnTransformer\n",
|
|
||||||
"from sklearn.impute import SimpleImputer\n",
|
|
||||||
"from sklearn.linear_model import LogisticRegression\n",
|
|
||||||
"from sklearn.model_selection import train_test_split\n",
|
|
||||||
"from sklearn.preprocessing import StandardScaler, OneHotEncoder\n",
|
|
||||||
"from sklearn.compose import make_column_selector as selector\n",
|
|
||||||
"from sklearn.pipeline import Pipeline\n",
|
|
||||||
"\n",
|
|
||||||
"import pandas as pd"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"We can now load and inspect the data:"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"from fairness_nb_utils import fetch_census_dataset\n",
|
|
||||||
"\n",
|
|
||||||
"data = fetch_census_dataset()\n",
|
|
||||||
" \n",
|
|
||||||
"# Extract the items we want\n",
|
|
||||||
"X_raw = data.data\n",
|
|
||||||
"y = (data.target == '>50K') * 1\n",
|
|
||||||
"\n",
|
|
||||||
"X_raw[\"race\"].value_counts().to_dict()"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"We are going to treat the sex and race of each individual as protected attributes, and in this particular case we are going to remove these attributes from the main data (this is not always the best option - see the [Fairlearn website](http://fairlearn.github.io/) for further discussion). Protected attributes are often denoted by 'A' in the literature, and we follow that convention here:"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"A = X_raw[['sex','race']]\n",
|
|
||||||
"X_raw = X_raw.drop(labels=['sex', 'race'], axis = 1)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"We now preprocess our data. To avoid the problem of data leakage, we split our data into training and test sets before performing any other transformations. Subsequent transformations (such as scalings) will be fit to the training data set, and then applied to the test dataset."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"(X_train, X_test, y_train, y_test, A_train, A_test) = train_test_split(\n",
|
|
||||||
" X_raw, y, A, test_size=0.3, random_state=12345, stratify=y\n",
|
|
||||||
")\n",
|
|
||||||
"\n",
|
|
||||||
"# Ensure indices are aligned between X, y and A,\n",
|
|
||||||
"# after all the slicing and splitting of DataFrames\n",
|
|
||||||
"# and Series\n",
|
|
||||||
"\n",
|
|
||||||
"X_train = X_train.reset_index(drop=True)\n",
|
|
||||||
"X_test = X_test.reset_index(drop=True)\n",
|
|
||||||
"y_train = y_train.reset_index(drop=True)\n",
|
|
||||||
"y_test = y_test.reset_index(drop=True)\n",
|
|
||||||
"A_train = A_train.reset_index(drop=True)\n",
|
|
||||||
"A_test = A_test.reset_index(drop=True)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"We have two types of column in the dataset - categorical columns which will need to be one-hot encoded, and numeric ones which will need to be rescaled. We also need to take care of missing values. We use a simple approach here, but please bear in mind that this is another way that bias could be introduced (especially if one subgroup tends to have more missing values).\n",
|
|
||||||
"\n",
|
|
||||||
"For this preprocessing, we make use of `Pipeline` objects from `sklearn`:"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"numeric_transformer = Pipeline(\n",
|
|
||||||
" steps=[\n",
|
|
||||||
" (\"impute\", SimpleImputer()),\n",
|
|
||||||
" (\"scaler\", StandardScaler()),\n",
|
|
||||||
" ]\n",
|
|
||||||
")\n",
|
|
||||||
"\n",
|
|
||||||
"categorical_transformer = Pipeline(\n",
|
|
||||||
" [\n",
|
|
||||||
" (\"impute\", SimpleImputer(strategy=\"most_frequent\")),\n",
|
|
||||||
" (\"ohe\", OneHotEncoder(handle_unknown=\"ignore\", sparse=False)),\n",
|
|
||||||
" ]\n",
|
|
||||||
")\n",
|
|
||||||
"\n",
|
|
||||||
"preprocessor = ColumnTransformer(\n",
|
|
||||||
" transformers=[\n",
|
|
||||||
" (\"num\", numeric_transformer, selector(dtype_exclude=\"category\")),\n",
|
|
||||||
" (\"cat\", categorical_transformer, selector(dtype_include=\"category\")),\n",
|
|
||||||
" ]\n",
|
|
||||||
")"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"Now, the preprocessing pipeline is defined, we can run it on our training data, and apply the generated transform to our test data:"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"X_train = preprocessor.fit_transform(X_train)\n",
|
|
||||||
"X_test = preprocessor.transform(X_test)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"<a id=\"UnmitigatedModel\"></a>\n",
|
|
||||||
"## Training an Unmitigated Model\n",
|
|
||||||
"\n",
|
|
||||||
"So we have a point of comparison, we first train a model (specifically, logistic regression from scikit-learn) on the raw data, without applying any mitigation algorithm:"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"unmitigated_predictor = LogisticRegression(solver='liblinear', fit_intercept=True)\n",
|
|
||||||
"\n",
|
|
||||||
"unmitigated_predictor.fit(X_train, y_train)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"We can view this model in the fairness dashboard, and see the disparities which appear:"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"FairnessDashboard(sensitive_features=A_test,\n",
|
|
||||||
" y_true=y_test,\n",
|
|
||||||
" y_pred={\"unmitigated\": unmitigated_predictor.predict(X_test)})"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"Looking at the disparity in accuracy when we select 'Sex' as the sensitive feature, we see that males have an error rate about three times greater than the females. More interesting is the disparity in opportunitiy - males are offered loans at three times the rate of females.\n",
|
|
||||||
"\n",
|
|
||||||
"Despite the fact that we removed the feature from the training data, our predictor still discriminates based on sex. This demonstrates that simply ignoring a protected attribute when fitting a predictor rarely eliminates unfairness. There will generally be enough other features correlated with the removed attribute to lead to disparate impact."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"<a id=\"Mitigation\"></a>\n",
|
|
||||||
"## Mitigation with GridSearch\n",
|
|
||||||
"\n",
|
|
||||||
"The `GridSearch` class in `Fairlearn` implements a simplified version of the exponentiated gradient reduction of [Agarwal et al. 2018](https://arxiv.org/abs/1803.02453). The user supplies a standard ML estimator, which is treated as a blackbox - for this simple example, we shall use the logistic regression estimator from scikit-learn. `GridSearch` works by generating a sequence of relabellings and reweightings, and trains a predictor for each.\n",
|
|
||||||
"\n",
|
|
||||||
"For this example, we specify demographic parity (on the protected attribute of sex) as the fairness metric. Demographic parity requires that individuals are offered the opportunity (a loan in this example) independent of membership in the protected class (i.e., females and males should be offered loans at the same rate). *We are using this metric for the sake of simplicity* in this example; the appropriate fairness metric can only be selected after *careful examination of the broader context* in which the model is to be used."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"sweep = GridSearch(LogisticRegression(solver='liblinear', fit_intercept=True),\n",
|
|
||||||
" constraints=DemographicParity(),\n",
|
|
||||||
" grid_size=71)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"With our estimator created, we can fit it to the data. After `fit()` completes, we extract the full set of predictors from the `GridSearch` object.\n",
|
|
||||||
"\n",
|
|
||||||
"The following cell trains a many copies of the underlying estimator, and may take a minute or two to run:"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"sweep.fit(X_train, y_train,\n",
|
|
||||||
" sensitive_features=A_train.sex)\n",
|
|
||||||
"\n",
|
|
||||||
"# For Fairlearn pre-v0.5.0, need sweep._predictors\n",
|
|
||||||
"predictors = sweep.predictors_"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"We could load these predictors into the Fairness dashboard now. However, the plot would be somewhat confusing due to their number. In this case, we are going to remove the predictors which are dominated in the error-disparity space by others from the sweep (note that the disparity will only be calculated for the protected attribute; other potentially protected attributes will *not* be mitigated). In general, one might not want to do this, since there may be other considerations beyond the strict optimisation of error and disparity (of the given protected attribute)."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"errors, disparities = [], []\n",
|
|
||||||
"for predictor in predictors:\n",
|
|
||||||
" error = ErrorRate()\n",
|
|
||||||
" error.load_data(X_train, pd.Series(y_train), sensitive_features=A_train.sex)\n",
|
|
||||||
" disparity = DemographicParity()\n",
|
|
||||||
" disparity.load_data(X_train, pd.Series(y_train), sensitive_features=A_train.sex)\n",
|
|
||||||
" \n",
|
|
||||||
" errors.append(error.gamma(predictor.predict)[0])\n",
|
|
||||||
" disparities.append(disparity.gamma(predictor.predict).max())\n",
|
|
||||||
" \n",
|
|
||||||
"all_results = pd.DataFrame( {\"predictor\": predictors, \"error\": errors, \"disparity\": disparities})\n",
|
|
||||||
"\n",
|
|
||||||
"dominant_models_dict = dict()\n",
|
|
||||||
"base_name_format = \"census_gs_model_{0}\"\n",
|
|
||||||
"row_id = 0\n",
|
|
||||||
"for row in all_results.itertuples():\n",
|
|
||||||
" model_name = base_name_format.format(row_id)\n",
|
|
||||||
" errors_for_lower_or_eq_disparity = all_results[\"error\"][all_results[\"disparity\"]<=row.disparity]\n",
|
|
||||||
" if row.error <= errors_for_lower_or_eq_disparity.min():\n",
|
|
||||||
" dominant_models_dict[model_name] = row.predictor\n",
|
|
||||||
" row_id = row_id + 1"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"We can construct predictions for the dominant models (we include the unmitigated predictor as well, for comparison):"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"predictions_dominant = {\"census_unmitigated\": unmitigated_predictor.predict(X_test)}\n",
|
|
||||||
"models_dominant = {\"census_unmitigated\": unmitigated_predictor}\n",
|
|
||||||
"for name, predictor in dominant_models_dict.items():\n",
|
|
||||||
" value = predictor.predict(X_test)\n",
|
|
||||||
" predictions_dominant[name] = value\n",
|
|
||||||
" models_dominant[name] = predictor"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"These predictions may then be viewed in the fairness dashboard. We include the race column from the dataset, as an alternative basis for assessing the models. However, since we have not based our mitigation on it, the variation in the models with respect to race can be large."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"FairnessDashboard(sensitive_features=A_test, \n",
|
|
||||||
" y_true=y_test.tolist(),\n",
|
|
||||||
" y_pred=predictions_dominant)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"When using sex as the sensitive feature and accuracy as the metric, we see a Pareto front forming - the set of predictors which represent optimal tradeoffs between accuracy and disparity in predictions. In the ideal case, we would have a predictor at (1,0) - perfectly accurate and without any unfairness under demographic parity (with respect to the protected attribute \"sex\"). The Pareto front represents the closest we can come to this ideal based on our data and choice of estimator. Note the range of the axes - the disparity axis covers more values than the accuracy, so we can reduce disparity substantially for a small loss in accuracy. Finally, we also see that the unmitigated model is towards the top right of the plot, with high accuracy, but worst disparity.\n",
|
|
||||||
"\n",
|
|
||||||
"By clicking on individual models on the plot, we can inspect their metrics for disparity and accuracy in greater detail. In a real example, we would then pick the model which represented the best trade-off between accuracy and disparity given the relevant business constraints."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"<a id=\"AzureUpload\"></a>\n",
|
|
||||||
"## Uploading a Fairness Dashboard to Azure\n",
|
|
||||||
"\n",
|
|
||||||
"Uploading a fairness dashboard to Azure is a two stage process. The `FairnessDashboard` invoked in the previous section relies on the underlying Python kernel to compute metrics on demand. This is obviously not available when the fairness dashboard is rendered in AzureML Studio. By default, the dashboard in Azure Machine Learning Studio also requires the models to be registered. The required stages are therefore:\n",
|
|
||||||
"1. Register the dominant models\n",
|
|
||||||
"1. Precompute all the required metrics\n",
|
|
||||||
"1. Upload to Azure\n",
|
|
||||||
"\n",
|
|
||||||
"Before that, we need to connect to Azure Machine Learning Studio:"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"from azureml.core import Workspace, Experiment, Model\n",
|
|
||||||
"\n",
|
|
||||||
"ws = Workspace.from_config()\n",
|
|
||||||
"ws.get_details()"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"<a id=\"RegisterModels\"></a>\n",
|
|
||||||
"### Registering Models\n",
|
|
||||||
"\n",
|
|
||||||
"The fairness dashboard is designed to integrate with registered models, so we need to do this for the models we want in the Studio portal. The assumption is that the names of the models specified in the dashboard dictionary correspond to the `id`s (i.e. `<name>:<version>` pairs) of registered models in the workspace. We register each of the models in the `models_dominant` dictionary into the workspace. For this, we have to save each model to a file, and then register that file:"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"import joblib\n",
|
|
||||||
"import os\n",
|
|
||||||
"\n",
|
|
||||||
"os.makedirs('models', exist_ok=True)\n",
|
|
||||||
"def register_model(name, model):\n",
|
|
||||||
" print(\"Registering \", name)\n",
|
|
||||||
" model_path = \"models/{0}.pkl\".format(name)\n",
|
|
||||||
" joblib.dump(value=model, filename=model_path)\n",
|
|
||||||
" registered_model = Model.register(model_path=model_path,\n",
|
|
||||||
" model_name=name,\n",
|
|
||||||
" workspace=ws)\n",
|
|
||||||
" print(\"Registered \", registered_model.id)\n",
|
|
||||||
" return registered_model.id\n",
|
|
||||||
"\n",
|
|
||||||
"model_name_id_mapping = dict()\n",
|
|
||||||
"for name, model in models_dominant.items():\n",
|
|
||||||
" m_id = register_model(name, model)\n",
|
|
||||||
" model_name_id_mapping[name] = m_id"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"Now, produce new predictions dictionaries, with the updated names:"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"predictions_dominant_ids = dict()\n",
|
|
||||||
"for name, y_pred in predictions_dominant.items():\n",
|
|
||||||
" predictions_dominant_ids[model_name_id_mapping[name]] = y_pred"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"<a id=\"PrecomputeMetrics\"></a>\n",
|
|
||||||
"### Precomputing Metrics\n",
|
|
||||||
"\n",
|
|
||||||
"We create a _dashboard dictionary_ using Fairlearn's `metrics` package. The `_create_group_metric_set` method has arguments similar to the Dashboard constructor, except that the sensitive features are passed as a dictionary (to ensure that names are available), and we must specify the type of prediction. Note that we use the `predictions_dominant_ids` dictionary we just created:"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"sf = { 'sex': A_test.sex, 'race': A_test.race }\n",
|
|
||||||
"\n",
|
|
||||||
"from fairlearn.metrics._group_metric_set import _create_group_metric_set\n",
|
|
||||||
"\n",
|
|
||||||
"\n",
|
|
||||||
"dash_dict = _create_group_metric_set(y_true=y_test,\n",
|
|
||||||
" predictions=predictions_dominant_ids,\n",
|
|
||||||
" sensitive_features=sf,\n",
|
|
||||||
" prediction_type='binary_classification')"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"<a id=\"DashboardUpload\"></a>\n",
|
|
||||||
"### Uploading the Dashboard\n",
|
|
||||||
"\n",
|
|
||||||
"Now, we import our `contrib` package which contains the routine to perform the upload:"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"from azureml.contrib.fairness import upload_dashboard_dictionary, download_dashboard_by_upload_id"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"Now we can create an Experiment, then a Run, and upload our dashboard to it:"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"exp = Experiment(ws, \"Test_Fairlearn_GridSearch_Census_Demo\")\n",
|
|
||||||
"print(exp)\n",
|
|
||||||
"\n",
|
|
||||||
"run = exp.start_logging()\n",
|
|
||||||
"try:\n",
|
|
||||||
" dashboard_title = \"Dominant Models from GridSearch\"\n",
|
|
||||||
" upload_id = upload_dashboard_dictionary(run,\n",
|
|
||||||
" dash_dict,\n",
|
|
||||||
" dashboard_name=dashboard_title)\n",
|
|
||||||
" print(\"\\nUploaded to id: {0}\\n\".format(upload_id))\n",
|
|
||||||
"\n",
|
|
||||||
" downloaded_dict = download_dashboard_by_upload_id(run, upload_id)\n",
|
|
||||||
"finally:\n",
|
|
||||||
" run.complete()"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"The dashboard can be viewed in the Run Details page.\n",
|
|
||||||
"\n",
|
|
||||||
"Finally, we can verify that the dashboard dictionary which we downloaded matches our upload:"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"print(dash_dict == downloaded_dict)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"<a id=\"Conclusion\"></a>\n",
|
|
||||||
"## Conclusion\n",
|
|
||||||
"\n",
|
|
||||||
"In this notebook we have demonstrated how to use the `GridSearch` algorithm from Fairlearn to generate a collection of models, and then present them in the fairness dashboard in Azure Machine Learning Studio. Please remember that this notebook has not attempted to discuss the many considerations which should be part of any approach to unfairness mitigation. The [Fairlearn website](http://fairlearn.org/) provides that discussion"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": []
|
|
||||||
}
|
|
||||||
],
|
|
||||||
"metadata": {
|
|
||||||
"authors": [
|
|
||||||
{
|
|
||||||
"name": "riedgar"
|
|
||||||
}
|
|
||||||
],
|
|
||||||
"kernelspec": {
|
|
||||||
"display_name": "Python 3.6",
|
|
||||||
"language": "python",
|
|
||||||
"name": "python36"
|
|
||||||
},
|
|
||||||
"language_info": {
|
|
||||||
"codemirror_mode": {
|
|
||||||
"name": "ipython",
|
|
||||||
"version": 3
|
|
||||||
},
|
|
||||||
"file_extension": ".py",
|
|
||||||
"mimetype": "text/x-python",
|
|
||||||
"name": "python",
|
|
||||||
"nbconvert_exporter": "python",
|
|
||||||
"pygments_lexer": "ipython3",
|
|
||||||
"version": "3.6.10"
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"nbformat": 4,
|
|
||||||
"nbformat_minor": 2
|
|
||||||
}
|
|
||||||
@@ -1,11 +0,0 @@
|
|||||||
name: fairlearn-azureml-mitigation
|
|
||||||
dependencies:
|
|
||||||
- pip:
|
|
||||||
- azureml-sdk
|
|
||||||
- azureml-contrib-fairness
|
|
||||||
- fairlearn>=0.6.2
|
|
||||||
- joblib
|
|
||||||
- liac-arff
|
|
||||||
- raiwidgets~=0.17.0
|
|
||||||
- itsdangerous==2.0.1
|
|
||||||
- markupsafe<2.1.0
|
|
||||||
@@ -1,111 +0,0 @@
|
|||||||
# ---------------------------------------------------------
|
|
||||||
# Copyright (c) Microsoft Corporation. All rights reserved.
|
|
||||||
# ---------------------------------------------------------
|
|
||||||
|
|
||||||
"""Utilities for azureml-contrib-fairness notebooks."""
|
|
||||||
|
|
||||||
import arff
|
|
||||||
from collections import OrderedDict
|
|
||||||
from contextlib import closing
|
|
||||||
import gzip
|
|
||||||
import pandas as pd
|
|
||||||
from sklearn.datasets import fetch_openml
|
|
||||||
from sklearn.utils import Bunch
|
|
||||||
import time
|
|
||||||
|
|
||||||
|
|
||||||
def fetch_openml_with_retries(data_id, max_retries=4, retry_delay=60):
|
|
||||||
"""Fetch a given dataset from OpenML with retries as specified."""
|
|
||||||
for i in range(max_retries):
|
|
||||||
try:
|
|
||||||
print("Download attempt {0} of {1}".format(i + 1, max_retries))
|
|
||||||
data = fetch_openml(data_id=data_id, as_frame=True)
|
|
||||||
break
|
|
||||||
except Exception as e: # noqa: B902
|
|
||||||
print("Download attempt failed with exception:")
|
|
||||||
print(e)
|
|
||||||
if i + 1 != max_retries:
|
|
||||||
print("Will retry after {0} seconds".format(retry_delay))
|
|
||||||
time.sleep(retry_delay)
|
|
||||||
retry_delay = retry_delay * 2
|
|
||||||
else:
|
|
||||||
raise RuntimeError("Unable to download dataset from OpenML")
|
|
||||||
|
|
||||||
return data
|
|
||||||
|
|
||||||
|
|
||||||
_categorical_columns = [
|
|
||||||
'workclass',
|
|
||||||
'education',
|
|
||||||
'marital-status',
|
|
||||||
'occupation',
|
|
||||||
'relationship',
|
|
||||||
'race',
|
|
||||||
'sex',
|
|
||||||
'native-country'
|
|
||||||
]
|
|
||||||
|
|
||||||
|
|
||||||
def fetch_census_dataset():
|
|
||||||
"""Fetch the Adult Census Dataset.
|
|
||||||
|
|
||||||
This uses a particular URL for the Adult Census dataset. The code
|
|
||||||
is a simplified version of fetch_openml() in sklearn.
|
|
||||||
|
|
||||||
The data are copied from:
|
|
||||||
https://openml.org/data/v1/download/1595261.gz
|
|
||||||
(as of 2021-03-31)
|
|
||||||
"""
|
|
||||||
try:
|
|
||||||
from urllib import urlretrieve
|
|
||||||
except ImportError:
|
|
||||||
from urllib.request import urlretrieve
|
|
||||||
|
|
||||||
filename = "1595261.gz"
|
|
||||||
data_url = "https://rainotebookscdn.blob.core.windows.net/datasets/"
|
|
||||||
|
|
||||||
remaining_attempts = 5
|
|
||||||
sleep_duration = 10
|
|
||||||
while remaining_attempts > 0:
|
|
||||||
try:
|
|
||||||
urlretrieve(data_url + filename, filename)
|
|
||||||
|
|
||||||
http_stream = gzip.GzipFile(filename=filename, mode='rb')
|
|
||||||
|
|
||||||
with closing(http_stream):
|
|
||||||
def _stream_generator(response):
|
|
||||||
for line in response:
|
|
||||||
yield line.decode('utf-8')
|
|
||||||
|
|
||||||
stream = _stream_generator(http_stream)
|
|
||||||
data = arff.load(stream)
|
|
||||||
except Exception as exc: # noqa: B902
|
|
||||||
remaining_attempts -= 1
|
|
||||||
print("Error downloading dataset from {} ({} attempt(s) remaining)"
|
|
||||||
.format(data_url, remaining_attempts))
|
|
||||||
print(exc)
|
|
||||||
time.sleep(sleep_duration)
|
|
||||||
sleep_duration *= 2
|
|
||||||
continue
|
|
||||||
else:
|
|
||||||
# dataset successfully downloaded
|
|
||||||
break
|
|
||||||
else:
|
|
||||||
raise Exception("Could not retrieve dataset from {}.".format(data_url))
|
|
||||||
|
|
||||||
attributes = OrderedDict(data['attributes'])
|
|
||||||
arff_columns = list(attributes)
|
|
||||||
|
|
||||||
raw_df = pd.DataFrame(data=data['data'], columns=arff_columns)
|
|
||||||
|
|
||||||
target_column_name = 'class'
|
|
||||||
target = raw_df.pop(target_column_name)
|
|
||||||
for col_name in _categorical_columns:
|
|
||||||
dtype = pd.api.types.CategoricalDtype(attributes[col_name])
|
|
||||||
raw_df[col_name] = raw_df[col_name].astype(dtype, copy=False)
|
|
||||||
|
|
||||||
result = Bunch()
|
|
||||||
result.data = raw_df
|
|
||||||
result.target = target
|
|
||||||
|
|
||||||
return result
|
|
||||||
@@ -1,545 +0,0 @@
|
|||||||
{
|
|
||||||
"cells": [
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"Copyright (c) Microsoft Corporation. All rights reserved. \n",
|
|
||||||
"Licensed under the MIT License."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
""
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"# Upload a Fairness Dashboard to Azure Machine Learning Studio\n",
|
|
||||||
"**This notebook shows how to generate and upload a fairness assessment dashboard from Fairlearn to AzureML Studio**\n",
|
|
||||||
"\n",
|
|
||||||
"## Table of Contents\n",
|
|
||||||
"\n",
|
|
||||||
"1. [Introduction](#Introduction)\n",
|
|
||||||
"1. [Loading the Data](#LoadingData)\n",
|
|
||||||
"1. [Processing the Data](#ProcessingData)\n",
|
|
||||||
"1. [Training Models](#TrainingModels)\n",
|
|
||||||
"1. [Logging in to AzureML](#LoginAzureML)\n",
|
|
||||||
"1. [Registering the Models](#RegisterModels)\n",
|
|
||||||
"1. [Using the Fairness Dashboard](#LocalDashboard)\n",
|
|
||||||
"1. [Uploading a Fairness Dashboard to Azure](#AzureUpload)\n",
|
|
||||||
" 1. Computing Fairness Metrics\n",
|
|
||||||
" 1. Uploading to Azure\n",
|
|
||||||
"1. [Conclusion](#Conclusion)\n",
|
|
||||||
" \n",
|
|
||||||
"\n",
|
|
||||||
"<a id=\"Introduction\"></a>\n",
|
|
||||||
"## Introduction\n",
|
|
||||||
"\n",
|
|
||||||
"In this notebook, we walk through a simple example of using the `azureml-contrib-fairness` package to upload a collection of fairness statistics for a fairness dashboard. It is an example of integrating the [open source Fairlearn package](https://www.github.com/fairlearn/fairlearn) with Azure Machine Learning. This is not an example of fairness analysis or mitigation - this notebook simply shows how to get a fairness dashboard into the Azure Machine Learning portal. We will load the data and train a couple of simple models. We will then use Fairlearn to generate data for a Fairness dashboard, which we can upload to Azure Machine Learning portal and view there.\n",
|
|
||||||
"\n",
|
|
||||||
"### Setup\n",
|
|
||||||
"\n",
|
|
||||||
"To use this notebook, an Azure Machine Learning workspace is required.\n",
|
|
||||||
"Please see the [configuration notebook](../../configuration.ipynb) for information about creating one, if required.\n",
|
|
||||||
"This notebook also requires the following packages:\n",
|
|
||||||
"* `azureml-contrib-fairness`\n",
|
|
||||||
"* `fairlearn>=0.6.2` (also works for pre-v0.5.0 with slight modifications)\n",
|
|
||||||
"* `joblib`\n",
|
|
||||||
"* `liac-arff`\n",
|
|
||||||
"* `raiwidgets`\n",
|
|
||||||
"\n",
|
|
||||||
"Fairlearn relies on features introduced in v0.22.1 of `scikit-learn`. If you have an older version already installed, please uncomment and run the following cell:"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"# !pip install --upgrade scikit-learn>=0.22.1"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"Finally, please ensure that when you downloaded this notebook, you also downloaded the `fairness_nb_utils.py` file from the same location, and placed it in the same directory as this notebook."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"<a id=\"LoadingData\"></a>\n",
|
|
||||||
"## Loading the Data\n",
|
|
||||||
"We use the well-known `adult` census dataset, which we fetch from the OpenML website. We start with a fairly unremarkable set of imports:"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"from sklearn import svm\n",
|
|
||||||
"from sklearn.compose import ColumnTransformer\n",
|
|
||||||
"from sklearn.impute import SimpleImputer\n",
|
|
||||||
"from sklearn.linear_model import LogisticRegression\n",
|
|
||||||
"from sklearn.model_selection import train_test_split\n",
|
|
||||||
"from sklearn.preprocessing import StandardScaler, OneHotEncoder\n",
|
|
||||||
"from sklearn.compose import make_column_selector as selector\n",
|
|
||||||
"from sklearn.pipeline import Pipeline"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"Now we can load the data:"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"from fairness_nb_utils import fetch_census_dataset\n",
|
|
||||||
"\n",
|
|
||||||
"data = fetch_census_dataset()\n",
|
|
||||||
" \n",
|
|
||||||
"# Extract the items we want\n",
|
|
||||||
"X_raw = data.data\n",
|
|
||||||
"y = (data.target == '>50K') * 1"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"We can take a look at some of the data. For example, the next cells shows the counts of the different races identified in the dataset:"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"print(X_raw[\"race\"].value_counts().to_dict())"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"<a id=\"ProcessingData\"></a>\n",
|
|
||||||
"## Processing the Data\n",
|
|
||||||
"\n",
|
|
||||||
"With the data loaded, we process it for our needs. First, we extract the sensitive features of interest into `A` (conventionally used in the literature) and leave the rest of the feature data in `X_raw`:"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"A = X_raw[['sex','race']]\n",
|
|
||||||
"X_raw = X_raw.drop(labels=['sex', 'race'],axis = 1)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"We now preprocess our data. To avoid the problem of data leakage, we split our data into training and test sets before performing any other transformations. Subsequent transformations (such as scalings) will be fit to the training data set, and then applied to the test dataset."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"(X_train, X_test, y_train, y_test, A_train, A_test) = train_test_split(\n",
|
|
||||||
" X_raw, y, A, test_size=0.3, random_state=12345, stratify=y\n",
|
|
||||||
")\n",
|
|
||||||
"\n",
|
|
||||||
"# Ensure indices are aligned between X, y and A,\n",
|
|
||||||
"# after all the slicing and splitting of DataFrames\n",
|
|
||||||
"# and Series\n",
|
|
||||||
"\n",
|
|
||||||
"X_train = X_train.reset_index(drop=True)\n",
|
|
||||||
"X_test = X_test.reset_index(drop=True)\n",
|
|
||||||
"y_train = y_train.reset_index(drop=True)\n",
|
|
||||||
"y_test = y_test.reset_index(drop=True)\n",
|
|
||||||
"A_train = A_train.reset_index(drop=True)\n",
|
|
||||||
"A_test = A_test.reset_index(drop=True)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"We have two types of column in the dataset - categorical columns which will need to be one-hot encoded, and numeric ones which will need to be rescaled. We also need to take care of missing values. We use a simple approach here, but please bear in mind that this is another way that bias could be introduced (especially if one subgroup tends to have more missing values).\n",
|
|
||||||
"\n",
|
|
||||||
"For this preprocessing, we make use of `Pipeline` objects from `sklearn`:"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"numeric_transformer = Pipeline(\n",
|
|
||||||
" steps=[\n",
|
|
||||||
" (\"impute\", SimpleImputer()),\n",
|
|
||||||
" (\"scaler\", StandardScaler()),\n",
|
|
||||||
" ]\n",
|
|
||||||
")\n",
|
|
||||||
"\n",
|
|
||||||
"categorical_transformer = Pipeline(\n",
|
|
||||||
" [\n",
|
|
||||||
" (\"impute\", SimpleImputer(strategy=\"most_frequent\")),\n",
|
|
||||||
" (\"ohe\", OneHotEncoder(handle_unknown=\"ignore\", sparse=False)),\n",
|
|
||||||
" ]\n",
|
|
||||||
")\n",
|
|
||||||
"\n",
|
|
||||||
"preprocessor = ColumnTransformer(\n",
|
|
||||||
" transformers=[\n",
|
|
||||||
" (\"num\", numeric_transformer, selector(dtype_exclude=\"category\")),\n",
|
|
||||||
" (\"cat\", categorical_transformer, selector(dtype_include=\"category\")),\n",
|
|
||||||
" ]\n",
|
|
||||||
")"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"Now, the preprocessing pipeline is defined, we can run it on our training data, and apply the generated transform to our test data:"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"X_train = preprocessor.fit_transform(X_train)\n",
|
|
||||||
"X_test = preprocessor.transform(X_test)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"<a id=\"TrainingModels\"></a>\n",
|
|
||||||
"## Training Models\n",
|
|
||||||
"\n",
|
|
||||||
"We now train a couple of different models on our data. The `adult` census dataset is a classification problem - the goal is to predict whether a particular individual exceeds an income threshold. For the purpose of generating a dashboard to upload, it is sufficient to train two basic classifiers. First, a logistic regression classifier:"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"lr_predictor = LogisticRegression(solver='liblinear', fit_intercept=True)\n",
|
|
||||||
"\n",
|
|
||||||
"lr_predictor.fit(X_train, y_train)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"And for comparison, a support vector classifier:"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"svm_predictor = svm.SVC()\n",
|
|
||||||
"\n",
|
|
||||||
"svm_predictor.fit(X_train, y_train)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"<a id=\"LoginAzureML\"></a>\n",
|
|
||||||
"## Logging in to AzureML\n",
|
|
||||||
"\n",
|
|
||||||
"With our two classifiers trained, we can log into our AzureML workspace:"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"from azureml.core import Workspace, Experiment, Model\n",
|
|
||||||
"\n",
|
|
||||||
"ws = Workspace.from_config()\n",
|
|
||||||
"ws.get_details()"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"<a id=\"RegisterModels\"></a>\n",
|
|
||||||
"## Registering the Models\n",
|
|
||||||
"\n",
|
|
||||||
"Next, we register our models. By default, the subroutine which uploads the models checks that the names provided correspond to registered models in the workspace. We define a utility routine to do the registering:"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"import joblib\n",
|
|
||||||
"import os\n",
|
|
||||||
"\n",
|
|
||||||
"os.makedirs('models', exist_ok=True)\n",
|
|
||||||
"def register_model(name, model):\n",
|
|
||||||
" print(\"Registering \", name)\n",
|
|
||||||
" model_path = \"models/{0}.pkl\".format(name)\n",
|
|
||||||
" joblib.dump(value=model, filename=model_path)\n",
|
|
||||||
" registered_model = Model.register(model_path=model_path,\n",
|
|
||||||
" model_name=name,\n",
|
|
||||||
" workspace=ws)\n",
|
|
||||||
" print(\"Registered \", registered_model.id)\n",
|
|
||||||
" return registered_model.id"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"Now, we register the models. For convenience in subsequent method calls, we store the results in a dictionary, which maps the `id` of the registered model (a string in `name:version` format) to the predictor itself:"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"model_dict = {}\n",
|
|
||||||
"\n",
|
|
||||||
"lr_reg_id = register_model(\"fairness_linear_regression\", lr_predictor)\n",
|
|
||||||
"model_dict[lr_reg_id] = lr_predictor\n",
|
|
||||||
"svm_reg_id = register_model(\"fairness_svm\", svm_predictor)\n",
|
|
||||||
"model_dict[svm_reg_id] = svm_predictor"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"<a id=\"LocalDashboard\"></a>\n",
|
|
||||||
"## Using the Fairlearn Dashboard\n",
|
|
||||||
"\n",
|
|
||||||
"We can now examine the fairness of the two models we have training, both as a function of race and (binary) sex. Before uploading the dashboard to the AzureML portal, we will first instantiate a local instance of the Fairlearn dashboard.\n",
|
|
||||||
"\n",
|
|
||||||
"Regardless of the viewing location, the dashboard is based on three things - the true values, the model predictions and the sensitive feature values. The dashboard can use predictions from multiple models and multiple sensitive features if desired (as we are doing here).\n",
|
|
||||||
"\n",
|
|
||||||
"Our first step is to generate a dictionary mapping the `id` of the registered model to the corresponding array of predictions:"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"ys_pred = {}\n",
|
|
||||||
"for n, p in model_dict.items():\n",
|
|
||||||
" ys_pred[n] = p.predict(X_test)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"We can examine these predictions in a locally invoked Fairlearn dashboard. This can be compared to the dashboard uploaded to the portal (in the next section):"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"from raiwidgets import FairnessDashboard\n",
|
|
||||||
"\n",
|
|
||||||
"FairnessDashboard(sensitive_features=A_test, \n",
|
|
||||||
" y_true=y_test.tolist(),\n",
|
|
||||||
" y_pred=ys_pred)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"<a id=\"AzureUpload\"></a>\n",
|
|
||||||
"## Uploading a Fairness Dashboard to Azure\n",
|
|
||||||
"\n",
|
|
||||||
"Uploading a fairness dashboard to Azure is a two stage process. The `FairnessDashboard` invoked in the previous section relies on the underlying Python kernel to compute metrics on demand. This is obviously not available when the fairness dashboard is rendered in AzureML Studio. The required stages are therefore:\n",
|
|
||||||
"1. Precompute all the required metrics\n",
|
|
||||||
"1. Upload to Azure\n",
|
|
||||||
"\n",
|
|
||||||
"\n",
|
|
||||||
"### Computing Fairness Metrics\n",
|
|
||||||
"We use Fairlearn to create a dictionary which contains all the data required to display a dashboard. This includes both the raw data (true values, predicted values and sensitive features), and also the fairness metrics. The API is similar to that used to invoke the Dashboard locally. However, there are a few minor changes to the API, and the type of problem being examined (binary classification, regression etc.) needs to be specified explicitly:"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"sf = { 'Race': A_test.race, 'Sex': A_test.sex }\n",
|
|
||||||
"\n",
|
|
||||||
"from fairlearn.metrics._group_metric_set import _create_group_metric_set\n",
|
|
||||||
"\n",
|
|
||||||
"dash_dict = _create_group_metric_set(y_true=y_test,\n",
|
|
||||||
" predictions=ys_pred,\n",
|
|
||||||
" sensitive_features=sf,\n",
|
|
||||||
" prediction_type='binary_classification')"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"The `_create_group_metric_set()` method is currently underscored since its exact design is not yet final in Fairlearn."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"### Uploading to Azure\n",
|
|
||||||
"\n",
|
|
||||||
"We can now import the `azureml.contrib.fairness` package itself. We will round-trip the data, so there are two required subroutines:"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"from azureml.contrib.fairness import upload_dashboard_dictionary, download_dashboard_by_upload_id"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"Finally, we can upload the generated dictionary to AzureML. The upload method requires a run, so we first create an experiment and a run. The uploaded dashboard can be seen on the corresponding Run Details page in AzureML Studio. For completeness, we also download the dashboard dictionary which we uploaded."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"exp = Experiment(ws, \"notebook-01\")\n",
|
|
||||||
"print(exp)\n",
|
|
||||||
"\n",
|
|
||||||
"run = exp.start_logging()\n",
|
|
||||||
"try:\n",
|
|
||||||
" dashboard_title = \"Sample notebook upload\"\n",
|
|
||||||
" upload_id = upload_dashboard_dictionary(run,\n",
|
|
||||||
" dash_dict,\n",
|
|
||||||
" dashboard_name=dashboard_title)\n",
|
|
||||||
" print(\"\\nUploaded to id: {0}\\n\".format(upload_id))\n",
|
|
||||||
"\n",
|
|
||||||
" downloaded_dict = download_dashboard_by_upload_id(run, upload_id)\n",
|
|
||||||
"finally:\n",
|
|
||||||
" run.complete()"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"Finally, we can verify that the dashboard dictionary which we downloaded matches our upload:"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"print(dash_dict == downloaded_dict)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"<a id=\"Conclusion\"></a>\n",
|
|
||||||
"## Conclusion\n",
|
|
||||||
"\n",
|
|
||||||
"In this notebook we have demonstrated how to generate and upload a fairness dashboard to AzureML Studio. We have not discussed how to analyse the results and apply mitigations. Those topics will be covered elsewhere."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": []
|
|
||||||
}
|
|
||||||
],
|
|
||||||
"metadata": {
|
|
||||||
"authors": [
|
|
||||||
{
|
|
||||||
"name": "riedgar"
|
|
||||||
}
|
|
||||||
],
|
|
||||||
"kernelspec": {
|
|
||||||
"display_name": "Python 3.6",
|
|
||||||
"language": "python",
|
|
||||||
"name": "python36"
|
|
||||||
},
|
|
||||||
"language_info": {
|
|
||||||
"codemirror_mode": {
|
|
||||||
"name": "ipython",
|
|
||||||
"version": 3
|
|
||||||
},
|
|
||||||
"file_extension": ".py",
|
|
||||||
"mimetype": "text/x-python",
|
|
||||||
"name": "python",
|
|
||||||
"nbconvert_exporter": "python",
|
|
||||||
"pygments_lexer": "ipython3",
|
|
||||||
"version": "3.6.10"
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"nbformat": 4,
|
|
||||||
"nbformat_minor": 4
|
|
||||||
}
|
|
||||||
@@ -1,11 +0,0 @@
|
|||||||
name: upload-fairness-dashboard
|
|
||||||
dependencies:
|
|
||||||
- pip:
|
|
||||||
- azureml-sdk
|
|
||||||
- azureml-contrib-fairness
|
|
||||||
- fairlearn>=0.6.2
|
|
||||||
- joblib
|
|
||||||
- liac-arff
|
|
||||||
- raiwidgets~=0.17.0
|
|
||||||
- itsdangerous==2.0.1
|
|
||||||
- markupsafe<2.1.0
|
|
||||||
@@ -9,7 +9,6 @@ As a pre-requisite, run the [configuration Notebook](../configuration.ipynb) not
|
|||||||
* [train-on-amlcompute](./training/train-on-amlcompute): Use a 1-n node Azure ML managed compute cluster for remote runs on Azure CPU or GPU infrastructure.
|
* [train-on-amlcompute](./training/train-on-amlcompute): Use a 1-n node Azure ML managed compute cluster for remote runs on Azure CPU or GPU infrastructure.
|
||||||
* [train-on-remote-vm](./training/train-on-remote-vm): Use Data Science Virtual Machine as a target for remote runs.
|
* [train-on-remote-vm](./training/train-on-remote-vm): Use Data Science Virtual Machine as a target for remote runs.
|
||||||
* [logging-api](./track-and-monitor-experiments/logging-api): Learn about the details of logging metrics to run history.
|
* [logging-api](./track-and-monitor-experiments/logging-api): Learn about the details of logging metrics to run history.
|
||||||
* [production-deploy-to-aks](./deployment/production-deploy-to-aks) Deploy a model to production at scale on Azure Kubernetes Service.
|
|
||||||
* [enable-app-insights-in-production-service](./deployment/enable-app-insights-in-production-service) Learn how to use App Insights with production web service.
|
* [enable-app-insights-in-production-service](./deployment/enable-app-insights-in-production-service) Learn how to use App Insights with production web service.
|
||||||
|
|
||||||
Find quickstarts, end-to-end tutorials, and how-tos on the [official documentation site for Azure Machine Learning service](https://docs.microsoft.com/en-us/azure/machine-learning/service/).
|
Find quickstarts, end-to-end tutorials, and how-tos on the [official documentation site for Azure Machine Learning service](https://docs.microsoft.com/en-us/azure/machine-learning/service/).
|
||||||
|
|||||||
@@ -5,26 +5,22 @@ channels:
|
|||||||
- main
|
- main
|
||||||
dependencies:
|
dependencies:
|
||||||
# The python interpreter version.
|
# The python interpreter version.
|
||||||
# Currently Azure ML only supports 3.6.0 and later.
|
# Azure ML only supports 3.8 and later.
|
||||||
- pip==20.2.4
|
- pip==22.3.1
|
||||||
- python>=3.6,<3.9
|
- python>=3.10,<3.11
|
||||||
- matplotlib==3.3.4
|
- holidays==0.29
|
||||||
- py-xgboost==1.3.3
|
- scipy==1.10.1
|
||||||
- pytorch::pytorch=1.4.0
|
- tqdm==4.66.1
|
||||||
- conda-forge::fbprophet==0.7.1
|
|
||||||
- cudatoolkit=10.1.243
|
|
||||||
- tqdm==4.63.1
|
|
||||||
- notebook
|
|
||||||
- pywin32==225
|
|
||||||
- PySocks==1.7.1
|
|
||||||
- conda-forge::pyqt==5.12.3
|
|
||||||
|
|
||||||
- pip:
|
- pip:
|
||||||
# Required packages for AzureML execution, history, and data preparation.
|
# Required packages for AzureML execution, history, and data preparation.
|
||||||
- azureml-widgets~=1.40.0
|
- azureml-widgets~=1.59.0
|
||||||
- pytorch-transformers==1.0.0
|
- azureml-defaults~=1.59.0
|
||||||
- spacy==2.2.4
|
- -r https://automlcesdkdataresources.blob.core.windows.net/validated-requirements/1.59.0/validated_win32_requirements.txt [--no-deps]
|
||||||
- pystan==2.19.1.1
|
- matplotlib==3.7.1
|
||||||
- https://aka.ms/automl-resources/packages/en_core_web_sm-2.1.0.tar.gz
|
- xgboost==1.5.2
|
||||||
- -r https://automlsdkdataresources.blob.core.windows.net/validated-requirements/1.40.0/validated_win32_requirements.txt [--no-deps]
|
- prophet==1.1.4
|
||||||
- arch==4.14
|
- onnx==1.16.1
|
||||||
|
- setuptools-git==1.2
|
||||||
|
- spacy==3.7.4
|
||||||
|
- https://aka.ms/automl-resources/packages/en_core_web_sm-3.7.1.tar.gz
|
||||||
|
|||||||
@@ -5,29 +5,26 @@ channels:
|
|||||||
- main
|
- main
|
||||||
dependencies:
|
dependencies:
|
||||||
# The python interpreter version.
|
# The python interpreter version.
|
||||||
# Currently Azure ML only supports 3.6.0 and later.
|
# Azure ML only supports 3.7 and later.
|
||||||
- pip==20.2.4
|
- pip==22.3.1
|
||||||
- python>=3.6,<3.9
|
- python>=3.10,<3.11
|
||||||
- boto3==1.20.19
|
- matplotlib==3.7.1
|
||||||
- botocore<=1.23.19
|
- numpy>=1.21.6,<=1.23.5
|
||||||
- matplotlib==3.3.4
|
|
||||||
- numpy==1.19.5
|
|
||||||
- cython==0.29.14
|
|
||||||
- urllib3==1.26.7
|
- urllib3==1.26.7
|
||||||
- scipy>=1.4.1,<=1.5.2
|
- scipy==1.10.1
|
||||||
- scikit-learn==0.22.1
|
- scikit-learn==1.5.1
|
||||||
- py-xgboost<=1.3.3
|
- holidays==0.29
|
||||||
- holidays==0.10.3
|
- pytorch::pytorch=1.11.0
|
||||||
- conda-forge::fbprophet==0.7.1
|
|
||||||
- pytorch::pytorch=1.4.0
|
|
||||||
- cudatoolkit=10.1.243
|
- cudatoolkit=10.1.243
|
||||||
|
- notebook
|
||||||
|
|
||||||
- pip:
|
- pip:
|
||||||
# Required packages for AzureML execution, history, and data preparation.
|
# Required packages for AzureML execution, history, and data preparation.
|
||||||
- azureml-widgets~=1.40.0
|
- azureml-widgets~=1.59.0
|
||||||
|
- azureml-defaults~=1.59.0
|
||||||
- pytorch-transformers==1.0.0
|
- pytorch-transformers==1.0.0
|
||||||
- spacy==2.2.4
|
- spacy==3.7.4
|
||||||
- pystan==2.19.1.1
|
- xgboost==1.5.2
|
||||||
- https://aka.ms/automl-resources/packages/en_core_web_sm-2.1.0.tar.gz
|
- prophet==1.1.4
|
||||||
- -r https://automlsdkdataresources.blob.core.windows.net/validated-requirements/1.40.0/validated_linux_requirements.txt [--no-deps]
|
- https://aka.ms/automl-resources/packages/en_core_web_sm-3.7.1.tar.gz
|
||||||
- arch==4.14
|
- -r https://automlcesdkdataresources.blob.core.windows.net/validated-requirements/1.59.0/validated_linux_requirements.txt [--no-deps]
|
||||||
|
|||||||
@@ -5,30 +5,22 @@ channels:
|
|||||||
- main
|
- main
|
||||||
dependencies:
|
dependencies:
|
||||||
# The python interpreter version.
|
# The python interpreter version.
|
||||||
# Currently Azure ML only supports 3.6.0 and later.
|
# Currently Azure ML only supports 3.7 and later.
|
||||||
- pip==20.2.4
|
- pip==22.3.1
|
||||||
- nomkl
|
- python>=3.10,<3.11
|
||||||
- python>=3.6,<3.9
|
- numpy>=1.21.6,<=1.23.5
|
||||||
- boto3==1.20.19
|
- scipy==1.10.1
|
||||||
- botocore<=1.23.19
|
- scikit-learn==1.5.1
|
||||||
- matplotlib==3.3.4
|
- holidays==0.29
|
||||||
- numpy==1.19.5
|
|
||||||
- cython==0.29.14
|
|
||||||
- urllib3==1.26.7
|
|
||||||
- scipy>=1.4.1,<=1.5.2
|
|
||||||
- scikit-learn==0.22.1
|
|
||||||
- py-xgboost<=1.3.3
|
|
||||||
- holidays==0.10.3
|
|
||||||
- conda-forge::fbprophet==0.7.1
|
|
||||||
- pytorch::pytorch=1.4.0
|
|
||||||
- cudatoolkit=9.0
|
|
||||||
|
|
||||||
- pip:
|
- pip:
|
||||||
# Required packages for AzureML execution, history, and data preparation.
|
# Required packages for AzureML execution, history, and data preparation.
|
||||||
- azureml-widgets~=1.40.0
|
- azureml-widgets~=1.59.0
|
||||||
|
- azureml-defaults~=1.59.0
|
||||||
- pytorch-transformers==1.0.0
|
- pytorch-transformers==1.0.0
|
||||||
- spacy==2.2.4
|
- prophet==1.1.4
|
||||||
- pystan==2.19.1.1
|
- xgboost==1.5.2
|
||||||
- https://aka.ms/automl-resources/packages/en_core_web_sm-2.1.0.tar.gz
|
- spacy==3.7.4
|
||||||
- -r https://automlsdkdataresources.blob.core.windows.net/validated-requirements/1.40.0/validated_darwin_requirements.txt [--no-deps]
|
- matplotlib==3.7.1
|
||||||
- arch==4.14
|
- https://aka.ms/automl-resources/packages/en_core_web_sm-3.7.1.tar.gz
|
||||||
|
- -r https://automlcesdkdataresources.blob.core.windows.net/validated-requirements/1.59.0/validated_darwin_requirements.txt [--no-deps]
|
||||||
|
|||||||
@@ -33,6 +33,8 @@ if not errorlevel 1 (
|
|||||||
call conda env create -f %automl_env_file% -n %conda_env_name%
|
call conda env create -f %automl_env_file% -n %conda_env_name%
|
||||||
)
|
)
|
||||||
|
|
||||||
|
python "%conda_prefix%\scripts\pywin32_postinstall.py" -install
|
||||||
|
|
||||||
call conda activate %conda_env_name% 2>nul:
|
call conda activate %conda_env_name% 2>nul:
|
||||||
if errorlevel 1 goto ErrorExit
|
if errorlevel 1 goto ErrorExit
|
||||||
|
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
from distutils.version import LooseVersion
|
from setuptools._vendor.packaging import version
|
||||||
import platform
|
import platform
|
||||||
|
|
||||||
try:
|
try:
|
||||||
@@ -17,7 +17,7 @@ if architecture != "64bit":
|
|||||||
|
|
||||||
minimumVersion = "4.7.8"
|
minimumVersion = "4.7.8"
|
||||||
|
|
||||||
versionInvalid = (LooseVersion(conda.__version__) < LooseVersion(minimumVersion))
|
versionInvalid = (version.parse(conda.__version__) < version.parse(minimumVersion))
|
||||||
|
|
||||||
if versionInvalid:
|
if versionInvalid:
|
||||||
print('Setup requires conda version ' + minimumVersion + ' or higher.')
|
print('Setup requires conda version ' + minimumVersion + ' or higher.')
|
||||||
|
|||||||
@@ -1,5 +1,21 @@
|
|||||||
{
|
{
|
||||||
"cells": [
|
"cells": [
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Copyright (c) Microsoft Corporation. All rights reserved.\n",
|
||||||
|
"\n",
|
||||||
|
"Licensed under the MIT License."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
""
|
||||||
|
]
|
||||||
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
@@ -77,7 +93,8 @@
|
|||||||
"from azureml.core.workspace import Workspace\n",
|
"from azureml.core.workspace import Workspace\n",
|
||||||
"from azureml.core.dataset import Dataset\n",
|
"from azureml.core.dataset import Dataset\n",
|
||||||
"from azureml.train.automl import AutoMLConfig\n",
|
"from azureml.train.automl import AutoMLConfig\n",
|
||||||
"from azureml.interpret import ExplanationClient"
|
"from azureml.interpret import ExplanationClient\n",
|
||||||
|
"from azureml.data.datapath import DataPath"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -134,6 +151,7 @@
|
|||||||
"output[\"Resource Group\"] = ws.resource_group\n",
|
"output[\"Resource Group\"] = ws.resource_group\n",
|
||||||
"output[\"Location\"] = ws.location\n",
|
"output[\"Location\"] = ws.location\n",
|
||||||
"output[\"Experiment Name\"] = experiment.name\n",
|
"output[\"Experiment Name\"] = experiment.name\n",
|
||||||
|
"output[\"SDK Version\"] = azureml.core.VERSION\n",
|
||||||
"pd.set_option(\"display.max_colwidth\", None)\n",
|
"pd.set_option(\"display.max_colwidth\", None)\n",
|
||||||
"outputDf = pd.DataFrame(data=output, index=[\"\"])\n",
|
"outputDf = pd.DataFrame(data=output, index=[\"\"])\n",
|
||||||
"outputDf.T"
|
"outputDf.T"
|
||||||
@@ -227,8 +245,8 @@
|
|||||||
"n_missing_samples = int(np.floor(data.shape[0] * missing_rate))\n",
|
"n_missing_samples = int(np.floor(data.shape[0] * missing_rate))\n",
|
||||||
"missing_samples = np.hstack(\n",
|
"missing_samples = np.hstack(\n",
|
||||||
" (\n",
|
" (\n",
|
||||||
" np.zeros(data.shape[0] - n_missing_samples, dtype=np.bool),\n",
|
" np.zeros(data.shape[0] - n_missing_samples, dtype=bool),\n",
|
||||||
" np.ones(n_missing_samples, dtype=np.bool),\n",
|
" np.ones(n_missing_samples, dtype=bool),\n",
|
||||||
" )\n",
|
" )\n",
|
||||||
")\n",
|
")\n",
|
||||||
"rng = np.random.RandomState(0)\n",
|
"rng = np.random.RandomState(0)\n",
|
||||||
@@ -249,10 +267,12 @@
|
|||||||
"pd.DataFrame(data).to_csv(\"data/train_data.csv\", index=False)\n",
|
"pd.DataFrame(data).to_csv(\"data/train_data.csv\", index=False)\n",
|
||||||
"\n",
|
"\n",
|
||||||
"ds = ws.get_default_datastore()\n",
|
"ds = ws.get_default_datastore()\n",
|
||||||
"ds.upload(\n",
|
"target = DataPath(\n",
|
||||||
" src_dir=\"./data\", target_path=\"bankmarketing\", overwrite=True, show_progress=True\n",
|
" datastore=ds, path_on_datastore=\"bankmarketing/train_data.csv\", name=\"bankmarketing\"\n",
|
||||||
|
")\n",
|
||||||
|
"Dataset.File.upload_directory(\n",
|
||||||
|
" src_dir=\"./data\", target=target, overwrite=True, show_progress=True\n",
|
||||||
")\n",
|
")\n",
|
||||||
"\n",
|
|
||||||
"\n",
|
"\n",
|
||||||
"# Upload the training data as a tabular dataset for access during training on remote compute\n",
|
"# Upload the training data as a tabular dataset for access during training on remote compute\n",
|
||||||
"train_data = Dataset.Tabular.from_delimited_files(\n",
|
"train_data = Dataset.Tabular.from_delimited_files(\n",
|
||||||
@@ -711,7 +731,9 @@
|
|||||||
"from azureml.core.model import Model\n",
|
"from azureml.core.model import Model\n",
|
||||||
"from azureml.core.environment import Environment\n",
|
"from azureml.core.environment import Environment\n",
|
||||||
"\n",
|
"\n",
|
||||||
"inference_config = InferenceConfig(entry_script=script_file_name)\n",
|
"inference_config = InferenceConfig(\n",
|
||||||
|
" environment=best_run.get_environment(), entry_script=script_file_name\n",
|
||||||
|
")\n",
|
||||||
"\n",
|
"\n",
|
||||||
"aciconfig = AciWebservice.deploy_configuration(\n",
|
"aciconfig = AciWebservice.deploy_configuration(\n",
|
||||||
" cpu_cores=2,\n",
|
" cpu_cores=2,\n",
|
||||||
@@ -827,9 +849,7 @@
|
|||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": null,
|
"execution_count": null,
|
||||||
"metadata": {
|
"metadata": {},
|
||||||
"scrolled": true
|
|
||||||
},
|
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"%matplotlib notebook\n",
|
"%matplotlib notebook\n",
|
||||||
@@ -1059,9 +1079,9 @@
|
|||||||
"name": "python3-azureml"
|
"name": "python3-azureml"
|
||||||
},
|
},
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Python 3.6",
|
"display_name": "Python 3.8 - AzureML",
|
||||||
"language": "python",
|
"language": "python",
|
||||||
"name": "python36"
|
"name": "python38-azureml"
|
||||||
},
|
},
|
||||||
"language_info": {
|
"language_info": {
|
||||||
"codemirror_mode": {
|
"codemirror_mode": {
|
||||||
@@ -1073,7 +1093,7 @@
|
|||||||
"name": "python",
|
"name": "python",
|
||||||
"nbconvert_exporter": "python",
|
"nbconvert_exporter": "python",
|
||||||
"pygments_lexer": "ipython3",
|
"pygments_lexer": "ipython3",
|
||||||
"version": "3.6.9"
|
"version": "3.10.14"
|
||||||
},
|
},
|
||||||
"nteract": {
|
"nteract": {
|
||||||
"version": "nteract-front-end@1.0.0"
|
"version": "nteract-front-end@1.0.0"
|
||||||
@@ -1087,5 +1107,5 @@
|
|||||||
"task": "Classification"
|
"task": "Classification"
|
||||||
},
|
},
|
||||||
"nbformat": 4,
|
"nbformat": 4,
|
||||||
"nbformat_minor": 1
|
"nbformat_minor": 4
|
||||||
}
|
}
|
||||||
@@ -1,4 +0,0 @@
|
|||||||
name: auto-ml-classification-bank-marketing-all-features
|
|
||||||
dependencies:
|
|
||||||
- pip:
|
|
||||||
- azureml-sdk
|
|
||||||
@@ -1,5 +1,21 @@
|
|||||||
{
|
{
|
||||||
"cells": [
|
"cells": [
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Copyright (c) Microsoft Corporation. All rights reserved.\n",
|
||||||
|
"\n",
|
||||||
|
"Licensed under the MIT License."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
""
|
||||||
|
]
|
||||||
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
@@ -90,6 +106,7 @@
|
|||||||
"output[\"Resource Group\"] = ws.resource_group\n",
|
"output[\"Resource Group\"] = ws.resource_group\n",
|
||||||
"output[\"Location\"] = ws.location\n",
|
"output[\"Location\"] = ws.location\n",
|
||||||
"output[\"Experiment Name\"] = experiment.name\n",
|
"output[\"Experiment Name\"] = experiment.name\n",
|
||||||
|
"output[\"SDK Version\"] = azureml.core.VERSION\n",
|
||||||
"pd.set_option(\"display.max_colwidth\", None)\n",
|
"pd.set_option(\"display.max_colwidth\", None)\n",
|
||||||
"outputDf = pd.DataFrame(data=output, index=[\"\"])\n",
|
"outputDf = pd.DataFrame(data=output, index=[\"\"])\n",
|
||||||
"outputDf.T"
|
"outputDf.T"
|
||||||
@@ -455,9 +472,9 @@
|
|||||||
"friendly_name": "Classification of credit card fraudulent transactions using Automated ML",
|
"friendly_name": "Classification of credit card fraudulent transactions using Automated ML",
|
||||||
"index_order": 5,
|
"index_order": 5,
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Python 3.6",
|
"display_name": "Python 3.8 - AzureML",
|
||||||
"language": "python",
|
"language": "python",
|
||||||
"name": "python36"
|
"name": "python38-azureml"
|
||||||
},
|
},
|
||||||
"language_info": {
|
"language_info": {
|
||||||
"codemirror_mode": {
|
"codemirror_mode": {
|
||||||
|
|||||||
@@ -1,4 +0,0 @@
|
|||||||
name: auto-ml-classification-credit-card-fraud
|
|
||||||
dependencies:
|
|
||||||
- pip:
|
|
||||||
- azureml-sdk
|
|
||||||
@@ -1,592 +0,0 @@
|
|||||||
{
|
|
||||||
"cells": [
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"# Automated Machine Learning\n",
|
|
||||||
"_**Text Classification Using Deep Learning**_\n",
|
|
||||||
"\n",
|
|
||||||
"## Contents\n",
|
|
||||||
"1. [Introduction](#Introduction)\n",
|
|
||||||
"1. [Setup](#Setup)\n",
|
|
||||||
"1. [Data](#Data)\n",
|
|
||||||
"1. [Train](#Train)\n",
|
|
||||||
"1. [Evaluate](#Evaluate)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"## Introduction\n",
|
|
||||||
"This notebook demonstrates classification with text data using deep learning in AutoML.\n",
|
|
||||||
"\n",
|
|
||||||
"AutoML highlights here include using deep neural networks (DNNs) to create embedded features from text data. Depending on the compute cluster the user provides, AutoML tried out Bidirectional Encoder Representations from Transformers (BERT) when a GPU compute is used, and Bidirectional Long-Short Term neural network (BiLSTM) when a CPU compute is used, thereby optimizing the choice of DNN for the uesr's setup.\n",
|
|
||||||
"\n",
|
|
||||||
"Make sure you have executed the [configuration](../../../configuration.ipynb) before running this notebook.\n",
|
|
||||||
"\n",
|
|
||||||
"Notebook synopsis:\n",
|
|
||||||
"\n",
|
|
||||||
"1. Creating an Experiment in an existing Workspace\n",
|
|
||||||
"2. Configuration and remote run of AutoML for a text dataset (20 Newsgroups dataset from scikit-learn) for classification\n",
|
|
||||||
"3. Registering the best model for future use\n",
|
|
||||||
"4. Evaluating the final model on a test set"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"## Setup"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"import json\n",
|
|
||||||
"import logging\n",
|
|
||||||
"import os\n",
|
|
||||||
"import shutil\n",
|
|
||||||
"\n",
|
|
||||||
"import pandas as pd\n",
|
|
||||||
"\n",
|
|
||||||
"import azureml.core\n",
|
|
||||||
"from azureml.core.experiment import Experiment\n",
|
|
||||||
"from azureml.core.workspace import Workspace\n",
|
|
||||||
"from azureml.core.dataset import Dataset\n",
|
|
||||||
"from azureml.core.compute import AmlCompute\n",
|
|
||||||
"from azureml.core.compute import ComputeTarget\n",
|
|
||||||
"from azureml.core.run import Run\n",
|
|
||||||
"from azureml.widgets import RunDetails\n",
|
|
||||||
"from azureml.core.model import Model\n",
|
|
||||||
"from helper import run_inference, get_result_df\n",
|
|
||||||
"from azureml.train.automl import AutoMLConfig\n",
|
|
||||||
"from sklearn.datasets import fetch_20newsgroups"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"This sample notebook may use features that are not available in previous versions of the Azure ML SDK."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"As part of the setup you have already created a <b>Workspace</b>. To run AutoML, you also need to create an <b>Experiment</b>. An Experiment corresponds to a prediction problem you are trying to solve, while a Run corresponds to a specific approach to the problem."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"ws = Workspace.from_config()\n",
|
|
||||||
"\n",
|
|
||||||
"# Choose an experiment name.\n",
|
|
||||||
"experiment_name = \"automl-classification-text-dnn\"\n",
|
|
||||||
"\n",
|
|
||||||
"experiment = Experiment(ws, experiment_name)\n",
|
|
||||||
"\n",
|
|
||||||
"output = {}\n",
|
|
||||||
"output[\"Subscription ID\"] = ws.subscription_id\n",
|
|
||||||
"output[\"Workspace Name\"] = ws.name\n",
|
|
||||||
"output[\"Resource Group\"] = ws.resource_group\n",
|
|
||||||
"output[\"Location\"] = ws.location\n",
|
|
||||||
"output[\"Experiment Name\"] = experiment.name\n",
|
|
||||||
"pd.set_option(\"display.max_colwidth\", None)\n",
|
|
||||||
"outputDf = pd.DataFrame(data=output, index=[\"\"])\n",
|
|
||||||
"outputDf.T"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"## Set up a compute cluster\n",
|
|
||||||
"This section uses a user-provided compute cluster (named \"dnntext-cluster\" in this example). If a cluster with this name does not exist in the user's workspace, the below code will create a new cluster. You can choose the parameters of the cluster as mentioned in the comments.\n",
|
|
||||||
"\n",
|
|
||||||
"> Note that if you have an AzureML Data Scientist role, you will not have permission to create compute resources. Talk to your workspace or IT admin to create the compute targets described in this section, if they do not already exist.\n",
|
|
||||||
"\n",
|
|
||||||
"Whether you provide/select a CPU or GPU cluster, AutoML will choose the appropriate DNN for that setup - BiLSTM or BERT text featurizer will be included in the candidate featurizers on CPU and GPU respectively. If your goal is to obtain the most accurate model, we recommend you use GPU clusters since BERT featurizers usually outperform BiLSTM featurizers."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"from azureml.core.compute import ComputeTarget, AmlCompute\n",
|
|
||||||
"from azureml.core.compute_target import ComputeTargetException\n",
|
|
||||||
"\n",
|
|
||||||
"num_nodes = 2\n",
|
|
||||||
"\n",
|
|
||||||
"# Choose a name for your cluster.\n",
|
|
||||||
"amlcompute_cluster_name = \"dnntext-cluster\"\n",
|
|
||||||
"\n",
|
|
||||||
"# Verify that cluster does not exist already\n",
|
|
||||||
"try:\n",
|
|
||||||
" compute_target = ComputeTarget(workspace=ws, name=amlcompute_cluster_name)\n",
|
|
||||||
" print(\"Found existing cluster, use it.\")\n",
|
|
||||||
"except ComputeTargetException:\n",
|
|
||||||
" compute_config = AmlCompute.provisioning_configuration(\n",
|
|
||||||
" vm_size=\"STANDARD_NC6\", # CPU for BiLSTM, such as \"STANDARD_D2_V2\"\n",
|
|
||||||
" # To use BERT (this is recommended for best performance), select a GPU such as \"STANDARD_NC6\"\n",
|
|
||||||
" # or similar GPU option\n",
|
|
||||||
" # available in your workspace\n",
|
|
||||||
" idle_seconds_before_scaledown=60,\n",
|
|
||||||
" max_nodes=num_nodes,\n",
|
|
||||||
" )\n",
|
|
||||||
" compute_target = ComputeTarget.create(ws, amlcompute_cluster_name, compute_config)\n",
|
|
||||||
"\n",
|
|
||||||
"compute_target.wait_for_completion(show_output=True)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"### Get data\n",
|
|
||||||
"For this notebook we will use 20 Newsgroups data from scikit-learn. We filter the data to contain four classes and take a sample as training data. Please note that for accuracy improvement, more data is needed. For this notebook we provide a small-data example so that you can use this template to use with your larger sized data."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"data_dir = \"text-dnn-data\" # Local directory to store data\n",
|
|
||||||
"blobstore_datadir = data_dir # Blob store directory to store data in\n",
|
|
||||||
"target_column_name = \"y\"\n",
|
|
||||||
"feature_column_name = \"X\"\n",
|
|
||||||
"\n",
|
|
||||||
"\n",
|
|
||||||
"def get_20newsgroups_data():\n",
|
|
||||||
" \"\"\"Fetches 20 Newsgroups data from scikit-learn\n",
|
|
||||||
" Returns them in form of pandas dataframes\n",
|
|
||||||
" \"\"\"\n",
|
|
||||||
" remove = (\"headers\", \"footers\", \"quotes\")\n",
|
|
||||||
" categories = [\n",
|
|
||||||
" \"rec.sport.baseball\",\n",
|
|
||||||
" \"rec.sport.hockey\",\n",
|
|
||||||
" \"comp.graphics\",\n",
|
|
||||||
" \"sci.space\",\n",
|
|
||||||
" ]\n",
|
|
||||||
"\n",
|
|
||||||
" data = fetch_20newsgroups(\n",
|
|
||||||
" subset=\"train\",\n",
|
|
||||||
" categories=categories,\n",
|
|
||||||
" shuffle=True,\n",
|
|
||||||
" random_state=42,\n",
|
|
||||||
" remove=remove,\n",
|
|
||||||
" )\n",
|
|
||||||
" data = pd.DataFrame(\n",
|
|
||||||
" {feature_column_name: data.data, target_column_name: data.target}\n",
|
|
||||||
" )\n",
|
|
||||||
"\n",
|
|
||||||
" data_train = data[:200]\n",
|
|
||||||
" data_test = data[200:300]\n",
|
|
||||||
"\n",
|
|
||||||
" data_train = remove_blanks_20news(\n",
|
|
||||||
" data_train, feature_column_name, target_column_name\n",
|
|
||||||
" )\n",
|
|
||||||
" data_test = remove_blanks_20news(data_test, feature_column_name, target_column_name)\n",
|
|
||||||
"\n",
|
|
||||||
" return data_train, data_test\n",
|
|
||||||
"\n",
|
|
||||||
"\n",
|
|
||||||
"def remove_blanks_20news(data, feature_column_name, target_column_name):\n",
|
|
||||||
"\n",
|
|
||||||
" data[feature_column_name] = (\n",
|
|
||||||
" data[feature_column_name]\n",
|
|
||||||
" .replace(r\"\\n\", \" \", regex=True)\n",
|
|
||||||
" .apply(lambda x: x.strip())\n",
|
|
||||||
" )\n",
|
|
||||||
" data = data[data[feature_column_name] != \"\"]\n",
|
|
||||||
"\n",
|
|
||||||
" return data"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"#### Fetch data and upload to datastore for use in training"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"data_train, data_test = get_20newsgroups_data()\n",
|
|
||||||
"\n",
|
|
||||||
"if not os.path.isdir(data_dir):\n",
|
|
||||||
" os.mkdir(data_dir)\n",
|
|
||||||
"\n",
|
|
||||||
"train_data_fname = data_dir + \"/train_data.csv\"\n",
|
|
||||||
"test_data_fname = data_dir + \"/test_data.csv\"\n",
|
|
||||||
"\n",
|
|
||||||
"data_train.to_csv(train_data_fname, index=False)\n",
|
|
||||||
"data_test.to_csv(test_data_fname, index=False)\n",
|
|
||||||
"\n",
|
|
||||||
"datastore = ws.get_default_datastore()\n",
|
|
||||||
"datastore.upload(src_dir=data_dir, target_path=blobstore_datadir, overwrite=True)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"train_dataset = Dataset.Tabular.from_delimited_files(\n",
|
|
||||||
" path=[(datastore, blobstore_datadir + \"/train_data.csv\")]\n",
|
|
||||||
")"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"### Prepare AutoML run"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"This notebook uses the blocked_models parameter to exclude some models that can take a longer time to train on some text datasets. You can choose to remove models from the blocked_models list but you may need to increase the experiment_timeout_hours parameter value to get results."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"automl_settings = {\n",
|
|
||||||
" \"experiment_timeout_minutes\": 30,\n",
|
|
||||||
" \"primary_metric\": \"accuracy\",\n",
|
|
||||||
" \"max_concurrent_iterations\": num_nodes,\n",
|
|
||||||
" \"max_cores_per_iteration\": -1,\n",
|
|
||||||
" \"enable_dnn\": True,\n",
|
|
||||||
" \"enable_early_stopping\": True,\n",
|
|
||||||
" \"validation_size\": 0.3,\n",
|
|
||||||
" \"verbosity\": logging.INFO,\n",
|
|
||||||
" \"enable_voting_ensemble\": False,\n",
|
|
||||||
" \"enable_stack_ensemble\": False,\n",
|
|
||||||
"}\n",
|
|
||||||
"\n",
|
|
||||||
"automl_config = AutoMLConfig(\n",
|
|
||||||
" task=\"classification\",\n",
|
|
||||||
" debug_log=\"automl_errors.log\",\n",
|
|
||||||
" compute_target=compute_target,\n",
|
|
||||||
" training_data=train_dataset,\n",
|
|
||||||
" label_column_name=target_column_name,\n",
|
|
||||||
" blocked_models=[\"LightGBM\", \"XGBoostClassifier\"],\n",
|
|
||||||
" **automl_settings,\n",
|
|
||||||
")"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"#### Submit AutoML Run"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"automl_run = experiment.submit(automl_config, show_output=True)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"Displaying the run objects gives you links to the visual tools in the Azure Portal. Go try them!"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"### Retrieve the Best Model\n",
|
|
||||||
"Below we select the best model pipeline from our iterations, use it to test on test data on the same compute cluster."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"For local inferencing, you can load the model locally via. the method `remote_run.get_output()`. For more information on the arguments expected by this method, you can run `remote_run.get_output??`.\n",
|
|
||||||
"Note that when the model contains BERT, this step will require pytorch and pytorch-transformers installed in your local environment. The exact versions of these packages can be found in the **automl_env.yml** file located in the local copy of your azureml-examples folder here: \"azureml-examples/python-sdk/tutorials/automl-with-azureml\""
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"# Retrieve the best Run object\n",
|
|
||||||
"best_run = automl_run.get_best_child()"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"You can now see what text transformations are used to convert text data to features for this dataset, including deep learning transformations based on BiLSTM or Transformer (BERT is one implementation of a Transformer) models."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"# Download the featurization summary JSON file locally\n",
|
|
||||||
"best_run.download_file(\n",
|
|
||||||
" \"outputs/featurization_summary.json\", \"featurization_summary.json\"\n",
|
|
||||||
")\n",
|
|
||||||
"\n",
|
|
||||||
"# Render the JSON as a pandas DataFrame\n",
|
|
||||||
"with open(\"featurization_summary.json\", \"r\") as f:\n",
|
|
||||||
" records = json.load(f)\n",
|
|
||||||
"\n",
|
|
||||||
"featurization_summary = pd.DataFrame.from_records(records)\n",
|
|
||||||
"featurization_summary[\"Transformations\"].tolist()"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"### Registering the best model\n",
|
|
||||||
"We now register the best fitted model from the AutoML Run for use in future deployments. "
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"Get results stats, extract the best model from AutoML run, download and register the resultant best model"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"summary_df = get_result_df(automl_run)\n",
|
|
||||||
"best_dnn_run_id = summary_df[\"run_id\"].iloc[0]\n",
|
|
||||||
"best_dnn_run = Run(experiment, best_dnn_run_id)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"model_dir = \"Model\" # Local folder where the model will be stored temporarily\n",
|
|
||||||
"if not os.path.isdir(model_dir):\n",
|
|
||||||
" os.mkdir(model_dir)\n",
|
|
||||||
"\n",
|
|
||||||
"best_dnn_run.download_file(\"outputs/model.pkl\", model_dir + \"/model.pkl\")"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"Register the model in your Azure Machine Learning Workspace. If you previously registered a model, please make sure to delete it so as to replace it with this new model."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"# Register the model\n",
|
|
||||||
"model_name = \"textDNN-20News\"\n",
|
|
||||||
"model = Model.register(\n",
|
|
||||||
" model_path=model_dir + \"/model.pkl\", model_name=model_name, tags=None, workspace=ws\n",
|
|
||||||
")"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"## Evaluate on Test Data"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"We now use the best fitted model from the AutoML Run to make predictions on the test set. \n",
|
|
||||||
"\n",
|
|
||||||
"Test set schema should match that of the training set."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"test_dataset = Dataset.Tabular.from_delimited_files(\n",
|
|
||||||
" path=[(datastore, blobstore_datadir + \"/test_data.csv\")]\n",
|
|
||||||
")\n",
|
|
||||||
"\n",
|
|
||||||
"# preview the first 3 rows of the dataset\n",
|
|
||||||
"test_dataset.take(3).to_pandas_dataframe()"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"test_experiment = Experiment(ws, experiment_name + \"_test\")"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"script_folder = os.path.join(os.getcwd(), \"inference\")\n",
|
|
||||||
"os.makedirs(script_folder, exist_ok=True)\n",
|
|
||||||
"shutil.copy(\"infer.py\", script_folder)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"test_run = run_inference(\n",
|
|
||||||
" test_experiment,\n",
|
|
||||||
" compute_target,\n",
|
|
||||||
" script_folder,\n",
|
|
||||||
" best_dnn_run,\n",
|
|
||||||
" test_dataset,\n",
|
|
||||||
" target_column_name,\n",
|
|
||||||
" model_name,\n",
|
|
||||||
")"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"Display computed metrics"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"test_run"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"RunDetails(test_run).show()"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"test_run.wait_for_completion()"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"pd.Series(test_run.get_metrics())"
|
|
||||||
]
|
|
||||||
}
|
|
||||||
],
|
|
||||||
"metadata": {
|
|
||||||
"authors": [
|
|
||||||
{
|
|
||||||
"name": "anshirga"
|
|
||||||
}
|
|
||||||
],
|
|
||||||
"compute": [
|
|
||||||
"AML Compute"
|
|
||||||
],
|
|
||||||
"datasets": [
|
|
||||||
"None"
|
|
||||||
],
|
|
||||||
"deployment": [
|
|
||||||
"None"
|
|
||||||
],
|
|
||||||
"exclude_from_index": false,
|
|
||||||
"framework": [
|
|
||||||
"None"
|
|
||||||
],
|
|
||||||
"friendly_name": "DNN Text Featurization",
|
|
||||||
"index_order": 2,
|
|
||||||
"kernelspec": {
|
|
||||||
"display_name": "Python 3.6",
|
|
||||||
"language": "python",
|
|
||||||
"name": "python36"
|
|
||||||
},
|
|
||||||
"language_info": {
|
|
||||||
"codemirror_mode": {
|
|
||||||
"name": "ipython",
|
|
||||||
"version": 3
|
|
||||||
},
|
|
||||||
"file_extension": ".py",
|
|
||||||
"mimetype": "text/x-python",
|
|
||||||
"name": "python",
|
|
||||||
"nbconvert_exporter": "python",
|
|
||||||
"pygments_lexer": "ipython3",
|
|
||||||
"version": "3.6.7"
|
|
||||||
},
|
|
||||||
"tags": [
|
|
||||||
"None"
|
|
||||||
],
|
|
||||||
"task": "Text featurization using DNNs for classification"
|
|
||||||
},
|
|
||||||
"nbformat": 4,
|
|
||||||
"nbformat_minor": 2
|
|
||||||
}
|
|
||||||
@@ -1,4 +0,0 @@
|
|||||||
name: auto-ml-classification-text-dnn
|
|
||||||
dependencies:
|
|
||||||
- pip:
|
|
||||||
- azureml-sdk
|
|
||||||
@@ -1,68 +0,0 @@
|
|||||||
import pandas as pd
|
|
||||||
from azureml.core import Environment
|
|
||||||
from azureml.train.estimator import Estimator
|
|
||||||
from azureml.core.run import Run
|
|
||||||
|
|
||||||
|
|
||||||
def run_inference(
|
|
||||||
test_experiment,
|
|
||||||
compute_target,
|
|
||||||
script_folder,
|
|
||||||
train_run,
|
|
||||||
test_dataset,
|
|
||||||
target_column_name,
|
|
||||||
model_name,
|
|
||||||
):
|
|
||||||
|
|
||||||
inference_env = train_run.get_environment()
|
|
||||||
|
|
||||||
est = Estimator(
|
|
||||||
source_directory=script_folder,
|
|
||||||
entry_script="infer.py",
|
|
||||||
script_params={
|
|
||||||
"--target_column_name": target_column_name,
|
|
||||||
"--model_name": model_name,
|
|
||||||
},
|
|
||||||
inputs=[test_dataset.as_named_input("test_data")],
|
|
||||||
compute_target=compute_target,
|
|
||||||
environment_definition=inference_env,
|
|
||||||
)
|
|
||||||
|
|
||||||
run = test_experiment.submit(
|
|
||||||
est,
|
|
||||||
tags={
|
|
||||||
"training_run_id": train_run.id,
|
|
||||||
"run_algorithm": train_run.properties["run_algorithm"],
|
|
||||||
"valid_score": train_run.properties["score"],
|
|
||||||
"primary_metric": train_run.properties["primary_metric"],
|
|
||||||
},
|
|
||||||
)
|
|
||||||
|
|
||||||
run.log("run_algorithm", run.tags["run_algorithm"])
|
|
||||||
return run
|
|
||||||
|
|
||||||
|
|
||||||
def get_result_df(remote_run):
|
|
||||||
|
|
||||||
children = list(remote_run.get_children(recursive=True))
|
|
||||||
summary_df = pd.DataFrame(
|
|
||||||
index=["run_id", "run_algorithm", "primary_metric", "Score"]
|
|
||||||
)
|
|
||||||
goal_minimize = False
|
|
||||||
for run in children:
|
|
||||||
if "run_algorithm" in run.properties and "score" in run.properties:
|
|
||||||
summary_df[run.id] = [
|
|
||||||
run.id,
|
|
||||||
run.properties["run_algorithm"],
|
|
||||||
run.properties["primary_metric"],
|
|
||||||
float(run.properties["score"]),
|
|
||||||
]
|
|
||||||
if "goal" in run.properties:
|
|
||||||
goal_minimize = run.properties["goal"].split("_")[-1] == "min"
|
|
||||||
|
|
||||||
summary_df = summary_df.T.sort_values(
|
|
||||||
"Score", ascending=goal_minimize
|
|
||||||
).drop_duplicates(["run_algorithm"])
|
|
||||||
summary_df = summary_df.set_index("run_algorithm")
|
|
||||||
|
|
||||||
return summary_df
|
|
||||||
@@ -1,68 +0,0 @@
|
|||||||
import argparse
|
|
||||||
|
|
||||||
import pandas as pd
|
|
||||||
import numpy as np
|
|
||||||
|
|
||||||
from sklearn.externals import joblib
|
|
||||||
|
|
||||||
from azureml.automl.runtime.shared.score import scoring, constants
|
|
||||||
from azureml.core import Run
|
|
||||||
from azureml.core.model import Model
|
|
||||||
|
|
||||||
|
|
||||||
parser = argparse.ArgumentParser()
|
|
||||||
parser.add_argument(
|
|
||||||
"--target_column_name",
|
|
||||||
type=str,
|
|
||||||
dest="target_column_name",
|
|
||||||
help="Target Column Name",
|
|
||||||
)
|
|
||||||
parser.add_argument(
|
|
||||||
"--model_name", type=str, dest="model_name", help="Name of registered model"
|
|
||||||
)
|
|
||||||
|
|
||||||
args = parser.parse_args()
|
|
||||||
target_column_name = args.target_column_name
|
|
||||||
model_name = args.model_name
|
|
||||||
|
|
||||||
print("args passed are: ")
|
|
||||||
print("Target column name: ", target_column_name)
|
|
||||||
print("Name of registered model: ", model_name)
|
|
||||||
|
|
||||||
model_path = Model.get_model_path(model_name)
|
|
||||||
# deserialize the model file back into a sklearn model
|
|
||||||
model = joblib.load(model_path)
|
|
||||||
|
|
||||||
run = Run.get_context()
|
|
||||||
# get input dataset by name
|
|
||||||
test_dataset = run.input_datasets["test_data"]
|
|
||||||
|
|
||||||
X_test_df = test_dataset.drop_columns(
|
|
||||||
columns=[target_column_name]
|
|
||||||
).to_pandas_dataframe()
|
|
||||||
y_test_df = (
|
|
||||||
test_dataset.with_timestamp_columns(None)
|
|
||||||
.keep_columns(columns=[target_column_name])
|
|
||||||
.to_pandas_dataframe()
|
|
||||||
)
|
|
||||||
|
|
||||||
predicted = model.predict_proba(X_test_df)
|
|
||||||
|
|
||||||
if isinstance(predicted, pd.DataFrame):
|
|
||||||
predicted = predicted.values
|
|
||||||
|
|
||||||
# Use the AutoML scoring module
|
|
||||||
train_labels = model.classes_
|
|
||||||
class_labels = np.unique(
|
|
||||||
np.concatenate((y_test_df.values, np.reshape(train_labels, (-1, 1))))
|
|
||||||
)
|
|
||||||
classification_metrics = list(constants.CLASSIFICATION_SCALAR_SET)
|
|
||||||
scores = scoring.score_classification(
|
|
||||||
y_test_df.values, predicted, classification_metrics, class_labels, train_labels
|
|
||||||
)
|
|
||||||
|
|
||||||
print("scores:")
|
|
||||||
print(scores)
|
|
||||||
|
|
||||||
for key, value in scores.items():
|
|
||||||
run.log(key, value)
|
|
||||||
@@ -1,5 +1,21 @@
|
|||||||
{
|
{
|
||||||
"cells": [
|
"cells": [
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Copyright (c) Microsoft Corporation. All rights reserved.\n",
|
||||||
|
"\n",
|
||||||
|
"Licensed under the MIT License."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
""
|
||||||
|
]
|
||||||
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
@@ -102,6 +118,7 @@
|
|||||||
"output[\"Resource Group\"] = ws.resource_group\n",
|
"output[\"Resource Group\"] = ws.resource_group\n",
|
||||||
"output[\"Location\"] = ws.location\n",
|
"output[\"Location\"] = ws.location\n",
|
||||||
"output[\"Run History Name\"] = experiment_name\n",
|
"output[\"Run History Name\"] = experiment_name\n",
|
||||||
|
"output[\"SDK Version\"] = azureml.core.VERSION\n",
|
||||||
"pd.set_option(\"display.max_colwidth\", None)\n",
|
"pd.set_option(\"display.max_colwidth\", None)\n",
|
||||||
"outputDf = pd.DataFrame(data=output, index=[\"\"])\n",
|
"outputDf = pd.DataFrame(data=output, index=[\"\"])\n",
|
||||||
"outputDf.T"
|
"outputDf.T"
|
||||||
@@ -178,7 +195,7 @@
|
|||||||
" \"azureml-opendatasets\",\n",
|
" \"azureml-opendatasets\",\n",
|
||||||
" \"azureml-defaults\",\n",
|
" \"azureml-defaults\",\n",
|
||||||
" ],\n",
|
" ],\n",
|
||||||
" conda_packages=[\"numpy==1.16.2\"],\n",
|
" conda_packages=[\"numpy==1.19.5\"],\n",
|
||||||
" pin_sdk_version=False,\n",
|
" pin_sdk_version=False,\n",
|
||||||
")\n",
|
")\n",
|
||||||
"conda_run_config.environment.python.conda_dependencies = cd\n",
|
"conda_run_config.environment.python.conda_dependencies = cd\n",
|
||||||
@@ -563,9 +580,9 @@
|
|||||||
}
|
}
|
||||||
],
|
],
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Python 3.6",
|
"display_name": "Python 3.8 - AzureML",
|
||||||
"language": "python",
|
"language": "python",
|
||||||
"name": "python36"
|
"name": "python38-azureml"
|
||||||
},
|
},
|
||||||
"language_info": {
|
"language_info": {
|
||||||
"codemirror_mode": {
|
"codemirror_mode": {
|
||||||
|
|||||||
@@ -1,4 +0,0 @@
|
|||||||
name: auto-ml-continuous-retraining
|
|
||||||
dependencies:
|
|
||||||
- pip:
|
|
||||||
- azureml-sdk
|
|
||||||
@@ -31,12 +31,15 @@ try:
|
|||||||
model = Model(ws, args.model_name)
|
model = Model(ws, args.model_name)
|
||||||
last_train_time = model.created_time
|
last_train_time = model.created_time
|
||||||
print("Model was last trained on {0}.".format(last_train_time))
|
print("Model was last trained on {0}.".format(last_train_time))
|
||||||
except Exception as e:
|
except Exception:
|
||||||
print("Could not get last model train time.")
|
print("Could not get last model train time.")
|
||||||
last_train_time = datetime.min.replace(tzinfo=pytz.UTC)
|
last_train_time = datetime.min.replace(tzinfo=pytz.UTC)
|
||||||
|
|
||||||
train_ds = Dataset.get_by_name(ws, args.ds_name)
|
train_ds = Dataset.get_by_name(ws, args.ds_name)
|
||||||
dataset_changed_time = train_ds.data_changed_time
|
dataset_changed_time = train_ds.data_changed_time.replace(tzinfo=pytz.UTC)
|
||||||
|
|
||||||
|
print("dataset_changed_time=" + str(dataset_changed_time))
|
||||||
|
print("last_train_time=" + str(last_train_time))
|
||||||
|
|
||||||
if not dataset_changed_time > last_train_time:
|
if not dataset_changed_time > last_train_time:
|
||||||
print("Cancelling run since there is no new data.")
|
print("Cancelling run since there is no new data.")
|
||||||
|
|||||||
@@ -120,9 +120,13 @@ except Exception:
|
|||||||
end_time = datetime(2021, 5, 1, 0, 0)
|
end_time = datetime(2021, 5, 1, 0, 0)
|
||||||
end_time_last_slice = end_time - relativedelta(weeks=2)
|
end_time_last_slice = end_time - relativedelta(weeks=2)
|
||||||
|
|
||||||
train_df = get_noaa_data(end_time_last_slice, end_time)
|
try:
|
||||||
|
train_df = get_noaa_data(end_time_last_slice, end_time)
|
||||||
|
except Exception as ex:
|
||||||
|
print("get_noaa_data failed:", ex)
|
||||||
|
train_df = None
|
||||||
|
|
||||||
if train_df.size > 0:
|
if train_df is not None and train_df.size > 0:
|
||||||
print(
|
print(
|
||||||
"Received {0} rows of new data after {1}.".format(
|
"Received {0} rows of new data after {1}.".format(
|
||||||
train_df.shape[0], end_time_last_slice
|
train_df.shape[0], end_time_last_slice
|
||||||
|
|||||||
@@ -9,7 +9,7 @@ To run these notebook on your own notebook server, use these installation instru
|
|||||||
The instructions below will install everything you need and then start a Jupyter notebook.
|
The instructions below will install everything you need and then start a Jupyter notebook.
|
||||||
If you would like to use a lighter-weight version of the client that does not install all of the machine learning libraries locally, you can leverage the [experimental notebooks.](experimental/README.md)
|
If you would like to use a lighter-weight version of the client that does not install all of the machine learning libraries locally, you can leverage the [experimental notebooks.](experimental/README.md)
|
||||||
|
|
||||||
### 1. Install mini-conda from [here](https://conda.io/miniconda.html), choose 64-bit Python 3.7 or higher.
|
### 1. Install mini-conda from [here](https://conda.io/miniconda.html), choose 64-bit Python 3.8 or higher.
|
||||||
- **Note**: if you already have conda installed, you can keep using it but it should be version 4.4.10 or later (as shown by: conda -V). If you have a previous version installed, you can update it using the command: conda update conda.
|
- **Note**: if you already have conda installed, you can keep using it but it should be version 4.4.10 or later (as shown by: conda -V). If you have a previous version installed, you can update it using the command: conda update conda.
|
||||||
There's no need to install mini-conda specifically.
|
There's no need to install mini-conda specifically.
|
||||||
|
|
||||||
|
|||||||
@@ -0,0 +1,346 @@
|
|||||||
|
{
|
||||||
|
"cells": [
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Copyright (c) Microsoft Corporation. All rights reserved.\n",
|
||||||
|
"\n",
|
||||||
|
"Licensed under the MIT License."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
""
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"# Automated Machine Learning - Codegen for AutoFeaturization \n",
|
||||||
|
"_**Autofeaturization of credit card fraudulent transactions dataset on remote compute and codegen functionality**_\n",
|
||||||
|
"\n",
|
||||||
|
"## Contents\n",
|
||||||
|
"1. [Introduction](#Introduction)\n",
|
||||||
|
"1. [Setup](#Setup)\n",
|
||||||
|
"1. [Data](#Data)\n",
|
||||||
|
"1. [Autofeaturization](#Autofeaturization)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"<a id='Introduction'></a>\n",
|
||||||
|
"## Introduction"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"**Autofeaturization** lets you run an AutoML experiment to only featurize the datasets. These datasets along with the transformer are stored in AML Storage and linked to the run which can later be retrieved and used to train models. \n",
|
||||||
|
"\n",
|
||||||
|
"**To run Autofeaturization, set the number of iterations to zero and featurization as auto.**\n",
|
||||||
|
"\n",
|
||||||
|
"Please refer to [Autofeaturization and custom model training](../autofeaturization-custom-model-training/custom-model-training-from-autofeaturization-run.ipynb) for more details on the same.\n",
|
||||||
|
"\n",
|
||||||
|
"[Codegen](https://github.com/Azure/automl-codegen-preview) is a feature, which when enabled, provides a user with the script of the underlying functionality and a notebook to tweak inputs or code and rerun the same.\n",
|
||||||
|
"\n",
|
||||||
|
"In this example we use the credit card fraudulent transactions dataset to showcase how you can use AutoML for autofeaturization and further how you can enable the `Codegen` feature.\n",
|
||||||
|
"\n",
|
||||||
|
"This notebook is using remote compute to complete the featurization.\n",
|
||||||
|
"\n",
|
||||||
|
"If you are using an Azure Machine Learning Compute Instance, you are all set. Otherwise, go through the [configuration](../../configuration.ipynb) notebook first if you haven't already, to establish your connection to the AzureML Workspace. \n",
|
||||||
|
"\n",
|
||||||
|
"Here you will learn how to create an autofeaturization experiment using an existing workspace with codegen feature enabled."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"<a id='Setup'></a>\n",
|
||||||
|
"## Setup\n",
|
||||||
|
"\n",
|
||||||
|
"As part of the setup you have already created an Azure ML `Workspace` object. For Automated ML you will need to create an `Experiment` object, which is a named object in a `Workspace` used to run experiments."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import logging\n",
|
||||||
|
"import pandas as pd\n",
|
||||||
|
"import azureml.core\n",
|
||||||
|
"from azureml.core.experiment import Experiment\n",
|
||||||
|
"from azureml.core.workspace import Workspace\n",
|
||||||
|
"from azureml.core.dataset import Dataset\n",
|
||||||
|
"from azureml.train.automl import AutoMLConfig"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"This sample notebook may use features that are not available in previous versions of the Azure ML SDK."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"print(\"This notebook was created using version 1.59.0 of the Azure ML SDK\")\n",
|
||||||
|
"print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"ws = Workspace.from_config()\n",
|
||||||
|
"\n",
|
||||||
|
"# choose a name for experiment\n",
|
||||||
|
"experiment_name = 'automl-autofeaturization-ccard-codegen-remote'\n",
|
||||||
|
"\n",
|
||||||
|
"experiment=Experiment(ws, experiment_name)\n",
|
||||||
|
"\n",
|
||||||
|
"output = {}\n",
|
||||||
|
"output['Subscription ID'] = ws.subscription_id\n",
|
||||||
|
"output['Workspace'] = ws.name\n",
|
||||||
|
"output['Resource Group'] = ws.resource_group\n",
|
||||||
|
"output['Location'] = ws.location\n",
|
||||||
|
"output['Experiment Name'] = experiment.name\n",
|
||||||
|
"outputDf = pd.DataFrame(data = output, index = [''])\n",
|
||||||
|
"outputDf.T"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"### Create or Attach existing AmlCompute\n",
|
||||||
|
"A compute target is required to execute the Automated ML run. In this tutorial, you create AmlCompute as your training compute resource.\n",
|
||||||
|
"\n",
|
||||||
|
"> Note that if you have an AzureML Data Scientist role, you will not have permission to create compute resources. Talk to your workspace or IT admin to create the compute targets described in this section, if they do not already exist.\n",
|
||||||
|
"\n",
|
||||||
|
"#### Creation of AmlCompute takes approximately 5 minutes. \n",
|
||||||
|
"If the AmlCompute with that name is already in your workspace this code will skip the creation process.\n",
|
||||||
|
"As with other Azure services, there are limits on certain resources (e.g. AmlCompute) associated with the Azure Machine Learning service. Please read [this article](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-manage-quotas) on the default limits and how to request more quota."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from azureml.core.compute import ComputeTarget, AmlCompute\n",
|
||||||
|
"from azureml.core.compute_target import ComputeTargetException\n",
|
||||||
|
"\n",
|
||||||
|
"# Choose a name for your CPU cluster\n",
|
||||||
|
"cpu_cluster_name = \"cpu-codegen\"\n",
|
||||||
|
"\n",
|
||||||
|
"# Verify that cluster does not exist already\n",
|
||||||
|
"try:\n",
|
||||||
|
" compute_target = ComputeTarget(workspace=ws, name=cpu_cluster_name)\n",
|
||||||
|
" print('Found existing cluster, use it.')\n",
|
||||||
|
"except ComputeTargetException:\n",
|
||||||
|
" compute_config = AmlCompute.provisioning_configuration(vm_size='STANDARD_DS12_V2',\n",
|
||||||
|
" max_nodes=6)\n",
|
||||||
|
" compute_target = ComputeTarget.create(ws, cpu_cluster_name, compute_config)\n",
|
||||||
|
"\n",
|
||||||
|
"compute_target.wait_for_completion(show_output=True)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"<a id='Data'></a>\n",
|
||||||
|
"## Data"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"### Load Data\n",
|
||||||
|
"\n",
|
||||||
|
"Load the credit card fraudulent transactions dataset from a CSV file, containing both training features and labels. The features are inputs to the model, while the training labels represent the expected output of the model. \n",
|
||||||
|
"\n",
|
||||||
|
"Here the autofeaturization run will featurize the training data passed in."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"##### Training Dataset"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"training_data = \"https://automlsamplenotebookdata.blob.core.windows.net/automl-sample-notebook-data/creditcard_train.csv\"\n",
|
||||||
|
"training_dataset = Dataset.Tabular.from_delimited_files(training_data) # Tabular dataset\n",
|
||||||
|
"\n",
|
||||||
|
"label_column_name = 'Class' # output label"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"<a id='Autofeaturization'></a>\n",
|
||||||
|
"## AutoFeaturization\n",
|
||||||
|
"\n",
|
||||||
|
"Instantiate an AutoMLConfig object. This defines the settings and data used to run the autofeaturization experiment.\n",
|
||||||
|
"\n",
|
||||||
|
"|Property|Description|\n",
|
||||||
|
"|-|-|\n",
|
||||||
|
"|**task**|classification or regression or forecasting|\n",
|
||||||
|
"|**training_data**|Input training dataset, containing both features and label column.|\n",
|
||||||
|
"|**iterations**|For an autofeaturization run, iterations will be 0.|\n",
|
||||||
|
"|**featurization**|For an autofeaturization run, featurization can be 'auto' or 'custom'.|\n",
|
||||||
|
"|**label_column_name**|The name of the label column.|\n",
|
||||||
|
"|**enable_code_generation**|For enabling codegen for the run, value would be True|\n",
|
||||||
|
"\n",
|
||||||
|
"**_You can find more information about primary metrics_** [here](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-configure-auto-train#primary-metric)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"automl_config = AutoMLConfig(task = 'classification',\n",
|
||||||
|
" debug_log = 'automl_errors.log',\n",
|
||||||
|
" iterations = 0, # autofeaturization run can be triggered by setting iterations to 0\n",
|
||||||
|
" compute_target = compute_target,\n",
|
||||||
|
" training_data = training_dataset,\n",
|
||||||
|
" label_column_name = label_column_name,\n",
|
||||||
|
" featurization = 'auto',\n",
|
||||||
|
" verbosity = logging.INFO,\n",
|
||||||
|
" enable_code_generation = True # enable codegen\n",
|
||||||
|
" )"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Call the `submit` method on the experiment object and pass the run configuration. Depending on the data this can run for a while. Validation errors and current status will be shown when setting `show_output=True` and the execution will be synchronous."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"remote_run = experiment.submit(automl_config, show_output = False)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"### Results"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"#### Widget for Monitoring Runs\n",
|
||||||
|
"\n",
|
||||||
|
"The widget will first report a \"loading\" status while running the first iteration. After completing the first iteration, an auto-updating graph and table will be shown. The widget will refresh once per minute, so you should see the graph update as child runs complete.\n",
|
||||||
|
"\n",
|
||||||
|
"**Note:** The widget displays a link at the bottom. Use this link to open a web interface to explore the individual run details"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from azureml.widgets import RunDetails\n",
|
||||||
|
"RunDetails(remote_run).show()"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"remote_run.wait_for_completion(show_output=False)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"### Codegen Script and Notebook"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Codegen script and notebook can be found under the `Outputs + logs` section from the details page of the remote run. Please check for the `autofeaturization_notebook.ipynb` under `/outputs/generated_code`. To modify the featurization code, open `script.py` and make changes. The codegen notebook can be run with the same environment configuration as the above AutoML run."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"#### Experiment Complete!"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"metadata": {
|
||||||
|
"authors": [
|
||||||
|
{
|
||||||
|
"name": "bhavanatumma"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"interpreter": {
|
||||||
|
"hash": "adb464b67752e4577e3dc163235ced27038d19b7d88def00d75d1975bde5d9ab"
|
||||||
|
},
|
||||||
|
"kernelspec": {
|
||||||
|
"display_name": "Python 3.8 - AzureML",
|
||||||
|
"language": "python",
|
||||||
|
"name": "python38-azureml"
|
||||||
|
},
|
||||||
|
"language_info": {
|
||||||
|
"codemirror_mode": {
|
||||||
|
"name": "ipython",
|
||||||
|
"version": 3
|
||||||
|
},
|
||||||
|
"file_extension": ".py",
|
||||||
|
"mimetype": "text/x-python",
|
||||||
|
"name": "python",
|
||||||
|
"nbconvert_exporter": "python",
|
||||||
|
"pygments_lexer": "ipython3",
|
||||||
|
"version": "3.6.9"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"nbformat": 4,
|
||||||
|
"nbformat_minor": 2
|
||||||
|
}
|
||||||
@@ -0,0 +1,729 @@
|
|||||||
|
{
|
||||||
|
"cells": [
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Copyright (c) Microsoft Corporation. All rights reserved.\n",
|
||||||
|
"\n",
|
||||||
|
"Licensed under the MIT License."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
""
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"# Automated Machine Learning - AutoFeaturization (Part 1)\n",
|
||||||
|
"_**Autofeaturization of credit card fraudulent transactions dataset on remote compute**_\n",
|
||||||
|
"\n",
|
||||||
|
"## Contents\n",
|
||||||
|
"1. [Introduction](#Introduction)\n",
|
||||||
|
"1. [Setup](#Setup)\n",
|
||||||
|
"1. [Data](#Data)\n",
|
||||||
|
"1. [Autofeaturization](#Autofeaturization)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"<a id='Introduction'></a>\n",
|
||||||
|
"## Introduction"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Autofeaturization is a new feature to let you as the user run an AutoML experiment to only featurize the datasets. These datasets along with the transformer will be stored in the experiment which can later be retrieved and used to train models, either via AutoML or custom training. \n",
|
||||||
|
"\n",
|
||||||
|
"**To run Autofeaturization, pass in zero iterations and featurization as auto. This will featurize the datasets and terminate the experiment. Training will not occur.**\n",
|
||||||
|
"\n",
|
||||||
|
"*Limitations - Sparse data cannot be supported at the moment. Any dataset that has extensive categorical data might be featurized into sparse data which will not be allowed as input to AutoML. Efforts are underway to support sparse data and will be updated soon.* \n",
|
||||||
|
"\n",
|
||||||
|
"In this example we use the credit card fraudulent transactions dataset to showcase how you can use AutoML for autofeaturization. The goal is to clean and featurize the training dataset.\n",
|
||||||
|
"\n",
|
||||||
|
"This notebook is using remote compute to complete the featurization.\n",
|
||||||
|
"\n",
|
||||||
|
"If you are using an Azure Machine Learning Compute Instance, you are all set. Otherwise, go through the [configuration](../../configuration.ipynb) notebook first if you haven't already, to establish your connection to the AzureML Workspace. \n",
|
||||||
|
"\n",
|
||||||
|
"In the below steps, you will learn how to:\n",
|
||||||
|
"1. Create an autofeaturization experiment using an existing workspace.\n",
|
||||||
|
"2. View the featurized datasets and transformer"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"<a id='Setup'></a>\n",
|
||||||
|
"## Setup\n",
|
||||||
|
"\n",
|
||||||
|
"As part of the setup you have already created an Azure ML `Workspace` object. For Automated ML you will need to create an `Experiment` object, which is a named object in a `Workspace` used to run experiments."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import logging\n",
|
||||||
|
"import pandas as pd\n",
|
||||||
|
"import azureml.core\n",
|
||||||
|
"from azureml.core.experiment import Experiment\n",
|
||||||
|
"from azureml.core.workspace import Workspace\n",
|
||||||
|
"from azureml.core.dataset import Dataset\n",
|
||||||
|
"from azureml.train.automl import AutoMLConfig"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"This sample notebook may use features that are not available in previous versions of the Azure ML SDK."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"print(\"This notebook was created using version 1.59.0 of the Azure ML SDK\")\n",
|
||||||
|
"print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"ws = Workspace.from_config()\n",
|
||||||
|
"\n",
|
||||||
|
"# choose a name for experiment\n",
|
||||||
|
"experiment_name = 'automl-autofeaturization-ccard-remote'\n",
|
||||||
|
"\n",
|
||||||
|
"experiment=Experiment(ws, experiment_name)\n",
|
||||||
|
"\n",
|
||||||
|
"output = {}\n",
|
||||||
|
"output['Subscription ID'] = ws.subscription_id\n",
|
||||||
|
"output['Workspace'] = ws.name\n",
|
||||||
|
"output['Resource Group'] = ws.resource_group\n",
|
||||||
|
"output['Location'] = ws.location\n",
|
||||||
|
"output['Experiment Name'] = experiment.name\n",
|
||||||
|
"outputDf = pd.DataFrame(data = output, index = [''])\n",
|
||||||
|
"outputDf.T"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"### Create or Attach existing AmlCompute\n",
|
||||||
|
"A compute target is required to execute the Automated ML run. In this tutorial, you create AmlCompute as your training compute resource.\n",
|
||||||
|
"\n",
|
||||||
|
"> Note that if you have an AzureML Data Scientist role, you will not have permission to create compute resources. Talk to your workspace or IT admin to create the compute targets described in this section, if they do not already exist.\n",
|
||||||
|
"\n",
|
||||||
|
"#### Creation of AmlCompute takes approximately 5 minutes. \n",
|
||||||
|
"If the AmlCompute with that name is already in your workspace this code will skip the creation process.\n",
|
||||||
|
"As with other Azure services, there are limits on certain resources (e.g. AmlCompute) associated with the Azure Machine Learning service. Please read [this article](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-manage-quotas) on the default limits and how to request more quota."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from azureml.core.compute import ComputeTarget, AmlCompute\n",
|
||||||
|
"from azureml.core.compute_target import ComputeTargetException\n",
|
||||||
|
"\n",
|
||||||
|
"# Choose a name for your CPU cluster\n",
|
||||||
|
"cpu_cluster_name = \"cpu-cluster\"\n",
|
||||||
|
"\n",
|
||||||
|
"# Verify that cluster does not exist already\n",
|
||||||
|
"try:\n",
|
||||||
|
" compute_target = ComputeTarget(workspace=ws, name=cpu_cluster_name)\n",
|
||||||
|
" print('Found existing cluster, use it.')\n",
|
||||||
|
"except ComputeTargetException:\n",
|
||||||
|
" compute_config = AmlCompute.provisioning_configuration(vm_size='STANDARD_DS12_V2',\n",
|
||||||
|
" max_nodes=6)\n",
|
||||||
|
" compute_target = ComputeTarget.create(ws, cpu_cluster_name, compute_config)\n",
|
||||||
|
"\n",
|
||||||
|
"compute_target.wait_for_completion(show_output=True)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"<a id='Data'></a>\n",
|
||||||
|
"## Data"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"### Load Data\n",
|
||||||
|
"\n",
|
||||||
|
"Load the credit card fraudulent transactions dataset from a CSV file, containing both training features and labels. The features are inputs to the model, while the training labels represent the expected output of the model. \n",
|
||||||
|
"\n",
|
||||||
|
"Here the autofeaturization run will featurize the training data passed in."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"##### Training Dataset"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"training_data = \"https://automlsamplenotebookdata.blob.core.windows.net/automl-sample-notebook-data/creditcard_train.csv\"\n",
|
||||||
|
"training_dataset = Dataset.Tabular.from_delimited_files(training_data) # Tabular dataset\n",
|
||||||
|
"\n",
|
||||||
|
"label_column_name = 'Class' # output label"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"<a id='Autofeaturization'></a>\n",
|
||||||
|
"## AutoFeaturization\n",
|
||||||
|
"\n",
|
||||||
|
"Instantiate an AutoMLConfig object. This defines the settings and data used to run the autofeaturization experiment.\n",
|
||||||
|
"\n",
|
||||||
|
"|Property|Description|\n",
|
||||||
|
"|-|-|\n",
|
||||||
|
"|**task**|classification or regression|\n",
|
||||||
|
"|**training_data**|Input training dataset, containing both features and label column.|\n",
|
||||||
|
"|**iterations**|For an autofeaturization run, iterations will be 0.|\n",
|
||||||
|
"|**featurization**|For an autofeaturization run, featurization will be 'auto'.|\n",
|
||||||
|
"|**label_column_name**|The name of the label column.|\n",
|
||||||
|
"\n",
|
||||||
|
"**_You can find more information about primary metrics_** [here](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-configure-auto-train#primary-metric)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"automl_config = AutoMLConfig(task = 'classification',\n",
|
||||||
|
" debug_log = 'automl_errors.log',\n",
|
||||||
|
" iterations = 0, # autofeaturization run can be triggered by setting iterations to 0\n",
|
||||||
|
" compute_target = compute_target,\n",
|
||||||
|
" training_data = training_dataset,\n",
|
||||||
|
" label_column_name = label_column_name,\n",
|
||||||
|
" featurization = 'auto',\n",
|
||||||
|
" verbosity = logging.INFO\n",
|
||||||
|
" )"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Call the `submit` method on the experiment object and pass the run configuration. Depending on the data this can run for a while. Validation errors and current status will be shown when setting `show_output=True` and the execution will be synchronous."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"remote_run = experiment.submit(automl_config, show_output = False)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"### Transformer and Featurized Datasets\n",
|
||||||
|
"The given datasets have been featurized and stored under `Outputs + logs` from the details page of the remote run. The structure is shown below. The featurized dataset is stored under `/outputs/featurization/data` and the transformer is saved under `/outputs/featurization/pipeline` \n",
|
||||||
|
"\n",
|
||||||
|
"Below you will learn how to refer to the data saved in your run and retrieve the same."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
""
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"### Results"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"#### Widget for Monitoring Runs\n",
|
||||||
|
"\n",
|
||||||
|
"The widget will first report a \"loading\" status while running the first iteration. After completing the first iteration, an auto-updating graph and table will be shown. The widget will refresh once per minute, so you should see the graph update as child runs complete.\n",
|
||||||
|
"\n",
|
||||||
|
"**Note:** The widget displays a link at the bottom. Use this link to open a web interface to explore the individual run details"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from azureml.widgets import RunDetails\n",
|
||||||
|
"RunDetails(remote_run).show()"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"remote_run.wait_for_completion(show_output=False)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"# Automated Machine Learning - AutoFeaturization (Part 2)\n",
|
||||||
|
"_**Training using a custom model with the featurized data from Autofeaturization run of credit card fraudulent transactions dataset**_\n",
|
||||||
|
"\n",
|
||||||
|
"## Contents\n",
|
||||||
|
"1. [Introduction](#Introduction)\n",
|
||||||
|
"1. [Data Setup](#DataSetup)\n",
|
||||||
|
"1. [Autofeaturization Data](#AutofeaturizationData)\n",
|
||||||
|
"1. [Train](#Train)\n",
|
||||||
|
"1. [Results](#Results)\n",
|
||||||
|
"1. [Test](#Test)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"<a id='Introduction'></a>\n",
|
||||||
|
"## Introduction\n",
|
||||||
|
"\n",
|
||||||
|
"Here we use the featurized dataset saved in the above run to showcase how you can perform custom training by using the transformer from an autofeaturization run to transform validation / test datasets. \n",
|
||||||
|
"\n",
|
||||||
|
"The goal is to use autofeaturized run data and transformer to transform and run a custom training experiment independently\n",
|
||||||
|
"\n",
|
||||||
|
"In the below steps, you will learn how to:\n",
|
||||||
|
"1. Read transformer from a completed autofeaturization run and transform data\n",
|
||||||
|
"2. Pull featurized data from a completed autofeaturization run\n",
|
||||||
|
"3. Run a custom training experiment with the above data\n",
|
||||||
|
"4. Check results"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"<a id='DataSetup'></a>\n",
|
||||||
|
"## Data Setup"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"We will load the featurized training data and also load the transformer from the above autofeaturized run. This transformer can then be used to transform the test data to check the accuracy of the custom model after training."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"### Load Test Data"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"load test dataset from CSV and split into X and y columns to featurize with the transformer going forward."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"test_data = \"https://automlsamplenotebookdata.blob.core.windows.net/automl-sample-notebook-data/creditcard_test.csv\"\n",
|
||||||
|
"\n",
|
||||||
|
"test_dataset = pd.read_csv(test_data)\n",
|
||||||
|
"label_column_name = 'Class'\n",
|
||||||
|
"\n",
|
||||||
|
"X_test_data = test_dataset[test_dataset.columns.difference([label_column_name])]\n",
|
||||||
|
"y_test_data = test_dataset[label_column_name].values\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"### Load data_transformer from the above remote run artifact"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"#### (Method 1)\n",
|
||||||
|
"\n",
|
||||||
|
"Method 1 allows you to read the transformer from the remote storage."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import mlflow\n",
|
||||||
|
"mlflow.set_tracking_uri(ws.get_mlflow_tracking_uri())\n",
|
||||||
|
"\n",
|
||||||
|
"# Set uri to fetch data transformer from remote parent run.\n",
|
||||||
|
"artifact_path = \"/outputs/featurization/pipeline/\"\n",
|
||||||
|
"uri = \"runs:/\" + remote_run.id + artifact_path\n",
|
||||||
|
"\n",
|
||||||
|
"print(uri)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"#### (Method 2)\n",
|
||||||
|
"\n",
|
||||||
|
"Method 2 downloads the transformer to the local directory and then can be used to transform the data. Uncomment to use."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"''' import pathlib\n",
|
||||||
|
"\n",
|
||||||
|
"# Download the transformer to the local directory\n",
|
||||||
|
"transformers_file_path = \"/outputs/featurization/pipeline/\"\n",
|
||||||
|
"local_path = \"./transformer\"\n",
|
||||||
|
"remote_run.download_files(prefix=transformers_file_path, output_directory=local_path, batch_size=500)\n",
|
||||||
|
"\n",
|
||||||
|
"path = pathlib.Path(\"transformer\") \n",
|
||||||
|
"path = str(path.absolute()) + transformers_file_path\n",
|
||||||
|
"str_uri = \"file:///\" + path\n",
|
||||||
|
"\n",
|
||||||
|
"print(str_uri) '''"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"### Transform Data"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"attachments": {},
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"**Note:** Not all datasets produce a y_transformer. The dataset used in the current notebook requires a transformer as the y column data is categorical. \n",
|
||||||
|
"\n",
|
||||||
|
"We will go ahead and download the mlflow transformer model and use it to transform test data that can be used for further experimentation below. To run the commented code, make sure the environment requirement is satisfied. You can go ahead and create the environment from the `conda.yaml` file under `/outputs/featurization/pipeline/` and run the given code in it."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"''' from azureml.automl.core.shared.constants import Transformers\n",
|
||||||
|
"\n",
|
||||||
|
"transformers = mlflow.sklearn.load_model(uri) # Using method 1\n",
|
||||||
|
"data_transformers = transformers.get_transformers()\n",
|
||||||
|
"x_transformer = data_transformers[Transformers.X_TRANSFORMER]\n",
|
||||||
|
"y_transformer = data_transformers[Transformers.Y_TRANSFORMER]\n",
|
||||||
|
"\n",
|
||||||
|
"X_test = x_transformer.transform(X_test_data)\n",
|
||||||
|
"y_test = y_transformer.transform(y_test_data) '''"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"attachments": {},
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Run the following cell to see the featurization summary of X and y transformers. Uncomment to use. "
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"''' X_data_summary = x_transformer.get_featurization_summary(is_user_friendly=False)\n",
|
||||||
|
"\n",
|
||||||
|
"summary_df = pd.DataFrame.from_records(X_data_summary)\n",
|
||||||
|
"summary_df '''"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"### Load Datastore\n",
|
||||||
|
"\n",
|
||||||
|
"The below data store holds the featurized datasets, hence we load and access the data. Check the path and file names according to the saved structure in your experiment `Outputs + logs` as seen in <i>Autofeaturization Part 1</i>"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from azureml.core.datastore import Datastore\n",
|
||||||
|
"\n",
|
||||||
|
"ds = Datastore.get(ws, \"workspaceartifactstore\")\n",
|
||||||
|
"experiment_loc = \"ExperimentRun/dcid.\" + remote_run.id\n",
|
||||||
|
"\n",
|
||||||
|
"remote_data_path = \"/outputs/featurization/data/\""
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"<a id='AutofeaturizationData'></a>\n",
|
||||||
|
"## Autofeaturization Data\n",
|
||||||
|
"\n",
|
||||||
|
"We will load the training data from the previously completed Autofeaturization experiment. The resulting featurized dataframe can be passed into the custom model for training. Here we are saving the file to local from the experiment storage and reading the data."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"train_data_file_path = \"full_training_dataset.df.parquet\"\n",
|
||||||
|
"local_data_path = \"./data/\" + train_data_file_path\n",
|
||||||
|
"\n",
|
||||||
|
"remote_run.download_file(remote_data_path + train_data_file_path, local_data_path)\n",
|
||||||
|
"\n",
|
||||||
|
"full_training_data = pd.read_parquet(local_data_path)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"attachments": {},
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Another way to load the data is to go to the above autofeaturization experiment and check for the featurized dataset ids under `Output datasets`. Uncomment and replace them accordingly below, to use."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"# train_data = Dataset.get_by_id(ws, 'cb4418ee-bac4-45ac-b055-600653bdf83a') # replace the featurized full_training_dataset id\n",
|
||||||
|
"# full_training_data = train_data.to_pandas_dataframe()"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"### Training Data"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"We are dropping the y column and weights column from the featurized training dataset."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"Y_COLUMN = \"automl_y\"\n",
|
||||||
|
"SW_COLUMN = \"automl_weights\"\n",
|
||||||
|
"\n",
|
||||||
|
"X_train = full_training_data[full_training_data.columns.difference([Y_COLUMN, SW_COLUMN])]\n",
|
||||||
|
"y_train = full_training_data[Y_COLUMN].values\n",
|
||||||
|
"sample_weight = full_training_data[SW_COLUMN].values"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"<a id='Train'></a>\n",
|
||||||
|
"## Train"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"attachments": {},
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Here we are passing our training data to the lightgbm classifier, any custom model can be used with your data. Let us first install lightgbm."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"! pip install lightgbm"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import lightgbm as lgb\n",
|
||||||
|
"\n",
|
||||||
|
"model = lgb.LGBMClassifier(learning_rate=0.08,max_depth=-5,random_state=42)\n",
|
||||||
|
"model.fit(X_train, y_train, sample_weight=sample_weight)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"attachments": {},
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Once training is done, the test data obtained after transforming from the above downloaded transformer can be used to calculate the accuracy "
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"print('Training accuracy {:.4f}'.format(model.score(X_train, y_train)))\n",
|
||||||
|
"\n",
|
||||||
|
"# Uncomment below to test the model on test data \n",
|
||||||
|
"# print('Testing accuracy {:.4f}'.format(model.score(X_test, y_test)))"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"<a id='Results'></a>\n",
|
||||||
|
"## Analyze results\n",
|
||||||
|
"\n",
|
||||||
|
"### Retrieve the Model"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"model"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"<a id='Test'></a>\n",
|
||||||
|
"## Test the fitted model\n",
|
||||||
|
"\n",
|
||||||
|
"Now that the model is trained, split the data in the same way the data was split for training (The difference here is the data is being split locally) and then run the test data through the trained model to get the predicted values."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"# Uncomment below to test the model on test data\n",
|
||||||
|
"# y_pred = model.predict(X_test)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"#### Experiment Complete!"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"metadata": {
|
||||||
|
"authors": [
|
||||||
|
{
|
||||||
|
"name": "bhavanatumma"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"interpreter": {
|
||||||
|
"hash": "adb464b67752e4577e3dc163235ced27038d19b7d88def00d75d1975bde5d9ab"
|
||||||
|
},
|
||||||
|
"kernelspec": {
|
||||||
|
"display_name": "Python 3.8 - AzureML",
|
||||||
|
"language": "python",
|
||||||
|
"name": "python38-azureml"
|
||||||
|
},
|
||||||
|
"language_info": {
|
||||||
|
"codemirror_mode": {
|
||||||
|
"name": "ipython",
|
||||||
|
"version": 3
|
||||||
|
},
|
||||||
|
"file_extension": ".py",
|
||||||
|
"mimetype": "text/x-python",
|
||||||
|
"name": "python",
|
||||||
|
"nbconvert_exporter": "python",
|
||||||
|
"pygments_lexer": "ipython3",
|
||||||
|
"version": "3.6.9"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"nbformat": 4,
|
||||||
|
"nbformat_minor": 2
|
||||||
|
}
|
||||||
@@ -1,17 +1,15 @@
|
|||||||
name: azure_automl_experimental
|
name: azure_automl_experimental
|
||||||
dependencies:
|
dependencies:
|
||||||
# The python interpreter version.
|
# The python interpreter version.
|
||||||
# Currently Azure ML only supports 3.6.0 and later.
|
# Currently Azure ML only supports 3.7.0 and later.
|
||||||
- pip<=20.2.4
|
- pip<=22.3.1
|
||||||
- python>=3.6.0,<3.9
|
- python>=3.7.0,<3.11
|
||||||
- cython==0.29.14
|
|
||||||
- urllib3==1.26.7
|
|
||||||
- PyJWT < 2.0.0
|
|
||||||
- numpy==1.18.5
|
|
||||||
|
|
||||||
- pip:
|
- pip:
|
||||||
# Required packages for AzureML execution, history, and data preparation.
|
# Required packages for AzureML execution, history, and data preparation.
|
||||||
- azureml-defaults
|
- azureml-defaults
|
||||||
- azureml-sdk
|
- azureml-sdk
|
||||||
- azureml-widgets
|
- azureml-widgets
|
||||||
|
- azureml-mlflow
|
||||||
- pandas
|
- pandas
|
||||||
|
- mlflow
|
||||||
|
|||||||
@@ -4,17 +4,21 @@ channels:
|
|||||||
- main
|
- main
|
||||||
dependencies:
|
dependencies:
|
||||||
# The python interpreter version.
|
# The python interpreter version.
|
||||||
# Currently Azure ML only supports 3.6.0 and later.
|
# Currently Azure ML only supports 3.7.0 and later.
|
||||||
- pip<=20.2.4
|
- pip<=20.2.4
|
||||||
- nomkl
|
- nomkl
|
||||||
- python>=3.6.0,<3.9
|
- python>=3.7.0,<3.11
|
||||||
- urllib3==1.26.7
|
- urllib3==1.26.7
|
||||||
- PyJWT < 2.0.0
|
- PyJWT < 2.0.0
|
||||||
- numpy==1.19.5
|
- numpy>=1.21.6,<=1.22.3
|
||||||
|
|
||||||
- pip:
|
- pip:
|
||||||
# Required packages for AzureML execution, history, and data preparation.
|
# Required packages for AzureML execution, history, and data preparation.
|
||||||
|
- azure-core==1.24.1
|
||||||
|
- azure-identity==1.7.0
|
||||||
- azureml-defaults
|
- azureml-defaults
|
||||||
- azureml-sdk
|
- azureml-sdk
|
||||||
- azureml-widgets
|
- azureml-widgets
|
||||||
|
- azureml-mlflow
|
||||||
- pandas
|
- pandas
|
||||||
|
- mlflow
|
||||||
|
|||||||
@@ -1,420 +0,0 @@
|
|||||||
{
|
|
||||||
"cells": [
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"Copyright (c) Microsoft Corporation. All rights reserved.\n",
|
|
||||||
"\n",
|
|
||||||
"Licensed under the MIT License."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
""
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"# Automated Machine Learning\n",
|
|
||||||
"_**Classification of credit card fraudulent transactions on local managed compute **_\n",
|
|
||||||
"\n",
|
|
||||||
"## Contents\n",
|
|
||||||
"1. [Introduction](#Introduction)\n",
|
|
||||||
"1. [Setup](#Setup)\n",
|
|
||||||
"1. [Train](#Train)\n",
|
|
||||||
"1. [Results](#Results)\n",
|
|
||||||
"1. [Test](#Test)\n",
|
|
||||||
"1. [Acknowledgements](#Acknowledgements)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"## Introduction\n",
|
|
||||||
"\n",
|
|
||||||
"In this example we use the associated credit card dataset to showcase how you can use AutoML for a simple classification problem. The goal is to predict if a credit card transaction is considered a fraudulent charge.\n",
|
|
||||||
"\n",
|
|
||||||
"This notebook is using local managed compute to train the model.\n",
|
|
||||||
"\n",
|
|
||||||
"If you are using an Azure Machine Learning Compute Instance, you are all set. Otherwise, go through the [configuration](../../../configuration.ipynb) notebook first if you haven't already to establish your connection to the AzureML Workspace. \n",
|
|
||||||
"\n",
|
|
||||||
"In this notebook you will learn how to:\n",
|
|
||||||
"1. Create an experiment using an existing workspace.\n",
|
|
||||||
"2. Configure AutoML using `AutoMLConfig`.\n",
|
|
||||||
"3. Train the model using local managed compute.\n",
|
|
||||||
"4. Explore the results.\n",
|
|
||||||
"5. Test the fitted model."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"## Setup\n",
|
|
||||||
"\n",
|
|
||||||
"As part of the setup you have already created an Azure ML `Workspace` object. For Automated ML you will need to create an `Experiment` object, which is a named object in a `Workspace` used to run experiments."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"import logging\n",
|
|
||||||
"\n",
|
|
||||||
"import pandas as pd\n",
|
|
||||||
"\n",
|
|
||||||
"import azureml.core\n",
|
|
||||||
"from azureml.core.compute_target import LocalTarget\n",
|
|
||||||
"from azureml.core.experiment import Experiment\n",
|
|
||||||
"from azureml.core.workspace import Workspace\n",
|
|
||||||
"from azureml.core.dataset import Dataset\n",
|
|
||||||
"from azureml.train.automl import AutoMLConfig"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"This sample notebook may use features that are not available in previous versions of the Azure ML SDK."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"print(\"This notebook was created using version 1.40.0 of the Azure ML SDK\")\n",
|
|
||||||
"print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"ws = Workspace.from_config()\n",
|
|
||||||
"\n",
|
|
||||||
"# choose a name for experiment\n",
|
|
||||||
"experiment_name = 'automl-local-managed'\n",
|
|
||||||
"\n",
|
|
||||||
"experiment=Experiment(ws, experiment_name)\n",
|
|
||||||
"\n",
|
|
||||||
"output = {}\n",
|
|
||||||
"output['Subscription ID'] = ws.subscription_id\n",
|
|
||||||
"output['Workspace'] = ws.name\n",
|
|
||||||
"output['Resource Group'] = ws.resource_group\n",
|
|
||||||
"output['Location'] = ws.location\n",
|
|
||||||
"output['Experiment Name'] = experiment.name\n",
|
|
||||||
"pd.set_option('display.max_colwidth', None)\n",
|
|
||||||
"outputDf = pd.DataFrame(data = output, index = [''])\n",
|
|
||||||
"outputDf.T"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"### Determine if local docker is configured for Linux images\n",
|
|
||||||
"\n",
|
|
||||||
"Local managed runs will leverage a Linux docker container to submit the run to. Due to this, the docker needs to be configured to use Linux containers."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"# Check if Docker is installed and Linux containers are enabled\n",
|
|
||||||
"import subprocess\n",
|
|
||||||
"from subprocess import CalledProcessError\n",
|
|
||||||
"try:\n",
|
|
||||||
" assert subprocess.run(\"docker -v\", shell=True).returncode == 0, 'Local Managed runs require docker to be installed.'\n",
|
|
||||||
" out = subprocess.check_output(\"docker system info\", shell=True).decode('ascii')\n",
|
|
||||||
" assert \"OSType: linux\" in out, 'Docker engine needs to be configured to use Linux containers.' \\\n",
|
|
||||||
" 'https://docs.docker.com/docker-for-windows/#switch-between-windows-and-linux-containers'\n",
|
|
||||||
"except CalledProcessError as ex:\n",
|
|
||||||
" raise Exception('Local Managed runs require docker to be installed.') from ex"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"# Data"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"### Load Data\n",
|
|
||||||
"\n",
|
|
||||||
"Load the credit card dataset from a csv file containing both training features and labels. The features are inputs to the model, while the training labels represent the expected output of the model. Next, we'll split the data using random_split and extract the training data for the model."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"data = \"https://automlsamplenotebookdata.blob.core.windows.net/automl-sample-notebook-data/creditcard.csv\"\n",
|
|
||||||
"dataset = Dataset.Tabular.from_delimited_files(data)\n",
|
|
||||||
"training_data, validation_data = dataset.random_split(percentage=0.8, seed=223)\n",
|
|
||||||
"label_column_name = 'Class'"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"## Train\n",
|
|
||||||
"\n",
|
|
||||||
"Instantiate a AutoMLConfig object. This defines the settings and data used to run the experiment.\n",
|
|
||||||
"\n",
|
|
||||||
"|Property|Description|\n",
|
|
||||||
"|-|-|\n",
|
|
||||||
"|**task**|classification or regression|\n",
|
|
||||||
"|**primary_metric**|This is the metric that you want to optimize. Classification supports the following primary metrics: <br><i>accuracy</i><br><i>AUC_weighted</i><br><i>average_precision_score_weighted</i><br><i>norm_macro_recall</i><br><i>precision_score_weighted</i>|\n",
|
|
||||||
"|**enable_early_stopping**|Stop the run if the metric score is not showing improvement.|\n",
|
|
||||||
"|**n_cross_validations**|Number of cross validation splits.|\n",
|
|
||||||
"|**training_data**|Input dataset, containing both features and label column.|\n",
|
|
||||||
"|**label_column_name**|The name of the label column.|\n",
|
|
||||||
"|**enable_local_managed**|Enable the experimental local-managed scenario.|\n",
|
|
||||||
"\n",
|
|
||||||
"**_You can find more information about primary metrics_** [here](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-configure-auto-train#primary-metric)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"automl_settings = {\n",
|
|
||||||
" \"n_cross_validations\": 3,\n",
|
|
||||||
" \"primary_metric\": 'average_precision_score_weighted',\n",
|
|
||||||
" \"enable_early_stopping\": True,\n",
|
|
||||||
" \"experiment_timeout_hours\": 0.3, #for real scenarios we recommend a timeout of at least one hour \n",
|
|
||||||
" \"verbosity\": logging.INFO,\n",
|
|
||||||
"}\n",
|
|
||||||
"\n",
|
|
||||||
"automl_config = AutoMLConfig(task = 'classification',\n",
|
|
||||||
" debug_log = 'automl_errors.log',\n",
|
|
||||||
" compute_target = LocalTarget(),\n",
|
|
||||||
" enable_local_managed = True,\n",
|
|
||||||
" training_data = training_data,\n",
|
|
||||||
" label_column_name = label_column_name,\n",
|
|
||||||
" **automl_settings\n",
|
|
||||||
" )"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"Call the `submit` method on the experiment object and pass the run configuration. Depending on the data and the number of iterations this can run for a while. Validation errors and current status will be shown when setting `show_output=True` and the execution will be synchronous."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"parent_run = experiment.submit(automl_config, show_output = True)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"# If you need to retrieve a run that already started, use the following code\n",
|
|
||||||
"#from azureml.train.automl.run import AutoMLRun\n",
|
|
||||||
"#parent_run = AutoMLRun(experiment = experiment, run_id = '<replace with your run id>')"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"parent_run"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"## Results"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"#### Explain model\n",
|
|
||||||
"\n",
|
|
||||||
"Automated ML models can be explained and visualized using the SDK Explainability library. "
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"## Analyze results\n",
|
|
||||||
"\n",
|
|
||||||
"### Retrieve the Best Child Run\n",
|
|
||||||
"\n",
|
|
||||||
"Below we select the best pipeline from our iterations. The `get_best_child` method returns the best run. Overloads on `get_best_child` allow you to retrieve the best run for *any* logged metric."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"best_run = parent_run.get_best_child()\n"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"## Test the fitted model\n",
|
|
||||||
"\n",
|
|
||||||
"Now that the model is trained, split the data in the same way the data was split for training (The difference here is the data is being split locally) and then run the test data through the trained model to get the predicted values."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"X_test_df = validation_data.drop_columns(columns=[label_column_name])\n",
|
|
||||||
"y_test_df = validation_data.keep_columns(columns=[label_column_name], validate=True)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"#### Creating ModelProxy for submitting prediction runs to the training environment.\n",
|
|
||||||
"We will create a ModelProxy for the best child run, which will allow us to submit a run that does the prediction in the training environment. Unlike the local client, which can have different versions of some libraries, the training environment will have all the compatible libraries for the model already."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"from azureml.train.automl.model_proxy import ModelProxy\n",
|
|
||||||
"best_model_proxy = ModelProxy(best_run)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"# call the predict functions on the model proxy\n",
|
|
||||||
"y_pred = best_model_proxy.predict(X_test_df).to_pandas_dataframe()\n",
|
|
||||||
"y_pred"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"## Acknowledgements"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"This Credit Card fraud Detection dataset is made available under the Open Database License: http://opendatacommons.org/licenses/odbl/1.0/. Any rights in individual contents of the database are licensed under the Database Contents License: http://opendatacommons.org/licenses/dbcl/1.0/ and is available at: https://www.kaggle.com/mlg-ulb/creditcardfraud\n",
|
|
||||||
"\n",
|
|
||||||
"\n",
|
|
||||||
"The dataset has been collected and analysed during a research collaboration of Worldline and the Machine Learning Group (http://mlg.ulb.ac.be) of ULB (Universit\u00c3\u0192\u00c2\u00a9 Libre de Bruxelles) on big data mining and fraud detection. More details on current and past projects on related topics are available on https://www.researchgate.net/project/Fraud-detection-5 and the page of the DefeatFraud project\n",
|
|
||||||
"Please cite the following works: \n",
|
|
||||||
"\u00c3\u00a2\u00e2\u201a\u00ac\u00c2\u00a2\tAndrea Dal Pozzolo, Olivier Caelen, Reid A. Johnson and Gianluca Bontempi. Calibrating Probability with Undersampling for Unbalanced Classification. In Symposium on Computational Intelligence and Data Mining (CIDM), IEEE, 2015\n",
|
|
||||||
"\u00c3\u00a2\u00e2\u201a\u00ac\u00c2\u00a2\tDal Pozzolo, Andrea; Caelen, Olivier; Le Borgne, Yann-Ael; Waterschoot, Serge; Bontempi, Gianluca. Learned lessons in credit card fraud detection from a practitioner perspective, Expert systems with applications,41,10,4915-4928,2014, Pergamon\n",
|
|
||||||
"\u00c3\u00a2\u00e2\u201a\u00ac\u00c2\u00a2\tDal Pozzolo, Andrea; Boracchi, Giacomo; Caelen, Olivier; Alippi, Cesare; Bontempi, Gianluca. Credit card fraud detection: a realistic modeling and a novel learning strategy, IEEE transactions on neural networks and learning systems,29,8,3784-3797,2018,IEEE\n",
|
|
||||||
"o\tDal Pozzolo, Andrea Adaptive Machine learning for credit card fraud detection ULB MLG PhD thesis (supervised by G. Bontempi)\n",
|
|
||||||
"\u00c3\u00a2\u00e2\u201a\u00ac\u00c2\u00a2\tCarcillo, Fabrizio; Dal Pozzolo, Andrea; Le Borgne, Yann-A\u00c3\u0192\u00c2\u00abl; Caelen, Olivier; Mazzer, Yannis; Bontempi, Gianluca. Scarff: a scalable framework for streaming credit card fraud detection with Spark, Information fusion,41, 182-194,2018,Elsevier\n",
|
|
||||||
"\u00c3\u00a2\u00e2\u201a\u00ac\u00c2\u00a2\tCarcillo, Fabrizio; Le Borgne, Yann-A\u00c3\u0192\u00c2\u00abl; Caelen, Olivier; Bontempi, Gianluca. Streaming active learning strategies for real-life credit card fraud detection: assessment and visualization, International Journal of Data Science and Analytics, 5,4,285-300,2018,Springer International Publishing"
|
|
||||||
]
|
|
||||||
}
|
|
||||||
],
|
|
||||||
"metadata": {
|
|
||||||
"authors": [
|
|
||||||
{
|
|
||||||
"name": "sekrupa"
|
|
||||||
}
|
|
||||||
],
|
|
||||||
"category": "tutorial",
|
|
||||||
"compute": [
|
|
||||||
"AML Compute"
|
|
||||||
],
|
|
||||||
"datasets": [
|
|
||||||
"Creditcard"
|
|
||||||
],
|
|
||||||
"deployment": [
|
|
||||||
"None"
|
|
||||||
],
|
|
||||||
"exclude_from_index": false,
|
|
||||||
"file_extension": ".py",
|
|
||||||
"framework": [
|
|
||||||
"None"
|
|
||||||
],
|
|
||||||
"friendly_name": "Classification of credit card fraudulent transactions using Automated ML",
|
|
||||||
"index_order": 5,
|
|
||||||
"kernelspec": {
|
|
||||||
"display_name": "Python 3.6",
|
|
||||||
"language": "python",
|
|
||||||
"name": "python36"
|
|
||||||
},
|
|
||||||
"language_info": {
|
|
||||||
"codemirror_mode": {
|
|
||||||
"name": "ipython",
|
|
||||||
"version": 3
|
|
||||||
},
|
|
||||||
"file_extension": ".py",
|
|
||||||
"mimetype": "text/x-python",
|
|
||||||
"name": "python",
|
|
||||||
"nbconvert_exporter": "python",
|
|
||||||
"pygments_lexer": "ipython3",
|
|
||||||
"version": "3.6.7"
|
|
||||||
},
|
|
||||||
"mimetype": "text/x-python",
|
|
||||||
"name": "python",
|
|
||||||
"nbconvert_exporter": "python",
|
|
||||||
"pygments_lexer": "ipython3",
|
|
||||||
"tags": [
|
|
||||||
"AutomatedML"
|
|
||||||
],
|
|
||||||
"task": "Classification",
|
|
||||||
"version": "3.6.7"
|
|
||||||
},
|
|
||||||
"nbformat": 4,
|
|
||||||
"nbformat_minor": 2
|
|
||||||
}
|
|
||||||
@@ -1,4 +0,0 @@
|
|||||||
name: auto-ml-classification-credit-card-fraud-local-managed
|
|
||||||
dependencies:
|
|
||||||
- pip:
|
|
||||||
- azureml-sdk
|
|
||||||
@@ -75,7 +75,6 @@
|
|||||||
"from azureml.core.experiment import Experiment\n",
|
"from azureml.core.experiment import Experiment\n",
|
||||||
"from azureml.core.workspace import Workspace\n",
|
"from azureml.core.workspace import Workspace\n",
|
||||||
"from azureml.core.dataset import Dataset\n",
|
"from azureml.core.dataset import Dataset\n",
|
||||||
"from azureml.data.dataset_factory import TabularDatasetFactory\n",
|
|
||||||
"from azureml.train.automl import AutoMLConfig"
|
"from azureml.train.automl import AutoMLConfig"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
@@ -92,7 +91,7 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"print(\"This notebook was created using version 1.40.0 of the Azure ML SDK\")\n",
|
"print(\"This notebook was created using version 1.59.0 of the Azure ML SDK\")\n",
|
||||||
"print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
|
"print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
@@ -197,10 +196,10 @@
|
|||||||
"source": [
|
"source": [
|
||||||
"ds = ws.get_default_datastore()\n",
|
"ds = ws.get_default_datastore()\n",
|
||||||
"\n",
|
"\n",
|
||||||
"train_data = TabularDatasetFactory.register_pandas_dataframe(\n",
|
"train_data = Dataset.Tabular.register_pandas_dataframe(\n",
|
||||||
" train_data.to_pandas_dataframe(), target=(ds, \"machineTrainData\"), name=\"train_data\")\n",
|
" train_data.to_pandas_dataframe(), target=(ds, \"machineTrainData\"), name=\"train_data\")\n",
|
||||||
"\n",
|
"\n",
|
||||||
"test_data = TabularDatasetFactory.register_pandas_dataframe(\n",
|
"test_data = Dataset.Tabular.register_pandas_dataframe(\n",
|
||||||
" test_data.to_pandas_dataframe(), target=(ds, \"machineTestData\"), name=\"test_data\")"
|
" test_data.to_pandas_dataframe(), target=(ds, \"machineTestData\"), name=\"test_data\")"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
@@ -328,7 +327,8 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"#### Show hyperparameters\n",
|
"#### Show hyperparameters\n",
|
||||||
"Show the model pipeline used for the best run with its hyperparameters."
|
"Show the model pipeline used for the best run with its hyperparameters.\n",
|
||||||
|
"For ensemble pipelines it shows the iterations and algorithms that are ensembled."
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -337,8 +337,19 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"run_properties = json.loads(best_run.get_details()['properties']['pipeline_script'])\n",
|
"run_properties = best_run.get_details()['properties']\n",
|
||||||
"print(json.dumps(run_properties, indent = 1)) "
|
"pipeline_script = json.loads(run_properties['pipeline_script'])\n",
|
||||||
|
"print(json.dumps(pipeline_script, indent = 1)) \n",
|
||||||
|
"\n",
|
||||||
|
"if 'ensembled_iterations' in run_properties:\n",
|
||||||
|
" print(\"\")\n",
|
||||||
|
" print(\"Ensembled Iterations\")\n",
|
||||||
|
" print(run_properties['ensembled_iterations'])\n",
|
||||||
|
" \n",
|
||||||
|
"if 'ensembled_algorithms' in run_properties:\n",
|
||||||
|
" print(\"\")\n",
|
||||||
|
" print(\"Ensembled Algorithms\")\n",
|
||||||
|
" print(run_properties['ensembled_algorithms'])"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -437,9 +448,9 @@
|
|||||||
"automated-machine-learning"
|
"automated-machine-learning"
|
||||||
],
|
],
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Python 3.6",
|
"display_name": "Python 3.8 - AzureML",
|
||||||
"language": "python",
|
"language": "python",
|
||||||
"name": "python36"
|
"name": "python38-azureml"
|
||||||
},
|
},
|
||||||
"language_info": {
|
"language_info": {
|
||||||
"codemirror_mode": {
|
"codemirror_mode": {
|
||||||
|
|||||||
@@ -1,4 +0,0 @@
|
|||||||
name: auto-ml-regression-model-proxy
|
|
||||||
dependencies:
|
|
||||||
- pip:
|
|
||||||
- azureml-sdk
|
|
||||||
@@ -5,6 +5,7 @@ import json
|
|||||||
import os
|
import os
|
||||||
import re
|
import re
|
||||||
|
|
||||||
|
import numpy as np
|
||||||
import pandas as pd
|
import pandas as pd
|
||||||
|
|
||||||
from matplotlib import pyplot as plt
|
from matplotlib import pyplot as plt
|
||||||
@@ -121,7 +122,10 @@ def calculate_scores_and_build_plots(
|
|||||||
input_dir: str, output_dir: str, automl_settings: Dict[str, Any]
|
input_dir: str, output_dir: str, automl_settings: Dict[str, Any]
|
||||||
):
|
):
|
||||||
os.makedirs(output_dir, exist_ok=True)
|
os.makedirs(output_dir, exist_ok=True)
|
||||||
grains = automl_settings.get(constants.TimeSeries.GRAIN_COLUMN_NAMES)
|
grains = automl_settings.get(
|
||||||
|
constants.TimeSeries.TIME_SERIES_ID_COLUMN_NAMES,
|
||||||
|
automl_settings.get(constants.TimeSeries.GRAIN_COLUMN_NAMES, None),
|
||||||
|
)
|
||||||
time_column_name = automl_settings.get(constants.TimeSeries.TIME_COLUMN_NAME)
|
time_column_name = automl_settings.get(constants.TimeSeries.TIME_COLUMN_NAME)
|
||||||
if grains is None:
|
if grains is None:
|
||||||
grains = []
|
grains = []
|
||||||
@@ -146,6 +150,9 @@ def calculate_scores_and_build_plots(
|
|||||||
_draw_one_plot(one_forecast, time_column_name, grains, pdf)
|
_draw_one_plot(one_forecast, time_column_name, grains, pdf)
|
||||||
pdf.close()
|
pdf.close()
|
||||||
forecast_df.to_csv(os.path.join(output_dir, FORECASTS_FILE), index=False)
|
forecast_df.to_csv(os.path.join(output_dir, FORECASTS_FILE), index=False)
|
||||||
|
# Remove np.NaN and np.inf from the prediction and actuals data.
|
||||||
|
forecast_df.replace([np.inf, -np.inf], np.nan, inplace=True)
|
||||||
|
forecast_df.dropna(subset=[ACTUALS, PREDICTIONS], inplace=True)
|
||||||
metrics = compute_all_metrics(forecast_df, grains + [BACKTEST_ITER])
|
metrics = compute_all_metrics(forecast_df, grains + [BACKTEST_ITER])
|
||||||
metrics.to_csv(os.path.join(output_dir, SCORES_FILE), index=False)
|
metrics.to_csv(os.path.join(output_dir, SCORES_FILE), index=False)
|
||||||
|
|
||||||
|
|||||||
@@ -13,7 +13,7 @@
|
|||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
""
|
""
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -33,6 +33,7 @@
|
|||||||
"For this notebook we are using a synthetic dataset to demonstrate the back testing in many model scenario. This allows us to check historical performance of AutoML on a historical data. To do that we step back on the backtesting period by the data set several times and split the data to train and test sets. Then these data sets are used for training and evaluation of model.<br>\n",
|
"For this notebook we are using a synthetic dataset to demonstrate the back testing in many model scenario. This allows us to check historical performance of AutoML on a historical data. To do that we step back on the backtesting period by the data set several times and split the data to train and test sets. Then these data sets are used for training and evaluation of model.<br>\n",
|
||||||
"\n",
|
"\n",
|
||||||
"Thus, it is a quick way of evaluating AutoML as if it was in production. Here, we do not test historical performance of a particular model, for this see the [notebook](../forecasting-backtest-single-model/auto-ml-forecasting-backtest-single-model.ipynb). Instead, the best model for every backtest iteration can be different since AutoML chooses the best model for a given training set.\n",
|
"Thus, it is a quick way of evaluating AutoML as if it was in production. Here, we do not test historical performance of a particular model, for this see the [notebook](../forecasting-backtest-single-model/auto-ml-forecasting-backtest-single-model.ipynb). Instead, the best model for every backtest iteration can be different since AutoML chooses the best model for a given training set.\n",
|
||||||
|
"\n",
|
||||||
"\n",
|
"\n",
|
||||||
"\n",
|
"\n",
|
||||||
"**NOTE: There are limits on how many runs we can do in parallel per workspace, and we currently recommend to set the parallelism to maximum of 320 runs per experiment per workspace. If users want to have more parallelism and increase this limit they might encounter Too Many Requests errors (HTTP 429).**"
|
"**NOTE: There are limits on how many runs we can do in parallel per workspace, and we currently recommend to set the parallelism to maximum of 320 runs per experiment per workspace. If users want to have more parallelism and increase this limit they might encounter Too Many Requests errors (HTTP 429).**"
|
||||||
@@ -43,7 +44,7 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"### Prerequisites\n",
|
"### Prerequisites\n",
|
||||||
"You'll need to create a compute Instance by following the instructions in the [EnvironmentSetup.md](../Setup_Resources/EnvironmentSetup.md)."
|
"You'll need to create a compute Instance by following [these](https://learn.microsoft.com/en-us/azure/machine-learning/v1/how-to-create-manage-compute-instance?tabs=python) instructions."
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -86,6 +87,7 @@
|
|||||||
"output[\"Resource Group\"] = ws.resource_group\n",
|
"output[\"Resource Group\"] = ws.resource_group\n",
|
||||||
"output[\"Location\"] = ws.location\n",
|
"output[\"Location\"] = ws.location\n",
|
||||||
"output[\"Default datastore name\"] = dstore.name\n",
|
"output[\"Default datastore name\"] = dstore.name\n",
|
||||||
|
"output[\"SDK Version\"] = azureml.core.VERSION\n",
|
||||||
"pd.set_option(\"display.max_colwidth\", None)\n",
|
"pd.set_option(\"display.max_colwidth\", None)\n",
|
||||||
"outputDf = pd.DataFrame(data=output, index=[\"\"])\n",
|
"outputDf = pd.DataFrame(data=output, index=[\"\"])\n",
|
||||||
"outputDf.T"
|
"outputDf.T"
|
||||||
@@ -312,21 +314,37 @@
|
|||||||
"source": [
|
"source": [
|
||||||
"### Set up training parameters\n",
|
"### Set up training parameters\n",
|
||||||
"\n",
|
"\n",
|
||||||
"This dictionary defines the AutoML and many models settings. For this forecasting task we need to define several settings including the name of the time column, the maximum forecast horizon, and the partition column name definition. Please note, that in this case we are setting grain_column_names to be the time series ID column plus iteration, because we want to train a separate model for each time series and iteration.\n",
|
"We need to provide ``ForecastingParameters``, ``AutoMLConfig`` and ``ManyModelsTrainParameters`` objects. For the forecasting task we also need to define several settings including the name of the time column, the maximum forecast horizon, and the partition column name(s) definition.\n",
|
||||||
"\n",
|
"\n",
|
||||||
|
"#### ``ForecastingParameters`` arguments\n",
|
||||||
|
"| Property | Description|\n",
|
||||||
|
"| :--------------- | :------------------- |\n",
|
||||||
|
"| **forecast_horizon** | The forecast horizon is how many periods forward you would like to forecast. This integer horizon is in units of the timeseries frequency (e.g. daily, weekly). Periods are inferred from your data. |\n",
|
||||||
|
"| **time_column_name** | The name of your time column. |\n",
|
||||||
|
"| **time_series_id_column_names** | The column names used to uniquely identify timeseries in data that has multiple rows with the same timestamp. |\n",
|
||||||
|
"| **cv_step_size** | Number of periods between two consecutive cross-validation folds. The default value is \\\"auto\\\", in which case AutoMl determines the cross-validation step size automatically, if a validation set is not provided. Or users could specify an integer value. |\n",
|
||||||
|
"\n",
|
||||||
|
"#### ``AutoMLConfig`` arguments\n",
|
||||||
"| Property | Description|\n",
|
"| Property | Description|\n",
|
||||||
"| :--------------- | :------------------- |\n",
|
"| :--------------- | :------------------- |\n",
|
||||||
"| **task** | forecasting |\n",
|
"| **task** | forecasting |\n",
|
||||||
"| **primary_metric** | This is the metric that you want to optimize.<br> Forecasting supports the following primary metrics <br><i>normalized_root_mean_squared_error</i><br><i>normalized_mean_absolute_error</i> |\n",
|
"| **primary_metric** | This is the metric that you want to optimize.<br> Forecasting supports the following primary metrics <br><i>spearman_correlation</i><br><i>normalized_root_mean_squared_error</i><br><i>r2_score</i><br><i>normalized_mean_absolute_error</i> |\n",
|
||||||
|
"| **blocked_models** | Blocked models won't be used by AutoML. |\n",
|
||||||
"| **iteration_timeout_minutes** | Maximum amount of time in minutes that the model can train. This is optional but provides customers with greater control on exit criteria. |\n",
|
"| **iteration_timeout_minutes** | Maximum amount of time in minutes that the model can train. This is optional but provides customers with greater control on exit criteria. |\n",
|
||||||
"| **iterations** | Number of models to train. This is optional but provides customers with greater control on exit criteria. |\n",
|
"| **iterations** | Number of models to train. This is optional but provides customers with greater control on exit criteria. |\n",
|
||||||
"| **experiment_timeout_hours** | Maximum amount of time in hours that the experiment can take before it terminates. This is optional but provides customers with greater control on exit criteria. |\n",
|
"| **experiment_timeout_hours** | Maximum amount of time in hours that each experiment can take before it terminates. This is optional but provides customers with greater control on exit criteria. **It does not control the overall timeout for the pipeline run, instead controls the timeout for each training run per partitioned time series.** |\n",
|
||||||
"| **label_column_name** | The name of the label column. |\n",
|
"| **label_column_name** | The name of the label column. |\n",
|
||||||
"| **max_horizon** | The forecast horizon is how many periods forward you would like to forecast. This integer horizon is in units of the timeseries frequency (e.g. daily, weekly). Periods are inferred from your data. |\n",
|
"| **n_cross_validations** | Number of cross validation splits. The default value is \\\"auto\\\", in which case AutoMl determines the number of cross-validations automatically, if a validation set is not provided. Or users could specify an integer value. Rolling Origin Validation is used to split time-series in a temporally consistent way. |\n",
|
||||||
"| **n_cross_validations** | Number of cross validation splits. Rolling Origin Validation is used to split time-series in a temporally consistent way. |\n",
|
"| **enable_early_stopping** | Flag to enable early termination if the primary metric is no longer improving. |\n",
|
||||||
"| **time_column_name** | The name of your time column. |\n",
|
"| **enable_engineered_explanations** | Engineered feature explanations will be downloaded if enable_engineered_explanations flag is set to True. By default it is set to False to save storage space. |\n",
|
||||||
"| **grain_column_names** | The column names used to uniquely identify timeseries in data that has multiple rows with the same timestamp. |\n",
|
|
||||||
"| **track_child_runs** | Flag to disable tracking of child runs. Only best run is tracked if the flag is set to False (this includes the model and metrics of the run). |\n",
|
"| **track_child_runs** | Flag to disable tracking of child runs. Only best run is tracked if the flag is set to False (this includes the model and metrics of the run). |\n",
|
||||||
|
"| **pipeline_fetch_max_batch_size** | Determines how many pipelines (training algorithms) to fetch at a time for training, this helps reduce throttling when training at large scale. |\n",
|
||||||
|
"\n",
|
||||||
|
"\n",
|
||||||
|
"#### ``ManyModelsTrainParameters`` arguments\n",
|
||||||
|
"| Property | Description|\n",
|
||||||
|
"| :--------------- | :------------------- |\n",
|
||||||
|
"| **automl_settings** | The ``AutoMLConfig`` object defined above. |\n",
|
||||||
"| **partition_column_names** | The names of columns used to group your models. For timeseries, the groups must not split up individual time-series. That is, each group must contain one or more whole time-series. |"
|
"| **partition_column_names** | The names of columns used to group your models. For timeseries, the groups must not split up individual time-series. That is, each group must contain one or more whole time-series. |"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
@@ -343,21 +361,30 @@
|
|||||||
"from azureml.train.automl.runtime._many_models.many_models_parameters import (\n",
|
"from azureml.train.automl.runtime._many_models.many_models_parameters import (\n",
|
||||||
" ManyModelsTrainParameters,\n",
|
" ManyModelsTrainParameters,\n",
|
||||||
")\n",
|
")\n",
|
||||||
|
"from azureml.automl.core.forecasting_parameters import ForecastingParameters\n",
|
||||||
|
"from azureml.train.automl.automlconfig import AutoMLConfig\n",
|
||||||
"\n",
|
"\n",
|
||||||
"partition_column_names = [TIME_SERIES_ID_COLNAME, \"backtest_iteration\"]\n",
|
"partition_column_names = [TIME_SERIES_ID_COLNAME, \"backtest_iteration\"]\n",
|
||||||
"automl_settings = {\n",
|
"\n",
|
||||||
" \"task\": \"forecasting\",\n",
|
"forecasting_parameters = ForecastingParameters(\n",
|
||||||
" \"primary_metric\": \"normalized_root_mean_squared_error\",\n",
|
" time_column_name=TIME_COLNAME,\n",
|
||||||
" \"iteration_timeout_minutes\": 10, # This needs to be changed based on the dataset. We ask customer to explore how long training is taking before settings this value\n",
|
" forecast_horizon=6,\n",
|
||||||
" \"iterations\": 15,\n",
|
" time_series_id_column_names=partition_column_names,\n",
|
||||||
" \"experiment_timeout_hours\": 0.25, # This also needs to be changed based on the dataset. For larger data set this number needs to be bigger.\n",
|
" cv_step_size=\"auto\",\n",
|
||||||
" \"label_column_name\": TARGET_COLNAME,\n",
|
")\n",
|
||||||
" \"n_cross_validations\": 3,\n",
|
"\n",
|
||||||
" \"time_column_name\": TIME_COLNAME,\n",
|
"automl_settings = AutoMLConfig(\n",
|
||||||
" \"max_horizon\": 6,\n",
|
" task=\"forecasting\",\n",
|
||||||
" \"grain_column_names\": partition_column_names,\n",
|
" primary_metric=\"normalized_root_mean_squared_error\",\n",
|
||||||
" \"track_child_runs\": False,\n",
|
" iteration_timeout_minutes=10,\n",
|
||||||
"}\n",
|
" iterations=15,\n",
|
||||||
|
" experiment_timeout_hours=0.25,\n",
|
||||||
|
" label_column_name=TARGET_COLNAME,\n",
|
||||||
|
" n_cross_validations=\"auto\", # Feel free to set to a small integer (>=2) if runtime is an issue.\n",
|
||||||
|
" track_child_runs=False,\n",
|
||||||
|
" forecasting_parameters=forecasting_parameters,\n",
|
||||||
|
")\n",
|
||||||
|
"\n",
|
||||||
"\n",
|
"\n",
|
||||||
"mm_paramters = ManyModelsTrainParameters(\n",
|
"mm_paramters = ManyModelsTrainParameters(\n",
|
||||||
" automl_settings=automl_settings, partition_column_names=partition_column_names\n",
|
" automl_settings=automl_settings, partition_column_names=partition_column_names\n",
|
||||||
@@ -384,8 +411,16 @@
|
|||||||
"| **node_count** | The number of compute nodes to be used for running the user script. We recommend to start with 3 and increase the node_count if the training time is taking too long. |\n",
|
"| **node_count** | The number of compute nodes to be used for running the user script. We recommend to start with 3 and increase the node_count if the training time is taking too long. |\n",
|
||||||
"| **process_count_per_node** | Process count per node, we recommend 2:1 ratio for number of cores: number of processes per node. eg. If node has 16 cores then configure 8 or less process count per node or optimal performance. |\n",
|
"| **process_count_per_node** | Process count per node, we recommend 2:1 ratio for number of cores: number of processes per node. eg. If node has 16 cores then configure 8 or less process count per node or optimal performance. |\n",
|
||||||
"| **train_pipeline_parameters** | The set of configuration parameters defined in the previous section. |\n",
|
"| **train_pipeline_parameters** | The set of configuration parameters defined in the previous section. |\n",
|
||||||
|
"| **run_invocation_timeout** | Maximum amount of time in seconds that the ``ParallelRunStep`` class is allowed. This is optional but provides customers with greater control on exit criteria. This must be greater than ``experiment_timeout_hours`` by at least 300 seconds. |\n",
|
||||||
"\n",
|
"\n",
|
||||||
"Calling this method will create a new aggregated dataset which is generated dynamically on pipeline execution."
|
"Calling this method will create a new aggregated dataset which is generated dynamically on pipeline execution.\n",
|
||||||
|
"\n",
|
||||||
|
"**Note**: Total time taken for the **training step** in the pipeline to complete = $ \\frac{t}{ p \\times n } \\times ts $\n",
|
||||||
|
"where,\n",
|
||||||
|
"- $ t $ is time taken for training one partition (can be viewed in the training logs)\n",
|
||||||
|
"- $ p $ is ``process_count_per_node``\n",
|
||||||
|
"- $ n $ is ``node_count``\n",
|
||||||
|
"- $ ts $ is total number of partitions in time series based on ``partition_column_names``"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -403,7 +438,7 @@
|
|||||||
" compute_target=compute_target,\n",
|
" compute_target=compute_target,\n",
|
||||||
" node_count=2,\n",
|
" node_count=2,\n",
|
||||||
" process_count_per_node=2,\n",
|
" process_count_per_node=2,\n",
|
||||||
" run_invocation_timeout=920,\n",
|
" run_invocation_timeout=1200,\n",
|
||||||
" train_pipeline_parameters=mm_paramters,\n",
|
" train_pipeline_parameters=mm_paramters,\n",
|
||||||
")"
|
")"
|
||||||
]
|
]
|
||||||
@@ -488,25 +523,31 @@
|
|||||||
"source": [
|
"source": [
|
||||||
"For many models we need to provide the ManyModelsInferenceParameters object.\n",
|
"For many models we need to provide the ManyModelsInferenceParameters object.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"#### ManyModelsInferenceParameters arguments\n",
|
"#### ``ManyModelsInferenceParameters`` arguments\n",
|
||||||
"| Property | Description|\n",
|
"| Property | Description|\n",
|
||||||
"| :--------------- | :------------------- |\n",
|
"| :--------------- | :------------------- |\n",
|
||||||
"| **partition_column_names** | List of column names that identifies groups. |\n",
|
"| **partition_column_names** | List of column names that identifies groups. |\n",
|
||||||
"| **target_column_name** | \\[Optional\\] Column name only if the inference dataset has the target. |\n",
|
"| **target_column_name** | \\[Optional] Column name only if the inference dataset has the target. |\n",
|
||||||
"| **time_column_name** | Column name only if it is timeseries. |\n",
|
"| **time_column_name** | \\[Optional] Time column name only if it is timeseries. |\n",
|
||||||
"| **many_models_run_id** | \\[Optional\\] Many models pipeline run id where models were trained. |\n",
|
"| **inference_type** | \\[Optional] Which inference method to use on the model. Possible values are 'forecast', 'predict_proba', and 'predict'. |\n",
|
||||||
|
"| **forecast_mode** | \\[Optional] The type of forecast to be used, either 'rolling' or 'recursive'; defaults to 'recursive'. |\n",
|
||||||
|
"| **step** | \\[Optional] Number of periods to advance the forecasting window in each iteration **(for rolling forecast only)**; defaults to 1. |\n",
|
||||||
"\n",
|
"\n",
|
||||||
"#### get_many_models_batch_inference_steps arguments\n",
|
"#### ``get_many_models_batch_inference_steps`` arguments\n",
|
||||||
"| Property | Description|\n",
|
"| Property | Description|\n",
|
||||||
"| :--------------- | :------------------- |\n",
|
"| :--------------- | :------------------- |\n",
|
||||||
"| **experiment** | The experiment used for inference run. |\n",
|
"| **experiment** | The experiment used for inference run. |\n",
|
||||||
"| **inference_data** | The data to use for inferencing. It should be the same schema as used for training.\n",
|
"| **inference_data** | The data to use for inferencing. It should be the same schema as used for training.\n",
|
||||||
"| **compute_target** | The compute target that runs the inference pipeline.|\n",
|
"| **compute_target** | The compute target that runs the inference pipeline. |\n",
|
||||||
"| **node_count** | The number of compute nodes to be used for running the user script. We recommend to start with the number of cores per node (varies by compute sku). |\n",
|
"| **node_count** | The number of compute nodes to be used for running the user script. We recommend to start with the number of cores per node (varies by compute sku). |\n",
|
||||||
"| **process_count_per_node** | The number of processes per node.\n",
|
"| **process_count_per_node** | \\[Optional] The number of processes per node. By default it's 2 (should be at most half of the number of cores in a single node of the compute cluster that will be used for the experiment).\n",
|
||||||
"| **train_run_id** | \\[Optional\\] The run id of the hierarchy training, by default it is the latest successful training many model run in the experiment. |\n",
|
"| **inference_pipeline_parameters** | \\[Optional] The ``ManyModelsInferenceParameters`` object defined above. |\n",
|
||||||
"| **train_experiment_name** | \\[Optional\\] The train experiment that contains the train pipeline. This one is only needed when the train pipeline is not in the same experiement as the inference pipeline. |\n",
|
"| **append_row_file_name** | \\[Optional] The name of the output file (optional, default value is 'parallel_run_step.txt'). Supports 'txt' and 'csv' file extension. A 'txt' file extension generates the output in 'txt' format with space as separator without column names. A 'csv' file extension generates the output in 'csv' format with comma as separator and with column names. |\n",
|
||||||
"| **process_count_per_node** | \\[Optional\\] The number of processes per node, by default it's 4. |"
|
"| **train_run_id** | \\[Optional] The run id of the **training pipeline**. By default it is the latest successful training pipeline run in the experiment. |\n",
|
||||||
|
"| **train_experiment_name** | \\[Optional] The train experiment that contains the train pipeline. This one is only needed when the train pipeline is not in the same experiement as the inference pipeline. |\n",
|
||||||
|
"| **run_invocation_timeout** | \\[Optional] Maximum amount of time in seconds that the ``ParallelRunStep`` class is allowed. This is optional but provides customers with greater control on exit criteria. |\n",
|
||||||
|
"| **output_datastore** | \\[Optional] The ``Datastore`` or ``OutputDatasetConfig`` to be used for output. If specified any pipeline output will be written to that location. If unspecified the default datastore will be used. |\n",
|
||||||
|
"| **arguments** | \\[Optional] Arguments to be passed to inference script. Possible argument is '--forecast_quantiles' followed by quantile values. |"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -526,6 +567,8 @@
|
|||||||
" target_column_name=TARGET_COLNAME,\n",
|
" target_column_name=TARGET_COLNAME,\n",
|
||||||
")\n",
|
")\n",
|
||||||
"\n",
|
"\n",
|
||||||
|
"output_file_name = \"parallel_run_step.csv\"\n",
|
||||||
|
"\n",
|
||||||
"inference_steps = AutoMLPipelineBuilder.get_many_models_batch_inference_steps(\n",
|
"inference_steps = AutoMLPipelineBuilder.get_many_models_batch_inference_steps(\n",
|
||||||
" experiment=experiment,\n",
|
" experiment=experiment,\n",
|
||||||
" inference_data=test_data,\n",
|
" inference_data=test_data,\n",
|
||||||
@@ -537,6 +580,7 @@
|
|||||||
" train_run_id=training_run.id,\n",
|
" train_run_id=training_run.id,\n",
|
||||||
" train_experiment_name=training_run.experiment.name,\n",
|
" train_experiment_name=training_run.experiment.name,\n",
|
||||||
" inference_pipeline_parameters=mm_parameters,\n",
|
" inference_pipeline_parameters=mm_parameters,\n",
|
||||||
|
" append_row_file_name=output_file_name,\n",
|
||||||
")"
|
")"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
@@ -584,18 +628,21 @@
|
|||||||
"source": [
|
"source": [
|
||||||
"from azureml.contrib.automl.pipeline.steps.utilities import get_output_from_mm_pipeline\n",
|
"from azureml.contrib.automl.pipeline.steps.utilities import get_output_from_mm_pipeline\n",
|
||||||
"\n",
|
"\n",
|
||||||
|
"PREDICTION_COLNAME = \"Predictions\"\n",
|
||||||
"forecasting_results_name = \"forecasting_results\"\n",
|
"forecasting_results_name = \"forecasting_results\"\n",
|
||||||
"forecasting_output_name = \"many_models_inference_output\"\n",
|
"forecasting_output_name = \"many_models_inference_output\"\n",
|
||||||
"forecast_file = get_output_from_mm_pipeline(\n",
|
"forecast_file = get_output_from_mm_pipeline(\n",
|
||||||
" inference_run, forecasting_results_name, forecasting_output_name\n",
|
" inference_run, forecasting_results_name, forecasting_output_name, output_file_name\n",
|
||||||
")\n",
|
")\n",
|
||||||
"df = pd.read_csv(forecast_file, delimiter=\" \", header=None, parse_dates=[0])\n",
|
"df = pd.read_csv(forecast_file, parse_dates=[0])\n",
|
||||||
"df.columns = list(X_train.columns) + [\"predicted_level\"]\n",
|
|
||||||
"print(\n",
|
"print(\n",
|
||||||
" \"Prediction has \", df.shape[0], \" rows. Here the first 10 rows are being displayed.\"\n",
|
" \"Prediction has \", df.shape[0], \" rows. Here the first 10 rows are being displayed.\"\n",
|
||||||
")\n",
|
")\n",
|
||||||
"# Save the scv file with header to read it in the next step.\n",
|
"# Save the csv file to read it in the next step.\n",
|
||||||
"df.rename(columns={TARGET_COLNAME: \"actual_level\"}, inplace=True)\n",
|
"df.rename(\n",
|
||||||
|
" columns={TARGET_COLNAME: \"actual_level\", PREDICTION_COLNAME: \"predicted_level\"},\n",
|
||||||
|
" inplace=True,\n",
|
||||||
|
")\n",
|
||||||
"df.to_csv(os.path.join(forecasting_results_name, \"forecast.csv\"), index=False)\n",
|
"df.to_csv(os.path.join(forecasting_results_name, \"forecast.csv\"), index=False)\n",
|
||||||
"df.head(10)"
|
"df.head(10)"
|
||||||
]
|
]
|
||||||
@@ -619,7 +666,9 @@
|
|||||||
"backtesting_results = \"backtesting_mm_results\"\n",
|
"backtesting_results = \"backtesting_mm_results\"\n",
|
||||||
"os.makedirs(backtesting_results, exist_ok=True)\n",
|
"os.makedirs(backtesting_results, exist_ok=True)\n",
|
||||||
"calculate_scores_and_build_plots(\n",
|
"calculate_scores_and_build_plots(\n",
|
||||||
" forecasting_results_name, backtesting_results, automl_settings\n",
|
" forecasting_results_name,\n",
|
||||||
|
" backtesting_results,\n",
|
||||||
|
" automl_settings.as_serializable_dict(),\n",
|
||||||
")\n",
|
")\n",
|
||||||
"pd.DataFrame({\"File\": os.listdir(backtesting_results)})"
|
"pd.DataFrame({\"File\": os.listdir(backtesting_results)})"
|
||||||
]
|
]
|
||||||
@@ -703,9 +752,9 @@
|
|||||||
"automated-machine-learning"
|
"automated-machine-learning"
|
||||||
],
|
],
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Python 3.6",
|
"display_name": "Python 3.8 - AzureML",
|
||||||
"language": "python",
|
"language": "python",
|
||||||
"name": "python36"
|
"name": "python38-azureml"
|
||||||
},
|
},
|
||||||
"language_info": {
|
"language_info": {
|
||||||
"codemirror_mode": {
|
"codemirror_mode": {
|
||||||
@@ -717,7 +766,12 @@
|
|||||||
"name": "python",
|
"name": "python",
|
||||||
"nbconvert_exporter": "python",
|
"nbconvert_exporter": "python",
|
||||||
"pygments_lexer": "ipython3",
|
"pygments_lexer": "ipython3",
|
||||||
"version": "3.6.9"
|
"version": "3.8.5"
|
||||||
|
},
|
||||||
|
"vscode": {
|
||||||
|
"interpreter": {
|
||||||
|
"hash": "6bd77c88278e012ef31757c15997a7bea8c943977c43d6909403c00ae11d43ca"
|
||||||
|
}
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"nbformat": 4,
|
"nbformat": 4,
|
||||||
|
|||||||
@@ -1,4 +0,0 @@
|
|||||||
name: auto-ml-forecasting-backtest-many-models
|
|
||||||
dependencies:
|
|
||||||
- pip:
|
|
||||||
- azureml-sdk
|
|
||||||
@@ -43,11 +43,20 @@ def init():
|
|||||||
global output_dir
|
global output_dir
|
||||||
global automl_settings
|
global automl_settings
|
||||||
global model_uid
|
global model_uid
|
||||||
|
global forecast_quantiles
|
||||||
|
|
||||||
logger.info("Initialization of the run.")
|
logger.info("Initialization of the run.")
|
||||||
parser = argparse.ArgumentParser("Parsing input arguments.")
|
parser = argparse.ArgumentParser("Parsing input arguments.")
|
||||||
parser.add_argument("--output-dir", dest="out", required=True)
|
parser.add_argument("--output-dir", dest="out", required=True)
|
||||||
parser.add_argument("--model-name", dest="model", default=None)
|
parser.add_argument("--model-name", dest="model", default=None)
|
||||||
parser.add_argument("--model-uid", dest="model_uid", default=None)
|
parser.add_argument("--model-uid", dest="model_uid", default=None)
|
||||||
|
parser.add_argument(
|
||||||
|
"--forecast_quantiles",
|
||||||
|
nargs="*",
|
||||||
|
type=float,
|
||||||
|
help="forecast quantiles list",
|
||||||
|
default=None,
|
||||||
|
)
|
||||||
|
|
||||||
parsed_args, _ = parser.parse_known_args()
|
parsed_args, _ = parser.parse_known_args()
|
||||||
model_name = parsed_args.model
|
model_name = parsed_args.model
|
||||||
@@ -55,6 +64,7 @@ def init():
|
|||||||
target_column_name = automl_settings.get("label_column_name")
|
target_column_name = automl_settings.get("label_column_name")
|
||||||
output_dir = parsed_args.out
|
output_dir = parsed_args.out
|
||||||
model_uid = parsed_args.model_uid
|
model_uid = parsed_args.model_uid
|
||||||
|
forecast_quantiles = parsed_args.forecast_quantiles
|
||||||
os.makedirs(output_dir, exist_ok=True)
|
os.makedirs(output_dir, exist_ok=True)
|
||||||
os.environ["AUTOML_IGNORE_PACKAGE_VERSION_INCOMPATIBILITIES".lower()] = "True"
|
os.environ["AUTOML_IGNORE_PACKAGE_VERSION_INCOMPATIBILITIES".lower()] = "True"
|
||||||
|
|
||||||
@@ -126,23 +136,18 @@ def run_backtest(data_input_name: str, file_name: str, experiment: Experiment):
|
|||||||
)
|
)
|
||||||
print(f"The model {best_run.properties['model_name']} was registered.")
|
print(f"The model {best_run.properties['model_name']} was registered.")
|
||||||
|
|
||||||
_, x_pred = fitted_model.forecast(X_test)
|
# By default we will have forecast quantiles of 0.5, which is our target
|
||||||
x_pred.reset_index(inplace=True, drop=False)
|
if forecast_quantiles:
|
||||||
columns = [automl_settings[constants.TimeSeries.TIME_COLUMN_NAME]]
|
if 0.5 not in forecast_quantiles:
|
||||||
if automl_settings.get(constants.TimeSeries.GRAIN_COLUMN_NAMES):
|
forecast_quantiles.append(0.5)
|
||||||
# We know that fitted_model.grain_column_names is a list.
|
fitted_model.quantiles = forecast_quantiles
|
||||||
columns.extend(fitted_model.grain_column_names)
|
|
||||||
columns.append(constants.TimeSeriesInternal.DUMMY_TARGET_COLUMN)
|
x_pred = fitted_model.forecast_quantiles(X_test)
|
||||||
# Remove featurized columns.
|
|
||||||
x_pred = x_pred[columns]
|
|
||||||
x_pred.rename(
|
|
||||||
{constants.TimeSeriesInternal.DUMMY_TARGET_COLUMN: "predicted_level"},
|
|
||||||
axis=1,
|
|
||||||
inplace=True,
|
|
||||||
)
|
|
||||||
x_pred["actual_level"] = y_test
|
x_pred["actual_level"] = y_test
|
||||||
x_pred["backtest_iteration"] = f"iteration_{last_training_date}"
|
x_pred["backtest_iteration"] = f"iteration_{last_training_date}"
|
||||||
|
x_pred.rename({0.5: "predicted_level"}, axis=1, inplace=True)
|
||||||
date_safe = RE_INVALID_SYMBOLS.sub("_", last_training_date)
|
date_safe = RE_INVALID_SYMBOLS.sub("_", last_training_date)
|
||||||
|
|
||||||
x_pred.to_csv(os.path.join(output_dir, f"iteration_{date_safe}.csv"), index=False)
|
x_pred.to_csv(os.path.join(output_dir, f"iteration_{date_safe}.csv"), index=False)
|
||||||
return x_pred
|
return x_pred
|
||||||
|
|
||||||
|
|||||||
@@ -5,6 +5,7 @@ import json
|
|||||||
import os
|
import os
|
||||||
import re
|
import re
|
||||||
|
|
||||||
|
import numpy as np
|
||||||
import pandas as pd
|
import pandas as pd
|
||||||
|
|
||||||
from matplotlib import pyplot as plt
|
from matplotlib import pyplot as plt
|
||||||
@@ -146,6 +147,9 @@ def calculate_scores_and_build_plots(
|
|||||||
_draw_one_plot(one_forecast, time_column_name, grains, pdf)
|
_draw_one_plot(one_forecast, time_column_name, grains, pdf)
|
||||||
pdf.close()
|
pdf.close()
|
||||||
forecast_df.to_csv(os.path.join(output_dir, FORECASTS_FILE), index=False)
|
forecast_df.to_csv(os.path.join(output_dir, FORECASTS_FILE), index=False)
|
||||||
|
# Remove np.NaN and np.inf from the prediction and actuals data.
|
||||||
|
forecast_df.replace([np.inf, -np.inf], np.nan, inplace=True)
|
||||||
|
forecast_df.dropna(subset=[ACTUALS, PREDICTIONS], inplace=True)
|
||||||
metrics = compute_all_metrics(forecast_df, grains + [BACKTEST_ITER])
|
metrics = compute_all_metrics(forecast_df, grains + [BACKTEST_ITER])
|
||||||
metrics.to_csv(os.path.join(output_dir, SCORES_FILE), index=False)
|
metrics.to_csv(os.path.join(output_dir, SCORES_FILE), index=False)
|
||||||
|
|
||||||
|
|||||||
@@ -7,7 +7,7 @@
|
|||||||
"Copyright (c) Microsoft Corporation. All rights reserved.\n",
|
"Copyright (c) Microsoft Corporation. All rights reserved.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"Licensed under the MIT License.\n",
|
"Licensed under the MIT License.\n",
|
||||||
""
|
""
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -100,6 +100,7 @@
|
|||||||
"output[\"SKU\"] = ws.sku\n",
|
"output[\"SKU\"] = ws.sku\n",
|
||||||
"output[\"Resource Group\"] = ws.resource_group\n",
|
"output[\"Resource Group\"] = ws.resource_group\n",
|
||||||
"output[\"Location\"] = ws.location\n",
|
"output[\"Location\"] = ws.location\n",
|
||||||
|
"output[\"SDK Version\"] = azureml.core.VERSION\n",
|
||||||
"pd.set_option(\"display.max_colwidth\", None)\n",
|
"pd.set_option(\"display.max_colwidth\", None)\n",
|
||||||
"outputDf = pd.DataFrame(data=output, index=[\"\"])\n",
|
"outputDf = pd.DataFrame(data=output, index=[\"\"])\n",
|
||||||
"outputDf.T"
|
"outputDf.T"
|
||||||
@@ -282,7 +283,8 @@
|
|||||||
"| **experiment_timeout_hours** | Maximum amount of time in hours that the experiment can take before it terminates. This is optional but provides customers with greater control on exit criteria. |\n",
|
"| **experiment_timeout_hours** | Maximum amount of time in hours that the experiment can take before it terminates. This is optional but provides customers with greater control on exit criteria. |\n",
|
||||||
"| **label_column_name** | The name of the label column. |\n",
|
"| **label_column_name** | The name of the label column. |\n",
|
||||||
"| **max_horizon** | The forecast horizon is how many periods forward you would like to forecast. This integer horizon is in units of the timeseries frequency (e.g. daily, weekly). Periods are inferred from your data. |\n",
|
"| **max_horizon** | The forecast horizon is how many periods forward you would like to forecast. This integer horizon is in units of the timeseries frequency (e.g. daily, weekly). Periods are inferred from your data. |\n",
|
||||||
"| **n_cross_validations** | Number of cross validation splits. Rolling Origin Validation is used to split time-series in a temporally consistent way. |\n",
|
"| **n_cross_validations** | Number of cross validation splits. The default value is \"auto\", in which case AutoMl determines the number of cross-validations automatically, if a validation set is not provided. Or users could specify an integer value. Rolling Origin Validation is used to split time-series in a temporally consistent way. |\n",
|
||||||
|
"|**cv_step_size**|Number of periods between two consecutive cross-validation folds. The default value is \"auto\", in which case AutoMl determines the cross-validation step size automatically, if a validation set is not provided. Or users could specify an integer value.\n",
|
||||||
"| **time_column_name** | The name of your time column. |\n",
|
"| **time_column_name** | The name of your time column. |\n",
|
||||||
"| **grain_column_names** | The column names used to uniquely identify timeseries in data that has multiple rows with the same timestamp. |"
|
"| **grain_column_names** | The column names used to uniquely identify timeseries in data that has multiple rows with the same timestamp. |"
|
||||||
]
|
]
|
||||||
@@ -300,7 +302,8 @@
|
|||||||
" \"iterations\": 15,\n",
|
" \"iterations\": 15,\n",
|
||||||
" \"experiment_timeout_hours\": 1, # This also needs to be changed based on the dataset. For larger data set this number needs to be bigger.\n",
|
" \"experiment_timeout_hours\": 1, # This also needs to be changed based on the dataset. For larger data set this number needs to be bigger.\n",
|
||||||
" \"label_column_name\": LABEL_COLUMN_NAME,\n",
|
" \"label_column_name\": LABEL_COLUMN_NAME,\n",
|
||||||
" \"n_cross_validations\": 3,\n",
|
" \"n_cross_validations\": \"auto\", # Feel free to set to a small integer (>=2) if runtime is an issue.\n",
|
||||||
|
" \"cv_step_size\": \"auto\",\n",
|
||||||
" \"time_column_name\": TIME_COLUMN_NAME,\n",
|
" \"time_column_name\": TIME_COLUMN_NAME,\n",
|
||||||
" \"max_horizon\": FORECAST_HORIZON,\n",
|
" \"max_horizon\": FORECAST_HORIZON,\n",
|
||||||
" \"track_child_runs\": False,\n",
|
" \"track_child_runs\": False,\n",
|
||||||
@@ -362,6 +365,7 @@
|
|||||||
" step_size=BACKTESTING_PERIOD,\n",
|
" step_size=BACKTESTING_PERIOD,\n",
|
||||||
" step_number=NUMBER_OF_BACKTESTS,\n",
|
" step_number=NUMBER_OF_BACKTESTS,\n",
|
||||||
" model_uid=model_uid,\n",
|
" model_uid=model_uid,\n",
|
||||||
|
" forecast_quantiles=[0.025, 0.975], # Optional\n",
|
||||||
")"
|
")"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
@@ -523,7 +527,7 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"model_list = Model.list(ws, tags={\"experiment\": \"automl-backtesting\"})\n",
|
"model_list = Model.list(ws, tags=[[\"experiment\", \"automl-backtesting\"]])\n",
|
||||||
"model_data = {\"name\": [], \"last_training_date\": []}\n",
|
"model_data = {\"name\": [], \"last_training_date\": []}\n",
|
||||||
"for model in model_list:\n",
|
"for model in model_list:\n",
|
||||||
" if (\n",
|
" if (\n",
|
||||||
@@ -587,6 +591,7 @@
|
|||||||
" step_size=BACKTESTING_PERIOD,\n",
|
" step_size=BACKTESTING_PERIOD,\n",
|
||||||
" step_number=NUMBER_OF_BACKTESTS,\n",
|
" step_number=NUMBER_OF_BACKTESTS,\n",
|
||||||
" model_name=model_name,\n",
|
" model_name=model_name,\n",
|
||||||
|
" forecast_quantiles=[0.025, 0.975],\n",
|
||||||
")"
|
")"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
@@ -697,9 +702,9 @@
|
|||||||
"Azure ML AutoML"
|
"Azure ML AutoML"
|
||||||
],
|
],
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Python 3.6",
|
"display_name": "Python 3.8 - AzureML",
|
||||||
"language": "python",
|
"language": "python",
|
||||||
"name": "python36"
|
"name": "python38-azureml"
|
||||||
},
|
},
|
||||||
"language_info": {
|
"language_info": {
|
||||||
"codemirror_mode": {
|
"codemirror_mode": {
|
||||||
@@ -711,7 +716,12 @@
|
|||||||
"name": "python",
|
"name": "python",
|
||||||
"nbconvert_exporter": "python",
|
"nbconvert_exporter": "python",
|
||||||
"pygments_lexer": "ipython3",
|
"pygments_lexer": "ipython3",
|
||||||
"version": "3.6.9"
|
"version": "3.8.5"
|
||||||
|
},
|
||||||
|
"vscode": {
|
||||||
|
"interpreter": {
|
||||||
|
"hash": "6bd77c88278e012ef31757c15997a7bea8c943977c43d6909403c00ae11d43ca"
|
||||||
|
}
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"nbformat": 4,
|
"nbformat": 4,
|
||||||
|
|||||||
@@ -1,4 +0,0 @@
|
|||||||
name: auto-ml-forecasting-backtest-single-model
|
|
||||||
dependencies:
|
|
||||||
- pip:
|
|
||||||
- azureml-sdk
|
|
||||||
@@ -31,6 +31,7 @@ def get_backtest_pipeline(
|
|||||||
step_number: int,
|
step_number: int,
|
||||||
model_name: Optional[str] = None,
|
model_name: Optional[str] = None,
|
||||||
model_uid: Optional[str] = None,
|
model_uid: Optional[str] = None,
|
||||||
|
forecast_quantiles: Optional[list] = None,
|
||||||
) -> Pipeline:
|
) -> Pipeline:
|
||||||
"""
|
"""
|
||||||
:param experiment: The experiment used to run the pipeline.
|
:param experiment: The experiment used to run the pipeline.
|
||||||
@@ -44,6 +45,7 @@ def get_backtest_pipeline(
|
|||||||
:param step_size: The number of periods to step back in backtesting.
|
:param step_size: The number of periods to step back in backtesting.
|
||||||
:param step_number: The number of backtesting iterations.
|
:param step_number: The number of backtesting iterations.
|
||||||
:param model_uid: The uid to mark models from this run of the experiment.
|
:param model_uid: The uid to mark models from this run of the experiment.
|
||||||
|
:param forecast_quantiles: The forecast quantiles that are required in the inference.
|
||||||
:return: The pipeline to be used for model retraining.
|
:return: The pipeline to be used for model retraining.
|
||||||
**Note:** The output will be uploaded in the pipeline output
|
**Note:** The output will be uploaded in the pipeline output
|
||||||
called 'score'.
|
called 'score'.
|
||||||
@@ -72,6 +74,8 @@ def get_backtest_pipeline(
|
|||||||
run_config.docker.use_docker = True
|
run_config.docker.use_docker = True
|
||||||
run_config.environment = env
|
run_config.environment = env
|
||||||
|
|
||||||
|
utilities.set_environment_variables_for_run(run_config)
|
||||||
|
|
||||||
split_data = PipelineData(name="split_data_output", datastore=None).as_dataset()
|
split_data = PipelineData(name="split_data_output", datastore=None).as_dataset()
|
||||||
split_step = PythonScriptStep(
|
split_step = PythonScriptStep(
|
||||||
name="split_data_for_backtest",
|
name="split_data_for_backtest",
|
||||||
@@ -114,6 +118,7 @@ def get_backtest_pipeline(
|
|||||||
run_invocation_timeout=3600,
|
run_invocation_timeout=3600,
|
||||||
node_count=node_count,
|
node_count=node_count,
|
||||||
)
|
)
|
||||||
|
utilities.set_environment_variables_for_run(back_test_config)
|
||||||
forecasts = PipelineData(name="forecasts", datastore=None)
|
forecasts = PipelineData(name="forecasts", datastore=None)
|
||||||
if model_name:
|
if model_name:
|
||||||
parallel_step_name = "{}-backtest".format(model_name.replace("_", "-"))
|
parallel_step_name = "{}-backtest".format(model_name.replace("_", "-"))
|
||||||
@@ -132,6 +137,9 @@ def get_backtest_pipeline(
|
|||||||
if model_uid is not None:
|
if model_uid is not None:
|
||||||
prs_args.append("--model-uid")
|
prs_args.append("--model-uid")
|
||||||
prs_args.append(model_uid)
|
prs_args.append(model_uid)
|
||||||
|
if forecast_quantiles:
|
||||||
|
prs_args.append("--forecast_quantiles")
|
||||||
|
prs_args.extend(forecast_quantiles)
|
||||||
backtest_prs = ParallelRunStep(
|
backtest_prs = ParallelRunStep(
|
||||||
name=parallel_step_name,
|
name=parallel_step_name,
|
||||||
parallel_run_config=back_test_config,
|
parallel_run_config=back_test_config,
|
||||||
@@ -149,12 +157,7 @@ def get_backtest_pipeline(
|
|||||||
inputs=[forecasts.as_mount()],
|
inputs=[forecasts.as_mount()],
|
||||||
outputs=[data_results],
|
outputs=[data_results],
|
||||||
source_directory=PROJECT_FOLDER,
|
source_directory=PROJECT_FOLDER,
|
||||||
arguments=[
|
arguments=["--forecasts", forecasts, "--output-dir", data_results],
|
||||||
"--forecasts",
|
|
||||||
forecasts,
|
|
||||||
"--output-dir",
|
|
||||||
data_results,
|
|
||||||
],
|
|
||||||
runconfig=run_config,
|
runconfig=run_config,
|
||||||
compute_target=compute_target,
|
compute_target=compute_target,
|
||||||
allow_reuse=False,
|
allow_reuse=False,
|
||||||
|
|||||||
@@ -16,6 +16,13 @@
|
|||||||
""
|
""
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"<font color=\"red\" size=\"5\"><strong>!Important!</strong> </br>This notebook is outdated and is not supported by the AutoML Team. Please use the supported version ([link](https://github.com/Azure/azureml-examples/tree/main/sdk/python/jobs/automl-standalone-jobs/automl-forecasting-task-bike-share)).</font>"
|
||||||
|
]
|
||||||
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
@@ -42,7 +49,7 @@
|
|||||||
"\n",
|
"\n",
|
||||||
"AutoML highlights here include built-in holiday featurization, accessing engineered feature names, and working with the `forecast` function. Please also look at the additional forecasting notebooks, which document lagging, rolling windows, forecast quantiles, other ways to use the forecast function, and forecaster deployment.\n",
|
"AutoML highlights here include built-in holiday featurization, accessing engineered feature names, and working with the `forecast` function. Please also look at the additional forecasting notebooks, which document lagging, rolling windows, forecast quantiles, other ways to use the forecast function, and forecaster deployment.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"Make sure you have executed the [configuration notebook](../../../configuration.ipynb) before running this notebook.\n",
|
"Make sure you have executed the [configuration notebook](https://github.com/Azure/MachineLearningNotebooks/blob/master/configuration.ipynb) before running this notebook.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"Notebook synopsis:\n",
|
"Notebook synopsis:\n",
|
||||||
"1. Creating an Experiment in an existing Workspace\n",
|
"1. Creating an Experiment in an existing Workspace\n",
|
||||||
@@ -61,7 +68,11 @@
|
|||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": null,
|
"execution_count": null,
|
||||||
"metadata": {},
|
"metadata": {
|
||||||
|
"gather": {
|
||||||
|
"logged": 1680248038565
|
||||||
|
}
|
||||||
|
},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"import json\n",
|
"import json\n",
|
||||||
@@ -119,6 +130,7 @@
|
|||||||
"output[\"Resource Group\"] = ws.resource_group\n",
|
"output[\"Resource Group\"] = ws.resource_group\n",
|
||||||
"output[\"Location\"] = ws.location\n",
|
"output[\"Location\"] = ws.location\n",
|
||||||
"output[\"Run History Name\"] = experiment_name\n",
|
"output[\"Run History Name\"] = experiment_name\n",
|
||||||
|
"output[\"SDK Version\"] = azureml.core.VERSION\n",
|
||||||
"pd.set_option(\"display.max_colwidth\", None)\n",
|
"pd.set_option(\"display.max_colwidth\", None)\n",
|
||||||
"outputDf = pd.DataFrame(data=output, index=[\"\"])\n",
|
"outputDf = pd.DataFrame(data=output, index=[\"\"])\n",
|
||||||
"outputDf.T"
|
"outputDf.T"
|
||||||
@@ -169,25 +181,6 @@
|
|||||||
"source": [
|
"source": [
|
||||||
"## Data\n",
|
"## Data\n",
|
||||||
"\n",
|
"\n",
|
||||||
"The [Machine Learning service workspace](https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-workspace) is paired with the storage account, which contains the default data store. We will use it to upload the bike share data and create [tabular dataset](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.data.tabulardataset?view=azure-ml-py) for training. A tabular dataset defines a series of lazily-evaluated, immutable operations to load data from the data source into tabular representation."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"datastore = ws.get_default_datastore()\n",
|
|
||||||
"datastore.upload_files(\n",
|
|
||||||
" files=[\"./bike-no.csv\"], target_path=\"dataset/\", overwrite=True, show_progress=True\n",
|
|
||||||
")"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"Let's set up what we know about the dataset. \n",
|
"Let's set up what we know about the dataset. \n",
|
||||||
"\n",
|
"\n",
|
||||||
"**Target column** is what we want to forecast.\n",
|
"**Target column** is what we want to forecast.\n",
|
||||||
@@ -205,25 +198,50 @@
|
|||||||
"time_column_name = \"date\""
|
"time_column_name = \"date\""
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"nteract": {
|
||||||
|
"transient": {
|
||||||
|
"deleting": false
|
||||||
|
}
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"You are now ready to load the historical bike share data. We will load the CSV file into a plain pandas DataFrame."
|
||||||
|
]
|
||||||
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": null,
|
"execution_count": null,
|
||||||
"metadata": {},
|
"metadata": {
|
||||||
|
"jupyter": {
|
||||||
|
"outputs_hidden": false,
|
||||||
|
"source_hidden": false
|
||||||
|
},
|
||||||
|
"nteract": {
|
||||||
|
"transient": {
|
||||||
|
"deleting": false
|
||||||
|
}
|
||||||
|
}
|
||||||
|
},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"dataset = Dataset.Tabular.from_delimited_files(\n",
|
"all_data = pd.read_csv(\"bike-no.csv\", parse_dates=[time_column_name])\n",
|
||||||
" path=[(datastore, \"dataset/bike-no.csv\")]\n",
|
|
||||||
").with_timestamp_columns(fine_grain_timestamp=time_column_name)\n",
|
|
||||||
"\n",
|
"\n",
|
||||||
"# Drop the columns 'casual' and 'registered' as these columns are a breakdown of the total and therefore a leak.\n",
|
"# Drop the columns 'casual' and 'registered' as these columns are a breakdown of the total and therefore a leak.\n",
|
||||||
"dataset = dataset.drop_columns(columns=[\"casual\", \"registered\"])\n",
|
"all_data.drop([\"casual\", \"registered\"], axis=1, inplace=True)"
|
||||||
"\n",
|
|
||||||
"dataset.take(5).to_pandas_dataframe().reset_index(drop=True)"
|
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {
|
||||||
|
"nteract": {
|
||||||
|
"transient": {
|
||||||
|
"deleting": false
|
||||||
|
}
|
||||||
|
}
|
||||||
|
},
|
||||||
"source": [
|
"source": [
|
||||||
"### Split the data\n",
|
"### Split the data\n",
|
||||||
"\n",
|
"\n",
|
||||||
@@ -233,22 +251,63 @@
|
|||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": null,
|
"execution_count": null,
|
||||||
"metadata": {},
|
"metadata": {
|
||||||
|
"gather": {
|
||||||
|
"logged": 1680247376789
|
||||||
|
},
|
||||||
|
"jupyter": {
|
||||||
|
"outputs_hidden": false,
|
||||||
|
"source_hidden": false
|
||||||
|
},
|
||||||
|
"nteract": {
|
||||||
|
"transient": {
|
||||||
|
"deleting": false
|
||||||
|
}
|
||||||
|
}
|
||||||
|
},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"# select data that occurs before a specified date\n",
|
"# select data that occurs before a specified date\n",
|
||||||
"train = dataset.time_before(datetime(2012, 8, 31), include_boundary=True)\n",
|
"train = all_data[all_data[time_column_name] <= pd.Timestamp(\"2012-08-31\")].copy()\n",
|
||||||
"train.to_pandas_dataframe().tail(5).reset_index(drop=True)"
|
"test = all_data[all_data[time_column_name] >= pd.Timestamp(\"2012-09-01\")].copy()"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"### Upload data to datastore\n",
|
||||||
|
"\n",
|
||||||
|
"The [Machine Learning service workspace](https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-workspace) is paired with the storage account, which contains the default data store. We will use it to upload the bike share data and create [tabular dataset](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.data.tabulardataset?view=azure-ml-py) for training. A tabular dataset defines a series of lazily-evaluated, immutable operations to load data from the data source into tabular representation."
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": null,
|
"execution_count": null,
|
||||||
"metadata": {},
|
"metadata": {
|
||||||
|
"jupyter": {
|
||||||
|
"outputs_hidden": false,
|
||||||
|
"source_hidden": false
|
||||||
|
},
|
||||||
|
"nteract": {
|
||||||
|
"transient": {
|
||||||
|
"deleting": false
|
||||||
|
}
|
||||||
|
}
|
||||||
|
},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"test = dataset.time_after(datetime(2012, 9, 1), include_boundary=True)\n",
|
"from azureml.data.dataset_factory import TabularDatasetFactory\n",
|
||||||
"test.to_pandas_dataframe().head(5).reset_index(drop=True)"
|
"\n",
|
||||||
|
"datastore = ws.get_default_datastore()\n",
|
||||||
|
"\n",
|
||||||
|
"train_dataset = TabularDatasetFactory.register_pandas_dataframe(\n",
|
||||||
|
" train, target=(datastore, \"dataset/\"), name=\"bike_no_train\"\n",
|
||||||
|
")\n",
|
||||||
|
"\n",
|
||||||
|
"test_dataset = TabularDatasetFactory.register_pandas_dataframe(\n",
|
||||||
|
" test, target=(datastore, \"dataset/\"), name=\"bike_no_test\"\n",
|
||||||
|
")"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -264,7 +323,8 @@
|
|||||||
"|**forecast_horizon**|The forecast horizon is how many periods forward you would like to forecast. This integer horizon is in units of the timeseries frequency (e.g. daily, weekly).|\n",
|
"|**forecast_horizon**|The forecast horizon is how many periods forward you would like to forecast. This integer horizon is in units of the timeseries frequency (e.g. daily, weekly).|\n",
|
||||||
"|**country_or_region_for_holidays**|The country/region used to generate holiday features. These should be ISO 3166 two-letter country/region codes (i.e. 'US', 'GB').|\n",
|
"|**country_or_region_for_holidays**|The country/region used to generate holiday features. These should be ISO 3166 two-letter country/region codes (i.e. 'US', 'GB').|\n",
|
||||||
"|**target_lags**|The target_lags specifies how far back we will construct the lags of the target variable.|\n",
|
"|**target_lags**|The target_lags specifies how far back we will construct the lags of the target variable.|\n",
|
||||||
"|**freq**|Forecast frequency. This optional parameter represents the period with which the forecast is desired, for example, daily, weekly, yearly, etc. Use this parameter for the correction of time series containing irregular data points or for padding of short time series. The frequency needs to be a pandas offset alias. Please refer to [pandas documentation](https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#dateoffset-objects) for more information."
|
"|**freq**|Forecast frequency. This optional parameter represents the period with which the forecast is desired, for example, daily, weekly, yearly, etc. Use this parameter for the correction of time series containing irregular data points or for padding of short time series. The frequency needs to be a pandas offset alias. Please refer to [pandas documentation](https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#dateoffset-objects) for more information.\n",
|
||||||
|
"|**cv_step_size**|Number of periods between two consecutive cross-validation folds. The default value is \"auto\", in which case AutoMl determines the cross-validation step size automatically, if a validation set is not provided. Or users could specify an integer value."
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -284,7 +344,7 @@
|
|||||||
"|**training_data**|Input dataset, containing both features and label column.|\n",
|
"|**training_data**|Input dataset, containing both features and label column.|\n",
|
||||||
"|**label_column_name**|The name of the label column.|\n",
|
"|**label_column_name**|The name of the label column.|\n",
|
||||||
"|**compute_target**|The remote compute for training.|\n",
|
"|**compute_target**|The remote compute for training.|\n",
|
||||||
"|**n_cross_validations**|Number of cross validation splits.|\n",
|
"|**n_cross_validations**|Number of cross-validation folds to use for model/pipeline selection. The default value is \"auto\", in which case AutoMl determines the number of cross-validations automatically, if a validation set is not provided. Or users could specify an integer value.\n",
|
||||||
"|**enable_early_stopping**|If early stopping is on, training will stop when the primary metric is no longer improving.|\n",
|
"|**enable_early_stopping**|If early stopping is on, training will stop when the primary metric is no longer improving.|\n",
|
||||||
"|**forecasting_parameters**|A class that holds all the forecasting related parameters.|\n",
|
"|**forecasting_parameters**|A class that holds all the forecasting related parameters.|\n",
|
||||||
"\n",
|
"\n",
|
||||||
@@ -349,6 +409,7 @@
|
|||||||
" country_or_region_for_holidays=\"US\", # set country_or_region will trigger holiday featurizer\n",
|
" country_or_region_for_holidays=\"US\", # set country_or_region will trigger holiday featurizer\n",
|
||||||
" target_lags=\"auto\", # use heuristic based lag setting\n",
|
" target_lags=\"auto\", # use heuristic based lag setting\n",
|
||||||
" freq=\"D\", # Set the forecast frequency to be daily\n",
|
" freq=\"D\", # Set the forecast frequency to be daily\n",
|
||||||
|
" cv_step_size=\"auto\",\n",
|
||||||
")\n",
|
")\n",
|
||||||
"\n",
|
"\n",
|
||||||
"automl_config = AutoMLConfig(\n",
|
"automl_config = AutoMLConfig(\n",
|
||||||
@@ -357,11 +418,11 @@
|
|||||||
" featurization=featurization_config,\n",
|
" featurization=featurization_config,\n",
|
||||||
" blocked_models=[\"ExtremeRandomTrees\"],\n",
|
" blocked_models=[\"ExtremeRandomTrees\"],\n",
|
||||||
" experiment_timeout_hours=0.3,\n",
|
" experiment_timeout_hours=0.3,\n",
|
||||||
" training_data=train,\n",
|
" training_data=train_dataset,\n",
|
||||||
" label_column_name=target_column_name,\n",
|
" label_column_name=target_column_name,\n",
|
||||||
" compute_target=compute_target,\n",
|
" compute_target=compute_target,\n",
|
||||||
" enable_early_stopping=True,\n",
|
" enable_early_stopping=True,\n",
|
||||||
" n_cross_validations=3,\n",
|
" n_cross_validations=\"auto\", # Feel free to set to a small integer (>=2) if runtime is an issue.\n",
|
||||||
" max_concurrent_iterations=4,\n",
|
" max_concurrent_iterations=4,\n",
|
||||||
" max_cores_per_iteration=-1,\n",
|
" max_cores_per_iteration=-1,\n",
|
||||||
" verbosity=logging.INFO,\n",
|
" verbosity=logging.INFO,\n",
|
||||||
@@ -543,7 +604,7 @@
|
|||||||
"from run_forecast import run_rolling_forecast\n",
|
"from run_forecast import run_rolling_forecast\n",
|
||||||
"\n",
|
"\n",
|
||||||
"remote_run = run_rolling_forecast(\n",
|
"remote_run = run_rolling_forecast(\n",
|
||||||
" test_experiment, compute_target, best_run, test, target_column_name\n",
|
" test_experiment, compute_target, best_run, test_dataset, target_column_name\n",
|
||||||
")\n",
|
")\n",
|
||||||
"remote_run"
|
"remote_run"
|
||||||
]
|
]
|
||||||
@@ -572,7 +633,32 @@
|
|||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"remote_run.download_file(\"outputs/predictions.csv\", \"predictions.csv\")\n",
|
"remote_run.download_file(\"outputs/predictions.csv\", \"predictions.csv\")\n",
|
||||||
"df_all = pd.read_csv(\"predictions.csv\")"
|
"fcst_df = pd.read_csv(\"predictions.csv\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Note that the rolling forecast can contain multiple predictions for each date, each from a different forecast origin. For example, consider 2012-09-05:"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"fcst_df[fcst_df.date == \"2012-09-05\"]"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Here, the forecast origin refers to the latest date of actuals available for a given forecast. The earliest origin in the rolling forecast, 2012-08-31, is the last day in the training data. For origin date 2012-09-01, the forecasts use actual recorded counts from the training data *and* the actual count recorded on 2012-09-01. Note that the model is not retrained for origin dates later than 2012-08-31, but the values for model features, such as lagged values of daily count, are updated.\n",
|
||||||
|
"\n",
|
||||||
|
"Let's calculate the metrics over all rolling forecasts:"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -584,29 +670,17 @@
|
|||||||
"from azureml.automl.core.shared import constants\n",
|
"from azureml.automl.core.shared import constants\n",
|
||||||
"from azureml.automl.runtime.shared.score import scoring\n",
|
"from azureml.automl.runtime.shared.score import scoring\n",
|
||||||
"from sklearn.metrics import mean_absolute_error, mean_squared_error\n",
|
"from sklearn.metrics import mean_absolute_error, mean_squared_error\n",
|
||||||
"from matplotlib import pyplot as plt\n",
|
|
||||||
"\n",
|
"\n",
|
||||||
"# use automl metrics module\n",
|
"# use automl metrics module\n",
|
||||||
"scores = scoring.score_regression(\n",
|
"scores = scoring.score_regression(\n",
|
||||||
" y_test=df_all[target_column_name],\n",
|
" y_test=fcst_df[target_column_name],\n",
|
||||||
" y_pred=df_all[\"predicted\"],\n",
|
" y_pred=fcst_df[\"predicted\"],\n",
|
||||||
" metrics=list(constants.Metric.SCALAR_REGRESSION_SET),\n",
|
" metrics=list(constants.Metric.SCALAR_REGRESSION_SET),\n",
|
||||||
")\n",
|
")\n",
|
||||||
"\n",
|
"\n",
|
||||||
"print(\"[Test data scores]\\n\")\n",
|
"print(\"[Test data scores]\\n\")\n",
|
||||||
"for key, value in scores.items():\n",
|
"for key, value in scores.items():\n",
|
||||||
" print(\"{}: {:.3f}\".format(key, value))\n",
|
" print(\"{}: {:.3f}\".format(key, value))"
|
||||||
"\n",
|
|
||||||
"# Plot outputs\n",
|
|
||||||
"%matplotlib inline\n",
|
|
||||||
"test_pred = plt.scatter(df_all[target_column_name], df_all[\"predicted\"], color=\"b\")\n",
|
|
||||||
"test_test = plt.scatter(\n",
|
|
||||||
" df_all[target_column_name], df_all[target_column_name], color=\"g\"\n",
|
|
||||||
")\n",
|
|
||||||
"plt.legend(\n",
|
|
||||||
" (test_pred, test_test), (\"prediction\", \"truth\"), loc=\"upper left\", fontsize=8\n",
|
|
||||||
")\n",
|
|
||||||
"plt.show()"
|
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -615,36 +689,15 @@
|
|||||||
"source": [
|
"source": [
|
||||||
"For more details on what metrics are included and how they are calculated, please refer to [supported metrics](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-understand-automated-ml#regressionforecasting-metrics). You could also calculate residuals, like described [here](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-understand-automated-ml#residuals).\n",
|
"For more details on what metrics are included and how they are calculated, please refer to [supported metrics](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-understand-automated-ml#regressionforecasting-metrics). You could also calculate residuals, like described [here](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-understand-automated-ml#residuals).\n",
|
||||||
"\n",
|
"\n",
|
||||||
"\n",
|
"The rolling forecast metric values are very high in comparison to the validation metrics reported by the AutoML job. What's going on here? We will investigate in the following cells!"
|
||||||
"Since we did a rolling evaluation on the test set, we can analyze the predictions by their forecast horizon relative to the rolling origin. The model was initially trained at a forecast horizon of 14, so each prediction from the model is associated with a horizon value from 1 to 14. The horizon values are in a column named, \"horizon_origin,\" in the prediction set. For example, we can calculate some of the error metrics grouped by the horizon:"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"from metrics_helper import MAPE, APE\n",
|
|
||||||
"\n",
|
|
||||||
"df_all.groupby(\"horizon_origin\").apply(\n",
|
|
||||||
" lambda df: pd.Series(\n",
|
|
||||||
" {\n",
|
|
||||||
" \"MAPE\": MAPE(df[target_column_name], df[\"predicted\"]),\n",
|
|
||||||
" \"RMSE\": np.sqrt(\n",
|
|
||||||
" mean_squared_error(df[target_column_name], df[\"predicted\"])\n",
|
|
||||||
" ),\n",
|
|
||||||
" \"MAE\": mean_absolute_error(df[target_column_name], df[\"predicted\"]),\n",
|
|
||||||
" }\n",
|
|
||||||
" )\n",
|
|
||||||
")"
|
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"To drill down more, we can look at the distributions of APE (absolute percentage error) by horizon. From the chart, it is clear that the overall MAPE is being skewed by one particular point where the actual value is of small absolute value."
|
"### Forecast versus actuals plot\n",
|
||||||
|
"We will plot predictions and actuals on a time series plot. Since there are many forecasts for each date, we select the 14-day-ahead forecast from each forecast origin for our comparison."
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -653,21 +706,55 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"df_all_APE = df_all.assign(APE=APE(df_all[target_column_name], df_all[\"predicted\"]))\n",
|
"from matplotlib import pyplot as plt\n",
|
||||||
"APEs = [\n",
|
|
||||||
" df_all_APE[df_all[\"horizon_origin\"] == h].APE.values\n",
|
|
||||||
" for h in range(1, forecast_horizon + 1)\n",
|
|
||||||
"]\n",
|
|
||||||
"\n",
|
"\n",
|
||||||
"%matplotlib inline\n",
|
"%matplotlib inline\n",
|
||||||
"plt.boxplot(APEs)\n",
|
|
||||||
"plt.yscale(\"log\")\n",
|
|
||||||
"plt.xlabel(\"horizon\")\n",
|
|
||||||
"plt.ylabel(\"APE (%)\")\n",
|
|
||||||
"plt.title(\"Absolute Percentage Errors by Forecast Horizon\")\n",
|
|
||||||
"\n",
|
"\n",
|
||||||
|
"fcst_df_h14 = (\n",
|
||||||
|
" fcst_df.groupby(\"forecast_origin\", as_index=False)\n",
|
||||||
|
" .last()\n",
|
||||||
|
" .drop(columns=[\"forecast_origin\"])\n",
|
||||||
|
")\n",
|
||||||
|
"fcst_df_h14.set_index(time_column_name, inplace=True)\n",
|
||||||
|
"plt.plot(fcst_df_h14[[target_column_name, \"predicted\"]])\n",
|
||||||
|
"plt.xticks(rotation=45)\n",
|
||||||
|
"plt.title(f\"Predicted vs. Actuals\")\n",
|
||||||
|
"plt.legend([\"actual\", \"14-day-ahead forecast\"])\n",
|
||||||
"plt.show()"
|
"plt.show()"
|
||||||
]
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Looking at the plot, there are two clear issues:\n",
|
||||||
|
"1. An anomalously low count value on October 29th, 2012.\n",
|
||||||
|
"2. End-of-year holidays (Thanksgiving and Christmas) in late November and late December.\n",
|
||||||
|
"\n",
|
||||||
|
"What happened on Oct. 29th, 2012? That day, Hurricane Sandy brought severe storm surge flooding to the east coast of the United States, particularly around New York City. This is certainly an anomalous event that the model did not account for!\n",
|
||||||
|
"\n",
|
||||||
|
"As for the late year holidays, the model apparently did not learn to account for the full reduction of bike share rentals on these major holidays. The training data covers 2011 and early 2012, so the model fit only had access to a single occurrence of these holidays. This makes it challenging to resolve holiday effects; however, a larger AutoML model search may result in a better model that is more holiday-aware.\n",
|
||||||
|
"\n",
|
||||||
|
"If we filter the predictions prior to the Thanksgiving holiday and remove the anomalous day of 2012-10-29, the metrics are closer to validation levels:"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"date_filter = (fcst_df.date != \"2012-10-29\") & (fcst_df.date < \"2012-11-22\")\n",
|
||||||
|
"scores = scoring.score_regression(\n",
|
||||||
|
" y_test=fcst_df[date_filter][target_column_name],\n",
|
||||||
|
" y_pred=fcst_df[date_filter][\"predicted\"],\n",
|
||||||
|
" metrics=list(constants.Metric.SCALAR_REGRESSION_SET),\n",
|
||||||
|
")\n",
|
||||||
|
"\n",
|
||||||
|
"print(\"[Test data scores (filtered)]\\n\")\n",
|
||||||
|
"for key, value in scores.items():\n",
|
||||||
|
" print(\"{}: {:.3f}\".format(key, value))"
|
||||||
|
]
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
"metadata": {
|
"metadata": {
|
||||||
@@ -693,10 +780,13 @@
|
|||||||
],
|
],
|
||||||
"friendly_name": "Forecasting BikeShare Demand",
|
"friendly_name": "Forecasting BikeShare Demand",
|
||||||
"index_order": 1,
|
"index_order": 1,
|
||||||
|
"kernel_info": {
|
||||||
|
"name": "python38-azureml"
|
||||||
|
},
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Python 3.6",
|
"display_name": "Python 3.8 - AzureML",
|
||||||
"language": "python",
|
"language": "python",
|
||||||
"name": "python36"
|
"name": "python38-azureml"
|
||||||
},
|
},
|
||||||
"language_info": {
|
"language_info": {
|
||||||
"codemirror_mode": {
|
"codemirror_mode": {
|
||||||
@@ -708,17 +798,30 @@
|
|||||||
"name": "python",
|
"name": "python",
|
||||||
"nbconvert_exporter": "python",
|
"nbconvert_exporter": "python",
|
||||||
"pygments_lexer": "ipython3",
|
"pygments_lexer": "ipython3",
|
||||||
"version": "3.6.7"
|
"version": "3.8.10"
|
||||||
|
},
|
||||||
|
"microsoft": {
|
||||||
|
"ms_spell_check": {
|
||||||
|
"ms_spell_check_language": "en"
|
||||||
|
}
|
||||||
},
|
},
|
||||||
"mimetype": "text/x-python",
|
"mimetype": "text/x-python",
|
||||||
"name": "python",
|
"name": "python",
|
||||||
"npconvert_exporter": "python",
|
"npconvert_exporter": "python",
|
||||||
|
"nteract": {
|
||||||
|
"version": "nteract-front-end@1.0.0"
|
||||||
|
},
|
||||||
"pygments_lexer": "ipython3",
|
"pygments_lexer": "ipython3",
|
||||||
"tags": [
|
"tags": [
|
||||||
"Forecasting"
|
"Forecasting"
|
||||||
],
|
],
|
||||||
"task": "Forecasting",
|
"task": "Forecasting",
|
||||||
"version": 3
|
"version": 3,
|
||||||
|
"vscode": {
|
||||||
|
"interpreter": {
|
||||||
|
"hash": "6bd77c88278e012ef31757c15997a7bea8c943977c43d6909403c00ae11d43ca"
|
||||||
|
}
|
||||||
|
}
|
||||||
},
|
},
|
||||||
"nbformat": 4,
|
"nbformat": 4,
|
||||||
"nbformat_minor": 4
|
"nbformat_minor": 4
|
||||||
|
|||||||
@@ -1,4 +0,0 @@
|
|||||||
name: auto-ml-forecasting-bike-share
|
|
||||||
dependencies:
|
|
||||||
- pip:
|
|
||||||
- azureml-sdk
|
|
||||||
@@ -1,6 +1,6 @@
|
|||||||
import argparse
|
import argparse
|
||||||
from azureml.core import Dataset, Run
|
from azureml.core import Dataset, Run
|
||||||
from sklearn.externals import joblib
|
import joblib
|
||||||
|
|
||||||
parser = argparse.ArgumentParser()
|
parser = argparse.ArgumentParser()
|
||||||
parser.add_argument(
|
parser.add_argument(
|
||||||
@@ -36,18 +36,18 @@ y_test_df = (
|
|||||||
|
|
||||||
fitted_model = joblib.load("model.pkl")
|
fitted_model = joblib.load("model.pkl")
|
||||||
|
|
||||||
y_pred, X_trans = fitted_model.rolling_evaluation(X_test_df, y_test_df.values)
|
X_rf = fitted_model.rolling_forecast(X_test_df, y_test_df.values, step=1)
|
||||||
|
|
||||||
# Add predictions, actuals, and horizon relative to rolling origin to the test feature data
|
# Add predictions, actuals, and horizon relative to rolling origin to the test feature data
|
||||||
assign_dict = {
|
assign_dict = {
|
||||||
"horizon_origin": X_trans["horizon_origin"].values,
|
fitted_model.forecast_origin_column_name: "forecast_origin",
|
||||||
"predicted": y_pred,
|
fitted_model.forecast_column_name: "predicted",
|
||||||
target_column_name: y_test_df[target_column_name].values,
|
fitted_model.actual_column_name: target_column_name,
|
||||||
}
|
}
|
||||||
df_all = X_test_df.assign(**assign_dict)
|
X_rf.rename(columns=assign_dict, inplace=True)
|
||||||
|
|
||||||
file_name = "outputs/predictions.csv"
|
file_name = "outputs/predictions.csv"
|
||||||
export_csv = df_all.to_csv(file_name, header=True)
|
export_csv = X_rf.to_csv(file_name, header=True)
|
||||||
|
|
||||||
# Upload the predictions into artifacts
|
# Upload the predictions into artifacts
|
||||||
run.upload_file(name=file_name, path_or_stream=file_name)
|
run.upload_file(name=file_name, path_or_stream=file_name)
|
||||||
|
|||||||
@@ -16,6 +16,13 @@
|
|||||||
""
|
""
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"<font color=\"red\" size=\"5\"><strong>!Important!</strong> </br>This notebook is outdated and is not supported by the AutoML Team. Please use the supported version ([link](https://github.com/Azure/azureml-examples/blob/main/sdk/python/jobs/automl-standalone-jobs/automl-forecasting-task-energy-demand/automl-forecasting-task-energy-demand-advanced-mlflow.ipynb)).</font>"
|
||||||
|
]
|
||||||
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
@@ -43,7 +50,7 @@
|
|||||||
"\n",
|
"\n",
|
||||||
"In this example we use the associated New York City energy demand dataset to showcase how you can use AutoML for a simple forecasting problem and explore the results. The goal is predict the energy demand for the next 48 hours based on historic time-series data.\n",
|
"In this example we use the associated New York City energy demand dataset to showcase how you can use AutoML for a simple forecasting problem and explore the results. The goal is predict the energy demand for the next 48 hours based on historic time-series data.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"If you are using an Azure Machine Learning Compute Instance, you are all set. Otherwise, go through the [configuration notebook](../../../configuration.ipynb) first, if you haven't already, to establish your connection to the AzureML Workspace.\n",
|
"If you are using an Azure Machine Learning Compute Instance, you are all set. Otherwise, go through the [configuration notebook](https://github.com/Azure/MachineLearningNotebooks/blob/master/configuration.ipynb) first, if you haven't already, to establish your connection to the AzureML Workspace.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"In this notebook you will learn how to:\n",
|
"In this notebook you will learn how to:\n",
|
||||||
"1. Creating an Experiment using an existing Workspace\n",
|
"1. Creating an Experiment using an existing Workspace\n",
|
||||||
@@ -132,6 +139,7 @@
|
|||||||
"output[\"Resource Group\"] = ws.resource_group\n",
|
"output[\"Resource Group\"] = ws.resource_group\n",
|
||||||
"output[\"Location\"] = ws.location\n",
|
"output[\"Location\"] = ws.location\n",
|
||||||
"output[\"Run History Name\"] = experiment_name\n",
|
"output[\"Run History Name\"] = experiment_name\n",
|
||||||
|
"output[\"SDK Version\"] = azureml.core.VERSION\n",
|
||||||
"pd.set_option(\"display.max_colwidth\", None)\n",
|
"pd.set_option(\"display.max_colwidth\", None)\n",
|
||||||
"outputDf = pd.DataFrame(data=output, index=[\"\"])\n",
|
"outputDf = pd.DataFrame(data=output, index=[\"\"])\n",
|
||||||
"outputDf.T"
|
"outputDf.T"
|
||||||
@@ -259,8 +267,12 @@
|
|||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"# split into train based on time\n",
|
"# split into train based on time\n",
|
||||||
"train = dataset.time_before(datetime(2017, 8, 8, 5), include_boundary=True)\n",
|
"train = (\n",
|
||||||
"train.to_pandas_dataframe().reset_index(drop=True).sort_values(time_column_name).tail(5)"
|
" dataset.time_before(datetime(2017, 8, 8, 5), include_boundary=True)\n",
|
||||||
|
" .to_pandas_dataframe()\n",
|
||||||
|
" .reset_index(drop=True)\n",
|
||||||
|
")\n",
|
||||||
|
"train.sort_values(time_column_name).tail(5)"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -270,8 +282,39 @@
|
|||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"# split into test based on time\n",
|
"# split into test based on time\n",
|
||||||
"test = dataset.time_between(datetime(2017, 8, 8, 6), datetime(2017, 8, 10, 5))\n",
|
"test = (\n",
|
||||||
"test.to_pandas_dataframe().reset_index(drop=True).head(5)"
|
" dataset.time_between(datetime(2017, 8, 8, 6), datetime(2017, 8, 10, 5))\n",
|
||||||
|
" .to_pandas_dataframe()\n",
|
||||||
|
" .reset_index(drop=True)\n",
|
||||||
|
")\n",
|
||||||
|
"test.head(5)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {
|
||||||
|
"jupyter": {
|
||||||
|
"outputs_hidden": false
|
||||||
|
},
|
||||||
|
"nteract": {
|
||||||
|
"transient": {
|
||||||
|
"deleting": false
|
||||||
|
}
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"# register the splitted train and test data in workspace storage\n",
|
||||||
|
"from azureml.data.dataset_factory import TabularDatasetFactory\n",
|
||||||
|
"\n",
|
||||||
|
"datastore = ws.get_default_datastore()\n",
|
||||||
|
"train_dataset = TabularDatasetFactory.register_pandas_dataframe(\n",
|
||||||
|
" train, target=(datastore, \"dataset/\"), name=\"nyc_energy_train\"\n",
|
||||||
|
")\n",
|
||||||
|
"test_dataset = TabularDatasetFactory.register_pandas_dataframe(\n",
|
||||||
|
" test, target=(datastore, \"dataset/\"), name=\"nyc_energy_test\"\n",
|
||||||
|
")"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -307,7 +350,8 @@
|
|||||||
"|-|-|\n",
|
"|-|-|\n",
|
||||||
"|**time_column_name**|The name of your time column.|\n",
|
"|**time_column_name**|The name of your time column.|\n",
|
||||||
"|**forecast_horizon**|The forecast horizon is how many periods forward you would like to forecast. This integer horizon is in units of the timeseries frequency (e.g. daily, weekly).|\n",
|
"|**forecast_horizon**|The forecast horizon is how many periods forward you would like to forecast. This integer horizon is in units of the timeseries frequency (e.g. daily, weekly).|\n",
|
||||||
"|**freq**|Forecast frequency. This optional parameter represents the period with which the forecast is desired, for example, daily, weekly, yearly, etc. Use this parameter for the correction of time series containing irregular data points or for padding of short time series. The frequency needs to be a pandas offset alias. Please refer to [pandas documentation](https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#dateoffset-objects) for more information."
|
"|**freq**|Forecast frequency. This optional parameter represents the period with which the forecast is desired, for example, daily, weekly, yearly, etc. Use this parameter for the correction of time series containing irregular data points or for padding of short time series. The frequency needs to be a pandas offset alias. Please refer to [pandas documentation](https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#dateoffset-objects) for more information.\n",
|
||||||
|
"|**cv_step_size**|Number of periods between two consecutive cross-validation folds. The default value is \"auto\", in which case AutoMl determines the cross-validation step size automatically, if a validation set is not provided. Or users could specify an integer value."
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -327,7 +371,7 @@
|
|||||||
"|**training_data**|The training data to be used within the experiment.|\n",
|
"|**training_data**|The training data to be used within the experiment.|\n",
|
||||||
"|**label_column_name**|The name of the label column.|\n",
|
"|**label_column_name**|The name of the label column.|\n",
|
||||||
"|**compute_target**|The remote compute for training.|\n",
|
"|**compute_target**|The remote compute for training.|\n",
|
||||||
"|**n_cross_validations**|Number of cross validation splits. Rolling Origin Validation is used to split time-series in a temporally consistent way.|\n",
|
"|**n_cross_validations**|Number of cross-validation folds to use for model/pipeline selection. The default value is \"auto\", in which case AutoMl determines the number of cross-validations automatically, if a validation set is not provided. Or users could specify an integer value.\n",
|
||||||
"|**enable_early_stopping**|Flag to enble early termination if the score is not improving in the short term.|\n",
|
"|**enable_early_stopping**|Flag to enble early termination if the score is not improving in the short term.|\n",
|
||||||
"|**forecasting_parameters**|A class holds all the forecasting related parameters.|\n"
|
"|**forecasting_parameters**|A class holds all the forecasting related parameters.|\n"
|
||||||
]
|
]
|
||||||
@@ -351,6 +395,7 @@
|
|||||||
" time_column_name=time_column_name,\n",
|
" time_column_name=time_column_name,\n",
|
||||||
" forecast_horizon=forecast_horizon,\n",
|
" forecast_horizon=forecast_horizon,\n",
|
||||||
" freq=\"H\", # Set the forecast frequency to be hourly\n",
|
" freq=\"H\", # Set the forecast frequency to be hourly\n",
|
||||||
|
" cv_step_size=\"auto\",\n",
|
||||||
")\n",
|
")\n",
|
||||||
"\n",
|
"\n",
|
||||||
"automl_config = AutoMLConfig(\n",
|
"automl_config = AutoMLConfig(\n",
|
||||||
@@ -358,11 +403,11 @@
|
|||||||
" primary_metric=\"normalized_root_mean_squared_error\",\n",
|
" primary_metric=\"normalized_root_mean_squared_error\",\n",
|
||||||
" blocked_models=[\"ExtremeRandomTrees\", \"AutoArima\", \"Prophet\"],\n",
|
" blocked_models=[\"ExtremeRandomTrees\", \"AutoArima\", \"Prophet\"],\n",
|
||||||
" experiment_timeout_hours=0.3,\n",
|
" experiment_timeout_hours=0.3,\n",
|
||||||
" training_data=train,\n",
|
" training_data=train_dataset,\n",
|
||||||
" label_column_name=target_column_name,\n",
|
" label_column_name=target_column_name,\n",
|
||||||
" compute_target=compute_target,\n",
|
" compute_target=compute_target,\n",
|
||||||
" enable_early_stopping=True,\n",
|
" enable_early_stopping=True,\n",
|
||||||
" n_cross_validations=3,\n",
|
" n_cross_validations=\"auto\", # Feel free to set to a small integer (>=2) if runtime is an issue.\n",
|
||||||
" verbosity=logging.INFO,\n",
|
" verbosity=logging.INFO,\n",
|
||||||
" forecasting_parameters=forecasting_parameters,\n",
|
" forecasting_parameters=forecasting_parameters,\n",
|
||||||
")"
|
")"
|
||||||
@@ -518,7 +563,7 @@
|
|||||||
" test_experiment=test_experiment,\n",
|
" test_experiment=test_experiment,\n",
|
||||||
" compute_target=compute_target,\n",
|
" compute_target=compute_target,\n",
|
||||||
" train_run=best_run,\n",
|
" train_run=best_run,\n",
|
||||||
" test_dataset=test,\n",
|
" test_dataset=test_dataset,\n",
|
||||||
" target_column_name=target_column_name,\n",
|
" target_column_name=target_column_name,\n",
|
||||||
")\n",
|
")\n",
|
||||||
"remote_run_infer.wait_for_completion(show_output=False)\n",
|
"remote_run_infer.wait_for_completion(show_output=False)\n",
|
||||||
@@ -608,6 +653,7 @@
|
|||||||
" forecast_horizon=forecast_horizon,\n",
|
" forecast_horizon=forecast_horizon,\n",
|
||||||
" target_lags=12,\n",
|
" target_lags=12,\n",
|
||||||
" target_rolling_window_size=4,\n",
|
" target_rolling_window_size=4,\n",
|
||||||
|
" cv_step_size=\"auto\",\n",
|
||||||
")\n",
|
")\n",
|
||||||
"\n",
|
"\n",
|
||||||
"automl_config = AutoMLConfig(\n",
|
"automl_config = AutoMLConfig(\n",
|
||||||
@@ -623,11 +669,11 @@
|
|||||||
" \"Prophet\",\n",
|
" \"Prophet\",\n",
|
||||||
" ], # These models are blocked for tutorial purposes, remove this for real use cases.\n",
|
" ], # These models are blocked for tutorial purposes, remove this for real use cases.\n",
|
||||||
" experiment_timeout_hours=0.3,\n",
|
" experiment_timeout_hours=0.3,\n",
|
||||||
" training_data=train,\n",
|
" training_data=train_dataset,\n",
|
||||||
" label_column_name=target_column_name,\n",
|
" label_column_name=target_column_name,\n",
|
||||||
" compute_target=compute_target,\n",
|
" compute_target=compute_target,\n",
|
||||||
" enable_early_stopping=True,\n",
|
" enable_early_stopping=True,\n",
|
||||||
" n_cross_validations=3,\n",
|
" n_cross_validations=\"auto\", # Feel free to set to a small integer (>=2) if runtime is an issue.\n",
|
||||||
" verbosity=logging.INFO,\n",
|
" verbosity=logging.INFO,\n",
|
||||||
" forecasting_parameters=advanced_forecasting_parameters,\n",
|
" forecasting_parameters=advanced_forecasting_parameters,\n",
|
||||||
")"
|
")"
|
||||||
@@ -694,7 +740,7 @@
|
|||||||
" test_experiment=test_experiment_advanced,\n",
|
" test_experiment=test_experiment_advanced,\n",
|
||||||
" compute_target=compute_target,\n",
|
" compute_target=compute_target,\n",
|
||||||
" train_run=best_run_lags,\n",
|
" train_run=best_run_lags,\n",
|
||||||
" test_dataset=test,\n",
|
" test_dataset=test_dataset,\n",
|
||||||
" target_column_name=target_column_name,\n",
|
" target_column_name=target_column_name,\n",
|
||||||
" inference_folder=\"./forecast_advanced\",\n",
|
" inference_folder=\"./forecast_advanced\",\n",
|
||||||
")\n",
|
")\n",
|
||||||
@@ -762,10 +808,13 @@
|
|||||||
"how-to-use-azureml",
|
"how-to-use-azureml",
|
||||||
"automated-machine-learning"
|
"automated-machine-learning"
|
||||||
],
|
],
|
||||||
|
"kernel_info": {
|
||||||
|
"name": "python3"
|
||||||
|
},
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Python 3.6",
|
"display_name": "Python 3.8 - AzureML",
|
||||||
"language": "python",
|
"language": "python",
|
||||||
"name": "python36"
|
"name": "python38-azureml"
|
||||||
},
|
},
|
||||||
"language_info": {
|
"language_info": {
|
||||||
"codemirror_mode": {
|
"codemirror_mode": {
|
||||||
@@ -777,9 +826,22 @@
|
|||||||
"name": "python",
|
"name": "python",
|
||||||
"nbconvert_exporter": "python",
|
"nbconvert_exporter": "python",
|
||||||
"pygments_lexer": "ipython3",
|
"pygments_lexer": "ipython3",
|
||||||
"version": "3.6.9"
|
"version": "3.8.10"
|
||||||
|
},
|
||||||
|
"microsoft": {
|
||||||
|
"ms_spell_check": {
|
||||||
|
"ms_spell_check_language": "en"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"nteract": {
|
||||||
|
"version": "nteract-front-end@1.0.0"
|
||||||
|
},
|
||||||
|
"vscode": {
|
||||||
|
"interpreter": {
|
||||||
|
"hash": "6bd77c88278e012ef31757c15997a7bea8c943977c43d6909403c00ae11d43ca"
|
||||||
|
}
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"nbformat": 4,
|
"nbformat": 4,
|
||||||
"nbformat_minor": 2
|
"nbformat_minor": 4
|
||||||
}
|
}
|
||||||
@@ -1,4 +0,0 @@
|
|||||||
name: auto-ml-forecasting-energy-demand
|
|
||||||
dependencies:
|
|
||||||
- pip:
|
|
||||||
- azureml-sdk
|
|
||||||
@@ -6,7 +6,7 @@ compute instance.
|
|||||||
|
|
||||||
import argparse
|
import argparse
|
||||||
from azureml.core import Dataset, Run
|
from azureml.core import Dataset, Run
|
||||||
from sklearn.externals import joblib
|
import joblib
|
||||||
from pandas.tseries.frequencies import to_offset
|
from pandas.tseries.frequencies import to_offset
|
||||||
|
|
||||||
parser = argparse.ArgumentParser()
|
parser = argparse.ArgumentParser()
|
||||||
|
|||||||
@@ -52,7 +52,7 @@
|
|||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"Please make sure you have followed the `configuration.ipynb` notebook so that your ML workspace information is saved in the config file."
|
"Please make sure you have followed the [configuration notebook](https://github.com/Azure/MachineLearningNotebooks/blob/master/configuration.ipynb) so that your ML workspace information is saved in the config file."
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -121,6 +121,7 @@
|
|||||||
"output[\"Resource Group\"] = ws.resource_group\n",
|
"output[\"Resource Group\"] = ws.resource_group\n",
|
||||||
"output[\"Location\"] = ws.location\n",
|
"output[\"Location\"] = ws.location\n",
|
||||||
"output[\"Run History Name\"] = experiment_name\n",
|
"output[\"Run History Name\"] = experiment_name\n",
|
||||||
|
"output[\"SDK Version\"] = azureml.core.VERSION\n",
|
||||||
"pd.set_option(\"display.max_colwidth\", None)\n",
|
"pd.set_option(\"display.max_colwidth\", None)\n",
|
||||||
"outputDf = pd.DataFrame(data=output, index=[\"\"])\n",
|
"outputDf = pd.DataFrame(data=output, index=[\"\"])\n",
|
||||||
"outputDf.T"
|
"outputDf.T"
|
||||||
@@ -334,7 +335,8 @@
|
|||||||
" forecast_horizon=forecast_horizon,\n",
|
" forecast_horizon=forecast_horizon,\n",
|
||||||
" time_series_id_column_names=[TIME_SERIES_ID_COLUMN_NAME],\n",
|
" time_series_id_column_names=[TIME_SERIES_ID_COLUMN_NAME],\n",
|
||||||
" target_lags=lags,\n",
|
" target_lags=lags,\n",
|
||||||
" freq=\"H\", # Set the forecast frequency to be hourly\n",
|
" freq=\"H\", # Set the forecast frequency to be hourly,\n",
|
||||||
|
" cv_step_size=\"auto\",\n",
|
||||||
")"
|
")"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
@@ -364,7 +366,7 @@
|
|||||||
" enable_early_stopping=True,\n",
|
" enable_early_stopping=True,\n",
|
||||||
" training_data=train_data,\n",
|
" training_data=train_data,\n",
|
||||||
" compute_target=compute_target,\n",
|
" compute_target=compute_target,\n",
|
||||||
" n_cross_validations=3,\n",
|
" n_cross_validations=\"auto\", # Feel free to set to a small integer (>=2) if runtime is an issue.\n",
|
||||||
" verbosity=logging.INFO,\n",
|
" verbosity=logging.INFO,\n",
|
||||||
" max_concurrent_iterations=4,\n",
|
" max_concurrent_iterations=4,\n",
|
||||||
" max_cores_per_iteration=-1,\n",
|
" max_cores_per_iteration=-1,\n",
|
||||||
@@ -646,13 +648,11 @@
|
|||||||
" & (fulldata[time_column_name] <= forecast_origin + horizon)\n",
|
" & (fulldata[time_column_name] <= forecast_origin + horizon)\n",
|
||||||
" ]\n",
|
" ]\n",
|
||||||
"\n",
|
"\n",
|
||||||
" y_past = X_past.pop(target_column_name).values.astype(np.float)\n",
|
" y_past = X_past.pop(target_column_name).values.astype(float)\n",
|
||||||
" y_future = X_future.pop(target_column_name).values.astype(np.float)\n",
|
" y_future = X_future.pop(target_column_name).values.astype(float)\n",
|
||||||
"\n",
|
"\n",
|
||||||
" # Now take y_future and turn it into question marks\n",
|
" # Now take y_future and turn it into question marks\n",
|
||||||
" y_query = y_future.copy().astype(\n",
|
" y_query = y_future.copy().astype(float) # because sometimes life hands you an int\n",
|
||||||
" np.float\n",
|
|
||||||
" ) # because sometimes life hands you an int\n",
|
|
||||||
" y_query.fill(np.NaN)\n",
|
" y_query.fill(np.NaN)\n",
|
||||||
"\n",
|
"\n",
|
||||||
" print(\"X_past is \" + str(X_past.shape) + \" - shaped\")\n",
|
" print(\"X_past is \" + str(X_past.shape) + \" - shaped\")\n",
|
||||||
@@ -758,7 +758,15 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"## Forecasting farther than the forecast horizon <a id=\"recursive forecasting\"></a>\n",
|
"## Forecasting farther than the forecast horizon <a id=\"recursive forecasting\"></a>\n",
|
||||||
"When the forecast destination, or the latest date in the prediction data frame, is farther into the future than the specified forecast horizon, the `forecast()` function will still make point predictions out to the later date using a recursive operation mode. Internally, the method recursively applies the regular forecaster to generate context so that we can forecast further into the future. \n",
|
"When the forecast destination, or the latest date in the prediction data frame, is farther into the future than the specified forecast horizon, the forecaster must be iteratively applied. Here, we advance the forecast origin on each iteration over the prediction window, predicting `max_horizon` periods ahead on each iteration. There are two choices for the context data to use as the forecaster advances into the prediction window:\n",
|
||||||
|
"\n",
|
||||||
|
"1. We can use forecasted values from previous iterations (recursive forecast),\n",
|
||||||
|
"2. We can use known, actual values of the target if they are available (rolling forecast).\n",
|
||||||
|
"\n",
|
||||||
|
"The first method is useful in a true forecasting scenario when we do not yet know the actual target values while the second is useful in an evaluation scenario where we want to compute accuracy metrics for the `max_horizon`-period-ahead forecaster over a long test set. We refer to the first as a **recursive forecast** since we apply the forecaster recursively over the prediction window and the second as a **rolling forecast** since we roll forward over known actuals.\n",
|
||||||
|
"\n",
|
||||||
|
"### Recursive forecasting\n",
|
||||||
|
"By default, the `forecast()` function will make point predictions out to the later date using a recursive operation mode. Internally, the method recursively applies the regular forecaster to generate context so that we can forecast further into the future. \n",
|
||||||
"\n",
|
"\n",
|
||||||
"To illustrate the use-case and operation of recursive forecasting, we'll consider an example with a single time-series where the forecasting period directly follows the training period and is twice as long as the forecasting horizon given at training time.\n",
|
"To illustrate the use-case and operation of recursive forecasting, we'll consider an example with a single time-series where the forecasting period directly follows the training period and is twice as long as the forecasting horizon given at training time.\n",
|
||||||
"\n",
|
"\n",
|
||||||
@@ -818,6 +826,35 @@
|
|||||||
"np.array_equal(y_pred_all, y_pred_long)"
|
"np.array_equal(y_pred_all, y_pred_long)"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"### Rolling forecasts\n",
|
||||||
|
"A rolling forecast is a similar concept to the recursive forecasts described above except that we use known actual values of the target for our context data. We have provided a different, public method for this called `rolling_forecast`. In addition to test data and actuals (`X_test` and `y_test`), `rolling_forecast` also accepts an optional `step` parameter that controls how far the origin advances on each iteration. The recursive forecast mode uses a fixed step of `max_horizon` while `rolling_forecast` defaults to a step size of 1, but can be set to any integer from 1 to `max_horizon`, inclusive.\n",
|
||||||
|
"\n",
|
||||||
|
"Let's see what the rolling forecast looks like on the long test set with the step set to 1:"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"X_rf = fitted_model.rolling_forecast(X_test_long, y_test_long, step=1)\n",
|
||||||
|
"X_rf.head(n=12)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Notice that `rolling_forecast` has returned a single DataFrame containing all results and has generated some new columns: `_automl_forecast_origin`, `_automl_forecast_y`, and `_automl_actual_y`. These are the origin date for each forecast, the forecasted value and the actual value, respectively. Note that \"y\" in the forecast and actual column names will generally be replaced by the target column name supplied to AutoML.\n",
|
||||||
|
"\n",
|
||||||
|
"The output above shows forecasts for two prediction windows, the first with origin at the end of the training set and the second including the first observation in the test set (2000-01-01 06:00:00). Since the forecast windows overlap, there are multiple forecasts for most dates which are associated with different origin dates."
|
||||||
|
]
|
||||||
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
@@ -866,9 +903,9 @@
|
|||||||
"friendly_name": "Forecasting away from training data",
|
"friendly_name": "Forecasting away from training data",
|
||||||
"index_order": 3,
|
"index_order": 3,
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Python 3.6",
|
"display_name": "Python 3.8 - AzureML",
|
||||||
"language": "python",
|
"language": "python",
|
||||||
"name": "python36"
|
"name": "python38-azureml"
|
||||||
},
|
},
|
||||||
"language_info": {
|
"language_info": {
|
||||||
"codemirror_mode": {
|
"codemirror_mode": {
|
||||||
@@ -880,14 +917,19 @@
|
|||||||
"name": "python",
|
"name": "python",
|
||||||
"nbconvert_exporter": "python",
|
"nbconvert_exporter": "python",
|
||||||
"pygments_lexer": "ipython3",
|
"pygments_lexer": "ipython3",
|
||||||
"version": "3.6.8"
|
"version": "3.7.13"
|
||||||
},
|
},
|
||||||
"tags": [
|
"tags": [
|
||||||
"Forecasting",
|
"Forecasting",
|
||||||
"Confidence Intervals"
|
"Confidence Intervals"
|
||||||
],
|
],
|
||||||
"task": "Forecasting"
|
"task": "Forecasting",
|
||||||
|
"vscode": {
|
||||||
|
"interpreter": {
|
||||||
|
"hash": "6bd77c88278e012ef31757c15997a7bea8c943977c43d6909403c00ae11d43ca"
|
||||||
|
}
|
||||||
|
}
|
||||||
},
|
},
|
||||||
"nbformat": 4,
|
"nbformat": 4,
|
||||||
"nbformat_minor": 2
|
"nbformat_minor": 4
|
||||||
}
|
}
|
||||||
@@ -1,4 +0,0 @@
|
|||||||
name: auto-ml-forecasting-function
|
|
||||||
dependencies:
|
|
||||||
- pip:
|
|
||||||
- azureml-sdk
|
|
||||||
@@ -19,7 +19,14 @@
|
|||||||
"hidePrompt": false
|
"hidePrompt": false
|
||||||
},
|
},
|
||||||
"source": [
|
"source": [
|
||||||
""
|
""
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"<font color=\"red\" size=\"5\"><strong>!Important!</strong> </br>This notebook is outdated and is not supported by the AutoML Team. Please use the supported version ([link](https://github.com/Azure/azureml-examples/tree/main/sdk/python/jobs/automl-standalone-jobs/automl-forecasting-github-dau)).</font>"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -52,12 +59,12 @@
|
|||||||
"\n",
|
"\n",
|
||||||
"AutoML highlights here include using Deep Learning forecasts, Arima, Prophet, Remote Execution and Remote Inferencing, and working with the `forecast` function. Please also look at the additional forecasting notebooks, which document lagging, rolling windows, forecast quantiles, other ways to use the forecast function, and forecaster deployment.\n",
|
"AutoML highlights here include using Deep Learning forecasts, Arima, Prophet, Remote Execution and Remote Inferencing, and working with the `forecast` function. Please also look at the additional forecasting notebooks, which document lagging, rolling windows, forecast quantiles, other ways to use the forecast function, and forecaster deployment.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"Make sure you have executed the [configuration](../../../configuration.ipynb) before running this notebook.\n",
|
"Make sure you have executed the [configuration](https://github.com/Azure/MachineLearningNotebooks/blob/master/configuration.ipynb) before running this notebook.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"Notebook synopsis:\n",
|
"Notebook synopsis:\n",
|
||||||
"\n",
|
"\n",
|
||||||
"1. Creating an Experiment in an existing Workspace\n",
|
"1. Creating an Experiment in an existing Workspace\n",
|
||||||
"2. Configuration and remote run of AutoML for a time-series model exploring Regression learners, Arima, Prophet and DNNs\n",
|
"2. Configuration and remote run of AutoML for a time-series model exploring DNNs\n",
|
||||||
"4. Evaluating the fitted model using a rolling test "
|
"4. Evaluating the fitted model using a rolling test "
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
@@ -92,8 +99,7 @@
|
|||||||
"# Squash warning messages for cleaner output in the notebook\n",
|
"# Squash warning messages for cleaner output in the notebook\n",
|
||||||
"warnings.showwarning = lambda *args, **kwargs: None\n",
|
"warnings.showwarning = lambda *args, **kwargs: None\n",
|
||||||
"\n",
|
"\n",
|
||||||
"from azureml.core.workspace import Workspace\n",
|
"from azureml.core import Workspace, Experiment, Dataset\n",
|
||||||
"from azureml.core.experiment import Experiment\n",
|
|
||||||
"from azureml.train.automl import AutoMLConfig\n",
|
"from azureml.train.automl import AutoMLConfig\n",
|
||||||
"from matplotlib import pyplot as plt\n",
|
"from matplotlib import pyplot as plt\n",
|
||||||
"from sklearn.metrics import mean_absolute_error, mean_squared_error\n",
|
"from sklearn.metrics import mean_absolute_error, mean_squared_error\n",
|
||||||
@@ -148,6 +154,7 @@
|
|||||||
"output[\"Resource Group\"] = ws.resource_group\n",
|
"output[\"Resource Group\"] = ws.resource_group\n",
|
||||||
"output[\"Location\"] = ws.location\n",
|
"output[\"Location\"] = ws.location\n",
|
||||||
"output[\"Run History Name\"] = experiment_name\n",
|
"output[\"Run History Name\"] = experiment_name\n",
|
||||||
|
"output[\"SDK Version\"] = azureml.core.VERSION\n",
|
||||||
"pd.set_option(\"display.max_colwidth\", None)\n",
|
"pd.set_option(\"display.max_colwidth\", None)\n",
|
||||||
"outputDf = pd.DataFrame(data=output, index=[\"\"])\n",
|
"outputDf = pd.DataFrame(data=output, index=[\"\"])\n",
|
||||||
"outputDf.T"
|
"outputDf.T"
|
||||||
@@ -298,40 +305,21 @@
|
|||||||
"from helper import split_full_for_forecasting\n",
|
"from helper import split_full_for_forecasting\n",
|
||||||
"\n",
|
"\n",
|
||||||
"train, valid = split_full_for_forecasting(df, time_column_name)\n",
|
"train, valid = split_full_for_forecasting(df, time_column_name)\n",
|
||||||
"train.to_csv(\"train.csv\")\n",
|
"\n",
|
||||||
"valid.to_csv(\"valid.csv\")\n",
|
"# Reset index to create a Tabualr Dataset.\n",
|
||||||
"test_df.to_csv(\"test.csv\")\n",
|
"train.reset_index(inplace=True)\n",
|
||||||
|
"valid.reset_index(inplace=True)\n",
|
||||||
|
"test_df.reset_index(inplace=True)\n",
|
||||||
"\n",
|
"\n",
|
||||||
"datastore = ws.get_default_datastore()\n",
|
"datastore = ws.get_default_datastore()\n",
|
||||||
"datastore.upload_files(\n",
|
"train_dataset = Dataset.Tabular.register_pandas_dataframe(\n",
|
||||||
" files=[\"./train.csv\"],\n",
|
" train, target=(datastore, \"dataset/\"), name=\"Github_DAU_train\"\n",
|
||||||
" target_path=\"github-dataset/tabular/\",\n",
|
|
||||||
" overwrite=True,\n",
|
|
||||||
" show_progress=True,\n",
|
|
||||||
")\n",
|
")\n",
|
||||||
"datastore.upload_files(\n",
|
"valid_dataset = Dataset.Tabular.register_pandas_dataframe(\n",
|
||||||
" files=[\"./valid.csv\"],\n",
|
" valid, target=(datastore, \"dataset/\"), name=\"Github_DAU_valid\"\n",
|
||||||
" target_path=\"github-dataset/tabular/\",\n",
|
|
||||||
" overwrite=True,\n",
|
|
||||||
" show_progress=True,\n",
|
|
||||||
")\n",
|
")\n",
|
||||||
"datastore.upload_files(\n",
|
"test_dataset = Dataset.Tabular.register_pandas_dataframe(\n",
|
||||||
" files=[\"./test.csv\"],\n",
|
" test_df, target=(datastore, \"dataset/\"), name=\"Github_DAU_test\"\n",
|
||||||
" target_path=\"github-dataset/tabular/\",\n",
|
|
||||||
" overwrite=True,\n",
|
|
||||||
" show_progress=True,\n",
|
|
||||||
")\n",
|
|
||||||
"\n",
|
|
||||||
"from azureml.core import Dataset\n",
|
|
||||||
"\n",
|
|
||||||
"train_dataset = Dataset.Tabular.from_delimited_files(\n",
|
|
||||||
" path=[(datastore, \"github-dataset/tabular/train.csv\")]\n",
|
|
||||||
")\n",
|
|
||||||
"valid_dataset = Dataset.Tabular.from_delimited_files(\n",
|
|
||||||
" path=[(datastore, \"github-dataset/tabular/valid.csv\")]\n",
|
|
||||||
")\n",
|
|
||||||
"test_dataset = Dataset.Tabular.from_delimited_files(\n",
|
|
||||||
" path=[(datastore, \"github-dataset/tabular/test.csv\")]\n",
|
|
||||||
")"
|
")"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
@@ -344,7 +332,7 @@
|
|||||||
"source": [
|
"source": [
|
||||||
"### Setting forecaster maximum horizon \n",
|
"### Setting forecaster maximum horizon \n",
|
||||||
"\n",
|
"\n",
|
||||||
"The forecast horizon is the number of periods into the future that the model should predict. Here, we set the horizon to 12 periods (i.e. 12 months). Notice that this is much shorter than the number of months in the test set; we will need to use a rolling test to evaluate the performance on the whole test set. For more discussion of forecast horizons and guiding principles for setting them, please see the [energy demand notebook](https://github.com/Azure/MachineLearningNotebooks/tree/master/how-to-use-azureml/automated-machine-learning/forecasting-energy-demand). "
|
"The forecast horizon is the number of periods into the future that the model should predict. Here, we set the horizon to 14 periods (i.e. 14 days). Notice that this is much shorter than the number of months in the test set; we will need to use a rolling test to evaluate the performance on the whole test set. For more discussion of forecast horizons and guiding principles for setting them, please see the [energy demand notebook](https://github.com/Azure/MachineLearningNotebooks/tree/master/how-to-use-azureml/automated-machine-learning/forecasting-energy-demand). "
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -356,7 +344,7 @@
|
|||||||
},
|
},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"forecast_horizon = 12"
|
"forecast_horizon = 14"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -397,11 +385,11 @@
|
|||||||
" freq=\"D\", # Set the forecast frequency to be daily\n",
|
" freq=\"D\", # Set the forecast frequency to be daily\n",
|
||||||
")\n",
|
")\n",
|
||||||
"\n",
|
"\n",
|
||||||
"# We will disable the enable_early_stopping flag to ensure the DNN model is recommended for demonstration purpose.\n",
|
"# To only allow the TCNForecaster we set the allowed_models parameter to reflect this.\n",
|
||||||
"automl_config = AutoMLConfig(\n",
|
"automl_config = AutoMLConfig(\n",
|
||||||
" task=\"forecasting\",\n",
|
" task=\"forecasting\",\n",
|
||||||
" primary_metric=\"normalized_root_mean_squared_error\",\n",
|
" primary_metric=\"normalized_root_mean_squared_error\",\n",
|
||||||
" experiment_timeout_hours=1,\n",
|
" experiment_timeout_hours=1.5,\n",
|
||||||
" training_data=train_dataset,\n",
|
" training_data=train_dataset,\n",
|
||||||
" label_column_name=target_column_name,\n",
|
" label_column_name=target_column_name,\n",
|
||||||
" validation_data=valid_dataset,\n",
|
" validation_data=valid_dataset,\n",
|
||||||
@@ -410,7 +398,7 @@
|
|||||||
" max_concurrent_iterations=4,\n",
|
" max_concurrent_iterations=4,\n",
|
||||||
" max_cores_per_iteration=-1,\n",
|
" max_cores_per_iteration=-1,\n",
|
||||||
" enable_dnn=True,\n",
|
" enable_dnn=True,\n",
|
||||||
" enable_early_stopping=False,\n",
|
" allowed_models=[\"TCNForecaster\"],\n",
|
||||||
" forecasting_parameters=forecasting_parameters,\n",
|
" forecasting_parameters=forecasting_parameters,\n",
|
||||||
")"
|
")"
|
||||||
]
|
]
|
||||||
@@ -503,7 +491,9 @@
|
|||||||
"if not forecast_model in summary_df[\"run_id\"]:\n",
|
"if not forecast_model in summary_df[\"run_id\"]:\n",
|
||||||
" forecast_model = \"ForecastTCN\"\n",
|
" forecast_model = \"ForecastTCN\"\n",
|
||||||
"\n",
|
"\n",
|
||||||
"best_dnn_run_id = summary_df[\"run_id\"][forecast_model]\n",
|
"best_dnn_run_id = summary_df[summary_df[\"Score\"] == summary_df[\"Score\"].min()][\n",
|
||||||
|
" \"run_id\"\n",
|
||||||
|
"][forecast_model]\n",
|
||||||
"best_dnn_run = Run(experiment, best_dnn_run_id)"
|
"best_dnn_run = Run(experiment, best_dnn_run_id)"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
@@ -564,11 +554,6 @@
|
|||||||
},
|
},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"from azureml.core import Dataset\n",
|
|
||||||
"\n",
|
|
||||||
"test_dataset = Dataset.Tabular.from_delimited_files(\n",
|
|
||||||
" path=[(datastore, \"github-dataset/tabular/test.csv\")]\n",
|
|
||||||
")\n",
|
|
||||||
"# preview the first 3 rows of the dataset\n",
|
"# preview the first 3 rows of the dataset\n",
|
||||||
"test_dataset.take(5).to_pandas_dataframe()"
|
"test_dataset.take(5).to_pandas_dataframe()"
|
||||||
]
|
]
|
||||||
@@ -703,9 +688,9 @@
|
|||||||
],
|
],
|
||||||
"hide_code_all_hidden": false,
|
"hide_code_all_hidden": false,
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Python 3.6",
|
"display_name": "Python 3.8 - AzureML",
|
||||||
"language": "python",
|
"language": "python",
|
||||||
"name": "python36"
|
"name": "python38-azureml"
|
||||||
},
|
},
|
||||||
"language_info": {
|
"language_info": {
|
||||||
"codemirror_mode": {
|
"codemirror_mode": {
|
||||||
@@ -717,9 +702,9 @@
|
|||||||
"name": "python",
|
"name": "python",
|
||||||
"nbconvert_exporter": "python",
|
"nbconvert_exporter": "python",
|
||||||
"pygments_lexer": "ipython3",
|
"pygments_lexer": "ipython3",
|
||||||
"version": "3.6.9"
|
"version": "3.8.10"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"nbformat": 4,
|
"nbformat": 4,
|
||||||
"nbformat_minor": 2
|
"nbformat_minor": 4
|
||||||
}
|
}
|
||||||
@@ -1,4 +0,0 @@
|
|||||||
name: auto-ml-forecasting-github-dau
|
|
||||||
dependencies:
|
|
||||||
- pip:
|
|
||||||
- azureml-sdk
|
|
||||||
@@ -79,9 +79,7 @@ def get_result_df(remote_run):
|
|||||||
if "goal" in run.properties:
|
if "goal" in run.properties:
|
||||||
goal_minimize = run.properties["goal"].split("_")[-1] == "min"
|
goal_minimize = run.properties["goal"].split("_")[-1] == "min"
|
||||||
|
|
||||||
summary_df = summary_df.T.sort_values(
|
summary_df = summary_df.T.sort_values("Score", ascending=goal_minimize)
|
||||||
"Score", ascending=goal_minimize
|
|
||||||
).drop_duplicates(["run_algorithm"])
|
|
||||||
summary_df = summary_df.set_index("run_algorithm")
|
summary_df = summary_df.set_index("run_algorithm")
|
||||||
return summary_df
|
return summary_df
|
||||||
|
|
||||||
|
|||||||
@@ -4,8 +4,7 @@ import os
|
|||||||
import numpy as np
|
import numpy as np
|
||||||
import pandas as pd
|
import pandas as pd
|
||||||
|
|
||||||
from pandas.tseries.frequencies import to_offset
|
import joblib
|
||||||
from sklearn.externals import joblib
|
|
||||||
from sklearn.metrics import mean_absolute_error, mean_squared_error
|
from sklearn.metrics import mean_absolute_error, mean_squared_error
|
||||||
|
|
||||||
from azureml.automl.runtime.shared.score import scoring, constants
|
from azureml.automl.runtime.shared.score import scoring, constants
|
||||||
@@ -19,219 +18,8 @@ except ImportError:
|
|||||||
_torch_present = False
|
_torch_present = False
|
||||||
|
|
||||||
|
|
||||||
def align_outputs(
|
def map_location_cuda(storage, loc):
|
||||||
y_predicted,
|
return storage.cuda()
|
||||||
X_trans,
|
|
||||||
X_test,
|
|
||||||
y_test,
|
|
||||||
predicted_column_name="predicted",
|
|
||||||
horizon_colname="horizon_origin",
|
|
||||||
):
|
|
||||||
"""
|
|
||||||
Demonstrates how to get the output aligned to the inputs
|
|
||||||
using pandas indexes. Helps understand what happened if
|
|
||||||
the output's shape differs from the input shape, or if
|
|
||||||
the data got re-sorted by time and grain during forecasting.
|
|
||||||
|
|
||||||
Typical causes of misalignment are:
|
|
||||||
* we predicted some periods that were missing in actuals -> drop from eval
|
|
||||||
* model was asked to predict past max_horizon -> increase max horizon
|
|
||||||
* data at start of X_test was needed for lags -> provide previous periods
|
|
||||||
"""
|
|
||||||
if horizon_colname in X_trans:
|
|
||||||
df_fcst = pd.DataFrame(
|
|
||||||
{
|
|
||||||
predicted_column_name: y_predicted,
|
|
||||||
horizon_colname: X_trans[horizon_colname],
|
|
||||||
}
|
|
||||||
)
|
|
||||||
else:
|
|
||||||
df_fcst = pd.DataFrame({predicted_column_name: y_predicted})
|
|
||||||
|
|
||||||
# y and X outputs are aligned by forecast() function contract
|
|
||||||
df_fcst.index = X_trans.index
|
|
||||||
|
|
||||||
# align original X_test to y_test
|
|
||||||
X_test_full = X_test.copy()
|
|
||||||
X_test_full[target_column_name] = y_test
|
|
||||||
|
|
||||||
# X_test_full's index does not include origin, so reset for merge
|
|
||||||
df_fcst.reset_index(inplace=True)
|
|
||||||
X_test_full = X_test_full.reset_index().drop(columns="index")
|
|
||||||
together = df_fcst.merge(X_test_full, how="right")
|
|
||||||
|
|
||||||
# drop rows where prediction or actuals are nan
|
|
||||||
# happens because of missing actuals
|
|
||||||
# or at edges of time due to lags/rolling windows
|
|
||||||
clean = together[
|
|
||||||
together[[target_column_name, predicted_column_name]].notnull().all(axis=1)
|
|
||||||
]
|
|
||||||
return clean
|
|
||||||
|
|
||||||
|
|
||||||
def do_rolling_forecast_with_lookback(
|
|
||||||
fitted_model, X_test, y_test, max_horizon, X_lookback, y_lookback, freq="D"
|
|
||||||
):
|
|
||||||
"""
|
|
||||||
Produce forecasts on a rolling origin over the given test set.
|
|
||||||
|
|
||||||
Each iteration makes a forecast for the next 'max_horizon' periods
|
|
||||||
with respect to the current origin, then advances the origin by the
|
|
||||||
horizon time duration. The prediction context for each forecast is set so
|
|
||||||
that the forecaster uses the actual target values prior to the current
|
|
||||||
origin time for constructing lag features.
|
|
||||||
|
|
||||||
This function returns a concatenated DataFrame of rolling forecasts.
|
|
||||||
"""
|
|
||||||
print("Using lookback of size: ", y_lookback.size)
|
|
||||||
df_list = []
|
|
||||||
origin_time = X_test[time_column_name].min()
|
|
||||||
X = X_lookback.append(X_test)
|
|
||||||
y = np.concatenate((y_lookback, y_test), axis=0)
|
|
||||||
while origin_time <= X_test[time_column_name].max():
|
|
||||||
# Set the horizon time - end date of the forecast
|
|
||||||
horizon_time = origin_time + max_horizon * to_offset(freq)
|
|
||||||
|
|
||||||
# Extract test data from an expanding window up-to the horizon
|
|
||||||
expand_wind = X[time_column_name] < horizon_time
|
|
||||||
X_test_expand = X[expand_wind]
|
|
||||||
y_query_expand = np.zeros(len(X_test_expand)).astype(np.float)
|
|
||||||
y_query_expand.fill(np.NaN)
|
|
||||||
|
|
||||||
if origin_time != X[time_column_name].min():
|
|
||||||
# Set the context by including actuals up-to the origin time
|
|
||||||
test_context_expand_wind = X[time_column_name] < origin_time
|
|
||||||
context_expand_wind = X_test_expand[time_column_name] < origin_time
|
|
||||||
y_query_expand[context_expand_wind] = y[test_context_expand_wind]
|
|
||||||
|
|
||||||
# Print some debug info
|
|
||||||
print(
|
|
||||||
"Horizon_time:",
|
|
||||||
horizon_time,
|
|
||||||
" origin_time: ",
|
|
||||||
origin_time,
|
|
||||||
" max_horizon: ",
|
|
||||||
max_horizon,
|
|
||||||
" freq: ",
|
|
||||||
freq,
|
|
||||||
)
|
|
||||||
print("expand_wind: ", expand_wind)
|
|
||||||
print("y_query_expand")
|
|
||||||
print(y_query_expand)
|
|
||||||
print("X_test")
|
|
||||||
print(X)
|
|
||||||
print("X_test_expand")
|
|
||||||
print(X_test_expand)
|
|
||||||
print("Type of X_test_expand: ", type(X_test_expand))
|
|
||||||
print("Type of y_query_expand: ", type(y_query_expand))
|
|
||||||
|
|
||||||
print("y_query_expand")
|
|
||||||
print(y_query_expand)
|
|
||||||
|
|
||||||
# Make a forecast out to the maximum horizon
|
|
||||||
# y_fcst, X_trans = y_query_expand, X_test_expand
|
|
||||||
y_fcst, X_trans = fitted_model.forecast(X_test_expand, y_query_expand)
|
|
||||||
|
|
||||||
print("y_fcst")
|
|
||||||
print(y_fcst)
|
|
||||||
|
|
||||||
# Align forecast with test set for dates within
|
|
||||||
# the current rolling window
|
|
||||||
trans_tindex = X_trans.index.get_level_values(time_column_name)
|
|
||||||
trans_roll_wind = (trans_tindex >= origin_time) & (trans_tindex < horizon_time)
|
|
||||||
test_roll_wind = expand_wind & (X[time_column_name] >= origin_time)
|
|
||||||
df_list.append(
|
|
||||||
align_outputs(
|
|
||||||
y_fcst[trans_roll_wind],
|
|
||||||
X_trans[trans_roll_wind],
|
|
||||||
X[test_roll_wind],
|
|
||||||
y[test_roll_wind],
|
|
||||||
)
|
|
||||||
)
|
|
||||||
|
|
||||||
# Advance the origin time
|
|
||||||
origin_time = horizon_time
|
|
||||||
|
|
||||||
return pd.concat(df_list, ignore_index=True)
|
|
||||||
|
|
||||||
|
|
||||||
def do_rolling_forecast(fitted_model, X_test, y_test, max_horizon, freq="D"):
|
|
||||||
"""
|
|
||||||
Produce forecasts on a rolling origin over the given test set.
|
|
||||||
|
|
||||||
Each iteration makes a forecast for the next 'max_horizon' periods
|
|
||||||
with respect to the current origin, then advances the origin by the
|
|
||||||
horizon time duration. The prediction context for each forecast is set so
|
|
||||||
that the forecaster uses the actual target values prior to the current
|
|
||||||
origin time for constructing lag features.
|
|
||||||
|
|
||||||
This function returns a concatenated DataFrame of rolling forecasts.
|
|
||||||
"""
|
|
||||||
df_list = []
|
|
||||||
origin_time = X_test[time_column_name].min()
|
|
||||||
while origin_time <= X_test[time_column_name].max():
|
|
||||||
# Set the horizon time - end date of the forecast
|
|
||||||
horizon_time = origin_time + max_horizon * to_offset(freq)
|
|
||||||
|
|
||||||
# Extract test data from an expanding window up-to the horizon
|
|
||||||
expand_wind = X_test[time_column_name] < horizon_time
|
|
||||||
X_test_expand = X_test[expand_wind]
|
|
||||||
y_query_expand = np.zeros(len(X_test_expand)).astype(np.float)
|
|
||||||
y_query_expand.fill(np.NaN)
|
|
||||||
|
|
||||||
if origin_time != X_test[time_column_name].min():
|
|
||||||
# Set the context by including actuals up-to the origin time
|
|
||||||
test_context_expand_wind = X_test[time_column_name] < origin_time
|
|
||||||
context_expand_wind = X_test_expand[time_column_name] < origin_time
|
|
||||||
y_query_expand[context_expand_wind] = y_test[test_context_expand_wind]
|
|
||||||
|
|
||||||
# Print some debug info
|
|
||||||
print(
|
|
||||||
"Horizon_time:",
|
|
||||||
horizon_time,
|
|
||||||
" origin_time: ",
|
|
||||||
origin_time,
|
|
||||||
" max_horizon: ",
|
|
||||||
max_horizon,
|
|
||||||
" freq: ",
|
|
||||||
freq,
|
|
||||||
)
|
|
||||||
print("expand_wind: ", expand_wind)
|
|
||||||
print("y_query_expand")
|
|
||||||
print(y_query_expand)
|
|
||||||
print("X_test")
|
|
||||||
print(X_test)
|
|
||||||
print("X_test_expand")
|
|
||||||
print(X_test_expand)
|
|
||||||
print("Type of X_test_expand: ", type(X_test_expand))
|
|
||||||
print("Type of y_query_expand: ", type(y_query_expand))
|
|
||||||
print("y_query_expand")
|
|
||||||
print(y_query_expand)
|
|
||||||
|
|
||||||
# Make a forecast out to the maximum horizon
|
|
||||||
y_fcst, X_trans = fitted_model.forecast(X_test_expand, y_query_expand)
|
|
||||||
|
|
||||||
print("y_fcst")
|
|
||||||
print(y_fcst)
|
|
||||||
|
|
||||||
# Align forecast with test set for dates within the
|
|
||||||
# current rolling window
|
|
||||||
trans_tindex = X_trans.index.get_level_values(time_column_name)
|
|
||||||
trans_roll_wind = (trans_tindex >= origin_time) & (trans_tindex < horizon_time)
|
|
||||||
test_roll_wind = expand_wind & (X_test[time_column_name] >= origin_time)
|
|
||||||
df_list.append(
|
|
||||||
align_outputs(
|
|
||||||
y_fcst[trans_roll_wind],
|
|
||||||
X_trans[trans_roll_wind],
|
|
||||||
X_test[test_roll_wind],
|
|
||||||
y_test[test_roll_wind],
|
|
||||||
)
|
|
||||||
)
|
|
||||||
|
|
||||||
# Advance the origin time
|
|
||||||
origin_time = horizon_time
|
|
||||||
|
|
||||||
return pd.concat(df_list, ignore_index=True)
|
|
||||||
|
|
||||||
|
|
||||||
def APE(actual, pred):
|
def APE(actual, pred):
|
||||||
@@ -254,10 +42,6 @@ def MAPE(actual, pred):
|
|||||||
return np.mean(APE(actual_safe, pred_safe))
|
return np.mean(APE(actual_safe, pred_safe))
|
||||||
|
|
||||||
|
|
||||||
def map_location_cuda(storage, loc):
|
|
||||||
return storage.cuda()
|
|
||||||
|
|
||||||
|
|
||||||
parser = argparse.ArgumentParser()
|
parser = argparse.ArgumentParser()
|
||||||
parser.add_argument(
|
parser.add_argument(
|
||||||
"--max_horizon",
|
"--max_horizon",
|
||||||
@@ -303,7 +87,6 @@ print(model_path)
|
|||||||
run = Run.get_context()
|
run = Run.get_context()
|
||||||
# get input dataset by name
|
# get input dataset by name
|
||||||
test_dataset = run.input_datasets["test_data"]
|
test_dataset = run.input_datasets["test_data"]
|
||||||
lookback_dataset = run.input_datasets["lookback_data"]
|
|
||||||
|
|
||||||
grain_column_names = []
|
grain_column_names = []
|
||||||
|
|
||||||
@@ -312,15 +95,8 @@ df = test_dataset.to_pandas_dataframe()
|
|||||||
print("Read df")
|
print("Read df")
|
||||||
print(df)
|
print(df)
|
||||||
|
|
||||||
X_test_df = test_dataset.drop_columns(columns=[target_column_name])
|
X_test_df = df
|
||||||
y_test_df = test_dataset.with_timestamp_columns(None).keep_columns(
|
y_test = df.pop(target_column_name).to_numpy()
|
||||||
columns=[target_column_name]
|
|
||||||
)
|
|
||||||
|
|
||||||
X_lookback_df = lookback_dataset.drop_columns(columns=[target_column_name])
|
|
||||||
y_lookback_df = lookback_dataset.with_timestamp_columns(None).keep_columns(
|
|
||||||
columns=[target_column_name]
|
|
||||||
)
|
|
||||||
|
|
||||||
_, ext = os.path.splitext(model_path)
|
_, ext = os.path.splitext(model_path)
|
||||||
if ext == ".pt":
|
if ext == ".pt":
|
||||||
@@ -336,37 +112,20 @@ else:
|
|||||||
# Load the sklearn pipeline.
|
# Load the sklearn pipeline.
|
||||||
fitted_model = joblib.load(model_path)
|
fitted_model = joblib.load(model_path)
|
||||||
|
|
||||||
if hasattr(fitted_model, "get_lookback"):
|
X_rf = fitted_model.rolling_forecast(X_test_df, y_test, step=1)
|
||||||
lookback = fitted_model.get_lookback()
|
assign_dict = {
|
||||||
df_all = do_rolling_forecast_with_lookback(
|
fitted_model.forecast_origin_column_name: "forecast_origin",
|
||||||
fitted_model,
|
fitted_model.forecast_column_name: "predicted",
|
||||||
X_test_df.to_pandas_dataframe(),
|
fitted_model.actual_column_name: target_column_name,
|
||||||
y_test_df.to_pandas_dataframe().values.T[0],
|
}
|
||||||
max_horizon,
|
X_rf.rename(columns=assign_dict, inplace=True)
|
||||||
X_lookback_df.to_pandas_dataframe()[-lookback:],
|
|
||||||
y_lookback_df.to_pandas_dataframe().values.T[0][-lookback:],
|
|
||||||
freq,
|
|
||||||
)
|
|
||||||
else:
|
|
||||||
df_all = do_rolling_forecast(
|
|
||||||
fitted_model,
|
|
||||||
X_test_df.to_pandas_dataframe(),
|
|
||||||
y_test_df.to_pandas_dataframe().values.T[0],
|
|
||||||
max_horizon,
|
|
||||||
freq,
|
|
||||||
)
|
|
||||||
|
|
||||||
print(df_all)
|
print(X_rf.head())
|
||||||
|
|
||||||
print("target values:::")
|
|
||||||
print(df_all[target_column_name])
|
|
||||||
print("predicted values:::")
|
|
||||||
print(df_all["predicted"])
|
|
||||||
|
|
||||||
# Use the AutoML scoring module
|
# Use the AutoML scoring module
|
||||||
regression_metrics = list(constants.REGRESSION_SCALAR_SET)
|
regression_metrics = list(constants.REGRESSION_SCALAR_SET)
|
||||||
y_test = np.array(df_all[target_column_name])
|
y_test = np.array(X_rf[target_column_name])
|
||||||
y_pred = np.array(df_all["predicted"])
|
y_pred = np.array(X_rf["predicted"])
|
||||||
scores = scoring.score_regression(y_test, y_pred, regression_metrics)
|
scores = scoring.score_regression(y_test, y_pred, regression_metrics)
|
||||||
|
|
||||||
print("scores:")
|
print("scores:")
|
||||||
@@ -376,11 +135,11 @@ for key, value in scores.items():
|
|||||||
run.log(key, value)
|
run.log(key, value)
|
||||||
|
|
||||||
print("Simple forecasting model")
|
print("Simple forecasting model")
|
||||||
rmse = np.sqrt(mean_squared_error(df_all[target_column_name], df_all["predicted"]))
|
rmse = np.sqrt(mean_squared_error(X_rf[target_column_name], X_rf["predicted"]))
|
||||||
print("[Test Data] \nRoot Mean squared error: %.2f" % rmse)
|
print("[Test Data] \nRoot Mean squared error: %.2f" % rmse)
|
||||||
mae = mean_absolute_error(df_all[target_column_name], df_all["predicted"])
|
mae = mean_absolute_error(X_rf[target_column_name], X_rf["predicted"])
|
||||||
print("mean_absolute_error score: %.2f" % mae)
|
print("mean_absolute_error score: %.2f" % mae)
|
||||||
print("MAPE: %.2f" % MAPE(df_all[target_column_name], df_all["predicted"]))
|
print("MAPE: %.2f" % MAPE(X_rf[target_column_name], X_rf["predicted"]))
|
||||||
|
|
||||||
run.log("rmse", rmse)
|
run.log("rmse", rmse)
|
||||||
run.log("mae", mae)
|
run.log("mae", mae)
|
||||||
|
|||||||
@@ -16,6 +16,13 @@
|
|||||||
""
|
""
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"<font color=\"red\" size=\"5\"><strong>!Important!</strong> </br>This notebook is outdated and is not supported by the AutoML Team. Please use the supported version ([link](https://github.com/Azure/azureml-examples/tree/main/sdk/python/jobs/pipelines/1k_demand_forecasting_with_pipeline_components/automl-forecasting-demand-hierarchical-timeseries-in-pipeline)).</font>"
|
||||||
|
]
|
||||||
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
@@ -40,7 +47,7 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"### Prerequisites\n",
|
"### Prerequisites\n",
|
||||||
"You'll need to create a compute Instance by following the instructions in the [EnvironmentSetup.md](../Setup_Resources/EnvironmentSetup.md)."
|
"You'll need to create a compute Instance by following [these](https://learn.microsoft.com/en-us/azure/machine-learning/v1/how-to-create-manage-compute-instance?tabs=python) instructions."
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -78,6 +85,7 @@
|
|||||||
"output[\"Resource Group\"] = ws.resource_group\n",
|
"output[\"Resource Group\"] = ws.resource_group\n",
|
||||||
"output[\"Location\"] = ws.location\n",
|
"output[\"Location\"] = ws.location\n",
|
||||||
"output[\"Default datastore name\"] = dstore.name\n",
|
"output[\"Default datastore name\"] = dstore.name\n",
|
||||||
|
"output[\"SDK Version\"] = azureml.core.VERSION\n",
|
||||||
"pd.set_option(\"display.max_colwidth\", None)\n",
|
"pd.set_option(\"display.max_colwidth\", None)\n",
|
||||||
"outputDf = pd.DataFrame(data=output, index=[\"\"])\n",
|
"outputDf = pd.DataFrame(data=output, index=[\"\"])\n",
|
||||||
"outputDf.T"
|
"outputDf.T"
|
||||||
@@ -250,8 +258,17 @@
|
|||||||
"source": [
|
"source": [
|
||||||
"### Set up training parameters\n",
|
"### Set up training parameters\n",
|
||||||
"\n",
|
"\n",
|
||||||
"This dictionary defines the AutoML and hierarchy settings. For this forecasting task we need to define several settings inncluding the name of the time column, the maximum forecast horizon, the hierarchy definition, and the level of the hierarchy at which to train.\n",
|
"We need to provide ``ForecastingParameters``, ``AutoMLConfig`` and ``HTSTrainParameters`` objects. For the forecasting task we need to define several settings including the name of the time column, the maximum forecast horizon, the hierarchy definition, and the level of the hierarchy at which to train.\n",
|
||||||
"\n",
|
"\n",
|
||||||
|
"#### ``ForecastingParameters`` arguments\n",
|
||||||
|
"| Property | Description|\n",
|
||||||
|
"| :--------------- | :------------------- |\n",
|
||||||
|
"| **forecast_horizon** | The forecast horizon is how many periods forward you would like to forecast. This integer horizon is in units of the timeseries frequency (e.g. daily, weekly). Periods are inferred from your data. |\n",
|
||||||
|
"| **time_column_name** | The name of your time column. |\n",
|
||||||
|
"| **time_series_id_column_names** | The column names used to uniquely identify timeseries in data that has multiple rows with the same timestamp. |\n",
|
||||||
|
"| **cv_step_size** | Number of periods between two consecutive cross-validation folds. The default value is \\\"auto\\\", in which case AutoMl determines the cross-validation step size automatically, if a validation set is not provided. Or users could specify an integer value. |\n",
|
||||||
|
"\n",
|
||||||
|
"#### ``AutoMLConfig`` arguments\n",
|
||||||
"| Property | Description|\n",
|
"| Property | Description|\n",
|
||||||
"| :--------------- | :------------------- |\n",
|
"| :--------------- | :------------------- |\n",
|
||||||
"| **task** | forecasting |\n",
|
"| **task** | forecasting |\n",
|
||||||
@@ -259,19 +276,22 @@
|
|||||||
"| **blocked_models** | Blocked models won't be used by AutoML. |\n",
|
"| **blocked_models** | Blocked models won't be used by AutoML. |\n",
|
||||||
"| **iteration_timeout_minutes** | Maximum amount of time in minutes that the model can train. This is optional but provides customers with greater control on exit criteria. |\n",
|
"| **iteration_timeout_minutes** | Maximum amount of time in minutes that the model can train. This is optional but provides customers with greater control on exit criteria. |\n",
|
||||||
"| **iterations** | Number of models to train. This is optional but provides customers with greater control on exit criteria. |\n",
|
"| **iterations** | Number of models to train. This is optional but provides customers with greater control on exit criteria. |\n",
|
||||||
"| **experiment_timeout_hours** | Maximum amount of time in hours that the experiment can take before it terminates. This is optional but provides customers with greater control on exit criteria. |\n",
|
"| **experiment_timeout_hours** | Maximum amount of time in hours that each experiment can take before it terminates. This is optional but provides customers with greater control on exit criteria. **It does not control the overall timeout for the pipeline run, instead controls the timeout for each training run per partitioned time series.** |\n",
|
||||||
"| **label_column_name** | The name of the label column. |\n",
|
"| **label_column_name** | The name of the label column. |\n",
|
||||||
"| **forecast_horizon** | The forecast horizon is how many periods forward you would like to forecast. This integer horizon is in units of the timeseries frequency (e.g. daily, weekly). Periods are inferred from your data. |\n",
|
"| **n_cross_validations** | Number of cross validation splits. The default value is \\\"auto\\\", in which case AutoMl determines the number of cross-validations automatically, if a validation set is not provided. Or users could specify an integer value. Rolling Origin Validation is used to split time-series in a temporally consistent way. |\n",
|
||||||
"| **n_cross_validations** | Number of cross validation splits. Rolling Origin Validation is used to split time-series in a temporally consistent way. |\n",
|
"| **enable_early_stopping** | Flag to enable early termination if the primary metric is no longer improving. |\n",
|
||||||
"| **enable_early_stopping** | Flag to enable early termination if the score is not improving in the short term. |\n",
|
|
||||||
"| **time_column_name** | The name of your time column. |\n",
|
|
||||||
"| **hierarchy_column_names** | The names of columns that define the hierarchical structure of the data from highest level to most granular. |\n",
|
|
||||||
"| **training_level** | The level of the hierarchy to be used for training models. |\n",
|
|
||||||
"| **enable_engineered_explanations** | Engineered feature explanations will be downloaded if enable_engineered_explanations flag is set to True. By default it is set to False to save storage space. |\n",
|
"| **enable_engineered_explanations** | Engineered feature explanations will be downloaded if enable_engineered_explanations flag is set to True. By default it is set to False to save storage space. |\n",
|
||||||
"| **time_series_id_column_name** | The column names used to uniquely identify timeseries in data that has multiple rows with the same timestamp. |\n",
|
|
||||||
"| **track_child_runs** | Flag to disable tracking of child runs. Only best run is tracked if the flag is set to False (this includes the model and metrics of the run). |\n",
|
"| **track_child_runs** | Flag to disable tracking of child runs. Only best run is tracked if the flag is set to False (this includes the model and metrics of the run). |\n",
|
||||||
"| **pipeline_fetch_max_batch_size** | Determines how many pipelines (training algorithms) to fetch at a time for training, this helps reduce throttling when training at large scale. |\n",
|
"| **pipeline_fetch_max_batch_size** | Determines how many pipelines (training algorithms) to fetch at a time for training, this helps reduce throttling when training at large scale. |\n",
|
||||||
"| **model_explainability** | Flag to disable explaining the best automated ML model at the end of all training iterations. The default is True and will block non-explainable models which may impact the forecast accuracy. For more information, see [Interpretability: model explanations in automated machine learning](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-machine-learning-interpretability-automl). |"
|
"| **model_explainability** | Flag to disable explaining the best automated ML model at the end of all training iterations. The default is True and will block non-explainable models which may impact the forecast accuracy. For more information, see [Interpretability: model explanations in automated machine learning](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-machine-learning-interpretability-automl). |\n",
|
||||||
|
"\n",
|
||||||
|
"#### ``HTSTrainParameters`` arguments\n",
|
||||||
|
"| Property | Description|\n",
|
||||||
|
"| :--------------- | :------------------- |\n",
|
||||||
|
"| **automl_settings** | The ``AutoMLConfig`` object defined above. |\n",
|
||||||
|
"| **hierarchy_column_names** | The names of columns that define the hierarchical structure of the data from highest level to most granular. |\n",
|
||||||
|
"| **training_level** | The level of the hierarchy to be used for training models. |\n",
|
||||||
|
"| **enable_engineered_explanations** | The switch controls engineered explanations. |"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -285,6 +305,9 @@
|
|||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"from azureml.train.automl.runtime._hts.hts_parameters import HTSTrainParameters\n",
|
"from azureml.train.automl.runtime._hts.hts_parameters import HTSTrainParameters\n",
|
||||||
|
"from azureml.automl.core.forecasting_parameters import ForecastingParameters\n",
|
||||||
|
"from azureml.train.automl.automlconfig import AutoMLConfig\n",
|
||||||
|
"\n",
|
||||||
"\n",
|
"\n",
|
||||||
"model_explainability = True\n",
|
"model_explainability = True\n",
|
||||||
"\n",
|
"\n",
|
||||||
@@ -298,23 +321,26 @@
|
|||||||
"label_column_name = \"quantity\"\n",
|
"label_column_name = \"quantity\"\n",
|
||||||
"forecast_horizon = 7\n",
|
"forecast_horizon = 7\n",
|
||||||
"\n",
|
"\n",
|
||||||
|
"forecasting_parameters = ForecastingParameters(\n",
|
||||||
|
" time_column_name=time_column_name,\n",
|
||||||
|
" forecast_horizon=forecast_horizon,\n",
|
||||||
|
")\n",
|
||||||
"\n",
|
"\n",
|
||||||
"automl_settings = {\n",
|
"automl_settings = AutoMLConfig(\n",
|
||||||
" \"task\": \"forecasting\",\n",
|
" task=\"forecasting\",\n",
|
||||||
" \"primary_metric\": \"normalized_root_mean_squared_error\",\n",
|
" primary_metric=\"normalized_root_mean_squared_error\",\n",
|
||||||
" \"label_column_name\": label_column_name,\n",
|
" experiment_timeout_hours=1,\n",
|
||||||
" \"time_column_name\": time_column_name,\n",
|
" label_column_name=label_column_name,\n",
|
||||||
" \"forecast_horizon\": forecast_horizon,\n",
|
" track_child_runs=False,\n",
|
||||||
" \"hierarchy_column_names\": hierarchy,\n",
|
" forecasting_parameters=forecasting_parameters,\n",
|
||||||
" \"hierarchy_training_level\": training_level,\n",
|
" pipeline_fetch_max_batch_size=15,\n",
|
||||||
" \"track_child_runs\": False,\n",
|
" model_explainability=model_explainability,\n",
|
||||||
" \"pipeline_fetch_max_batch_size\": 15,\n",
|
" n_cross_validations=\"auto\", # Feel free to set to a small integer (>=2) if runtime is an issue.\n",
|
||||||
" \"model_explainability\": model_explainability,\n",
|
" cv_step_size=\"auto\",\n",
|
||||||
" # The following settings are specific to this sample and should be adjusted according to your own needs.\n",
|
" # The following settings are specific to this sample and should be adjusted according to your own needs.\n",
|
||||||
" \"iteration_timeout_minutes\": 10,\n",
|
" iteration_timeout_minutes=10,\n",
|
||||||
" \"iterations\": 10,\n",
|
" iterations=15,\n",
|
||||||
" \"n_cross_validations\": 2,\n",
|
")\n",
|
||||||
"}\n",
|
|
||||||
"\n",
|
"\n",
|
||||||
"hts_parameters = HTSTrainParameters(\n",
|
"hts_parameters = HTSTrainParameters(\n",
|
||||||
" automl_settings=automl_settings,\n",
|
" automl_settings=automl_settings,\n",
|
||||||
@@ -335,15 +361,25 @@
|
|||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"Parallel run step is leveraged to train the hierarchy. To configure the ParallelRunConfig you will need to determine the appropriate number of workers and nodes for your use case. The `process_count_per_node` is based off the number of cores of the compute VM. The node_count will determine the number of master nodes to use, increasing the node count will speed up the training process.\n",
|
"Parallel run step is leveraged to train multiple models at once. To configure the ParallelRunConfig you will need to determine the appropriate number of workers and nodes for your use case. The ``process_count_per_node`` is based off the number of cores of the compute VM. The node_count will determine the number of master nodes to use, increasing the node count will speed up the training process.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"* **experiment:** The experiment used for training.\n",
|
"| Property | Description|\n",
|
||||||
"* **train_data:** The tabular dataset to be used as input to the training run.\n",
|
"| :--------------- | :------------------- |\n",
|
||||||
"* **node_count:** The number of compute nodes to be used for running the user script. We recommend to start with 3 and increase the node_count if the training time is taking too long.\n",
|
"| **experiment** | The experiment used for training. |\n",
|
||||||
"* **process_count_per_node:** Process count per node, we recommend 2:1 ratio for number of cores: number of processes per node. eg. If node has 16 cores then configure 8 or less process count per node or optimal performance.\n",
|
"| **train_data** | The file dataset to be used as input to the training run. |\n",
|
||||||
"* **train_pipeline_parameters:** The set of configuration parameters defined in the previous section. \n",
|
"| **node_count** | The number of compute nodes to be used for running the user script. We recommend to start with 3 and increase the node_count if the training time is taking too long. |\n",
|
||||||
|
"| **process_count_per_node** | Process count per node, we recommend 2:1 ratio for number of cores: number of processes per node. eg. If node has 16 cores then configure 8 or less process count per node for optimal performance. |\n",
|
||||||
|
"| **train_pipeline_parameters** | The set of configuration parameters defined in the previous section. |\n",
|
||||||
|
"| **run_invocation_timeout** | Maximum amount of time in seconds that the ``ParallelRunStep`` class is allowed. This is optional but provides customers with greater control on exit criteria. This must be greater than ``experiment_timeout_hours`` by at least 300 seconds. |\n",
|
||||||
"\n",
|
"\n",
|
||||||
"Calling this method will create a new aggregated dataset which is generated dynamically on pipeline execution."
|
"Calling this method will create a new aggregated dataset which is generated dynamically on pipeline execution.\n",
|
||||||
|
"\n",
|
||||||
|
"**Note**: Total time taken for the **training step** in the pipeline to complete = $ \\frac{t}{ p \\times n } \\times ts $\n",
|
||||||
|
"where,\n",
|
||||||
|
"- $ t $ is time taken for training one partition (can be viewed in the training logs)\n",
|
||||||
|
"- $ p $ is ``process_count_per_node``\n",
|
||||||
|
"- $ n $ is ``node_count``\n",
|
||||||
|
"- $ ts $ is total number of partitions in time series based on ``partition_column_names``"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -362,6 +398,7 @@
|
|||||||
" node_count=2,\n",
|
" node_count=2,\n",
|
||||||
" process_count_per_node=8,\n",
|
" process_count_per_node=8,\n",
|
||||||
" train_pipeline_parameters=hts_parameters,\n",
|
" train_pipeline_parameters=hts_parameters,\n",
|
||||||
|
" run_invocation_timeout=3900,\n",
|
||||||
")"
|
")"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
@@ -506,19 +543,24 @@
|
|||||||
"source": [
|
"source": [
|
||||||
"## 5.0 Forecasting\n",
|
"## 5.0 Forecasting\n",
|
||||||
"For hierarchical forecasting we need to provide the HTSInferenceParameters object.\n",
|
"For hierarchical forecasting we need to provide the HTSInferenceParameters object.\n",
|
||||||
"#### HTSInferenceParameters arguments\n",
|
"#### ``HTSInferenceParameters`` arguments\n",
|
||||||
"* **hierarchy_forecast_level:** The default level of the hierarchy to produce prediction/forecast on.\n",
|
"| Property | Description|\n",
|
||||||
"* **allocation_method:** \\[Optional] The disaggregation method to use if the hierarchy forecast level specified is below the define hierarchy training level. <br><i>(average historical proportions) 'average_historical_proportions'</i><br><i>(proportions of the historical averages) 'proportions_of_historical_average'</i>\n",
|
"| :--------------- | :------------------- |\n",
|
||||||
|
"| **hierarchy_forecast_level:** | The default level of the hierarchy to produce prediction/forecast on. |\n",
|
||||||
|
"| **allocation_method:** | \\[Optional] The disaggregation method to use if the hierarchy forecast level specified is below the define hierarchy training level. <br><i>(average historical proportions) 'average_historical_proportions'</i><br><i>(proportions of the historical averages) 'proportions_of_historical_average'</i> |\n",
|
||||||
"\n",
|
"\n",
|
||||||
"#### get_many_models_batch_inference_steps arguments\n",
|
"#### ``get_many_models_batch_inference_steps`` arguments\n",
|
||||||
"* **experiment:** The experiment used for inference run.\n",
|
"| Property | Description|\n",
|
||||||
"* **inference_data:** The data to use for inferencing. It should be the same schema as used for training.\n",
|
"| :--------------- | :------------------- |\n",
|
||||||
"* **compute_target:** The compute target that runs the inference pipeline.\n",
|
"| **experiment** | The experiment used for inference run. |\n",
|
||||||
"* **node_count:** The number of compute nodes to be used for running the user script. We recommend to start with the number of cores per node (varies by compute sku).\n",
|
"| **inference_data** | The data to use for inferencing. It should be the same schema as used for training.\n",
|
||||||
"* **process_count_per_node:** The number of processes per node.\n",
|
"| **compute_target** | The compute target that runs the inference pipeline. |\n",
|
||||||
"* **train_run_id:** \\[Optional] The run id of the hierarchy training, by default it is the latest successful training hts run in the experiment.\n",
|
"| **node_count** | The number of compute nodes to be used for running the user script. We recommend to start with the number of cores per node (varies by compute sku). |\n",
|
||||||
"* **train_experiment_name:** \\[Optional] The train experiment that contains the train pipeline. This one is only needed when the train pipeline is not in the same experiement as the inference pipeline.\n",
|
"| **process_count_per_node** | \\[Optional] The number of processes per node. By default it's 2 (should be at most half of the number of cores in a single node of the compute cluster that will be used for the experiment).\n",
|
||||||
"* **process_count_per_node:** \\[Optional] The number of processes per node, by default it's 4."
|
"| **inference_pipeline_parameters** | \\[Optional] The ``HTSInferenceParameters`` object defined above. |\n",
|
||||||
|
"| **train_run_id** | \\[Optional] The run id of the **training pipeline**. By default it is the latest successful training pipeline run in the experiment. |\n",
|
||||||
|
"| **train_experiment_name** | \\[Optional] The train experiment that contains the train pipeline. This one is only needed when the train pipeline is not in the same experiement as the inference pipeline. |\n",
|
||||||
|
"| **run_invocation_timeout** | \\[Optional] Maximum amount of time in seconds that the ``ParallelRunStep`` class is allowed. This is optional but provides customers with greater control on exit criteria. |"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -617,9 +659,9 @@
|
|||||||
"automated-machine-learning"
|
"automated-machine-learning"
|
||||||
],
|
],
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Python 3.6",
|
"display_name": "Python 3.8 - AzureML",
|
||||||
"language": "python",
|
"language": "python",
|
||||||
"name": "python36"
|
"name": "python38-azureml"
|
||||||
},
|
},
|
||||||
"language_info": {
|
"language_info": {
|
||||||
"codemirror_mode": {
|
"codemirror_mode": {
|
||||||
@@ -631,7 +673,7 @@
|
|||||||
"name": "python",
|
"name": "python",
|
||||||
"nbconvert_exporter": "python",
|
"nbconvert_exporter": "python",
|
||||||
"pygments_lexer": "ipython3",
|
"pygments_lexer": "ipython3",
|
||||||
"version": "3.6.8"
|
"version": "3.8.10"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"nbformat": 4,
|
"nbformat": 4,
|
||||||
|
|||||||
@@ -1,4 +0,0 @@
|
|||||||
name: auto-ml-forecasting-hierarchical-timeseries
|
|
||||||
dependencies:
|
|
||||||
- pip:
|
|
||||||
- azureml-sdk
|
|
||||||
@@ -0,0 +1,122 @@
|
|||||||
|
---
|
||||||
|
page_type: sample
|
||||||
|
languages:
|
||||||
|
- python
|
||||||
|
products:
|
||||||
|
- azure-machine-learning
|
||||||
|
description: Tutorial showing how to solve a complex machine learning time series forecasting problems at scale by using Azure Automated ML and Many Models solution accelerator.
|
||||||
|
---
|
||||||
|
|
||||||
|

|
||||||
|
# Many Models Solution Accelerator
|
||||||
|
|
||||||
|
<!--
|
||||||
|
Guidelines on README format: https://review.docs.microsoft.com/help/onboard/admin/samples/concepts/readme-template?branch=master
|
||||||
|
|
||||||
|
Guidance on onboarding samples to docs.microsoft.com/samples: https://review.docs.microsoft.com/help/onboard/admin/samples/process/onboarding?branch=master
|
||||||
|
|
||||||
|
Taxonomies for products and languages: https://review.docs.microsoft.com/new-hope/information-architecture/metadata/taxonomies?branch=master
|
||||||
|
-->
|
||||||
|
|
||||||
|
In the real world, many problems can be too complex to be solved by a single machine learning model. Whether that be predicting sales for each individual store, building a predictive maintanence model for hundreds of oil wells, or tailoring an experience to individual users, building a model for each instance can lead to improved results on many machine learning problems.
|
||||||
|
|
||||||
|
This Pattern is very common across a wide variety of industries and applicable to many real world use cases. Below are some examples we have seen where this pattern is being used.
|
||||||
|
|
||||||
|
- Energy and utility companies building predictive maintenance models for thousands of oil wells, hundreds of wind turbines or hundreds of smart meters
|
||||||
|
|
||||||
|
- Retail organizations building workforce optimization models for thousands of stores, campaign promotion propensity models, Price optimization models for hundreds of thousands of products they sell
|
||||||
|
|
||||||
|
- Restaurant chains building demand forecasting models across thousands of restaurants
|
||||||
|
|
||||||
|
- Banks and financial institutes building models for cash replenishment for ATM Machine and for several ATMs or building personalized models for individuals
|
||||||
|
|
||||||
|
- Enterprises building revenue forecasting models at each division level
|
||||||
|
|
||||||
|
- Document management companies building text analytics and legal document search models per each state
|
||||||
|
|
||||||
|
Azure Machine Learning (AML) makes it easy to train, operate, and manage hundreds or even thousands of models. This repo will walk you through the end to end process of creating a many models solution from training to scoring to monitoring.
|
||||||
|
|
||||||
|
## Prerequisites
|
||||||
|
|
||||||
|
To use this solution accelerator, all you need is access to an [Azure subscription](https://azure.microsoft.com/free/) and an [Azure Machine Learning Workspace](https://docs.microsoft.com/azure/machine-learning/how-to-manage-workspace) that you'll create below.
|
||||||
|
|
||||||
|
While it's not required, a basic understanding of Azure Machine Learning will be helpful for understanding the solution. The following resources can help introduce you to AML:
|
||||||
|
|
||||||
|
1. [Azure Machine Learning Overview](https://azure.microsoft.com/services/machine-learning/)
|
||||||
|
2. [Azure Machine Learning Tutorials](https://docs.microsoft.com/azure/machine-learning/tutorial-1st-experiment-sdk-setup)
|
||||||
|
3. [Azure Machine Learning Sample Notebooks on Github](https://github.com/Azure/azureml-examples)
|
||||||
|
|
||||||
|
## Getting started
|
||||||
|
|
||||||
|
### 1. Deploy Resources
|
||||||
|
|
||||||
|
Start by deploying the resources to Azure. The button below will deploy Azure Machine Learning and its related resources:
|
||||||
|
|
||||||
|
<a href="https://portal.azure.com/#create/Microsoft.Template/uri/https%3A%2F%2Fraw.githubusercontent.com%2Fmicrosoft%2Fsolution-accelerator-many-models%2Fmaster%2Fazuredeploy.json" target="_blank">
|
||||||
|
<img src="http://azuredeploy.net/deploybutton.png"/>
|
||||||
|
</a>
|
||||||
|
|
||||||
|
### 2. Configure Development Environment
|
||||||
|
|
||||||
|
Next you'll need to configure your [development environment](https://docs.microsoft.com/azure/machine-learning/how-to-configure-environment) for Azure Machine Learning. We recommend using a [Compute Instance](https://docs.microsoft.com/azure/machine-learning/how-to-configure-environment#compute-instance) as it's the fastest way to get up and running.
|
||||||
|
|
||||||
|
### 3. Run Notebooks
|
||||||
|
|
||||||
|
Once your development environment is set up, run through the Jupyter Notebooks sequentially following the steps outlined. By the end, you'll know how to train, score, and make predictions using the many models pattern on Azure Machine Learning.
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
|
||||||
|
## Contents
|
||||||
|
|
||||||
|
In this repo, you'll train and score a forecasting model for each orange juice brand and for each store at a (simulated) grocery chain. By the end, you'll have forecasted sales by using up to 11,973 models to predict sales for the next few weeks.
|
||||||
|
|
||||||
|
The data used in this sample is simulated based on the [Dominick's Orange Juice Dataset](http://www.cs.unitn.it/~taufer/QMMA/L10-OJ-Data.html#(1)), sales data from a Chicago area grocery store.
|
||||||
|
|
||||||
|
<img src="images/Flow_map.png" width="1000">
|
||||||
|
|
||||||
|
### Using Automated ML to train the models:
|
||||||
|
|
||||||
|
The [`auto-ml-forecasting-many-models.ipynb`](./auto-ml-forecasting-many-models.ipynb) noteboook is a guided solution accelerator that demonstrates steps from data preparation, to model training, and forecasting on train models as well as operationalizing the solution.
|
||||||
|
|
||||||
|
## How-to-videos
|
||||||
|
|
||||||
|
Watch these how-to-videos for a step by step walk-through of the many model solution accelerator to learn how to setup your models using Automated ML.
|
||||||
|
|
||||||
|
### Automated ML
|
||||||
|
|
||||||
|
[](https://channel9.msdn.com/Shows/Docs-AI/Building-Large-Scale-Machine-Learning-Forecasting-Models-using-Azure-Machine-Learnings-Automated-ML)
|
||||||
|
|
||||||
|
## Key concepts
|
||||||
|
|
||||||
|
### ParallelRunStep
|
||||||
|
|
||||||
|
[ParallelRunStep](https://docs.microsoft.com/en-us/python/api/azureml-pipeline-steps/azureml.pipeline.steps.parallel_run_step.parallelrunstep?view=azure-ml-py) enables the parallel training of models and is commonly used for batch inferencing. This [document](https://docs.microsoft.com/azure/machine-learning/how-to-use-parallel-run-step) walks through some of the key concepts around ParallelRunStep.
|
||||||
|
|
||||||
|
### Pipelines
|
||||||
|
|
||||||
|
[Pipelines](https://docs.microsoft.com/azure/machine-learning/concept-ml-pipelines) allow you to create workflows in your machine learning projects. These workflows have a number of benefits including speed, simplicity, repeatability, and modularity.
|
||||||
|
|
||||||
|
### Automated Machine Learning
|
||||||
|
|
||||||
|
[Automated Machine Learning](https://docs.microsoft.com/azure/machine-learning/concept-automated-ml) also referred to as automated ML or AutoML, is the process of automating the time consuming, iterative tasks of machine learning model development. It allows data scientists, analysts, and developers to build ML models with high scale, efficiency, and productivity all while sustaining model quality.
|
||||||
|
|
||||||
|
### Other Concepts
|
||||||
|
|
||||||
|
In additional to ParallelRunStep, Pipelines and Automated Machine Learning, you'll also be working with the following concepts including [workspace](https://docs.microsoft.com/azure/machine-learning/concept-workspace), [datasets](https://docs.microsoft.com/azure/machine-learning/concept-data#datasets), [compute targets](https://docs.microsoft.com/azure/machine-learning/concept-compute-target#train), [python script steps](https://docs.microsoft.com/python/api/azureml-pipeline-steps/azureml.pipeline.steps.python_script_step.pythonscriptstep?view=azure-ml-py), and [Azure Open Datasets](https://azure.microsoft.com/services/open-datasets/).
|
||||||
|
|
||||||
|
## Contributing
|
||||||
|
|
||||||
|
This project welcomes contributions and suggestions. To learn more visit the [contributing](../../../CONTRIBUTING.md) section.
|
||||||
|
|
||||||
|
Most contributions require you to agree to a Contributor License Agreement (CLA)
|
||||||
|
declaring that you have the right to, and actually do, grant us
|
||||||
|
the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.
|
||||||
|
|
||||||
|
When you submit a pull request, a CLA bot will automatically determine whether you need to provide
|
||||||
|
a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions
|
||||||
|
provided by the bot. You will only need to do this once across all repos using our CLA.
|
||||||
|
|
||||||
|
This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/).
|
||||||
|
For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or
|
||||||
|
contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments.
|
||||||
@@ -16,6 +16,13 @@
|
|||||||
""
|
""
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"<font color=\"red\" size=\"5\"><strong>!Important!</strong> </br>This notebook is outdated and is not supported by the AutoML Team. Please use the supported version ([link](https://github.com/Azure/azureml-examples/tree/main/sdk/python/jobs/pipelines/1k_demand_forecasting_with_pipeline_components/automl-forecasting-demand-many-models-in-pipeline)).</font>"
|
||||||
|
]
|
||||||
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
@@ -40,7 +47,7 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"### Prerequisites\n",
|
"### Prerequisites\n",
|
||||||
"You'll need to create a compute Instance by following the instructions in the [EnvironmentSetup.md](../Setup_Resources/EnvironmentSetup.md)."
|
"You'll need to create a compute Instance by following [these](https://learn.microsoft.com/en-us/azure/machine-learning/v1/how-to-create-manage-compute-instance?tabs=python) instructions."
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -78,6 +85,7 @@
|
|||||||
"output[\"Resource Group\"] = ws.resource_group\n",
|
"output[\"Resource Group\"] = ws.resource_group\n",
|
||||||
"output[\"Location\"] = ws.location\n",
|
"output[\"Location\"] = ws.location\n",
|
||||||
"output[\"Default datastore name\"] = dstore.name\n",
|
"output[\"Default datastore name\"] = dstore.name\n",
|
||||||
|
"output[\"SDK Version\"] = azureml.core.VERSION\n",
|
||||||
"pd.set_option(\"display.max_colwidth\", None)\n",
|
"pd.set_option(\"display.max_colwidth\", None)\n",
|
||||||
"outputDf = pd.DataFrame(data=output, index=[\"\"])\n",
|
"outputDf = pd.DataFrame(data=output, index=[\"\"])\n",
|
||||||
"outputDf.T"
|
"outputDf.T"
|
||||||
@@ -241,6 +249,34 @@
|
|||||||
")"
|
")"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"#### 2.4 Configure data with ``OutputFileDatasetConfig`` objects\n",
|
||||||
|
"This step shows how to configure output data from a pipeline step. One of the use cases for this step is when you want to do some preprocessing before feeding the data to training step. Intermediate data (or output of a step) is represented by an ``OutputFileDatasetConfig`` object. ``output_data`` is produced as the output of a step. Optionally, this data can be registered as a dataset by calling the ``register_on_complete`` method. If you create an ``OutputFileDatasetConfig`` in one step and use it as an input to another step, that data dependency between steps creates an implicit execution order in the pipeline.\n",
|
||||||
|
"\n",
|
||||||
|
"``OutputFileDatasetConfig`` objects return a directory, and by default write output to the default datastore of the workspace.\n",
|
||||||
|
"\n",
|
||||||
|
"Since instance creation for class ``OutputTabularDatasetConfig`` is not allowed, we first create an instance of this class. Then we use the ``read_parquet_files`` method to read the parquet file into ``OutputTabularDatasetConfig``."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from azureml.data.output_dataset_config import OutputFileDatasetConfig\n",
|
||||||
|
"\n",
|
||||||
|
"output_data = OutputFileDatasetConfig(\n",
|
||||||
|
" name=\"processed_data\", destination=(dstore, \"outputdataset/{run-id}/{output-name}\")\n",
|
||||||
|
").as_upload()\n",
|
||||||
|
"# output_data_dataset = output_data.register_on_complete(\n",
|
||||||
|
"# name='processed_data', description = 'files from prev step')\n",
|
||||||
|
"output_data = output_data.read_parquet_files()"
|
||||||
|
]
|
||||||
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
@@ -277,7 +313,7 @@
|
|||||||
"from azureml.core.compute import ComputeTarget, AmlCompute\n",
|
"from azureml.core.compute import ComputeTarget, AmlCompute\n",
|
||||||
"\n",
|
"\n",
|
||||||
"# Name your cluster\n",
|
"# Name your cluster\n",
|
||||||
"compute_name = \"mm-compute\"\n",
|
"compute_name = \"mm-compute-v1\"\n",
|
||||||
"\n",
|
"\n",
|
||||||
"\n",
|
"\n",
|
||||||
"if compute_name in ws.compute_targets:\n",
|
"if compute_name in ws.compute_targets:\n",
|
||||||
@@ -287,7 +323,7 @@
|
|||||||
"else:\n",
|
"else:\n",
|
||||||
" print(\"Creating a new compute target...\")\n",
|
" print(\"Creating a new compute target...\")\n",
|
||||||
" provisioning_config = AmlCompute.provisioning_configuration(\n",
|
" provisioning_config = AmlCompute.provisioning_configuration(\n",
|
||||||
" vm_size=\"STANDARD_D16S_V3\", max_nodes=20\n",
|
" vm_size=\"STANDARD_D14_V2\", max_nodes=20\n",
|
||||||
" )\n",
|
" )\n",
|
||||||
" # Create the compute target\n",
|
" # Create the compute target\n",
|
||||||
" compute_target = ComputeTarget.create(ws, compute_name, provisioning_config)\n",
|
" compute_target = ComputeTarget.create(ws, compute_name, provisioning_config)\n",
|
||||||
@@ -302,14 +338,65 @@
|
|||||||
" print(compute_target.status.serialize())"
|
" print(compute_target.status.serialize())"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"### Configure the training run's environment\n",
|
||||||
|
"The next step is making sure that the remote training run has all the dependencies needed by the training steps. Dependencies and the runtime context are set by creating and configuring a RunConfiguration object.\n",
|
||||||
|
"\n",
|
||||||
|
"The code below shows two options for handling dependencies. As presented, with ``USE_CURATED_ENV = True``, the configuration is based on a [curated environment](https://docs.microsoft.com/en-us/azure/machine-learning/resource-curated-environments). Curated environments have prebuilt Docker images in the [Microsoft Container Registry](https://hub.docker.com/publishers/microsoftowner). For more information, see [Azure Machine Learning curated environments](https://docs.microsoft.com/en-us/azure/machine-learning/resource-curated-environments).\n",
|
||||||
|
"\n",
|
||||||
|
"The path taken if you change ``USE_CURATED_ENV`` to False shows the pattern for explicitly setting your dependencies. In that scenario, a new custom Docker image will be created and registered in an Azure Container Registry within your resource group (see [Introduction to private Docker container registries in Azure](https://docs.microsoft.com/en-us/azure/container-registry/container-registry-intro)). Building and registering this image can take quite a few minutes."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from azureml.core.runconfig import RunConfiguration\n",
|
||||||
|
"from azureml.core.conda_dependencies import CondaDependencies\n",
|
||||||
|
"from azureml.core import Environment\n",
|
||||||
|
"\n",
|
||||||
|
"aml_run_config = RunConfiguration()\n",
|
||||||
|
"aml_run_config.target = compute_target\n",
|
||||||
|
"\n",
|
||||||
|
"USE_CURATED_ENV = True\n",
|
||||||
|
"if USE_CURATED_ENV:\n",
|
||||||
|
" curated_environment = Environment.get(\n",
|
||||||
|
" workspace=ws, name=\"AzureML-sklearn-1.5\"\n",
|
||||||
|
" )\n",
|
||||||
|
" aml_run_config.environment = curated_environment\n",
|
||||||
|
"else:\n",
|
||||||
|
" aml_run_config.environment.python.user_managed_dependencies = False\n",
|
||||||
|
"\n",
|
||||||
|
" # Add some packages relied on by data prep step\n",
|
||||||
|
" aml_run_config.environment.python.conda_dependencies = CondaDependencies.create(\n",
|
||||||
|
" conda_packages=[\"pandas\", \"scikit-learn\"],\n",
|
||||||
|
" pip_packages=[\"azureml-sdk\", \"azureml-dataset-runtime[fuse,pandas]\"],\n",
|
||||||
|
" pin_sdk_version=False,\n",
|
||||||
|
" )"
|
||||||
|
]
|
||||||
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"### Set up training parameters\n",
|
"### Set up training parameters\n",
|
||||||
"\n",
|
"\n",
|
||||||
"This dictionary defines the AutoML and many models settings. For this forecasting task we need to define several settings inncluding the name of the time column, the maximum forecast horizon, and the partition column name definition.\n",
|
"We need to provide ``ForecastingParameters``, ``AutoMLConfig`` and ``ManyModelsTrainParameters`` objects. For the forecasting task we also need to define several settings including the name of the time column, the maximum forecast horizon, and the partition column name(s) definition.\n",
|
||||||
"\n",
|
"\n",
|
||||||
|
"#### ``ForecastingParameters`` arguments\n",
|
||||||
|
"| Property | Description|\n",
|
||||||
|
"| :--------------- | :------------------- |\n",
|
||||||
|
"| **forecast_horizon** | The forecast horizon is how many periods forward you would like to forecast. This integer horizon is in units of the timeseries frequency (e.g. daily, weekly). Periods are inferred from your data. |\n",
|
||||||
|
"| **time_column_name** | The name of your time column. |\n",
|
||||||
|
"| **time_series_id_column_names** | The column names used to uniquely identify timeseries in data that has multiple rows with the same timestamp. |\n",
|
||||||
|
"| **cv_step_size** | Number of periods between two consecutive cross-validation folds. The default value is \\\"auto\\\", in which case AutoMl determines the cross-validation step size automatically, if a validation set is not provided. Or users could specify an integer value. |\n",
|
||||||
|
"\n",
|
||||||
|
"#### ``AutoMLConfig`` arguments\n",
|
||||||
"| Property | Description|\n",
|
"| Property | Description|\n",
|
||||||
"| :--------------- | :------------------- |\n",
|
"| :--------------- | :------------------- |\n",
|
||||||
"| **task** | forecasting |\n",
|
"| **task** | forecasting |\n",
|
||||||
@@ -317,16 +404,19 @@
|
|||||||
"| **blocked_models** | Blocked models won't be used by AutoML. |\n",
|
"| **blocked_models** | Blocked models won't be used by AutoML. |\n",
|
||||||
"| **iteration_timeout_minutes** | Maximum amount of time in minutes that the model can train. This is optional but provides customers with greater control on exit criteria. |\n",
|
"| **iteration_timeout_minutes** | Maximum amount of time in minutes that the model can train. This is optional but provides customers with greater control on exit criteria. |\n",
|
||||||
"| **iterations** | Number of models to train. This is optional but provides customers with greater control on exit criteria. |\n",
|
"| **iterations** | Number of models to train. This is optional but provides customers with greater control on exit criteria. |\n",
|
||||||
"| **experiment_timeout_hours** | Maximum amount of time in hours that the experiment can take before it terminates. This is optional but provides customers with greater control on exit criteria. |\n",
|
"| **experiment_timeout_hours** | Maximum amount of time in hours that each experiment can take before it terminates. This is optional but provides customers with greater control on exit criteria. **It does not control the overall timeout for the pipeline run, instead controls the timeout for each training run per partitioned time series.** |\n",
|
||||||
"| **label_column_name** | The name of the label column. |\n",
|
"| **label_column_name** | The name of the label column. |\n",
|
||||||
"| **forecast_horizon** | The forecast horizon is how many periods forward you would like to forecast. This integer horizon is in units of the timeseries frequency (e.g. daily, weekly). Periods are inferred from your data. |\n",
|
"| **n_cross_validations** | Number of cross validation splits. The default value is \\\"auto\\\", in which case AutoMl determines the number of cross-validations automatically, if a validation set is not provided. Or users could specify an integer value. Rolling Origin Validation is used to split time-series in a temporally consistent way. |\n",
|
||||||
"| **n_cross_validations** | Number of cross validation splits. Rolling Origin Validation is used to split time-series in a temporally consistent way. |\n",
|
"| **enable_early_stopping** | Flag to enable early termination if the primary metric is no longer improving. |\n",
|
||||||
"| **enable_early_stopping** | Flag to enable early termination if the score is not improving in the short term. |\n",
|
|
||||||
"| **time_column_name** | The name of your time column. |\n",
|
|
||||||
"| **enable_engineered_explanations** | Engineered feature explanations will be downloaded if enable_engineered_explanations flag is set to True. By default it is set to False to save storage space. |\n",
|
"| **enable_engineered_explanations** | Engineered feature explanations will be downloaded if enable_engineered_explanations flag is set to True. By default it is set to False to save storage space. |\n",
|
||||||
"| **time_series_id_column_name** | The column names used to uniquely identify timeseries in data that has multiple rows with the same timestamp. |\n",
|
|
||||||
"| **track_child_runs** | Flag to disable tracking of child runs. Only best run is tracked if the flag is set to False (this includes the model and metrics of the run). |\n",
|
"| **track_child_runs** | Flag to disable tracking of child runs. Only best run is tracked if the flag is set to False (this includes the model and metrics of the run). |\n",
|
||||||
"| **pipeline_fetch_max_batch_size** | Determines how many pipelines (training algorithms) to fetch at a time for training, this helps reduce throttling when training at large scale. |\n",
|
"| **pipeline_fetch_max_batch_size** | Determines how many pipelines (training algorithms) to fetch at a time for training, this helps reduce throttling when training at large scale. |\n",
|
||||||
|
"\n",
|
||||||
|
"\n",
|
||||||
|
"#### ``ManyModelsTrainParameters`` arguments\n",
|
||||||
|
"| Property | Description|\n",
|
||||||
|
"| :--------------- | :------------------- |\n",
|
||||||
|
"| **automl_settings** | The ``AutoMLConfig`` object defined above. |\n",
|
||||||
"| **partition_column_names** | The names of columns used to group your models. For timeseries, the groups must not split up individual time-series. That is, each group must contain one or more whole time-series. |"
|
"| **partition_column_names** | The names of columns used to group your models. For timeseries, the groups must not split up individual time-series. That is, each group must contain one or more whole time-series. |"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
@@ -343,28 +433,77 @@
|
|||||||
"from azureml.train.automl.runtime._many_models.many_models_parameters import (\n",
|
"from azureml.train.automl.runtime._many_models.many_models_parameters import (\n",
|
||||||
" ManyModelsTrainParameters,\n",
|
" ManyModelsTrainParameters,\n",
|
||||||
")\n",
|
")\n",
|
||||||
|
"from azureml.automl.core.forecasting_parameters import ForecastingParameters\n",
|
||||||
|
"from azureml.train.automl.automlconfig import AutoMLConfig\n",
|
||||||
"\n",
|
"\n",
|
||||||
"partition_column_names = [\"Store\", \"Brand\"]\n",
|
"partition_column_names = [\"Store\", \"Brand\"]\n",
|
||||||
"automl_settings = {\n",
|
"\n",
|
||||||
" \"task\": \"forecasting\",\n",
|
"forecasting_parameters = ForecastingParameters(\n",
|
||||||
" \"primary_metric\": \"normalized_root_mean_squared_error\",\n",
|
" time_column_name=\"WeekStarting\",\n",
|
||||||
" \"iteration_timeout_minutes\": 10, # This needs to be changed based on the dataset. We ask customer to explore how long training is taking before settings this value\n",
|
" forecast_horizon=6,\n",
|
||||||
" \"iterations\": 15,\n",
|
" time_series_id_column_names=partition_column_names,\n",
|
||||||
" \"experiment_timeout_hours\": 0.25,\n",
|
" cv_step_size=\"auto\",\n",
|
||||||
" \"label_column_name\": \"Quantity\",\n",
|
")\n",
|
||||||
" \"n_cross_validations\": 3,\n",
|
"\n",
|
||||||
" \"time_column_name\": \"WeekStarting\",\n",
|
"automl_settings = AutoMLConfig(\n",
|
||||||
" \"drop_column_names\": \"Revenue\",\n",
|
" task=\"forecasting\",\n",
|
||||||
" \"max_horizon\": 6,\n",
|
" primary_metric=\"normalized_root_mean_squared_error\",\n",
|
||||||
" \"grain_column_names\": partition_column_names,\n",
|
" iteration_timeout_minutes=10,\n",
|
||||||
" \"track_child_runs\": False,\n",
|
" iterations=15,\n",
|
||||||
"}\n",
|
" experiment_timeout_hours=0.25,\n",
|
||||||
|
" label_column_name=\"Quantity\",\n",
|
||||||
|
" n_cross_validations=\"auto\", # Feel free to set to a small integer (>=2) if runtime is an issue.\n",
|
||||||
|
" track_child_runs=False,\n",
|
||||||
|
" forecasting_parameters=forecasting_parameters,\n",
|
||||||
|
")\n",
|
||||||
"\n",
|
"\n",
|
||||||
"mm_paramters = ManyModelsTrainParameters(\n",
|
"mm_paramters = ManyModelsTrainParameters(\n",
|
||||||
" automl_settings=automl_settings, partition_column_names=partition_column_names\n",
|
" automl_settings=automl_settings, partition_column_names=partition_column_names\n",
|
||||||
")"
|
")"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"### Construct your pipeline steps\n",
|
||||||
|
"Once you have the compute resource and environment created, you're ready to define your pipeline's steps. There are many built-in steps available via the Azure Machine Learning SDK, as you can see on the [reference documentation for the azureml.pipeline.steps package](https://docs.microsoft.com/en-us/python/api/azureml-pipeline-steps/azureml.pipeline.steps?view=azure-ml-py). The most flexible class is [PythonScriptStep](https://docs.microsoft.com/en-us/python/api/azureml-pipeline-steps/azureml.pipeline.steps.python_script_step.pythonscriptstep?view=azure-ml-py), which runs a Python script.\n",
|
||||||
|
"\n",
|
||||||
|
"Your data preparation code is in a subdirectory (in this example, \"data_preprocessing_tabular.py\" in the directory \"./scripts\"). As part of the pipeline creation process, this directory is zipped and uploaded to the compute_target and the step runs the script specified as the value for ``script_name``.\n",
|
||||||
|
"\n",
|
||||||
|
"The ``arguments`` values specify the inputs and outputs of the step. In the example below, the baseline data is the ``input_ds_small`` dataset. The script data_preprocessing_tabular.py does whatever data-transformation tasks are appropriate to the task at hand and outputs the data to ``output_data``, of type ``OutputFileDatasetConfig``. For more information, see [Moving data into and between ML pipeline steps (Python)](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-move-data-in-out-of-pipelines). The step will run on the machine defined by ``compute_target``, using the configuration ``aml_run_config``.\n",
|
||||||
|
"\n",
|
||||||
|
"Reuse of previous results (``allow_reuse``) is key when using pipelines in a collaborative environment since eliminating unnecessary reruns offers agility. Reuse is the default behavior when the ``script_name``, ``inputs``, and the parameters of a step remain the same. When reuse is allowed, results from the previous run are immediately sent to the next step. If ``allow_reuse`` is set to False, a new run will always be generated for this step during pipeline execution.\n",
|
||||||
|
"\n",
|
||||||
|
"> Note that we only support partitioned FileDataset and TabularDataset without partition when using such output as input.\n",
|
||||||
|
"\n",
|
||||||
|
"> Note that we **drop column** \"Revenue\" from the dataset in this step to avoid information leak as \"Quantity\" = \"Revenue\" / \"Price\". **Please modify the logic based on your data**."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from azureml.pipeline.steps import PythonScriptStep\n",
|
||||||
|
"\n",
|
||||||
|
"dataprep_source_dir = \"./scripts\"\n",
|
||||||
|
"entry_point = \"data_preprocessing_tabular.py\"\n",
|
||||||
|
"ds_input = input_ds_small.as_named_input(\"train_10_models\")\n",
|
||||||
|
"\n",
|
||||||
|
"data_prep_step = PythonScriptStep(\n",
|
||||||
|
" script_name=entry_point,\n",
|
||||||
|
" source_directory=dataprep_source_dir,\n",
|
||||||
|
" arguments=[\"--input\", ds_input, \"--output\", output_data],\n",
|
||||||
|
" compute_target=compute_target,\n",
|
||||||
|
" runconfig=aml_run_config,\n",
|
||||||
|
" allow_reuse=False,\n",
|
||||||
|
")\n",
|
||||||
|
"\n",
|
||||||
|
"input_ds_small = output_data"
|
||||||
|
]
|
||||||
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
@@ -376,17 +515,25 @@
|
|||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"Parallel run step is leveraged to train multiple models at once. To configure the ParallelRunConfig you will need to determine the appropriate number of workers and nodes for your use case. The process_count_per_node is based off the number of cores of the compute VM. The node_count will determine the number of master nodes to use, increasing the node count will speed up the training process.\n",
|
"Parallel run step is leveraged to train multiple models at once. To configure the ParallelRunConfig you will need to determine the appropriate number of workers and nodes for your use case. The ``process_count_per_node`` is based off the number of cores of the compute VM. The node_count will determine the number of master nodes to use, increasing the node count will speed up the training process.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"| Property | Description|\n",
|
"| Property | Description|\n",
|
||||||
"| :--------------- | :------------------- |\n",
|
"| :--------------- | :------------------- |\n",
|
||||||
"| **experiment** | The experiment used for training. |\n",
|
"| **experiment** | The experiment used for training. |\n",
|
||||||
"| **train_data** | The file dataset to be used as input to the training run. |\n",
|
"| **train_data** | The file dataset to be used as input to the training run. |\n",
|
||||||
"| **node_count** | The number of compute nodes to be used for running the user script. We recommend to start with 3 and increase the node_count if the training time is taking too long. |\n",
|
"| **node_count** | The number of compute nodes to be used for running the user script. We recommend to start with 3 and increase the node_count if the training time is taking too long. |\n",
|
||||||
"| **process_count_per_node** | Process count per node, we recommend 2:1 ratio for number of cores: number of processes per node. eg. If node has 16 cores then configure 8 or less process count per node or optimal performance. |\n",
|
"| **process_count_per_node** | Process count per node, we recommend 2:1 ratio for number of cores: number of processes per node. eg. If node has 16 cores then configure 8 or less process count per node for optimal performance. |\n",
|
||||||
"| **train_pipeline_parameters** | The set of configuration parameters defined in the previous section. |\n",
|
"| **train_pipeline_parameters** | The set of configuration parameters defined in the previous section. |\n",
|
||||||
|
"| **run_invocation_timeout** | Maximum amount of time in seconds that the ``ParallelRunStep`` class is allowed. This is optional but provides customers with greater control on exit criteria. This must be greater than ``experiment_timeout_hours`` by at least 300 seconds. |\n",
|
||||||
"\n",
|
"\n",
|
||||||
"Calling this method will create a new aggregated dataset which is generated dynamically on pipeline execution."
|
"Calling this method will create a new aggregated dataset which is generated dynamically on pipeline execution.\n",
|
||||||
|
"\n",
|
||||||
|
"**Note**: Total time taken for the **training step** in the pipeline to complete = $ \\frac{t}{ p \\times n } \\times ts $\n",
|
||||||
|
"where,\n",
|
||||||
|
"- $ t $ is time taken for training one partition (can be viewed in the training logs)\n",
|
||||||
|
"- $ p $ is ``process_count_per_node``\n",
|
||||||
|
"- $ n $ is ``node_count``\n",
|
||||||
|
"- $ ts $ is total number of partitions in time series based on ``partition_column_names``"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -404,7 +551,7 @@
|
|||||||
" compute_target=compute_target,\n",
|
" compute_target=compute_target,\n",
|
||||||
" node_count=2,\n",
|
" node_count=2,\n",
|
||||||
" process_count_per_node=8,\n",
|
" process_count_per_node=8,\n",
|
||||||
" run_invocation_timeout=920,\n",
|
" run_invocation_timeout=1200,\n",
|
||||||
" train_pipeline_parameters=mm_paramters,\n",
|
" train_pipeline_parameters=mm_paramters,\n",
|
||||||
")"
|
")"
|
||||||
]
|
]
|
||||||
@@ -485,7 +632,7 @@
|
|||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"### 7.2 Schedule the pipeline\n",
|
"### 5.2 Schedule the pipeline\n",
|
||||||
"You can also [schedule the pipeline](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-schedule-pipelines) to run on a time-based or change-based schedule. This could be used to automatically retrain models every month or based on another trigger such as data drift."
|
"You can also [schedule the pipeline](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-schedule-pipelines) to run on a time-based or change-based schedule. This could be used to automatically retrain models every month or based on another trigger such as data drift."
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
@@ -541,25 +688,31 @@
|
|||||||
"source": [
|
"source": [
|
||||||
"For many models we need to provide the ManyModelsInferenceParameters object.\n",
|
"For many models we need to provide the ManyModelsInferenceParameters object.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"#### ManyModelsInferenceParameters arguments\n",
|
"#### ``ManyModelsInferenceParameters`` arguments\n",
|
||||||
"| Property | Description|\n",
|
"| Property | Description|\n",
|
||||||
"| :--------------- | :------------------- |\n",
|
"| :--------------- | :------------------- |\n",
|
||||||
"| **partition_column_names** | List of column names that identifies groups. |\n",
|
"| **partition_column_names** | List of column names that identifies groups. |\n",
|
||||||
"| **target_column_name** | \\[Optional] Column name only if the inference dataset has the target. |\n",
|
"| **target_column_name** | \\[Optional] Column name only if the inference dataset has the target. |\n",
|
||||||
"| **time_column_name** | \\[Optional] Column name only if it is timeseries. |\n",
|
"| **time_column_name** | \\[Optional] Time column name only if it is timeseries. |\n",
|
||||||
"| **many_models_run_id** | \\[Optional] Many models run id where models were trained. |\n",
|
"| **inference_type** | \\[Optional] Which inference method to use on the model. Possible values are 'forecast', 'predict_proba', and 'predict'. |\n",
|
||||||
|
"| **forecast_mode** | \\[Optional] The type of forecast to be used, either 'rolling' or 'recursive'; defaults to 'recursive'. |\n",
|
||||||
|
"| **step** | \\[Optional] Number of periods to advance the forecasting window in each iteration **(for rolling forecast only)**; defaults to 1. |\n",
|
||||||
"\n",
|
"\n",
|
||||||
"#### get_many_models_batch_inference_steps arguments\n",
|
"#### ``get_many_models_batch_inference_steps`` arguments\n",
|
||||||
"| Property | Description|\n",
|
"| Property | Description|\n",
|
||||||
"| :--------------- | :------------------- |\n",
|
"| :--------------- | :------------------- |\n",
|
||||||
"| **experiment** | The experiment used for inference run. |\n",
|
"| **experiment** | The experiment used for inference run. |\n",
|
||||||
"| **inference_data** | The data to use for inferencing. It should be the same schema as used for training.\n",
|
"| **inference_data** | The data to use for inferencing. It should be the same schema as used for training.\n",
|
||||||
"| **compute_target** The compute target that runs the inference pipeline.|\n",
|
"| **compute_target** | The compute target that runs the inference pipeline. |\n",
|
||||||
"| **node_count** | The number of compute nodes to be used for running the user script. We recommend to start with the number of cores per node (varies by compute sku). |\n",
|
"| **node_count** | The number of compute nodes to be used for running the user script. We recommend to start with the number of cores per node (varies by compute sku). |\n",
|
||||||
"| **process_count_per_node** The number of processes per node.\n",
|
"| **process_count_per_node** | \\[Optional] The number of processes per node. By default it's 2 (should be at most half of the number of cores in a single node of the compute cluster that will be used for the experiment).\n",
|
||||||
"| **train_run_id** | \\[Optional] The run id of the hierarchy training, by default it is the latest successful training many model run in the experiment. |\n",
|
"| **inference_pipeline_parameters** | \\[Optional] The ``ManyModelsInferenceParameters`` object defined above. |\n",
|
||||||
|
"| **append_row_file_name** | \\[Optional] The name of the output file (optional, default value is 'parallel_run_step.txt'). Supports 'txt' and 'csv' file extension. A 'txt' file extension generates the output in 'txt' format with space as separator without column names. A 'csv' file extension generates the output in 'csv' format with comma as separator and with column names. |\n",
|
||||||
|
"| **train_run_id** | \\[Optional] The run id of the **training pipeline**. By default it is the latest successful training pipeline run in the experiment. |\n",
|
||||||
"| **train_experiment_name** | \\[Optional] The train experiment that contains the train pipeline. This one is only needed when the train pipeline is not in the same experiement as the inference pipeline. |\n",
|
"| **train_experiment_name** | \\[Optional] The train experiment that contains the train pipeline. This one is only needed when the train pipeline is not in the same experiement as the inference pipeline. |\n",
|
||||||
"| **process_count_per_node** | \\[Optional] The number of processes per node, by default it's 4. |"
|
"| **run_invocation_timeout** | \\[Optional] Maximum amount of time in seconds that the ``ParallelRunStep`` class is allowed. This is optional but provides customers with greater control on exit criteria. |\n",
|
||||||
|
"| **output_datastore** | \\[Optional] The ``Datastore`` or ``OutputDatasetConfig`` to be used for output. If specified any pipeline output will be written to that location. If unspecified the default datastore will be used. |\n",
|
||||||
|
"| **arguments** | \\[Optional] Arguments to be passed to inference script. Possible argument is '--forecast_quantiles' followed by quantile values. |"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -579,6 +732,8 @@
|
|||||||
" target_column_name=\"Quantity\",\n",
|
" target_column_name=\"Quantity\",\n",
|
||||||
")\n",
|
")\n",
|
||||||
"\n",
|
"\n",
|
||||||
|
"output_file_name = \"parallel_run_step.csv\"\n",
|
||||||
|
"\n",
|
||||||
"inference_steps = AutoMLPipelineBuilder.get_many_models_batch_inference_steps(\n",
|
"inference_steps = AutoMLPipelineBuilder.get_many_models_batch_inference_steps(\n",
|
||||||
" experiment=experiment,\n",
|
" experiment=experiment,\n",
|
||||||
" inference_data=inference_ds_small,\n",
|
" inference_data=inference_ds_small,\n",
|
||||||
@@ -590,6 +745,8 @@
|
|||||||
" train_run_id=training_run.id,\n",
|
" train_run_id=training_run.id,\n",
|
||||||
" train_experiment_name=training_run.experiment.name,\n",
|
" train_experiment_name=training_run.experiment.name,\n",
|
||||||
" inference_pipeline_parameters=mm_parameters,\n",
|
" inference_pipeline_parameters=mm_parameters,\n",
|
||||||
|
" append_row_file_name=output_file_name,\n",
|
||||||
|
" arguments=[\"--forecast_quantiles\", 0.1, 0.9],\n",
|
||||||
")"
|
")"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
@@ -624,7 +781,7 @@
|
|||||||
"\n",
|
"\n",
|
||||||
"The following code snippet:\n",
|
"The following code snippet:\n",
|
||||||
"1. Downloads the contents of the output folder that is passed in the parallel run step \n",
|
"1. Downloads the contents of the output folder that is passed in the parallel run step \n",
|
||||||
"2. Reads the parallel_run_step.txt file that has the predictions as pandas dataframe and \n",
|
"2. Reads the output file that has the predictions as pandas dataframe and \n",
|
||||||
"3. Displays the top 10 rows of the predictions"
|
"3. Displays the top 10 rows of the predictions"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
@@ -639,19 +796,9 @@
|
|||||||
"forecasting_results_name = \"forecasting_results\"\n",
|
"forecasting_results_name = \"forecasting_results\"\n",
|
||||||
"forecasting_output_name = \"many_models_inference_output\"\n",
|
"forecasting_output_name = \"many_models_inference_output\"\n",
|
||||||
"forecast_file = get_output_from_mm_pipeline(\n",
|
"forecast_file = get_output_from_mm_pipeline(\n",
|
||||||
" inference_run, forecasting_results_name, forecasting_output_name\n",
|
" inference_run, forecasting_results_name, forecasting_output_name, output_file_name\n",
|
||||||
")\n",
|
")\n",
|
||||||
"df = pd.read_csv(forecast_file, delimiter=\" \", header=None)\n",
|
"df = pd.read_csv(forecast_file)\n",
|
||||||
"df.columns = [\n",
|
|
||||||
" \"Week Starting\",\n",
|
|
||||||
" \"Store\",\n",
|
|
||||||
" \"Brand\",\n",
|
|
||||||
" \"Quantity\",\n",
|
|
||||||
" \"Advert\",\n",
|
|
||||||
" \"Price\",\n",
|
|
||||||
" \"Revenue\",\n",
|
|
||||||
" \"Predicted\",\n",
|
|
||||||
"]\n",
|
|
||||||
"print(\n",
|
"print(\n",
|
||||||
" \"Prediction has \", df.shape[0], \" rows. Here the first 10 rows are being displayed.\"\n",
|
" \"Prediction has \", df.shape[0], \" rows. Here the first 10 rows are being displayed.\"\n",
|
||||||
")\n",
|
")\n",
|
||||||
@@ -724,9 +871,9 @@
|
|||||||
"automated-machine-learning"
|
"automated-machine-learning"
|
||||||
],
|
],
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Python 3.6",
|
"display_name": "Python 3.8 - AzureML",
|
||||||
"language": "python",
|
"language": "python",
|
||||||
"name": "python36"
|
"name": "python38-azureml"
|
||||||
},
|
},
|
||||||
"language_info": {
|
"language_info": {
|
||||||
"codemirror_mode": {
|
"codemirror_mode": {
|
||||||
@@ -738,7 +885,12 @@
|
|||||||
"name": "python",
|
"name": "python",
|
||||||
"nbconvert_exporter": "python",
|
"nbconvert_exporter": "python",
|
||||||
"pygments_lexer": "ipython3",
|
"pygments_lexer": "ipython3",
|
||||||
"version": "3.6.8"
|
"version": "3.8.10"
|
||||||
|
},
|
||||||
|
"vscode": {
|
||||||
|
"interpreter": {
|
||||||
|
"hash": "6bd77c88278e012ef31757c15997a7bea8c943977c43d6909403c00ae11d43ca"
|
||||||
|
}
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"nbformat": 4,
|
"nbformat": 4,
|
||||||
|
|||||||
@@ -1,4 +0,0 @@
|
|||||||
name: auto-ml-forecasting-many-models
|
|
||||||
dependencies:
|
|
||||||
- pip:
|
|
||||||
- azureml-sdk
|
|
||||||
|
After Width: | Height: | Size: 32 KiB |
|
After Width: | Height: | Size: 306 KiB |
|
After Width: | Height: | Size: 2.6 MiB |
|
After Width: | Height: | Size: 106 KiB |
|
After Width: | Height: | Size: 158 KiB |
|
After Width: | Height: | Size: 80 KiB |
|
After Width: | Height: | Size: 68 KiB |
|
After Width: | Height: | Size: 631 KiB |
@@ -0,0 +1,39 @@
|
|||||||
|
from pathlib import Path
|
||||||
|
from azureml.core import Run
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import os
|
||||||
|
|
||||||
|
|
||||||
|
def main(args):
|
||||||
|
output = Path(args.output)
|
||||||
|
output.mkdir(parents=True, exist_ok=True)
|
||||||
|
|
||||||
|
run_context = Run.get_context()
|
||||||
|
input_path = run_context.input_datasets["train_10_models"]
|
||||||
|
|
||||||
|
for file_name in os.listdir(input_path):
|
||||||
|
input_file = os.path.join(input_path, file_name)
|
||||||
|
with open(input_file, "r") as f:
|
||||||
|
content = f.read()
|
||||||
|
|
||||||
|
# Apply any data pre-processing techniques here
|
||||||
|
|
||||||
|
output_file = os.path.join(output, file_name)
|
||||||
|
with open(output_file, "w") as f:
|
||||||
|
f.write(content)
|
||||||
|
|
||||||
|
|
||||||
|
def my_parse_args():
|
||||||
|
parser = argparse.ArgumentParser("Test")
|
||||||
|
|
||||||
|
parser.add_argument("--input", type=str)
|
||||||
|
parser.add_argument("--output", type=str)
|
||||||
|
|
||||||
|
args = parser.parse_args()
|
||||||
|
return args
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
args = my_parse_args()
|
||||||
|
main(args)
|
||||||
@@ -0,0 +1,37 @@
|
|||||||
|
from pathlib import Path
|
||||||
|
from azureml.core import Run
|
||||||
|
import argparse
|
||||||
|
|
||||||
|
|
||||||
|
def main(args):
|
||||||
|
output = Path(args.output)
|
||||||
|
output.mkdir(parents=True, exist_ok=True)
|
||||||
|
|
||||||
|
run_context = Run.get_context()
|
||||||
|
dataset = run_context.input_datasets["train_10_models"]
|
||||||
|
df = dataset.to_pandas_dataframe()
|
||||||
|
|
||||||
|
# Drop the column "Revenue" from the dataset to avoid information leak as
|
||||||
|
# "Quantity" = "Revenue" / "Price". Please modify the logic based on your data.
|
||||||
|
drop_column_name = "Revenue"
|
||||||
|
if drop_column_name in df.columns:
|
||||||
|
df.drop(drop_column_name, axis=1, inplace=True)
|
||||||
|
|
||||||
|
# Apply any data pre-processing techniques here
|
||||||
|
|
||||||
|
df.to_parquet(output / "data_prepared_result.parquet", compression=None)
|
||||||
|
|
||||||
|
|
||||||
|
def my_parse_args():
|
||||||
|
parser = argparse.ArgumentParser("Test")
|
||||||
|
|
||||||
|
parser.add_argument("--input", type=str)
|
||||||
|
parser.add_argument("--output", type=str)
|
||||||
|
|
||||||
|
args = parser.parse_args()
|
||||||
|
return args
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
args = my_parse_args()
|
||||||
|
main(args)
|
||||||
@@ -0,0 +1,3 @@
|
|||||||
|
dependencies:
|
||||||
|
- pip:
|
||||||
|
- azureml-contrib-automl-pipeline-steps
|
||||||
@@ -16,6 +16,13 @@
|
|||||||
""
|
""
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"<font color=\"red\" size=\"5\"><strong>!Important!</strong> </br>This notebook is outdated and is not supported by the AutoML Team. Please use the supported version ([link](https://github.com/Azure/azureml-examples/tree/main/sdk/python/jobs/automl-standalone-jobs/automl-forecasting-orange-juice-sales)).</font>"
|
||||||
|
]
|
||||||
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
@@ -40,7 +47,7 @@
|
|||||||
"## Introduction<a id=\"introduction\"></a>\n",
|
"## Introduction<a id=\"introduction\"></a>\n",
|
||||||
"In this example, we use AutoML to train, select, and operationalize a time-series forecasting model for multiple time-series.\n",
|
"In this example, we use AutoML to train, select, and operationalize a time-series forecasting model for multiple time-series.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"Make sure you have executed the [configuration notebook](../../../configuration.ipynb) before running this notebook.\n",
|
"Make sure you have executed the [configuration notebook](https://github.com/Azure/MachineLearningNotebooks/blob/master/configuration.ipynb) before running this notebook.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"The examples in the follow code samples use the University of Chicago's Dominick's Finer Foods dataset to forecast orange juice sales. Dominick's was a grocery chain in the Chicago metropolitan area."
|
"The examples in the follow code samples use the University of Chicago's Dominick's Finer Foods dataset to forecast orange juice sales. Dominick's was a grocery chain in the Chicago metropolitan area."
|
||||||
]
|
]
|
||||||
@@ -112,6 +119,7 @@
|
|||||||
"output[\"Resource Group\"] = ws.resource_group\n",
|
"output[\"Resource Group\"] = ws.resource_group\n",
|
||||||
"output[\"Location\"] = ws.location\n",
|
"output[\"Location\"] = ws.location\n",
|
||||||
"output[\"Run History Name\"] = experiment_name\n",
|
"output[\"Run History Name\"] = experiment_name\n",
|
||||||
|
"output[\"SDK Version\"] = azureml.core.VERSION\n",
|
||||||
"pd.set_option(\"display.max_colwidth\", None)\n",
|
"pd.set_option(\"display.max_colwidth\", None)\n",
|
||||||
"outputDf = pd.DataFrame(data=output, index=[\"\"])\n",
|
"outputDf = pd.DataFrame(data=output, index=[\"\"])\n",
|
||||||
"outputDf.T"
|
"outputDf.T"
|
||||||
@@ -241,7 +249,9 @@
|
|||||||
" time_series_id_column_names, group_keys=False\n",
|
" time_series_id_column_names, group_keys=False\n",
|
||||||
" )\n",
|
" )\n",
|
||||||
" df_head = df_grouped.apply(lambda dfg: dfg.iloc[:-n])\n",
|
" df_head = df_grouped.apply(lambda dfg: dfg.iloc[:-n])\n",
|
||||||
|
" df_head.reset_index(inplace=True, drop=True)\n",
|
||||||
" df_tail = df_grouped.apply(lambda dfg: dfg.iloc[-n:])\n",
|
" df_tail = df_grouped.apply(lambda dfg: dfg.iloc[-n:])\n",
|
||||||
|
" df_tail.reset_index(inplace=True, drop=True)\n",
|
||||||
" return df_head, df_tail\n",
|
" return df_head, df_tail\n",
|
||||||
"\n",
|
"\n",
|
||||||
"\n",
|
"\n",
|
||||||
@@ -367,7 +377,8 @@
|
|||||||
"|**time_column_name**|The name of your time column.|\n",
|
"|**time_column_name**|The name of your time column.|\n",
|
||||||
"|**forecast_horizon**|The forecast horizon is how many periods forward you would like to forecast. This integer horizon is in units of the timeseries frequency (e.g. daily, weekly).|\n",
|
"|**forecast_horizon**|The forecast horizon is how many periods forward you would like to forecast. This integer horizon is in units of the timeseries frequency (e.g. daily, weekly).|\n",
|
||||||
"|**time_series_id_column_names**|The column names used to uniquely identify the time series in data that has multiple rows with the same timestamp. If the time series identifiers are not defined, the data set is assumed to be one time series.|\n",
|
"|**time_series_id_column_names**|The column names used to uniquely identify the time series in data that has multiple rows with the same timestamp. If the time series identifiers are not defined, the data set is assumed to be one time series.|\n",
|
||||||
"|**freq**|Forecast frequency. This optional parameter represents the period with which the forecast is desired, for example, daily, weekly, yearly, etc. Use this parameter for the correction of time series containing irregular data points or for padding of short time series. The frequency needs to be a pandas offset alias. Please refer to [pandas documentation](https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#dateoffset-objects) for more information."
|
"|**freq**|Forecast frequency. This optional parameter represents the period with which the forecast is desired, for example, daily, weekly, yearly, etc. Use this parameter for the correction of time series containing irregular data points or for padding of short time series. The frequency needs to be a pandas offset alias. Please refer to [pandas documentation](https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#dateoffset-objects) for more information.\n",
|
||||||
|
"|**cv_step_size**|Number of periods between two consecutive cross-validation folds. The default value is \"auto\", in which case AutoMl determines the cross-validation step size automatically, if a validation set is not provided. Or users could specify an integer value."
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -389,7 +400,7 @@
|
|||||||
"In the first case, AutoML loops over all time-series in your dataset and trains one model (e.g. AutoArima or Prophet, as the case may be) for each series. This can result in long runtimes to train these models if there are a lot of series in the data. One way to mitigate this problem is to fit models for different series in parallel if you have multiple compute cores available. To enable this behavior, set the `max_cores_per_iteration` parameter in your AutoMLConfig as shown in the example in the next cell. \n",
|
"In the first case, AutoML loops over all time-series in your dataset and trains one model (e.g. AutoArima or Prophet, as the case may be) for each series. This can result in long runtimes to train these models if there are a lot of series in the data. One way to mitigate this problem is to fit models for different series in parallel if you have multiple compute cores available. To enable this behavior, set the `max_cores_per_iteration` parameter in your AutoMLConfig as shown in the example in the next cell. \n",
|
||||||
"\n",
|
"\n",
|
||||||
"\n",
|
"\n",
|
||||||
"Finally, a note about the cross-validation (CV) procedure for time-series data. AutoML uses out-of-sample error estimates to select a best pipeline/model, so it is important that the CV fold splitting is done correctly. Time-series can violate the basic statistical assumptions of the canonical K-Fold CV strategy, so AutoML implements a [rolling origin validation](https://robjhyndman.com/hyndsight/tscv/) procedure to create CV folds for time-series data. To use this procedure, you just need to specify the desired number of CV folds in the AutoMLConfig object. It is also possible to bypass CV and use your own validation set by setting the *validation_data* parameter of AutoMLConfig.\n",
|
"Finally, a note about the cross-validation (CV) procedure for time-series data. AutoML uses out-of-sample error estimates to select a best pipeline/model, so it is important that the CV fold splitting is done correctly. Time-series can violate the basic statistical assumptions of the canonical K-Fold CV strategy, so AutoML implements a [rolling origin validation](https://robjhyndman.com/hyndsight/tscv/) procedure to create CV folds for time-series data. To use this procedure, you could specify the desired number of CV folds and the number of periods between two consecutive folds in the AutoMLConfig object, or AutoMl could set them automatically if you don't specify them. It is also possible to bypass CV and use your own validation set by setting the *validation_data* parameter of AutoMLConfig.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"Here is a summary of AutoMLConfig parameters used for training the OJ model:\n",
|
"Here is a summary of AutoMLConfig parameters used for training the OJ model:\n",
|
||||||
"\n",
|
"\n",
|
||||||
@@ -402,7 +413,7 @@
|
|||||||
"|**training_data**|Input dataset, containing both features and label column.|\n",
|
"|**training_data**|Input dataset, containing both features and label column.|\n",
|
||||||
"|**label_column_name**|The name of the label column.|\n",
|
"|**label_column_name**|The name of the label column.|\n",
|
||||||
"|**compute_target**|The remote compute for training.|\n",
|
"|**compute_target**|The remote compute for training.|\n",
|
||||||
"|**n_cross_validations**|Number of cross-validation folds to use for model/pipeline selection|\n",
|
"|**n_cross_validations**|Number of cross-validation folds to use for model/pipeline selection. The default value is \"auto\", in which case AutoMl determines the number of cross-validations automatically, if a validation set is not provided. Or users could specify an integer value.\n",
|
||||||
"|**enable_voting_ensemble**|Allow AutoML to create a Voting ensemble of the best performing models|\n",
|
"|**enable_voting_ensemble**|Allow AutoML to create a Voting ensemble of the best performing models|\n",
|
||||||
"|**enable_stack_ensemble**|Allow AutoML to create a Stack ensemble of the best performing models|\n",
|
"|**enable_stack_ensemble**|Allow AutoML to create a Stack ensemble of the best performing models|\n",
|
||||||
"|**debug_log**|Log file path for writing debugging information|\n",
|
"|**debug_log**|Log file path for writing debugging information|\n",
|
||||||
@@ -423,6 +434,7 @@
|
|||||||
" forecast_horizon=n_test_periods,\n",
|
" forecast_horizon=n_test_periods,\n",
|
||||||
" time_series_id_column_names=time_series_id_column_names,\n",
|
" time_series_id_column_names=time_series_id_column_names,\n",
|
||||||
" freq=\"W-THU\", # Set the forecast frequency to be weekly (start on each Thursday)\n",
|
" freq=\"W-THU\", # Set the forecast frequency to be weekly (start on each Thursday)\n",
|
||||||
|
" cv_step_size=\"auto\",\n",
|
||||||
")\n",
|
")\n",
|
||||||
"\n",
|
"\n",
|
||||||
"automl_config = AutoMLConfig(\n",
|
"automl_config = AutoMLConfig(\n",
|
||||||
@@ -435,7 +447,7 @@
|
|||||||
" compute_target=compute_target,\n",
|
" compute_target=compute_target,\n",
|
||||||
" enable_early_stopping=True,\n",
|
" enable_early_stopping=True,\n",
|
||||||
" featurization=featurization_config,\n",
|
" featurization=featurization_config,\n",
|
||||||
" n_cross_validations=3,\n",
|
" n_cross_validations=\"auto\", # Feel free to set to a small integer (>=2) if runtime is an issue.\n",
|
||||||
" verbosity=logging.INFO,\n",
|
" verbosity=logging.INFO,\n",
|
||||||
" max_cores_per_iteration=-1,\n",
|
" max_cores_per_iteration=-1,\n",
|
||||||
" forecasting_parameters=forecasting_parameters,\n",
|
" forecasting_parameters=forecasting_parameters,\n",
|
||||||
@@ -712,7 +724,7 @@
|
|||||||
" description=\"Automl forecasting sample service\",\n",
|
" description=\"Automl forecasting sample service\",\n",
|
||||||
")\n",
|
")\n",
|
||||||
"\n",
|
"\n",
|
||||||
"aci_service_name = \"automl-oj-forecast-01\"\n",
|
"aci_service_name = \"automl-oj-forecast-03\"\n",
|
||||||
"print(aci_service_name)\n",
|
"print(aci_service_name)\n",
|
||||||
"aci_service = Model.deploy(ws, aci_service_name, [model], inference_config, aciconfig)\n",
|
"aci_service = Model.deploy(ws, aci_service_name, [model], inference_config, aciconfig)\n",
|
||||||
"aci_service.wait_for_deployment(True)\n",
|
"aci_service.wait_for_deployment(True)\n",
|
||||||
@@ -789,7 +801,7 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"serv = Webservice(ws, \"automl-oj-forecast-01\")\n",
|
"serv = Webservice(ws, \"automl-oj-forecast-03\")\n",
|
||||||
"serv.delete() # don't do it accidentally"
|
"serv.delete() # don't do it accidentally"
|
||||||
]
|
]
|
||||||
}
|
}
|
||||||
@@ -818,9 +830,9 @@
|
|||||||
"friendly_name": "Forecasting orange juice sales with deployment",
|
"friendly_name": "Forecasting orange juice sales with deployment",
|
||||||
"index_order": 1,
|
"index_order": 1,
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Python 3.6",
|
"display_name": "Python 3.8 - AzureML",
|
||||||
"language": "python",
|
"language": "python",
|
||||||
"name": "python36"
|
"name": "python38-azureml"
|
||||||
},
|
},
|
||||||
"language_info": {
|
"language_info": {
|
||||||
"codemirror_mode": {
|
"codemirror_mode": {
|
||||||
@@ -832,12 +844,17 @@
|
|||||||
"name": "python",
|
"name": "python",
|
||||||
"nbconvert_exporter": "python",
|
"nbconvert_exporter": "python",
|
||||||
"pygments_lexer": "ipython3",
|
"pygments_lexer": "ipython3",
|
||||||
"version": "3.6.9"
|
"version": "3.8.10"
|
||||||
},
|
},
|
||||||
"tags": [
|
"tags": [
|
||||||
"None"
|
"None"
|
||||||
],
|
],
|
||||||
"task": "Forecasting"
|
"task": "Forecasting",
|
||||||
|
"vscode": {
|
||||||
|
"interpreter": {
|
||||||
|
"hash": "6bd77c88278e012ef31757c15997a7bea8c943977c43d6909403c00ae11d43ca"
|
||||||
|
}
|
||||||
|
}
|
||||||
},
|
},
|
||||||
"nbformat": 4,
|
"nbformat": 4,
|
||||||
"nbformat_minor": 4
|
"nbformat_minor": 4
|
||||||
|
|||||||
@@ -1,4 +0,0 @@
|
|||||||
name: auto-ml-forecasting-orange-juice-sales
|
|
||||||
dependencies:
|
|
||||||
- pip:
|
|
||||||
- azureml-sdk
|
|
||||||
@@ -6,7 +6,7 @@ compute instance.
|
|||||||
|
|
||||||
import argparse
|
import argparse
|
||||||
from azureml.core import Dataset, Run
|
from azureml.core import Dataset, Run
|
||||||
from sklearn.externals import joblib
|
import joblib
|
||||||
from pandas.tseries.frequencies import to_offset
|
from pandas.tseries.frequencies import to_offset
|
||||||
|
|
||||||
parser = argparse.ArgumentParser()
|
parser = argparse.ArgumentParser()
|
||||||
|
|||||||
@@ -0,0 +1,834 @@
|
|||||||
|
{
|
||||||
|
"cells": [
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Copyright (c) Microsoft Corporation. All rights reserved.\n",
|
||||||
|
"\n",
|
||||||
|
"Licensed under the MIT License."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
""
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"<font color=\"red\" size=\"5\"><strong>!Important!</strong> </br>This notebook is outdated and is not supported by the AutoML Team. Please use the supported version ([link](https://github.com/Azure/azureml-examples/tree/main/sdk/python/jobs/pipelines/1h_automl_in_pipeline/automl-forecasting-in-pipeline)).</font>\n",
|
||||||
|
"</br>\n",
|
||||||
|
"</br>\n",
|
||||||
|
"<font color=\"red\" size=\"5\">\n",
|
||||||
|
"For examples illustrating how to build pipelines with components, please use the following links:</font>\n",
|
||||||
|
"<ul>\n",
|
||||||
|
" <li><a href=\"https://github.com/Azure/azureml-examples/tree/main/sdk/python/jobs/pipelines/1k_demand_forecasting_with_pipeline_components/automl-forecasting-demand-many-models-in-pipeline\">Many Models</a></li>\n",
|
||||||
|
" <li><a href=\"https://github.com/Azure/azureml-examples/tree/main/sdk/python/jobs/pipelines/1k_demand_forecasting_with_pipeline_components/automl-forecasting-demand-hierarchical-timeseries-in-pipeline\">Hierarchical Time Series</a></li>\n",
|
||||||
|
" <li><a href=\"https://github.com/Azure/azureml-examples/tree/main/sdk/python/jobs/automl-standalone-jobs/automl-forecasting-distributed-tcn\">Distributed TCN</a></li>\n",
|
||||||
|
"</ul>"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"# Training and Inferencing AutoML Forecasting Model Using Pipelines"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## Introduction\n",
|
||||||
|
"\n",
|
||||||
|
"In this notebook, we demonstrate how to use piplines to train and inference on AutoML Forecasting model. Two pipelines will be created: one for training AutoML model, and the other is for inference on AutoML model. We'll also demonstrate how to schedule the inference pipeline so you can get inference results periodically (with refreshed test dataset). Make sure you have executed the [configuration notebook](https://github.com/Azure/MachineLearningNotebooks/blob/master/configuration.ipynb) before running this notebook. In this notebook you will learn how to:\n",
|
||||||
|
"\n",
|
||||||
|
"- Configure AutoML using AutoMLConfig for forecasting tasks using pipeline AutoMLSteps.\n",
|
||||||
|
"- Create and register an AutoML model using AzureML pipeline.\n",
|
||||||
|
"- Inference and schdelue the pipeline using registered model."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## Setup\n",
|
||||||
|
"\n",
|
||||||
|
"As part of the setup you have already created an Azure ML `Workspace` object. For AutoML you will need to create an `Experiment` object, which is a named object in a `Workspace` used to run experiments."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import json\n",
|
||||||
|
"import logging\n",
|
||||||
|
"import os\n",
|
||||||
|
"\n",
|
||||||
|
"import pandas as pd\n",
|
||||||
|
"\n",
|
||||||
|
"import azureml.core\n",
|
||||||
|
"from azureml.core.experiment import Experiment\n",
|
||||||
|
"from azureml.core.workspace import Workspace\n",
|
||||||
|
"from azureml.train.automl import AutoMLConfig"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"This sample notebook may use features that are not available in previous versions of the Azure ML SDK."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"print(\"This notebook was created using version 1.38.0 of the Azure ML SDK\")\n",
|
||||||
|
"print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Accessing the Azure ML workspace requires authentication with Azure.\n",
|
||||||
|
"\n",
|
||||||
|
"The default authentication is interactive authentication using the default tenant. Executing the ws = Workspace.from_config() line in the cell below will prompt for authentication the first time that it is run.\n",
|
||||||
|
"\n",
|
||||||
|
"If you have multiple Azure tenants, you can specify the tenant by replacing the ws = Workspace.from_config() line in the cell below with the following:\n",
|
||||||
|
"```\n",
|
||||||
|
"from azureml.core.authentication import InteractiveLoginAuthentication\n",
|
||||||
|
"auth = InteractiveLoginAuthentication(tenant_id = 'mytenantid')\n",
|
||||||
|
"ws = Workspace.from_config(auth = auth)\n",
|
||||||
|
"```\n",
|
||||||
|
"If you need to run in an environment where interactive login is not possible, you can use Service Principal authentication by replacing the ws = Workspace.from_config() line in the cell below with the following:\n",
|
||||||
|
"```\n",
|
||||||
|
"from azureml.core.authentication import ServicePrincipalAuthentication\n",
|
||||||
|
"auth = ServicePrincipalAuthentication('mytenantid', 'myappid', 'mypassword')\n",
|
||||||
|
"ws = Workspace.from_config(auth = auth)\n",
|
||||||
|
"```\n",
|
||||||
|
"For more details, see aka.ms/aml-notebook-auth"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"ws = Workspace.from_config()\n",
|
||||||
|
"dstor = ws.get_default_datastore()\n",
|
||||||
|
"\n",
|
||||||
|
"# Choose a name for the run history container in the workspace.\n",
|
||||||
|
"experiment_name = \"forecasting-pipeline\"\n",
|
||||||
|
"experiment = Experiment(ws, experiment_name)\n",
|
||||||
|
"\n",
|
||||||
|
"output = {}\n",
|
||||||
|
"output[\"Subscription ID\"] = ws.subscription_id\n",
|
||||||
|
"output[\"Workspace\"] = ws.name\n",
|
||||||
|
"output[\"Resource Group\"] = ws.resource_group\n",
|
||||||
|
"output[\"Location\"] = ws.location\n",
|
||||||
|
"output[\"Run History Name\"] = experiment_name\n",
|
||||||
|
"pd.set_option(\"display.max_colwidth\", None)\n",
|
||||||
|
"outputDf = pd.DataFrame(data=output, index=[\"\"])\n",
|
||||||
|
"outputDf.T"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## Compute"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## Compute \n",
|
||||||
|
"\n",
|
||||||
|
"#### Create or Attach existing AmlCompute\n",
|
||||||
|
"\n",
|
||||||
|
"You will need to create a compute target for your AutoML run. In this tutorial, you create AmlCompute as your training compute resource.\n",
|
||||||
|
"\n",
|
||||||
|
"> Note that if you have an AzureML Data Scientist role, you will not have permission to create compute resources. Talk to your workspace or IT admin to create the compute targets described in this section, if they do not already exist.\n",
|
||||||
|
"\n",
|
||||||
|
"#### Creation of AmlCompute takes approximately 5 minutes. \n",
|
||||||
|
"If the AmlCompute with that name is already in your workspace this code will skip the creation process.\n",
|
||||||
|
"As with other Azure services, there are limits on certain resources (e.g. AmlCompute) associated with the Azure Machine Learning service. Please read [this article](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-manage-quotas) on the default limits and how to request more quota."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from azureml.core.compute import ComputeTarget, AmlCompute\n",
|
||||||
|
"from azureml.core.compute_target import ComputeTargetException\n",
|
||||||
|
"\n",
|
||||||
|
"# Choose a name for your CPU cluster\n",
|
||||||
|
"amlcompute_cluster_name = \"forecast-step-cluster\"\n",
|
||||||
|
"\n",
|
||||||
|
"# Verify that cluster does not exist already\n",
|
||||||
|
"try:\n",
|
||||||
|
" compute_target = ComputeTarget(workspace=ws, name=amlcompute_cluster_name)\n",
|
||||||
|
" print(\"Found existing cluster, use it.\")\n",
|
||||||
|
"except ComputeTargetException:\n",
|
||||||
|
" compute_config = AmlCompute.provisioning_configuration(\n",
|
||||||
|
" vm_size=\"STANDARD_DS12_V2\", max_nodes=4\n",
|
||||||
|
" )\n",
|
||||||
|
" compute_target = ComputeTarget.create(ws, amlcompute_cluster_name, compute_config)\n",
|
||||||
|
"compute_target.wait_for_completion(show_output=True)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## Data\n",
|
||||||
|
"You are now ready to load the historical orange juice sales data. For demonstration purposes, we extract sales time-series for just a few of the stores. We will load the CSV file into a plain pandas DataFrame; the time column in the CSV is called _WeekStarting_, so it will be specially parsed into the datetime type."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"time_column_name = \"WeekStarting\"\n",
|
||||||
|
"train = pd.read_csv(\"oj-train.csv\", parse_dates=[time_column_name])\n",
|
||||||
|
"\n",
|
||||||
|
"train.head()"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Each row in the DataFrame holds a quantity of weekly sales for an OJ brand at a single store. The data also includes the sales price, a flag indicating if the OJ brand was advertised in the store that week, and some customer demographic information based on the store location. For historical reasons, the data also include the logarithm of the sales quantity. The Dominick's grocery data is commonly used to illustrate econometric modeling techniques where logarithms of quantities are generally preferred. \n",
|
||||||
|
"\n",
|
||||||
|
"The task is now to build a time-series model for the _Quantity_ column. It is important to note that this dataset is comprised of many individual time-series - one for each unique combination of _Store_ and _Brand_. To distinguish the individual time-series, we define the **time_series_id_column_names** - the columns whose values determine the boundaries between time-series: "
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"time_series_id_column_names = [\"Store\", \"Brand\"]\n",
|
||||||
|
"nseries = train.groupby(time_series_id_column_names).ngroups\n",
|
||||||
|
"print(\"Data contains {0} individual time-series.\".format(nseries))"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"### Test Splitting\n",
|
||||||
|
"We now split the data into a training and a testing set for later forecast prediction. The test set will contain the final 4 weeks of observed sales for each time-series. The splits should be stratified by series, so we use a group-by statement on the time series identifier columns."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"n_test_periods = 4\n",
|
||||||
|
"\n",
|
||||||
|
"test = pd.read_csv(\"oj-test.csv\", parse_dates=[time_column_name])"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"### Upload data to datastore\n",
|
||||||
|
"The [Machine Learning service workspace](https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-workspace), is paired with the storage account, which contains the default data store. We will use it to upload the train and test data and create [tabular datasets](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.data.tabulardataset?view=azure-ml-py) for training and testing. A tabular dataset defines a series of lazily-evaluated, immutable operations to load data from the data source into tabular representation."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from azureml.data.dataset_factory import TabularDatasetFactory\n",
|
||||||
|
"\n",
|
||||||
|
"datastore = ws.get_default_datastore()\n",
|
||||||
|
"train_dataset = TabularDatasetFactory.register_pandas_dataframe(\n",
|
||||||
|
" train, target=(datastore, \"dataset/\"), name=\"dominicks_OJ_train_pipeline\"\n",
|
||||||
|
")\n",
|
||||||
|
"\n",
|
||||||
|
"test_dataset = TabularDatasetFactory.register_pandas_dataframe(\n",
|
||||||
|
" test, target=(datastore, \"dataset/\"), name=\"dominicks_OJ_test_pipeline\"\n",
|
||||||
|
")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## Training"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## Modeling\n",
|
||||||
|
"\n",
|
||||||
|
"For forecasting tasks, AutoML uses pre-processing and estimation steps that are specific to time-series. AutoML will undertake the following pre-processing steps:\n",
|
||||||
|
"* Detect time-series sample frequency (e.g. hourly, daily, weekly) and create new records for absent time points to make the series regular. A regular time series has a well-defined frequency and has a value at every sample point in a contiguous time span \n",
|
||||||
|
"* Impute missing values in the target (via forward-fill) and feature columns (using median column values) \n",
|
||||||
|
"* Create features based on time series identifiers to enable fixed effects across different series\n",
|
||||||
|
"* Create time-based features to assist in learning seasonal patterns\n",
|
||||||
|
"* Encode categorical variables to numeric quantities\n",
|
||||||
|
"\n",
|
||||||
|
"In this notebook, AutoML will train a single, regression-type model across **all** time-series in a given training set. This allows the model to generalize across related series. If you're looking for training multiple models for different time-series, please see the many-models notebook.\n",
|
||||||
|
"\n",
|
||||||
|
"You are almost ready to start an AutoML training job. First, we need to define the target column."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"target_column_name = \"Quantity\""
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## Forecasting Parameters\n",
|
||||||
|
"To define forecasting parameters for your experiment training, you can leverage the ForecastingParameters class. The table below details the forecasting parameter we will be passing into our experiment.\n",
|
||||||
|
"\n",
|
||||||
|
"\n",
|
||||||
|
"|Property|Description|\n",
|
||||||
|
"|-|-|\n",
|
||||||
|
"|**time_column_name**|The name of your time column.|\n",
|
||||||
|
"|**forecast_horizon**|The forecast horizon is how many periods forward you would like to forecast. This integer horizon is in units of the timeseries frequency (e.g. daily, weekly).|\n",
|
||||||
|
"|**time_series_id_column_names**|The column names used to uniquely identify the time series in data that has multiple rows with the same timestamp. If the time series identifiers are not defined, the data set is assumed to be one time series.|\n",
|
||||||
|
"|**freq**|Forecast frequency. This optional parameter represents the period with which the forecast is desired, for example, daily, weekly, yearly, etc. Use this parameter for the correction of time series containing irregular data points or for padding of short time series. The frequency needs to be a pandas offset alias. Please refer to [pandas documentation](https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#dateoffset-objects) for more information.\n",
|
||||||
|
"|**cv_step_size**|Number of periods between two consecutive cross-validation folds. The default value is \"auto\", in which case AutoMl determines the cross-validation step size automatically, if a validation set is not provided. Or users could specify an integer value."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from azureml.automl.core.forecasting_parameters import ForecastingParameters\n",
|
||||||
|
"\n",
|
||||||
|
"forecasting_parameters = ForecastingParameters(\n",
|
||||||
|
" time_column_name=time_column_name,\n",
|
||||||
|
" forecast_horizon=n_test_periods,\n",
|
||||||
|
" time_series_id_column_names=time_series_id_column_names,\n",
|
||||||
|
" freq=\"W-THU\", # Set the forecast frequency to be weekly (start on each Thursday),\n",
|
||||||
|
" cv_step_size=\"auto\",\n",
|
||||||
|
")\n",
|
||||||
|
"\n",
|
||||||
|
"automl_config = AutoMLConfig(\n",
|
||||||
|
" task=\"forecasting\",\n",
|
||||||
|
" debug_log=\"automl_oj_sales_errors.log\",\n",
|
||||||
|
" primary_metric=\"normalized_mean_absolute_error\",\n",
|
||||||
|
" experiment_timeout_hours=0.25,\n",
|
||||||
|
" training_data=train_dataset,\n",
|
||||||
|
" label_column_name=target_column_name,\n",
|
||||||
|
" compute_target=compute_target,\n",
|
||||||
|
" enable_early_stopping=True,\n",
|
||||||
|
" n_cross_validations=\"auto\", # Feel free to set to a small integer (>=2) if runtime is an issue.\n",
|
||||||
|
" verbosity=logging.INFO,\n",
|
||||||
|
" max_cores_per_iteration=-1,\n",
|
||||||
|
" forecasting_parameters=forecasting_parameters,\n",
|
||||||
|
")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from azureml.pipeline.core import PipelineData, TrainingOutput\n",
|
||||||
|
"from azureml.pipeline.steps import AutoMLStep\n",
|
||||||
|
"from azureml.pipeline.core import Pipeline, PipelineParameter\n",
|
||||||
|
"from azureml.pipeline.steps import PythonScriptStep\n",
|
||||||
|
"\n",
|
||||||
|
"metrics_output_name = \"metrics_output\"\n",
|
||||||
|
"best_model_output_name = \"best_model_output\"\n",
|
||||||
|
"model_file_name = \"model_file\"\n",
|
||||||
|
"metrics_data_name = \"metrics_data\"\n",
|
||||||
|
"\n",
|
||||||
|
"metrics_data = PipelineData(\n",
|
||||||
|
" name=metrics_data_name,\n",
|
||||||
|
" datastore=datastore,\n",
|
||||||
|
" pipeline_output_name=metrics_output_name,\n",
|
||||||
|
" training_output=TrainingOutput(type=\"Metrics\"),\n",
|
||||||
|
")\n",
|
||||||
|
"model_data = PipelineData(\n",
|
||||||
|
" name=model_file_name,\n",
|
||||||
|
" datastore=datastore,\n",
|
||||||
|
" pipeline_output_name=best_model_output_name,\n",
|
||||||
|
" training_output=TrainingOutput(type=\"Model\"),\n",
|
||||||
|
")\n",
|
||||||
|
"\n",
|
||||||
|
"automl_step = AutoMLStep(\n",
|
||||||
|
" name=\"automl_module\",\n",
|
||||||
|
" automl_config=automl_config,\n",
|
||||||
|
" outputs=[metrics_data, model_data],\n",
|
||||||
|
" allow_reuse=False,\n",
|
||||||
|
")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"### Register Model Step"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"#### Run Configuration and Environment\n",
|
||||||
|
"To have a pipeline step run, we first need an environment to run the jobs. The environment can be build using the following code."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from azureml.core.runconfig import CondaDependencies, RunConfiguration\n",
|
||||||
|
"\n",
|
||||||
|
"# create a new RunConfig object\n",
|
||||||
|
"conda_run_config = RunConfiguration(framework=\"python\")\n",
|
||||||
|
"\n",
|
||||||
|
"# Set compute target to AmlCompute\n",
|
||||||
|
"conda_run_config.target = compute_target\n",
|
||||||
|
"\n",
|
||||||
|
"conda_run_config.docker.use_docker = True\n",
|
||||||
|
"\n",
|
||||||
|
"cd = CondaDependencies.create(\n",
|
||||||
|
" pip_packages=[\n",
|
||||||
|
" \"azureml-sdk[automl]\",\n",
|
||||||
|
" \"applicationinsights\",\n",
|
||||||
|
" \"azureml-opendatasets\",\n",
|
||||||
|
" \"azureml-defaults\",\n",
|
||||||
|
" ],\n",
|
||||||
|
" conda_packages=[\"numpy==1.19.5\"],\n",
|
||||||
|
" pin_sdk_version=False,\n",
|
||||||
|
")\n",
|
||||||
|
"conda_run_config.environment.python.conda_dependencies = cd\n",
|
||||||
|
"\n",
|
||||||
|
"print(\"run config is ready\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"#### Step to register the model.\n",
|
||||||
|
"The following code generates a step to register the model to the workspace from previous step. "
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"# The model name with which to register the trained model in the workspace.\n",
|
||||||
|
"model_name_str = \"ojmodel\"\n",
|
||||||
|
"model_name = PipelineParameter(\"model_name\", default_value=model_name_str)\n",
|
||||||
|
"\n",
|
||||||
|
"\n",
|
||||||
|
"register_model_step = PythonScriptStep(\n",
|
||||||
|
" script_name=\"register_model.py\",\n",
|
||||||
|
" name=\"register_model\",\n",
|
||||||
|
" source_directory=\"scripts\",\n",
|
||||||
|
" allow_reuse=False,\n",
|
||||||
|
" arguments=[\n",
|
||||||
|
" \"--model_name\",\n",
|
||||||
|
" model_name,\n",
|
||||||
|
" \"--model_path\",\n",
|
||||||
|
" model_data,\n",
|
||||||
|
" \"--ds_name\",\n",
|
||||||
|
" \"dominicks_OJ_train\",\n",
|
||||||
|
" ],\n",
|
||||||
|
" inputs=[model_data],\n",
|
||||||
|
" compute_target=compute_target,\n",
|
||||||
|
" runconfig=conda_run_config,\n",
|
||||||
|
")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"### Build the Pipeline"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"training_pipeline = Pipeline(\n",
|
||||||
|
" description=\"training_pipeline\",\n",
|
||||||
|
" workspace=ws,\n",
|
||||||
|
" steps=[automl_step, register_model_step],\n",
|
||||||
|
")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"### Submit Pipeline Run"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"training_pipeline_run = experiment.submit(training_pipeline)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"training_pipeline_run.wait_for_completion(show_output=True)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"### Get metrics for each runs"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"output_dir = \"train_output\"\n",
|
||||||
|
"pipeline_output = training_pipeline_run.get_pipeline_output(\"metrics_output\")\n",
|
||||||
|
"pipeline_output.download(output_dir)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"file_path = os.path.join(output_dir, pipeline_output.path_on_datastore)\n",
|
||||||
|
"with open(file_path) as f:\n",
|
||||||
|
" metrics = json.load(f)\n",
|
||||||
|
"for run_id, metrics in metrics.items():\n",
|
||||||
|
" print(\"{}: {}\".format(run_id, metrics[\"normalized_root_mean_squared_error\"][0]))"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## Inference"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"There are several ways to do the inference, for here we will demonstrate how to use the registered model and pipeline to do the inference. (how to register a model https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.model.model?view=azure-ml-py)."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"### Get Inference Pipeline Environment\n",
|
||||||
|
"To trigger an inference pipeline run, we first need a running environment for run that contains all the appropriate packages for the model unpickling. This environment can be either assess from the training run or using the `yml` file that comes with the model."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from azureml.train.automl.run import AutoMLRun\n",
|
||||||
|
"\n",
|
||||||
|
"for step in training_pipeline_run.get_steps():\n",
|
||||||
|
" if step.properties.get(\"StepType\") == \"AutoMLStep\":\n",
|
||||||
|
" automl_run = AutoMLRun(experiment, step.id)\n",
|
||||||
|
" break\n",
|
||||||
|
"\n",
|
||||||
|
"best_run = automl_run.get_best_child()\n",
|
||||||
|
"inference_env = best_run.get_environment()"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"After we have the environment for the inference, we could build run config based on this environment."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"run_config = RunConfiguration()\n",
|
||||||
|
"run_config.environment = inference_env"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"### Build and submit the inference pipeline"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"The inference pipeline will create two different format of outputs, 1) a tabular dataset that contains the prediction and 2) an `OutputFileDatasetConfig` that can be used for the sequential pipeline steps."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from azureml.data import OutputFileDatasetConfig\n",
|
||||||
|
"\n",
|
||||||
|
"output_data = OutputFileDatasetConfig(name=\"prediction_result\")\n",
|
||||||
|
"\n",
|
||||||
|
"output_ds_name = \"oj-output\"\n",
|
||||||
|
"\n",
|
||||||
|
"inference_step = PythonScriptStep(\n",
|
||||||
|
" name=\"infer-results\",\n",
|
||||||
|
" source_directory=\"scripts\",\n",
|
||||||
|
" script_name=\"infer.py\",\n",
|
||||||
|
" arguments=[\n",
|
||||||
|
" \"--model_name\",\n",
|
||||||
|
" model_name_str,\n",
|
||||||
|
" \"--ouput_dataset_name\",\n",
|
||||||
|
" output_ds_name,\n",
|
||||||
|
" \"--test_dataset_name\",\n",
|
||||||
|
" test_dataset.name,\n",
|
||||||
|
" \"--target_column_name\",\n",
|
||||||
|
" target_column_name,\n",
|
||||||
|
" \"--output_path\",\n",
|
||||||
|
" output_data,\n",
|
||||||
|
" ],\n",
|
||||||
|
" compute_target=compute_target,\n",
|
||||||
|
" allow_reuse=False,\n",
|
||||||
|
" runconfig=run_config,\n",
|
||||||
|
")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"inference_pipeline = Pipeline(ws, [inference_step])\n",
|
||||||
|
"inference_run = experiment.submit(inference_pipeline)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"inference_run.wait_for_completion(show_output=True)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"### Get the predicted data"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from azureml.core import Dataset\n",
|
||||||
|
"\n",
|
||||||
|
"inference_ds = Dataset.get_by_name(ws, output_ds_name)\n",
|
||||||
|
"inference_df = inference_ds.to_pandas_dataframe()\n",
|
||||||
|
"inference_df.tail(5)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## Schedule Pipeline"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"This section is about how to schedule a pipeline for periodically predictions. For more info about pipeline schedule and pipeline endpoint, please follow this [notebook](https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-setup-schedule-for-a-published-pipeline.ipynb)."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"inference_published_pipeline = inference_pipeline.publish(\n",
|
||||||
|
" name=\"OJ Inference Test\", description=\"OJ Inference Test\"\n",
|
||||||
|
")\n",
|
||||||
|
"print(\"Newly published pipeline id: {}\".format(inference_published_pipeline.id))"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"If `test_dataset` is going to refresh every 4 weeks before Friday 16:00 and we want to predict every 4 weeks (forecast_horizon), we can schedule our pipeline to run every 4 weeks at 16:00 to get daily inference results. You can refresh your test dataset (a newer version will be created) periodically when new data is available (i.e. target column in test dataset would have values in the beginning as context data, and followed by NaNs to be predicted). The inference pipeline will pick up context to further improve the forecast accuracy."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"# schedule\n",
|
||||||
|
"\n",
|
||||||
|
"from azureml.pipeline.core.schedule import ScheduleRecurrence, Schedule\n",
|
||||||
|
"\n",
|
||||||
|
"recurrence = ScheduleRecurrence(\n",
|
||||||
|
" frequency=\"Week\", interval=4, week_days=[\"Friday\"], hours=[16], minutes=[0]\n",
|
||||||
|
")\n",
|
||||||
|
"\n",
|
||||||
|
"schedule = Schedule.create(\n",
|
||||||
|
" workspace=ws,\n",
|
||||||
|
" name=\"OJ_Inference_schedule\",\n",
|
||||||
|
" pipeline_id=inference_published_pipeline.id,\n",
|
||||||
|
" experiment_name=\"Schedule-run-OJ\",\n",
|
||||||
|
" recurrence=recurrence,\n",
|
||||||
|
" wait_for_provisioning=True,\n",
|
||||||
|
" description=\"Schedule Run\",\n",
|
||||||
|
")\n",
|
||||||
|
"\n",
|
||||||
|
"# You may want to make sure that the schedule is provisioned properly\n",
|
||||||
|
"# before making any further changes to the schedule\n",
|
||||||
|
"\n",
|
||||||
|
"print(\"Created schedule with id: {}\".format(schedule.id))"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"### [Optional] Disable schedule"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"schedule.disable()"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"metadata": {
|
||||||
|
"authors": [
|
||||||
|
{
|
||||||
|
"name": "jialiu"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"category": "tutorial",
|
||||||
|
"celltoolbar": "Raw Cell Format",
|
||||||
|
"compute": [
|
||||||
|
"Remote"
|
||||||
|
],
|
||||||
|
"datasets": [
|
||||||
|
"Orange Juice Sales"
|
||||||
|
],
|
||||||
|
"deployment": [
|
||||||
|
"Azure Container Instance"
|
||||||
|
],
|
||||||
|
"exclude_from_index": false,
|
||||||
|
"framework": [
|
||||||
|
"Azure ML AutoML"
|
||||||
|
],
|
||||||
|
"friendly_name": "Forecasting orange juice sales with deployment",
|
||||||
|
"index_order": 1,
|
||||||
|
"kernelspec": {
|
||||||
|
"display_name": "Python 3.8 - AzureML",
|
||||||
|
"language": "python",
|
||||||
|
"name": "python38-azureml"
|
||||||
|
},
|
||||||
|
"language_info": {
|
||||||
|
"codemirror_mode": {
|
||||||
|
"name": "ipython",
|
||||||
|
"version": 3
|
||||||
|
},
|
||||||
|
"file_extension": ".py",
|
||||||
|
"mimetype": "text/x-python",
|
||||||
|
"name": "python",
|
||||||
|
"nbconvert_exporter": "python",
|
||||||
|
"pygments_lexer": "ipython3",
|
||||||
|
"version": "3.8.5"
|
||||||
|
},
|
||||||
|
"tags": [
|
||||||
|
"None"
|
||||||
|
],
|
||||||
|
"task": "Forecasting",
|
||||||
|
"vscode": {
|
||||||
|
"interpreter": {
|
||||||
|
"hash": "6bd77c88278e012ef31757c15997a7bea8c943977c43d6909403c00ae11d43ca"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"nbformat": 4,
|
||||||
|
"nbformat_minor": 4
|
||||||
|
}
|
||||||
@@ -0,0 +1,37 @@
|
|||||||
|
WeekStarting,Store,Brand,Advert,Price,Age60,COLLEGE,INCOME,Hincome150,Large HH,Minorities,WorkingWoman,SSTRDIST,SSTRVOL,CPDIST5,CPWVOL5
|
||||||
|
1992-09-10,2,dominicks,0,2.09,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-09-10,2,minute.maid,1,1.99,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-09-10,2,tropicana,0,2.64,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-09-10,5,dominicks,0,1.85,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-09-10,5,minute.maid,1,1.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-09-10,5,tropicana,0,2.49,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-09-10,8,dominicks,0,1.79,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-09-10,8,minute.maid,1,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-09-10,8,tropicana,0,2.49,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-09-17,2,dominicks,0,1.77,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-09-17,2,minute.maid,0,2.83,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-09-17,2,tropicana,0,3.19,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-09-17,5,dominicks,0,1.85,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-09-17,5,minute.maid,0,2.69,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-09-17,5,tropicana,0,2.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-09-17,8,dominicks,0,1.79,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-09-17,8,minute.maid,0,2.49,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-09-17,8,tropicana,0,2.89,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-09-24,2,dominicks,0,1.49,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-09-24,2,minute.maid,0,2.67,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-09-24,2,tropicana,1,2.79,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-09-24,5,dominicks,0,1.49,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-09-24,5,minute.maid,0,2.49,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-09-24,5,tropicana,1,2.78,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-09-24,8,dominicks,0,1.49,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-09-24,8,minute.maid,0,2.49,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-09-24,8,tropicana,1,2.79,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-10-01,2,dominicks,0,1.82,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-10-01,2,minute.maid,1,2.19,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-10-01,2,tropicana,0,2.97,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-10-01,5,dominicks,0,1.85,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-10-01,5,minute.maid,1,2.19,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-10-01,5,tropicana,0,2.78,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-10-01,8,dominicks,0,1.79,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-10-01,8,minute.maid,1,2.19,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-10-01,8,tropicana,0,2.89,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
@@ -0,0 +1,997 @@
|
|||||||
|
WeekStarting,Store,Brand,Quantity,Advert,Price,Age60,COLLEGE,INCOME,Hincome150,Large HH,Minorities,WorkingWoman,SSTRDIST,SSTRVOL,CPDIST5,CPWVOL5
|
||||||
|
1990-06-14,2,dominicks,10560,1,1.59,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-06-14,2,minute.maid,4480,0,3.17,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-06-14,2,tropicana,8256,0,3.87,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-06-14,5,dominicks,1792,1,1.59,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-06-14,5,minute.maid,4224,0,2.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-06-14,5,tropicana,5888,0,3.66,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-06-14,8,dominicks,14336,1,1.59,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-06-14,8,minute.maid,6080,0,2.62,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-06-14,8,tropicana,8896,0,3.19,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-06-21,8,dominicks,6400,0,2.29,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-06-21,8,minute.maid,51968,1,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-06-21,8,tropicana,7296,0,3.19,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-06-28,5,dominicks,2496,0,2.49,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-06-28,5,minute.maid,4352,0,2.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-06-28,5,tropicana,6976,0,3.66,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-06-28,8,dominicks,3968,0,2.29,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-06-28,8,minute.maid,4928,0,2.62,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-06-28,8,tropicana,10368,0,3.19,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-07-05,5,dominicks,2944,0,2.49,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-07-05,5,minute.maid,4928,0,2.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-07-05,5,tropicana,6528,0,3.66,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-07-05,8,dominicks,4352,0,2.29,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-07-05,8,minute.maid,5312,0,2.62,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-07-05,8,tropicana,6976,0,3.19,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-07-12,5,dominicks,1024,0,2.49,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-07-12,5,minute.maid,31168,1,2.59,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-07-12,5,tropicana,4928,0,3.66,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-07-12,8,dominicks,3520,0,2.29,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-07-12,8,minute.maid,39424,1,2.59,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-07-12,8,tropicana,6464,0,3.19,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-07-19,8,dominicks,6464,0,2.29,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-07-19,8,minute.maid,5568,0,2.62,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-07-19,8,tropicana,8192,0,3.19,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-07-26,2,dominicks,8000,0,2.69,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-07-26,2,minute.maid,4672,0,3.17,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-07-26,2,tropicana,6144,0,3.87,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-07-26,5,dominicks,4224,0,2.49,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-07-26,5,minute.maid,10048,0,2.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-07-26,5,tropicana,5312,0,3.66,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-07-26,8,dominicks,5952,0,2.29,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-07-26,8,minute.maid,14592,0,2.62,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-07-26,8,tropicana,7936,0,3.19,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-08-02,2,dominicks,6848,1,2.09,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-08-02,2,minute.maid,20160,1,2.39,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-08-02,2,tropicana,3840,0,3.87,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-08-02,5,dominicks,4544,1,2.09,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-08-02,5,minute.maid,21760,1,2.39,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-08-02,5,tropicana,5120,0,3.66,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-08-02,8,dominicks,8832,1,2.09,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-08-02,8,minute.maid,22208,1,2.39,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-08-02,8,tropicana,6656,0,3.19,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-08-09,2,dominicks,2880,0,2.09,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-08-09,2,minute.maid,2688,0,3.17,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-08-09,2,tropicana,8000,0,3.87,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-08-09,5,dominicks,1728,0,2.09,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-08-09,5,minute.maid,4544,0,2.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-08-09,5,tropicana,7936,0,3.66,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-08-09,8,dominicks,7232,0,2.09,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-08-09,8,minute.maid,5760,0,2.62,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-08-09,8,tropicana,8256,0,3.19,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-08-16,5,dominicks,1216,0,2.09,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-08-16,5,minute.maid,52224,1,1.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-08-16,5,tropicana,6080,0,3.66,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-08-16,8,dominicks,5504,0,2.09,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-08-16,8,minute.maid,54016,1,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-08-16,8,tropicana,5568,0,3.19,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-08-23,2,dominicks,1600,0,2.09,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-08-23,2,minute.maid,3008,0,3.17,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-08-23,2,tropicana,8896,0,3.87,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-08-23,5,dominicks,1152,0,2.09,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-08-23,5,minute.maid,3584,0,2.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-08-23,5,tropicana,4160,0,3.66,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-08-23,8,dominicks,4800,0,2.09,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-08-23,8,minute.maid,5824,0,2.62,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-08-23,8,tropicana,7488,0,3.19,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-08-30,2,dominicks,25344,1,1.89,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-08-30,2,minute.maid,4672,0,3.17,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-08-30,2,tropicana,7168,0,3.87,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-08-30,5,dominicks,30144,1,1.89,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-08-30,5,minute.maid,5120,0,2.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-08-30,5,tropicana,5888,0,3.66,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-08-30,8,dominicks,52672,1,1.89,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-08-30,8,minute.maid,6528,0,2.62,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-08-30,8,tropicana,6144,0,3.19,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-09-06,2,dominicks,10752,0,1.89,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-09-06,2,minute.maid,2752,0,3.17,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-09-06,2,tropicana,10880,0,3.29,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-09-06,5,dominicks,8960,0,1.89,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-09-06,5,minute.maid,4416,0,2.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-09-06,5,tropicana,9536,0,3.29,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-09-06,8,dominicks,16448,0,1.89,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-09-06,8,minute.maid,5440,0,2.62,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-09-06,8,tropicana,11008,0,3.19,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-09-13,2,dominicks,6656,0,1.89,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-09-13,2,minute.maid,26176,1,2.19,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-09-13,2,tropicana,7744,0,3.29,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-09-13,5,dominicks,8192,0,1.89,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-09-13,5,minute.maid,30208,1,2.19,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-09-13,5,tropicana,8320,0,3.29,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-09-13,8,dominicks,19072,0,1.89,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-09-13,8,minute.maid,36544,1,2.19,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-09-13,8,tropicana,5760,0,3.19,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-09-20,2,dominicks,6592,0,1.79,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-09-20,2,minute.maid,3712,0,3.17,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-09-20,2,tropicana,8512,0,3.29,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-09-20,5,dominicks,6528,0,1.79,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-09-20,5,minute.maid,4160,0,2.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-09-20,5,tropicana,8000,0,3.29,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-09-20,8,dominicks,13376,0,1.79,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-09-20,8,minute.maid,3776,0,2.62,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-09-20,8,tropicana,10112,0,3.19,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-09-27,5,dominicks,34688,1,1.79,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-09-27,5,minute.maid,4992,0,2.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-09-27,5,tropicana,5824,0,3.66,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-09-27,8,dominicks,61440,1,1.79,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-09-27,8,minute.maid,5504,0,2.62,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-09-27,8,tropicana,8448,0,3.19,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-10-04,5,dominicks,4672,0,1.79,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-10-04,5,minute.maid,13952,0,2.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-10-04,5,tropicana,10624,1,3.29,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-10-04,8,dominicks,13760,0,1.79,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-10-04,8,minute.maid,12416,0,2.62,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-10-04,8,tropicana,8448,1,3.19,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-10-11,2,dominicks,1728,0,2.69,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-10-11,2,minute.maid,30656,1,1.99,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-10-11,2,tropicana,5504,0,3.29,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-10-11,5,dominicks,1088,0,2.49,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-10-11,5,minute.maid,47680,1,1.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-10-11,5,tropicana,6656,0,3.29,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-10-11,8,dominicks,3136,0,2.29,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-10-11,8,minute.maid,53696,1,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-10-11,8,tropicana,7424,0,3.19,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-10-18,2,dominicks,33792,1,1.24,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-10-18,2,minute.maid,3840,0,2.98,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-10-18,2,tropicana,5888,0,3.56,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-10-18,5,dominicks,69440,1,1.24,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-10-18,5,minute.maid,7616,0,2.69,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-10-18,5,tropicana,5184,0,3.51,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-10-18,8,dominicks,186176,1,1.14,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-10-18,8,minute.maid,5696,0,2.51,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-10-18,8,tropicana,5824,0,3.04,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-10-25,2,dominicks,1920,0,1.59,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-10-25,2,minute.maid,2816,0,3.17,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-10-25,2,tropicana,8384,0,3.56,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-10-25,5,dominicks,1280,0,1.59,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-10-25,5,minute.maid,8896,0,2.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-10-25,5,tropicana,4928,0,3.51,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-10-25,8,dominicks,3712,0,1.59,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-10-25,8,minute.maid,4864,0,2.62,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-10-25,8,tropicana,6656,0,3.04,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-11-01,2,dominicks,8960,1,1.59,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-11-01,2,minute.maid,23104,1,1.99,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-11-01,2,tropicana,5952,0,3.56,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-11-01,5,dominicks,35456,1,1.59,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-11-01,5,minute.maid,28544,1,1.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-11-01,5,tropicana,5888,0,3.51,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-11-01,8,dominicks,35776,1,1.59,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-11-01,8,minute.maid,37184,1,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-11-01,8,tropicana,6272,0,3.04,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-11-08,2,dominicks,11392,0,1.29,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-11-08,2,minute.maid,3392,0,3.17,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-11-08,2,tropicana,6848,0,3.56,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-11-08,5,dominicks,13824,0,1.29,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-11-08,5,minute.maid,5440,0,2.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-11-08,5,tropicana,5312,0,3.51,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-11-08,8,dominicks,26880,0,1.29,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-11-08,8,minute.maid,5504,0,2.62,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-11-08,8,tropicana,6912,0,3.04,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-11-15,2,dominicks,28416,0,0.99,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-11-15,2,minute.maid,26304,1,1.99,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-11-15,2,tropicana,9216,0,3.87,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-11-15,5,dominicks,14208,0,0.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-11-15,5,minute.maid,52416,1,1.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-11-15,5,tropicana,9984,0,3.66,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-11-15,8,dominicks,71680,0,0.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-11-15,8,minute.maid,51008,1,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-11-15,8,tropicana,10496,0,3.19,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-11-22,2,dominicks,17152,1,1.59,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-11-22,2,minute.maid,6336,0,1.99,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-11-22,2,tropicana,12160,0,2.99,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-11-22,5,dominicks,29312,1,1.59,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-11-22,5,minute.maid,11712,0,1.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-11-22,5,tropicana,8448,0,2.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-11-22,8,dominicks,25088,1,1.59,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-11-22,8,minute.maid,11072,0,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-11-22,8,tropicana,11840,0,2.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-11-29,2,dominicks,26560,1,2.49,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-11-29,2,minute.maid,9920,0,3.17,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-11-29,2,tropicana,12672,0,2.99,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-11-29,5,dominicks,52992,1,2.49,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-11-29,5,minute.maid,13952,0,2.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-11-29,5,tropicana,10880,0,2.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-11-29,8,dominicks,91456,1,2.29,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-11-29,8,minute.maid,12160,0,2.62,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-11-29,8,tropicana,9664,0,2.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-12-06,2,dominicks,6336,0,2.69,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-12-06,2,minute.maid,25280,1,1.99,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-12-06,2,tropicana,6528,0,3.59,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-12-06,5,dominicks,15680,0,2.19,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-12-06,5,minute.maid,36160,1,1.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-12-06,5,tropicana,5696,0,3.39,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-12-06,8,dominicks,23808,0,1.89,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-12-06,8,minute.maid,30528,1,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-12-06,8,tropicana,6272,0,2.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-12-13,2,dominicks,26368,1,1.39,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-12-13,2,minute.maid,14848,0,1.99,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-12-13,2,tropicana,6144,0,3.59,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-12-13,5,dominicks,43520,1,1.39,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-12-13,5,minute.maid,12864,0,1.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-12-13,5,tropicana,5696,0,3.39,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-12-13,8,dominicks,89856,1,1.39,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-12-13,8,minute.maid,12096,0,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-12-13,8,tropicana,7168,0,2.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-12-20,2,dominicks,896,0,2.69,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-12-20,2,minute.maid,12288,0,1.99,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-12-20,2,tropicana,21120,0,2.39,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-12-20,5,dominicks,3904,0,2.19,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-12-20,5,minute.maid,22208,0,1.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-12-20,5,tropicana,32384,0,2.39,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-12-20,8,dominicks,12224,0,1.89,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-12-20,8,minute.maid,16448,0,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-12-20,8,tropicana,29504,0,2.39,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-12-27,2,dominicks,1472,0,2.69,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-12-27,2,minute.maid,6272,0,1.99,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-12-27,2,tropicana,12416,0,2.39,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1990-12-27,5,dominicks,896,0,2.19,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-12-27,5,minute.maid,9984,0,1.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-12-27,5,tropicana,10752,0,2.39,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1990-12-27,8,dominicks,3776,0,1.89,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-12-27,8,minute.maid,9344,0,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1990-12-27,8,tropicana,8704,0,2.39,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-01-03,2,dominicks,1344,0,2.69,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-01-03,2,minute.maid,9152,0,1.99,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-01-03,2,tropicana,9472,0,3.59,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-01-03,5,dominicks,2240,0,2.19,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-01-03,5,minute.maid,14016,0,1.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-01-03,5,tropicana,6912,0,3.39,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-01-03,8,dominicks,13824,0,1.89,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-01-03,8,minute.maid,16128,0,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-01-03,8,tropicana,9280,0,2.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-01-10,2,dominicks,111680,1,0.99,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-01-10,2,minute.maid,4160,0,2.59,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-01-10,2,tropicana,17920,0,2.59,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-01-10,5,dominicks,125760,1,0.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-01-10,5,minute.maid,6080,0,2.46,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-01-10,5,tropicana,13440,0,2.59,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-01-10,8,dominicks,251072,1,0.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-01-10,8,minute.maid,5376,0,2.17,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-01-10,8,tropicana,12224,0,2.59,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-01-17,2,dominicks,1856,0,2.69,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-01-17,2,minute.maid,10176,0,2.59,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-01-17,2,tropicana,9408,0,2.59,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-01-17,5,dominicks,1408,0,2.19,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-01-17,5,minute.maid,7808,0,2.46,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-01-17,5,tropicana,7808,0,2.59,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-01-17,8,dominicks,4864,0,1.89,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-01-17,8,minute.maid,6656,0,2.17,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-01-17,8,tropicana,10368,0,2.59,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-01-24,2,dominicks,5568,0,2.69,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-01-24,2,minute.maid,29056,1,1.99,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-01-24,2,tropicana,6272,0,3.59,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-01-24,5,dominicks,7232,0,2.19,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-01-24,5,minute.maid,40896,1,1.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-01-24,5,tropicana,5248,0,3.39,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-01-24,8,dominicks,10176,0,1.89,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-01-24,8,minute.maid,59712,1,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-01-24,8,tropicana,8128,0,2.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-01-31,2,dominicks,32064,1,1.49,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-01-31,2,minute.maid,7104,0,2.59,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-01-31,2,tropicana,6912,0,3.59,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-01-31,5,dominicks,41216,1,1.49,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-01-31,5,minute.maid,6272,0,2.46,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-01-31,5,tropicana,6208,0,3.39,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-01-31,8,dominicks,105344,1,1.49,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-01-31,8,minute.maid,9856,0,2.17,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-01-31,8,tropicana,5952,0,2.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-02-07,2,dominicks,4352,0,1.49,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-02-07,2,minute.maid,7488,0,2.49,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-02-07,2,tropicana,16768,0,2.49,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-02-07,5,dominicks,9024,0,1.49,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-02-07,5,minute.maid,7872,0,2.41,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-02-07,5,tropicana,21440,0,2.49,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-02-07,8,dominicks,33600,0,1.49,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-02-07,8,minute.maid,6720,0,2.12,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-02-07,8,tropicana,21696,0,2.49,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-02-14,2,dominicks,704,0,2.69,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-02-14,2,minute.maid,4224,0,2.49,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-02-14,2,tropicana,6272,0,3.59,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-02-14,5,dominicks,1600,0,2.19,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-02-14,5,minute.maid,6144,0,2.41,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-02-14,5,tropicana,7360,0,3.39,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-02-14,8,dominicks,4736,0,1.89,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-02-14,8,minute.maid,4224,0,2.12,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-02-14,8,tropicana,7808,0,2.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-02-21,2,dominicks,13760,0,2.69,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-02-21,2,minute.maid,8960,0,2.49,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-02-21,2,tropicana,7936,0,3.59,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-02-21,5,dominicks,2496,0,2.19,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-02-21,5,minute.maid,8448,0,2.41,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-02-21,5,tropicana,6720,0,3.39,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-02-21,8,dominicks,10304,0,1.89,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-02-21,8,minute.maid,9728,0,2.12,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-02-21,8,tropicana,8128,0,2.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-02-28,2,dominicks,43328,1,1.09,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-02-28,2,minute.maid,22464,1,1.99,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-02-28,2,tropicana,6144,0,3.59,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-02-28,5,dominicks,6336,1,1.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-02-28,5,minute.maid,18688,1,1.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-02-28,5,tropicana,6656,0,3.39,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-02-28,8,dominicks,5056,1,1.89,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-02-28,8,minute.maid,40320,1,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-02-28,8,tropicana,7424,0,2.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-03-07,2,dominicks,57600,1,1.09,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-03-07,2,minute.maid,3840,0,2.59,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-03-07,2,tropicana,7936,0,3.59,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-03-07,5,dominicks,56384,1,1.09,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-03-07,5,minute.maid,6272,0,2.46,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-03-07,5,tropicana,6016,0,3.39,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-03-07,8,dominicks,179968,1,0.94,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-03-07,8,minute.maid,5120,0,2.17,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-03-07,8,tropicana,5952,0,2.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-03-14,2,dominicks,704,0,2.69,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-03-14,2,minute.maid,12992,0,2.59,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-03-14,2,tropicana,7808,0,3.59,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-03-14,5,dominicks,1600,0,2.19,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-03-14,5,minute.maid,12096,0,2.46,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-03-14,5,tropicana,6144,0,3.39,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-03-14,8,dominicks,4992,0,1.89,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-03-14,8,minute.maid,19264,0,2.17,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-03-14,8,tropicana,7616,0,2.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-03-21,2,dominicks,6016,0,2.69,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-03-21,2,minute.maid,70144,1,1.69,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-03-21,2,tropicana,6080,0,3.59,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-03-21,5,dominicks,2944,0,2.19,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-03-21,5,minute.maid,73216,1,1.69,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-03-21,5,tropicana,4928,0,3.39,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-03-21,8,dominicks,6400,0,1.89,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-03-21,8,minute.maid,170432,1,1.69,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-03-21,8,tropicana,5312,0,2.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-03-28,2,dominicks,10368,1,1.59,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-03-28,2,minute.maid,21248,0,1.69,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-03-28,2,tropicana,42176,1,1.69,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-03-28,5,dominicks,13504,1,1.59,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-03-28,5,minute.maid,18944,0,1.69,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-03-28,5,tropicana,67712,1,1.69,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-03-28,8,dominicks,14912,1,1.59,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-03-28,8,minute.maid,39680,0,1.69,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-03-28,8,tropicana,161792,1,1.49,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-04-04,2,dominicks,12608,0,1.59,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-04-04,2,minute.maid,5696,1,2.59,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-04-04,2,tropicana,4928,0,3.59,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-04-04,5,dominicks,5376,0,1.59,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-04-04,5,minute.maid,6400,1,2.46,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-04-04,5,tropicana,8640,0,3.39,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-04-04,8,dominicks,34624,0,1.59,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-04-04,8,minute.maid,8128,1,2.17,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-04-04,8,tropicana,17280,0,2.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-04-11,2,dominicks,6336,0,1.59,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-04-11,2,minute.maid,7680,0,2.09,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-04-11,2,tropicana,29504,1,1.99,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-04-11,5,dominicks,6656,0,1.59,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-04-11,5,minute.maid,8640,0,2.09,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-04-11,5,tropicana,35520,1,1.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-04-11,8,dominicks,10368,0,1.59,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-04-11,8,minute.maid,9088,0,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-04-11,8,tropicana,47040,1,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-04-18,2,dominicks,140736,1,0.99,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-04-18,2,minute.maid,6336,0,2.09,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-04-18,2,tropicana,9984,0,3.59,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-04-18,5,dominicks,95680,1,0.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-04-18,5,minute.maid,7296,0,2.09,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-04-18,5,tropicana,9664,0,3.39,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-04-18,8,dominicks,194880,1,0.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-04-18,8,minute.maid,6720,0,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-04-18,8,tropicana,14464,0,2.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-04-25,2,dominicks,960,1,2.69,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-04-25,2,minute.maid,8576,0,2.09,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-04-25,2,tropicana,35200,1,1.99,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-04-25,5,dominicks,896,1,2.19,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-04-25,5,minute.maid,12480,0,2.09,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-04-25,5,tropicana,49088,1,1.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-04-25,8,dominicks,5696,1,1.89,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-04-25,8,minute.maid,7552,0,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-04-25,8,tropicana,52928,1,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-05-02,2,dominicks,1216,0,2.69,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-05-02,2,minute.maid,15104,0,2.09,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-05-02,2,tropicana,23936,0,1.99,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-05-02,5,dominicks,1728,0,2.19,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-05-02,5,minute.maid,14144,0,2.09,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-05-02,5,tropicana,14912,0,1.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-05-02,8,dominicks,7168,0,1.89,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-05-02,8,minute.maid,24768,0,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-05-02,8,tropicana,21184,0,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-05-09,2,dominicks,1664,0,2.69,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-05-09,2,minute.maid,76480,1,1.39,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-05-09,2,tropicana,7104,0,3.59,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-05-09,5,dominicks,1280,0,2.19,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-05-09,5,minute.maid,88256,1,1.39,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-05-09,5,tropicana,6464,0,3.39,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-05-09,8,dominicks,2880,0,1.89,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-05-09,8,minute.maid,183296,1,1.39,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-05-09,8,tropicana,7360,0,2.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-05-16,2,dominicks,4992,0,2.69,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-05-16,2,minute.maid,5056,0,2.39,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-05-16,2,tropicana,24512,1,2.29,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-05-16,5,dominicks,5696,0,2.19,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-05-16,5,minute.maid,6848,0,2.26,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-05-16,5,tropicana,25024,1,2.29,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-05-16,8,dominicks,12288,0,1.89,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-05-16,8,minute.maid,8896,0,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-05-16,8,tropicana,15744,1,2.29,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-05-23,2,dominicks,27968,1,1.59,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-05-23,2,minute.maid,4736,0,2.39,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-05-23,2,tropicana,6336,0,3.59,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-05-23,5,dominicks,28288,1,1.59,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-05-23,5,minute.maid,7808,0,2.26,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-05-23,5,tropicana,6272,0,3.39,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-05-30,2,dominicks,12160,0,1.59,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-05-30,2,minute.maid,4480,0,2.39,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-05-30,2,tropicana,6080,0,3.59,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-05-30,5,dominicks,4864,0,1.59,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-05-30,5,minute.maid,6272,0,2.26,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-05-30,5,tropicana,5056,0,3.39,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-06-06,2,dominicks,2240,0,2.69,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-06-06,2,minute.maid,4032,0,2.39,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-06-06,2,tropicana,33536,0,1.99,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-06-06,5,dominicks,2880,0,2.09,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-06-06,5,minute.maid,6144,0,2.26,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-06-06,5,tropicana,47616,0,1.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-06-06,8,dominicks,9280,0,1.69,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-06-06,8,minute.maid,6656,0,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-06-06,8,tropicana,46912,0,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-06-13,2,dominicks,5504,1,1.49,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-06-13,2,minute.maid,14784,1,1.79,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-06-13,2,tropicana,13248,0,1.99,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-06-13,5,dominicks,5760,1,1.41,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-06-13,5,minute.maid,27776,1,1.69,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-06-13,5,tropicana,13888,0,1.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-06-13,8,dominicks,25856,1,1.26,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-06-13,8,minute.maid,35456,1,1.49,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-06-13,8,tropicana,18240,0,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-06-20,2,dominicks,8832,0,1.29,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-06-20,2,minute.maid,12096,0,2.39,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-06-20,2,tropicana,6208,0,3.59,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-06-20,5,dominicks,15040,0,1.29,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-06-20,5,minute.maid,20800,0,2.26,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-06-20,5,tropicana,6144,0,3.39,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-06-20,8,dominicks,19264,0,1.29,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-06-20,8,minute.maid,17408,0,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-06-20,8,tropicana,6464,0,2.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-06-27,2,dominicks,2624,0,1.99,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-06-27,2,minute.maid,41792,1,1.69,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-06-27,2,tropicana,10624,0,3.59,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-06-27,5,dominicks,5120,0,1.89,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-06-27,5,minute.maid,45696,1,1.69,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-06-27,5,tropicana,9344,0,3.39,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-06-27,8,dominicks,6848,0,1.69,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-06-27,8,minute.maid,75520,1,1.69,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-06-27,8,tropicana,8512,0,2.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-07-04,2,dominicks,10432,0,1.99,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-07-04,2,minute.maid,10560,0,1.69,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-07-04,2,tropicana,44672,0,1.99,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-07-04,5,dominicks,3264,0,1.89,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-07-04,5,minute.maid,14336,0,1.69,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-07-04,5,tropicana,32896,0,1.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-07-04,8,dominicks,12928,0,1.69,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-07-04,8,minute.maid,21632,0,1.69,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-07-04,8,tropicana,28416,0,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-07-11,5,dominicks,9536,1,1.59,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-07-11,5,minute.maid,4928,0,2.26,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-07-11,5,tropicana,21056,0,1.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-07-11,8,dominicks,44032,1,1.59,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-07-11,8,minute.maid,8384,0,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-07-11,8,tropicana,16960,0,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-07-18,2,dominicks,8320,0,1.59,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-07-18,2,minute.maid,4224,0,2.39,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-07-18,2,tropicana,20096,0,1.99,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-07-18,5,dominicks,6208,0,1.59,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-07-18,5,minute.maid,4608,0,2.26,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-07-18,5,tropicana,15360,0,1.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-07-18,8,dominicks,25408,0,1.59,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-07-18,8,minute.maid,9920,0,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-07-18,8,tropicana,8320,0,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-07-25,2,dominicks,6784,0,1.99,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-07-25,2,minute.maid,2880,0,2.39,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-07-25,2,tropicana,9152,1,3.59,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-07-25,5,dominicks,6592,0,1.89,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-07-25,5,minute.maid,5248,0,2.26,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-07-25,5,tropicana,8000,1,3.39,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-07-25,8,dominicks,38336,0,1.69,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-07-25,8,minute.maid,6592,0,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-07-25,8,tropicana,11136,1,2.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-08-01,2,dominicks,60544,1,0.99,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-08-01,2,minute.maid,3968,0,2.39,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-08-01,2,tropicana,21952,0,2.19,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-08-01,5,dominicks,63552,1,0.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-08-01,5,minute.maid,4224,0,2.26,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-08-01,5,tropicana,21120,0,2.19,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-08-01,8,dominicks,152384,1,0.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-08-01,8,minute.maid,7168,0,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-08-01,8,tropicana,27712,0,2.19,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-08-08,2,dominicks,20608,0,0.99,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-08-08,2,minute.maid,3712,0,2.39,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-08-08,2,tropicana,13568,0,2.19,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-08-08,5,dominicks,27968,0,0.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-08-08,5,minute.maid,4288,0,2.26,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-08-08,5,tropicana,11904,0,2.19,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-08-08,8,dominicks,54464,0,0.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-08-08,8,minute.maid,6208,0,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-08-08,8,tropicana,7744,0,2.19,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-08-15,5,dominicks,21760,1,1.49,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-08-15,5,minute.maid,16896,0,2.26,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-08-15,5,tropicana,5056,0,3.39,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-08-15,8,dominicks,47680,1,1.49,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-08-15,8,minute.maid,30528,0,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-08-15,8,tropicana,5184,0,2.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-08-22,5,dominicks,2688,0,1.49,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-08-22,5,minute.maid,77184,1,1.29,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-08-22,5,tropicana,4608,0,3.39,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-08-22,8,dominicks,14720,0,1.49,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-08-22,8,minute.maid,155840,1,1.29,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-08-22,8,tropicana,6272,0,2.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-08-29,2,dominicks,16064,0,1.39,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-08-29,2,minute.maid,2816,0,2.39,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-08-29,2,tropicana,4160,0,3.59,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-08-29,5,dominicks,10432,0,1.39,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-08-29,5,minute.maid,5184,0,2.26,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-08-29,5,tropicana,6016,0,3.39,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-08-29,8,dominicks,53248,0,1.39,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-08-29,8,minute.maid,10752,0,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-08-29,8,tropicana,7744,0,2.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-09-05,2,dominicks,12480,0,1.39,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-09-05,2,minute.maid,4288,0,2.39,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-09-05,2,tropicana,39424,1,1.99,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-09-05,5,dominicks,9792,0,1.39,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-09-05,5,minute.maid,5248,0,2.26,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-09-05,5,tropicana,50752,1,1.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-09-05,8,dominicks,40576,0,1.39,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-09-05,8,minute.maid,6976,0,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-09-05,8,tropicana,53184,1,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-09-12,2,dominicks,17024,0,1.39,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-09-12,2,minute.maid,18240,1,1.99,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-09-12,2,tropicana,5632,0,3.59,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-09-12,5,dominicks,8448,0,1.39,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-09-12,5,minute.maid,20672,1,1.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-09-12,5,tropicana,5632,0,3.39,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-09-12,8,dominicks,25856,0,1.39,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-09-12,8,minute.maid,31872,1,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-09-12,8,tropicana,6784,0,2.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-09-19,2,dominicks,13440,1,1.59,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-09-19,2,minute.maid,7360,0,1.95,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-09-19,2,tropicana,9024,1,2.68,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-09-19,8,dominicks,24064,1,1.58,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-09-19,8,minute.maid,5312,0,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-09-19,8,tropicana,8000,1,2.49,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-09-26,2,dominicks,10112,0,1.59,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-09-26,2,minute.maid,7808,0,1.83,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-09-26,2,tropicana,6016,0,3.44,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-09-26,5,dominicks,6912,0,1.58,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-09-26,5,minute.maid,12352,0,1.89,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-09-26,5,tropicana,6400,0,3.39,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-09-26,8,dominicks,15680,0,1.58,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-09-26,8,minute.maid,33344,0,1.89,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-09-26,8,tropicana,6592,0,2.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-10-03,2,dominicks,9088,0,1.56,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-10-03,2,minute.maid,13504,0,1.79,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-10-03,2,tropicana,7744,0,3.14,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-10-03,5,dominicks,8256,0,1.58,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-10-03,5,minute.maid,12032,0,1.79,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-10-03,5,tropicana,5440,0,2.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-10-03,8,dominicks,16576,0,1.58,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-10-03,8,minute.maid,13504,0,1.79,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-10-03,8,tropicana,5248,0,2.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-10-10,2,dominicks,22848,1,1.49,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-10-10,2,minute.maid,10048,0,1.91,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-10-10,2,tropicana,6784,0,3.07,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-10-10,5,dominicks,28672,1,1.58,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-10-10,5,minute.maid,13440,0,1.79,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-10-10,5,tropicana,8128,0,2.94,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-10-10,8,dominicks,49664,1,1.58,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-10-10,8,minute.maid,13504,0,1.79,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-10-10,8,tropicana,6592,0,2.94,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-10-17,2,dominicks,6976,0,1.65,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-10-17,2,minute.maid,135936,1,1.69,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-10-17,2,tropicana,6784,0,3.07,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-10-17,8,dominicks,10752,0,1.58,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-10-17,8,minute.maid,335808,1,1.69,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-10-17,8,tropicana,5888,0,2.94,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-10-24,2,dominicks,4160,0,1.99,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-10-24,2,minute.maid,5056,0,2.39,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-10-24,2,tropicana,6272,0,3.07,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-10-24,5,dominicks,4416,0,1.58,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-10-24,5,minute.maid,5824,0,2.26,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-10-24,5,tropicana,7232,0,2.94,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-10-24,8,dominicks,9792,0,1.58,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-10-24,8,minute.maid,13120,0,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-10-24,8,tropicana,6336,0,2.94,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-10-31,2,dominicks,3328,0,1.83,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-10-31,2,minute.maid,27968,0,1.49,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-10-31,2,tropicana,5312,0,3.07,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-10-31,5,dominicks,1856,0,1.58,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-10-31,5,minute.maid,50112,0,1.49,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-10-31,5,tropicana,7168,0,2.94,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-10-31,8,dominicks,7104,0,1.58,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-10-31,8,minute.maid,49664,0,1.49,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-10-31,8,tropicana,5888,0,2.94,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-11-07,2,dominicks,12096,1,1.69,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-11-07,2,minute.maid,4736,0,2.39,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-11-07,2,tropicana,9216,0,3.11,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-11-07,5,dominicks,6528,1,1.58,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-11-07,5,minute.maid,5184,0,2.26,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-11-07,5,tropicana,7872,0,2.94,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-11-07,8,dominicks,9216,1,1.58,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-11-07,8,minute.maid,10880,0,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-11-07,8,tropicana,6080,0,2.94,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-11-14,2,dominicks,6208,0,1.76,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-11-14,2,minute.maid,7808,0,2.14,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-11-14,2,tropicana,7296,0,3.19,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-11-14,5,dominicks,6080,0,1.58,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-11-14,5,minute.maid,8384,0,2.26,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-11-14,5,tropicana,7552,0,2.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-11-14,8,dominicks,12608,0,1.58,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-11-14,8,minute.maid,9984,0,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-11-14,8,tropicana,6848,0,2.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-11-21,2,dominicks,3008,0,1.99,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-11-21,2,minute.maid,12480,0,1.99,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-11-21,2,tropicana,34240,1,1.99,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-11-21,5,dominicks,3456,0,1.58,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-11-21,5,minute.maid,10112,0,1.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-11-21,5,tropicana,69504,1,1.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-11-21,8,dominicks,16448,0,1.58,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-11-21,8,minute.maid,9216,0,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-11-21,8,tropicana,54016,1,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-11-28,2,dominicks,19456,1,1.5,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-11-28,2,minute.maid,9664,0,1.99,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-11-28,2,tropicana,7168,0,2.64,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-11-28,5,dominicks,25856,1,1.58,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-11-28,5,minute.maid,8384,0,1.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-11-28,5,tropicana,8960,0,2.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-11-28,8,dominicks,27968,1,1.58,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-11-28,8,minute.maid,7680,0,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-11-28,8,tropicana,10368,0,2.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-12-05,2,dominicks,16768,0,1.59,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-12-05,2,minute.maid,7168,0,2.06,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-12-05,2,tropicana,6080,0,3.19,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-12-05,5,dominicks,25728,0,1.58,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-12-05,5,minute.maid,11456,0,1.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-12-05,5,tropicana,6912,0,2.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-12-05,8,dominicks,37824,0,1.58,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-12-05,8,minute.maid,7296,0,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-12-05,8,tropicana,5568,0,2.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-12-12,2,dominicks,13568,1,1.59,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-12-12,2,minute.maid,4480,0,2.39,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-12-12,2,tropicana,5120,0,3.19,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-12-12,5,dominicks,23552,1,1.58,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-12-12,5,minute.maid,5952,0,2.26,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-12-12,5,tropicana,6656,0,2.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-12-12,8,dominicks,33664,1,1.58,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-12-12,8,minute.maid,8192,0,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-12-12,8,tropicana,4864,0,2.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-12-19,2,dominicks,6080,0,1.61,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-12-19,2,minute.maid,5952,0,2.22,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-12-19,2,tropicana,8320,0,2.74,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-12-19,5,dominicks,2944,0,1.58,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-12-19,5,minute.maid,8512,0,2.26,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-12-19,5,tropicana,8192,0,2.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-12-19,8,dominicks,17728,0,1.58,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-12-19,8,minute.maid,6080,0,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-12-19,8,tropicana,7232,0,2.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-12-26,2,dominicks,10432,1,1.69,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-12-26,2,minute.maid,21696,1,1.99,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-12-26,2,tropicana,17728,0,2.39,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1991-12-26,5,dominicks,5888,1,1.58,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-12-26,5,minute.maid,27968,1,1.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-12-26,5,tropicana,13440,0,2.39,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1991-12-26,8,dominicks,25088,1,1.58,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-12-26,8,minute.maid,15040,1,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1991-12-26,8,tropicana,15232,0,2.39,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-01-02,2,dominicks,11712,0,1.69,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-01-02,2,minute.maid,12032,0,1.99,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-01-02,2,tropicana,13120,0,2.35,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-01-02,5,dominicks,6848,0,1.58,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-01-02,5,minute.maid,24000,0,1.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-01-02,5,tropicana,12160,0,2.39,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-01-02,8,dominicks,13184,0,1.58,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-01-02,8,minute.maid,9472,0,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-01-02,8,tropicana,47040,0,2.39,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-01-09,2,dominicks,4032,0,1.76,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-01-09,2,minute.maid,7040,0,2.12,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-01-09,2,tropicana,13120,0,2.29,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-01-09,5,dominicks,1792,0,1.58,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-01-09,5,minute.maid,6848,0,1.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-01-09,5,tropicana,11840,0,2.29,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-01-09,8,dominicks,3136,0,1.58,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-01-09,8,minute.maid,5888,0,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-01-09,8,tropicana,9280,0,2.29,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-01-16,2,dominicks,6336,0,1.82,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-01-16,2,minute.maid,10240,1,2.49,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-01-16,2,tropicana,9792,0,2.43,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-01-16,5,dominicks,5248,0,1.58,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-01-16,5,minute.maid,15104,1,2.49,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-01-16,5,tropicana,8640,0,2.29,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-01-16,8,dominicks,5696,0,1.58,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-01-16,8,minute.maid,14336,1,2.49,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-01-16,8,tropicana,6720,0,2.29,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-01-23,2,dominicks,13632,0,1.47,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-01-23,2,minute.maid,6848,1,2.49,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-01-23,2,tropicana,3520,0,3.19,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-01-23,5,dominicks,16768,0,1.58,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-01-23,5,minute.maid,11392,1,2.49,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-01-23,5,tropicana,5888,0,2.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-01-23,8,dominicks,19008,0,1.58,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-01-23,8,minute.maid,11712,1,2.49,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-01-23,8,tropicana,5056,0,2.89,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-01-30,2,dominicks,45120,0,1.29,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-01-30,2,minute.maid,3968,0,2.61,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-01-30,2,tropicana,5504,0,3.19,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-01-30,5,dominicks,52160,0,1.58,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-01-30,5,minute.maid,5824,0,2.49,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-01-30,5,tropicana,7424,0,2.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-01-30,8,dominicks,121664,0,1.58,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-01-30,8,minute.maid,7936,0,2.49,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-01-30,8,tropicana,6080,0,2.89,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-02-06,2,dominicks,9984,0,1.39,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-02-06,2,minute.maid,5888,0,2.26,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-02-06,2,tropicana,6720,0,3.19,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-02-06,5,dominicks,16640,0,1.58,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-02-06,5,minute.maid,7488,0,2.66,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-02-06,5,tropicana,5632,0,2.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-02-06,8,dominicks,38848,0,1.58,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-02-06,8,minute.maid,5184,0,2.39,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-02-06,8,tropicana,10496,0,2.89,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-02-13,2,dominicks,4800,0,1.82,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-02-13,2,minute.maid,6208,0,1.99,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-02-13,2,tropicana,20224,1,1.99,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-02-13,5,dominicks,1344,0,1.58,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-02-13,5,minute.maid,8320,0,1.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-02-13,5,tropicana,33600,1,1.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-02-13,8,dominicks,6144,0,1.58,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-02-13,8,minute.maid,7168,0,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-02-13,8,tropicana,39040,1,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-02-20,2,dominicks,11776,0,1.69,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-02-20,2,minute.maid,72256,1,1.99,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-02-20,2,tropicana,5056,0,3.19,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-02-20,5,dominicks,4608,0,1.58,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-02-20,5,minute.maid,99904,1,1.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-02-20,5,tropicana,5376,0,2.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-02-20,8,dominicks,13632,0,1.58,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-02-20,8,minute.maid,216064,1,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-02-20,8,tropicana,4480,0,2.89,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-02-27,2,dominicks,11584,0,1.54,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-02-27,2,minute.maid,11520,0,2.11,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-02-27,2,tropicana,43584,1,1.99,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-02-27,5,dominicks,12672,0,1.58,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-02-27,5,minute.maid,6976,0,1.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-02-27,5,tropicana,54272,1,1.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-02-27,8,dominicks,9792,0,1.58,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-02-27,8,minute.maid,15040,0,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-02-27,8,tropicana,61760,1,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-03-05,2,dominicks,51264,1,1.39,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-03-05,2,minute.maid,5824,0,2.35,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-03-05,2,tropicana,25728,0,1.79,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-03-05,5,dominicks,48640,1,1.58,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-03-05,5,minute.maid,9984,0,2.66,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-03-05,5,tropicana,33600,0,1.79,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-03-05,8,dominicks,86912,1,1.58,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-03-05,8,minute.maid,11840,0,2.39,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-03-05,8,tropicana,15360,0,1.79,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-03-12,2,dominicks,14976,0,1.44,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-03-12,2,minute.maid,19392,1,1.99,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-03-12,2,tropicana,31808,0,1.79,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-03-12,5,dominicks,13248,0,1.58,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-03-12,5,minute.maid,32832,1,1.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-03-12,5,tropicana,24448,0,1.79,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-03-12,8,dominicks,24512,0,1.58,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-03-12,8,minute.maid,25472,1,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-03-12,8,tropicana,54976,0,1.79,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-03-19,2,dominicks,30784,0,1.59,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-03-19,2,minute.maid,9536,0,2.1,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-03-19,2,tropicana,20736,0,1.91,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-03-19,5,dominicks,29248,0,1.58,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-03-19,5,minute.maid,8128,0,1.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-03-19,5,tropicana,22784,0,1.79,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-03-19,8,dominicks,58048,0,1.58,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-03-19,8,minute.maid,16384,0,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-03-19,8,tropicana,34368,0,1.79,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-03-26,2,dominicks,12480,0,1.6,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-03-26,2,minute.maid,5312,0,2.28,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-03-26,2,tropicana,15168,0,2.81,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-03-26,5,dominicks,4608,0,1.58,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-03-26,5,minute.maid,6464,0,2.66,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-03-26,5,tropicana,19008,0,2.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-03-26,8,dominicks,13952,0,1.58,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-03-26,8,minute.maid,20480,0,2.39,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-03-26,8,tropicana,10752,0,2.89,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-04-02,2,dominicks,3264,0,1.99,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-04-02,2,minute.maid,14528,1,1.9,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-04-02,2,tropicana,28096,1,2.5,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-04-02,5,dominicks,3136,0,1.58,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-04-02,5,minute.maid,36800,1,1.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-04-02,5,tropicana,15808,1,2.5,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-04-02,8,dominicks,15168,0,1.58,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-04-02,8,minute.maid,34688,1,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-04-02,8,tropicana,20096,1,2.5,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-04-09,2,dominicks,8768,0,1.48,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-04-09,2,minute.maid,12416,0,2.12,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-04-09,2,tropicana,12416,0,2.58,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-04-09,5,dominicks,13184,0,1.58,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-04-09,5,minute.maid,12928,0,1.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-04-09,5,tropicana,14144,0,2.5,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-04-09,8,dominicks,14592,0,1.58,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-04-09,8,minute.maid,22400,0,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-04-09,8,tropicana,16192,0,2.5,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-04-16,2,dominicks,70848,1,1.29,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-04-16,2,minute.maid,5376,0,2.79,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-04-16,2,tropicana,5376,0,3.19,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-04-16,5,dominicks,67712,1,1.29,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-04-16,5,minute.maid,7424,0,2.66,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-04-16,5,tropicana,9600,0,2.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-04-16,8,dominicks,145088,1,1.29,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-04-16,8,minute.maid,7808,0,2.39,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-04-16,8,tropicana,6528,0,2.89,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-04-23,2,dominicks,18560,0,1.42,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-04-23,2,minute.maid,19008,1,2.09,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-04-23,2,tropicana,9792,0,2.67,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-04-23,5,dominicks,18880,0,1.29,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-04-23,5,minute.maid,34176,1,1.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-04-23,5,tropicana,10112,0,2.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-04-23,8,dominicks,43712,0,1.29,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-04-23,8,minute.maid,48064,1,1.79,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-04-23,8,tropicana,8320,0,2.89,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-04-30,2,dominicks,9152,0,1.99,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-04-30,2,minute.maid,3904,0,2.79,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-04-30,2,tropicana,16960,1,2.39,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-04-30,5,dominicks,6208,0,1.89,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-04-30,5,minute.maid,4160,0,2.66,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-04-30,5,tropicana,31872,1,2.24,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-04-30,8,dominicks,20608,0,1.69,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-04-30,8,minute.maid,7360,0,2.39,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-04-30,8,tropicana,30784,1,2.16,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-05-07,2,dominicks,9600,0,2.0,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-05-07,2,minute.maid,6336,0,2.79,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-05-07,2,tropicana,8320,0,3.19,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-05-07,5,dominicks,5952,0,1.89,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-05-07,5,minute.maid,5952,0,2.66,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-05-07,5,tropicana,9280,0,2.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-05-07,8,dominicks,18752,0,1.69,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-05-07,8,minute.maid,6272,0,2.39,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-05-07,8,tropicana,18048,0,2.89,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-05-14,2,dominicks,4800,0,2.09,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-05-14,2,minute.maid,5440,0,2.79,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-05-14,2,tropicana,6912,0,3.19,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-05-14,5,dominicks,4160,0,1.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-05-14,5,minute.maid,6528,0,2.66,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-05-14,5,tropicana,7680,0,2.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-05-14,8,dominicks,20160,0,1.79,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-05-14,8,minute.maid,6400,0,2.39,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-05-14,8,tropicana,12864,0,2.89,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-05-21,2,dominicks,9664,0,1.69,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-05-21,2,minute.maid,22400,1,2.09,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-05-21,2,tropicana,6976,0,3.19,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-05-21,5,dominicks,23488,0,1.69,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-05-21,5,minute.maid,30656,1,1.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-05-21,5,tropicana,8704,0,2.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-05-21,8,dominicks,18688,0,1.69,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-05-21,8,minute.maid,54592,1,1.79,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-05-21,8,tropicana,7168,0,2.89,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-05-28,2,dominicks,45568,0,1.69,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-05-28,2,minute.maid,3968,0,2.84,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-05-28,2,tropicana,7232,0,3.19,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-05-28,5,dominicks,60480,0,1.69,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-05-28,5,minute.maid,6656,0,2.66,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-05-28,5,tropicana,9920,0,2.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-05-28,8,dominicks,133824,0,1.69,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-05-28,8,minute.maid,8128,0,2.39,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-05-28,8,tropicana,9024,0,2.89,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-06-04,2,dominicks,20992,0,1.74,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-06-04,2,minute.maid,3264,0,2.89,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-06-04,2,tropicana,51520,1,2.49,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-06-04,5,dominicks,20416,0,1.69,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-06-04,5,minute.maid,4416,0,2.69,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-06-04,5,tropicana,91968,1,2.49,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-06-04,8,dominicks,63488,0,1.69,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-06-04,8,minute.maid,4928,0,2.49,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-06-04,8,tropicana,84992,1,2.49,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-06-11,2,dominicks,6592,0,2.09,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-06-11,2,minute.maid,4352,0,2.89,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-06-11,2,tropicana,22272,0,2.21,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-06-11,5,dominicks,6336,0,1.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-06-11,5,minute.maid,5696,0,2.69,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-06-11,5,tropicana,44096,0,2.49,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-06-11,8,dominicks,71040,0,1.79,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-06-11,8,minute.maid,5440,0,2.49,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-06-11,8,tropicana,14144,0,2.49,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-06-18,2,dominicks,4992,0,2.05,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-06-18,2,minute.maid,4480,0,2.89,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-06-18,2,tropicana,46144,1,1.99,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-06-25,2,dominicks,8064,0,1.24,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-06-25,2,minute.maid,3840,0,2.52,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-06-25,2,tropicana,4352,1,3.19,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-06-25,5,dominicks,1408,0,1.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-06-25,5,minute.maid,5696,0,2.69,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-06-25,5,tropicana,7296,1,2.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-06-25,8,dominicks,15360,0,1.79,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-06-25,8,minute.maid,5888,0,2.49,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-06-25,8,tropicana,7488,1,2.89,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-07-02,2,dominicks,7360,0,1.61,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-07-02,2,minute.maid,13312,1,2.0,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-07-02,2,tropicana,17280,0,2.69,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-07-02,5,dominicks,4672,0,1.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-07-02,5,minute.maid,39680,1,2.01,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-07-02,5,tropicana,12928,0,2.69,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-07-02,8,dominicks,17728,0,1.79,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-07-02,8,minute.maid,23872,1,2.02,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-07-02,8,tropicana,12352,0,2.69,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-07-09,2,dominicks,10048,0,1.4,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-07-09,2,minute.maid,3776,1,2.33,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-07-09,2,tropicana,5696,0,3.19,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-07-09,5,dominicks,19520,0,1.29,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-07-09,5,minute.maid,6208,1,2.19,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-07-09,5,tropicana,6848,0,2.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-07-09,8,dominicks,24256,0,1.29,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-07-09,8,minute.maid,6848,1,2.19,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-07-09,8,tropicana,5696,0,2.89,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-07-16,2,dominicks,10112,0,1.91,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-07-16,2,minute.maid,4800,0,2.89,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-07-16,2,tropicana,6848,0,3.19,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-07-16,5,dominicks,7872,0,1.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-07-16,5,minute.maid,7872,0,2.69,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-07-16,5,tropicana,8064,0,2.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-07-16,8,dominicks,19968,0,1.79,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-07-16,8,minute.maid,8192,0,2.49,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-07-16,8,tropicana,7680,0,2.89,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-07-23,2,dominicks,9152,0,1.69,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-07-23,2,minute.maid,24960,1,2.29,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-07-23,2,tropicana,4416,0,3.19,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-07-23,5,dominicks,5184,0,1.69,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-07-23,5,minute.maid,54528,1,2.29,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-07-23,5,tropicana,4992,0,2.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-07-23,8,dominicks,15936,0,1.69,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-07-23,8,minute.maid,55040,1,2.29,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-07-23,8,tropicana,5440,0,2.89,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-07-30,2,dominicks,36288,1,1.49,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-07-30,2,minute.maid,4544,0,2.86,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-07-30,2,tropicana,4672,0,3.16,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-07-30,5,dominicks,42240,1,1.49,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-07-30,5,minute.maid,6400,0,2.69,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-07-30,5,tropicana,7360,0,2.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-07-30,8,dominicks,76352,1,1.49,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-07-30,8,minute.maid,6528,0,2.49,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-07-30,8,tropicana,5632,0,2.89,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-08-06,2,dominicks,3776,1,1.99,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-08-06,2,minute.maid,3968,1,2.81,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-08-06,2,tropicana,7168,1,3.09,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-08-06,5,dominicks,6592,1,1.89,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-08-06,5,minute.maid,5888,1,2.65,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-08-06,5,tropicana,8384,1,2.89,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-08-06,8,dominicks,17408,1,1.69,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-08-06,8,minute.maid,6208,1,2.45,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-08-06,8,tropicana,8960,1,2.79,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-08-13,2,dominicks,3328,0,1.97,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-08-13,2,minute.maid,49600,1,1.99,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-08-13,2,tropicana,5056,0,3.19,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-08-13,5,dominicks,2112,0,1.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-08-13,5,minute.maid,56384,1,1.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-08-13,5,tropicana,8832,0,2.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-08-13,8,dominicks,17536,0,1.79,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-08-13,8,minute.maid,94720,1,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-08-13,8,tropicana,6080,0,2.89,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-08-20,2,dominicks,13824,0,1.36,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-08-20,2,minute.maid,23488,1,1.94,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-08-20,2,tropicana,13376,1,2.79,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-08-20,5,dominicks,21248,0,1.79,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-08-20,5,minute.maid,27072,1,1.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-08-20,5,tropicana,17728,1,2.79,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-08-20,8,dominicks,31232,0,1.59,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-08-20,8,minute.maid,55552,1,1.99,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-08-20,8,tropicana,8576,1,2.79,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-08-27,2,dominicks,9024,0,1.19,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-08-27,2,minute.maid,19008,0,1.69,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-08-27,2,tropicana,8128,0,2.75,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-08-27,5,dominicks,1856,0,1.29,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-08-27,5,minute.maid,3840,0,1.69,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-08-27,5,tropicana,9600,0,2.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-08-27,8,dominicks,19200,0,1.29,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-08-27,8,minute.maid,18688,0,1.69,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-08-27,8,tropicana,8000,0,2.89,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-09-03,2,dominicks,2048,0,2.09,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-09-03,2,minute.maid,11584,0,1.81,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-09-03,2,tropicana,19456,1,2.49,0.232864734,0.248934934,10.55320518,0.463887065,0.103953406,0.114279949,0.303585347,2.110122129,1.142857143,1.927279669,0.37692661299999997
|
||||||
|
1992-09-03,5,dominicks,3712,0,1.99,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-09-03,5,minute.maid,6144,0,1.69,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-09-03,5,tropicana,25664,1,2.49,0.117368032,0.32122573,10.92237097,0.535883355,0.103091585,0.053875277,0.410568032,3.801997814,0.681818182,1.600573425,0.736306837
|
||||||
|
1992-09-03,8,dominicks,12800,0,1.79,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-09-03,8,minute.maid,14656,0,1.69,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
1992-09-03,8,tropicana,21760,1,2.49,0.252394035,0.095173274,10.59700966,0.054227156,0.131749698,0.035243328,0.283074736,2.636332801,1.5,2.905384316,0.641015947
|
||||||
|
@@ -0,0 +1,155 @@
|
|||||||
|
import argparse
|
||||||
|
from datetime import datetime
|
||||||
|
import os
|
||||||
|
import uuid
|
||||||
|
import numpy as np
|
||||||
|
import pandas as pd
|
||||||
|
|
||||||
|
from pandas.tseries.frequencies import to_offset
|
||||||
|
import joblib
|
||||||
|
from sklearn.metrics import mean_absolute_error, mean_squared_error
|
||||||
|
|
||||||
|
from azureml.data.dataset_factory import TabularDatasetFactory
|
||||||
|
from azureml.automl.runtime.shared.score import scoring, constants as metrics_constants
|
||||||
|
import azureml.automl.core.shared.constants as constants
|
||||||
|
from azureml.core import Run, Dataset, Model
|
||||||
|
|
||||||
|
try:
|
||||||
|
import torch
|
||||||
|
|
||||||
|
_torch_present = True
|
||||||
|
except ImportError:
|
||||||
|
_torch_present = False
|
||||||
|
|
||||||
|
|
||||||
|
def infer_forecasting_dataset_tcn(
|
||||||
|
X_test, y_test, model, output_path, output_dataset_name="results"
|
||||||
|
):
|
||||||
|
|
||||||
|
y_pred, df_all = model.forecast(X_test, y_test)
|
||||||
|
|
||||||
|
run = Run.get_context()
|
||||||
|
|
||||||
|
TabularDatasetFactory.register_pandas_dataframe(
|
||||||
|
df_all,
|
||||||
|
target=(
|
||||||
|
run.experiment.workspace.get_default_datastore(),
|
||||||
|
datetime.now().strftime("%Y-%m-%d-") + str(uuid.uuid4())[:6],
|
||||||
|
),
|
||||||
|
name=output_dataset_name,
|
||||||
|
)
|
||||||
|
df_all.to_csv(os.path.join(output_path, output_dataset_name + ".csv"), index=False)
|
||||||
|
|
||||||
|
|
||||||
|
def map_location_cuda(storage, loc):
|
||||||
|
return storage.cuda()
|
||||||
|
|
||||||
|
|
||||||
|
def get_model(model_path, model_file_name):
|
||||||
|
# _, ext = os.path.splitext(model_path)
|
||||||
|
model_full_path = os.path.join(model_path, model_file_name)
|
||||||
|
print(model_full_path)
|
||||||
|
if model_file_name.endswith("pt"):
|
||||||
|
# Load the fc-tcn torch model.
|
||||||
|
assert _torch_present, "Loading DNN models needs torch to be presented."
|
||||||
|
if torch.cuda.is_available():
|
||||||
|
map_location = map_location_cuda
|
||||||
|
else:
|
||||||
|
map_location = "cpu"
|
||||||
|
with open(model_full_path, "rb") as fh:
|
||||||
|
fitted_model = torch.load(fh, map_location=map_location)
|
||||||
|
else:
|
||||||
|
# Load the sklearn pipeline.
|
||||||
|
fitted_model = joblib.load(model_full_path)
|
||||||
|
return fitted_model
|
||||||
|
|
||||||
|
|
||||||
|
def get_args():
|
||||||
|
parser = argparse.ArgumentParser()
|
||||||
|
parser.add_argument(
|
||||||
|
"--model_name", type=str, dest="model_name", help="Model to be loaded"
|
||||||
|
)
|
||||||
|
|
||||||
|
parser.add_argument(
|
||||||
|
"--ouput_dataset_name",
|
||||||
|
type=str,
|
||||||
|
dest="ouput_dataset_name",
|
||||||
|
default="results",
|
||||||
|
help="Dataset name of the final output",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--target_column_name",
|
||||||
|
type=str,
|
||||||
|
dest="target_column_name",
|
||||||
|
help="The target column name.",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--test_dataset_name",
|
||||||
|
type=str,
|
||||||
|
dest="test_dataset_name",
|
||||||
|
default="results",
|
||||||
|
help="Dataset name of the final output",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--output_path",
|
||||||
|
type=str,
|
||||||
|
dest="output_path",
|
||||||
|
default="results",
|
||||||
|
help="The output path",
|
||||||
|
)
|
||||||
|
args = parser.parse_args()
|
||||||
|
return args
|
||||||
|
|
||||||
|
|
||||||
|
def get_data(run, fitted_model, target_column_name, test_dataset_name):
|
||||||
|
|
||||||
|
# get input dataset by name
|
||||||
|
test_dataset = Dataset.get_by_name(run.experiment.workspace, test_dataset_name)
|
||||||
|
test_df = test_dataset.to_pandas_dataframe()
|
||||||
|
if target_column_name in test_df:
|
||||||
|
y_test = test_df.pop(target_column_name).values
|
||||||
|
else:
|
||||||
|
y_test = np.full(test_df.shape[0], np.nan)
|
||||||
|
|
||||||
|
return test_df, y_test
|
||||||
|
|
||||||
|
|
||||||
|
def get_model_filename(run, model_name, model_path):
|
||||||
|
model = Model(run.experiment.workspace, model_name)
|
||||||
|
if "model_file_name" in model.tags:
|
||||||
|
return model.tags["model_file_name"]
|
||||||
|
is_pkl = True
|
||||||
|
if model.tags.get("algorithm") == "TCNForecaster" or os.path.exists(
|
||||||
|
os.path.join(model_path, "model.pt")
|
||||||
|
):
|
||||||
|
is_pkl = False
|
||||||
|
return "model.pkl" if is_pkl else "model.pt"
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
run = Run.get_context()
|
||||||
|
|
||||||
|
args = get_args()
|
||||||
|
model_name = args.model_name
|
||||||
|
ouput_dataset_name = args.ouput_dataset_name
|
||||||
|
test_dataset_name = args.test_dataset_name
|
||||||
|
target_column_name = args.target_column_name
|
||||||
|
print("args passed are: ")
|
||||||
|
|
||||||
|
print(model_name)
|
||||||
|
print(test_dataset_name)
|
||||||
|
print(ouput_dataset_name)
|
||||||
|
print(target_column_name)
|
||||||
|
|
||||||
|
model_path = Model.get_model_path(model_name)
|
||||||
|
model_file_name = get_model_filename(run, model_name, model_path)
|
||||||
|
print(model_file_name)
|
||||||
|
fitted_model = get_model(model_path, model_file_name)
|
||||||
|
|
||||||
|
X_test_df, y_test = get_data(
|
||||||
|
run, fitted_model, target_column_name, test_dataset_name
|
||||||
|
)
|
||||||
|
|
||||||
|
infer_forecasting_dataset_tcn(
|
||||||
|
X_test_df, y_test, fitted_model, args.output_path, ouput_dataset_name
|
||||||
|
)
|
||||||
@@ -0,0 +1,64 @@
|
|||||||
|
import argparse
|
||||||
|
import os
|
||||||
|
import uuid
|
||||||
|
import shutil
|
||||||
|
from azureml.core.model import Model, Dataset
|
||||||
|
from azureml.core.run import Run, _OfflineRun
|
||||||
|
from azureml.core import Workspace
|
||||||
|
import azureml.automl.core.shared.constants as constants
|
||||||
|
from azureml.train.automl.run import AutoMLRun
|
||||||
|
|
||||||
|
|
||||||
|
def get_best_automl_run(pipeline_run):
|
||||||
|
all_children = [c for c in pipeline_run.get_children()]
|
||||||
|
automl_step = [
|
||||||
|
c for c in all_children if c.properties.get("runTemplate") == "AutoML"
|
||||||
|
]
|
||||||
|
for c in all_children:
|
||||||
|
print(c, c.properties)
|
||||||
|
automlrun = AutoMLRun(pipeline_run.experiment, automl_step[0].id)
|
||||||
|
best = automlrun.get_best_child()
|
||||||
|
return best
|
||||||
|
|
||||||
|
|
||||||
|
def get_model_path(model_artifact_path):
|
||||||
|
return model_artifact_path.split("/")[1]
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
parser = argparse.ArgumentParser()
|
||||||
|
parser.add_argument("--model_name")
|
||||||
|
parser.add_argument("--model_path")
|
||||||
|
parser.add_argument("--ds_name")
|
||||||
|
args = parser.parse_args()
|
||||||
|
|
||||||
|
print("Argument 1(model_name): %s" % args.model_name)
|
||||||
|
print("Argument 2(model_path): %s" % args.model_path)
|
||||||
|
print("Argument 3(ds_name): %s" % args.ds_name)
|
||||||
|
|
||||||
|
run = Run.get_context()
|
||||||
|
ws = None
|
||||||
|
if type(run) == _OfflineRun:
|
||||||
|
ws = Workspace.from_config()
|
||||||
|
else:
|
||||||
|
ws = run.experiment.workspace
|
||||||
|
|
||||||
|
train_ds = Dataset.get_by_name(ws, args.ds_name)
|
||||||
|
datasets = [(Dataset.Scenario.TRAINING, train_ds)]
|
||||||
|
new_dir = str(uuid.uuid4())
|
||||||
|
os.makedirs(new_dir)
|
||||||
|
|
||||||
|
# Register model with training dataset
|
||||||
|
best_run = get_best_automl_run(run.parent)
|
||||||
|
model_artifact_path = best_run.properties[constants.PROPERTY_KEY_OF_MODEL_PATH]
|
||||||
|
algo = best_run.properties.get("run_algorithm")
|
||||||
|
model_artifact_dir = model_artifact_path.split("/")[0]
|
||||||
|
model_file_name = model_artifact_path.split("/")[1]
|
||||||
|
model = best_run.register_model(
|
||||||
|
args.model_name,
|
||||||
|
model_path=model_artifact_dir,
|
||||||
|
datasets=datasets,
|
||||||
|
tags={"algorithm": algo, "model_file_name": model_file_name},
|
||||||
|
)
|
||||||
|
|
||||||
|
print("Registered version {0} of model {1}".format(model.version, model.name))
|
||||||
@@ -20,7 +20,14 @@
|
|||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"In this notebook we will explore the univaraite time-series data to determine the settings for an automated ML experiment. We will follow the thought process depicted in the following diagram:<br/>\n",
|
"<font color=\"red\" size=\"5\"><strong>!Important!</strong> </br>This notebook is outdated and is not supported by the AutoML Team. Please use the supported version ([link](https://github.com/Azure/azureml-examples/tree/main/sdk/python/jobs/automl-standalone-jobs/automl-forecasting-recipes-univariate)).</font>"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"In this notebook we will explore the univariate time-series data to determine the settings for an automated ML experiment. We will follow the thought process depicted in the following diagram:<br/>\n",
|
||||||
"\n",
|
"\n",
|
||||||
"\n",
|
"\n",
|
||||||
"The objective is to answer the following questions:\n",
|
"The objective is to answer the following questions:\n",
|
||||||
@@ -32,11 +39,11 @@
|
|||||||
" </ul>\n",
|
" </ul>\n",
|
||||||
" <li>Is the data stationary? </li>\n",
|
" <li>Is the data stationary? </li>\n",
|
||||||
" <ul style=\"margin-top:-1px; list-style-type:none\"> \n",
|
" <ul style=\"margin-top:-1px; list-style-type:none\"> \n",
|
||||||
" <li> Importance: In the absense of features that capture trend behavior, ML models (regression and tree based) are not well equiped to predict stochastic trends. Working with stationary data solves this problem. </li>\n",
|
" <li> Importance: In the absence of features that capture trend behavior, ML models (regression and tree based) are not well equipped to predict stochastic trends. Working with stationary data solves this problem. </li>\n",
|
||||||
" </ul>\n",
|
" </ul>\n",
|
||||||
" <li>Is there a detectable auto-regressive pattern in the stationary data? </li>\n",
|
" <li>Is there a detectable auto-regressive pattern in the stationary data? </li>\n",
|
||||||
" <ul style=\"margin-top:-1px; list-style-type:none\"> \n",
|
" <ul style=\"margin-top:-1px; list-style-type:none\"> \n",
|
||||||
" <li> Importance: The accuracy of ML models can be improved if serial correlation is modeled by including lags of the dependent/target varaible as features. Including target lags in every experiment by default will result in a regression in accuracy scores if such setting is not warranted. </li>\n",
|
" <li> Importance: The accuracy of ML models can be improved if serial correlation is modeled by including lags of the dependent/target variable as features. Including target lags in every experiment by default will result in a regression in accuracy scores if such setting is not warranted. </li>\n",
|
||||||
" </ul>\n",
|
" </ul>\n",
|
||||||
"</ol>\n",
|
"</ol>\n",
|
||||||
"\n",
|
"\n",
|
||||||
@@ -109,7 +116,7 @@
|
|||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"The graph plots the alcohol sales in the United States. Because the data is trending, it can be difficult to see cycles, seasonality or other interestng behaviors due to the scaling issues. For example, if there is a seasonal pattern, which we will discuss later, we cannot see them on the trending data. In such case, it is worth plotting the same data in first differences."
|
"The graph plots the alcohol sales in the United States. Because the data is trending, it can be difficult to see cycles, seasonality or other interesting behaviors due to the scaling issues. For example, if there is a seasonal pattern, which we will discuss later, we cannot see them on the trending data. In such case, it is worth plotting the same data in first differences."
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -335,8 +342,8 @@
|
|||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"# 3 Check if there is a clear autoregressive pattern\n",
|
"# 3 Check if there is a clear auto-regressive pattern\n",
|
||||||
"We need to determine if we should include lags of the target variable as features in order to improve forecast accuracy. To do this, we will examine the ACF and partial ACF (PACF) plots of the stationary series. In our case, it is a series in first diffrences.\n",
|
"We need to determine if we should include lags of the target variable as features in order to improve forecast accuracy. To do this, we will examine the ACF and partial ACF (PACF) plots of the stationary series. In our case, it is a series in first differences.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"<ul>\n",
|
"<ul>\n",
|
||||||
" <li> Question: What is an Auto-regressive pattern? What are we looking for? </li>\n",
|
" <li> Question: What is an Auto-regressive pattern? What are we looking for? </li>\n",
|
||||||
@@ -418,11 +425,11 @@
|
|||||||
" </li>\n",
|
" </li>\n",
|
||||||
" where $\\sigma_{xzy}$ is the covariance between two random variables $X$ and $Z$; $\\sigma_x$ and $\\sigma_z$ is the variance for $X$ and $Z$, respectively. The correlation coefficient measures the strength of linear relationship between two random variables. This metric can take any value from -1 to 1. <li/>\n",
|
" where $\\sigma_{xzy}$ is the covariance between two random variables $X$ and $Z$; $\\sigma_x$ and $\\sigma_z$ is the variance for $X$ and $Z$, respectively. The correlation coefficient measures the strength of linear relationship between two random variables. This metric can take any value from -1 to 1. <li/>\n",
|
||||||
" <br/>\n",
|
" <br/>\n",
|
||||||
" <li> The auto-correlation coefficient $\\rho_{Y_{t} Y_{t-k}}$ is the time series equivalent of the correlation coefficient, except instead of measuring linear association between two random variables $X$ and $Z$, it measures the strength of a linear relationship between a random variable $Y_t$ and its lag $Y_{t-k}$ for any positive interger value of $k$. </li> \n",
|
" <li> The auto-correlation coefficient $\\rho_{Y_{t} Y_{t-k}}$ is the time series equivalent of the correlation coefficient, except instead of measuring linear association between two random variables $X$ and $Z$, it measures the strength of a linear relationship between a random variable $Y_t$ and its lag $Y_{t-k}$ for any positive integer value of $k$. </li> \n",
|
||||||
" <br />\n",
|
" <br />\n",
|
||||||
" <li> To visualize the ACF for a particular lag, say lag 2, plot the second lag of a series $y_{t-2}$ on the x-axis, and plot the series itself $y_t$ on the y-axis. The autocorrelation coefficient is the slope of the best fitted regression line and can be interpreted as follows. A one unit increase in the lag of a variable one period ago leads to a $\\rho_{Y_{t} Y_{t-2}}$ units change in the variable in the current period. This interpreation can be applied to any lag. </li> \n",
|
" <li> To visualize the ACF for a particular lag, say lag 2, plot the second lag of a series $y_{t-2}$ on the x-axis, and plot the series itself $y_t$ on the y-axis. The autocorrelation coefficient is the slope of the best fitted regression line and can be interpreted as follows. A one unit increase in the lag of a variable one period ago leads to a $\\rho_{Y_{t} Y_{t-2}}$ units change in the variable in the current period. This interpretation can be applied to any lag. </li> \n",
|
||||||
" <br />\n",
|
" <br />\n",
|
||||||
" <li> In the interpretation posted above we need to be careful not to confuse the word \"leads\" with \"causes\" since these are not the same thing. We do not know the lagged value of the varaible causes it to change. Afterall, there are probably many other features that may explain the movement in $Y_t$. All we are trying to do in this section is to identify situations when the variable contains the strong auto-regressive components that needs to be included in the model to improve forecast accuracy. </li>\n",
|
" <li> In the interpretation posted above we need to be careful not to confuse the word \"leads\" with \"causes\" since these are not the same thing. We do not know the lagged value of the variable causes it to change. After all, there are probably many other features that may explain the movement in $Y_t$. All we are trying to do in this section is to identify situations when the variable contains the strong auto-regressive components that needs to be included in the model to improve forecast accuracy. </li>\n",
|
||||||
" </ul>\n",
|
" </ul>\n",
|
||||||
"</ul>"
|
"</ul>"
|
||||||
]
|
]
|
||||||
@@ -434,7 +441,7 @@
|
|||||||
"<ul>\n",
|
"<ul>\n",
|
||||||
" <li> Question: What is the PACF? </li>\n",
|
" <li> Question: What is the PACF? </li>\n",
|
||||||
" <ul style=\"list-style-type:none;\">\n",
|
" <ul style=\"list-style-type:none;\">\n",
|
||||||
" <li> When describing the ACF we essentially running a regression between a partigular lag of a series, say, lag 4, and the series itself. What this implies is the regression coefficient for lag 4 captures the impact of everything that happens in lags 1, 2 and 3. In other words, if lag 1 is the most important lag and we exclude it from the regression, naturally, the regression model will assign the importance of the 1st lag to the 4th one. Partial auto-correlation function fixes this problem since it measures the contribution of each lag accounting for the information added by the intermediary lags. If we were to illustrate ACF and PACF for the fourth lag using the regression analogy, the difference is a follows: \n",
|
" <li> When describing the ACF we essentially running a regression between a particular lag of a series, say, lag 4, and the series itself. What this implies is the regression coefficient for lag 4 captures the impact of everything that happens in lags 1, 2 and 3. In other words, if lag 1 is the most important lag and we exclude it from the regression, naturally, the regression model will assign the importance of the 1st lag to the 4th one. Partial auto-correlation function fixes this problem since it measures the contribution of each lag accounting for the information added by the intermediary lags. If we were to illustrate ACF and PACF for the fourth lag using the regression analogy, the difference is a follows: \n",
|
||||||
" \\begin{align}\n",
|
" \\begin{align}\n",
|
||||||
" Y_{t} &= a_{0} + a_{4} Y_{t-4} + e_{t} \\\\\n",
|
" Y_{t} &= a_{0} + a_{4} Y_{t-4} + e_{t} \\\\\n",
|
||||||
" Y_{t} &= b_{0} + b_{1} Y_{t-1} + b_{2} Y_{t-2} + b_{3} Y_{t-3} + b_{4} Y_{t-4} + \\varepsilon_{t} \\\\\n",
|
" Y_{t} &= b_{0} + b_{1} Y_{t-1} + b_{2} Y_{t-2} + b_{3} Y_{t-3} + b_{4} Y_{t-4} + \\varepsilon_{t} \\\\\n",
|
||||||
@@ -442,7 +449,7 @@
|
|||||||
" </li>\n",
|
" </li>\n",
|
||||||
" <br/>\n",
|
" <br/>\n",
|
||||||
" <li>\n",
|
" <li>\n",
|
||||||
" Here, you can think of $a_4$ and $b_{4}$ as the auto- and partial auto-correlation coefficients for lag 4. Notice, in the second equation we explicitely accounting for the intermediate lags by adding them as regrerssors.\n",
|
" Here, you can think of $a_4$ and $b_{4}$ as the auto- and partial auto-correlation coefficients for lag 4. Notice, in the second equation we explicitly accounting for the intermediate lags by adding them as regressors.\n",
|
||||||
" </li>\n",
|
" </li>\n",
|
||||||
" </ul>\n",
|
" </ul>\n",
|
||||||
"</ul>"
|
"</ul>"
|
||||||
@@ -455,11 +462,11 @@
|
|||||||
"<ul>\n",
|
"<ul>\n",
|
||||||
" <li> Question: Auto-regressive pattern? What are we looking for? </li>\n",
|
" <li> Question: Auto-regressive pattern? What are we looking for? </li>\n",
|
||||||
" <ul style=\"list-style-type:none;\">\n",
|
" <ul style=\"list-style-type:none;\">\n",
|
||||||
" <li> We are looking for a classical profiles for an AR(p) process such as an exponential decay of an ACF and a the first $p$ significant lags of the PACF. Let's examine the ACF/PACF profiles of the same simulated AR(2) shown in Section 3, and check if the ACF/PACF explanation are refelcted in these plots. <li/>\n",
|
" <li> We are looking for a classical profiles for an AR(p) process such as an exponential decay of an ACF and a the first $p$ significant lags of the PACF. Let's examine the ACF/PACF profiles of the same simulated AR(2) shown in Section 3, and check if the ACF/PACF explanation are reflected in these plots. <li/>\n",
|
||||||
" <li><img src=\"figures/ACF_PACF_for_AR2.png\" class=\"img_class\">\n",
|
" <li><img src=\"figures/ACF_PACF_for_AR2.png\" class=\"img_class\">\n",
|
||||||
" <li> The autocorrelation coefficient for the 3rd lag is 0.6, which can be interpreted that a one unit increase in the value of the target varaible three periods ago leads to 0.6 units increase in the current period. However, the PACF plot shows that the partial autocorrealtion coefficient is zero (from a statistical point of view since it lies within the shaded region). This is happening because the 1st and 2nd lags are good predictors of the target variable. Ommiting these two lags from the regression results in the misleading conclusion that the third lag is a good prediciton. <li/>\n",
|
" <li> The autocorrelation coefficient for the 3rd lag is 0.6, which can be interpreted that a one unit increase in the value of the target variable three periods ago leads to 0.6 units increase in the current period. However, the PACF plot shows that the partial autocorrelation coefficient is zero (from a statistical point of view since it lies within the shaded region). This is happening because the 1st and 2nd lags are good predictors of the target variable. Omitting these two lags from the regression results in the misleading conclusion that the third lag is a good prediction. <li/>\n",
|
||||||
" <br/>\n",
|
" <br/>\n",
|
||||||
" <li> This is why it is important to examine both the ACF and the PACF plots when tring to determine the auto regressive order for the variable in question. <li/>\n",
|
" <li> This is why it is important to examine both the ACF and the PACF plots when trying to determine the auto regressive order for the variable in question. <li/>\n",
|
||||||
" </ul>\n",
|
" </ul>\n",
|
||||||
"</ul> "
|
"</ul> "
|
||||||
]
|
]
|
||||||
@@ -471,10 +478,13 @@
|
|||||||
"name": "vlbejan"
|
"name": "vlbejan"
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
|
"kernel_info": {
|
||||||
|
"name": "python38-azureml"
|
||||||
|
},
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Python 3.6",
|
"display_name": "Python 3.8 - AzureML",
|
||||||
"language": "python",
|
"language": "python",
|
||||||
"name": "python36"
|
"name": "python38-azureml"
|
||||||
},
|
},
|
||||||
"language_info": {
|
"language_info": {
|
||||||
"codemirror_mode": {
|
"codemirror_mode": {
|
||||||
@@ -486,7 +496,15 @@
|
|||||||
"name": "python",
|
"name": "python",
|
||||||
"nbconvert_exporter": "python",
|
"nbconvert_exporter": "python",
|
||||||
"pygments_lexer": "ipython3",
|
"pygments_lexer": "ipython3",
|
||||||
"version": "3.6.9"
|
"version": "3.8.10"
|
||||||
|
},
|
||||||
|
"microsoft": {
|
||||||
|
"ms_spell_check": {
|
||||||
|
"ms_spell_check_language": "en"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"nteract": {
|
||||||
|
"version": "nteract-front-end@1.0.0"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"nbformat": 4,
|
"nbformat": 4,
|
||||||
|
|||||||
@@ -1,4 +0,0 @@
|
|||||||
name: auto-ml-forecasting-univariate-recipe-experiment-settings
|
|
||||||
dependencies:
|
|
||||||
- pip:
|
|
||||||
- azureml-sdk
|
|
||||||
@@ -16,6 +16,13 @@
|
|||||||
""
|
""
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"<font color=\"red\" size=\"5\"><strong>!Important!</strong> </br>This notebook is outdated and is not supported by the AutoML Team. Please use the supported version ([link](https://github.com/Azure/azureml-examples/tree/main/sdk/python/jobs/automl-standalone-jobs/automl-forecasting-recipes-univariate)).</font>"
|
||||||
|
]
|
||||||
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
@@ -300,27 +307,14 @@
|
|||||||
"df_train.to_csv(\"train.csv\", index=False)\n",
|
"df_train.to_csv(\"train.csv\", index=False)\n",
|
||||||
"df_test.to_csv(\"test.csv\", index=False)\n",
|
"df_test.to_csv(\"test.csv\", index=False)\n",
|
||||||
"\n",
|
"\n",
|
||||||
|
"from azureml.data.dataset_factory import TabularDatasetFactory\n",
|
||||||
|
"\n",
|
||||||
"datastore = ws.get_default_datastore()\n",
|
"datastore = ws.get_default_datastore()\n",
|
||||||
"datastore.upload_files(\n",
|
"train_dataset = TabularDatasetFactory.register_pandas_dataframe(\n",
|
||||||
" files=[\"./train.csv\"],\n",
|
" df_train, target=(datastore, \"dataset/\"), name=\"train\"\n",
|
||||||
" target_path=\"uni-recipe-dataset/tabular/\",\n",
|
|
||||||
" overwrite=True,\n",
|
|
||||||
" show_progress=True,\n",
|
|
||||||
")\n",
|
")\n",
|
||||||
"datastore.upload_files(\n",
|
"test_dataset = TabularDatasetFactory.register_pandas_dataframe(\n",
|
||||||
" files=[\"./test.csv\"],\n",
|
" df_test, target=(datastore, \"dataset/\"), name=\"test\"\n",
|
||||||
" target_path=\"uni-recipe-dataset/tabular/\",\n",
|
|
||||||
" overwrite=True,\n",
|
|
||||||
" show_progress=True,\n",
|
|
||||||
")\n",
|
|
||||||
"\n",
|
|
||||||
"from azureml.core import Dataset\n",
|
|
||||||
"\n",
|
|
||||||
"train_dataset = Dataset.Tabular.from_delimited_files(\n",
|
|
||||||
" path=[(datastore, \"uni-recipe-dataset/tabular/train.csv\")]\n",
|
|
||||||
")\n",
|
|
||||||
"test_dataset = Dataset.Tabular.from_delimited_files(\n",
|
|
||||||
" path=[(datastore, \"uni-recipe-dataset/tabular/test.csv\")]\n",
|
|
||||||
")\n",
|
")\n",
|
||||||
"\n",
|
"\n",
|
||||||
"# print the first 5 rows of the Dataset\n",
|
"# print the first 5 rows of the Dataset\n",
|
||||||
@@ -358,7 +352,8 @@
|
|||||||
" enable_early_stopping=True,\n",
|
" enable_early_stopping=True,\n",
|
||||||
" training_data=train_dataset,\n",
|
" training_data=train_dataset,\n",
|
||||||
" label_column_name=TARGET_COLNAME,\n",
|
" label_column_name=TARGET_COLNAME,\n",
|
||||||
" n_cross_validations=5,\n",
|
" n_cross_validations=\"auto\", # Feel free to set to a small integer (>=2) if runtime is an issue.\n",
|
||||||
|
" cv_step_size=\"auto\",\n",
|
||||||
" verbosity=logging.INFO,\n",
|
" verbosity=logging.INFO,\n",
|
||||||
" max_cores_per_iteration=-1,\n",
|
" max_cores_per_iteration=-1,\n",
|
||||||
" compute_target=compute_target,\n",
|
" compute_target=compute_target,\n",
|
||||||
@@ -570,10 +565,13 @@
|
|||||||
"name": "vlbejan"
|
"name": "vlbejan"
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
|
"kernel_info": {
|
||||||
|
"name": "python3"
|
||||||
|
},
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Python 3.6",
|
"display_name": "Python 3.8 - AzureML",
|
||||||
"language": "python",
|
"language": "python",
|
||||||
"name": "python36"
|
"name": "python38-azureml"
|
||||||
},
|
},
|
||||||
"language_info": {
|
"language_info": {
|
||||||
"codemirror_mode": {
|
"codemirror_mode": {
|
||||||
@@ -585,7 +583,20 @@
|
|||||||
"name": "python",
|
"name": "python",
|
||||||
"nbconvert_exporter": "python",
|
"nbconvert_exporter": "python",
|
||||||
"pygments_lexer": "ipython3",
|
"pygments_lexer": "ipython3",
|
||||||
"version": "3.6.9"
|
"version": "3.8.10"
|
||||||
|
},
|
||||||
|
"microsoft": {
|
||||||
|
"ms_spell_check": {
|
||||||
|
"ms_spell_check_language": "en"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"nteract": {
|
||||||
|
"version": "nteract-front-end@1.0.0"
|
||||||
|
},
|
||||||
|
"vscode": {
|
||||||
|
"interpreter": {
|
||||||
|
"hash": "6bd77c88278e012ef31757c15997a7bea8c943977c43d6909403c00ae11d43ca"
|
||||||
|
}
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"nbformat": 4,
|
"nbformat": 4,
|
||||||
|
|||||||
@@ -1,4 +0,0 @@
|
|||||||
name: auto-ml-forecasting-univariate-recipe-run-experiment
|
|
||||||
dependencies:
|
|
||||||
- pip:
|
|
||||||
- azureml-sdk
|
|
||||||
@@ -7,7 +7,7 @@ compute instance.
|
|||||||
import argparse
|
import argparse
|
||||||
from azureml.core import Dataset, Run
|
from azureml.core import Dataset, Run
|
||||||
from azureml.automl.core.shared.constants import TimeSeriesInternal
|
from azureml.automl.core.shared.constants import TimeSeriesInternal
|
||||||
from sklearn.externals import joblib
|
import joblib
|
||||||
|
|
||||||
parser = argparse.ArgumentParser()
|
parser = argparse.ArgumentParser()
|
||||||
parser.add_argument(
|
parser.add_argument(
|
||||||
|
|||||||
@@ -1,5 +1,21 @@
|
|||||||
{
|
{
|
||||||
"cells": [
|
"cells": [
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Copyright (c) Microsoft Corporation. All rights reserved.\n",
|
||||||
|
"\n",
|
||||||
|
"Licensed under the MIT License."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
""
|
||||||
|
]
|
||||||
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
@@ -93,6 +109,7 @@
|
|||||||
"output[\"Resource Group\"] = ws.resource_group\n",
|
"output[\"Resource Group\"] = ws.resource_group\n",
|
||||||
"output[\"Location\"] = ws.location\n",
|
"output[\"Location\"] = ws.location\n",
|
||||||
"output[\"Experiment Name\"] = experiment.name\n",
|
"output[\"Experiment Name\"] = experiment.name\n",
|
||||||
|
"output[\"SDK Version\"] = azureml.core.VERSION\n",
|
||||||
"pd.set_option(\"display.max_colwidth\", None)\n",
|
"pd.set_option(\"display.max_colwidth\", None)\n",
|
||||||
"outputDf = pd.DataFrame(data=output, index=[\"\"])\n",
|
"outputDf = pd.DataFrame(data=output, index=[\"\"])\n",
|
||||||
"outputDf.T"
|
"outputDf.T"
|
||||||
@@ -553,273 +570,6 @@
|
|||||||
"automl_run.upload_file(\"outputs/scoring_explainer.pkl\", scoring_explainer_file_name)"
|
"automl_run.upload_file(\"outputs/scoring_explainer.pkl\", scoring_explainer_file_name)"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"### Deploying the scoring and explainer models to a web service to Azure Kubernetes Service (AKS)\n",
|
|
||||||
"\n",
|
|
||||||
"We use the TreeScoringExplainer from azureml.interpret package to create the scoring explainer which will be used to compute the raw and engineered feature importances at the inference time. In the cell below, we register the AutoML model and the scoring explainer with the Model Management Service."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"# Register trained automl model present in the 'outputs' folder in the artifacts\n",
|
|
||||||
"original_model = automl_run.register_model(\n",
|
|
||||||
" model_name=\"automl_model\", model_path=\"outputs/model.pkl\"\n",
|
|
||||||
")\n",
|
|
||||||
"scoring_explainer_model = automl_run.register_model(\n",
|
|
||||||
" model_name=\"scoring_explainer\", model_path=\"outputs/scoring_explainer.pkl\"\n",
|
|
||||||
")"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"#### Create the conda dependencies for setting up the service\n",
|
|
||||||
"\n",
|
|
||||||
"We need to download the conda dependencies using the automl_run object."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"from azureml.automl.core.shared import constants\n",
|
|
||||||
"from azureml.core.environment import Environment\n",
|
|
||||||
"\n",
|
|
||||||
"automl_run.download_file(constants.CONDA_ENV_FILE_PATH, \"myenv.yml\")\n",
|
|
||||||
"myenv = Environment.from_conda_specification(name=\"myenv\", file_path=\"myenv.yml\")\n",
|
|
||||||
"myenv"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"#### Write the Entry Script\n",
|
|
||||||
"Write the script that will be used to predict on your model"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"%%writefile score.py\n",
|
|
||||||
"import joblib\n",
|
|
||||||
"import pandas as pd\n",
|
|
||||||
"from azureml.core.model import Model\n",
|
|
||||||
"from azureml.train.automl.runtime.automl_explain_utilities import (\n",
|
|
||||||
" automl_setup_model_explanations,\n",
|
|
||||||
")\n",
|
|
||||||
"\n",
|
|
||||||
"\n",
|
|
||||||
"def init():\n",
|
|
||||||
" global automl_model\n",
|
|
||||||
" global scoring_explainer\n",
|
|
||||||
"\n",
|
|
||||||
" # Retrieve the path to the model file using the model name\n",
|
|
||||||
" # Assume original model is named original_prediction_model\n",
|
|
||||||
" automl_model_path = Model.get_model_path(\"automl_model\")\n",
|
|
||||||
" scoring_explainer_path = Model.get_model_path(\"scoring_explainer\")\n",
|
|
||||||
"\n",
|
|
||||||
" automl_model = joblib.load(automl_model_path)\n",
|
|
||||||
" scoring_explainer = joblib.load(scoring_explainer_path)\n",
|
|
||||||
"\n",
|
|
||||||
"\n",
|
|
||||||
"def run(raw_data):\n",
|
|
||||||
" data = pd.read_json(raw_data, orient=\"records\")\n",
|
|
||||||
" # Make prediction\n",
|
|
||||||
" predictions = automl_model.predict(data)\n",
|
|
||||||
" # Setup for inferencing explanations\n",
|
|
||||||
" automl_explainer_setup_obj = automl_setup_model_explanations(\n",
|
|
||||||
" automl_model, X_test=data, task=\"classification\"\n",
|
|
||||||
" )\n",
|
|
||||||
" # Retrieve model explanations for engineered explanations\n",
|
|
||||||
" engineered_local_importance_values = scoring_explainer.explain(\n",
|
|
||||||
" automl_explainer_setup_obj.X_test_transform\n",
|
|
||||||
" )\n",
|
|
||||||
" # Retrieve model explanations for raw explanations\n",
|
|
||||||
" raw_local_importance_values = scoring_explainer.explain(\n",
|
|
||||||
" automl_explainer_setup_obj.X_test_transform, get_raw=True\n",
|
|
||||||
" )\n",
|
|
||||||
" # You can return any data type as long as it is JSON-serializable\n",
|
|
||||||
" return {\n",
|
|
||||||
" \"predictions\": predictions.tolist(),\n",
|
|
||||||
" \"engineered_local_importance_values\": engineered_local_importance_values,\n",
|
|
||||||
" \"raw_local_importance_values\": raw_local_importance_values,\n",
|
|
||||||
" }"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"#### Create the InferenceConfig \n",
|
|
||||||
"Create the inference config that will be used when deploying the model"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"from azureml.core.model import InferenceConfig\n",
|
|
||||||
"\n",
|
|
||||||
"inf_config = InferenceConfig(entry_script=\"score.py\", environment=myenv)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"#### Provision the AKS Cluster\n",
|
|
||||||
"This is a one time setup. You can reuse this cluster for multiple deployments after it has been created. If you delete the cluster or the resource group that contains it, then you would have to recreate it."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"from azureml.core.compute import ComputeTarget, AksCompute\n",
|
|
||||||
"from azureml.core.compute_target import ComputeTargetException\n",
|
|
||||||
"\n",
|
|
||||||
"# Choose a name for your cluster.\n",
|
|
||||||
"aks_name = \"scoring-explain\"\n",
|
|
||||||
"\n",
|
|
||||||
"# Verify that cluster does not exist already\n",
|
|
||||||
"try:\n",
|
|
||||||
" aks_target = ComputeTarget(workspace=ws, name=aks_name)\n",
|
|
||||||
" print(\"Found existing cluster, use it.\")\n",
|
|
||||||
"except ComputeTargetException:\n",
|
|
||||||
" prov_config = AksCompute.provisioning_configuration(vm_size=\"STANDARD_D3_V2\")\n",
|
|
||||||
" aks_target = ComputeTarget.create(\n",
|
|
||||||
" workspace=ws, name=aks_name, provisioning_configuration=prov_config\n",
|
|
||||||
" )\n",
|
|
||||||
"aks_target.wait_for_completion(show_output=True)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"#### Deploy web service to AKS"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"# Set the web service configuration (using default here)\n",
|
|
||||||
"from azureml.core.webservice import AksWebservice\n",
|
|
||||||
"from azureml.core.model import Model\n",
|
|
||||||
"\n",
|
|
||||||
"aks_config = AksWebservice.deploy_configuration()"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"aks_service_name = \"model-scoring-local-aks\"\n",
|
|
||||||
"\n",
|
|
||||||
"aks_service = Model.deploy(\n",
|
|
||||||
" workspace=ws,\n",
|
|
||||||
" name=aks_service_name,\n",
|
|
||||||
" models=[scoring_explainer_model, original_model],\n",
|
|
||||||
" inference_config=inf_config,\n",
|
|
||||||
" deployment_config=aks_config,\n",
|
|
||||||
" deployment_target=aks_target,\n",
|
|
||||||
")\n",
|
|
||||||
"\n",
|
|
||||||
"aks_service.wait_for_deployment(show_output=True)\n",
|
|
||||||
"print(aks_service.state)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"#### View the service logs"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"aks_service.get_logs()"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"#### Consume the web service using run method to do the scoring and explanation of scoring.\n",
|
|
||||||
"We test the web sevice by passing data. Run() method retrieves API keys behind the scenes to make sure that call is authenticated."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"# Serialize the first row of the test data into json\n",
|
|
||||||
"X_test_json = X_test_df[:1].to_json(orient=\"records\")\n",
|
|
||||||
"print(X_test_json)\n",
|
|
||||||
"\n",
|
|
||||||
"# Call the service to get the predictions and the engineered and raw explanations\n",
|
|
||||||
"output = aks_service.run(X_test_json)\n",
|
|
||||||
"\n",
|
|
||||||
"# Print the predicted value\n",
|
|
||||||
"print(\"predictions:\\n{}\\n\".format(output[\"predictions\"]))\n",
|
|
||||||
"# Print the engineered feature importances for the predicted value\n",
|
|
||||||
"print(\n",
|
|
||||||
" \"engineered_local_importance_values:\\n{}\\n\".format(\n",
|
|
||||||
" output[\"engineered_local_importance_values\"]\n",
|
|
||||||
" )\n",
|
|
||||||
")\n",
|
|
||||||
"# Print the raw feature importances for the predicted value\n",
|
|
||||||
"print(\n",
|
|
||||||
" \"raw_local_importance_values:\\n{}\\n\".format(output[\"raw_local_importance_values\"])\n",
|
|
||||||
")"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"#### Clean up\n",
|
|
||||||
"Delete the service."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"aks_service.delete()"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
{
|
||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
@@ -869,9 +619,9 @@
|
|||||||
"friendly_name": "Classification of credit card fraudulent transactions using Automated ML",
|
"friendly_name": "Classification of credit card fraudulent transactions using Automated ML",
|
||||||
"index_order": 5,
|
"index_order": 5,
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Python 3.6",
|
"display_name": "Python 3.8 - AzureML",
|
||||||
"language": "python",
|
"language": "python",
|
||||||
"name": "python36"
|
"name": "python38-azureml"
|
||||||
},
|
},
|
||||||
"language_info": {
|
"language_info": {
|
||||||
"codemirror_mode": {
|
"codemirror_mode": {
|
||||||
|
|||||||
@@ -1,4 +0,0 @@
|
|||||||
name: auto-ml-classification-credit-card-fraud-local
|
|
||||||
dependencies:
|
|
||||||
- pip:
|
|
||||||
- azureml-sdk
|
|
||||||
@@ -1,5 +1,27 @@
|
|||||||
{
|
{
|
||||||
"cells": [
|
"cells": [
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"hideCode": false,
|
||||||
|
"hidePrompt": false
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
"Copyright (c) Microsoft Corporation. All rights reserved.\n",
|
||||||
|
"\n",
|
||||||
|
"Licensed under the MIT License."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {
|
||||||
|
"hideCode": false,
|
||||||
|
"hidePrompt": false
|
||||||
|
},
|
||||||
|
"source": [
|
||||||
|
""
|
||||||
|
]
|
||||||
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
@@ -93,6 +115,7 @@
|
|||||||
"output[\"Resource Group\"] = ws.resource_group\n",
|
"output[\"Resource Group\"] = ws.resource_group\n",
|
||||||
"output[\"Location\"] = ws.location\n",
|
"output[\"Location\"] = ws.location\n",
|
||||||
"output[\"Experiment Name\"] = experiment.name\n",
|
"output[\"Experiment Name\"] = experiment.name\n",
|
||||||
|
"output[\"SDK Version\"] = azureml.core.VERSION\n",
|
||||||
"pd.set_option(\"display.max_colwidth\", None)\n",
|
"pd.set_option(\"display.max_colwidth\", None)\n",
|
||||||
"outputDf = pd.DataFrame(data=output, index=[\"\"])\n",
|
"outputDf = pd.DataFrame(data=output, index=[\"\"])\n",
|
||||||
"outputDf.T"
|
"outputDf.T"
|
||||||
@@ -494,6 +517,30 @@
|
|||||||
"#### Create conda configuration for model explanations experiment from automl_run object"
|
"#### Create conda configuration for model explanations experiment from automl_run object"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import json\n",
|
||||||
|
"from azureml.core import Environment\n",
|
||||||
|
"\n",
|
||||||
|
"\n",
|
||||||
|
"def get_environment_safe(parent_run):\n",
|
||||||
|
" \"\"\"Get the environment from parent run\"\"\"\n",
|
||||||
|
" try:\n",
|
||||||
|
" return parent_run.get_environment()\n",
|
||||||
|
" except BaseException:\n",
|
||||||
|
" run_details = parent_run.get_details()\n",
|
||||||
|
" run_def = run_details.get(\"runDefinition\")\n",
|
||||||
|
" env = run_def.get(\"environment\")\n",
|
||||||
|
" if env is None:\n",
|
||||||
|
" raise\n",
|
||||||
|
" json.dump(env, open(\"azureml_environment.json\", \"w\"))\n",
|
||||||
|
" return Environment.load_from_directory(\".\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": null,
|
"execution_count": null,
|
||||||
@@ -501,8 +548,6 @@
|
|||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"from azureml.core.runconfig import RunConfiguration\n",
|
"from azureml.core.runconfig import RunConfiguration\n",
|
||||||
"from azureml.core.conda_dependencies import CondaDependencies\n",
|
|
||||||
"import pkg_resources\n",
|
|
||||||
"\n",
|
"\n",
|
||||||
"# create a new RunConfig object\n",
|
"# create a new RunConfig object\n",
|
||||||
"conda_run_config = RunConfiguration(framework=\"python\")\n",
|
"conda_run_config = RunConfiguration(framework=\"python\")\n",
|
||||||
@@ -512,9 +557,7 @@
|
|||||||
"conda_run_config.environment.docker.enabled = True\n",
|
"conda_run_config.environment.docker.enabled = True\n",
|
||||||
"\n",
|
"\n",
|
||||||
"# specify CondaDependencies obj\n",
|
"# specify CondaDependencies obj\n",
|
||||||
"conda_run_config.environment.python.conda_dependencies = (\n",
|
"conda_run_config.environment = get_environment_safe(automl_run)"
|
||||||
" automl_run.get_environment().python.conda_dependencies\n",
|
|
||||||
")"
|
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -643,28 +686,6 @@
|
|||||||
")"
|
")"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"### Create the conda dependencies for setting up the service\n",
|
|
||||||
"We need to create the conda dependencies comprising of the *azureml* packages using the training environment from the *automl_run*."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"conda_dep = automl_run.get_environment().python.conda_dependencies\n",
|
|
||||||
"\n",
|
|
||||||
"with open(\"myenv.yml\", \"w\") as f:\n",
|
|
||||||
" f.write(conda_dep.serialize_to_string())\n",
|
|
||||||
"with open(\"myenv.yml\", \"r\") as f:\n",
|
|
||||||
" print(f.read())"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
{
|
||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
@@ -687,7 +708,7 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"### Deploy the service\n",
|
"### Deploy the service\n",
|
||||||
"In the cell below, we deploy the service using the conda file and the scoring file from the previous steps. "
|
"In the cell below, we deploy the service using the automl training environment and the scoring file from the previous steps. "
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -709,7 +730,7 @@
|
|||||||
" description=\"Get local explanations for Machine test data\",\n",
|
" description=\"Get local explanations for Machine test data\",\n",
|
||||||
")\n",
|
")\n",
|
||||||
"\n",
|
"\n",
|
||||||
"myenv = Environment.from_conda_specification(name=\"myenv\", file_path=\"myenv.yml\")\n",
|
"myenv = get_environment_safe(automl_run)\n",
|
||||||
"inference_config = InferenceConfig(entry_script=\"score_explain.py\", environment=myenv)\n",
|
"inference_config = InferenceConfig(entry_script=\"score_explain.py\", environment=myenv)\n",
|
||||||
"\n",
|
"\n",
|
||||||
"# Use configs and models generated above\n",
|
"# Use configs and models generated above\n",
|
||||||
@@ -882,8 +903,8 @@
|
|||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"%matplotlib inline\n",
|
"%matplotlib inline\n",
|
||||||
"test_pred = plt.scatter(y_test, y_pred_test, color=\"\")\n",
|
"test_pred = plt.scatter(y_test, y_pred_test, c=[\"b\"])\n",
|
||||||
"test_test = plt.scatter(y_test, y_test, color=\"g\")\n",
|
"test_test = plt.scatter(y_test, y_test, c=[\"g\"])\n",
|
||||||
"plt.legend(\n",
|
"plt.legend(\n",
|
||||||
" (test_pred, test_test), (\"prediction\", \"truth\"), loc=\"upper left\", fontsize=8\n",
|
" (test_pred, test_test), (\"prediction\", \"truth\"), loc=\"upper left\", fontsize=8\n",
|
||||||
")\n",
|
")\n",
|
||||||
@@ -918,9 +939,9 @@
|
|||||||
"friendly_name": "Automated ML run with featurization and model explainability.",
|
"friendly_name": "Automated ML run with featurization and model explainability.",
|
||||||
"index_order": 5,
|
"index_order": 5,
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Python 3.6",
|
"display_name": "Python 3.8 - AzureML",
|
||||||
"language": "python",
|
"language": "python",
|
||||||
"name": "python36"
|
"name": "python38-azureml"
|
||||||
},
|
},
|
||||||
"language_info": {
|
"language_info": {
|
||||||
"codemirror_mode": {
|
"codemirror_mode": {
|
||||||
@@ -932,7 +953,7 @@
|
|||||||
"name": "python",
|
"name": "python",
|
||||||
"nbconvert_exporter": "python",
|
"nbconvert_exporter": "python",
|
||||||
"pygments_lexer": "ipython3",
|
"pygments_lexer": "ipython3",
|
||||||
"version": "3.6.7"
|
"version": "3.8.7"
|
||||||
},
|
},
|
||||||
"tags": [
|
"tags": [
|
||||||
"featurization",
|
"featurization",
|
||||||
|
|||||||
@@ -1,4 +0,0 @@
|
|||||||
name: auto-ml-regression-explanation-featurization
|
|
||||||
dependencies:
|
|
||||||
- pip:
|
|
||||||
- azureml-sdk
|
|
||||||
@@ -1,5 +1,21 @@
|
|||||||
{
|
{
|
||||||
"cells": [
|
"cells": [
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"Copyright (c) Microsoft Corporation. All rights reserved.\n",
|
||||||
|
"\n",
|
||||||
|
"Licensed under the MIT License."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
""
|
||||||
|
]
|
||||||
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
@@ -89,6 +105,7 @@
|
|||||||
"output[\"Resource Group\"] = ws.resource_group\n",
|
"output[\"Resource Group\"] = ws.resource_group\n",
|
||||||
"output[\"Location\"] = ws.location\n",
|
"output[\"Location\"] = ws.location\n",
|
||||||
"output[\"Run History Name\"] = experiment_name\n",
|
"output[\"Run History Name\"] = experiment_name\n",
|
||||||
|
"output[\"SDK Version\"] = azureml.core.VERSION\n",
|
||||||
"pd.set_option(\"display.max_colwidth\", None)\n",
|
"pd.set_option(\"display.max_colwidth\", None)\n",
|
||||||
"outputDf = pd.DataFrame(data=output, index=[\"\"])\n",
|
"outputDf = pd.DataFrame(data=output, index=[\"\"])\n",
|
||||||
"outputDf.T"
|
"outputDf.T"
|
||||||
@@ -421,8 +438,8 @@
|
|||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"%matplotlib inline\n",
|
"%matplotlib inline\n",
|
||||||
"test_pred = plt.scatter(y_test, y_pred_test, color=\"\")\n",
|
"test_pred = plt.scatter(y_test, y_pred_test, c=[\"b\"])\n",
|
||||||
"test_test = plt.scatter(y_test, y_test, color=\"g\")\n",
|
"test_test = plt.scatter(y_test, y_test, c=[\"g\"])\n",
|
||||||
"plt.legend(\n",
|
"plt.legend(\n",
|
||||||
" (test_pred, test_test), (\"prediction\", \"truth\"), loc=\"upper left\", fontsize=8\n",
|
" (test_pred, test_test), (\"prediction\", \"truth\"), loc=\"upper left\", fontsize=8\n",
|
||||||
")\n",
|
")\n",
|
||||||
@@ -448,9 +465,9 @@
|
|||||||
"automated-machine-learning"
|
"automated-machine-learning"
|
||||||
],
|
],
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Python 3.6",
|
"display_name": "Python 3.8 - AzureML",
|
||||||
"language": "python",
|
"language": "python",
|
||||||
"name": "python36"
|
"name": "python38-azureml"
|
||||||
},
|
},
|
||||||
"language_info": {
|
"language_info": {
|
||||||
"codemirror_mode": {
|
"codemirror_mode": {
|
||||||
|
|||||||
@@ -1,4 +0,0 @@
|
|||||||
name: auto-ml-regression
|
|
||||||
dependencies:
|
|
||||||
- pip:
|
|
||||||
- azureml-sdk
|
|
||||||
@@ -429,9 +429,9 @@
|
|||||||
}
|
}
|
||||||
],
|
],
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Python 3.6",
|
"display_name": "Python 3.8 - AzureML",
|
||||||
"language": "python",
|
"language": "python",
|
||||||
"name": "python36"
|
"name": "python38-azureml"
|
||||||
},
|
},
|
||||||
"language_info": {
|
"language_info": {
|
||||||
"codemirror_mode": {
|
"codemirror_mode": {
|
||||||
|
|||||||
@@ -557,9 +557,9 @@
|
|||||||
}
|
}
|
||||||
],
|
],
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Python 3.6",
|
"display_name": "Python 3.8 - AzureML",
|
||||||
"language": "python",
|
"language": "python",
|
||||||
"name": "python36"
|
"name": "python38-azureml"
|
||||||
},
|
},
|
||||||
"language_info": {
|
"language_info": {
|
||||||
"codemirror_mode": {
|
"codemirror_mode": {
|
||||||
|
|||||||
@@ -161,9 +161,9 @@
|
|||||||
}
|
}
|
||||||
],
|
],
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Python 3.6",
|
"display_name": "Python 3.8 - AzureML",
|
||||||
"language": "python",
|
"language": "python",
|
||||||
"name": "python36"
|
"name": "python38-azureml"
|
||||||
},
|
},
|
||||||
"language_info": {
|
"language_info": {
|
||||||
"codemirror_mode": {
|
"codemirror_mode": {
|
||||||
|
|||||||
@@ -215,9 +215,9 @@
|
|||||||
}
|
}
|
||||||
],
|
],
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Python 3.6",
|
"display_name": "Python 3.8 - AzureML",
|
||||||
"language": "python",
|
"language": "python",
|
||||||
"name": "python36"
|
"name": "python38-azureml"
|
||||||
},
|
},
|
||||||
"language_info": {
|
"language_info": {
|
||||||
"codemirror_mode": {
|
"codemirror_mode": {
|
||||||
|
|||||||
@@ -482,9 +482,9 @@
|
|||||||
}
|
}
|
||||||
],
|
],
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Python 3.6",
|
"display_name": "Python 3.8 - AzureML",
|
||||||
"language": "python",
|
"language": "python",
|
||||||
"name": "python36"
|
"name": "python38-azureml"
|
||||||
},
|
},
|
||||||
"language_info": {
|
"language_info": {
|
||||||
"codemirror_mode": {
|
"codemirror_mode": {
|
||||||
|
|||||||
@@ -302,9 +302,9 @@
|
|||||||
}
|
}
|
||||||
],
|
],
|
||||||
"kernelspec": {
|
"kernelspec": {
|
||||||
"display_name": "Python 3.6",
|
"display_name": "Python 3.8 - AzureML",
|
||||||
"language": "python",
|
"language": "python",
|
||||||
"name": "python36"
|
"name": "python38-azureml"
|
||||||
},
|
},
|
||||||
"language_info": {
|
"language_info": {
|
||||||
"codemirror_mode": {
|
"codemirror_mode": {
|
||||||
|
|||||||
@@ -1,217 +0,0 @@
|
|||||||
|
|
||||||
NOTICES AND INFORMATION
|
|
||||||
Do Not Translate or Localize
|
|
||||||
|
|
||||||
This Azure Machine Learning service example notebooks repository includes material from the projects listed below.
|
|
||||||
|
|
||||||
|
|
||||||
1. SSD-Tensorflow (https://github.com/balancap/ssd-tensorflow)
|
|
||||||
|
|
||||||
|
|
||||||
%% SSD-Tensorflow NOTICES AND INFORMATION BEGIN HERE
|
|
||||||
=========================================
|
|
||||||
|
|
||||||
Apache License
|
|
||||||
Version 2.0, January 2004
|
|
||||||
http://www.apache.org/licenses/
|
|
||||||
|
|
||||||
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
|
|
||||||
|
|
||||||
1. Definitions.
|
|
||||||
|
|
||||||
"License" shall mean the terms and conditions for use, reproduction,
|
|
||||||
and distribution as defined by Sections 1 through 9 of this document.
|
|
||||||
|
|
||||||
"Licensor" shall mean the copyright owner or entity authorized by
|
|
||||||
the copyright owner that is granting the License.
|
|
||||||
|
|
||||||
"Legal Entity" shall mean the union of the acting entity and all
|
|
||||||
other entities that control, are controlled by, or are under common
|
|
||||||
control with that entity. For the purposes of this definition,
|
|
||||||
"control" means (i) the power, direct or indirect, to cause the
|
|
||||||
direction or management of such entity, whether by contract or
|
|
||||||
otherwise, or (ii) ownership of fifty percent (50%) or more of the
|
|
||||||
outstanding shares, or (iii) beneficial ownership of such entity.
|
|
||||||
|
|
||||||
"You" (or "Your") shall mean an individual or Legal Entity
|
|
||||||
exercising permissions granted by this License.
|
|
||||||
|
|
||||||
"Source" form shall mean the preferred form for making modifications,
|
|
||||||
including but not limited to software source code, documentation
|
|
||||||
source, and configuration files.
|
|
||||||
|
|
||||||
"Object" form shall mean any form resulting from mechanical
|
|
||||||
transformation or translation of a Source form, including but
|
|
||||||
not limited to compiled object code, generated documentation,
|
|
||||||
and conversions to other media types.
|
|
||||||
|
|
||||||
"Work" shall mean the work of authorship, whether in Source or
|
|
||||||
Object form, made available under the License, as indicated by a
|
|
||||||
copyright notice that is included in or attached to the work
|
|
||||||
(an example is provided in the Appendix below).
|
|
||||||
|
|
||||||
"Derivative Works" shall mean any work, whether in Source or Object
|
|
||||||
form, that is based on (or derived from) the Work and for which the
|
|
||||||
editorial revisions, annotations, elaborations, or other modifications
|
|
||||||
represent, as a whole, an original work of authorship. For the purposes
|
|
||||||
of this License, Derivative Works shall not include works that remain
|
|
||||||
separable from, or merely link (or bind by name) to the interfaces of,
|
|
||||||
the Work and Derivative Works thereof.
|
|
||||||
|
|
||||||
"Contribution" shall mean any work of authorship, including
|
|
||||||
the original version of the Work and any modifications or additions
|
|
||||||
to that Work or Derivative Works thereof, that is intentionally
|
|
||||||
submitted to Licensor for inclusion in the Work by the copyright owner
|
|
||||||
or by an individual or Legal Entity authorized to submit on behalf of
|
|
||||||
the copyright owner. For the purposes of this definition, "submitted"
|
|
||||||
means any form of electronic, verbal, or written communication sent
|
|
||||||
to the Licensor or its representatives, including but not limited to
|
|
||||||
communication on electronic mailing lists, source code control systems,
|
|
||||||
and issue tracking systems that are managed by, or on behalf of, the
|
|
||||||
Licensor for the purpose of discussing and improving the Work, but
|
|
||||||
excluding communication that is conspicuously marked or otherwise
|
|
||||||
designated in writing by the copyright owner as "Not a Contribution."
|
|
||||||
|
|
||||||
"Contributor" shall mean Licensor and any individual or Legal Entity
|
|
||||||
on behalf of whom a Contribution has been received by Licensor and
|
|
||||||
subsequently incorporated within the Work.
|
|
||||||
|
|
||||||
2. Grant of Copyright License. Subject to the terms and conditions of
|
|
||||||
this License, each Contributor hereby grants to You a perpetual,
|
|
||||||
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
|
|
||||||
copyright license to reproduce, prepare Derivative Works of,
|
|
||||||
publicly display, publicly perform, sublicense, and distribute the
|
|
||||||
Work and such Derivative Works in Source or Object form.
|
|
||||||
|
|
||||||
3. Grant of Patent License. Subject to the terms and conditions of
|
|
||||||
this License, each Contributor hereby grants to You a perpetual,
|
|
||||||
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
|
|
||||||
(except as stated in this section) patent license to make, have made,
|
|
||||||
use, offer to sell, sell, import, and otherwise transfer the Work,
|
|
||||||
where such license applies only to those patent claims licensable
|
|
||||||
by such Contributor that are necessarily infringed by their
|
|
||||||
Contribution(s) alone or by combination of their Contribution(s)
|
|
||||||
with the Work to which such Contribution(s) was submitted. If You
|
|
||||||
institute patent litigation against any entity (including a
|
|
||||||
cross-claim or counterclaim in a lawsuit) alleging that the Work
|
|
||||||
or a Contribution incorporated within the Work constitutes direct
|
|
||||||
or contributory patent infringement, then any patent licenses
|
|
||||||
granted to You under this License for that Work shall terminate
|
|
||||||
as of the date such litigation is filed.
|
|
||||||
|
|
||||||
4. Redistribution. You may reproduce and distribute copies of the
|
|
||||||
Work or Derivative Works thereof in any medium, with or without
|
|
||||||
modifications, and in Source or Object form, provided that You
|
|
||||||
meet the following conditions:
|
|
||||||
|
|
||||||
(a) You must give any other recipients of the Work or
|
|
||||||
Derivative Works a copy of this License; and
|
|
||||||
|
|
||||||
(b) You must cause any modified files to carry prominent notices
|
|
||||||
stating that You changed the files; and
|
|
||||||
|
|
||||||
(c) You must retain, in the Source form of any Derivative Works
|
|
||||||
that You distribute, all copyright, patent, trademark, and
|
|
||||||
attribution notices from the Source form of the Work,
|
|
||||||
excluding those notices that do not pertain to any part of
|
|
||||||
the Derivative Works; and
|
|
||||||
|
|
||||||
(d) If the Work includes a "NOTICE" text file as part of its
|
|
||||||
distribution, then any Derivative Works that You distribute must
|
|
||||||
include a readable copy of the attribution notices contained
|
|
||||||
within such NOTICE file, excluding those notices that do not
|
|
||||||
pertain to any part of the Derivative Works, in at least one
|
|
||||||
of the following places: within a NOTICE text file distributed
|
|
||||||
as part of the Derivative Works; within the Source form or
|
|
||||||
documentation, if provided along with the Derivative Works; or,
|
|
||||||
within a display generated by the Derivative Works, if and
|
|
||||||
wherever such third-party notices normally appear. The contents
|
|
||||||
of the NOTICE file are for informational purposes only and
|
|
||||||
do not modify the License. You may add Your own attribution
|
|
||||||
notices within Derivative Works that You distribute, alongside
|
|
||||||
or as an addendum to the NOTICE text from the Work, provided
|
|
||||||
that such additional attribution notices cannot be construed
|
|
||||||
as modifying the License.
|
|
||||||
|
|
||||||
You may add Your own copyright statement to Your modifications and
|
|
||||||
may provide additional or different license terms and conditions
|
|
||||||
for use, reproduction, or distribution of Your modifications, or
|
|
||||||
for any such Derivative Works as a whole, provided Your use,
|
|
||||||
reproduction, and distribution of the Work otherwise complies with
|
|
||||||
the conditions stated in this License.
|
|
||||||
|
|
||||||
5. Submission of Contributions. Unless You explicitly state otherwise,
|
|
||||||
any Contribution intentionally submitted for inclusion in the Work
|
|
||||||
by You to the Licensor shall be under the terms and conditions of
|
|
||||||
this License, without any additional terms or conditions.
|
|
||||||
Notwithstanding the above, nothing herein shall supersede or modify
|
|
||||||
the terms of any separate license agreement you may have executed
|
|
||||||
with Licensor regarding such Contributions.
|
|
||||||
|
|
||||||
6. Trademarks. This License does not grant permission to use the trade
|
|
||||||
names, trademarks, service marks, or product names of the Licensor,
|
|
||||||
except as required for reasonable and customary use in describing the
|
|
||||||
origin of the Work and reproducing the content of the NOTICE file.
|
|
||||||
|
|
||||||
7. Disclaimer of Warranty. Unless required by applicable law or
|
|
||||||
agreed to in writing, Licensor provides the Work (and each
|
|
||||||
Contributor provides its Contributions) on an "AS IS" BASIS,
|
|
||||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
|
|
||||||
implied, including, without limitation, any warranties or conditions
|
|
||||||
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
|
|
||||||
PARTICULAR PURPOSE. You are solely responsible for determining the
|
|
||||||
appropriateness of using or redistributing the Work and assume any
|
|
||||||
risks associated with Your exercise of permissions under this License.
|
|
||||||
|
|
||||||
8. Limitation of Liability. In no event and under no legal theory,
|
|
||||||
whether in tort (including negligence), contract, or otherwise,
|
|
||||||
unless required by applicable law (such as deliberate and grossly
|
|
||||||
negligent acts) or agreed to in writing, shall any Contributor be
|
|
||||||
liable to You for damages, including any direct, indirect, special,
|
|
||||||
incidental, or consequential damages of any character arising as a
|
|
||||||
result of this License or out of the use or inability to use the
|
|
||||||
Work (including but not limited to damages for loss of goodwill,
|
|
||||||
work stoppage, computer failure or malfunction, or any and all
|
|
||||||
other commercial damages or losses), even if such Contributor
|
|
||||||
has been advised of the possibility of such damages.
|
|
||||||
|
|
||||||
9. Accepting Warranty or Additional Liability. While redistributing
|
|
||||||
the Work or Derivative Works thereof, You may choose to offer,
|
|
||||||
and charge a fee for, acceptance of support, warranty, indemnity,
|
|
||||||
or other liability obligations and/or rights consistent with this
|
|
||||||
License. However, in accepting such obligations, You may act only
|
|
||||||
on Your own behalf and on Your sole responsibility, not on behalf
|
|
||||||
of any other Contributor, and only if You agree to indemnify,
|
|
||||||
defend, and hold each Contributor harmless for any liability
|
|
||||||
incurred by, or claims asserted against, such Contributor by reason
|
|
||||||
of your accepting any such warranty or additional liability.
|
|
||||||
|
|
||||||
END OF TERMS AND CONDITIONS
|
|
||||||
|
|
||||||
APPENDIX: How to apply the Apache License to your work.
|
|
||||||
|
|
||||||
To apply the Apache License to your work, attach the following
|
|
||||||
boilerplate notice, with the fields enclosed by brackets "[]"
|
|
||||||
replaced with your own identifying information. (Don't include
|
|
||||||
the brackets!) The text should be enclosed in the appropriate
|
|
||||||
comment syntax for the file format. We also recommend that a
|
|
||||||
file or class name and description of purpose be included on the
|
|
||||||
same "printed page" as the copyright notice for easier
|
|
||||||
identification within third-party archives.
|
|
||||||
|
|
||||||
Copyright [yyyy] [name of copyright owner]
|
|
||||||
|
|
||||||
Licensed under the Apache License, Version 2.0 (the "License");
|
|
||||||
you may not use this file except in compliance with the License.
|
|
||||||
You may obtain a copy of the License at
|
|
||||||
|
|
||||||
http://www.apache.org/licenses/LICENSE-2.0
|
|
||||||
|
|
||||||
Unless required by applicable law or agreed to in writing, software
|
|
||||||
distributed under the License is distributed on an "AS IS" BASIS,
|
|
||||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
||||||
See the License for the specific language governing permissions and
|
|
||||||
limitations under the License.
|
|
||||||
|
|
||||||
=========================================
|
|
||||||
END OF SSD-Tensorflow NOTICES AND INFORMATION
|
|
||||||
@@ -1,104 +0,0 @@
|
|||||||
|
|
||||||
# Notebooks for Microsoft Azure Machine Learning Hardware Accelerated Models SDK
|
|
||||||
|
|
||||||
Easily create and train a model using various deep neural networks (DNNs) as a featurizer for deployment to Azure or a Data Box Edge device for ultra-low latency inferencing using FPGA's. These models are currently available:
|
|
||||||
|
|
||||||
* ResNet 50
|
|
||||||
* ResNet 152
|
|
||||||
* DenseNet-121
|
|
||||||
* VGG-16
|
|
||||||
* SSD-VGG
|
|
||||||
|
|
||||||
To learn more about the azureml-accel-model classes, see the section [Model Classes](#model-classes) below or the [Azure ML Accel Models SDK documentation](https://docs.microsoft.com/en-us/python/api/azureml-accel-models/azureml.accel?view=azure-ml-py).
|
|
||||||
|
|
||||||
### Step 1: Create an Azure ML workspace
|
|
||||||
Follow [these instructions](https://docs.microsoft.com/en-us/azure/machine-learning/service/setup-create-workspace) to install the Azure ML SDK on your local machine, create an Azure ML workspace, and set up your notebook environment, which is required for the next step.
|
|
||||||
|
|
||||||
### Step 2: Check your FPGA quota
|
|
||||||
Use the Azure CLI to check whether you have quota.
|
|
||||||
|
|
||||||
```shell
|
|
||||||
az vm list-usage --location "eastus" -o table
|
|
||||||
```
|
|
||||||
|
|
||||||
The other locations are ``southeastasia``, ``westeurope``, and ``westus2``.
|
|
||||||
|
|
||||||
Under the "Name" column, look for "Standard PBS Family vCPUs" and ensure you have at least 6 vCPUs under "CurrentValue."
|
|
||||||
|
|
||||||
If you do not have quota, then submit a request form [here](https://aka.ms/accelerateAI).
|
|
||||||
|
|
||||||
### Step 3: Install the Azure ML Accelerated Models SDK
|
|
||||||
Once you have set up your environment, install the Azure ML Accel Models SDK. This package requires tensorflow >= 1.6,<2.0 to be installed.
|
|
||||||
|
|
||||||
If you already have tensorflow >= 1.6,<2.0 installed in your development environment, you can install the SDK package using:
|
|
||||||
|
|
||||||
```
|
|
||||||
pip install azureml-accel-models
|
|
||||||
```
|
|
||||||
|
|
||||||
If you do not have tensorflow >= 1.6,<2.0 and are using a CPU-only development environment, our SDK with tensorflow can be installed using:
|
|
||||||
|
|
||||||
```
|
|
||||||
pip install azureml-accel-models[cpu]
|
|
||||||
```
|
|
||||||
|
|
||||||
If your machine supports GPU (for example, on an [Azure DSVM](https://docs.microsoft.com/en-us/azure/machine-learning/data-science-virtual-machine/overview)), then you can leverage the tensorflow-gpu functionality using:
|
|
||||||
|
|
||||||
```
|
|
||||||
pip install azureml-accel-models[gpu]
|
|
||||||
```
|
|
||||||
|
|
||||||
### Step 4: Follow our notebooks
|
|
||||||
|
|
||||||
We provide notebooks to walk through the following scenarios, linked below:
|
|
||||||
* [Quickstart](https://github.com/Azure/MachineLearningNotebooks/blob/33d6def8c30d3dd3a5bfbea50b9c727788185faf/how-to-use-azureml/deployment/accelerated-models/accelerated-models-quickstart.ipynb), deploy and inference a ResNet50 model trained on ImageNet
|
|
||||||
* [Object Detection](https://github.com/Azure/MachineLearningNotebooks/blob/33d6def8c30d3dd3a5bfbea50b9c727788185faf/how-to-use-azureml/deployment/accelerated-models/accelerated-models-object-detection.ipynb), deploy and inference an SSD-VGG model that can do object detection
|
|
||||||
* [Training models](https://github.com/Azure/MachineLearningNotebooks/blob/33d6def8c30d3dd3a5bfbea50b9c727788185faf/how-to-use-azureml/deployment/accelerated-models/accelerated-models-training.ipynb), train one of our accelerated models on the Kaggle Cats and Dogs dataset to see how to improve accuracy on custom datasets
|
|
||||||
|
|
||||||
**Note**: the above notebooks work only for tensorflow >= 1.6,<2.0.
|
|
||||||
|
|
||||||
<a name="model-classes"></a>
|
|
||||||
## Model Classes
|
|
||||||
As stated above, we support 5 Accelerated Models. Here's more information on their input and output tensors.
|
|
||||||
|
|
||||||
**Available models and output tensors**
|
|
||||||
|
|
||||||
The available models and the corresponding default classifier output tensors are below. This is the value that you would use during inferencing if you used the default classifier.
|
|
||||||
* Resnet50, QuantizedResnet50
|
|
||||||
``
|
|
||||||
output_tensors = "classifier_1/resnet_v1_50/predictions/Softmax:0"
|
|
||||||
``
|
|
||||||
* Resnet152, QuantizedResnet152
|
|
||||||
``
|
|
||||||
output_tensors = "classifier/resnet_v1_152/predictions/Softmax:0"
|
|
||||||
``
|
|
||||||
* Densenet121, QuantizedDensenet121
|
|
||||||
``
|
|
||||||
output_tensors = "classifier/densenet121/predictions/Softmax:0"
|
|
||||||
``
|
|
||||||
* Vgg16, QuantizedVgg16
|
|
||||||
``
|
|
||||||
output_tensors = "classifier/vgg_16/fc8/squeezed:0"
|
|
||||||
``
|
|
||||||
* SsdVgg, QuantizedSsdVgg
|
|
||||||
``
|
|
||||||
output_tensors = ['ssd_300_vgg/block4_box/Reshape_1:0', 'ssd_300_vgg/block7_box/Reshape_1:0', 'ssd_300_vgg/block8_box/Reshape_1:0', 'ssd_300_vgg/block9_box/Reshape_1:0', 'ssd_300_vgg/block10_box/Reshape_1:0', 'ssd_300_vgg/block11_box/Reshape_1:0', 'ssd_300_vgg/block4_box/Reshape:0', 'ssd_300_vgg/block7_box/Reshape:0', 'ssd_300_vgg/block8_box/Reshape:0', 'ssd_300_vgg/block9_box/Reshape:0', 'ssd_300_vgg/block10_box/Reshape:0', 'ssd_300_vgg/block11_box/Reshape:0']
|
|
||||||
``
|
|
||||||
|
|
||||||
For more information, please reference the azureml.accel.models package in the [Azure ML Python SDK documentation](https://docs.microsoft.com/en-us/python/api/azureml-accel-models/azureml.accel.models?view=azure-ml-py).
|
|
||||||
|
|
||||||
**Input tensors**
|
|
||||||
|
|
||||||
The input_tensors value defaults to "Placeholder:0" and is created in the [Image Preprocessing](#construct-model) step in the line:
|
|
||||||
``
|
|
||||||
in_images = tf.placeholder(tf.string)
|
|
||||||
``
|
|
||||||
|
|
||||||
You can change the input_tensors name by doing this:
|
|
||||||
``
|
|
||||||
in_images = tf.placeholder(tf.string, name="images")
|
|
||||||
``
|
|
||||||
|
|
||||||
|
|
||||||
## Resources
|
|
||||||
* [Read more about FPGAs](https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-accelerate-with-fpgas)
|
|
||||||
@@ -1,14 +0,0 @@
|
|||||||
# Model Deployment with Azure ML service
|
|
||||||
You can use Azure Machine Learning to package, debug, validate and deploy inference containers to a variety of compute targets. This process is known as "MLOps" (ML operationalization).
|
|
||||||
For more information please check out this article: https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-deploy-and-where
|
|
||||||
|
|
||||||
## Get Started
|
|
||||||
To begin, you will need an ML workspace.
|
|
||||||
For more information please check out this article: https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-manage-workspace
|
|
||||||
|
|
||||||
## Deploy to the cloud
|
|
||||||
You can deploy to the cloud using the Azure ML CLI or the Azure ML SDK.
|
|
||||||
- CLI example: https://aka.ms/azmlcli
|
|
||||||
- Notebook example: [model-register-and-deploy](./model-register-and-deploy.ipynb).
|
|
||||||
|
|
||||||

|
|
||||||