Compare commits

...

43 Commits

Author SHA1 Message Date
Bruce Leng
df2be5b5b0 fix drift-on-aks failure, will be ready by next release 2019-12-10 16:22:28 -08:00
Shané Winner
8f4efe15eb Update index.md 2019-12-10 09:05:23 -08:00
vizhur
d179080467 Merge pull request #690 from Azure/release_update/Release-163
update samples from Release-163 as a part of 1.0.79 SDK release
2019-12-09 15:41:03 -05:00
vizhur
0040644e7a update samples from Release-163 as a part of 1.0.79 SDK release 2019-12-09 20:09:30 +00:00
Shané Winner
8aa04307fb Update index.md 2019-12-03 10:24:18 -08:00
Shané Winner
a525da4488 Update index.md 2019-11-27 13:08:21 -08:00
Shané Winner
e149565a8a Merge pull request #679 from Azure/release_update/Release-30
update samples - test
2019-11-27 13:05:00 -08:00
vizhur
75610ec31c update samples - test 2019-11-27 21:02:21 +00:00
Shané Winner
0c2c450b6b Update index.md 2019-11-25 14:34:48 -08:00
Shané Winner
0d548eabff Merge pull request #677 from Azure/release_update/Release-29
update samples - test
2019-11-25 14:31:50 -08:00
vizhur
e4029801e6 update samples - test 2019-11-25 22:24:09 +00:00
Shané Winner
156974ee7b Update index.md 2019-11-25 11:42:53 -08:00
Shané Winner
1f05157d24 Merge pull request #676 from Azure/release_update/Release-160
update samples from Release-160 as a part of 1.0.76 SDK release
2019-11-25 11:39:27 -08:00
vizhur
2214ea8616 update samples from Release-160 as a part of 1.0.76 SDK release 2019-11-25 19:28:19 +00:00
Sheri Gilley
b54b2566de Merge pull request #667 from Azure/sdk-codetest
remove deprecated auto_prepare_environment
2019-11-21 09:25:15 -06:00
Sheri Gilley
57b0f701f8 remove deprecated auto_prepare_environment 2019-11-20 17:28:44 -06:00
Shané Winner
d658c85208 Update index.md 2019-11-12 14:59:15 -08:00
vizhur
a5f627a9b6 Merge pull request #655 from Azure/release_update/Release-28
update samples - test
2019-11-12 17:11:45 -05:00
Sheri Gilley
7db93bcb1d update comments 2019-01-22 17:18:19 -06:00
Sheri Gilley
fcbe925640 Merge branch 'sdk-codetest' of https://github.com/Azure/MachineLearningNotebooks into sdk-codetest 2019-01-07 13:06:12 -06:00
Sheri Gilley
bedfbd649e fix files 2019-01-07 13:06:02 -06:00
Sheri Gilley
fb760f648d Delete temp.py 2019-01-07 12:58:32 -06:00
Sheri Gilley
a9a0713d2f Delete donotupload.py 2019-01-07 12:57:58 -06:00
Sheri Gilley
c9d018b52c remove prepare environment 2019-01-07 12:56:54 -06:00
Sheri Gilley
53dbd0afcf hdi run config code 2019-01-07 11:29:40 -06:00
Sheri Gilley
e3a64b1f16 code for remote vm 2019-01-04 12:51:11 -06:00
Sheri Gilley
732eecfc7c update names 2019-01-04 12:45:28 -06:00
Sheri Gilley
6995c086ff change snippet names 2019-01-03 22:39:06 -06:00
Sheri Gilley
80bba4c7ae code for amlcompute section 2019-01-03 18:55:31 -06:00
Sheri Gilley
3c581b533f for local computer 2019-01-03 18:07:12 -06:00
Sheri Gilley
cc688caa4e change names 2019-01-03 08:53:49 -06:00
Sheri Gilley
da225e116e new code 2019-01-03 08:02:35 -06:00
Sheri Gilley
73c5d02880 Update quickstart.py 2018-12-17 12:23:03 -06:00
Sheri Gilley
e472b54f1b Update quickstart.py 2018-12-17 12:22:40 -06:00
Sheri Gilley
716c6d8bb1 add quickstart code 2018-11-06 11:27:58 -06:00
Sheri Gilley
23189c6f40 move folder 2018-10-17 16:24:46 -05:00
Sheri Gilley
361b57ed29 change all names to camelCase 2018-10-17 11:47:09 -05:00
Sheri Gilley
3f531fd211 try camelCase 2018-10-17 11:09:46 -05:00
Sheri Gilley
111f5e8d73 playing around 2018-10-17 10:46:33 -05:00
Sheri Gilley
96c59d5c2b testing 2018-10-17 09:56:04 -05:00
Sheri Gilley
ce3214b7c6 fix name 2018-10-16 17:33:24 -05:00
Sheri Gilley
53199d17de add delete 2018-10-16 16:54:08 -05:00
Sheri Gilley
54c883412c add test service 2018-10-16 16:49:41 -05:00
102 changed files with 3140 additions and 3213 deletions

View File

@@ -103,7 +103,7 @@
"source": [ "source": [
"import azureml.core\n", "import azureml.core\n",
"\n", "\n",
"print(\"This notebook was created using version 1.0.74.1 of the Azure ML SDK\")\n", "print(\"This notebook was created using version 1.0.79 of the Azure ML SDK\")\n",
"print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")" "print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
] ]
}, },

View File

@@ -8,6 +8,13 @@
"Licensed under the MIT License." "Licensed under the MIT License."
] ]
}, },
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/machine-learning-pipelines/contrib/batch_inferencing/file-dataset-image-inference-mnist.png)"
]
},
{ {
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
@@ -29,8 +36,6 @@
"- Register the pretrained MNIST model into the model registry. \n", "- Register the pretrained MNIST model into the model registry. \n",
"- Use the registered model to do batch inference on the images in the data blob container.\n", "- Use the registered model to do batch inference on the images in the data blob container.\n",
"\n", "\n",
"![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/machine-learning-pipelines/contrib/batch_inferencing/file-dataset-image-inference-mnist.png)\n",
"\n",
"## Prerequisites\n", "## Prerequisites\n",
"If you are using an Azure Machine Learning Notebook VM, you are all set. Otherwise, make sure you go through the configuration Notebook located at https://github.com/Azure/MachineLearningNotebooks first. This sets you up with a working config file that has information on your workspace, subscription id, etc. " "If you are using an Azure Machine Learning Notebook VM, you are all set. Otherwise, make sure you go through the configuration Notebook located at https://github.com/Azure/MachineLearningNotebooks first. This sets you up with a working config file that has information on your workspace, subscription id, etc. "
] ]
@@ -485,7 +490,7 @@
"source": [ "source": [
"## Cleanup Compute resources\n", "## Cleanup Compute resources\n",
"\n", "\n",
"For re-occuring jobs, it may be wise to keep compute the compute resources and allow compute nodes to scale down to 0. However, since this is just a single-run job, we are free to release the allocated compute resources." "For re-occurring jobs, it may be wise to keep compute the compute resources and allow compute nodes to scale down to 0. However, since this is just a single-run job, we are free to release the allocated compute resources."
] ]
}, },
{ {
@@ -514,6 +519,27 @@
"name": "tracych" "name": "tracych"
} }
], ],
"friendly_name": "MNIST data inferencing using ParallelRunStep",
"exclude_from_index": false,
"index_order": 1,
"category": "Other notebooks",
"compute": [
"AML Compute"
],
"datasets": [
"MNIST"
],
"deployment": [
"None"
],
"framework": [
"None"
],
"tags": [
"Batch Inferencing",
"Pipeline"
],
"task": "Digit identification",
"kernelspec": { "kernelspec": {
"display_name": "Python 3.6", "display_name": "Python 3.6",
"language": "python", "language": "python",

View File

@@ -8,6 +8,13 @@
"Licensed under the MIT License." "Licensed under the MIT License."
] ]
}, },
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/machine-learning-pipelines/contrib/batch_inferencing/tabular-dataset-inference-iris.png)"
]
},
{ {
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
@@ -28,9 +35,7 @@
"- Use the registered model to do batch inference on the CSV files in the data blob container.\n", "- Use the registered model to do batch inference on the CSV files in the data blob container.\n",
"\n", "\n",
"## Prerequisites\n", "## Prerequisites\n",
"If you are using an Azure Machine Learning Notebook VM, you are all set. Otherwise, make sure you go through the configuration Notebook located at https://github.com/Azure/MachineLearningNotebooks first. This sets you up with a working config file that has information on your workspace, subscription id, etc. \n", "If you are using an Azure Machine Learning Notebook VM, you are all set. Otherwise, make sure you go through the configuration Notebook located at https://github.com/Azure/MachineLearningNotebooks first. This sets you up with a working config file that has information on your workspace, subscription id, etc. \n"
"\n",
"![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/machine-learning-pipelines/contrib/batch_inferencing/tabular-dataset-inference-iris.png)"
] ]
}, },
{ {
@@ -460,7 +465,7 @@
"metadata": {}, "metadata": {},
"source": [ "source": [
"## Cleanup compute resources\n", "## Cleanup compute resources\n",
"For re-occuring jobs, it may be wise to keep compute the compute resources and allow compute nodes to scale down to 0. However, since this is just a single run job, we are free to release the allocated compute resources." "For re-occurring jobs, it may be wise to keep compute the compute resources and allow compute nodes to scale down to 0. However, since this is just a single run job, we are free to release the allocated compute resources."
] ]
}, },
{ {
@@ -489,6 +494,27 @@
"name": "tracych" "name": "tracych"
} }
], ],
"friendly_name": "IRIS data inferencing using ParallelRunStep",
"exclude_from_index": false,
"index_order": 1,
"category": "Other notebooks",
"compute": [
"AML Compute"
],
"datasets": [
"IRIS"
],
"deployment": [
"None"
],
"framework": [
"None"
],
"tags": [
"Batch Inferencing",
"Pipeline"
],
"task": "Recognize flower type",
"kernelspec": { "kernelspec": {
"display_name": "Python 3.6", "display_name": "Python 3.6",
"language": "python", "language": "python",
@@ -505,8 +531,7 @@
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.6.2" "version": "3.6.2"
}, }
"notice": "Copyright (c) Microsoft Corporation. All rights reserved.\u00e2\u20ac\u00afLicensed under the MIT License."
}, },
"nbformat": 4, "nbformat": 4,
"nbformat_minor": 2 "nbformat_minor": 2

View File

@@ -27,10 +27,10 @@ dependencies:
- azureml-explain-model - azureml-explain-model
- azureml-pipeline - azureml-pipeline
- azureml-contrib-interpret - azureml-contrib-interpret
- pandas_ml
- pytorch-transformers==1.0.0 - pytorch-transformers==1.0.0
- spacy==2.1.8 - spacy==2.1.8
- joblib - joblib
- onnxruntime==0.4.0
- https://aka.ms/automl-resources/packages/en_core_web_sm-2.1.0.tar.gz - https://aka.ms/automl-resources/packages/en_core_web_sm-2.1.0.tar.gz
channels: channels:

View File

@@ -28,10 +28,10 @@ dependencies:
- azureml-explain-model - azureml-explain-model
- azureml-pipeline - azureml-pipeline
- azureml-contrib-interpret - azureml-contrib-interpret
- pandas_ml
- pytorch-transformers==1.0.0 - pytorch-transformers==1.0.0
- spacy==2.1.8 - spacy==2.1.8
- joblib - joblib
- onnxruntime==0.4.0
- https://aka.ms/automl-resources/packages/en_core_web_sm-2.1.0.tar.gz - https://aka.ms/automl-resources/packages/en_core_web_sm-2.1.0.tar.gz
channels: channels:

View File

@@ -14,8 +14,9 @@ IF "%CONDA_EXE%"=="" GOTO CondaMissing
call conda activate %conda_env_name% 2>nul: call conda activate %conda_env_name% 2>nul:
if not errorlevel 1 ( if not errorlevel 1 (
echo Upgrading azureml-sdk[automl,notebooks,explain] in existing conda environment %conda_env_name% echo Upgrading existing conda environment %conda_env_name%
call pip install --upgrade azureml-sdk[automl,notebooks,explain] call pip uninstall azureml-train-automl -y -q
call conda env update --name %conda_env_name% --file %automl_env_file%
if errorlevel 1 goto ErrorExit if errorlevel 1 goto ErrorExit
) else ( ) else (
call conda env create -f %automl_env_file% -n %conda_env_name% call conda env create -f %automl_env_file% -n %conda_env_name%

View File

@@ -22,8 +22,9 @@ fi
if source activate $CONDA_ENV_NAME 2> /dev/null if source activate $CONDA_ENV_NAME 2> /dev/null
then then
echo "Upgrading azureml-sdk[automl,notebooks,explain] in existing conda environment" $CONDA_ENV_NAME echo "Upgrading existing conda environment" $CONDA_ENV_NAME
pip install --upgrade azureml-sdk[automl,notebooks,explain] && pip uninstall azureml-train-automl -y -q
conda env update --name $CONDA_ENV_NAME --file $AUTOML_ENV_FILE &&
jupyter nbextension uninstall --user --py azureml.widgets jupyter nbextension uninstall --user --py azureml.widgets
else else
conda env create -f $AUTOML_ENV_FILE -n $CONDA_ENV_NAME && conda env create -f $AUTOML_ENV_FILE -n $CONDA_ENV_NAME &&

View File

@@ -22,8 +22,9 @@ fi
if source activate $CONDA_ENV_NAME 2> /dev/null if source activate $CONDA_ENV_NAME 2> /dev/null
then then
echo "Upgrading azureml-sdk[automl,notebooks,explain] in existing conda environment" $CONDA_ENV_NAME echo "Upgrading existing conda environment" $CONDA_ENV_NAME
pip install --upgrade azureml-sdk[automl,notebooks,explain] && pip uninstall azureml-train-automl -y -q
conda env update --name $CONDA_ENV_NAME --file $AUTOML_ENV_FILE &&
jupyter nbextension uninstall --user --py azureml.widgets jupyter nbextension uninstall --user --py azureml.widgets
else else
conda env create -f $AUTOML_ENV_FILE -n $CONDA_ENV_NAME && conda env create -f $AUTOML_ENV_FILE -n $CONDA_ENV_NAME &&

View File

@@ -285,7 +285,8 @@
"|**task**|classification or regression or forecasting|\n", "|**task**|classification or regression or forecasting|\n",
"|**primary_metric**|This is the metric that you want to optimize. Classification supports the following primary metrics: <br><i>accuracy</i><br><i>AUC_weighted</i><br><i>average_precision_score_weighted</i><br><i>norm_macro_recall</i><br><i>precision_score_weighted</i>|\n", "|**primary_metric**|This is the metric that you want to optimize. Classification supports the following primary metrics: <br><i>accuracy</i><br><i>AUC_weighted</i><br><i>average_precision_score_weighted</i><br><i>norm_macro_recall</i><br><i>precision_score_weighted</i>|\n",
"|**iteration_timeout_minutes**|Time limit in minutes for each iteration.|\n", "|**iteration_timeout_minutes**|Time limit in minutes for each iteration.|\n",
"|**blacklist_models** or **whitelist_models** |*List* of *strings* indicating machine learning algorithms for AutoML to avoid in this run.<br><br> Allowed values for **Classification**<br><i>LogisticRegression</i><br><i>SGD</i><br><i>MultinomialNaiveBayes</i><br><i>BernoulliNaiveBayes</i><br><i>SVM</i><br><i>LinearSVM</i><br><i>KNN</i><br><i>DecisionTree</i><br><i>RandomForest</i><br><i>ExtremeRandomTrees</i><br><i>LightGBM</i><br><i>GradientBoosting</i><br><i>TensorFlowDNN</i><br><i>TensorFlowLinearClassifier</i><br><br>Allowed values for **Regression**<br><i>ElasticNet</i><br><i>GradientBoosting</i><br><i>DecisionTree</i><br><i>KNN</i><br><i>LassoLars</i><br><i>SGD</i><br><i>RandomForest</i><br><i>ExtremeRandomTrees</i><br><i>LightGBM</i><br><i>TensorFlowLinearRegressor</i><br><i>TensorFlowDNN</i><br><br>Allowed values for **Forecasting**<br><i>ElasticNet</i><br><i>GradientBoosting</i><br><i>DecisionTree</i><br><i>KNN</i><br><i>LassoLars</i><br><i>SGD</i><br><i>RandomForest</i><br><i>ExtremeRandomTrees</i><br><i>LightGBM</i><br><i>TensorFlowLinearRegressor</i><br><i>TensorFlowDNN</i><br><i>Arima</i><br><i>Prophet</i>|\n", "|**blacklist_models** | *List* of *strings* indicating machine learning algorithms for AutoML to avoid in this run. <br><br> Allowed values for **Classification**<br><i>LogisticRegression</i><br><i>SGD</i><br><i>MultinomialNaiveBayes</i><br><i>BernoulliNaiveBayes</i><br><i>SVM</i><br><i>LinearSVM</i><br><i>KNN</i><br><i>DecisionTree</i><br><i>RandomForest</i><br><i>ExtremeRandomTrees</i><br><i>LightGBM</i><br><i>GradientBoosting</i><br><i>TensorFlowDNN</i><br><i>TensorFlowLinearClassifier</i><br><br>Allowed values for **Regression**<br><i>ElasticNet</i><br><i>GradientBoosting</i><br><i>DecisionTree</i><br><i>KNN</i><br><i>LassoLars</i><br><i>SGD</i><br><i>RandomForest</i><br><i>ExtremeRandomTrees</i><br><i>LightGBM</i><br><i>TensorFlowLinearRegressor</i><br><i>TensorFlowDNN</i><br><br>Allowed values for **Forecasting**<br><i>ElasticNet</i><br><i>GradientBoosting</i><br><i>DecisionTree</i><br><i>KNN</i><br><i>LassoLars</i><br><i>SGD</i><br><i>RandomForest</i><br><i>ExtremeRandomTrees</i><br><i>LightGBM</i><br><i>TensorFlowLinearRegressor</i><br><i>TensorFlowDNN</i><br><i>Arima</i><br><i>Prophet</i>|\n",
"| **whitelist_models** | *List* of *strings* indicating machine learning algorithms for AutoML to use in this run. Same values listed above for **blacklist_models** allowed for **whitelist_models**.|\n",
"|**experiment_exit_score**| Value indicating the target for *primary_metric*. <br>Once the target is surpassed the run terminates.|\n", "|**experiment_exit_score**| Value indicating the target for *primary_metric*. <br>Once the target is surpassed the run terminates.|\n",
"|**experiment_timeout_minutes**| Maximum amount of time in minutes that all iterations combined can take before the experiment terminates.|\n", "|**experiment_timeout_minutes**| Maximum amount of time in minutes that all iterations combined can take before the experiment terminates.|\n",
"|**enable_early_stopping**| Flag to enble early termination if the score is not improving in the short term.|\n", "|**enable_early_stopping**| Flag to enble early termination if the score is not improving in the short term.|\n",
@@ -557,7 +558,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"from azureml.automl.core.onnx_convert import OnnxConverter\n", "from azureml.automl.runtime.onnx_convert import OnnxConverter\n",
"onnx_fl_path = \"./best_model.onnx\"\n", "onnx_fl_path = \"./best_model.onnx\"\n",
"OnnxConverter.save_onnx_model(onnx_mdl, onnx_fl_path)" "OnnxConverter.save_onnx_model(onnx_mdl, onnx_fl_path)"
] ]
@@ -566,17 +567,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"### Predict with the ONNX model, using onnxruntime package\n", "### Predict with the ONNX model, using onnxruntime package"
"#### Note: The code will install the onnxruntime==0.4.0 if not installed. Newer versions of the onnxruntime have compatibility issues."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"test_df = test_dataset.to_pandas_dataframe()"
] ]
}, },
{ {
@@ -595,21 +586,8 @@
"else:\n", "else:\n",
" python_version_compatible = False\n", " python_version_compatible = False\n",
"\n", "\n",
"onnxrt_present = False\n", "import onnxruntime\n",
"try:\n", "from azureml.automl.runtime.onnx_convert import OnnxInferenceHelper\n",
" import onnxruntime\n",
" from azureml.automl.core.onnx_convert import OnnxInferenceHelper \n",
" from onnxruntime import __version__ as ORT_VER\n",
" if ORT_VER == '0.4.0':\n",
" onnxrt_present = True\n",
"except ImportError:\n",
" onnxrt_present = False\n",
" \n",
"# Install the onnxruntime if the version 0.4.0 is not installed.\n",
"if not onnxrt_present:\n",
" print(\"Installing the onnxruntime version 0.4.0.\")\n",
" !{sys.executable} -m pip install --user --force-reinstall onnxruntime==0.4.0\n",
" onnxrt_present = True\n",
"\n", "\n",
"def get_onnx_res(run):\n", "def get_onnx_res(run):\n",
" res_path = 'onnx_resource.json'\n", " res_path = 'onnx_resource.json'\n",
@@ -618,7 +596,8 @@
" onnx_res = json.load(f)\n", " onnx_res = json.load(f)\n",
" return onnx_res\n", " return onnx_res\n",
"\n", "\n",
"if onnxrt_present and python_version_compatible: \n", "if python_version_compatible:\n",
" test_df = test_dataset.to_pandas_dataframe()\n",
" mdl_bytes = onnx_mdl.SerializeToString()\n", " mdl_bytes = onnx_mdl.SerializeToString()\n",
" onnx_res = get_onnx_res(best_run)\n", " onnx_res = get_onnx_res(best_run)\n",
"\n", "\n",
@@ -628,10 +607,7 @@
" print(pred_onnx)\n", " print(pred_onnx)\n",
" print(pred_prob_onnx)\n", " print(pred_prob_onnx)\n",
"else:\n", "else:\n",
" if not python_version_compatible:\n", " print('Please use Python version 3.6 or 3.7 to run the inference helper.')"
" print('Please use Python version 3.6 or 3.7 to run the inference helper.') \n",
" if not onnxrt_present:\n",
" print('Please install the onnxruntime package to do the prediction with ONNX model.')"
] ]
}, },
{ {
@@ -665,20 +641,6 @@
"best_run, fitted_model = remote_run.get_output()" "best_run, fitted_model = remote_run.get_output()"
] ]
}, },
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import shutil\n",
"\n",
"sript_folder = os.path.join(os.getcwd(), 'inference')\n",
"project_folder = '/inference'\n",
"os.makedirs(project_folder, exist_ok=True)"
]
},
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": null, "execution_count": null,

View File

@@ -451,7 +451,7 @@
"AML Compute" "AML Compute"
], ],
"datasets": [ "datasets": [
"creditcard" "Creditcard"
], ],
"deployment": [ "deployment": [
"None" "None"

View File

@@ -522,6 +522,9 @@
"datasets": [ "datasets": [
"None" "None"
], ],
"compute": [
"AML Compute"
],
"deployment": [ "deployment": [
"None" "None"
], ],

View File

@@ -323,7 +323,8 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"from azureml.train.automl import AutoMLStep, AutoMLConfig\n", "from azureml.train.automl import AutoMLConfig\n",
"from azureml.train.automl.runtime import AutoMLStep\n",
"\n", "\n",
"automl_settings = {\n", "automl_settings = {\n",
" \"iteration_timeout_minutes\": 20,\n", " \"iteration_timeout_minutes\": 20,\n",
@@ -440,7 +441,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"training_pipeline_run.wait_for_completion()" "training_pipeline_run.wait_for_completion(show_output=False)"
] ]
}, },
{ {

View File

@@ -301,7 +301,7 @@
"source": [ "source": [
"### Setting forecaster maximum horizon \n", "### Setting forecaster maximum horizon \n",
"\n", "\n",
"The forecast horizon is the number of periods into the future that the model should predict. Here, we set the horizon to 4 periods (i.e. 4 months). Notice that this is much shorter than the number of days in the test set; we will need to use a rolling test to evaluate the performance on the whole test set. For more discussion of forecast horizons and guiding principles for setting them, please see the [energy demand notebook](https://github.com/Azure/MachineLearningNotebooks/tree/master/how-to-use-azureml/automated-machine-learning/forecasting-energy-demand). " "The forecast horizon is the number of periods into the future that the model should predict. Here, we set the horizon to 12 periods (i.e. 12 months). Notice that this is much shorter than the number of months in the test set; we will need to use a rolling test to evaluate the performance on the whole test set. For more discussion of forecast horizons and guiding principles for setting them, please see the [energy demand notebook](https://github.com/Azure/MachineLearningNotebooks/tree/master/how-to-use-azureml/automated-machine-learning/forecasting-energy-demand). "
] ]
}, },
{ {
@@ -363,7 +363,7 @@
" label_column_name=target_column_name,\n", " label_column_name=target_column_name,\n",
" validation_data=valid_dataset, \n", " validation_data=valid_dataset, \n",
" verbosity=logging.INFO,\n", " verbosity=logging.INFO,\n",
" compute_target = compute_target,\n", " compute_target=compute_target,\n",
" max_concurrent_iterations=4,\n", " max_concurrent_iterations=4,\n",
" max_cores_per_iteration=-1,\n", " max_cores_per_iteration=-1,\n",
" **automl_settings)" " **automl_settings)"

View File

@@ -42,7 +42,7 @@
"\n", "\n",
"AutoML highlights here include built-in holiday featurization, accessing engineered feature names, and working with the `forecast` function. Please also look at the additional forecasting notebooks, which document lagging, rolling windows, forecast quantiles, other ways to use the forecast function, and forecaster deployment.\n", "AutoML highlights here include built-in holiday featurization, accessing engineered feature names, and working with the `forecast` function. Please also look at the additional forecasting notebooks, which document lagging, rolling windows, forecast quantiles, other ways to use the forecast function, and forecaster deployment.\n",
"\n", "\n",
"Make sure you have executed the [configuration](../configuration.ipynb) before running this notebook.\n", "Make sure you have executed the [configuration notebook](../../../configuration.ipynb) before running this notebook.\n",
"\n", "\n",
"Notebook synopsis:\n", "Notebook synopsis:\n",
"1. Creating an Experiment in an existing Workspace\n", "1. Creating an Experiment in an existing Workspace\n",
@@ -161,7 +161,7 @@
"source": [ "source": [
"## Data\n", "## Data\n",
"\n", "\n",
"The [Machine Learning service workspace](https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-workspace), is paired with the storage account, which contains the default data store. We will use it to upload the bike share data and create [tabular dataset](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.data.tabulardataset?view=azure-ml-py) for training. A tabular dataset defines a series of lazily-evaluated, immutable operations to load data from the data source into tabular representation." "The [Machine Learning service workspace](https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-workspace) is paired with the storage account, which contains the default data store. We will use it to upload the bike share data and create [tabular dataset](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.data.tabulardataset?view=azure-ml-py) for training. A tabular dataset defines a series of lazily-evaluated, immutable operations to load data from the data source into tabular representation."
] ]
}, },
{ {
@@ -202,7 +202,7 @@
"outputs": [], "outputs": [],
"source": [ "source": [
"dataset = Dataset.Tabular.from_delimited_files(path = [(datastore, 'dataset/bike-no.csv')]).with_timestamp_columns(fine_grain_timestamp=time_column_name) \n", "dataset = Dataset.Tabular.from_delimited_files(path = [(datastore, 'dataset/bike-no.csv')]).with_timestamp_columns(fine_grain_timestamp=time_column_name) \n",
"dataset.take(5).to_pandas_dataframe()" "dataset.take(5).to_pandas_dataframe().reset_index(drop=True)"
] ]
}, },
{ {
@@ -221,8 +221,8 @@
"outputs": [], "outputs": [],
"source": [ "source": [
"# select data that occurs before a specified date\n", "# select data that occurs before a specified date\n",
"train = dataset.time_before(datetime(2012, 9, 1))\n", "train = dataset.time_before(datetime(2012, 8, 31), include_boundary=True)\n",
"train.to_pandas_dataframe().tail(5)" "train.to_pandas_dataframe().tail(5).reset_index(drop=True)"
] ]
}, },
{ {
@@ -231,8 +231,8 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"test = dataset.time_after(datetime(2012, 8, 31))\n", "test = dataset.time_after(datetime(2012, 9, 1), include_boundary=True)\n",
"test.to_pandas_dataframe().head(5)" "test.to_pandas_dataframe().head(5).reset_index(drop=True)"
] ]
}, },
{ {
@@ -247,7 +247,7 @@
"|-|-|\n", "|-|-|\n",
"|**task**|forecasting|\n", "|**task**|forecasting|\n",
"|**primary_metric**|This is the metric that you want to optimize.<br> Forecasting supports the following primary metrics <br><i>spearman_correlation</i><br><i>normalized_root_mean_squared_error</i><br><i>r2_score</i><br><i>normalized_mean_absolute_error</i>\n", "|**primary_metric**|This is the metric that you want to optimize.<br> Forecasting supports the following primary metrics <br><i>spearman_correlation</i><br><i>normalized_root_mean_squared_error</i><br><i>r2_score</i><br><i>normalized_mean_absolute_error</i>\n",
"|**blacklist_models**|Models in blacklist won't be used by AutoML. All supported models can be found at [here](https://docs.microsoft.com/en-us/python/api/azureml-train-automl/azureml.train.automl.constants.supportedmodels.regression?view=azure-ml-py).|\n", "|**blacklist_models**|Models in blacklist won't be used by AutoML. All supported models can be found at [here](https://docs.microsoft.com/en-us/python/api/azureml-train-automl-client/azureml.train.automl.constants.supportedmodels.forecasting?view=azure-ml-py).|\n",
"|**experiment_timeout_minutes**|Experimentation timeout in minutes.|\n", "|**experiment_timeout_minutes**|Experimentation timeout in minutes.|\n",
"|**training_data**|Input dataset, containing both features and label column.|\n", "|**training_data**|Input dataset, containing both features and label column.|\n",
"|**label_column_name**|The name of the label column.|\n", "|**label_column_name**|The name of the label column.|\n",
@@ -309,7 +309,7 @@
" training_data=train,\n", " training_data=train,\n",
" label_column_name=target_column_name,\n", " label_column_name=target_column_name,\n",
" compute_target=compute_target,\n", " compute_target=compute_target,\n",
" enable_early_stopping = True,\n", " enable_early_stopping=True,\n",
" n_cross_validations=3, \n", " n_cross_validations=3, \n",
" max_concurrent_iterations=4,\n", " max_concurrent_iterations=4,\n",
" max_cores_per_iteration=-1,\n", " max_cores_per_iteration=-1,\n",
@@ -586,7 +586,7 @@
], ],
"category": "tutorial", "category": "tutorial",
"compute": [ "compute": [
"remote" "Remote"
], ],
"datasets": [ "datasets": [
"BikeShare" "BikeShare"
@@ -625,7 +625,7 @@
"tags": [ "tags": [
"Forecasting" "Forecasting"
], ],
"task": "forecasting", "task": "Forecasting",
"version": 3 "version": 3
}, },
"nbformat": 4, "nbformat": 4,

View File

@@ -1,6 +1,6 @@
import argparse import argparse
import azureml.train.automl import azureml.train.automl
from azureml.automl.core._vendor.automl.client.core.runtime import forecasting_models from azureml.automl.runtime._vendor.automl.client.core.runtime import forecasting_models
from azureml.core import Run from azureml.core import Run
from sklearn.externals import joblib from sklearn.externals import joblib
import forecasting_helper import forecasting_helper
@@ -32,18 +32,17 @@ test_dataset = run.input_datasets['test_data']
grain_column_names = [] grain_column_names = []
df = test_dataset.to_pandas_dataframe() df = test_dataset.to_pandas_dataframe().reset_index(drop=True)
X_test_df = test_dataset.drop_columns(columns=[target_column_name]) X_test_df = test_dataset.drop_columns(columns=[target_column_name]).to_pandas_dataframe().reset_index(drop=True)
y_test_df = test_dataset.with_timestamp_columns( y_test_df = test_dataset.with_timestamp_columns(None).keep_columns(columns=[target_column_name]).to_pandas_dataframe()
None).keep_columns(columns=[target_column_name])
fitted_model = joblib.load('model.pkl') fitted_model = joblib.load('model.pkl')
df_all = forecasting_helper.do_rolling_forecast( df_all = forecasting_helper.do_rolling_forecast(
fitted_model, fitted_model,
X_test_df.to_pandas_dataframe(), X_test_df,
y_test_df.to_pandas_dataframe().values.T[0], y_test_df.values.T[0],
target_column_name, target_column_name,
time_column_name, time_column_name,
max_horizon, max_horizon,

View File

@@ -31,8 +31,8 @@
"1. [Results](#Results)\n", "1. [Results](#Results)\n",
"\n", "\n",
"Advanced Forecasting\n", "Advanced Forecasting\n",
"1. [Advanced Training](#Advanced Training)\n", "1. [Advanced Training](#advanced_training)\n",
"1. [Advanced Results](#Advanced Results)" "1. [Advanced Results](#advanced_results)"
] ]
}, },
{ {
@@ -211,7 +211,7 @@
"outputs": [], "outputs": [],
"source": [ "source": [
"dataset = Dataset.Tabular.from_delimited_files(path = \"https://automlsamplenotebookdata.blob.core.windows.net/automl-sample-notebook-data/nyc_energy.csv\").with_timestamp_columns(fine_grain_timestamp=time_column_name) \n", "dataset = Dataset.Tabular.from_delimited_files(path = \"https://automlsamplenotebookdata.blob.core.windows.net/automl-sample-notebook-data/nyc_energy.csv\").with_timestamp_columns(fine_grain_timestamp=time_column_name) \n",
"dataset.take(5).to_pandas_dataframe()" "dataset.take(5).to_pandas_dataframe().reset_index(drop=True)"
] ]
}, },
{ {
@@ -253,7 +253,7 @@
"source": [ "source": [
"# split into train based on time\n", "# split into train based on time\n",
"train = dataset.time_before(datetime(2017, 8, 8, 5), include_boundary=True)\n", "train = dataset.time_before(datetime(2017, 8, 8, 5), include_boundary=True)\n",
"train.to_pandas_dataframe().sort_values(time_column_name).tail(5)" "train.to_pandas_dataframe().sort_values(time_column_name).tail(5).reset_index(drop=True)"
] ]
}, },
{ {
@@ -263,8 +263,8 @@
"outputs": [], "outputs": [],
"source": [ "source": [
"# split into test based on time\n", "# split into test based on time\n",
"test = dataset.time_between(datetime(2017, 8, 8, 5), datetime(2017, 8, 10, 5))\n", "test = dataset.time_between(datetime(2017, 8, 8, 6), datetime(2017, 8, 10, 5))\n",
"test.to_pandas_dataframe().head(5)" "test.to_pandas_dataframe().head(5).reset_index(drop=True)"
] ]
}, },
{ {
@@ -301,7 +301,7 @@
"|-|-|\n", "|-|-|\n",
"|**task**|forecasting|\n", "|**task**|forecasting|\n",
"|**primary_metric**|This is the metric that you want to optimize.<br> Forecasting supports the following primary metrics <br><i>spearman_correlation</i><br><i>normalized_root_mean_squared_error</i><br><i>r2_score</i><br><i>normalized_mean_absolute_error</i>|\n", "|**primary_metric**|This is the metric that you want to optimize.<br> Forecasting supports the following primary metrics <br><i>spearman_correlation</i><br><i>normalized_root_mean_squared_error</i><br><i>r2_score</i><br><i>normalized_mean_absolute_error</i>|\n",
"|**blacklist_models**|Models in blacklist won't be used by AutoML. All supported models can be found at [here](https://docs.microsoft.com/en-us/python/api/azureml-train-automl/azureml.train.automl.constants.supportedmodels.regression?view=azure-ml-py).|\n", "|**blacklist_models**|Models in blacklist won't be used by AutoML. All supported models can be found at [here](https://docs.microsoft.com/en-us/python/api/azureml-train-automl-client/azureml.train.automl.constants.supportedmodels.forecasting?view=azure-ml-py).|\n",
"|**experiment_timeout_minutes**|Maximum amount of time in minutes that the experiment take before it terminates.|\n", "|**experiment_timeout_minutes**|Maximum amount of time in minutes that the experiment take before it terminates.|\n",
"|**training_data**|The training data to be used within the experiment.|\n", "|**training_data**|The training data to be used within the experiment.|\n",
"|**label_column_name**|The name of the label column.|\n", "|**label_column_name**|The name of the label column.|\n",
@@ -337,7 +337,7 @@
" training_data=train,\n", " training_data=train,\n",
" label_column_name=target_column_name,\n", " label_column_name=target_column_name,\n",
" compute_target=compute_target,\n", " compute_target=compute_target,\n",
" enable_early_stopping = True,\n", " enable_early_stopping=True,\n",
" n_cross_validations=3, \n", " n_cross_validations=3, \n",
" verbosity=logging.INFO,\n", " verbosity=logging.INFO,\n",
" **automl_settings)" " **automl_settings)"
@@ -454,7 +454,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"X_test = test.to_pandas_dataframe()\n", "X_test = test.to_pandas_dataframe().reset_index(drop=True)\n",
"y_test = X_test.pop(target_column_name).values" "y_test = X_test.pop(target_column_name).values"
] ]
}, },
@@ -463,11 +463,7 @@
"metadata": {}, "metadata": {},
"source": [ "source": [
"### Forecast Function\n", "### Forecast Function\n",
"For forecasting, we will use the forecast function instead of the predict function. There are two reasons for this.\n", "For forecasting, we will use the forecast function instead of the predict function. Using the predict method would result in getting predictions for EVERY horizon the forecaster can predict at. This is useful when training and evaluating the performance of the forecaster at various horizons, but the level of detail is excessive for normal use. Forecast function also can handle more complicated scenarios, see notebook on [high frequency forecasting](https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/automated-machine-learning/forecasting-high-frequency/automl-forecasting-function.ipynb)."
"\n",
"We need to pass the recent values of the target variable y, whereas the scikit-compatible predict function only takes the non-target variables 'test'. In our case, the test data immediately follows the training data, and we fill the target variable with NaN. The NaN serves as a question mark for the forecaster to fill with the actuals. Using the forecast function will produce forecasts using the shortest possible forecast horizon. The last time at which a definite (non-NaN) value is seen is the forecast origin - the last time when the value of the target is known.\n",
"\n",
"Using the predict method would result in getting predictions for EVERY horizon the forecaster can predict at. This is useful when training and evaluating the performance of the forecaster at various horizons, but the level of detail is excessive for normal use."
] ]
}, },
{ {
@@ -476,15 +472,10 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"# Replace ALL values in y by NaN.\n",
"# The forecast origin will be at the beginning of the first forecast period.\n",
"# (Which is the same time as the end of the last training period.)\n",
"y_query = y_test.copy().astype(np.float)\n",
"y_query.fill(np.nan)\n",
"# The featurized data, aligned to y, will also be returned.\n", "# The featurized data, aligned to y, will also be returned.\n",
"# This contains the assumptions that were made in the forecast\n", "# This contains the assumptions that were made in the forecast\n",
"# and helps align the forecast to the original data\n", "# and helps align the forecast to the original data\n",
"y_predictions, X_trans = fitted_model.forecast(X_test, y_query)" "y_predictions, X_trans = fitted_model.forecast(X_test)"
] ]
}, },
{ {
@@ -557,7 +548,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"## Advanced Training\n", "## Advanced Training <a id=\"advanced_training\"></a>\n",
"We did not use lags in the previous model specification. In effect, the prediction was the result of a simple regression on date, grain and any additional features. This is often a very good prediction as common time series patterns like seasonality and trends can be captured in this manner. Such simple regression is horizon-less: it doesn't matter how far into the future we are predicting, because we are not using past data. In the previous example, the horizon was only used to split the data for cross-validation." "We did not use lags in the previous model specification. In effect, the prediction was the result of a simple regression on date, grain and any additional features. This is often a very good prediction as common time series patterns like seasonality and trends can be captured in this manner. Such simple regression is horizon-less: it doesn't matter how far into the future we are predicting, because we are not using past data. In the previous example, the horizon was only used to split the data for cross-validation."
] ]
}, },
@@ -642,7 +633,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"## Advanced Results\n", "## Advanced Results<a id=\"advanced_results\"></a>\n",
"We did not use lags in the previous model specification. In effect, the prediction was the result of a simple regression on date, grain and any additional features. This is often a very good prediction as common time series patterns like seasonality and trends can be captured in this manner. Such simple regression is horizon-less: it doesn't matter how far into the future we are predicting, because we are not using past data. In the previous example, the horizon was only used to split the data for cross-validation." "We did not use lags in the previous model specification. In effect, the prediction was the result of a simple regression on date, grain and any additional features. This is often a very good prediction as common time series patterns like seasonality and trends can be captured in this manner. Such simple regression is horizon-less: it doesn't matter how far into the future we are predicting, because we are not using past data. In the previous example, the horizon was only used to split the data for cross-validation."
] ]
}, },
@@ -652,15 +643,10 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"# Replace ALL values in y by NaN.\n",
"# The forecast origin will be at the beginning of the first forecast period.\n",
"# (Which is the same time as the end of the last training period.)\n",
"y_query = y_test.copy().astype(np.float)\n",
"y_query.fill(np.nan)\n",
"# The featurized data, aligned to y, will also be returned.\n", "# The featurized data, aligned to y, will also be returned.\n",
"# This contains the assumptions that were made in the forecast\n", "# This contains the assumptions that were made in the forecast\n",
"# and helps align the forecast to the original data\n", "# and helps align the forecast to the original data\n",
"y_predictions, X_trans = fitted_model_lags.forecast(X_test, y_query)" "y_predictions, X_trans = fitted_model_lags.forecast(X_test)"
] ]
}, },
{ {
@@ -730,14 +716,7 @@
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.6.8" "version": "3.6.8"
}, }
"star_tag": [
"featured"
],
"tags": [
""
],
"task": "Forecasting"
}, },
"nbformat": 4, "nbformat": 4,
"nbformat_minor": 2 "nbformat_minor": 2

View File

@@ -152,7 +152,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"# upload data to your default datastore\n", "# upload training and test data to your default datastore\n",
"ds = ws.get_default_datastore()\n", "ds = ws.get_default_datastore()\n",
"ds.upload(src_dir='./data', target_path='groupdata', overwrite=True, show_progress=True)" "ds.upload(src_dir='./data', target_path='groupdata', overwrite=True, show_progress=True)"
] ]
@@ -178,7 +178,7 @@
"\n", "\n",
"#### Create or Attach existing AmlCompute\n", "#### Create or Attach existing AmlCompute\n",
"\n", "\n",
"You will need to create a compute target for your AutoML run. In this tutorial, you create AmlCompute as your training compute resource.\n", "You will need to create a compute target for your automated ML run. In this tutorial, you create AmlCompute as your training compute resource.\n",
"#### Creation of AmlCompute takes approximately 5 minutes. \n", "#### Creation of AmlCompute takes approximately 5 minutes. \n",
"If the AmlCompute with that name is already in your workspace this code will skip the creation process.\n", "If the AmlCompute with that name is already in your workspace this code will skip the creation process.\n",
"As with other Azure services, there are limits on certain resources (e.g. AmlCompute) associated with the Azure Machine Learning service. Please read this article on the default limits and how to request more quota." "As with other Azure services, there are limits on certain resources (e.g. AmlCompute) associated with the Azure Machine Learning service. Please read this article on the default limits and how to request more quota."

View File

@@ -8,10 +8,11 @@ from azureml.core import RunConfiguration
from azureml.core.compute import ComputeTarget from azureml.core.compute import ComputeTarget
from azureml.core.conda_dependencies import CondaDependencies from azureml.core.conda_dependencies import CondaDependencies
from azureml.core.dataset import Dataset from azureml.core.dataset import Dataset
from azureml.data import TabularDataset
from azureml.pipeline.core import PipelineData, PipelineParameter, TrainingOutput, StepSequence from azureml.pipeline.core import PipelineData, PipelineParameter, TrainingOutput, StepSequence
from azureml.pipeline.steps import PythonScriptStep from azureml.pipeline.steps import PythonScriptStep
from azureml.train.automl import AutoMLConfig from azureml.train.automl import AutoMLConfig
from azureml.train.automl import AutoMLStep from azureml.train.automl.runtime import AutoMLStep
def _get_groups(data: Dataset, group_column_names: List[str]) -> pd.DataFrame: def _get_groups(data: Dataset, group_column_names: List[str]) -> pd.DataFrame:
@@ -33,9 +34,10 @@ def _get_configs(automlconfig: AutoMLConfig,
group_name = "#####".join(str(x) for x in group.values) group_name = "#####".join(str(x) for x in group.values)
group_name = valid_chars.sub('', group_name) group_name = valid_chars.sub('', group_name)
for key in group.index: for key in group.index:
single = data._dataflow.filter(data._dataflow[key] == group[key]) single = single._dataflow.filter(data._dataflow[key] == group[key])
t_dataset = TabularDataset._create(single)
group_conf = copy.deepcopy(automlconfig) group_conf = copy.deepcopy(automlconfig)
group_conf.user_settings['training_data'] = single group_conf.user_settings['training_data'] = t_dataset
group_conf.user_settings['label_column_name'] = target_column group_conf.user_settings['label_column_name'] = target_column
group_conf.user_settings['compute_target'] = compute_target group_conf.user_settings['compute_target'] = compute_target
configs[group_name] = group_conf configs[group_name] = group_conf
@@ -106,6 +108,13 @@ def build_pipeline_steps(automlconfig: AutoMLConfig,
final_steps = steps final_steps = steps
if deploy: if deploy:
# modify the conda dependencies to ensure we pick up correct
# versions of azureml-defaults and azureml-train-automl
cd = CondaDependencies.create(pip_packages=['azureml-defaults', 'azureml-train-automl'])
automl_deps = CondaDependencies(conda_dependencies_file_path='deploy/myenv.yml')
cd._merge_dependencies(automl_deps)
cd.save('deploy/myenv.yml')
# add deployment step # add deployment step
pp_group_column_names = PipelineParameter( pp_group_column_names = PipelineParameter(
"group_column_names", "group_column_names",

View File

@@ -1,10 +1,11 @@
import argparse import argparse
from azureml.core import Run, Model
from azureml.core import Workspace
from azureml.core.webservice import AciWebservice
from azureml.core.model import InferenceConfig
import json import json
from azureml.core import Run, Model, Workspace
from azureml.core.conda_dependencies import CondaDependencies
from azureml.core.model import InferenceConfig
from azureml.core.webservice import AciWebservice
script_file_name = 'score.py' script_file_name = 'score.py'
conda_env_file_name = 'myenv.yml' conda_env_file_name = 'myenv.yml'

View File

@@ -1,15 +1,11 @@
name: project_environment name: automl_grouping_env
dependencies: dependencies:
# The python interpreter version. # The python interpreter version.
# Currently Azure ML only supports 3.5.2 and later. # Currently Azure ML only supports 3.5.2 and later.
- python=3.6.2 - python=3.6.2
- numpy>=1.16.0,<=1.16.2
- pip: - scikit-learn>=0.19.0,<=0.20.3
- azureml-defaults
- azureml-train-automl
- numpy
- scikit-learn
- conda-forge::fbprophet==0.5 - conda-forge::fbprophet==0.5

View File

@@ -44,7 +44,7 @@ def run(raw_data):
model_path = Model.get_model_path(cur_group) model_path = Model.get_model_path(cur_group)
model = joblib.load(model_path) model = joblib.load(model_path)
models[cur_group] = model models[cur_group] = model
_, xtrans = models[cur_group].forecast(df_one, np.repeat(np.nan, len(df_one))) _, xtrans = models[cur_group].forecast(df_one)
dfs.append(xtrans) dfs.append(xtrans)
df_ret = pd.concat(dfs) df_ret = pd.concat(dfs)
df_ret.reset_index(drop=False, inplace=True) df_ret.reset_index(drop=False, inplace=True)

View File

@@ -377,9 +377,7 @@
"\n", "\n",
"![Forecasting after training](forecast_function_at_train.png)\n", "![Forecasting after training](forecast_function_at_train.png)\n",
"\n", "\n",
"The `X_test` and `y_query` below, taken together, form the **forecast request**. The two are interpreted as aligned - `y_query` could actally be a column in `X_test`. `NaN`s in `y_query` are the question marks. These will be filled with the forecasts.\n", "We use `X_test` as a **forecast request** to generate the predictions."
"\n",
"When the forecast period immediately follows the training period, the models retain the last few points of data. You can simply fill `y_query` filled with question marks - the model has the data for the lookback already.\n"
] ]
}, },
{ {
@@ -408,8 +406,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"y_query = np.repeat(np.NaN, X_test.shape[0])\n", "y_pred_no_gap, xy_nogap = fitted_model.forecast(X_test)\n",
"y_pred_no_gap, xy_nogap = fitted_model.forecast(X_test, y_query)\n",
"\n", "\n",
"# xy_nogap contains the predictions in the _automl_target_col column.\n", "# xy_nogap contains the predictions in the _automl_target_col column.\n",
"# Those same numbers are output in y_pred_no_gap\n", "# Those same numbers are output in y_pred_no_gap\n",
@@ -437,7 +434,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"quantiles = fitted_model.forecast_quantiles(X_test, y_query)\n", "quantiles = fitted_model.forecast_quantiles(X_test)\n",
"quantiles" "quantiles"
] ]
}, },
@@ -448,7 +445,7 @@
"#### Distribution forecasts\n", "#### Distribution forecasts\n",
"\n", "\n",
"Often the figure of interest is not just the point prediction, but the prediction at some quantile of the distribution. \n", "Often the figure of interest is not just the point prediction, but the prediction at some quantile of the distribution. \n",
"This arises when the forecast is used to control some kind of inventory, for example of grocery items of virtual machines for a cloud service. In such case, the control point is usually something like \"we want the item to be in stock and not run out 99% of the time\". This is called a \"service level\". Here is how you get quantile forecasts." "This arises when the forecast is used to control some kind of inventory, for example of grocery items or virtual machines for a cloud service. In such case, the control point is usually something like \"we want the item to be in stock and not run out 99% of the time\". This is called a \"service level\". Here is how you get quantile forecasts."
] ]
}, },
{ {
@@ -460,10 +457,10 @@
"# specify which quantiles you would like \n", "# specify which quantiles you would like \n",
"fitted_model.quantiles = [0.01, 0.5, 0.95]\n", "fitted_model.quantiles = [0.01, 0.5, 0.95]\n",
"# use forecast_quantiles function, not the forecast() one\n", "# use forecast_quantiles function, not the forecast() one\n",
"y_pred_quantiles = fitted_model.forecast_quantiles(X_test, y_query)\n", "y_pred_quantiles = fitted_model.forecast_quantiles(X_test)\n",
"\n", "\n",
"# it all nicely aligns column-wise\n", "# it all nicely aligns column-wise\n",
"pd.concat([X_test.reset_index(), pd.DataFrame({'query' : y_query}), y_pred_quantiles], axis=1)" "pd.concat([X_test.reset_index(), y_pred_quantiles], axis=1)"
] ]
}, },
{ {
@@ -472,7 +469,7 @@
"source": [ "source": [
"#### Destination-date forecast: \"just do something\"\n", "#### Destination-date forecast: \"just do something\"\n",
"\n", "\n",
"In some scenarios, the X_test is not known. The forecast is likely to be weak, becaus it is missing contemporaneous predictors, which we will need to impute. If you still wish to predict forward under the assumption that the last known values will be carried forward, you can forecast out to \"destination date\". The destination date still needs to fit within the maximum horizon from training." "In some scenarios, the X_test is not known. The forecast is likely to be weak, because it is missing contemporaneous predictors, which we will need to impute. If you still wish to predict forward under the assumption that the last known values will be carried forward, you can forecast out to \"destination date\". The destination date still needs to fit within the maximum horizon from training."
] ]
}, },
{ {
@@ -539,9 +536,7 @@
"outputs": [], "outputs": [],
"source": [ "source": [
"try: \n", "try: \n",
" y_query = y_away.copy()\n", " y_pred_away, xy_away = fitted_model.forecast(X_away)\n",
" y_query.fill(np.NaN)\n",
" y_pred_away, xy_away = fitted_model.forecast(X_away, y_query)\n",
" xy_away\n", " xy_away\n",
"except Exception as e:\n", "except Exception as e:\n",
" print(e)" " print(e)"
@@ -551,7 +546,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"How should we read that eror message? The forecast origin is at the last time themodel saw an actual values of `y` (the target). That was at the end of the training data! Because the model received all `NaN` (and not an actual target value), it is attempting to forecast from the end of training data. But the requested forecast periods are past the maximum horizon. We need to provide a define `y` value to establish the forecast origin.\n", "How should we read that eror message? The forecast origin is at the last time the model saw an actual value of `y` (the target). That was at the end of the training data! The model is attempting to forecast from the end of training data. But the requested forecast periods are past the maximum horizon. We need to provide a define `y` value to establish the forecast origin.\n",
"\n", "\n",
"We will use this helper function to take the required amount of context from the data preceding the testing data. It's definition is intentionally simplified to keep the idea in the clear." "We will use this helper function to take the required amount of context from the data preceding the testing data. It's definition is intentionally simplified to keep the idea in the clear."
] ]
@@ -711,7 +706,7 @@
], ],
"category": "tutorial", "category": "tutorial",
"compute": [ "compute": [
"remote" "Remote"
], ],
"datasets": [ "datasets": [
"None" "None"
@@ -740,13 +735,13 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.6.7" "version": "3.6.8"
}, },
"tags": [ "tags": [
"Forecasting", "Forecasting",
"Confidence Intervals" "Confidence Intervals"
], ],
"task": "forecasting" "task": "Forecasting"
}, },
"nbformat": 4, "nbformat": 4,
"nbformat_minor": 2 "nbformat_minor": 2

View File

@@ -40,7 +40,7 @@
"## Introduction\n", "## Introduction\n",
"In this example, we use AutoML to train, select, and operationalize a time-series forecasting model for multiple time-series.\n", "In this example, we use AutoML to train, select, and operationalize a time-series forecasting model for multiple time-series.\n",
"\n", "\n",
"Make sure you have executed the [configuration notebook](../configuration.ipynb) before running this notebook.\n", "Make sure you have executed the [configuration notebook](../../../configuration.ipynb) before running this notebook.\n",
"\n", "\n",
"The examples in the follow code samples use the University of Chicago's Dominick's Finer Foods dataset to forecast orange juice sales. Dominick's was a grocery chain in the Chicago metropolitan area." "The examples in the follow code samples use the University of Chicago's Dominick's Finer Foods dataset to forecast orange juice sales. Dominick's was a grocery chain in the Chicago metropolitan area."
] ]
@@ -325,9 +325,9 @@
"\n", "\n",
"For forecasting tasks, there are some additional parameters that can be set: the name of the column holding the date/time, the grain column names, and the maximum forecast horizon. A time column is required for forecasting, while the grain is optional. If a grain is not given, AutoML assumes that the whole dataset is a single time-series. We also pass a list of columns to drop prior to modeling. The _logQuantity_ column is completely correlated with the target quantity, so it must be removed to prevent a target leak.\n", "For forecasting tasks, there are some additional parameters that can be set: the name of the column holding the date/time, the grain column names, and the maximum forecast horizon. A time column is required for forecasting, while the grain is optional. If a grain is not given, AutoML assumes that the whole dataset is a single time-series. We also pass a list of columns to drop prior to modeling. The _logQuantity_ column is completely correlated with the target quantity, so it must be removed to prevent a target leak.\n",
"\n", "\n",
"The forecast horizon is given in units of the time-series frequency; for instance, the OJ series frequency is weekly, so a horizon of 20 means that a trained model will estimate sales up-to 20 weeks beyond the latest date in the training data for each series. In this example, we set the maximum horizon to the number of samples per series in the test set (n_test_periods). Generally, the value of this parameter will be dictated by business needs. For example, a demand planning organizaion that needs to estimate the next month of sales would set the horizon accordingly. Please see the [energy_demand notebook](https://github.com/Azure/MachineLearningNotebooks/tree/master/how-to-use-azureml/automated-machine-learning/forecasting-energy-demand) for more discussion of forecast horizon.\n", "The forecast horizon is given in units of the time-series frequency; for instance, the OJ series frequency is weekly, so a horizon of 20 means that a trained model will estimate sales up to 20 weeks beyond the latest date in the training data for each series. In this example, we set the maximum horizon to the number of samples per series in the test set (n_test_periods). Generally, the value of this parameter will be dictated by business needs. For example, a demand planning organizaion that needs to estimate the next month of sales would set the horizon accordingly. Please see the [energy_demand notebook](https://github.com/Azure/MachineLearningNotebooks/tree/master/how-to-use-azureml/automated-machine-learning/forecasting-energy-demand) for more discussion of forecast horizon.\n",
"\n", "\n",
"Finally, a note about the cross-validation (CV) procedure for time-series data. AutoML uses out-of-sample error estimates to select a best pipeline/model, so it is important that the CV fold splitting is done correctly. Time-series can violate the basic statistical assumptions of the canonical K-Fold CV strategy, so AutoML implements a [rolling origin validation](https://robjhyndman.com/hyndsight/tscv/) procedure to create CV folds for time-series data. To use this procedure, you just need to specify the desired number of CV folds in the AutoMLConfig object. It is also possible to bypass CV and use your own validation set by setting the *X_valid* and *y_valid* parameters of AutoMLConfig.\n", "Finally, a note about the cross-validation (CV) procedure for time-series data. AutoML uses out-of-sample error estimates to select a best pipeline/model, so it is important that the CV fold splitting is done correctly. Time-series can violate the basic statistical assumptions of the canonical K-Fold CV strategy, so AutoML implements a [rolling origin validation](https://robjhyndman.com/hyndsight/tscv/) procedure to create CV folds for time-series data. To use this procedure, you just need to specify the desired number of CV folds in the AutoMLConfig object. It is also possible to bypass CV and use your own validation set by setting the *validation_data* parameter of AutoMLConfig.\n",
"\n", "\n",
"Here is a summary of AutoMLConfig parameters used for training the OJ model:\n", "Here is a summary of AutoMLConfig parameters used for training the OJ model:\n",
"\n", "\n",
@@ -370,7 +370,7 @@
" training_data=train_dataset,\n", " training_data=train_dataset,\n",
" label_column_name=target_column_name,\n", " label_column_name=target_column_name,\n",
" compute_target=compute_target,\n", " compute_target=compute_target,\n",
" enable_early_stopping = True,\n", " enable_early_stopping=True,\n",
" n_cross_validations=3,\n", " n_cross_validations=3,\n",
" verbosity=logging.INFO,\n", " verbosity=logging.INFO,\n",
" **time_series_settings)" " **time_series_settings)"
@@ -454,9 +454,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"To produce predictions on the test set, we need to know the feature values at all dates in the test set. This requirement is somewhat reasonable for the OJ sales data since the features mainly consist of price, which is usually set in advance, and customer demographics which are approximately constant for each store over the 20 week forecast horizon in the testing data. \n", "To produce predictions on the test set, we need to know the feature values at all dates in the test set. This requirement is somewhat reasonable for the OJ sales data since the features mainly consist of price, which is usually set in advance, and customer demographics which are approximately constant for each store over the 20 week forecast horizon in the testing data."
"\n",
"We will first create a query `y_query`, which is aligned index-for-index to `X_test`. This is a vector of target values where each `NaN` serves the function of the question mark to be replaced by forecast. Passing definite values in the `y` argument allows the `forecast` function to make predictions on data that does not immediately follow the train data which contains `y`. In each grain, the last time point where the model sees a definite value of `y` is that grain's _forecast origin_."
] ]
}, },
{ {
@@ -465,15 +463,10 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"# Replace ALL values in y by NaN.\n",
"# The forecast origin will be at the beginning of the first forecast period.\n",
"# (Which is the same time as the end of the last training period.)\n",
"y_query = y_test.copy().astype(np.float)\n",
"y_query.fill(np.nan)\n",
"# The featurized data, aligned to y, will also be returned.\n", "# The featurized data, aligned to y, will also be returned.\n",
"# This contains the assumptions that were made in the forecast\n", "# This contains the assumptions that were made in the forecast\n",
"# and helps align the forecast to the original data\n", "# and helps align the forecast to the original data\n",
"y_predictions, X_trans = fitted_model.forecast(X_test, y_query)" "y_predictions, X_trans = fitted_model.forecast(X_test)"
] ]
}, },
{ {
@@ -640,7 +633,7 @@
"import json\n", "import json\n",
"# The request data frame needs to have y_query column which corresponds to query.\n", "# The request data frame needs to have y_query column which corresponds to query.\n",
"X_query = X_test.copy()\n", "X_query = X_test.copy()\n",
"X_query['y_query'] = y_query\n", "X_query['y_query'] = np.NaN\n",
"# We have to convert datetime to string, because Timestamps cannot be serialized to JSON.\n", "# We have to convert datetime to string, because Timestamps cannot be serialized to JSON.\n",
"X_query[time_column_name] = X_query[time_column_name].astype(str)\n", "X_query[time_column_name] = X_query[time_column_name].astype(str)\n",
"# The Service object accept the complex dictionary, which is internally converted to JSON string.\n", "# The Service object accept the complex dictionary, which is internally converted to JSON string.\n",
@@ -693,7 +686,7 @@
"category": "tutorial", "category": "tutorial",
"celltoolbar": "Raw Cell Format", "celltoolbar": "Raw Cell Format",
"compute": [ "compute": [
"remote" "Remote"
], ],
"datasets": [ "datasets": [
"Orange Juice Sales" "Orange Juice Sales"
@@ -722,8 +715,11 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.6.7" "version": "3.6.8"
}, },
"tags": [
"None"
],
"task": "Forecasting" "task": "Forecasting"
}, },
"nbformat": 4, "nbformat": 4,

View File

@@ -634,7 +634,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"from azureml.train.automl.automl_explain_utilities import AutoMLExplainerSetupClass, automl_setup_model_explanations\n", "from azureml.train.automl.runtime.automl_explain_utilities import AutoMLExplainerSetupClass, automl_setup_model_explanations\n",
"explainer_setup_class = automl_setup_model_explanations(fitted_model, 'regression', X_test=X_test)" "explainer_setup_class = automl_setup_model_explanations(fitted_model, 'regression', X_test=X_test)"
] ]
}, },
@@ -653,11 +653,11 @@
"outputs": [], "outputs": [],
"source": [ "source": [
"from azureml.explain.model._internal.explanation_client import ExplanationClient\n", "from azureml.explain.model._internal.explanation_client import ExplanationClient\n",
"from azureml.contrib.interpret.visualize import ExplanationDashboard\n", "from interpret_community.widget import ExplanationDashboard\n",
"client = ExplanationClient.from_run(automl_run)\n", "client = ExplanationClient.from_run(automl_run)\n",
"engineered_explanations = client.download_model_explanation(raw=False)\n", "engineered_explanations = client.download_model_explanation(raw=False)\n",
"print(engineered_explanations.get_feature_importance_dict())\n", "print(engineered_explanations.get_feature_importance_dict())\n",
"ExplanationDashboard(engineered_explanations, explainer_setup_class.automl_estimator, explainer_setup_class.X_test_transform)" "ExplanationDashboard(engineered_explanations, explainer_setup_class.automl_estimator, datasetX=explainer_setup_class.X_test_transform)"
] ]
}, },
{ {
@@ -676,7 +676,7 @@
"source": [ "source": [
"raw_explanations = client.download_model_explanation(raw=True)\n", "raw_explanations = client.download_model_explanation(raw=True)\n",
"print(raw_explanations.get_feature_importance_dict())\n", "print(raw_explanations.get_feature_importance_dict())\n",
"ExplanationDashboard(raw_explanations, explainer_setup_class.automl_pipeline, explainer_setup_class.X_test_raw)" "ExplanationDashboard(raw_explanations, explainer_setup_class.automl_pipeline, datasetX=explainer_setup_class.X_test_raw)"
] ]
}, },
{ {

View File

@@ -5,7 +5,8 @@ import os
import pickle import pickle
import azureml.train.automl import azureml.train.automl
import azureml.explain.model import azureml.explain.model
from azureml.train.automl.automl_explain_utilities import AutoMLExplainerSetupClass, automl_setup_model_explanations from azureml.train.automl.runtime.automl_explain_utilities import AutoMLExplainerSetupClass, \
automl_setup_model_explanations
from sklearn.externals import joblib from sklearn.externals import joblib
from azureml.core.model import Model from azureml.core.model import Model

View File

@@ -6,7 +6,8 @@ from azureml.core.run import Run
from azureml.core.experiment import Experiment from azureml.core.experiment import Experiment
from sklearn.externals import joblib from sklearn.externals import joblib
from azureml.core.dataset import Dataset from azureml.core.dataset import Dataset
from azureml.train.automl.automl_explain_utilities import AutoMLExplainerSetupClass, automl_setup_model_explanations from azureml.train.automl.runtime.automl_explain_utilities import AutoMLExplainerSetupClass, \
automl_setup_model_explanations
from azureml.explain.model.mimic.models.lightgbm_model import LGBMExplainableModel from azureml.explain.model.mimic.models.lightgbm_model import LGBMExplainableModel
from azureml.explain.model.mimic_wrapper import MimicWrapper from azureml.explain.model.mimic_wrapper import MimicWrapper
from automl.client.core.common.constants import MODEL_PATH from automl.client.core.common.constants import MODEL_PATH

View File

@@ -140,6 +140,9 @@
"framework": [ "framework": [
"Azure ML AutoML" "Azure ML AutoML"
], ],
"tags": [
""
],
"friendly_name": "Forecasting with automated ML SQL integration", "friendly_name": "Forecasting with automated ML SQL integration",
"index_order": 1, "index_order": 1,
"kernelspec": { "kernelspec": {
@@ -151,9 +154,6 @@
"name": "sql", "name": "sql",
"version": "" "version": ""
}, },
"tags": [
""
],
"task": "Forecasting" "task": "Forecasting"
}, },
"nbformat": 4, "nbformat": 4,

View File

@@ -560,6 +560,9 @@
"framework": [ "framework": [
"Azure ML AutoML" "Azure ML AutoML"
], ],
"tags": [
""
],
"friendly_name": "Setup automated ML SQL integration", "friendly_name": "Setup automated ML SQL integration",
"index_order": 1, "index_order": 1,
"kernelspec": { "kernelspec": {
@@ -571,9 +574,6 @@
"name": "sql", "name": "sql",
"version": "" "version": ""
}, },
"tags": [
""
],
"task": "None" "task": "None"
}, },
"nbformat": 4, "nbformat": 4,

View File

@@ -175,6 +175,7 @@
"source": [ "source": [
"#deploy to ACI\n", "#deploy to ACI\n",
"from azureml.core.webservice import AciWebservice, Webservice\n", "from azureml.core.webservice import AciWebservice, Webservice\n",
"from azureml.exceptions import WebserviceException\n",
"from azureml.core.model import InferenceConfig\n", "from azureml.core.model import InferenceConfig\n",
"\n", "\n",
"myaci_config = AciWebservice.deploy_configuration(cpu_cores = 2, \n", "myaci_config = AciWebservice.deploy_configuration(cpu_cores = 2, \n",
@@ -182,11 +183,19 @@
" tags = {'name':'Databricks Azure ML ACI'}, \n", " tags = {'name':'Databricks Azure ML ACI'}, \n",
" description = 'This is for ADB and AML example.')\n", " description = 'This is for ADB and AML example.')\n",
"\n", "\n",
"service_name = 'aciws'\n",
"\n",
"# Remove any existing service under the same name.\n",
"try:\n",
" Webservice(ws, service_name).delete()\n",
"except WebserviceException:\n",
" pass\n",
"\n",
"inference_config = InferenceConfig(runtime= 'spark-py', \n", "inference_config = InferenceConfig(runtime= 'spark-py', \n",
" entry_script='score_sparkml.py',\n", " entry_script='score_sparkml.py',\n",
" conda_file='mydeployenv.yml')\n", " conda_file='mydeployenv.yml')\n",
"\n", "\n",
"myservice = Model.deploy(ws, 'aciws', [mymodel], inference_config, myaci_config)\n", "myservice = Model.deploy(ws, service_name, [mymodel], inference_config, myaci_config)\n",
"myservice.wait_for_deployment(show_output=True)" "myservice.wait_for_deployment(show_output=True)"
] ]
}, },
@@ -199,18 +208,6 @@
"help(Webservice)" "help(Webservice)"
] ]
}, },
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# List images by ws\n",
"\n",
"for i in ContainerImage.list(workspace = ws):\n",
" print('{}(v.{} [{}]) stored at {} with build log {}'.format(i.name, i.version, i.creation_state, i.image_location, i.image_build_log_uri))"
]
},
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": null, "execution_count": null,

View File

@@ -163,14 +163,19 @@
"#it may take 20-25 minutes to create a new cluster\n", "#it may take 20-25 minutes to create a new cluster\n",
"\n", "\n",
"from azureml.core.compute import AksCompute, ComputeTarget\n", "from azureml.core.compute import AksCompute, ComputeTarget\n",
"\n", "from azureml.core.compute_target import ComputeTargetException\n",
"# Use the default configuration (can also provide parameters to customize)\n",
"prov_config = AksCompute.provisioning_configuration()\n",
"\n", "\n",
"aks_name = 'ps-aks-demo2' \n", "aks_name = 'ps-aks-demo2' \n",
"\n", "\n",
"# Create the cluster\n", "try:\n",
"aks_target = ComputeTarget.create(workspace = ws, \n", " aks_target = ComputeTarget(workspace=ws, name=aks_name)\n",
" print('Found existing cluster, use it.')\n",
"except ComputeTargetException:\n",
" # Use the default configuration (can also provide parameters to customize)\n",
" prov_config = AksCompute.provisioning_configuration()\n",
" \n",
" # Create the cluster\n",
" aks_target = ComputeTarget.create(workspace = ws, \n",
" name = aks_name, \n", " name = aks_name, \n",
" provisioning_configuration = prov_config)\n", " provisioning_configuration = prov_config)\n",
"\n", "\n",
@@ -188,15 +193,24 @@
"source": [ "source": [
"#deploy to AKS\n", "#deploy to AKS\n",
"from azureml.core.webservice import AksWebservice, Webservice\n", "from azureml.core.webservice import AksWebservice, Webservice\n",
"from azureml.exceptions import WebserviceException\n",
"from azureml.core.model import InferenceConfig\n", "from azureml.core.model import InferenceConfig\n",
"\n", "\n",
"aks_config = AksWebservice.deploy_configuration(enable_app_insights=True)\n", "aks_config = AksWebservice.deploy_configuration(enable_app_insights=True)\n",
"\n", "\n",
"service_name = 'ps-aks-service'\n",
"\n",
"# Remove any existing service under the same name.\n",
"try:\n",
" Webservice(ws, service_name).delete()\n",
"except WebserviceException:\n",
" pass\n",
"\n",
"inference_config = InferenceConfig(runtime = 'spark-py', \n", "inference_config = InferenceConfig(runtime = 'spark-py', \n",
" entry_script ='score_sparkml.py',\n", " entry_script ='score_sparkml.py',\n",
" conda_file ='mydeployenv.yml')\n", " conda_file ='mydeployenv.yml')\n",
"\n", "\n",
"aks_service = Model.deploy(ws, 'ps-aks-service', [mymodel], inference_config, aks_config, aks_target)\n", "aks_service = Model.deploy(ws, service_name, [mymodel], inference_config, aks_config, aks_target)\n",
"aks_service.wait_for_deployment(show_output=True)" "aks_service.wait_for_deployment(show_output=True)"
] ]
}, },
@@ -288,7 +302,7 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.6.6" "version": "3.6.8"
}, },
"name": "deploy-to-aks-existingimage-05", "name": "deploy-to-aks-existingimage-05",
"notebookId": 1030695628045968 "notebookId": 1030695628045968

View File

@@ -661,6 +661,7 @@
"# this will take 10-15 minutes to finish\n", "# this will take 10-15 minutes to finish\n",
"\n", "\n",
"from azureml.core.webservice import AciWebservice, Webservice\n", "from azureml.core.webservice import AciWebservice, Webservice\n",
"from azureml.exceptions import WebserviceException\n",
"from azureml.core.model import InferenceConfig\n", "from azureml.core.model import InferenceConfig\n",
"from azureml.core.model import Model\n", "from azureml.core.model import Model\n",
"import uuid\n", "import uuid\n",
@@ -677,6 +678,13 @@
"\n", "\n",
"guid = str(uuid.uuid4()).split(\"-\")[0]\n", "guid = str(uuid.uuid4()).split(\"-\")[0]\n",
"service_name = \"myservice-{}\".format(guid)\n", "service_name = \"myservice-{}\".format(guid)\n",
"\n",
"# Remove any existing service under the same name.\n",
"try:\n",
" Webservice(ws, service_name).delete()\n",
"except WebserviceException:\n",
" pass\n",
"\n",
"print(\"Creating service with name: {}\".format(service_name))\n", "print(\"Creating service with name: {}\".format(service_name))\n",
"\n", "\n",
"myservice = Model.deploy(ws, service_name, [model], inference_config, myaci_config)\n", "myservice = Model.deploy(ws, service_name, [model], inference_config, myaci_config)\n",
@@ -795,7 +803,7 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.6.5" "version": "3.6.8"
}, },
"name": "auto-ml-classification-local-adb", "name": "auto-ml-classification-local-adb",
"notebookId": 2733885892129020 "notebookId": 2733885892129020

View File

@@ -116,7 +116,8 @@
"execution_count": null, "execution_count": null,
"metadata": { "metadata": {
"tags": [ "tags": [
"register model from file" "register model from file",
"sample-model-register"
] ]
}, },
"outputs": [], "outputs": [],
@@ -404,7 +405,7 @@
"\n", "\n",
" - To run a production-ready web service, see the [notebook on deployment to Azure Kubernetes Service](../production-deploy-to-aks/production-deploy-to-aks.ipynb).\n", " - To run a production-ready web service, see the [notebook on deployment to Azure Kubernetes Service](../production-deploy-to-aks/production-deploy-to-aks.ipynb).\n",
" - To run a local web service, see the [notebook on deployment to a local Docker container](../deploy-to-local/register-model-deploy-local.ipynb).\n", " - To run a local web service, see the [notebook on deployment to a local Docker container](../deploy-to-local/register-model-deploy-local.ipynb).\n",
" - For more information on datasets, see the [notebook on training with datasets](../../work-with-data/datasets-tutorial/train-with-datasets.ipynb).\n", " - For more information on datasets, see the [notebook on training with datasets](../../work-with-data/datasets-tutorial/train-with-datasets/train-with-datasets.ipynb).\n",
" - For more information on environments, see the [notebook on using environments](../../training/using-environments/using-environments.ipynb).\n", " - For more information on environments, see the [notebook on using environments](../../training/using-environments/using-environments.ipynb).\n",
" - For information on all the available deployment targets, see [&ldquo;How and where to deploy models&rdquo;](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-deploy-and-where#choose-a-compute-target)." " - For information on all the available deployment targets, see [&ldquo;How and where to deploy models&rdquo;](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-deploy-and-where#choose-a-compute-target)."
] ]

View File

@@ -96,7 +96,8 @@
"execution_count": null, "execution_count": null,
"metadata": { "metadata": {
"tags": [ "tags": [
"register model from file" "register model from file",
"sample-model-register"
] ]
}, },
"outputs": [], "outputs": [],

View File

@@ -345,9 +345,11 @@
], ],
"category": "tutorial", "category": "tutorial",
"compute": [ "compute": [
"local" "Local"
],
"datasets": [
"None"
], ],
"datasets": [],
"deployment": [ "deployment": [
"Local" "Local"
], ],

View File

@@ -0,0 +1,369 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Copyright (c) Microsoft Corporation. All rights reserved.\n",
"\n",
"Licensed under the MIT License."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/deployment/production-deploy-to-aks/production-deploy-to-aks.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Deploy models to Azure Kubernetes Service (AKS) using controlled roll out\n",
"This notebook will show you how to deploy mulitple AKS webservices with the same scoring endpoint and how to roll out your models in a controlled manner by configuring % of scoring traffic going to each webservice. If you are using a Notebook VM, you are all set. Otherwise, go through the [configuration notebook](../../../configuration.ipynb) to install the Azure Machine Learning Python SDK and create an Azure ML Workspace."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Check for latest version\n",
"import azureml.core\n",
"print(azureml.core.VERSION)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Initialize workspace\n",
"Create a [Workspace](https://docs.microsoft.com/python/api/azureml-core/azureml.core.workspace%28class%29?view=azure-ml-py) object from your persisted configuration."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core.workspace import Workspace\n",
"\n",
"ws = Workspace.from_config()\n",
"print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep = '\\n')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Register the model\n",
"Register a file or folder as a model by calling [Model.register()](https://docs.microsoft.com/python/api/azureml-core/azureml.core.model.model?view=azure-ml-py#register-workspace--model-path--model-name--tags-none--properties-none--description-none--datasets-none--model-framework-none--model-framework-version-none--child-paths-none-).\n",
"In addition to the content of the model file itself, your registered model will also store model metadata -- model description, tags, and framework information -- that will be useful when managing and deploying models in your workspace. Using tags, for instance, you can categorize your models and apply filters when listing models in your workspace."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core import Model\n",
"\n",
"model = Model.register(workspace=ws,\n",
" model_name='sklearn_regression_model.pkl', # Name of the registered model in your workspace.\n",
" model_path='./sklearn_regression_model.pkl', # Local file to upload and register as a model.\n",
" model_framework=Model.Framework.SCIKITLEARN, # Framework used to create the model.\n",
" model_framework_version='0.19.1', # Version of scikit-learn used to create the model.\n",
" description='Ridge regression model to predict diabetes progression.',\n",
" tags={'area': 'diabetes', 'type': 'regression'})\n",
"\n",
"print('Name:', model.name)\n",
"print('Version:', model.version)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Register an environment (for all models)\n",
"\n",
"If you control over how your model is run, or if it has special runtime requirements, you can specify your own environment and scoring method.\n",
"\n",
"Specify the model's runtime environment by creating an [Environment](https://docs.microsoft.com/python/api/azureml-core/azureml.core.environment%28class%29?view=azure-ml-py) object and providing the [CondaDependencies](https://docs.microsoft.com/python/api/azureml-core/azureml.core.conda_dependencies.condadependencies?view=azure-ml-py) needed by your model."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core import Environment\n",
"from azureml.core.conda_dependencies import CondaDependencies\n",
"\n",
"environment=Environment('my-sklearn-environment')\n",
"environment.python.conda_dependencies = CondaDependencies.create(pip_packages=[\n",
" 'azureml-defaults',\n",
" 'inference-schema[numpy-support]',\n",
" 'joblib',\n",
" 'numpy',\n",
" 'scikit-learn'\n",
"])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"When using a custom environment, you must also provide Python code for initializing and running your model. An example script is included with this notebook."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"with open('score.py') as f:\n",
" print(f.read())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Create the InferenceConfig\n",
"Create the inference configuration to reference your environment and entry script during deployment"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core.model import InferenceConfig\n",
"\n",
"inference_config = InferenceConfig(entry_script='score.py', \n",
" source_directory='.',\n",
" environment=environment)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Provision the AKS Cluster\n",
"If you already have an AKS cluster attached to this workspace, skip the step below and provide the name of the cluster."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core.compute import AksCompute\n",
"from azureml.core.compute import ComputeTarget\n",
"# Use the default configuration (can also provide parameters to customize)\n",
"prov_config = AksCompute.provisioning_configuration()\n",
"\n",
"aks_name = 'my-aks' \n",
"# Create the cluster\n",
"aks_target = ComputeTarget.create(workspace = ws, \n",
" name = aks_name, \n",
" provisioning_configuration = prov_config) \n",
"aks_target.wait_for_completion(show_output=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Create an Endpoint and add a version (AKS service)\n",
"This creates a new endpoint and adds a version behind it. By default the first version added is the default version. You can specify the traffic percentile a version takes behind an endpoint. \n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# deploying the model and create a new endpoint\n",
"from azureml.core.webservice import AksEndpoint\n",
"# from azureml.core.compute import ComputeTarget\n",
"\n",
"#select a created compute\n",
"compute = ComputeTarget(ws, 'my-aks')\n",
"namespace_name=\"endpointnamespace\"\n",
"# define the endpoint name\n",
"endpoint_name = \"myendpoint1\"\n",
"# define the service name\n",
"version_name= \"versiona\"\n",
"\n",
"endpoint_deployment_config = AksEndpoint.deploy_configuration(tags = {'modelVersion':'firstversion', 'department':'finance'}, \n",
" description = \"my first version\", namespace = namespace_name, \n",
" version_name = version_name, traffic_percentile = 40)\n",
"\n",
"endpoint = Model.deploy(ws, endpoint_name, [model], inference_config, endpoint_deployment_config, compute)\n",
"endpoint.wait_for_deployment(True)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"endpoint.get_logs()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Add another version of the service to an existing endpoint\n",
"This adds another version behind an existing endpoint. You can specify the traffic percentile the new version takes. If no traffic_percentile is specified then it defaults to 0. All the unspecified traffic percentile (in this example 50) across all versions goes to default version."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Adding a new version to an existing Endpoint.\n",
"version_name_add=\"versionb\" \n",
"\n",
"endpoint.create_version(version_name = version_name_add, inference_config=inference_config, models=[model], tags = {'modelVersion':'secondversion', 'department':'finance'}, \n",
" description = \"my second version\", traffic_percentile = 10)\n",
"endpoint.wait_for_deployment(True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Update an existing version in an endpoint\n",
"There are two types of versions: control and treatment. An endpoint contains one or more treatment versions but only one control version. This categorization helps compare the different versions against the defined control version."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"endpoint.update_version(version_name=endpoint.versions[version_name_add].name, description=\"my second version update\", traffic_percentile=40, is_default=True, is_control_version_type=True)\n",
"endpoint.wait_for_deployment(True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Test the web service using run method\n",
"Test the web sevice by passing in data. Run() method retrieves API keys behind the scenes to make sure that call is authenticated."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Scoring on endpoint\n",
"import json\n",
"test_sample = json.dumps({'data': [\n",
" [1,2,3,4,5,6,7,8,9,10], \n",
" [10,9,8,7,6,5,4,3,2,1]\n",
"]})\n",
"\n",
"test_sample_encoded = bytes(test_sample, encoding='utf8')\n",
"prediction = endpoint.run(input_data=test_sample_encoded)\n",
"print(prediction)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Delete Resources"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# deleting a version in an endpoint\n",
"endpoint.delete_version(version_name=version_name)\n",
"endpoint.wait_for_deployment(True)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# deleting an endpoint, this will delete all versions in the endpoint and the endpoint itself\n",
"endpoint.delete()"
]
}
],
"metadata": {
"authors": [
{
"name": "shipatel"
}
],
"category": "deployment",
"compute": [
"None"
],
"datasets": [
"Diabetes"
],
"deployment": [
"Azure Kubernetes Service"
],
"exclude_from_index": false,
"framework": [
"Scikit-learn"
],
"friendly_name": "Deploy models to AKS using controlled roll out",
"index_order": 3,
"kernelspec": {
"display_name": "Python 3.6",
"language": "python",
"name": "python36"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.0"
},
"star_tag": [
"featured"
],
"tags": [
"None"
],
"task": "Deploy a model with Azure Machine Learning"
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -0,0 +1,4 @@
name: deploy-aks-with-controlled-rollout
dependencies:
- pip:
- azureml-sdk

View File

@@ -0,0 +1,28 @@
import pickle
import json
import numpy
from sklearn.externals import joblib
from sklearn.linear_model import Ridge
from azureml.core.model import Model
def init():
global model
# note here "sklearn_regression_model.pkl" is the name of the model registered under
# this is a different behavior than before when the code is run locally, even though the code is the same.
model_path = Model.get_model_path('sklearn_regression_model.pkl')
# deserialize the model file back into a sklearn model
model = joblib.load(model_path)
# note you can pass in multiple rows for scoring
def run(raw_data):
try:
data = json.loads(raw_data)['data']
data = numpy.array(data)
result = model.predict(data)
# you can return any data type as long as it is JSON-serializable
return result.tolist()
except Exception as e:
error = str(e)
return error

View File

@@ -431,7 +431,8 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"aks_service.update(enable_app_insights=False)" "aks_service.update(enable_app_insights=False)\n",
"aks_service.wait_for_deployment(show_output = True)"
] ]
}, },
{ {

View File

@@ -755,7 +755,7 @@
], ],
"category": "deployment", "category": "deployment",
"compute": [ "compute": [
"local" "Local"
], ],
"datasets": [ "datasets": [
"Emotion FER" "Emotion FER"

View File

@@ -763,7 +763,7 @@
], ],
"category": "deployment", "category": "deployment",
"compute": [ "compute": [
"local" "Local"
], ],
"datasets": [ "datasets": [
"MNIST" "MNIST"

View File

@@ -373,7 +373,7 @@
], ],
"category": "deployment", "category": "deployment",
"compute": [ "compute": [
"local" "Local"
], ],
"datasets": [ "datasets": [
"ImageNet" "ImageNet"

View File

@@ -1,11 +1,14 @@
## Using explain model APIs ## Using AzureML Interpret APIs
<a name="samples"></a> <a name="samples"></a>
# Explain Model SDK Sample Notebooks # AzureML Interpret SDK Sample Notebooks
Follow these sample notebooks to learn: You can run the interpret-community SDK to explain models locally without Azure.
For notebooks on the local experience, please see:
https://github.com/interpretml/interpret-community/tree/master/notebooks
1. [Explain tabular data locally](tabular-data): Basic examples of explaining model trained on tabular data. Follow these sample notebooks to learn about the model interpretability integration with Azure:
2. [Explain on remote AMLCompute](azure-integration/remote-explanation): Explain a model on a remote AMLCompute target.
3. [Explain tabular data with Run History](azure-integration/run-history): Explain a model with Run History. 1. [Explain on remote AMLCompute](azure-integration/remote-explanation): Explain a model on a remote AMLCompute target.
4. [Operationalize model explanation](azure-integration/scoring-time): Operationalize model explanation as a web service. 2. [Explain tabular data with Run History](azure-integration/run-history): Explain a model with Run History.
3. [Operationalize model explanation](azure-integration/scoring-time): Operationalize model explanation as a web service.

View File

@@ -669,7 +669,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"from azureml.contrib.interpret.visualize import ExplanationDashboard" "from interpret_community.widget import ExplanationDashboard"
] ]
}, },
{ {
@@ -678,7 +678,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"ExplanationDashboard(global_explanation, original_model, x_test)" "ExplanationDashboard(global_explanation, original_model, datasetX=x_test)"
] ]
}, },
{ {

View File

@@ -61,4 +61,4 @@ global_explanation = tabular_explainer.explain_global(X_test)
# Uploading model explanation data for storage or visualization in webUX # Uploading model explanation data for storage or visualization in webUX
# The explanation can then be downloaded on any compute # The explanation can then be downloaded on any compute
comment = 'Global explanation on regression model trained on boston dataset' comment = 'Global explanation on regression model trained on boston dataset'
client.upload_model_explanation(global_explanation, comment=comment) client.upload_model_explanation(global_explanation, comment=comment, model_id=original_model.id)

View File

@@ -564,7 +564,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"from azureml.contrib.interpret.visualize import ExplanationDashboard" "from interpret_community.widget import ExplanationDashboard"
] ]
}, },
{ {
@@ -573,7 +573,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"ExplanationDashboard(downloaded_global_explanation, model, x_test)" "ExplanationDashboard(downloaded_global_explanation, model, datasetX=x_test)"
] ]
}, },
{ {

View File

@@ -290,7 +290,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"from azureml.contrib.interpret.visualize import ExplanationDashboard" "from interpret_community.widget import ExplanationDashboard"
] ]
}, },
{ {
@@ -299,7 +299,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"ExplanationDashboard(global_explanation, clf, x_test)" "ExplanationDashboard(global_explanation, clf, datasetX=x_test)"
] ]
}, },
{ {

View File

@@ -355,7 +355,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"from azureml.contrib.interpret.visualize import ExplanationDashboard" "from interpret_community.widget import ExplanationDashboard"
] ]
}, },
{ {
@@ -364,7 +364,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"ExplanationDashboard(global_explanation, original_svm_model, x_test)" "ExplanationDashboard(global_explanation, original_svm_model, datasetX=x_test)"
] ]
}, },
{ {

View File

@@ -116,7 +116,7 @@ global_explanation = tabular_explainer.explain_global(x_test)
# uploading model explanation data for storage or visualization # uploading model explanation data for storage or visualization
comment = 'Global explanation on classification model trained on IBM employee attrition dataset' comment = 'Global explanation on classification model trained on IBM employee attrition dataset'
client.upload_model_explanation(global_explanation, comment=comment) client.upload_model_explanation(global_explanation, comment=comment, model_id=original_model.id)
# also create a lightweight explainer for scoring time # also create a lightweight explainer for scoring time
scoring_explainer = LinearScoringExplainer(tabular_explainer) scoring_explainer = LinearScoringExplainer(tabular_explainer)

View File

@@ -1,509 +0,0 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Copyright (c) Microsoft Corporation. All rights reserved.\n",
"\n",
"Licensed under the MIT License."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/explain-model/tabular-data/advanced-feature-transformations-explain-local.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Explain binary classification model predictions with raw feature transformations\n",
"_**This notebook showcases how to use the Azure Machine Learning Interpretability SDK to explain and visualize a binary classification model that uses advanced many to one or many to many feature transformations.**_\n",
"\n",
"\n",
"\n",
"## Table of Contents\n",
"\n",
"1. [Introduction](#Introduction)\n",
"1. [Setup](#Setup)\n",
"1. [Run model explainer locally at training time](#Explain)\n",
" 1. Apply feature transformations\n",
" 1. Train a binary classification model\n",
" 1. Explain the model on raw features\n",
" 1. Generate global explanations\n",
" 1. Generate local explanations\n",
"1. [Visualize results](#Visualize)\n",
"1. [Next steps](#Next)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Introduction\n",
"\n",
"This notebook illustrates creating explanations for a binary classification model, Titanic passenger data classification, that uses many to one and many to many feature transformations from raw data to engineered features. For the many to one transformation, we sum 2 features `age` and `fare`. For many to many transformations two features are computed: one that is product of `age` and `fare` and another that is square of this product. Our tabular data explainer is then used to get the explanation object with the flag `allow_all_transformations` passed. The object is then used to get raw feature importances.\n",
"\n",
"\n",
"We will showcase raw feature transformations with three tabular data explainers: TabularExplainer (SHAP), MimicExplainer (global surrogate), and PFIExplainer.\n",
"\n",
"| ![Interpretability Toolkit Architecture](./img/interpretability-architecture.png) |\n",
"|:--:|\n",
"| *Interpretability Toolkit Architecture* |\n",
"\n",
"Problem: Titanic passenger data classification with scikit-learn (run model explainer locally)\n",
"\n",
"1. Transform raw features to engineered features\n",
"2. Train a Logistic Regression model using Scikit-learn\n",
"3. Run 'explain_model' globally and locally with full dataset in local mode, which doesn't contact any Azure services.\n",
"4. Visualize the global and local explanations with the visualization dashboard.\n",
"---\n",
"\n",
"Setup: If you are using Jupyter notebooks, the extensions should be installed automatically with the package.\n",
"If you are using Jupyter Labs run the following command:\n",
"```\n",
"(myenv) $ jupyter labextension install @jupyter-widgets/jupyterlab-manager\n",
"```\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Explain\n",
"\n",
"### Run model explainer locally at training time"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from sklearn.pipeline import Pipeline\n",
"from sklearn.impute import SimpleImputer\n",
"from sklearn.preprocessing import StandardScaler, OneHotEncoder\n",
"from sklearn.linear_model import LogisticRegression\n",
"import pandas as pd\n",
"import numpy as np\n",
"\n",
"# Explainers:\n",
"# 1. SHAP Tabular Explainer\n",
"from interpret.ext.blackbox import TabularExplainer\n",
"\n",
"# OR\n",
"\n",
"# 2. Mimic Explainer\n",
"from interpret.ext.blackbox import MimicExplainer\n",
"# You can use one of the following four interpretable models as a global surrogate to the black box model\n",
"from interpret.ext.glassbox import LGBMExplainableModel\n",
"from interpret.ext.glassbox import LinearExplainableModel\n",
"from interpret.ext.glassbox import SGDExplainableModel\n",
"from interpret.ext.glassbox import DecisionTreeExplainableModel\n",
"\n",
"# OR\n",
"\n",
"# 3. PFI Explainer\n",
"from interpret.ext.blackbox import PFIExplainer "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Load the Titanic passenger data"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"titanic_url = ('https://raw.githubusercontent.com/amueller/'\n",
" 'scipy-2017-sklearn/091d371/notebooks/datasets/titanic3.csv')\n",
"data = pd.read_csv(titanic_url)\n",
"# fill missing values\n",
"data = data.fillna(method=\"ffill\")\n",
"data = data.fillna(method=\"bfill\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Similar to example [here](https://scikit-learn.org/stable/auto_examples/compose/plot_column_transformer_mixed_types.html#sphx-glr-auto-examples-compose-plot-column-transformer-mixed-types-py), use a subset of columns"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from sklearn.model_selection import train_test_split\n",
"\n",
"numeric_features = ['age', 'fare']\n",
"categorical_features = ['embarked', 'sex', 'pclass']\n",
"\n",
"y = data['survived'].values\n",
"X = data[categorical_features + numeric_features]\n",
"\n",
"# Split data into train and test\n",
"x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Transform raw features"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can explain raw features by either using a `sklearn.compose.ColumnTransformer` or a list of fitted transformer tuples. The cell below uses `sklearn.compose.ColumnTransformer`. In case you want to run the example with the list of fitted transformer tuples, comment the cell below and uncomment the cell that follows after. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# We add many to one and many to many transformations for illustration purposes.\n",
"# The support for raw feature explanations with many to one and many to many transformations are only supported \n",
"# When allow_all_transformations is set to True on explainer creation\n",
"from sklearn.preprocessing import FunctionTransformer\n",
"many_to_one_transformer = FunctionTransformer(lambda x: x.sum(axis=1).reshape(-1, 1))\n",
"many_to_many_transformer = FunctionTransformer(lambda x: np.hstack(\n",
" (np.prod(x, axis=1).reshape(-1, 1), (np.prod(x, axis=1)**2).reshape(-1, 1))\n",
"))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from sklearn.compose import ColumnTransformer\n",
"\n",
"transformations = ColumnTransformer([\n",
" (\"age_fare_1\", Pipeline(steps=[\n",
" ('imputer', SimpleImputer(strategy='median')),\n",
" ('scaler', StandardScaler())\n",
" ]), [\"age\", \"fare\"]),\n",
" (\"age_fare_2\", many_to_one_transformer, [\"age\", \"fare\"]),\n",
" (\"age_fare_3\", many_to_many_transformer, [\"age\", \"fare\"]),\n",
" (\"embarked\", Pipeline(steps=[\n",
" (\"imputer\", SimpleImputer(strategy='constant', fill_value='missing')), \n",
" (\"encoder\", OneHotEncoder(sparse=False))]), [\"embarked\"]),\n",
" (\"sex_pclass\", OneHotEncoder(sparse=False), [\"sex\", \"pclass\"]) \n",
"])\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"'''\n",
"# Uncomment below if sklearn-pandas is not installed\n",
"#!pip install sklearn-pandas\n",
"from sklearn_pandas import DataFrameMapper\n",
"\n",
"# Impute, standardize the numeric features and one-hot encode the categorical features. \n",
"\n",
"transformations = [\n",
" ([\"age\", \"fare\"], Pipeline(steps=[\n",
" ('imputer', SimpleImputer(strategy='median')),\n",
" ('scaler', StandardScaler())\n",
" ])),\n",
" ([\"age\", \"fare\"], many_to_one_transformer),\n",
" ([\"age\", \"fare\"], many_to_many_transformer),\n",
" ([\"embarked\"], Pipeline(steps=[\n",
" (\"imputer\", SimpleImputer(strategy='constant', fill_value='missing')), \n",
" (\"encoder\", OneHotEncoder(sparse=False))])),\n",
" ([\"sex\", \"pclass\"], OneHotEncoder(sparse=False)) \n",
"]\n",
"\n",
"\n",
"# Append classifier to preprocessing pipeline.\n",
"# Now we have a full prediction pipeline.\n",
"clf = Pipeline(steps=[('preprocessor', DataFrameMapper(transformations)),\n",
" ('classifier', LogisticRegression(solver='lbfgs'))])\n",
"'''"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Train a Logistic Regression model, which you want to explain"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Append classifier to preprocessing pipeline.\n",
"# Now we have a full prediction pipeline.\n",
"clf = Pipeline(steps=[('preprocessor', transformations),\n",
" ('classifier', LogisticRegression(solver='lbfgs'))])\n",
"model = clf.fit(x_train, y_train)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Explain predictions on your local machine"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# 1. Using SHAP TabularExplainer\n",
"# When the last parameter allow_all_transformations is passed, we handle many to one and many to many transformations to \n",
"# generate approximations to raw feature importances. When this flag is passed, for transformations not recognized as one to \n",
"# many, we distribute feature importances evenly to raw features generating them.\n",
"# clf.steps[-1][1] returns the trained classification model\n",
"explainer = TabularExplainer(clf.steps[-1][1], \n",
" initialization_examples=x_train, \n",
" features=x_train.columns, \n",
" transformations=transformations, \n",
" allow_all_transformations=True)\n",
"\n",
"\n",
"\n",
"\n",
"# 2. Using MimicExplainer\n",
"# augment_data is optional and if true, oversamples the initialization examples to improve surrogate model accuracy to fit original model. Useful for high-dimensional data where the number of rows is less than the number of columns. \n",
"# max_num_of_augmentations is optional and defines max number of times we can increase the input data size.\n",
"# LGBMExplainableModel can be replaced with LinearExplainableModel, SGDExplainableModel, or DecisionTreeExplainableModel\n",
"# explainer = MimicExplainer(clf.steps[-1][1], \n",
"# x_train, \n",
"# LGBMExplainableModel, \n",
"# augment_data=True, \n",
"# max_num_of_augmentations=10, \n",
"# features=x_train.columns, \n",
"# transformations=transformations, \n",
"# allow_all_transformations=True)\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"# 3. Using PFIExplainer\n",
"\n",
"# Use the parameter \"metric\" to pass a metric name or function to evaluate the permutation. \n",
"# Note that if a metric function is provided a higher value must be better.\n",
"# Otherwise, take the negative of the function or set the parameter \"is_error_metric\" to True.\n",
"# Default metrics: \n",
"# F1 Score for binary classification, F1 Score with micro average for multiclass classification and\n",
"# Mean absolute error for regression\n",
"\n",
"\n",
"# explainer = PFIExplainer(clf.steps[-1][1], \n",
"# features=x_train.columns, \n",
"# transformations=transformations)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Generate global explanations\n",
"Explain overall model predictions (global explanation)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Passing in test dataset for evaluation examples - note it must be a representative sample of the original data\n",
"# x_train can be passed as well, but with more examples explanations will take longer although they may be more accurate\n",
"\n",
"global_explanation = explainer.explain_global(x_test)\n",
"\n",
"# Note: if you used the PFIExplainer in the previous step, use the next line of code instead\n",
"# global_explanation = explainer.explain_global(x_test, true_labels=y_test)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Sorted SHAP values\n",
"print('ranked global importance values: {}'.format(global_explanation.get_ranked_global_values()))\n",
"# Corresponding feature names\n",
"print('ranked global importance names: {}'.format(global_explanation.get_ranked_global_names()))\n",
"# Feature ranks (based on original order of features)\n",
"print('global importance rank: {}'.format(global_explanation.global_importance_rank))\n",
"# Per class feature names\n",
"print('ranked per class feature names: {}'.format(global_explanation.get_ranked_per_class_names()))\n",
"# Per class feature importance values\n",
"print('ranked per class feature values: {}'.format(global_explanation.get_ranked_per_class_values()))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Print out a dictionary that holds the sorted feature importance names and values\n",
"print('global importance rank: {}'.format(global_explanation.get_feature_importance_dict()))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Explain overall model predictions as a collection of local (instance-level) explanations"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# feature shap values for all features and all data points in the training data\n",
"print('local importance values: {}'.format(global_explanation.local_importance_values))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Generate local explanations\n",
"Explain local data points (individual instances)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Note: PFIExplainer does not support local explanations\n",
"# You can pass a specific data point or a group of data points to the explain_local function\n",
"\n",
"# E.g., Explain the first data point in the test set\n",
"instance_num = 1\n",
"local_explanation = explainer.explain_local(x_test[:instance_num])"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Get the prediction for the first member of the test set and explain why model made that prediction\n",
"prediction_value = clf.predict(x_test)[instance_num]\n",
"\n",
"sorted_local_importance_values = local_explanation.get_ranked_local_values()[prediction_value]\n",
"sorted_local_importance_names = local_explanation.get_ranked_local_names()[prediction_value]\n",
"\n",
"print('local importance values: {}'.format(sorted_local_importance_values))\n",
"print('local importance names: {}'.format(sorted_local_importance_names))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Visualize\n",
"Load the visualization dashboard"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.contrib.interpret.visualize import ExplanationDashboard"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"ExplanationDashboard(global_explanation, model, x_test)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Next\n",
"Learn about other use cases of the explain package on a:\n",
" \n",
"1. [Training time: regression problem](./explain-regression-local.ipynb)\n",
"1. [Training time: binary classification problem](./explain-binary-classification-local.ipynb)\n",
"1. [Training time: multiclass classification problem](./explain-multiclass-classification-local.ipynb)\n",
"1. [Explain models with simple feature transformations](./simple-feature-transformations-explain-local.ipynb)\n",
"1. [Save model explanations via Azure Machine Learning Run History](../azure-integration/run-history/save-retrieve-explanations-run-history.ipynb)\n",
"1. [Run explainers remotely on Azure Machine Learning Compute (AMLCompute)](../azure-integration/remote-explanation/explain-model-on-amlcompute.ipynb)\n",
"1. Inferencing time: deploy a classification model and explainer:\n",
" 1. [Deploy a locally-trained model and explainer](../azure-integration/scoring-time/train-explain-model-locally-and-deploy.ipynb)\n",
" 1. [Deploy a remotely-trained model and explainer](../azure-integration/scoring-time/train-explain-model-on-amlcompute-and-deploy.ipynb)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"authors": [
{
"name": "mesameki"
}
],
"kernelspec": {
"display_name": "Python 3.6",
"language": "python",
"name": "python36"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.8"
}
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -1,9 +0,0 @@
name: advanced-feature-transformations-explain-local
dependencies:
- pip:
- azureml-sdk
- interpret
- azureml-interpret
- azureml-contrib-interpret
- sklearn-pandas
- ipywidgets

View File

@@ -1,390 +0,0 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Copyright (c) Microsoft Corporation. All rights reserved.\n",
"\n",
"Licensed under the MIT License."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/explain-model/tabular-data/explain-binary-classification-local.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Explain binary classification model predictions\n",
"_**This notebook showcases how to use the Azure Machine Learning Interpretability SDK to explain and visualize a binary classification model predictions.**_\n",
"\n",
"\n",
"## Table of Contents\n",
"\n",
"1. [Introduction](#Introduction)\n",
"1. [Setup](#Setup)\n",
"1. [Run model explainer locally at training time](#Explain)\n",
" 1. Train a binary classification model\n",
" 1. Explain the model\n",
" 1. Generate global explanations\n",
" 1. Generate local explanations\n",
"1. [Visualize results](#Visualize)\n",
"1. [Next steps](#Next)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Introduction\n",
"\n",
"This notebook illustrates how to explain a binary classification model predictions locally at training time without contacting any Azure services.\n",
"It demonstrates the API calls that you need to make to get the global and local explanations and a visualization dashboard that provides an interactive way of discovering patterns in data and explanations.\n",
"\n",
"We will showcase three tabular data explainers: TabularExplainer (SHAP), MimicExplainer (global surrogate), and PFIExplainer.\n",
"\n",
"| ![Interpretability Toolkit Architecture](./img/interpretability-architecture.png) |\n",
"|:--:|\n",
"| *Interpretability Toolkit Architecture* |\n",
"\n",
"Problem: Breast cancer diagnosis classification with scikit-learn (run model explainer locally)\n",
"\n",
"1. Train a SVM classification model using Scikit-learn\n",
"2. Run 'explain_model' globally and locally with full dataset in local mode, which doesn't contact any Azure services.\n",
"3. Visualize the global and local explanations with the visualization dashboard.\n",
"---\n",
"\n",
"Setup: If you are using Jupyter notebooks, the extensions should be installed automatically with the package.\n",
"If you are using Jupyter Labs run the following command:\n",
"```\n",
"(myenv) $ jupyter labextension install @jupyter-widgets/jupyterlab-manager\n",
"```\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Explain\n",
"\n",
"### Run model explainer locally at training time"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from sklearn.datasets import load_breast_cancer\n",
"from sklearn import svm\n",
"\n",
"# Explainers:\n",
"# 1. SHAP Tabular Explainer\n",
"from interpret.ext.blackbox import TabularExplainer\n",
"\n",
"# OR\n",
"\n",
"# 2. Mimic Explainer\n",
"from interpret.ext.blackbox import MimicExplainer\n",
"# You can use one of the following four interpretable models as a global surrogate to the black box model\n",
"from interpret.ext.glassbox import LGBMExplainableModel\n",
"from interpret.ext.glassbox import LinearExplainableModel\n",
"from interpret.ext.glassbox import SGDExplainableModel\n",
"from interpret.ext.glassbox import DecisionTreeExplainableModel\n",
"\n",
"# OR\n",
"\n",
"# 3. PFI Explainer\n",
"from interpret.ext.blackbox import PFIExplainer "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Load the breast cancer diagnosis data"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"breast_cancer_data = load_breast_cancer()\n",
"classes = breast_cancer_data.target_names.tolist()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Split data into train and test\n",
"from sklearn.model_selection import train_test_split\n",
"x_train, x_test, y_train, y_test = train_test_split(breast_cancer_data.data, breast_cancer_data.target, test_size=0.2, random_state=0)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Train a SVM classification model, which you want to explain"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"clf = svm.SVC(gamma=0.001, C=100., probability=True)\n",
"model = clf.fit(x_train, y_train)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Explain predictions on your local machine"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# 1. Using SHAP TabularExplainer\n",
"explainer = TabularExplainer(model, \n",
" x_train, \n",
" features=breast_cancer_data.feature_names, \n",
" classes=classes)\n",
"\n",
"\n",
"\n",
"\n",
"# 2. Using MimicExplainer\n",
"# augment_data is optional and if true, oversamples the initialization examples to improve surrogate model accuracy to fit original model. Useful for high-dimensional data where the number of rows is less than the number of columns. \n",
"# max_num_of_augmentations is optional and defines max number of times we can increase the input data size.\n",
"# LGBMExplainableModel can be replaced with LinearExplainableModel, SGDExplainableModel, or DecisionTreeExplainableModel\n",
"# explainer = MimicExplainer(model, \n",
"# x_train, \n",
"# LGBMExplainableModel, \n",
"# augment_data=True, \n",
"# max_num_of_augmentations=10, \n",
"# features=breast_cancer_data.feature_names, \n",
"# classes=classes)\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"# 3. Using PFIExplainer\n",
"\n",
"# Use the parameter \"metric\" to pass a metric name or function to evaluate the permutation. \n",
"# Note that if a metric function is provided a higher value must be better.\n",
"# Otherwise, take the negative of the function or set the parameter \"is_error_metric\" to True.\n",
"# Default metrics: \n",
"# F1 Score for binary classification, F1 Score with micro average for multiclass classification and\n",
"# Mean absolute error for regression\n",
"\n",
"# explainer = PFIExplainer(model, \n",
"# features=breast_cancer_data.feature_names, \n",
"# classes=classes)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Generate global explanations\n",
"Explain overall model predictions (global explanation)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Passing in test dataset for evaluation examples - note it must be a representative sample of the original data\n",
"# x_train can be passed as well, but with more examples explanations will take longer although they may be more accurate\n",
"global_explanation = explainer.explain_global(x_test)\n",
"\n",
"# Note: if you used the PFIExplainer in the previous step, use the next line of code instead\n",
"# global_explanation = explainer.explain_global(x_test, true_labels=y_test)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Sorted SHAP values\n",
"print('ranked global importance values: {}'.format(global_explanation.get_ranked_global_values()))\n",
"# Corresponding feature names\n",
"print('ranked global importance names: {}'.format(global_explanation.get_ranked_global_names()))\n",
"# Feature ranks (based on original order of features)\n",
"print('global importance rank: {}'.format(global_explanation.global_importance_rank))\n",
"\n",
"# Note: PFIExplainer does not support per class explanations\n",
"# Per class feature names\n",
"print('ranked per class feature names: {}'.format(global_explanation.get_ranked_per_class_names()))\n",
"# Per class feature importance values\n",
"print('ranked per class feature values: {}'.format(global_explanation.get_ranked_per_class_values()))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Print out a dictionary that holds the sorted feature importance names and values\n",
"print('global importance rank: {}'.format(global_explanation.get_feature_importance_dict()))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Explain overall model predictions as a collection of local (instance-level) explanations"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# feature shap values for all features and all data points in the training data\n",
"print('local importance values: {}'.format(global_explanation.local_importance_values))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Generate local explanations\n",
"Explain local data points (individual instances)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Note: PFIExplainer does not support local explanations\n",
"# You can pass a specific data point or a group of data points to the explain_local function\n",
"\n",
"# E.g., Explain the first data point in the test set\n",
"instance_num = 0\n",
"local_explanation = explainer.explain_local(x_test[instance_num,:])"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Get the prediction for the first member of the test set and explain why model made that prediction\n",
"prediction_value = clf.predict(x_test)[instance_num]\n",
"\n",
"sorted_local_importance_values = local_explanation.get_ranked_local_values()[prediction_value]\n",
"sorted_local_importance_names = local_explanation.get_ranked_local_names()[prediction_value]\n",
"\n",
"print('local importance values: {}'.format(sorted_local_importance_values))\n",
"print('local importance names: {}'.format(sorted_local_importance_names))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Visualize\n",
"Load the visualization dashboard"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.contrib.interpret.visualize import ExplanationDashboard"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"ExplanationDashboard(global_explanation, model, x_test)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Next\n",
"Learn about other use cases of the explain package on a:\n",
" \n",
"1. [Training time: regression problem](./explain-regression-local.ipynb)\n",
"1. [Training time: multiclass classification problem](./explain-multiclass-classification-local.ipynb)\n",
"1. Explain models with engineered features:\n",
" 1. [Simple feature transformations](./simple-feature-transformations-explain-local.ipynb)\n",
" 1. [Advanced feature transformations](./advanced-feature-transformations-explain-local.ipynb)\n",
"1. [Save model explanations via Azure Machine Learning Run History](../azure-integration/run-history/save-retrieve-explanations-run-history.ipynb)\n",
"1. [Run explainers remotely on Azure Machine Learning Compute (AMLCompute)](../azure-integration/remote-explanation/explain-model-on-amlcompute.ipynb)\n",
"1. Inferencing time: deploy a classification model and explainer:\n",
" 1. [Deploy a locally-trained model and explainer](../azure-integration/scoring-time/train-explain-model-locally-and-deploy.ipynb)\n",
" 1. [Deploy a remotely-trained model and explainer](../azure-integration/scoring-time/train-explain-model-on-amlcompute-and-deploy.ipynb)\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"authors": [
{
"name": "mesameki"
}
],
"kernelspec": {
"display_name": "Python 3.6",
"language": "python",
"name": "python36"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.8"
}
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -1,8 +0,0 @@
name: explain-binary-classification-local
dependencies:
- pip:
- azureml-sdk
- interpret
- azureml-interpret
- azureml-contrib-interpret
- ipywidgets

View File

@@ -1,388 +0,0 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Copyright (c) Microsoft Corporation. All rights reserved.\n",
"\n",
"Licensed under the MIT License."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/explain-model/tabular-data/explain-multiclass-classification-local.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Explain multiclass classification model's predictions\n",
"_**This notebook showcases how to use the Azure Machine Learning Interpretability SDK to explain and visualize a multiclass classification model predictions.**_\n",
"\n",
"\n",
"\n",
"## Table of Contents\n",
"\n",
"1. [Introduction](#Introduction)\n",
"1. [Setup](#Setup)\n",
"1. [Run model explainer locally at training time](#Explain)\n",
" 1. Train a multiclass classification model\n",
" 1. Explain the model\n",
" 1. Generate global explanations\n",
" 1. Generate local explanations\n",
"1. [Visualize results](#Visualize)\n",
"1. [Next steps](#Next)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Introduction\n",
"\n",
"This notebook illustrates how to explain a multiclass classification model predictions locally at training time without contacting any Azure services.\n",
"It demonstrates the API calls that you need to make to get the global and local explanations and a visualization dashboard that provides an interactive way of discovering patterns in data and explanations.\n",
"\n",
"We will showcase three tabular data explainers: TabularExplainer (SHAP), MimicExplainer (global surrogate), and PFIExplainer.\n",
"\n",
"| ![Interpretability Toolkit Architecture](./img/interpretability-architecture.png) |\n",
"|:--:|\n",
"| *Interpretability Toolkit Architecture* |\n",
"\n",
"Problem: Iris flower classification with scikit-learn (run model explainer locally)\n",
"\n",
"1. Train a SVM classification model using Scikit-learn\n",
"2. Run 'explain_model' globally and locally with full dataset in local mode, which doesn't contact any Azure services.\n",
"3. Visualize the global and local explanations with the visualization dashboard.\n",
"---\n",
"\n",
"Setup: If you are using Jupyter notebooks, the extensions should be installed automatically with the package.\n",
"If you are using Jupyter Labs run the following command:\n",
"```\n",
"(myenv) $ jupyter labextension install @jupyter-widgets/jupyterlab-manager\n",
"```\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Explain\n",
"\n",
"### Run model explainer locally at training time"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from sklearn.datasets import load_iris\n",
"from sklearn import svm\n",
"\n",
"# Explainers:\n",
"# 1. SHAP Tabular Explainer\n",
"from interpret.ext.blackbox import TabularExplainer\n",
"\n",
"# OR\n",
"\n",
"# 2. Mimic Explainer\n",
"from interpret.ext.blackbox import MimicExplainer\n",
"# You can use one of the following four interpretable models as a global surrogate to the black box model\n",
"from interpret.ext.glassbox import LGBMExplainableModel\n",
"from interpret.ext.glassbox import LinearExplainableModel\n",
"from interpret.ext.glassbox import SGDExplainableModel\n",
"from interpret.ext.glassbox import DecisionTreeExplainableModel\n",
"\n",
"# OR\n",
"\n",
"# 3. PFI Explainer\n",
"from interpret.ext.blackbox import PFIExplainer "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Load the Iris flower dataset"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"iris = load_iris()\n",
"X = iris['data']\n",
"y = iris['target']\n",
"classes = iris['target_names']\n",
"feature_names = iris['feature_names']"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Split data into train and test\n",
"from sklearn.model_selection import train_test_split\n",
"x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Train a SVM classification model, which you want to explain"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"clf = svm.SVC(gamma=0.001, C=100., probability=True)\n",
"model = clf.fit(x_train, y_train)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Explain predictions on your local machine"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# 1. Using SHAP TabularExplainer\n",
"explainer = TabularExplainer(model, \n",
" x_train, \n",
" features=feature_names, \n",
" classes=classes)\n",
"\n",
"\n",
"\n",
"\n",
"# 2. Using MimicExplainer\n",
"# augment_data is optional and if true, oversamples the initialization examples to improve surrogate model accuracy to fit original model. Useful for high-dimensional data where the number of rows is less than the number of columns. \n",
"# max_num_of_augmentations is optional and defines max number of times we can increase the input data size.\n",
"# LGBMExplainableModel can be replaced with LinearExplainableModel, SGDExplainableModel, or DecisionTreeExplainableModel\n",
"# explainer = MimicExplainer(model, \n",
"# x_train, \n",
"# LGBMExplainableModel, \n",
"# augment_data=True, \n",
"# max_num_of_augmentations=10, \n",
"# features=feature_names, \n",
"# classes=classes)\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"# 3. Using PFIExplainer\n",
"\n",
"# Use the parameter \"metric\" to pass a metric name or function to evaluate the permutation. \n",
"# Note that if a metric function is provided a higher value must be better.\n",
"# Otherwise, take the negative of the function or set the parameter \"is_error_metric\" to True.\n",
"# Default metrics: \n",
"# F1 Score for binary classification, F1 Score with micro average for multiclass classification and\n",
"# Mean absolute error for regression\n",
"\n",
"# explainer = PFIExplainer(model, \n",
"# features=feature_names, \n",
"# classes=classes)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Generate global explanations\n",
"Explain overall model predictions (global explanation)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Passing in test dataset for evaluation examples - note it must be a representative sample of the original data\n",
"# x_train can be passed as well, but with more examples explanations will take longer although they may be more accurate\n",
"global_explanation = explainer.explain_global(x_test)\n",
"\n",
"# Note: if you used the PFIExplainer in the previous step, use the next line of code instead\n",
"# global_explanation = explainer.explain_global(x_test, true_labels=y_test)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Sorted SHAP values\n",
"print('ranked global importance values: {}'.format(global_explanation.get_ranked_global_values()))\n",
"# Corresponding feature names\n",
"print('ranked global importance names: {}'.format(global_explanation.get_ranked_global_names()))\n",
"# Feature ranks (based on original order of features)\n",
"print('global importance rank: {}'.format(global_explanation.global_importance_rank))\n",
"\n",
"# Note: PFIExplainer does not support per class explanations\n",
"# Per class feature names\n",
"print('ranked per class feature names: {}'.format(global_explanation.get_ranked_per_class_names()))\n",
"# Per class feature importance values\n",
"print('ranked per class feature values: {}'.format(global_explanation.get_ranked_per_class_values()))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Print out a dictionary that holds the sorted feature importance names and values\n",
"print('global importance rank: {}'.format(global_explanation.get_feature_importance_dict()))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Explain overall model predictions as a collection of local (instance-level) explanations"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# feature shap values for all features and all data points in the training data\n",
"print('local importance values: {}'.format(global_explanation.local_importance_values))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Generate local explanations\n",
"Explain local data points (individual instances)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Note: PFIExplainer does not support local explanations\n",
"# You can pass a specific data point or a group of data points to the explain_local function\n",
"\n",
"# E.g., Explain the first data point in the test set\n",
"instance_num = 0\n",
"local_explanation = explainer.explain_local(x_test[instance_num,:])"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Get the prediction for the first member of the test set and explain why model made that prediction\n",
"prediction_value = clf.predict(x_test)[instance_num]\n",
"\n",
"sorted_local_importance_values = local_explanation.get_ranked_local_values()[prediction_value]\n",
"sorted_local_importance_names = local_explanation.get_ranked_local_names()[prediction_value]\n",
"\n",
"print('local importance values: {}'.format(sorted_local_importance_values))\n",
"print('local importance names: {}'.format(sorted_local_importance_names))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Visualize\n",
"Load the visualization dashboard"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.contrib.interpret.visualize import ExplanationDashboard"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"ExplanationDashboard(global_explanation, model, x_test)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Next\n",
"Learn about other use cases of the explain package on a:\n",
"\n",
"1. [Training time: regression problem](./explain-regression-local.ipynb) \n",
"1. [Training time: binary classification problem](./explain-binary-classification-local.ipynb)\n",
"1. Explain models with engineered features:\n",
" 1. [Simple feature transformations](./simple-feature-transformations-explain-local.ipynb)\n",
" 1. [Advanced feature transformations](./advanced-feature-transformations-explain-local.ipynb)\n",
"1. [Save model explanations via Azure Machine Learning Run History](../azure-integration/run-history/save-retrieve-explanations-run-history.ipynb)\n",
"1. [Run explainers remotely on Azure Machine Learning Compute (AMLCompute)](../azure-integration/remote-explanation/explain-model-on-amlcompute.ipynb)\n",
"1. Inferencing time: deploy a classification model and explainer:\n",
" 1. [Deploy a locally-trained model and explainer](../azure-integration/scoring-time/train-explain-model-locally-and-deploy.ipynb)\n",
" 1. [Deploy a remotely-trained model and explainer](../azure-integration/scoring-time/train-explain-model-on-amlcompute-and-deploy.ipynb)\n",
"\u00e2\u20ac\u2039\n"
]
}
],
"metadata": {
"authors": [
{
"name": "mesameki"
}
],
"kernelspec": {
"display_name": "Python 3.6",
"language": "python",
"name": "python36"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.8"
}
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -1,8 +0,0 @@
name: explain-multiclass-classification-local
dependencies:
- pip:
- azureml-sdk
- interpret
- azureml-interpret
- azureml-contrib-interpret
- ipywidgets

View File

@@ -1,383 +0,0 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Copyright (c) Microsoft Corporation. All rights reserved.\n",
"\n",
"Licensed under the MIT License."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/explain-model/tabular-data/explain-regression-local.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Explain regression model predictions\n",
"_**This notebook showcases how to use the Azure Machine Learning Interpretability SDK to explain and visualize a regression model predictions.**_\n",
"\n",
"\n",
"## Table of Contents\n",
"\n",
"1. [Introduction](#Introduction)\n",
"1. [Setup](#Setup)\n",
"1. [Run model explainer locally at training time](#Explain)\n",
" 1. Train a regressor model\n",
" 1. Explain the model\n",
" 1. Generate global explanations\n",
" 1. Generate local explanations\n",
"1. [Visualize results](#Visualize)\n",
"1. [Next steps](#Next)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Introduction\n",
"\n",
"This notebook illustrates how to explain regression model predictions locally at training time without contacting any Azure services.\n",
"It demonstrates the API calls that you need to make to get the global and local explanations and a visualization dashboard that provides an interactive way of discovering patterns in data and explanations.\n",
"\n",
"We will showcase three tabular data explainers: TabularExplainer (SHAP), MimicExplainer (global surrogate), and PFIExplainer.\n",
"\n",
"| ![Interpretability Toolkit Architecture](./img/interpretability-architecture.png) |\n",
"|:--:|\n",
"| *Interpretability Toolkit Architecture* |\n",
"\n",
"Problem: Boston Housing Price Prediction with scikit-learn (run model explainer locally)\n",
"\n",
"1. Train a GradientBoosting regression model using Scikit-learn\n",
"2. Run 'explain_model' globally and locally with full dataset in local mode, which doesn't contact any Azure services.\n",
"3. Visualize the global and local explanations with the visualization dashboard.\n",
"---\n",
"\n",
"Setup: If you are using Jupyter notebooks, the extensions should be installed automatically with the package.\n",
"If you are using Jupyter Labs run the following command:\n",
"```\n",
"(myenv) $ jupyter labextension install @jupyter-widgets/jupyterlab-manager\n",
"```\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Explain\n",
"\n",
"### Run model explainer locally at training time"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from sklearn import datasets\n",
"from sklearn.ensemble import GradientBoostingRegressor\n",
"\n",
"# Explainers:\n",
"# 1. SHAP Tabular Explainer\n",
"from interpret.ext.blackbox import TabularExplainer\n",
"\n",
"# OR\n",
"\n",
"# 2. Mimic Explainer\n",
"from interpret.ext.blackbox import MimicExplainer\n",
"# You can use one of the following four interpretable models as a global surrogate to the black box model\n",
"from interpret.ext.glassbox import LGBMExplainableModel\n",
"from interpret.ext.glassbox import LinearExplainableModel\n",
"from interpret.ext.glassbox import SGDExplainableModel\n",
"from interpret.ext.glassbox import DecisionTreeExplainableModel\n",
"\n",
"# OR\n",
"\n",
"# 3. PFI Explainer\n",
"from interpret.ext.blackbox import PFIExplainer "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Load the Boston house price data"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"boston_data = datasets.load_boston()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Split data into train and test\n",
"from sklearn.model_selection import train_test_split\n",
"x_train, x_test, y_train, y_test = train_test_split(boston_data.data, boston_data.target, test_size=0.2, random_state=0)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Train a GradientBoosting regression model, which you want to explain"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"reg = GradientBoostingRegressor(n_estimators=100, max_depth=4,\n",
" learning_rate=0.1, loss='huber',\n",
" random_state=1)\n",
"model = reg.fit(x_train, y_train)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Explain predictions on your local machine"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# 1. Using SHAP TabularExplainer\n",
"explainer = TabularExplainer(model, \n",
" x_train, \n",
" features = boston_data.feature_names)\n",
"\n",
"\n",
"\n",
"\n",
"# 2. Using MimicExplainer\n",
"# augment_data is optional and if true, oversamples the initialization examples to improve surrogate model accuracy to fit original model. Useful for high-dimensional data where the number of rows is less than the number of columns. \n",
"# max_num_of_augmentations is optional and defines max number of times we can increase the input data size.\n",
"# LGBMExplainableModel can be replaced with LinearExplainableModel, SGDExplainableModel, or DecisionTreeExplainableModel\n",
"# explainer = MimicExplainer(model, \n",
"# x_train, \n",
"# LGBMExplainableModel, \n",
"# augment_data=True, \n",
"# max_num_of_augmentations=10, \n",
"# features=boston_data.feature_names)\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"# 3. Using PFIExplainer\n",
"\n",
"# Use the parameter \"metric\" to pass a metric name or function to evaluate the permutation. \n",
"# Note that if a metric function is provided a higher value must be better.\n",
"# Otherwise, take the negative of the function or set the parameter \"is_error_metric\" to True.\n",
"# Default metrics: \n",
"# F1 Score for binary classification, F1 Score with micro average for multiclass classification and\n",
"# Mean absolute error for regression\n",
"\n",
"# explainer = PFIExplainer(model, \n",
"# features=boston_data.feature_names)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Generate global explanations\n",
"Explain overall model predictions (global explanation)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Passing in test dataset for evaluation examples - note it must be a representative sample of the original data\n",
"# x_train can be passed as well, but with more examples explanations will take longer although they may be more accurate\n",
"global_explanation = explainer.explain_global(x_test)\n",
"\n",
"# Note: if you used the PFIExplainer in the previous step, use the next line of code instead\n",
"# global_explanation = explainer.explain_global(x_test, true_labels=y_test)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Sorted SHAP values \n",
"print('ranked global importance values: {}'.format(global_explanation.get_ranked_global_values()))\n",
"# Corresponding feature names\n",
"print('ranked global importance names: {}'.format(global_explanation.get_ranked_global_names()))\n",
"# Feature ranks (based on original order of features)\n",
"print('global importance rank: {}'.format(global_explanation.global_importance_rank))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Print out a dictionary that holds the sorted feature importance names and values\n",
"print('global importance rank: {}'.format(global_explanation.get_feature_importance_dict()))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Explain overall model predictions as a collection of local (instance-level) explanations"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Note: PFIExplainer does not support local explanations\n",
"# feature shap values for all features and all data points in the training data\n",
"print('local importance values: {}'.format(global_explanation.local_importance_values))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Generate local explanations\n",
"Explain local data points (individual instances)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Note: PFIExplainer does not support local explanations\n",
"# You can pass a specific data point or a group of data points to the explain_local function\n",
"\n",
"# E.g., Explain the first data point in the test set\n",
"local_explanation = explainer.explain_local(x_test[0,:])\n",
"\n",
"# E.g., Explain the first five data points in the test set\n",
"# local_explanation_group = explainer.explain_local(x_test[0:4,:])"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Sorted local feature importance information; reflects the original feature order\n",
"sorted_local_importance_names = local_explanation.get_ranked_local_names()\n",
"sorted_local_importance_values = local_explanation.get_ranked_local_values()\n",
"\n",
"print('sorted local importance names: {}'.format(sorted_local_importance_names))\n",
"print('sorted local importance values: {}'.format(sorted_local_importance_values))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Visualize\n",
"Load the visualization dashboard"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.contrib.interpret.visualize import ExplanationDashboard"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"ExplanationDashboard(global_explanation, model, x_test)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Next\n",
"Learn about other use cases of the explain package on a:\n",
" \n",
"1. [Training time: binary classification problem](./explain-binary-classification-local.ipynb)\n",
"1. [Training time: multiclass classification problem](./explain-multiclass-classification-local.ipynb)\n",
"1. Explain models with engineered features:\n",
" 1. [Simple feature transformations](./simple-feature-transformations-explain-local.ipynb)\n",
" 1. [Advanced feature transformations](./advanced-feature-transformations-explain-local.ipynb)\n",
"1. [Save model explanations via Azure Machine Learning Run History](../azure-integration/run-history/save-retrieve-explanations-run-history.ipynb)\n",
"1. [Run explainers remotely on Azure Machine Learning Compute (AMLCompute)](../azure-integration/remote-explanation/explain-model-on-amlcompute.ipynb)\n",
"1. Inferencing time: deploy a classification model and explainer:\n",
" 1. [Deploy a locally-trained model and explainer](../azure-integration/scoring-time/train-explain-model-locally-and-deploy.ipynb)\n",
" 1. [Deploy a remotely-trained model and explainer](../azure-integration/scoring-time/train-explain-model-on-amlcompute-and-deploy.ipynb)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"authors": [
{
"name": "mesameki"
}
],
"kernelspec": {
"display_name": "Python 3.6",
"language": "python",
"name": "python36"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.8"
}
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -1,8 +0,0 @@
name: explain-regression-local
dependencies:
- pip:
- azureml-sdk
- interpret
- azureml-interpret
- azureml-contrib-interpret
- ipywidgets

Binary file not shown.

Before

Width:  |  Height:  |  Size: 116 KiB

View File

@@ -1,517 +0,0 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Copyright (c) Microsoft Corporation. All rights reserved.\n",
"\n",
"Licensed under the MIT License."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/explain-model/tabular-data/simple-feature-transformations-explain-local.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Explain binary classification model predictions with raw feature transformations\n",
"_**This notebook showcases how to use the Azure Machine Learning Interpretability SDK to explain and visualize a binary classification model that uses one to one and one to many feature transformations.**_\n",
"\n",
"\n",
"## Table of Contents\n",
"\n",
"1. [Introduction](#Introduction)\n",
"1. [Setup](#Setup)\n",
"1. [Run model explainer locally at training time](#Explain)\n",
" 1. Apply feature transformations\n",
" 1. Train a binary classification model\n",
" 1. Explain the model on raw features\n",
" 1. Generate global explanations\n",
" 1. Generate local explanations\n",
"1. [Visualize results](#Visualize)\n",
"1. [Next steps](#Next%20steps)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Introduction\n",
"\n",
"This notebook illustrates creating explanations for a binary classification model, IBM employee attrition classification, that uses one to one and one to many feature transformations from raw data to engineered features. The one to many feature transformations include one hot encoding on categorical features. The one to one feature transformations apply standard scaling on numeric features. Our tabular data explainer is then used to get raw feature importances.\n",
"\n",
"\n",
"We will showcase raw feature transformations with three tabular data explainers: TabularExplainer (SHAP), MimicExplainer (global surrogate), and PFIExplainer.\n",
"\n",
"| ![Interpretability Toolkit Architecture](./img/interpretability-architecture.png) |\n",
"|:--:|\n",
"| *Interpretability Toolkit Architecture* |\n",
"\n",
"Problem: IBM employee attrition classification with scikit-learn (run model explainer locally)\n",
"\n",
"1. Transform raw features to engineered features\n",
"2. Train a SVC classification model using Scikit-learn\n",
"3. Run 'explain_model' globally and locally with full dataset in local mode, which doesn't contact any Azure services.\n",
"4. Visualize the global and local explanations with the visualization dashboard.\n",
"---\n",
"\n",
"Setup: If you are using Jupyter notebooks, the extensions should be installed automatically with the package.\n",
"If you are using Jupyter Labs run the following command:\n",
"```\n",
"(myenv) $ jupyter labextension install @jupyter-widgets/jupyterlab-manager\n",
"```\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Explain\n",
"\n",
"### Run model explainer locally at training time"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from sklearn.pipeline import Pipeline\n",
"from sklearn.impute import SimpleImputer\n",
"from sklearn.preprocessing import StandardScaler, OneHotEncoder\n",
"from sklearn.svm import SVC\n",
"import pandas as pd\n",
"import numpy as np\n",
"\n",
"# Explainers:\n",
"# 1. SHAP Tabular Explainer\n",
"from interpret.ext.blackbox import TabularExplainer\n",
"\n",
"# OR\n",
"\n",
"# 2. Mimic Explainer\n",
"from interpret.ext.blackbox import MimicExplainer\n",
"# You can use one of the following four interpretable models as a global surrogate to the black box model\n",
"from interpret.ext.glassbox import LGBMExplainableModel\n",
"from interpret.ext.glassbox import LinearExplainableModel\n",
"from interpret.ext.glassbox import SGDExplainableModel\n",
"from interpret.ext.glassbox import DecisionTreeExplainableModel\n",
"\n",
"# OR\n",
"\n",
"# 3. PFI Explainer\n",
"from interpret.ext.blackbox import PFIExplainer "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Load the IBM employee attrition data"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# get the IBM employee attrition dataset\n",
"outdirname = 'dataset.6.21.19'\n",
"try:\n",
" from urllib import urlretrieve\n",
"except ImportError:\n",
" from urllib.request import urlretrieve\n",
"import zipfile\n",
"zipfilename = outdirname + '.zip'\n",
"urlretrieve('https://publictestdatasets.blob.core.windows.net/data/' + zipfilename, zipfilename)\n",
"with zipfile.ZipFile(zipfilename, 'r') as unzip:\n",
" unzip.extractall('.')\n",
"attritionData = pd.read_csv('./WA_Fn-UseC_-HR-Employee-Attrition.csv')\n",
"\n",
"# Dropping Employee count as all values are 1 and hence attrition is independent of this feature\n",
"attritionData = attritionData.drop(['EmployeeCount'], axis=1)\n",
"# Dropping Employee Number since it is merely an identifier\n",
"attritionData = attritionData.drop(['EmployeeNumber'], axis=1)\n",
"\n",
"attritionData = attritionData.drop(['Over18'], axis=1)\n",
"\n",
"# Since all values are 80\n",
"attritionData = attritionData.drop(['StandardHours'], axis=1)\n",
"\n",
"# Converting target variables from string to numerical values\n",
"target_map = {'Yes': 1, 'No': 0}\n",
"attritionData[\"Attrition_numerical\"] = attritionData[\"Attrition\"].apply(lambda x: target_map[x])\n",
"target = attritionData[\"Attrition_numerical\"]\n",
"\n",
"attritionXData = attritionData.drop(['Attrition_numerical', 'Attrition'], axis=1)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Split data into train and test\n",
"from sklearn.model_selection import train_test_split\n",
"x_train, x_test, y_train, y_test = train_test_split(attritionXData, \n",
" target, \n",
" test_size = 0.2,\n",
" random_state=0,\n",
" stratify=target)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Creating dummy columns for each categorical feature\n",
"categorical = []\n",
"for col, value in attritionXData.iteritems():\n",
" if value.dtype == 'object':\n",
" categorical.append(col)\n",
" \n",
"# Store the numerical columns in a list numerical\n",
"numerical = attritionXData.columns.difference(categorical) "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Transform raw features"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can explain raw features by either using a `sklearn.compose.ColumnTransformer` or a list of fitted transformer tuples. The cell below uses `sklearn.compose.ColumnTransformer`. In case you want to run the example with the list of fitted transformer tuples, comment the cell below and uncomment the cell that follows after. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from sklearn.compose import ColumnTransformer\n",
"\n",
"# We create the preprocessing pipelines for both numeric and categorical data.\n",
"numeric_transformer = Pipeline(steps=[\n",
" ('imputer', SimpleImputer(strategy='median')),\n",
" ('scaler', StandardScaler())])\n",
"\n",
"categorical_transformer = Pipeline(steps=[\n",
" ('imputer', SimpleImputer(strategy='constant', fill_value='missing')),\n",
" ('onehot', OneHotEncoder(handle_unknown='ignore'))])\n",
"\n",
"transformations = ColumnTransformer(\n",
" transformers=[\n",
" ('num', numeric_transformer, numerical),\n",
" ('cat', categorical_transformer, categorical)])\n",
"\n",
"# Append classifier to preprocessing pipeline.\n",
"# Now we have a full prediction pipeline.\n",
"clf = Pipeline(steps=[('preprocessor', transformations),\n",
" ('classifier', SVC(C = 1.0, probability=True))])"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"'''\n",
"# Uncomment below if sklearn-pandas is not installed\n",
"#!pip install sklearn-pandas\n",
"from sklearn_pandas import DataFrameMapper\n",
"\n",
"# Impute, standardize the numeric features and one-hot encode the categorical features. \n",
"\n",
"\n",
"numeric_transformations = [([f], Pipeline(steps=[('imputer', SimpleImputer(strategy='median')), ('scaler', StandardScaler())])) for f in numerical]\n",
"\n",
"categorical_transformations = [([f], OneHotEncoder(handle_unknown='ignore', sparse=False)) for f in categorical]\n",
"\n",
"transformations = numeric_transformations + categorical_transformations\n",
"\n",
"# Append classifier to preprocessing pipeline.\n",
"# Now we have a full prediction pipeline.\n",
"clf = Pipeline(steps=[('preprocessor', transformations),\n",
" ('classifier', SVC(C = 1.0, probability=True))]) \n",
"\n",
"\n",
"\n",
"'''"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Train a SVM classification model, which you want to explain"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"model = clf.fit(x_train, y_train)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Explain predictions on your local machine"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# 1. Using SHAP TabularExplainer\n",
"# clf.steps[-1][1] returns the trained classification model\n",
"explainer = TabularExplainer(clf.steps[-1][1], \n",
" initialization_examples=x_train, \n",
" features=attritionXData.columns, \n",
" classes=[\"Not leaving\", \"leaving\"], \n",
" transformations=transformations)\n",
"\n",
"\n",
"\n",
"\n",
"# 2. Using MimicExplainer\n",
"# augment_data is optional and if true, oversamples the initialization examples to improve surrogate model accuracy to fit original model. Useful for high-dimensional data where the number of rows is less than the number of columns. \n",
"# max_num_of_augmentations is optional and defines max number of times we can increase the input data size.\n",
"# LGBMExplainableModel can be replaced with LinearExplainableModel, SGDExplainableModel, or DecisionTreeExplainableModel\n",
"# explainer = MimicExplainer(clf.steps[-1][1], \n",
"# x_train, \n",
"# LGBMExplainableModel, \n",
"# augment_data=True, \n",
"# max_num_of_augmentations=10, \n",
"# features=attritionXData.columns, \n",
"# classes=[\"Not leaving\", \"leaving\"], \n",
"# transformations=transformations)\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"# 3. Using PFIExplainer\n",
"\n",
"# Use the parameter \"metric\" to pass a metric name or function to evaluate the permutation. \n",
"# Note that if a metric function is provided a higher value must be better.\n",
"# Otherwise, take the negative of the function or set the parameter \"is_error_metric\" to True.\n",
"# Default metrics: \n",
"# F1 Score for binary classification, F1 Score with micro average for multiclass classification and\n",
"# Mean absolute error for regression\n",
"\n",
"# explainer = PFIExplainer(clf.steps[-1][1], \n",
"# features=x_train.columns, \n",
"# transformations=transformations,\n",
"# classes=[\"Not leaving\", \"leaving\"])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Generate global explanations\n",
"Explain overall model predictions (global explanation)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Passing in test dataset for evaluation examples - note it must be a representative sample of the original data\n",
"# x_train can be passed as well, but with more examples explanations will take longer although they may be more accurate\n",
"global_explanation = explainer.explain_global(x_test)\n",
"\n",
"# Note: if you used the PFIExplainer in the previous step, use the next line of code instead\n",
"# global_explanation = explainer.explain_global(x_test, true_labels=y_test)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Sorted SHAP values\n",
"print('ranked global importance values: {}'.format(global_explanation.get_ranked_global_values()))\n",
"# Corresponding feature names\n",
"print('ranked global importance names: {}'.format(global_explanation.get_ranked_global_names()))\n",
"# Feature ranks (based on original order of features)\n",
"print('global importance rank: {}'.format(global_explanation.global_importance_rank))\n",
"\n",
"# Note: PFIExplainer does not support per class explanations\n",
"# Per class feature names\n",
"print('ranked per class feature names: {}'.format(global_explanation.get_ranked_per_class_names()))\n",
"# Per class feature importance values\n",
"print('ranked per class feature values: {}'.format(global_explanation.get_ranked_per_class_values()))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Print out a dictionary that holds the sorted feature importance names and values\n",
"print('global importance rank: {}'.format(global_explanation.get_feature_importance_dict()))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Explain overall model predictions as a collection of local (instance-level) explanations"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# feature shap values for all features and all data points in the training data\n",
"print('local importance values: {}'.format(global_explanation.local_importance_values))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Generate local explanations\n",
"Explain local data points (individual instances)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Note: PFIExplainer does not support local explanations\n",
"# You can pass a specific data point or a group of data points to the explain_local function\n",
"\n",
"# E.g., Explain the first data point in the test set\n",
"instance_num = 1\n",
"local_explanation = explainer.explain_local(x_test[:instance_num])"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Get the prediction for the first member of the test set and explain why model made that prediction\n",
"prediction_value = clf.predict(x_test)[instance_num]\n",
"\n",
"sorted_local_importance_values = local_explanation.get_ranked_local_values()[prediction_value]\n",
"sorted_local_importance_names = local_explanation.get_ranked_local_names()[prediction_value]\n",
"\n",
"print('local importance values: {}'.format(sorted_local_importance_values))\n",
"print('local importance names: {}'.format(sorted_local_importance_names))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Visualize\n",
"Load the visualization dashboard"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.contrib.interpret.visualize import ExplanationDashboard"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"ExplanationDashboard(global_explanation, model, x_test)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Next\n",
"Learn about other use cases of the explain package on a:\n",
" \n",
"1. [Training time: regression problem](./explain-regression-local.ipynb)\n",
"1. [Training time: binary classification problem](./explain-binary-classification-local.ipynb)\n",
"1. [Training time: multiclass classification problem](./explain-multiclass-classification-local.ipynb)\n",
"1. [Explain models with advanced feature transformations](./advanced-feature-transformations-explain-local.ipynb)\n",
"1. [Save model explanations via Azure Machine Learning Run History](../azure-integration/run-history/save-retrieve-explanations-run-history.ipynb)\n",
"1. [Run explainers remotely on Azure Machine Learning Compute (AMLCompute)](../azure-integration/remote-explanation/explain-model-on-amlcompute.ipynb)\n",
"1. Inferencing time: deploy a classification model and explainer:\n",
" 1. [Deploy a locally-trained model and explainer](../azure-integration/scoring-time/train-explain-model-locally-and-deploy.ipynb)\n",
" 1. [Deploy a remotely-trained model and explainer](../azure-integration/scoring-time/train-explain-model-on-amlcompute-and-deploy.ipynb)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"authors": [
{
"name": "mesameki"
}
],
"kernelspec": {
"display_name": "Python 3.6",
"language": "python",
"name": "python36"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.8"
}
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -1,9 +0,0 @@
name: simple-feature-transformations-explain-local
dependencies:
- pip:
- azureml-sdk
- interpret
- azureml-interpret
- azureml-contrib-interpret
- sklearn-pandas
- ipywidgets

View File

@@ -34,7 +34,8 @@
"| Azure Data Lake Storage Gen 1 | Yes | Yes |\n", "| Azure Data Lake Storage Gen 1 | Yes | Yes |\n",
"| Azure Data Lake Storage Gen 2 | Yes | Yes |\n", "| Azure Data Lake Storage Gen 2 | Yes | Yes |\n",
"| Azure SQL Database | Yes | Yes |\n", "| Azure SQL Database | Yes | Yes |\n",
"| Azure Database for PostgreSQL | Yes | No |" "| Azure Database for PostgreSQL | Yes | Yes |",
"| Azure Database for MySQL | Yes | Yes |"
] ]
}, },
{ {
@@ -342,8 +343,8 @@
"source": [ "source": [
"\n", "\n",
"mysql_datastore_name=\"MySqlDatastore\"\n", "mysql_datastore_name=\"MySqlDatastore\"\n",
"server_name=os.getenv(\"MYSQL_SERVERNAME_62\", \"<my-server-name>\") # Name of PostgreSQL server \n", "server_name=os.getenv(\"MYSQL_SERVERNAME_62\", \"<my-server-name>\") # Name of MySQL server \n",
"database_name=os.getenv(\"MYSQL_DATBASENAME_62\", \"<my-database-name>\") # Name of PostgreSQL database\n", "database_name=os.getenv(\"MYSQL_DATBASENAME_62\", \"<my-database-name>\") # Name of MySQL database\n",
"user_id=os.getenv(\"MYSQL_USERID_62\", \"<my-user-id>\") # user id\n", "user_id=os.getenv(\"MYSQL_USERID_62\", \"<my-user-id>\") # user id\n",
"user_password=os.getenv(\"MYSQL_USERPW_62\", \"<my-user-password>\") # user password\n", "user_password=os.getenv(\"MYSQL_USERPW_62\", \"<my-user-password>\") # user password\n",
"\n", "\n",

View File

@@ -23,9 +23,9 @@
"# How to create Module, ModuleVersion, and use them in a pipeline with ModuleStep.\n", "# How to create Module, ModuleVersion, and use them in a pipeline with ModuleStep.\n",
"In this notebook, we introduce the concept of versioned modules and how to use them in an Azure Machine Learning Pipeline.\n", "In this notebook, we introduce the concept of versioned modules and how to use them in an Azure Machine Learning Pipeline.\n",
"\n", "\n",
"The core idea behind introducing Module, ModuleVersion and ModuleStep is to allow the separation between a reusable executable components and their actual usage. These reusable software components (such as scripts or executables) can be used in different scenarios and by different users. This follows the same idea of separating software frameworks/libraries and their actual usage in applications. Module and ModuleVersion take the role of the reusable executable components where ModuleStep is there to link them to an actual usage.\n", "The core idea behind introducing Module, ModuleVersion and ModuleStep is to allow the separation between reusable executable components and their actual usage. These reusable software components (such as scripts or executables) can be used in different scenarios and by different users. This follows the same idea of separating software frameworks/libraries and their actual usage in applications. Module and ModuleVersion take the role of the reusable executable components where ModuleStep is there to link them to an actual usage.\n",
"\n", "\n",
"A module is an elaborated container of its versions, where each version is the actual computational unit. It is up to users to define the semantics of this hierarchical structure of container and versions. For example, they could be different versions for different use cases, development progress, etc.\n", "A module is an elaborated container of its versions, where each version is the actual computational unit. It is up to users to define the semantics of this hierarchical structure of container and versions. For example, there could be different versions for different use cases, development progress, etc.\n",
"\n", "\n",
"Each ModuleVersion may have inputs, outputs and rely on parameters and its environment configuration to operate.\n", "Each ModuleVersion may have inputs, outputs and rely on parameters and its environment configuration to operate.\n",
"\n", "\n",

View File

@@ -382,10 +382,25 @@
" headers=aad_token, \n", " headers=aad_token, \n",
" json={\"ExperimentName\": \"My_Pipeline1\",\n", " json={\"ExperimentName\": \"My_Pipeline1\",\n",
" \"RunSource\": \"SDK\",\n", " \"RunSource\": \"SDK\",\n",
" \"ParameterAssignments\": {\"pipeline_arg\": 45}})\n", " \"ParameterAssignments\": {\"pipeline_arg\": 45}})"
"run_id = response.json()[\"Id\"]\n", ]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"try:\n",
" response.raise_for_status()\n",
"except Exception: \n",
" raise Exception('Received bad response from the endpoint: {}\\n'\n",
" 'Response Code: {}\\n'\n",
" 'Headers: {}\\n'\n",
" 'Content: {}'.format(rest_endpoint, response.status_code, response.headers, response.content))\n",
"\n", "\n",
"print(run_id)" "run_id = response.json().get('Id')\n",
"print('Submitted pipeline run: ', run_id)"
] ]
}, },
{ {

View File

@@ -494,10 +494,25 @@
" headers=aad_token, \n", " headers=aad_token, \n",
" json={\"ExperimentName\": \"default_pipeline\",\n", " json={\"ExperimentName\": \"default_pipeline\",\n",
" \"RunSource\": \"SDK\",\n", " \"RunSource\": \"SDK\",\n",
" \"ParameterAssignments\": {\"1\": \"united\", \"2\":\"city\"}})\n", " \"ParameterAssignments\": {\"1\": \"united\", \"2\":\"city\"}})"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"try:\n",
" response.raise_for_status()\n",
"except Exception: \n",
" raise Exception('Received bad response from the endpoint: {}\\n'\n",
" 'Response Code: {}\\n'\n",
" 'Headers: {}\\n'\n",
" 'Content: {}'.format(rest_endpoint, response.status_code, response.headers, response.content))\n",
"\n", "\n",
"run_id = response.json()[\"Id\"]\n", "run_id = response.json().get('Id')\n",
"print(run_id)" "print('Submitted pipeline run: ', run_id)"
] ]
}, },
{ {
@@ -522,6 +537,24 @@
"run_id = pipeline_endpoint_by_name.submit(\"NewName\")\n", "run_id = pipeline_endpoint_by_name.submit(\"NewName\")\n",
"print(run_id)" "print(run_id)"
] ]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Use Experiment.Submit() to Submit Pipeline\n",
"Run specific pipeline using Experiment submit api"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core import Experiment\n",
"pipeline_run = Experiment(ws, name=\"submit_from_endpoint\").submit(pipeline_endpoint_by_name, tags={'endpoint_tag': \"1\"}, pipeline_version=\"0\")"
]
} }
], ],
"metadata": { "metadata": {
@@ -560,7 +593,7 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.6.2" "version": "3.6.7"
}, },
"order_index": 12, "order_index": 12,
"tags": [ "tags": [

View File

@@ -366,8 +366,15 @@
"\n", "\n",
"rest_endpoint = published_pipeline.endpoint\n", "rest_endpoint = published_pipeline.endpoint\n",
"\n", "\n",
"print(\"You can perform HTTP POST on URL {} to trigger this pipeline\".format(rest_endpoint))\n", "print(\"You can perform HTTP POST on URL {} to trigger this pipeline\".format(rest_endpoint))"
"\n", ]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# specify the param when running the pipeline\n", "# specify the param when running the pipeline\n",
"response = requests.post(rest_endpoint, \n", "response = requests.post(rest_endpoint, \n",
" headers=aad_token, \n", " headers=aad_token, \n",
@@ -381,9 +388,24 @@
" },\n", " },\n",
" \"ParameterAssignments\": {\"input_string\": \"sample_string3\"}\n", " \"ParameterAssignments\": {\"input_string\": \"sample_string3\"}\n",
" }\n", " }\n",
" )\n", " )"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"try:\n",
" response.raise_for_status()\n",
"except Exception: \n",
" raise Exception('Received bad response from the endpoint: {}\\n'\n",
" 'Response Code: {}\\n'\n",
" 'Headers: {}\\n'\n",
" 'Content: {}'.format(rest_endpoint, response.status_code, response.headers, response.content))\n",
"\n", "\n",
"run_id = response.json()[\"Id\"]\n", "run_id = response.json().get('Id')\n",
"print('Submitted pipeline run: ', run_id)" "print('Submitted pipeline run: ', run_id)"
] ]
}, },

View File

@@ -76,7 +76,7 @@
"from azureml.core.runconfig import RunConfiguration\n", "from azureml.core.runconfig import RunConfiguration\n",
"from azureml.core.conda_dependencies import CondaDependencies\n", "from azureml.core.conda_dependencies import CondaDependencies\n",
"\n", "\n",
"from azureml.train.automl import AutoMLStep\n", "from azureml.train.automl.runtime import AutoMLStep\n",
"\n", "\n",
"# Check core SDK version number\n", "# Check core SDK version number\n",
"print(\"SDK version:\", azureml.core.VERSION)" "print(\"SDK version:\", azureml.core.VERSION)"

View File

@@ -822,7 +822,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"from azureml.train.automl import AutoMLStep\n", "from azureml.train.automl.runtime import AutoMLStep\n",
"\n", "\n",
"trainWithAutomlStep = AutoMLStep(\n", "trainWithAutomlStep = AutoMLStep(\n",
" name='AutoML_Regression',\n", " name='AutoML_Regression',\n",

View File

@@ -15,6 +15,13 @@
"![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/machine-learning-pipelines/pipeline-batch-scoring/pipeline-batch-scoring.png)" "![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/machine-learning-pipelines/pipeline-batch-scoring/pipeline-batch-scoring.png)"
] ]
}, },
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Note**: Azure Machine Learning recently released ParallelRunStep for public preview, this will allow for parallelization of your workload across many compute nodes without the difficulty of orchestrating worker pools and queues. See the [batch inference notebooks](../../../contrib/batch_inferencing/) for examples on how to get started."
]
},
{ {
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
@@ -322,7 +329,6 @@
"# Runconfig\n", "# Runconfig\n",
"amlcompute_run_config = RunConfiguration(conda_dependencies=cd)\n", "amlcompute_run_config = RunConfiguration(conda_dependencies=cd)\n",
"amlcompute_run_config.environment.docker.enabled = True\n", "amlcompute_run_config.environment.docker.enabled = True\n",
"amlcompute_run_config.environment.docker.gpu_support = True\n",
"amlcompute_run_config.environment.docker.base_image = DEFAULT_GPU_IMAGE\n", "amlcompute_run_config.environment.docker.base_image = DEFAULT_GPU_IMAGE\n",
"amlcompute_run_config.environment.spark.precache_packages = False" "amlcompute_run_config.environment.spark.precache_packages = False"
] ]
@@ -554,8 +560,25 @@
"response = requests.post(rest_endpoint, \n", "response = requests.post(rest_endpoint, \n",
" headers=aad_token, \n", " headers=aad_token, \n",
" json={\"ExperimentName\": \"batch_scoring\",\n", " json={\"ExperimentName\": \"batch_scoring\",\n",
" \"ParameterAssignments\": {\"param_batch_size\": 50}})\n", " \"ParameterAssignments\": {\"param_batch_size\": 50}})"
"run_id = response.json()[\"Id\"]" ]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"try:\n",
" response.raise_for_status()\n",
"except Exception: \n",
" raise Exception('Received bad response from the endpoint: {}\\n'\n",
" 'Response Code: {}\\n'\n",
" 'Headers: {}\\n'\n",
" 'Content: {}'.format(rest_endpoint, response.status_code, response.headers, response.content))\n",
"\n",
"run_id = response.json().get('Id')\n",
"print('Submitted pipeline run: ', run_id)"
] ]
}, },
{ {

View File

@@ -16,6 +16,13 @@
"![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/machine-learning-pipelines/pipeline-style-transfer/pipeline-style-transfer.png)" "![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/machine-learning-pipelines/pipeline-style-transfer/pipeline-style-transfer.png)"
] ]
}, },
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Note**: Azure Machine Learning recently released ParallelRunStep for public preview, this will allow for parallelization of your workload across many compute nodes without the difficulty of orchestrating worker pools and queues. See the [batch inference notebooks](../../../contrib/batch_inferencing/) for examples on how to get started."
]
},
{ {
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
@@ -276,7 +283,6 @@
"# Runconfig\n", "# Runconfig\n",
"amlcompute_run_config = RunConfiguration(conda_dependencies=cd)\n", "amlcompute_run_config = RunConfiguration(conda_dependencies=cd)\n",
"amlcompute_run_config.environment.docker.enabled = True\n", "amlcompute_run_config.environment.docker.enabled = True\n",
"amlcompute_run_config.environment.docker.gpu_support = True\n",
"amlcompute_run_config.environment.docker.base_image = \"pytorch/pytorch\"\n", "amlcompute_run_config.environment.docker.base_image = \"pytorch/pytorch\"\n",
"amlcompute_run_config.environment.spark.precache_packages = False" "amlcompute_run_config.environment.spark.precache_packages = False"
] ]
@@ -538,41 +544,59 @@
"## Send request and monitor" "## Send request and monitor"
] ]
}, },
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Run the pipeline using PipelineParameter values style='candy' and nodecount=2"
]
},
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": null, "execution_count": null,
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"# run the pipeline using PipelineParameter values style='candy' and nodecount=2\n",
"response = requests.post(rest_endpoint, \n", "response = requests.post(rest_endpoint, \n",
" headers=aad_token,\n", " headers=aad_token,\n",
" json={\"ExperimentName\": \"style_transfer\",\n", " json={\"ExperimentName\": \"style_transfer\",\n",
" \"ParameterAssignments\": {\"style\": \"candy\", \"nodecount\": 2}}) \n", " \"ParameterAssignments\": {\"style\": \"candy\", \"nodecount\": 2}})"
"run_id = response.json()[\"Id\"]\n", ]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"try:\n",
" response.raise_for_status()\n",
"except Exception: \n",
" raise Exception('Received bad response from the endpoint: {}\\n'\n",
" 'Response Code: {}\\n'\n",
" 'Headers: {}\\n'\n",
" 'Content: {}'.format(rest_endpoint, response.status_code, response.headers, response.content))\n",
"\n", "\n",
"run_id = response.json().get('Id')\n",
"print('Submitted pipeline run: ', run_id)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.pipeline.core.run import PipelineRun\n", "from azureml.pipeline.core.run import PipelineRun\n",
"published_pipeline_run_candy = PipelineRun(ws.experiments[\"style_transfer\"], run_id)\n", "published_pipeline_run_candy = PipelineRun(ws.experiments[\"style_transfer\"], run_id)\n",
"\n",
"RunDetails(published_pipeline_run_candy).show()" "RunDetails(published_pipeline_run_candy).show()"
] ]
}, },
{ {
"cell_type": "code", "cell_type": "markdown",
"execution_count": null,
"metadata": {}, "metadata": {},
"outputs": [],
"source": [ "source": [
"# run the pipeline using PipelineParameter values style='rain_princess' and nodecount=3\n", "Run the pipeline using PipelineParameter values style='rain_princess' and nodecount=3"
"response = requests.post(rest_endpoint, \n",
" headers=aad_token,\n",
" json={\"ExperimentName\": \"style_transfer\",\n",
" \"ParameterAssignments\": {\"style\": \"rain_princess\", \"nodecount\": 3}}) \n",
"run_id = response.json()[\"Id\"]\n",
"\n",
"published_pipeline_run_rain = PipelineRun(ws.experiments[\"style_transfer\"], run_id)\n",
"\n",
"RunDetails(published_pipeline_run_rain).show()"
] ]
}, },
{ {
@@ -581,15 +605,84 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"# run the pipeline using PipelineParameter values style='udnie' and nodecount=4\n",
"response = requests.post(rest_endpoint, \n", "response = requests.post(rest_endpoint, \n",
" headers=aad_token,\n", " headers=aad_token,\n",
" json={\"ExperimentName\": \"style_transfer\",\n", " json={\"ExperimentName\": \"style_transfer\",\n",
" \"ParameterAssignments\": {\"style\": \"udnie\", \"nodecount\": 3}}) \n", " \"ParameterAssignments\": {\"style\": \"rain_princess\", \"nodecount\": 3}})"
"run_id = response.json()[\"Id\"]\n", ]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"try:\n",
" response.raise_for_status()\n",
"except Exception: \n",
" raise Exception('Received bad response from the endpoint: {}\\n'\n",
" 'Response Code: {}\\n'\n",
" 'Headers: {}\\n'\n",
" 'Content: {}'.format(rest_endpoint, response.status_code, response.headers, response.content))\n",
"\n", "\n",
"run_id = response.json().get('Id')\n",
"print('Submitted pipeline run: ', run_id)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"published_pipeline_run_rain = PipelineRun(ws.experiments[\"style_transfer\"], run_id)\n",
"RunDetails(published_pipeline_run_rain).show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Run the pipeline using PipelineParameter values style='udnie' and nodecount=4"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"response = requests.post(rest_endpoint, \n",
" headers=aad_token,\n",
" json={\"ExperimentName\": \"style_transfer\",\n",
" \"ParameterAssignments\": {\"style\": \"udnie\", \"nodecount\": 3}})\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"try:\n",
" response.raise_for_status()\n",
"except Exception: \n",
" raise Exception('Received bad response from the endpoint: {}\\n'\n",
" 'Response Code: {}\\n'\n",
" 'Headers: {}\\n'\n",
" 'Content: {}'.format(rest_endpoint, response.status_code, response.headers, response.content))\n",
"\n",
"run_id = response.json().get('Id')\n",
"print('Submitted pipeline run: ', run_id)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"published_pipeline_run_udnie = PipelineRun(ws.experiments[\"style_transfer\"], run_id)\n", "published_pipeline_run_udnie = PipelineRun(ws.experiments[\"style_transfer\"], run_id)\n",
"\n",
"RunDetails(published_pipeline_run_udnie).show()" "RunDetails(published_pipeline_run_udnie).show()"
] ]
}, },

View File

@@ -104,7 +104,11 @@
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": null, "execution_count": null,
"metadata": {}, "metadata": {
"tags": [
"sample-interactiveloginauth-tenantid"
]
},
"outputs": [], "outputs": [],
"source": [ "source": [
"from azureml.core.authentication import InteractiveLoginAuthentication\n", "from azureml.core.authentication import InteractiveLoginAuthentication\n",
@@ -131,7 +135,11 @@
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": null, "execution_count": null,
"metadata": {}, "metadata": {
"tags": [
"sample-azurecliauth"
]
},
"outputs": [], "outputs": [],
"source": [ "source": [
"from azureml.core.authentication import AzureCliAuthentication\n", "from azureml.core.authentication import AzureCliAuthentication\n",
@@ -168,7 +176,11 @@
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": null, "execution_count": null,
"metadata": {}, "metadata": {
"tags": [
"sample-msiauth"
]
},
"outputs": [], "outputs": [],
"source": [ "source": [
"from azureml.core.authentication import MsiAuthentication\n", "from azureml.core.authentication import MsiAuthentication\n",
@@ -245,7 +257,11 @@
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": null, "execution_count": null,
"metadata": {}, "metadata": {
"tags": [
"sample-serviceprincipalauth-tenantid"
]
},
"outputs": [], "outputs": [],
"source": [ "source": [
"import os\n", "import os\n",
@@ -300,7 +316,11 @@
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": null, "execution_count": null,
"metadata": {}, "metadata": {
"tags": [
"sample-keyvault"
]
},
"outputs": [], "outputs": [],
"source": [ "source": [
"import os, uuid\n", "import os, uuid\n",

View File

@@ -707,7 +707,7 @@
"metadata": { "metadata": {
"authors": [ "authors": [
{ {
"name": "dipeck" "name": "swatig"
} }
], ],
"category": "training", "category": "training",

View File

@@ -166,7 +166,7 @@ def download_data():
from zipfile import ZipFile from zipfile import ZipFile
# download data # download data
data_file = './fowl_data.zip' data_file = './fowl_data.zip'
download_url = 'https://msdocsdatasets.blob.core.windows.net/pytorchfowl/fowl_data.zip' download_url = 'https://azureopendatastorage.blob.core.windows.net/testpublic/temp/fowl_data.zip'
urllib.request.urlretrieve(download_url, filename=data_file) urllib.request.urlretrieve(download_url, filename=data_file)
# extract files # extract files

View File

@@ -174,7 +174,7 @@
"metadata": {}, "metadata": {},
"source": [ "source": [
"### Download training data\n", "### Download training data\n",
"The dataset we will use (located on a public blob [here](https://msdocsdatasets.blob.core.windows.net/pytorchfowl/fowl_data.zip) as a zip file) consists of about 120 training images each for turkeys and chickens, with 100 validation images for each class. The images are a subset of the [Open Images v5 Dataset](https://storage.googleapis.com/openimages/web/index.html). We will download and extract the dataset as part of our training script `pytorch_train.py`" "The dataset we will use (located on a public blob [here](https://azureopendatastorage.blob.core.windows.net/testpublic/temp/fowl_data.zip) as a zip file) consists of about 120 training images each for turkeys and chickens, with 100 validation images for each class. The images are a subset of the [Open Images v5 Dataset](https://storage.googleapis.com/openimages/web/index.html). We will download and extract the dataset as part of our training script `pytorch_train.py`"
] ]
}, },
{ {
@@ -698,7 +698,7 @@
"metadata": { "metadata": {
"authors": [ "authors": [
{ {
"name": "ninhu" "name": "swatig"
} }
], ],
"category": "training", "category": "training",

View File

@@ -550,7 +550,7 @@
"metadata": { "metadata": {
"authors": [ "authors": [
{ {
"name": "dipeck" "name": "swatig"
} }
], ],
"category": "training", "category": "training",

View File

@@ -1140,7 +1140,7 @@
"metadata": { "metadata": {
"authors": [ "authors": [
{ {
"name": "ninhu" "name": "swatig"
} }
], ],
"category": "training", "category": "training",

View File

@@ -517,7 +517,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"from azureml.train.hyperdrive import *\n", "from azureml.train.hyperdrive import RandomParameterSampling, choice, loguniform\n",
"\n", "\n",
"ps = RandomParameterSampling(\n", "ps = RandomParameterSampling(\n",
" {\n", " {\n",
@@ -562,6 +562,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"from azureml.train.hyperdrive import TruncationSelectionPolicy\n",
"policy = TruncationSelectionPolicy(evaluation_interval=2, truncation_percentage=25)" "policy = TruncationSelectionPolicy(evaluation_interval=2, truncation_percentage=25)"
] ]
}, },
@@ -578,12 +579,13 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"from azureml.train.hyperdrive import HyperDriveConfig, PrimaryMetricGoal\n",
"htc = HyperDriveConfig(estimator=est, \n", "htc = HyperDriveConfig(estimator=est, \n",
" hyperparameter_sampling=ps, \n", " hyperparameter_sampling=ps, \n",
" policy=policy, \n", " policy=policy, \n",
" primary_metric_name='validation_acc', \n", " primary_metric_name='validation_acc', \n",
" primary_metric_goal=PrimaryMetricGoal.MAXIMIZE, \n", " primary_metric_goal=PrimaryMetricGoal.MAXIMIZE, \n",
" max_total_runs=20,\n", " max_total_runs=15,\n",
" max_concurrent_runs=4)" " max_concurrent_runs=4)"
] ]
}, },
@@ -616,7 +618,6 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"from azureml.widgets import RunDetails\n",
"RunDetails(htr).show()" "RunDetails(htr).show()"
] ]
}, },
@@ -721,7 +722,6 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"from azureml.widgets import RunDetails\n",
"RunDetails(warm_start_htr).show()" "RunDetails(warm_start_htr).show()"
] ]
}, },
@@ -820,7 +820,6 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"from azureml.widgets import RunDetails\n",
"RunDetails(resume_child_runs_htr).show()" "RunDetails(resume_child_runs_htr).show()"
] ]
}, },

View File

@@ -0,0 +1,346 @@
latitude,longitude,temperature,windAngle,windSpeed,elevation
26.536,-81.755,17.8,10.0,2.1,9.0
26.536,-81.755,16.7,360.0,1.5,9.0
26.536,-81.755,16.1,350.0,1.5,9.0
26.536,-81.755,15.0,0.0,0.0,9.0
26.536,-81.755,14.4,350.0,1.5,9.0
26.536,-81.755,0.0,0.0,0.0,9.0
26.536,-81.755,13.9,360.0,2.1,9.0
26.536,-81.755,13.3,350.0,1.5,9.0
26.536,-81.755,13.3,10.0,2.1,9.0
26.536,-81.755,13.3,360.0,1.5,9.0
26.536,-81.755,13.3,0.0,0.0,9.0
26.536,-81.755,12.2,0.0,0.0,9.0
26.536,-81.755,11.7,0.0,0.0,9.0
26.536,-81.755,14.4,0.0,0.0,9.0
26.536,-81.755,17.2,10.0,2.6,9.0
26.536,-81.755,20.0,20.0,2.6,9.0
26.536,-81.755,22.2,10.0,3.6,9.0
26.536,-81.755,23.3,30.0,4.6,9.0
26.536,-81.755,23.3,330.0,2.6,9.0
26.536,-81.755,24.4,0.0,0.0,9.0
26.536,-81.755,25.0,360.0,3.1,9.0
26.536,-81.755,24.4,20.0,4.1,9.0
26.536,-81.755,23.3,10.0,2.6,9.0
26.536,-81.755,21.1,30.0,2.1,9.0
26.536,-81.755,18.3,0.0,0.0,9.0
26.536,-81.755,17.2,30.0,2.1,9.0
26.536,-81.755,15.6,60.0,2.6,9.0
26.536,-81.755,15.6,0.0,0.0,9.0
26.536,-81.755,13.9,60.0,2.6,9.0
26.536,-81.755,12.8,70.0,2.6,9.0
26.536,-81.755,0.0,0.0,0.0,9.0
26.536,-81.755,11.7,70.0,2.1,9.0
26.536,-81.755,12.2,20.0,2.1,9.0
26.536,-81.755,11.7,30.0,1.5,9.0
26.536,-81.755,11.1,40.0,2.1,9.0
26.536,-81.755,12.2,40.0,2.6,9.0
26.536,-81.755,12.2,30.0,2.6,9.0
26.536,-81.755,12.2,0.0,0.0,9.0
26.536,-81.755,15.0,30.0,6.2,9.0
26.536,-81.755,17.2,50.0,3.6,9.0
26.536,-81.755,20.6,60.0,5.1,9.0
26.536,-81.755,22.8,50.0,4.6,9.0
26.536,-81.755,24.4,80.0,6.2,9.0
26.536,-81.755,25.0,100.0,5.7,9.0
26.536,-81.755,25.6,60.0,3.1,9.0
26.536,-81.755,25.6,80.0,4.6,9.0
26.536,-81.755,25.0,90.0,5.1,9.0
26.536,-81.755,24.4,80.0,5.1,9.0
26.536,-81.755,21.1,60.0,2.6,9.0
26.536,-81.755,19.4,70.0,3.6,9.0
26.536,-81.755,18.3,70.0,2.6,9.0
26.536,-81.755,18.3,80.0,2.6,9.0
26.536,-81.755,17.2,60.0,1.5,9.0
26.536,-81.755,16.1,70.0,2.6,9.0
26.536,-81.755,15.6,70.0,2.6,9.0
26.536,-81.755,0.0,0.0,0.0,9.0
26.536,-81.755,16.1,50.0,2.6,9.0
26.536,-81.755,15.6,50.0,2.1,9.0
26.536,-81.755,15.0,50.0,1.5,9.0
26.536,-81.755,15.0,0.0,0.0,9.0
26.536,-81.755,15.0,0.0,0.0,9.0
26.536,-81.755,14.4,0.0,0.0,9.0
26.536,-81.755,14.4,30.0,4.1,9.0
26.536,-81.755,16.1,40.0,1.5,9.0
26.536,-81.755,19.4,0.0,1.5,9.0
26.536,-81.755,22.8,90.0,2.6,9.0
26.536,-81.755,24.4,130.0,3.6,9.0
26.536,-81.755,25.6,100.0,4.6,9.0
26.536,-81.755,26.1,120.0,3.1,9.0
26.536,-81.755,26.7,0.0,2.6,9.0
26.536,-81.755,27.2,0.0,0.0,9.0
26.536,-81.755,27.2,40.0,3.1,9.0
26.536,-81.755,26.1,30.0,1.5,9.0
26.536,-81.755,22.8,310.0,2.1,9.0
26.536,-81.755,23.3,330.0,2.1,9.0
-34.067,-56.238,17.5,30.0,3.1,68.0
-34.067,-56.238,21.2,30.0,5.7,68.0
-34.067,-56.238,24.5,30.0,3.1,68.0
-34.067,-56.238,27.5,330.0,3.6,68.0
-34.067,-56.238,29.2,30.0,4.1,68.0
-34.067,-56.238,31.0,20.0,4.6,68.0
-34.067,-56.238,33.0,360.0,2.6,68.0
-34.067,-56.238,33.6,60.0,3.1,68.0
-34.067,-56.238,33.6,30.0,3.6,68.0
-34.067,-56.238,18.6,40.0,3.1,68.0
-34.067,-56.238,22.0,120.0,1.5,68.0
-34.067,-56.238,25.0,120.0,2.6,68.0
-34.067,-56.238,28.6,50.0,3.1,68.0
-34.067,-56.238,30.6,50.0,4.1,68.0
-34.067,-56.238,31.5,30.0,6.7,68.0
-34.067,-56.238,32.0,40.0,7.2,68.0
-34.067,-56.238,33.0,30.0,5.7,68.0
-34.067,-56.238,33.2,360.0,3.6,68.0
-34.067,-56.238,20.6,30.0,3.1,68.0
-34.067,-56.238,21.2,0.0,0.0,68.0
-34.067,-56.238,22.0,210.0,3.1,68.0
-34.067,-56.238,23.0,210.0,3.6,68.0
-34.067,-56.238,24.0,180.0,6.7,68.0
-34.067,-56.238,24.5,210.0,7.2,68.0
-34.067,-56.238,21.0,180.0,8.2,68.0
-34.067,-56.238,20.0,180.0,6.7,68.0
-34.083,-56.233,20.2,180.0,7.2,68.0
-29.917,-71.2,16.6,290.0,4.1,146.0
-29.916,-71.2,17.0,290.0,4.1,147.0
-29.916,-71.2,16.0,310.0,3.1,147.0
-29.916,-71.2,16.0,300.0,2.1,147.0
-29.917,-71.2,15.1,0.0,0.0,146.0
-29.916,-71.2,15.0,0.0,1.0,147.0
-29.916,-71.2,15.0,160.0,1.0,147.0
-29.916,-71.2,15.0,120.0,1.0,147.0
-29.917,-71.2,14.3,190.0,1.0,146.0
-29.916,-71.2,14.0,190.0,1.0,147.0
-29.916,-71.2,14.0,0.0,0.0,147.0
-29.916,-71.2,14.0,100.0,3.1,147.0
-29.917,-71.2,12.9,0.0,0.0,146.0
-29.916,-71.2,13.0,0.0,1.0,147.0
-29.916,-71.2,14.0,0.0,0.5,147.0
-29.916,-71.2,15.0,0.0,0.5,147.0
-29.917,-71.2,15.9,0.0,0.0,146.0
-29.916,-71.2,16.0,0.0,0.0,147.0
-29.916,-71.2,17.0,270.0,4.6,147.0
-29.916,-71.2,19.0,260.0,4.1,147.0
-29.917,-71.2,18.1,270.0,6.2,146.0
-29.916,-71.2,18.0,270.0,6.2,147.0
-29.916,-71.2,19.0,270.0,6.2,147.0
-29.916,-71.2,20.0,260.0,5.1,147.0
-29.917,-71.2,19.6,280.0,6.2,146.0
-29.916,-71.2,20.0,280.0,6.2,147.0
-29.916,-71.2,20.0,270.0,6.2,147.0
-29.916,-71.2,19.0,280.0,6.7,147.0
-29.917,-71.2,18.3,270.0,5.7,146.0
-29.916,-71.2,18.0,270.0,5.7,147.0
-29.916,-71.2,18.0,0.0,0.0,147.0
-29.916,-71.2,17.0,280.0,4.6,147.0
-29.917,-71.2,15.9,280.0,4.1,146.0
-29.916,-71.2,16.0,280.0,4.1,147.0
-29.916,-71.2,15.0,280.0,3.6,147.0
-29.916,-71.2,15.0,280.0,3.6,147.0
-29.917,-71.2,15.4,280.0,4.1,146.0
-29.916,-71.2,15.0,280.0,4.1,147.0
-29.916,-71.2,16.0,240.0,2.1,147.0
-29.916,-71.2,15.0,0.0,0.5,147.0
-29.917,-71.2,15.8,80.0,3.6,146.0
-29.916,-71.2,16.0,80.0,3.6,147.0
-29.916,-71.2,16.0,10.0,1.5,147.0
-29.916,-71.2,16.0,100.0,1.5,147.0
-29.917,-71.2,15.3,130.0,1.5,146.0
-29.916,-71.2,15.0,130.0,1.5,147.0
-29.916,-71.2,15.0,110.0,1.0,147.0
-29.916,-71.2,16.0,280.0,6.2,147.0
-29.917,-71.2,15.9,240.0,3.6,146.0
-29.916,-71.2,16.0,240.0,3.6,147.0
-29.916,-71.2,16.0,240.0,3.1,147.0
-29.916,-71.2,16.0,220.0,3.1,147.0
-29.917,-71.2,16.4,260.0,3.1,146.0
-29.916,-71.2,16.0,260.0,3.1,147.0
-29.916,-71.2,17.0,230.0,2.6,147.0
-29.916,-71.2,18.0,0.0,1.5,147.0
-29.917,-71.2,20.3,340.0,2.6,146.0
-29.916,-71.2,20.0,340.0,2.6,147.0
-29.916,-71.2,21.0,270.0,5.1,147.0
-29.916,-71.2,20.0,270.0,6.7,147.0
-29.917,-71.2,19.2,280.0,6.7,146.0
-29.916,-71.2,19.0,280.0,6.7,147.0
-29.916,-71.2,19.0,310.0,2.6,147.0
-29.916,-71.2,18.0,270.0,5.1,147.0
-29.917,-71.2,17.0,300.0,4.6,146.0
-29.916,-71.2,17.0,300.0,4.6,147.0
-29.916,-71.2,17.0,300.0,3.6,147.0
-29.916,-71.2,17.0,290.0,3.1,147.0
-29.917,-71.2,16.3,290.0,2.1,146.0
-29.916,-71.2,16.0,290.0,2.1,147.0
-29.916,-71.2,17.0,270.0,1.0,147.0
-29.916,-71.2,17.0,0.0,0.5,147.0
-29.917,-71.2,16.5,160.0,2.1,146.0
-29.916,-71.2,17.0,160.0,2.1,147.0
-29.916,-71.2,15.0,120.0,3.1,147.0
-29.916,-71.2,16.0,180.0,1.5,147.0
-29.917,-71.2,14.7,0.0,0.0,146.0
-29.916,-71.2,15.0,0.0,1.0,147.0
-29.916,-71.2,15.0,300.0,1.0,147.0
-29.916,-71.2,16.0,0.0,0.0,147.0
-29.917,-71.2,18.5,110.0,1.0,146.0
-29.916,-71.2,19.0,110.0,1.0,147.0
-29.916,-71.2,20.0,270.0,3.6,147.0
-29.916,-71.2,20.0,270.0,5.7,147.0
-29.917,-71.2,20.0,280.0,6.2,146.0
-29.916,-71.2,20.0,280.0,6.2,147.0
-29.916,-71.2,21.0,290.0,6.7,147.0
-29.916,-71.2,20.0,270.0,6.2,147.0
-29.917,-71.2,21.0,260.0,6.7,146.0
-29.916,-71.2,21.0,260.0,6.7,147.0
-29.916,-71.2,20.0,270.0,6.2,147.0
-29.916,-71.2,19.0,260.0,5.1,147.0
-29.916,-71.2,18.0,280.0,4.6,147.0
-29.917,-71.2,17.5,280.0,3.1,146.0
-29.916,-71.2,18.0,280.0,3.1,147.0
30.349,-85.788,11.1,0.0,0.0,21.0
30.349,-85.788,11.1,0.0,0.0,21.0
30.349,-85.788,9.4,0.0,0.0,21.0
30.349,-85.788,9.4,0.0,0.0,21.0
30.349,-85.788,8.3,300.0,2.1,21.0
30.349,-85.788,11.1,280.0,1.5,21.0
30.349,-85.788,0.0,0.0,0.0,21.0
30.349,-85.788,10.6,320.0,3.1,21.0
30.349,-85.788,9.4,310.0,3.1,21.0
30.349,-85.788,7.8,320.0,2.6,21.0
30.349,-85.788,6.1,340.0,2.1,21.0
30.349,-85.788,6.7,330.0,2.6,21.0
30.349,-85.788,6.1,310.0,1.5,21.0
30.349,-85.788,7.2,310.0,2.1,21.0
30.349,-85.788,12.8,360.0,3.1,21.0
30.349,-85.788,15.0,0.0,3.1,21.0
30.349,-85.788,16.7,20.0,4.6,21.0
30.349,-85.788,18.9,30.0,5.1,21.0
30.349,-85.788,19.4,10.0,4.1,21.0
30.349,-85.788,21.1,330.0,2.6,21.0
30.349,-85.788,21.1,10.0,4.6,21.0
30.349,-85.788,21.7,360.0,4.1,21.0
30.349,-85.788,21.7,30.0,2.1,21.0
30.349,-85.788,21.7,330.0,2.6,21.0
30.349,-85.788,16.1,350.0,2.1,21.0
30.349,-85.788,11.7,0.0,0.0,21.0
30.349,-85.788,8.9,0.0,0.0,21.0
30.349,-85.788,9.4,0.0,0.0,21.0
30.349,-85.788,7.8,0.0,0.0,21.0
30.349,-85.788,11.1,30.0,3.1,21.0
30.349,-85.788,7.2,0.0,0.0,21.0
30.349,-85.788,7.2,0.0,0.0,21.0
30.349,-85.788,0.0,0.0,0.0,21.0
30.349,-85.788,7.8,30.0,2.1,21.0
30.349,-85.788,8.3,40.0,2.6,21.0
30.349,-85.788,7.2,50.0,1.5,21.0
30.349,-85.788,8.3,60.0,1.5,21.0
30.349,-85.788,5.6,40.0,2.1,21.0
30.349,-85.788,6.7,40.0,2.1,21.0
30.349,-85.788,7.8,50.0,3.1,21.0
30.349,-85.788,11.7,70.0,2.6,21.0
30.349,-85.788,15.6,70.0,3.1,21.0
30.349,-85.788,18.9,100.0,3.6,21.0
30.349,-85.788,20.0,130.0,3.6,21.0
30.349,-85.788,21.1,140.0,4.1,21.0
30.349,-85.788,21.7,150.0,4.1,21.0
30.349,-85.788,21.7,170.0,3.1,21.0
30.349,-85.788,22.2,170.0,3.1,21.0
30.349,-85.788,20.6,0.0,0.0,21.0
30.349,-85.788,17.2,0.0,0.0,21.0
30.349,-85.788,14.4,0.0,0.0,21.0
30.349,-85.788,12.8,100.0,1.5,21.0
30.349,-85.788,13.3,100.0,1.5,21.0
30.349,-85.788,10.6,0.0,0.0,21.0
30.349,-85.788,9.4,0.0,0.0,21.0
30.349,-85.788,7.8,0.0,0.0,21.0
30.358,-85.799,8.3,0.0,0.0,21.0
30.349,-85.788,0.0,0.0,0.0,21.0
30.358,-85.799,6.7,0.0,0.0,21.0
30.358,-85.799,7.2,0.0,0.0,21.0
30.358,-85.799,7.2,0.0,0.0,21.0
30.358,-85.799,8.3,50.0,1.5,21.0
30.358,-85.799,9.4,0.0,0.0,21.0
30.358,-85.799,8.9,0.0,0.0,21.0
30.358,-85.799,10.0,340.0,1.5,21.0
30.358,-85.799,12.8,40.0,1.5,21.0
30.358,-85.799,16.7,100.0,2.1,21.0
30.358,-85.799,21.1,100.0,1.5,21.0
30.358,-85.799,23.3,0.0,0.0,21.0
30.358,-85.799,25.0,180.0,4.6,21.0
30.358,-85.799,24.4,230.0,3.6,21.0
30.358,-85.799,25.0,210.0,4.1,21.0
30.358,-85.799,23.9,170.0,4.1,21.0
30.358,-85.799,22.8,0.0,0.0,21.0
30.358,-85.799,19.4,0.0,0.0,21.0
30.358,-85.799,17.8,140.0,2.1,21.0
60.383,5.333,-0.7,0.0,0.0,36.0
60.383,5.333,0.6,270.0,2.0,36.0
60.383,5.333,-0.9,120.0,1.0,36.0
60.383,5.333,-1.6,130.0,2.0,36.0
60.383,5.333,-1.4,150.0,1.0,36.0
60.383,5.333,-1.7,0.0,0.0,36.0
60.383,5.333,-1.7,140.0,1.0,36.0
60.383,5.333,-1.4,0.0,0.0,36.0
60.383,5.333,-1.0,0.0,0.0,36.0
60.383,5.333,-1.0,150.0,1.0,36.0
60.383,5.333,-0.7,140.0,1.0,36.0
60.383,5.333,0.5,150.0,1.0,36.0
60.383,5.333,1.9,0.0,0.0,36.0
60.383,5.333,1.7,0.0,0.0,36.0
60.383,5.333,2.1,310.0,2.0,36.0
60.383,5.333,1.5,90.0,1.0,36.0
60.383,5.333,1.9,290.0,1.0,36.0
60.383,5.333,2.0,320.0,1.0,36.0
60.383,5.333,1.9,330.0,1.0,36.0
60.383,5.333,1.3,350.0,1.0,36.0
60.383,5.333,1.5,120.0,1.0,36.0
60.383,5.333,1.3,150.0,2.0,36.0
60.383,5.333,0.8,140.0,1.0,36.0
60.383,5.333,0.3,300.0,1.0,36.0
60.383,5.333,0.2,140.0,1.0,36.0
60.383,5.333,0.4,140.0,1.0,36.0
60.383,5.333,0.5,320.0,1.0,36.0
60.383,5.333,1.5,330.0,1.0,36.0
60.383,5.333,1.8,40.0,1.0,36.0
60.383,5.333,2.3,170.0,1.0,36.0
60.383,5.333,2.7,140.0,1.0,36.0
60.383,5.333,3.1,330.0,1.0,36.0
60.383,5.333,3.8,350.0,1.0,36.0
60.383,5.333,3.8,140.0,1.0,36.0
60.383,5.333,4.1,150.0,1.0,36.0
60.383,5.333,4.4,180.0,1.0,36.0
60.383,5.333,4.9,300.0,1.0,36.0
60.383,5.333,5.2,320.0,1.0,36.0
60.383,5.333,6.7,340.0,1.0,36.0
60.383,5.333,6.9,250.0,1.0,36.0
60.383,5.333,7.9,300.0,2.0,36.0
60.383,5.333,5.5,140.0,1.0,36.0
60.383,5.333,7.1,140.0,2.0,36.0
60.383,5.333,7.0,280.0,2.0,36.0
60.383,5.333,4.6,170.0,1.0,36.0
60.383,5.333,4.8,330.0,1.0,36.0
60.383,5.333,6.4,260.0,2.0,36.0
60.383,5.333,6.2,340.0,1.0,36.0
60.383,5.333,5.7,320.0,2.0,36.0
60.383,5.333,5.2,100.0,1.0,36.0
60.383,5.333,5.1,310.0,1.0,36.0
60.383,5.333,4.9,290.0,2.0,36.0
60.383,5.333,4.9,310.0,2.0,36.0
60.383,5.333,6.1,320.0,2.0,36.0
60.383,5.333,7.0,250.0,1.0,36.0
60.383,5.333,5.3,140.0,1.0,36.0
60.383,5.333,6.9,350.0,1.0,36.0
60.383,5.333,9.7,110.0,3.0,36.0
60.383,5.333,10.3,300.0,3.0,36.0
60.383,5.333,8.7,310.0,1.0,36.0
60.383,5.333,9.0,270.0,3.0,36.0
60.383,5.333,11.6,80.0,3.0,36.0
60.383,5.333,11.4,80.0,4.0,36.0
60.383,5.333,9.7,70.0,5.0,36.0
60.383,5.333,9.5,80.0,6.0,36.0
60.383,5.333,8.7,80.0,5.0,36.0
60.383,5.333,7.7,80.0,5.0,36.0
60.383,5.333,8.2,80.0,4.0,36.0
60.383,5.333,7.7,30.0,1.0,36.0
60.383,5.333,7.2,310.0,1.0,36.0
60.383,5.333,6.8,300.0,2.0,36.0
60.383,5.333,6.7,140.0,1.0,36.0
1 latitude longitude temperature windAngle windSpeed elevation
2 26.536 -81.755 17.8 10.0 2.1 9.0
3 26.536 -81.755 16.7 360.0 1.5 9.0
4 26.536 -81.755 16.1 350.0 1.5 9.0
5 26.536 -81.755 15.0 0.0 0.0 9.0
6 26.536 -81.755 14.4 350.0 1.5 9.0
7 26.536 -81.755 0.0 0.0 0.0 9.0
8 26.536 -81.755 13.9 360.0 2.1 9.0
9 26.536 -81.755 13.3 350.0 1.5 9.0
10 26.536 -81.755 13.3 10.0 2.1 9.0
11 26.536 -81.755 13.3 360.0 1.5 9.0
12 26.536 -81.755 13.3 0.0 0.0 9.0
13 26.536 -81.755 12.2 0.0 0.0 9.0
14 26.536 -81.755 11.7 0.0 0.0 9.0
15 26.536 -81.755 14.4 0.0 0.0 9.0
16 26.536 -81.755 17.2 10.0 2.6 9.0
17 26.536 -81.755 20.0 20.0 2.6 9.0
18 26.536 -81.755 22.2 10.0 3.6 9.0
19 26.536 -81.755 23.3 30.0 4.6 9.0
20 26.536 -81.755 23.3 330.0 2.6 9.0
21 26.536 -81.755 24.4 0.0 0.0 9.0
22 26.536 -81.755 25.0 360.0 3.1 9.0
23 26.536 -81.755 24.4 20.0 4.1 9.0
24 26.536 -81.755 23.3 10.0 2.6 9.0
25 26.536 -81.755 21.1 30.0 2.1 9.0
26 26.536 -81.755 18.3 0.0 0.0 9.0
27 26.536 -81.755 17.2 30.0 2.1 9.0
28 26.536 -81.755 15.6 60.0 2.6 9.0
29 26.536 -81.755 15.6 0.0 0.0 9.0
30 26.536 -81.755 13.9 60.0 2.6 9.0
31 26.536 -81.755 12.8 70.0 2.6 9.0
32 26.536 -81.755 0.0 0.0 0.0 9.0
33 26.536 -81.755 11.7 70.0 2.1 9.0
34 26.536 -81.755 12.2 20.0 2.1 9.0
35 26.536 -81.755 11.7 30.0 1.5 9.0
36 26.536 -81.755 11.1 40.0 2.1 9.0
37 26.536 -81.755 12.2 40.0 2.6 9.0
38 26.536 -81.755 12.2 30.0 2.6 9.0
39 26.536 -81.755 12.2 0.0 0.0 9.0
40 26.536 -81.755 15.0 30.0 6.2 9.0
41 26.536 -81.755 17.2 50.0 3.6 9.0
42 26.536 -81.755 20.6 60.0 5.1 9.0
43 26.536 -81.755 22.8 50.0 4.6 9.0
44 26.536 -81.755 24.4 80.0 6.2 9.0
45 26.536 -81.755 25.0 100.0 5.7 9.0
46 26.536 -81.755 25.6 60.0 3.1 9.0
47 26.536 -81.755 25.6 80.0 4.6 9.0
48 26.536 -81.755 25.0 90.0 5.1 9.0
49 26.536 -81.755 24.4 80.0 5.1 9.0
50 26.536 -81.755 21.1 60.0 2.6 9.0
51 26.536 -81.755 19.4 70.0 3.6 9.0
52 26.536 -81.755 18.3 70.0 2.6 9.0
53 26.536 -81.755 18.3 80.0 2.6 9.0
54 26.536 -81.755 17.2 60.0 1.5 9.0
55 26.536 -81.755 16.1 70.0 2.6 9.0
56 26.536 -81.755 15.6 70.0 2.6 9.0
57 26.536 -81.755 0.0 0.0 0.0 9.0
58 26.536 -81.755 16.1 50.0 2.6 9.0
59 26.536 -81.755 15.6 50.0 2.1 9.0
60 26.536 -81.755 15.0 50.0 1.5 9.0
61 26.536 -81.755 15.0 0.0 0.0 9.0
62 26.536 -81.755 15.0 0.0 0.0 9.0
63 26.536 -81.755 14.4 0.0 0.0 9.0
64 26.536 -81.755 14.4 30.0 4.1 9.0
65 26.536 -81.755 16.1 40.0 1.5 9.0
66 26.536 -81.755 19.4 0.0 1.5 9.0
67 26.536 -81.755 22.8 90.0 2.6 9.0
68 26.536 -81.755 24.4 130.0 3.6 9.0
69 26.536 -81.755 25.6 100.0 4.6 9.0
70 26.536 -81.755 26.1 120.0 3.1 9.0
71 26.536 -81.755 26.7 0.0 2.6 9.0
72 26.536 -81.755 27.2 0.0 0.0 9.0
73 26.536 -81.755 27.2 40.0 3.1 9.0
74 26.536 -81.755 26.1 30.0 1.5 9.0
75 26.536 -81.755 22.8 310.0 2.1 9.0
76 26.536 -81.755 23.3 330.0 2.1 9.0
77 -34.067 -56.238 17.5 30.0 3.1 68.0
78 -34.067 -56.238 21.2 30.0 5.7 68.0
79 -34.067 -56.238 24.5 30.0 3.1 68.0
80 -34.067 -56.238 27.5 330.0 3.6 68.0
81 -34.067 -56.238 29.2 30.0 4.1 68.0
82 -34.067 -56.238 31.0 20.0 4.6 68.0
83 -34.067 -56.238 33.0 360.0 2.6 68.0
84 -34.067 -56.238 33.6 60.0 3.1 68.0
85 -34.067 -56.238 33.6 30.0 3.6 68.0
86 -34.067 -56.238 18.6 40.0 3.1 68.0
87 -34.067 -56.238 22.0 120.0 1.5 68.0
88 -34.067 -56.238 25.0 120.0 2.6 68.0
89 -34.067 -56.238 28.6 50.0 3.1 68.0
90 -34.067 -56.238 30.6 50.0 4.1 68.0
91 -34.067 -56.238 31.5 30.0 6.7 68.0
92 -34.067 -56.238 32.0 40.0 7.2 68.0
93 -34.067 -56.238 33.0 30.0 5.7 68.0
94 -34.067 -56.238 33.2 360.0 3.6 68.0
95 -34.067 -56.238 20.6 30.0 3.1 68.0
96 -34.067 -56.238 21.2 0.0 0.0 68.0
97 -34.067 -56.238 22.0 210.0 3.1 68.0
98 -34.067 -56.238 23.0 210.0 3.6 68.0
99 -34.067 -56.238 24.0 180.0 6.7 68.0
100 -34.067 -56.238 24.5 210.0 7.2 68.0
101 -34.067 -56.238 21.0 180.0 8.2 68.0
102 -34.067 -56.238 20.0 180.0 6.7 68.0
103 -34.083 -56.233 20.2 180.0 7.2 68.0
104 -29.917 -71.2 16.6 290.0 4.1 146.0
105 -29.916 -71.2 17.0 290.0 4.1 147.0
106 -29.916 -71.2 16.0 310.0 3.1 147.0
107 -29.916 -71.2 16.0 300.0 2.1 147.0
108 -29.917 -71.2 15.1 0.0 0.0 146.0
109 -29.916 -71.2 15.0 0.0 1.0 147.0
110 -29.916 -71.2 15.0 160.0 1.0 147.0
111 -29.916 -71.2 15.0 120.0 1.0 147.0
112 -29.917 -71.2 14.3 190.0 1.0 146.0
113 -29.916 -71.2 14.0 190.0 1.0 147.0
114 -29.916 -71.2 14.0 0.0 0.0 147.0
115 -29.916 -71.2 14.0 100.0 3.1 147.0
116 -29.917 -71.2 12.9 0.0 0.0 146.0
117 -29.916 -71.2 13.0 0.0 1.0 147.0
118 -29.916 -71.2 14.0 0.0 0.5 147.0
119 -29.916 -71.2 15.0 0.0 0.5 147.0
120 -29.917 -71.2 15.9 0.0 0.0 146.0
121 -29.916 -71.2 16.0 0.0 0.0 147.0
122 -29.916 -71.2 17.0 270.0 4.6 147.0
123 -29.916 -71.2 19.0 260.0 4.1 147.0
124 -29.917 -71.2 18.1 270.0 6.2 146.0
125 -29.916 -71.2 18.0 270.0 6.2 147.0
126 -29.916 -71.2 19.0 270.0 6.2 147.0
127 -29.916 -71.2 20.0 260.0 5.1 147.0
128 -29.917 -71.2 19.6 280.0 6.2 146.0
129 -29.916 -71.2 20.0 280.0 6.2 147.0
130 -29.916 -71.2 20.0 270.0 6.2 147.0
131 -29.916 -71.2 19.0 280.0 6.7 147.0
132 -29.917 -71.2 18.3 270.0 5.7 146.0
133 -29.916 -71.2 18.0 270.0 5.7 147.0
134 -29.916 -71.2 18.0 0.0 0.0 147.0
135 -29.916 -71.2 17.0 280.0 4.6 147.0
136 -29.917 -71.2 15.9 280.0 4.1 146.0
137 -29.916 -71.2 16.0 280.0 4.1 147.0
138 -29.916 -71.2 15.0 280.0 3.6 147.0
139 -29.916 -71.2 15.0 280.0 3.6 147.0
140 -29.917 -71.2 15.4 280.0 4.1 146.0
141 -29.916 -71.2 15.0 280.0 4.1 147.0
142 -29.916 -71.2 16.0 240.0 2.1 147.0
143 -29.916 -71.2 15.0 0.0 0.5 147.0
144 -29.917 -71.2 15.8 80.0 3.6 146.0
145 -29.916 -71.2 16.0 80.0 3.6 147.0
146 -29.916 -71.2 16.0 10.0 1.5 147.0
147 -29.916 -71.2 16.0 100.0 1.5 147.0
148 -29.917 -71.2 15.3 130.0 1.5 146.0
149 -29.916 -71.2 15.0 130.0 1.5 147.0
150 -29.916 -71.2 15.0 110.0 1.0 147.0
151 -29.916 -71.2 16.0 280.0 6.2 147.0
152 -29.917 -71.2 15.9 240.0 3.6 146.0
153 -29.916 -71.2 16.0 240.0 3.6 147.0
154 -29.916 -71.2 16.0 240.0 3.1 147.0
155 -29.916 -71.2 16.0 220.0 3.1 147.0
156 -29.917 -71.2 16.4 260.0 3.1 146.0
157 -29.916 -71.2 16.0 260.0 3.1 147.0
158 -29.916 -71.2 17.0 230.0 2.6 147.0
159 -29.916 -71.2 18.0 0.0 1.5 147.0
160 -29.917 -71.2 20.3 340.0 2.6 146.0
161 -29.916 -71.2 20.0 340.0 2.6 147.0
162 -29.916 -71.2 21.0 270.0 5.1 147.0
163 -29.916 -71.2 20.0 270.0 6.7 147.0
164 -29.917 -71.2 19.2 280.0 6.7 146.0
165 -29.916 -71.2 19.0 280.0 6.7 147.0
166 -29.916 -71.2 19.0 310.0 2.6 147.0
167 -29.916 -71.2 18.0 270.0 5.1 147.0
168 -29.917 -71.2 17.0 300.0 4.6 146.0
169 -29.916 -71.2 17.0 300.0 4.6 147.0
170 -29.916 -71.2 17.0 300.0 3.6 147.0
171 -29.916 -71.2 17.0 290.0 3.1 147.0
172 -29.917 -71.2 16.3 290.0 2.1 146.0
173 -29.916 -71.2 16.0 290.0 2.1 147.0
174 -29.916 -71.2 17.0 270.0 1.0 147.0
175 -29.916 -71.2 17.0 0.0 0.5 147.0
176 -29.917 -71.2 16.5 160.0 2.1 146.0
177 -29.916 -71.2 17.0 160.0 2.1 147.0
178 -29.916 -71.2 15.0 120.0 3.1 147.0
179 -29.916 -71.2 16.0 180.0 1.5 147.0
180 -29.917 -71.2 14.7 0.0 0.0 146.0
181 -29.916 -71.2 15.0 0.0 1.0 147.0
182 -29.916 -71.2 15.0 300.0 1.0 147.0
183 -29.916 -71.2 16.0 0.0 0.0 147.0
184 -29.917 -71.2 18.5 110.0 1.0 146.0
185 -29.916 -71.2 19.0 110.0 1.0 147.0
186 -29.916 -71.2 20.0 270.0 3.6 147.0
187 -29.916 -71.2 20.0 270.0 5.7 147.0
188 -29.917 -71.2 20.0 280.0 6.2 146.0
189 -29.916 -71.2 20.0 280.0 6.2 147.0
190 -29.916 -71.2 21.0 290.0 6.7 147.0
191 -29.916 -71.2 20.0 270.0 6.2 147.0
192 -29.917 -71.2 21.0 260.0 6.7 146.0
193 -29.916 -71.2 21.0 260.0 6.7 147.0
194 -29.916 -71.2 20.0 270.0 6.2 147.0
195 -29.916 -71.2 19.0 260.0 5.1 147.0
196 -29.916 -71.2 18.0 280.0 4.6 147.0
197 -29.917 -71.2 17.5 280.0 3.1 146.0
198 -29.916 -71.2 18.0 280.0 3.1 147.0
199 30.349 -85.788 11.1 0.0 0.0 21.0
200 30.349 -85.788 11.1 0.0 0.0 21.0
201 30.349 -85.788 9.4 0.0 0.0 21.0
202 30.349 -85.788 9.4 0.0 0.0 21.0
203 30.349 -85.788 8.3 300.0 2.1 21.0
204 30.349 -85.788 11.1 280.0 1.5 21.0
205 30.349 -85.788 0.0 0.0 0.0 21.0
206 30.349 -85.788 10.6 320.0 3.1 21.0
207 30.349 -85.788 9.4 310.0 3.1 21.0
208 30.349 -85.788 7.8 320.0 2.6 21.0
209 30.349 -85.788 6.1 340.0 2.1 21.0
210 30.349 -85.788 6.7 330.0 2.6 21.0
211 30.349 -85.788 6.1 310.0 1.5 21.0
212 30.349 -85.788 7.2 310.0 2.1 21.0
213 30.349 -85.788 12.8 360.0 3.1 21.0
214 30.349 -85.788 15.0 0.0 3.1 21.0
215 30.349 -85.788 16.7 20.0 4.6 21.0
216 30.349 -85.788 18.9 30.0 5.1 21.0
217 30.349 -85.788 19.4 10.0 4.1 21.0
218 30.349 -85.788 21.1 330.0 2.6 21.0
219 30.349 -85.788 21.1 10.0 4.6 21.0
220 30.349 -85.788 21.7 360.0 4.1 21.0
221 30.349 -85.788 21.7 30.0 2.1 21.0
222 30.349 -85.788 21.7 330.0 2.6 21.0
223 30.349 -85.788 16.1 350.0 2.1 21.0
224 30.349 -85.788 11.7 0.0 0.0 21.0
225 30.349 -85.788 8.9 0.0 0.0 21.0
226 30.349 -85.788 9.4 0.0 0.0 21.0
227 30.349 -85.788 7.8 0.0 0.0 21.0
228 30.349 -85.788 11.1 30.0 3.1 21.0
229 30.349 -85.788 7.2 0.0 0.0 21.0
230 30.349 -85.788 7.2 0.0 0.0 21.0
231 30.349 -85.788 0.0 0.0 0.0 21.0
232 30.349 -85.788 7.8 30.0 2.1 21.0
233 30.349 -85.788 8.3 40.0 2.6 21.0
234 30.349 -85.788 7.2 50.0 1.5 21.0
235 30.349 -85.788 8.3 60.0 1.5 21.0
236 30.349 -85.788 5.6 40.0 2.1 21.0
237 30.349 -85.788 6.7 40.0 2.1 21.0
238 30.349 -85.788 7.8 50.0 3.1 21.0
239 30.349 -85.788 11.7 70.0 2.6 21.0
240 30.349 -85.788 15.6 70.0 3.1 21.0
241 30.349 -85.788 18.9 100.0 3.6 21.0
242 30.349 -85.788 20.0 130.0 3.6 21.0
243 30.349 -85.788 21.1 140.0 4.1 21.0
244 30.349 -85.788 21.7 150.0 4.1 21.0
245 30.349 -85.788 21.7 170.0 3.1 21.0
246 30.349 -85.788 22.2 170.0 3.1 21.0
247 30.349 -85.788 20.6 0.0 0.0 21.0
248 30.349 -85.788 17.2 0.0 0.0 21.0
249 30.349 -85.788 14.4 0.0 0.0 21.0
250 30.349 -85.788 12.8 100.0 1.5 21.0
251 30.349 -85.788 13.3 100.0 1.5 21.0
252 30.349 -85.788 10.6 0.0 0.0 21.0
253 30.349 -85.788 9.4 0.0 0.0 21.0
254 30.349 -85.788 7.8 0.0 0.0 21.0
255 30.358 -85.799 8.3 0.0 0.0 21.0
256 30.349 -85.788 0.0 0.0 0.0 21.0
257 30.358 -85.799 6.7 0.0 0.0 21.0
258 30.358 -85.799 7.2 0.0 0.0 21.0
259 30.358 -85.799 7.2 0.0 0.0 21.0
260 30.358 -85.799 8.3 50.0 1.5 21.0
261 30.358 -85.799 9.4 0.0 0.0 21.0
262 30.358 -85.799 8.9 0.0 0.0 21.0
263 30.358 -85.799 10.0 340.0 1.5 21.0
264 30.358 -85.799 12.8 40.0 1.5 21.0
265 30.358 -85.799 16.7 100.0 2.1 21.0
266 30.358 -85.799 21.1 100.0 1.5 21.0
267 30.358 -85.799 23.3 0.0 0.0 21.0
268 30.358 -85.799 25.0 180.0 4.6 21.0
269 30.358 -85.799 24.4 230.0 3.6 21.0
270 30.358 -85.799 25.0 210.0 4.1 21.0
271 30.358 -85.799 23.9 170.0 4.1 21.0
272 30.358 -85.799 22.8 0.0 0.0 21.0
273 30.358 -85.799 19.4 0.0 0.0 21.0
274 30.358 -85.799 17.8 140.0 2.1 21.0
275 60.383 5.333 -0.7 0.0 0.0 36.0
276 60.383 5.333 0.6 270.0 2.0 36.0
277 60.383 5.333 -0.9 120.0 1.0 36.0
278 60.383 5.333 -1.6 130.0 2.0 36.0
279 60.383 5.333 -1.4 150.0 1.0 36.0
280 60.383 5.333 -1.7 0.0 0.0 36.0
281 60.383 5.333 -1.7 140.0 1.0 36.0
282 60.383 5.333 -1.4 0.0 0.0 36.0
283 60.383 5.333 -1.0 0.0 0.0 36.0
284 60.383 5.333 -1.0 150.0 1.0 36.0
285 60.383 5.333 -0.7 140.0 1.0 36.0
286 60.383 5.333 0.5 150.0 1.0 36.0
287 60.383 5.333 1.9 0.0 0.0 36.0
288 60.383 5.333 1.7 0.0 0.0 36.0
289 60.383 5.333 2.1 310.0 2.0 36.0
290 60.383 5.333 1.5 90.0 1.0 36.0
291 60.383 5.333 1.9 290.0 1.0 36.0
292 60.383 5.333 2.0 320.0 1.0 36.0
293 60.383 5.333 1.9 330.0 1.0 36.0
294 60.383 5.333 1.3 350.0 1.0 36.0
295 60.383 5.333 1.5 120.0 1.0 36.0
296 60.383 5.333 1.3 150.0 2.0 36.0
297 60.383 5.333 0.8 140.0 1.0 36.0
298 60.383 5.333 0.3 300.0 1.0 36.0
299 60.383 5.333 0.2 140.0 1.0 36.0
300 60.383 5.333 0.4 140.0 1.0 36.0
301 60.383 5.333 0.5 320.0 1.0 36.0
302 60.383 5.333 1.5 330.0 1.0 36.0
303 60.383 5.333 1.8 40.0 1.0 36.0
304 60.383 5.333 2.3 170.0 1.0 36.0
305 60.383 5.333 2.7 140.0 1.0 36.0
306 60.383 5.333 3.1 330.0 1.0 36.0
307 60.383 5.333 3.8 350.0 1.0 36.0
308 60.383 5.333 3.8 140.0 1.0 36.0
309 60.383 5.333 4.1 150.0 1.0 36.0
310 60.383 5.333 4.4 180.0 1.0 36.0
311 60.383 5.333 4.9 300.0 1.0 36.0
312 60.383 5.333 5.2 320.0 1.0 36.0
313 60.383 5.333 6.7 340.0 1.0 36.0
314 60.383 5.333 6.9 250.0 1.0 36.0
315 60.383 5.333 7.9 300.0 2.0 36.0
316 60.383 5.333 5.5 140.0 1.0 36.0
317 60.383 5.333 7.1 140.0 2.0 36.0
318 60.383 5.333 7.0 280.0 2.0 36.0
319 60.383 5.333 4.6 170.0 1.0 36.0
320 60.383 5.333 4.8 330.0 1.0 36.0
321 60.383 5.333 6.4 260.0 2.0 36.0
322 60.383 5.333 6.2 340.0 1.0 36.0
323 60.383 5.333 5.7 320.0 2.0 36.0
324 60.383 5.333 5.2 100.0 1.0 36.0
325 60.383 5.333 5.1 310.0 1.0 36.0
326 60.383 5.333 4.9 290.0 2.0 36.0
327 60.383 5.333 4.9 310.0 2.0 36.0
328 60.383 5.333 6.1 320.0 2.0 36.0
329 60.383 5.333 7.0 250.0 1.0 36.0
330 60.383 5.333 5.3 140.0 1.0 36.0
331 60.383 5.333 6.9 350.0 1.0 36.0
332 60.383 5.333 9.7 110.0 3.0 36.0
333 60.383 5.333 10.3 300.0 3.0 36.0
334 60.383 5.333 8.7 310.0 1.0 36.0
335 60.383 5.333 9.0 270.0 3.0 36.0
336 60.383 5.333 11.6 80.0 3.0 36.0
337 60.383 5.333 11.4 80.0 4.0 36.0
338 60.383 5.333 9.7 70.0 5.0 36.0
339 60.383 5.333 9.5 80.0 6.0 36.0
340 60.383 5.333 8.7 80.0 5.0 36.0
341 60.383 5.333 7.7 80.0 5.0 36.0
342 60.383 5.333 8.2 80.0 4.0 36.0
343 60.383 5.333 7.7 30.0 1.0 36.0
344 60.383 5.333 7.2 310.0 1.0 36.0
345 60.383 5.333 6.8 300.0 2.0 36.0
346 60.383 5.333 6.7 140.0 1.0 36.0

File diff suppressed because it is too large Load Diff

View File

@@ -100,7 +100,7 @@
"\n", "\n",
"# Check core SDK version number\n", "# Check core SDK version number\n",
"\n", "\n",
"print(\"This notebook was created using SDK version 1.0.74.1, you are currently running version\", azureml.core.VERSION)" "print(\"This notebook was created using SDK version 1.0.79, you are currently running version\", azureml.core.VERSION)"
] ]
}, },
{ {
@@ -542,7 +542,9 @@
"compute": [ "compute": [
"None" "None"
], ],
"datasets": [], "datasets": [
"None"
],
"deployment": [ "deployment": [
"None" "None"
], ],

View File

@@ -1131,7 +1131,7 @@
"metadata": { "metadata": {
"authors": [ "authors": [
{ {
"name": "ninhu" "name": "swatig"
} }
], ],
"category": "training", "category": "training",

View File

@@ -63,7 +63,6 @@
"outputs": [], "outputs": [],
"source": [ "source": [
"from azureml.core import Workspace\n", "from azureml.core import Workspace\n",
"\n",
"ws = Workspace.from_config()\n", "ws = Workspace.from_config()\n",
"print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep='\\n')" "print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep='\\n')"
] ]
@@ -258,6 +257,16 @@
"metrics = run.get_metrics()\n", "metrics = run.get_metrics()\n",
"print(metrics)" "print(metrics)"
] ]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# register the generated model\n",
"model = run.register_model(model_name='iris.model', model_path='outputs/iris.model')\n"
]
} }
], ],
"metadata": { "metadata": {
@@ -297,7 +306,7 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.6.7" "version": "3.6.2"
}, },
"tags": [ "tags": [
"None" "None"

View File

@@ -76,6 +76,8 @@ train, test = data.randomSplit([0.70, 0.30])
lr = pyspark.ml.classification.LogisticRegression(regParam=reg) lr = pyspark.ml.classification.LogisticRegression(regParam=reg)
model = lr.fit(train) model = lr.fit(train)
model.save(os.path.join("outputs", "iris.model"))
# predict on the test set # predict on the test set
prediction = model.transform(test) prediction = model.transform(test)
print("Prediction") print("Prediction")

View File

@@ -685,7 +685,7 @@
"framework": [ "framework": [
"None" "None"
], ],
"friendly_name": "", "friendly_name": "Train and deploy a model using Python SDK",
"index_order": 1, "index_order": 1,
"kernelspec": { "kernelspec": {
"display_name": "Python 3.6", "display_name": "Python 3.6",

View File

@@ -27,6 +27,7 @@
"\n", "\n",
"1. [Introduction](#Introduction)\n", "1. [Introduction](#Introduction)\n",
"1. [Setup](#Setup)\n", "1. [Setup](#Setup)\n",
"1. [Use curated environment](#Use-curated-environment)\n",
"1. [Create environment](#Create-environment)\n", "1. [Create environment](#Create-environment)\n",
" 1. Add Python packages\n", " 1. Add Python packages\n",
" 1. Specify environment variables\n", " 1. Specify environment variables\n",
@@ -36,6 +37,8 @@
"1. [Other ways to create environments](#Other-ways-to-create-environments)\n", "1. [Other ways to create environments](#Other-ways-to-create-environments)\n",
" 1. From existing Conda environment\n", " 1. From existing Conda environment\n",
" 1. From Conda or pip files\n", " 1. From Conda or pip files\n",
"1. [Estimators and environments](#Estimators-and-environments) \n",
"1. [Using environments for inferencing](#Using-environments-for-inferencing)\n",
"1. [Docker settings](#Docker-settings)\n", "1. [Docker settings](#Docker-settings)\n",
"1. [Spark and Azure Databricks settings](#Spark-and-Azure-Databricks-settings)\n", "1. [Spark and Azure Databricks settings](#Spark-and-Azure-Databricks-settings)\n",
"1. [Next steps](#Next-steps)\n", "1. [Next steps](#Next-steps)\n",
@@ -84,7 +87,57 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"## Create environment\n", "## Use curated environments\n",
"\n",
"Curated environments are provided by Azure Machine Learning and are available in your workspace by default. They contain collections of Python packages and settings to help you get started different machine learning frameworks. \n",
"\n",
" * The __AzureML-Minimal__ environment contains a minimal set of packages to enable run tracking and asset uploading. You can use it as a starting point for your own environment.\n",
" * The __AzureML-Tutorial__ environment contains common data science packages, such as Scikit-Learn, Pandas and Matplotlib, and larger set of azureml-sdk packages.\n",
" \n",
"Curated environments are backed by cached Docker images, reducing the run preparation cost.\n",
" \n",
"You can get a curated environment using"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core import Environment\n",
"\n",
"curated_env = Environment.get(workspace=ws, name=\"AzureML-Minimal\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To list curated environments, use following code.\n",
"\n",
"**Note**: The name prefixes _AzureML_ and _Microsoft_ are reserved for curated environments. Do not use them for your own environments"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"envs = Environment.list(workspace=ws)\n",
"\n",
"for env in envs:\n",
" if env.startswith(\"AzureML\"):\n",
" print(\"Name\",env)\n",
" print(\"packages\", envs[env].python.conda_dependencies.serialize_to_string())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Create your own environment\n",
"\n", "\n",
"You can create an environment by instantiating ```Environment``` object and then setting its attributes: set of Python packages, environment variables and others.\n", "You can create an environment by instantiating ```Environment``` object and then setting its attributes: set of Python packages, environment variables and others.\n",
"\n", "\n",
@@ -96,10 +149,13 @@
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": null, "execution_count": null,
"metadata": {}, "metadata": {
"tags": [
"condadependencies-remarks-sample"
]
},
"outputs": [], "outputs": [],
"source": [ "source": [
"from azureml.core import Environment\n",
"from azureml.core.environment import CondaDependencies\n", "from azureml.core.environment import CondaDependencies\n",
"\n", "\n",
"myenv = Environment(name=\"myenv\")\n", "myenv = Environment(name=\"myenv\")\n",
@@ -117,7 +173,11 @@
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": null, "execution_count": null,
"metadata": {}, "metadata": {
"tags": [
"condadependencies-remarks-sample2"
]
},
"outputs": [], "outputs": [],
"source": [ "source": [
"conda_dep.add_pip_package(\"pillow==5.4.1\")\n", "conda_dep.add_pip_package(\"pillow==5.4.1\")\n",
@@ -185,6 +245,22 @@
"run.wait_for_completion(show_output=True)" "run.wait_for_completion(show_output=True)"
] ]
}, },
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To audit the environment used by for a run, you can use ```get_environement```."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"run.get_environment()"
]
},
{ {
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
@@ -256,6 +332,48 @@
"```\n" "```\n"
] ]
}, },
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Estimators and environments\n",
"\n",
"[Estimators](https://docs.microsoft.com/azure/machine-learning/service/how-to-train-ml-models) are backed by environments that define the base images, Python packages and other settings for the training environment. \n",
"\n",
"For example, to see the environment behind PyTorch Estimator, you can create a dummy instance of the Estimator, and look at the ```run_config.environment``` property."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.train.dnn import PyTorch\n",
"\n",
"pt = PyTorch(source_directory=\".\", compute_target=\"local\")\n",
"pt.run_config.environment"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Using environments for inferencing\n",
"\n",
"You can re-use the training environment when you deploy your model as a web service, by specifying inferencing stack version, and adding then environment to ```InferenceConfig```.\n",
"\n",
"```\n",
"from azureml.core.model import InferenceConfig\n",
"\n",
"myenv.inferencing_stack_version = \"latest\"\n",
"\n",
"inference_config = InferenceConfig(entry_script=\"score.py\", environment=myenv)\n",
"```\n",
"\n",
"See [Register Model and deploy as Webservice Notebook](../../deployment/deploy-to-cloud/model-register-and-deploy.ipynb) for an end-to-end example of web service deployment."
]
},
{ {
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
@@ -299,7 +417,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"You can also specify whether to use GPU or shared volumes, and shm size." "You can also specify shared volumes, and shm size."
] ]
}, },
{ {
@@ -308,7 +426,6 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"myenv.docker.gpu_support\n",
"myenv.docker.shared_volumes\n", "myenv.docker.shared_volumes\n",
"myenv.docker.shm_size" "myenv.docker.shm_size"
] ]
@@ -336,7 +453,7 @@
"\n", "\n",
"Learn more about registering and deploying a model:\n", "Learn more about registering and deploying a model:\n",
"\n", "\n",
"* [Model Register and Deploy](../../deploy-to-cloud/model-register-and-deploy.ipynb)" "* [Register Model and deploy as Webservice](../../deployment/deploy-to-cloud/model-register-and-deploy.ipynb)"
] ]
}, },
{ {

View File

@@ -10,7 +10,7 @@ With Azure Machine Learning datasets, you can:
## Learn how to use Azure Machine Learning datasets ## Learn how to use Azure Machine Learning datasets
* [Create and register datasets](https://aka.ms/azureml/howto/createdatasets) * [Create and register datasets](https://aka.ms/azureml/howto/createdatasets)
* Use [Datasets in training](datasets-tutorial/train-with-datasets.ipynb) * Use [Datasets in training](datasets-tutorial/train-with-datasets/train-with-datasets.ipynb)
* Use TabularDatasets in [automated machine learning training](https://aka.ms/automl-dataset) * Use TabularDatasets in [automated machine learning training](https://aka.ms/automl-dataset)
* Use FileDatasets in [image classification](https://aka.ms/filedataset-samplenotebook) * Use FileDatasets in [image classification](https://aka.ms/filedataset-samplenotebook)
* Use FileDatasets in [deep learning with hyperparameter tuning](https://aka.ms/filedataset-hyperdrive) * Use FileDatasets in [deep learning with hyperparameter tuning](https://aka.ms/filedataset-hyperdrive)

View File

@@ -414,7 +414,7 @@
], ],
"category": "tutorial", "category": "tutorial",
"compute": [ "compute": [
"remote" "Remote"
], ],
"datasets": [ "datasets": [
"NOAA" "NOAA"

View File

@@ -0,0 +1,403 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Copyright (c) Microsoft Corporation. All rights reserved.\n",
"\n",
"Licensed under the MIT License."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/work-with-data/datasets-tutorial/labeled-datasets/labeled-datasets.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Introduction to labeled datasets\n",
"\n",
"Labeled datasets are output from Azure Machine Learning [labeling projects](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-create-labeling-projects). It captures the reference to the data (e.g. image files) and its labels. \n",
"\n",
"This tutorial introduces the capabilities of labeled datasets and how to use it in training.\n",
"\n",
"Learn how-to:\n",
"\n",
"> * Set up your development environment\n",
"> * Explore labeled datasets\n",
"> * Train a simple deep learning neural network on a remote cluster\n",
"\n",
"## Prerequisite:\n",
"* Understand the [architecture and terms](https://docs.microsoft.com/azure/machine-learning/service/concept-azure-machine-learning-architecture) introduced by Azure Machine Learning\n",
"* Go through Azure Machine Learning [labeling projects](https://docs.microsoft.com/azure/machine-learning/service/how-to-create-labeling-projects) and export the labels as an Azure Machine Learning dataset\n",
"* Go through the [configuration notebook](../../../configuration.ipynb) to:\n",
" * install the latest version of azureml-sdk\n",
" * install the latest version of azureml-contrib-dataset\n",
" * install [PyTorch](https://pytorch.org/)\n",
" * create a workspace and its configuration file (`config.json`)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Set up your development environment\n",
"\n",
"All the setup for your development work can be accomplished in a Python notebook. Setup includes:\n",
"\n",
"* Importing Python packages\n",
"* Connecting to a workspace to enable communication between your local computer and remote resources\n",
"* Creating an experiment to track all your runs\n",
"* Creating a remote compute target to use for training\n",
"\n",
"### Import packages\n",
"\n",
"Import Python packages you need in this session. Also display the Azure Machine Learning SDK version."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import azureml.core\n",
"import azureml.contrib.dataset\n",
"from azureml.core import Dataset, Workspace, Experiment\n",
"from azureml.contrib.dataset import FileHandlingOption\n",
"\n",
"# check core SDK version number\n",
"print(\"Azure ML SDK Version: \", azureml.core.VERSION)\n",
"print(\"Azure ML Contrib Version\", azureml.contrib.dataset.VERSION)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Connect to workspace\n",
"\n",
"Create a workspace object from the existing workspace. `Workspace.from_config()` reads the file **config.json** and loads the details into an object named `workspace`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# load workspace\n",
"workspace = Workspace.from_config()\n",
"print('Workspace name: ' + workspace.name, \n",
" 'Azure region: ' + workspace.location, \n",
" 'Subscription id: ' + workspace.subscription_id, \n",
" 'Resource group: ' + workspace.resource_group, sep='\\n')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Create experiment and a directory\n",
"\n",
"Create an experiment to track the runs in your workspace and a directory to deliver the necessary code from your computer to the remote resource."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# create an ML experiment\n",
"exp = Experiment(workspace=workspace, name='labeled-datasets')\n",
"\n",
"# create a directory\n",
"script_folder = './labeled-datasets'\n",
"os.makedirs(script_folder, exist_ok=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Create or Attach existing compute resource\n",
"By using Azure Machine Learning Compute, a managed service, data scientists can train machine learning models on clusters of Azure virtual machines. Examples include VMs with GPU support. In this tutorial, you will create Azure Machine Learning Compute as your training environment. The code below creates the compute clusters for you if they don't already exist in your workspace.\n",
"\n",
"**Creation of compute takes approximately 5 minutes.** If the AmlCompute with that name is already in your workspace the code will skip the creation process."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core.compute import ComputeTarget, AmlCompute\n",
"from azureml.core.compute_target import ComputeTargetException\n",
"\n",
"# choose a name for your cluster\n",
"cluster_name = \"openhack\"\n",
"\n",
"try:\n",
" compute_target = ComputeTarget(workspace=workspace, name=cluster_name)\n",
" print('Found existing compute target')\n",
"except ComputeTargetException:\n",
" print('Creating a new compute target...')\n",
" compute_config = AmlCompute.provisioning_configuration(vm_size='STANDARD_NC6', \n",
" max_nodes=4)\n",
"\n",
" # create the cluster\n",
" compute_target = ComputeTarget.create(workspace, cluster_name, compute_config)\n",
"\n",
" # can poll for a minimum number of nodes and for a specific timeout. \n",
" # if no min node count is provided it uses the scale settings for the cluster\n",
" compute_target.wait_for_completion(show_output=True, min_node_count=None, timeout_in_minutes=20)\n",
"\n",
"# use get_status() to get a detailed status for the current cluster. \n",
"print(compute_target.get_status().serialize())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Explore labeled datasets\n",
"\n",
"**Note**: How to create labeled datasets is not covered in this tutorial. To create labeled datasets, you can go through [labeling projects](https://docs.microsoft.com/azure/machine-learning/service/how-to-create-labeling-projects) and export the output labels as Azure Machine Lerning datasets. \n",
"\n",
"`animal_labels` used in this tutorial section is the output from a labeling project, with the task type of \"Object Identification\"."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# get animal_labels dataset from the workspace\n",
"animal_labels = Dataset.get_by_name(workspace, 'animal_labels')\n",
"animal_labels"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can load labeled datasets into pandas DataFrame. There are 3 file handling option that you can choose to load the data files referenced by the labeled datasets:\n",
"* Streaming: The default option to load data files.\n",
"* Download: Download your data files to a local path.\n",
"* Mount: Mount your data files to a mount point. Mount only works for Linux-based compute, including Azure Machine Learning notebook VM and Azure Machine Learning Compute."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"animal_pd = animal_labels.to_pandas_dataframe(file_handling_option=FileHandlingOption.DOWNLOAD, target_path='./download/', overwrite_download=True)\n",
"animal_pd"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import matplotlib.pyplot as plt\n",
"import matplotlib.image as mpimg\n",
"\n",
"# read images from downloaded path\n",
"img = mpimg.imread(animal_pd.loc[0,'image_url'])\n",
"imgplot = plt.imshow(img)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can also load labeled datasets into [torchvision datasets](https://pytorch.org/docs/stable/torchvision/datasets.html), so that you can leverage on the open source libraries provided by PyTorch for image transformation and training."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from torchvision.transforms import functional as F\n",
"\n",
"# load animal_labels dataset into torchvision dataset\n",
"pytorch_dataset = animal_labels.to_torchvision()\n",
"img = pytorch_dataset[0][0]\n",
"print(type(img))\n",
"\n",
"# use methods from torchvision to transform the img into grayscale\n",
"pil_image = F.to_pil_image(img)\n",
"gray_image = F.to_grayscale(pil_image, num_output_channels=3)\n",
"\n",
"imgplot = plt.imshow(gray_image)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Train an image classification model\n",
"\n",
" `crack_labels` dataset used in this tutorial section is the output from a labeling project, with the task type of \"Image Classification Multi-class\". We will use this dataset to train an image classification model that classify whether an image has cracks or not."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# get crack_labels dataset from the workspace\n",
"crack_labels = Dataset.get_by_name(workspace, 'crack_labels')\n",
"crack_labels"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Configure Estimator for training\n",
"\n",
"You can ask the system to build a conda environment based on your dependency specification. Once the environment is built, and if you don't change your dependencies, it will be reused in subsequent runs."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core import Environment\n",
"from azureml.core.conda_dependencies import CondaDependencies\n",
"\n",
"conda_env = Environment('conda-env')\n",
"conda_env.python.conda_dependencies = CondaDependencies.create(pip_packages=['azureml-sdk',\n",
" 'azureml-contrib-dataset',\n",
" 'torch','torchvision',\n",
" 'azureml-dataprep[pandas]'])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"An estimator object is used to submit the run. Azure Machine Learning has pre-configured estimators for common machine learning frameworks, as well as generic Estimator. Create a generic estimator for by specifying\n",
"\n",
"* The name of the estimator object, `est`\n",
"* The directory that contains your scripts. All the files in this directory are uploaded into the cluster nodes for execution. \n",
"* The training script name, train.py\n",
"* The input dataset for training\n",
"* The compute target. In this case you will use the AmlCompute you created\n",
"* The environment definition for the experiment"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.train.estimator import Estimator\n",
"\n",
"est = Estimator(source_directory=script_folder, \n",
" entry_script='train.py',\n",
" inputs=[crack_labels.as_named_input('crack_labels')],\n",
" compute_target=compute_target,\n",
" environment_definition= conda_env)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Submit job to run\n",
"\n",
"Submit the estimator to the Azure ML experiment to kick off the execution."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"run = exp.submit(est)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"run.wait_for_completion(show_output=True)"
]
}
],
"metadata": {
"authors": [
{
"name": "sihhu"
}
],
"category": "tutorial",
"compute": [
"Remote"
],
"deployment": [
"None"
],
"exclude_from_index": false,
"framework": [
"Azure ML"
],
"friendly_name": "Introduction to labeled datasets",
"index_order": 1,
"kernelspec": {
"display_name": "Python 3.6",
"language": "python",
"name": "python36"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.9"
},
"nteract": {
"version": "nteract-front-end@1.0.0"
},
"star_tag": [
"featured"
],
"tags": [
"Dataset",
"label",
"Estimator"
],
"task": "Train"
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -0,0 +1,106 @@
import os
import torchvision
import torchvision.transforms as transforms
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from azureml.core import Dataset, Run
import azureml.contrib.dataset
from azureml.contrib.dataset import FileHandlingOption, LabeledDatasetTask
run = Run.get_context()
# get input dataset by name
labeled_dataset = run.input_datasets['crack_labels']
pytorch_dataset = labeled_dataset.to_torchvision()
indices = torch.randperm(len(pytorch_dataset)).tolist()
dataset_train = torch.utils.data.Subset(pytorch_dataset, indices[:40])
dataset_test = torch.utils.data.Subset(pytorch_dataset, indices[-10:])
trainloader = torch.utils.data.DataLoader(dataset_train, batch_size=4,
shuffle=True, num_workers=0)
testloader = torch.utils.data.DataLoader(dataset_test, batch_size=4,
shuffle=True, num_workers=0)
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(3, 6, 5)
self.pool = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(6, 16, 5)
self.fc1 = nn.Linear(16 * 71 * 71, 120)
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, 10)
def forward(self, x):
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
x = x.view(x.size(0), 16 * 71 * 71)
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
net = Net()
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)
for epoch in range(2): # loop over the dataset multiple times
running_loss = 0.0
for i, data in enumerate(trainloader, 0):
# get the inputs; data is a list of [inputs, labels]
inputs, labels = data
# zero the parameter gradients
optimizer.zero_grad()
# forward + backward + optimize
outputs = net(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
# print statistics
running_loss += loss.item()
if i % 5 == 4: # print every 5 mini-batches
print('[%d, %5d] loss: %.3f' %
(epoch + 1, i + 1, running_loss / 5))
running_loss = 0.0
print('Finished Training')
classes = trainloader.dataset.dataset.labels
PATH = './cifar_net.pth'
torch.save(net.state_dict(), PATH)
dataiter = iter(testloader)
images, labels = dataiter.next()
net = Net()
net.load_state_dict(torch.load(PATH))
outputs = net(images)
_, predicted = torch.max(outputs, 1)
correct = 0
total = 0
with torch.no_grad():
for data in testloader:
images, labels = data
outputs = net(images)
_, predicted = torch.max(outputs.data, 1)
total += labels.size(0)
correct += (predicted == labels).sum().item()
print('Accuracy of the network on the 10 test images: %d %%' % (100 * correct / total))
pass

View File

@@ -0,0 +1,35 @@
import os
def convert(imgf, labelf, outf, n):
f = open(imgf, "rb")
l = open(labelf, "rb")
o = open(outf, "w")
f.read(16)
l.read(8)
images = []
for i in range(n):
image = [ord(l.read(1))]
for j in range(28 * 28):
image.append(ord(f.read(1)))
images.append(image)
for image in images:
o.write(",".join(str(pix) for pix in image) + "\n")
f.close()
o.close()
l.close()
mounted_input_path = os.environ['fashion_ds']
mounted_output_path = os.environ['AZUREML_DATAREFERENCE_prepared_fashion_ds']
os.makedirs(mounted_output_path, exist_ok=True)
convert(os.path.join(mounted_input_path, 'train-images-idx3-ubyte'),
os.path.join(mounted_input_path, 'train-labels-idx1-ubyte'),
os.path.join(mounted_output_path, 'mnist_train.csv'), 60000)
convert(os.path.join(mounted_input_path, 't10k-images-idx3-ubyte'),
os.path.join(mounted_input_path, 't10k-labels-idx1-ubyte'),
os.path.join(mounted_output_path, 'mnist_test.csv'), 10000)

View File

@@ -0,0 +1,120 @@
import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.layers.normalization import BatchNormalization
from keras.utils import to_categorical
from keras.callbacks import Callback
import numpy as np
import pandas as pd
import os
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from azureml.core import Run
# dataset object from the run
run = Run.get_context()
dataset = run.input_datasets['prepared_fashion_ds']
# split dataset into train and test set
(train_dataset, test_dataset) = dataset.random_split(percentage=0.8, seed=111)
# load dataset into pandas dataframe
data_train = train_dataset.to_pandas_dataframe()
data_test = test_dataset.to_pandas_dataframe()
img_rows, img_cols = 28, 28
input_shape = (img_rows, img_cols, 1)
X = np.array(data_train.iloc[:, 1:])
y = to_categorical(np.array(data_train.iloc[:, 0]))
# here we split validation data to optimiza classifier during training
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=13)
# test data
X_test = np.array(data_test.iloc[:, 1:])
y_test = to_categorical(np.array(data_test.iloc[:, 0]))
X_train = X_train.reshape(X_train.shape[0], img_rows, img_cols, 1).astype('float32') / 255
X_test = X_test.reshape(X_test.shape[0], img_rows, img_cols, 1).astype('float32') / 255
X_val = X_val.reshape(X_val.shape[0], img_rows, img_cols, 1).astype('float32') / 255
batch_size = 256
num_classes = 10
epochs = 10
# construct neuron network
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
activation='relu',
kernel_initializer='he_normal',
input_shape=input_shape))
model.add(MaxPooling2D((2, 2)))
model.add(Dropout(0.25))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Conv2D(128, (3, 3), activation='relu'))
model.add(Dropout(0.4))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.3))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adam(),
metrics=['accuracy'])
# start an Azure ML run
run = Run.get_context()
class LogRunMetrics(Callback):
# callback at the end of every epoch
def on_epoch_end(self, epoch, log):
# log a value repeated which creates a list
run.log('Loss', log['loss'])
run.log('Accuracy', log['accuracy'])
history = model.fit(X_train, y_train,
batch_size=batch_size,
epochs=epochs,
verbose=1,
validation_data=(X_val, y_val),
callbacks=[LogRunMetrics()])
score = model.evaluate(X_test, y_test, verbose=0)
# log a single value
run.log("Final test loss", score[0])
print('Test loss:', score[0])
run.log('Final test accuracy', score[1])
print('Test accuracy:', score[1])
plt.figure(figsize=(6, 3))
plt.title('Fashion MNIST with Keras ({} epochs)'.format(epochs), fontsize=14)
plt.plot(history.history['accuracy'], 'b-', label='Accuracy', lw=4, alpha=0.5)
plt.plot(history.history['loss'], 'r--', label='Loss', lw=4, alpha=0.5)
plt.legend(fontsize=12)
plt.grid(True)
# log an image
run.log_image('Loss v.s. Accuracy', plot=plt)
# create a ./outputs/model folder in the compute target
# files saved in the "./outputs" folder are automatically uploaded into run history
os.makedirs('./outputs/model', exist_ok=True)
# serialize NN architecture to JSON
model_json = model.to_json()
# save model JSON
with open('./outputs/model/model.json', 'w') as f:
f.write(model_json)
# save model weights
model.save_weights('./outputs/model/model.h5')
print("model saved in ./outputs/model folder")

View File

@@ -0,0 +1,488 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Copyright (c) Microsoft Corporation. All rights reserved.\n",
"\n",
"Licensed under the MIT License [2017] Zalando SE, https://tech.zalando.com"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/work-with-data/datasets-tutorial/pipeline-with-datasets/pipeline-for-image-classification.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Build a simple ML pipeline for image classification\n",
"\n",
"## Introduction\n",
"This tutorial shows how to train a simple deep neural network using the [Fashion MNIST](https://github.com/zalandoresearch/fashion-mnist) dataset and Keras on Azure Machine Learning. Fashion-MNIST is a dataset of Zalando's article images\u00e2\u20ac\u201dconsisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a label from 10 classes.\n",
"\n",
"Learn how to:\n",
"\n",
"> * Set up your development environment\n",
"> * Create the Fashion MNIST dataset\n",
"> * Create a machine learning pipeline to train a simple deep learning neural network on a remote cluster\n",
"> * Retrieve input datasets from the experiment and register the output model with datasets\n",
"\n",
"## Prerequisite:\n",
"* Understand the [architecture and terms](https://docs.microsoft.com/azure/machine-learning/service/concept-azure-machine-learning-architecture) introduced by Azure Machine Learning\n",
"* If you are using an Azure Machine Learning Notebook VM, you are all set. Otherwise, go through the [configuration notebook](../../../configuration.ipynb) to:\n",
" * install the latest version of AzureML SDK\n",
" * create a workspace and its configuration file (`config.json`)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Set up your development environment\n",
"\n",
"All the setup for your development work can be accomplished in a Python notebook. Setup includes:\n",
"\n",
"* Importing Python packages\n",
"* Connecting to a workspace to enable communication between your local computer and remote resources\n",
"* Creating an experiment to track all your runs\n",
"* Creating a remote compute target to use for training\n",
"\n",
"### Import packages\n",
"\n",
"Import Python packages you need in this session. Also display the Azure Machine Learning SDK version."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import azureml.core\n",
"from azureml.core import Workspace, Dataset, Datastore, ComputeTarget, RunConfiguration, Experiment\n",
"from azureml.core.runconfig import CondaDependencies\n",
"from azureml.pipeline.steps import PythonScriptStep, EstimatorStep\n",
"from azureml.pipeline.core import Pipeline, PipelineData\n",
"from azureml.train.dnn import TensorFlow\n",
"\n",
"# check core SDK version number\n",
"print(\"Azure ML SDK Version: \", azureml.core.VERSION)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Connect to workspace\n",
"\n",
"Create a workspace object from the existing workspace. `Workspace.from_config()` reads the file **config.json** and loads the details into an object named `workspace`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# load workspace\n",
"workspace = Workspace.from_config()\n",
"print('Workspace name: ' + workspace.name, \n",
" 'Azure region: ' + workspace.location, \n",
" 'Subscription id: ' + workspace.subscription_id, \n",
" 'Resource group: ' + workspace.resource_group, sep='\\n')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Create experiment and a directory\n",
"\n",
"Create an experiment to track the runs in your workspace and a directory to deliver the necessary code from your computer to the remote resource."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# create an ML experiment\n",
"exp = Experiment(workspace=workspace, name='keras-mnist-fashion')\n",
"\n",
"# create a directory\n",
"script_folder = './keras-mnist-fashion'\n",
"os.makedirs(script_folder, exist_ok=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Create or Attach existing compute resource\n",
"By using Azure Machine Learning Compute, a managed service, data scientists can train machine learning models on clusters of Azure virtual machines. Examples include VMs with GPU support. In this tutorial, you create Azure Machine Learning Compute as your training environment. The code below creates the compute clusters for you if they don't already exist in your workspace.\n",
"\n",
"**Creation of compute takes approximately 5 minutes.** If the AmlCompute with that name is already in your workspace the code will skip the creation process."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core.compute import ComputeTarget, AmlCompute\n",
"from azureml.core.compute_target import ComputeTargetException\n",
"\n",
"# choose a name for your cluster\n",
"cluster_name = \"your-cluster-name\"\n",
"\n",
"try:\n",
" compute_target = ComputeTarget(workspace=workspace, name=cluster_name)\n",
" print('Found existing compute target')\n",
"except ComputeTargetException:\n",
" print('Creating a new compute target...')\n",
" compute_config = AmlCompute.provisioning_configuration(vm_size='STANDARD_NC6', \n",
" max_nodes=4)\n",
"\n",
" # create the cluster\n",
" compute_target = ComputeTarget.create(workspace, cluster_name, compute_config)\n",
"\n",
" # can poll for a minimum number of nodes and for a specific timeout. \n",
" # if no min node count is provided it uses the scale settings for the cluster\n",
" compute_target.wait_for_completion(show_output=True, min_node_count=None, timeout_in_minutes=20)\n",
"\n",
"# use get_status() to get a detailed status for the current cluster. \n",
"print(compute_target.get_status().serialize())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Create the Fashion MNIST dataset\n",
"\n",
"By creating a dataset, you create a reference to the data source location. If you applied any subsetting transformations to the dataset, they will be stored in the dataset as well. The data remains in its existing location, so no extra storage cost is incurred. \n",
"\n",
"Every workspace comes with a default [datastore](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-access-data) (and you can register more) which is backed by the Azure blob storage account associated with the workspace. We can use it to transfer data from local to the cloud, and create a dataset from it. We will now upload the [Fashion MNIST](./keras-mnist-fashion) to the default datastore (blob) within your workspace."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"datastore = workspace.get_default_datastore()\n",
"datastore.upload_files(files = ['keras-mnist-fashion/t10k-images-idx3-ubyte', 'keras-mnist-fashion/t10k-labels-idx1-ubyte',\n",
" 'keras-mnist-fashion/train-images-idx3-ubyte','keras-mnist-fashion/train-labels-idx1-ubyte'],\n",
" target_path = 'mnist-fashion',\n",
" overwrite = True,\n",
" show_progress = True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Then we will create an unregistered FileDataset pointing to the path in the datastore. You can also create a dataset from multiple paths. [Learn More](https://aka.ms/azureml/howto/createdatasets) "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"fashion_ds = Dataset.File.from_files([(datastore, 'mnist-fashion')])\n",
"\n",
"# list the files referenced by fashion_ds\n",
"fashion_ds.to_path()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Build 2-step ML pipeline\n",
"\n",
"The [Azure Machine Learning Pipeline](https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-ml-pipelines) enables data scientists to create and manage multiple simple and complex workflows concurrently. A typical pipeline would have multiple tasks to prepare data, train, deploy and evaluate models. Individual steps in the pipeline can make use of diverse compute options (for example: CPU for data preparation and GPU for training) and languages. [Learn More](https://github.com/Azure/MachineLearningNotebooks/tree/master/how-to-use-azureml/machine-learning-pipelines)\n",
"\n",
"\n",
"### Step 1: data preparation\n",
"\n",
"In step one, we will load the image and labels from Fashion MNIST dataset into mnist_train.csv and mnist_test.csv\n",
"\n",
"Each image is 28 pixels in height and 28 pixels in width, for a total of 784 pixels in total. Each pixel has a single pixel-value associated with it, indicating the lightness or darkness of that pixel, with higher numbers meaning darker. This pixel-value is an integer between 0 and 255. Both mnist_train.csv and mnist_test.csv contain 785 columns. The first column consists of the class labels, which represent the article of clothing. The rest of the columns contain the pixel-values of the associated image."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# set up the compute environment to install required packages\n",
"conda = CondaDependencies.create(\n",
" pip_packages=['azureml-sdk','azureml-dataprep[fuse,pandas]'],\n",
" pin_sdk_version=False)\n",
"\n",
"conda.set_pip_option('--pre')\n",
"\n",
"run_config = RunConfiguration()\n",
"run_config.environment.python.conda_dependencies = conda"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Intermediate data (or output of a step) is represented by a `PipelineData` object. preprared_fashion_ds is produced as the output of step 1, and used as the input of step 2. PipelineData introduces a data dependency between steps, and creates an implicit execution order in the pipeline. You can register a `PipelineData` as a dataset and version the output data automatically. [Learn More](https://docs.microsoft.com/azure/machine-learning/service/how-to-version-track-datasets#version-a-pipeline-output-dataset) "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# define output data\n",
"prepared_fashion_ds = PipelineData('prepared_fashion_ds', datastore=datastore).as_dataset()\n",
"\n",
"# register output data as dataset\n",
"prepared_fashion_ds = prepared_fashion_ds.register(name='prepared_fashion_ds', create_new_version=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"A **PythonScriptStep** is a basic, built-in step to run a Python Script on a compute target. It takes a script name and optionally other parameters like arguments for the script, compute target, inputs and outputs. If no compute target is specified, default compute target for the workspace is used. You can also use a [**RunConfiguration**](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.runconfiguration?view=azure-ml-py) to specify requirements for the PythonScriptStep, such as conda dependencies and docker image."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"prep_step = PythonScriptStep(name='prepare step',\n",
" script_name=\"prepare.py\",\n",
" # mount fashion_ds dataset to the compute_target\n",
" inputs=[fashion_ds.as_named_input('fashion_ds').as_mount()],\n",
" outputs=[prepared_fashion_ds],\n",
" source_directory=script_folder,\n",
" compute_target=compute_target,\n",
" runconfig=run_config)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Step 2: train CNN with Keras\n",
"\n",
"Next, we construct an `azureml.train.dnn.TensorFlow` estimator object. The TensorFlow estimator is providing a simple way of launching a TensorFlow training job on a compute target. It will automatically provide a docker image that has TensorFlow installed.\n",
"\n",
"[EstimatorStep](https://docs.microsoft.com/en-us/python/api/azureml-pipeline-steps/azureml.pipeline.steps.estimator_step.estimatorstep?view=azure-ml-py) adds a step to run Tensorflow Estimator in a Pipeline. It takes a dataset as the input."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# set up training step with Tensorflow estimator\n",
"est = TensorFlow(entry_script='train.py',\n",
" source_directory=script_folder, \n",
" pip_packages = ['azureml-sdk','keras','numpy','scikit-learn', 'matplotlib'],\n",
" compute_target=compute_target)\n",
"\n",
"est_step = EstimatorStep(name='train step',\n",
" estimator=est,\n",
" estimator_entry_script_arguments=[],\n",
" # parse prepared_fashion_ds into TabularDataset and use it as the input\n",
" inputs=[prepared_fashion_ds.parse_delimited_files()],\n",
" compute_target=compute_target)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Build the pipeline\n",
"Once we have the steps (or steps collection), we can build the [pipeline](https://docs.microsoft.com/python/api/azureml-pipeline-core/azureml.pipeline.core.pipeline.pipeline?view=azure-ml-py).\n",
"\n",
"A pipeline is created with a list of steps and a workspace. Submit a pipeline using [submit](https://docs.microsoft.com/python/api/azureml-core/azureml.core.experiment(class)?view=azure-ml-py#submit-config--tags-none----kwargs-). When submit is called, a [PipelineRun](https://docs.microsoft.com/python/api/azureml-pipeline-core/azureml.pipeline.core.pipelinerun?view=azure-ml-py) is created which in turn creates [StepRun](https://docs.microsoft.com/python/api/azureml-pipeline-core/azureml.pipeline.core.steprun?view=azure-ml-py) objects for each step in the workflow."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# build pipeline & run experiment\n",
"pipeline = Pipeline(workspace, steps=[prep_step, est_step])\n",
"run = exp.submit(pipeline)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Monitor the PipelineRun"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"inputHidden": false,
"outputHidden": false
},
"outputs": [],
"source": [
"run.wait_for_completion(show_output=True)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"run.find_step_run('train step')[0].get_metrics()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Register the input dataset and the output model"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Azure Machine Learning dataset makes it easy to trace how your data is used in ML. [Learn More](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-version-track-datasets#track-datasets-in-experiments)<br>\n",
"For each Machine Learning experiment, you can easily trace the datasets used as the input through `Run` object."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# get input datasets\n",
"prep_step = run.find_step_run('prepare step')[0]\n",
"inputs = prep_step.get_details()['inputDatasets']\n",
"input_dataset = inputs[0]['dataset']\n",
"\n",
"# list the files referenced by input_dataset\n",
"input_dataset.to_path()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Register the input Fashion MNIST dataset with the workspace so that you can reuse it in other experiments or share it with your colleagues who have access to your workspace."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"fashion_ds = input_dataset.register(workspace = workspace,\n",
" name = 'fashion_ds',\n",
" description = 'image and label files from fashion mnist',\n",
" create_new_version = True)\n",
"fashion_ds"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Register the output model with dataset"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"run.find_step_run('train step')[0].register_model(model_name = 'keras-model', model_path = 'outputs/model/', \n",
" datasets =[('train test data',fashion_ds)])"
]
}
],
"metadata": {
"authors": [
{
"name": "sihhu"
}
],
"category": "tutorial",
"compute": [
"Remote"
],
"datasets": [
"Fashion MNIST"
],
"deployment": [
"None"
],
"exclude_from_index": false,
"framework": [
"Azure ML"
],
"friendly_name": "Datasets with ML Pipeline",
"index_order": 1,
"kernelspec": {
"display_name": "Python 3.6",
"language": "python",
"name": "python36"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.9"
},
"nteract": {
"version": "nteract-front-end@1.0.0"
},
"star_tag": [
"featured"
],
"tags": [
"Dataset",
"Pipeline",
"Estimator",
"ScriptRun"
],
"task": "Train"
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -522,7 +522,7 @@
], ],
"category": "tutorial", "category": "tutorial",
"compute": [ "compute": [
"local" "Local"
], ],
"datasets": [ "datasets": [
"NOAA" "NOAA"

View File

@@ -13,23 +13,23 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/work-with-data/datasets-tutorial/train-with-datasets.png)" "![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/work-with-data/datasets-tutorial/train-with-datasets/train-with-datasets.png)"
] ]
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"# Train with Azure Machine Learning Datasets\n", "# Train with Azure Machine Learning datasets\n",
"Datasets are categorized into TabularDataset and FileDataset based on how users consume them in training. \n", "Datasets are categorized into TabularDataset and FileDataset based on how users consume them in training. \n",
"* A TabularDataset represents data in a tabular format by parsing the provided file or list of files. TabularDataset can be created from csv, tsv, parquet files, SQL query results etc. For the complete list, please visit our [documentation](https://aka.ms/tabulardataset-api-reference). It provides you with the ability to materialize the data into a pandas DataFrame.\n", "* A TabularDataset represents data in a tabular format by parsing the provided file or list of files. TabularDataset can be created from csv, tsv, parquet files, SQL query results etc. For the complete list, please visit our [documentation](https://aka.ms/tabulardataset-api-reference). It provides you with the ability to materialize the data into a pandas DataFrame.\n",
"* A FileDataset references single or multiple files in your datastores or public urls. This provides you with the ability to download or mount the files to your compute. The files can be of any format, which enables a wider range of machine learning scenarios including deep learning.\n", "* A FileDataset references single or multiple files in your datastores or public urls. This provides you with the ability to download or mount the files to your compute. The files can be of any format, which enables a wider range of machine learning scenarios including deep learning.\n",
"\n", "\n",
"In this tutorial, you will learn how to train with Azure Machine Learning Datasets:\n", "In this tutorial, you will learn how to train with Azure Machine Learning datasets:\n",
"\n", "\n",
"&#x2611; Use Datasets directly in your training script\n", "&#x2611; Use datasets directly in your training script\n",
"\n", "\n",
"&#x2611; Use Datasets to mount files to a remote compute" "&#x2611; Use datasets to mount files to a remote compute"
] ]
}, },
{ {
@@ -149,12 +149,12 @@
"metadata": {}, "metadata": {},
"source": [ "source": [
"You now have the necessary packages and compute resources to train a model in the cloud.\n", "You now have the necessary packages and compute resources to train a model in the cloud.\n",
"## Use Datasets directly in training\n", "## Use datasets directly in training\n",
"\n", "\n",
"### Create a TabularDataset\n", "### Create a TabularDataset\n",
"By creating a dataset, you create a reference to the data source location. If you applied any subsetting transformations to the dataset, they will be stored in the dataset as well. The data remains in its existing location, so no extra storage cost is incurred. \n", "By creating a dataset, you create a reference to the data source location. If you applied any subsetting transformations to the dataset, they will be stored in the dataset as well. The data remains in its existing location, so no extra storage cost is incurred. \n",
"\n", "\n",
"Every workspace comes with a default [datastore](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-access-data) (and you can register more) which is backed by the Azure blob storage account associated with the workspace. We can use it to transfer data from local to the cloud, and create Dataset from it. We will now upload the [Iris data](./train-dataset/Iris.csv) to the default datastore (blob) within your workspace." "Every workspace comes with a default [datastore](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-access-data) (and you can register more) which is backed by the Azure blob storage account associated with the workspace. We can use it to transfer data from local to the cloud, and create dataset from it. We will now upload the [Iris data](./train-dataset/Iris.csv) to the default datastore (blob) within your workspace."
] ]
}, },
{ {
@@ -174,7 +174,9 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"Then we will create an unregistered TabularDataset pointing to the path in the datastore. You can also create a Dataset from multiple paths. [learn more](https://aka.ms/azureml/howto/createdatasets) " "Then we will create an unregistered TabularDataset pointing to the path in the datastore. You can also create a dataset from multiple paths. [learn more](https://aka.ms/azureml/howto/createdatasets) \n",
"\n",
"[TabularDataset](https://docs.microsoft.com/python/api/azureml-core/azureml.data.tabulardataset?view=azure-ml-py) represents data in a tabular format by parsing the provided file or list of files. This provides you with the ability to materialize the data into a Pandas or Spark DataFrame. You can create a TabularDataset object from .csv, .tsv, and parquet files, and from SQL query results. For a complete list, see [TabularDatasetFactory](https://docs.microsoft.com/python/api/azureml-core/azureml.data.dataset_factory.tabulardatasetfactory?view=azure-ml-py) class."
] ]
}, },
{ {
@@ -260,7 +262,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"### Configure and use Datasets as the input to Estimator" "### Configure and use datasets as the input to Estimator"
] ]
}, },
{ {
@@ -294,7 +296,7 @@
"* The name of the estimator object, `est`\n", "* The name of the estimator object, `est`\n",
"* The directory that contains your scripts. All the files in this directory are uploaded into the cluster nodes for execution. \n", "* The directory that contains your scripts. All the files in this directory are uploaded into the cluster nodes for execution. \n",
"* The training script name, train_titanic.py\n", "* The training script name, train_titanic.py\n",
"* The input Dataset for training\n", "* The input dataset for training\n",
"* The compute target. In this case you will use the AmlCompute you created\n", "* The compute target. In this case you will use the AmlCompute you created\n",
"* The environment definition for the experiment" "* The environment definition for the experiment"
] ]
@@ -348,9 +350,9 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"## Use Datasets to mount files to a remote compute\n", "## Use datasets to mount files to a remote compute\n",
"\n", "\n",
"You can use the Dataset object to mount or download files referred by it. When you mount a file system, you attach that file system to a directory (mount point) and make it available to the system. Because mounting load files at the time of processing, it is usually faster than download.<br> \n", "You can use the `Dataset` object to mount or download files referred by it. When you mount a file system, you attach that file system to a directory (mount point) and make it available to the system. Because mounting load files at the time of processing, it is usually faster than download.<br> \n",
"Note: mounting is only available for Linux-based compute (DSVM/VM, AMLCompute, HDInsights)." "Note: mounting is only available for Linux-based compute (DSVM/VM, AMLCompute, HDInsights)."
] ]
}, },
@@ -365,7 +367,6 @@
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": null, "execution_count": null,
"metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"from sklearn.datasets import load_diabetes\n", "from sklearn.datasets import load_diabetes\n",
@@ -396,7 +397,9 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"### Create a FileDataset" "### Create a FileDataset\n",
"\n",
"[FileDataset](https://docs.microsoft.com/python/api/azureml-core/azureml.data.file_dataset.filedataset?view=azure-ml-py) references single or multiple files in your datastores or public URLs. Using this method, you can download or mount the files to your compute as a FileDataset object. The files can be in any format, which enables a wider range of machine learning scenarios, including deep learning."
] ]
}, },
{ {
@@ -492,7 +495,7 @@
"src = ScriptRunConfig(source_directory=script_folder, \n", "src = ScriptRunConfig(source_directory=script_folder, \n",
" script='train_diabetes.py', \n", " script='train_diabetes.py', \n",
" # to mount the dataset on the remote compute and pass the mounted path as an argument to the training script\n", " # to mount the dataset on the remote compute and pass the mounted path as an argument to the training script\n",
" arguments =[dataset.as_named_input('diabetes').as_mount('tmp/dataset')])\n", " arguments =[dataset.as_named_input('diabetes').as_mount()])\n",
"\n", "\n",
"src.run_config.framework = 'python'\n", "src.run_config.framework = 'python'\n",
"src.run_config.environment = conda_env\n", "src.run_config.environment = conda_env\n",
@@ -533,7 +536,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"### Register Datasets\n", "### Register datasets\n",
"Use the register() method to register datasets to your workspace so they can be shared with others, reused across various experiments, and referred to by name in your training script." "Use the register() method to register datasets to your workspace so they can be shared with others, reused across various experiments, and referred to by name in your training script."
] ]
}, },
@@ -553,10 +556,10 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"## Register models with Datasets\n", "## Register models with datasets\n",
"The last step in the training script wrote the model files in a directory named `outputs` in the VM of the cluster where the job is executed. `outputs` is a special directory in that all content in this directory is automatically uploaded to your workspace. This content appears in the run record in the experiment under your workspace. Hence, the model file is now also available in your workspace.\n", "The last step in the training script wrote the model files in a directory named `outputs` in the VM of the cluster where the job is executed. `outputs` is a special directory in that all content in this directory is automatically uploaded to your workspace. This content appears in the run record in the experiment under your workspace. Hence, the model file is now also available in your workspace.\n",
"\n", "\n",
"You can register models with Datasets for reproducibility and auditing purpose." "You can register models with datasets for reproducibility and auditing purpose."
] ]
}, },
{ {
@@ -606,11 +609,11 @@
], ],
"category": "tutorial", "category": "tutorial",
"compute": [ "compute": [
"remote" "Remote"
], ],
"datasets": [ "datasets": [
"Iris", "Iris",
"Daibetes" "Diabetes"
], ],
"deployment": [ "deployment": [
"None" "None"
@@ -642,9 +645,11 @@
"featured" "featured"
], ],
"tags": [ "tags": [
"Dataset" "Dataset",
"Estimator",
"ScriptRun"
], ],
"task": "Filtering" "task": "Train"
}, },
"nbformat": 4, "nbformat": 4,
"nbformat_minor": 2 "nbformat_minor": 2

148
index.md
View File

@@ -10,7 +10,6 @@ Machine Learning notebook samples and encourage efficient retrieval of topics an
|Title| Task | Dataset | Training Compute | Deployment Target | ML Framework | Tags | |Title| Task | Dataset | Training Compute | Deployment Target | ML Framework | Tags |
|:----|:-----|:-------:|:----------------:|:-----------------:|:------------:|:------------:| |:----|:-----|:-------:|:----------------:|:-----------------:|:------------:|:------------:|
| [Using Azure ML environments](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/training/using-environments/using-environments.ipynb) | Creating and registering environments | None | Local | None | None | None | | [Using Azure ML environments](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/training/using-environments/using-environments.ipynb) | Creating and registering environments | None | Local | None | None | None |
| [Estimators in AML with hyperparameter tuning](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/training-with-deep-learning/how-to-use-estimator/how-to-use-estimator.ipynb) | Use the Estimator pattern in Azure Machine Learning SDK | None | AML Compute | None | None | None | | [Estimators in AML with hyperparameter tuning](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/training-with-deep-learning/how-to-use-estimator/how-to-use-estimator.ipynb) | Use the Estimator pattern in Azure Machine Learning SDK | None | AML Compute | None | None | None |
@@ -18,64 +17,37 @@ Machine Learning notebook samples and encourage efficient retrieval of topics an
|Title| Task | Dataset | Training Compute | Deployment Target | ML Framework | Tags | |Title| Task | Dataset | Training Compute | Deployment Target | ML Framework | Tags |
|:----|:-----|:-------:|:----------------:|:-----------------:|:------------:|:------------:| |:----|:-----|:-------:|:----------------:|:-----------------:|:------------:|:------------:|
| [Forecasting BikeShare Demand](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/automated-machine-learning/forecasting-bike-share/auto-ml-forecasting-bike-share.ipynb) | forecasting | BikeShare | remote | None | Azure ML AutoML | Forecasting | | [Forecasting BikeShare Demand](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/automated-machine-learning/forecasting-bike-share/auto-ml-forecasting-bike-share.ipynb) | Forecasting | BikeShare | Remote | None | Azure ML AutoML | Forecasting |
| [Forecasting orange juice sales with deployment](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/automated-machine-learning/forecasting-orange-juice-sales/auto-ml-forecasting-orange-juice-sales.ipynb) | Forecasting | Orange Juice Sales | Remote | Azure Container Instance | Azure ML AutoML | None |
| [Forecasting orange juice sales with deployment](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/automated-machine-learning/forecasting-orange-juice-sales/auto-ml-forecasting-orange-juice-sales.ipynb) | Forecasting | Orange Juice Sales | remote | Azure Container Instance | Azure ML AutoML | |
| [Forecasting with automated ML SQL integration](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/automated-machine-learning/sql-server/energy-demand/auto-ml-sql-energy-demand.ipynb) | Forecasting | NYC Energy | Local | None | Azure ML AutoML | | | [Forecasting with automated ML SQL integration](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/automated-machine-learning/sql-server/energy-demand/auto-ml-sql-energy-demand.ipynb) | Forecasting | NYC Energy | Local | None | Azure ML AutoML | |
| [Setup automated ML SQL integration](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/automated-machine-learning/sql-server/setup/auto-ml-sql-setup.ipynb) | None | None | None | None | Azure ML AutoML | | | [Setup automated ML SQL integration](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/automated-machine-learning/sql-server/setup/auto-ml-sql-setup.ipynb) | None | None | None | None | Azure ML AutoML | |
| [Register a model and deploy locally](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/deploy-to-local/register-model-deploy-local.ipynb) | Deployment | None | Local | Local | None | None |
| [Register a model and deploy locally](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/deploy-to-local/register-model-deploy-local.ipynb) | Deployment | | local | Local | None | None | | :star:[Data drift on aks](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/monitor-models/data-drift/drift-on-aks.ipynb) | Filtering | NOAA | Remote | AKS | Azure ML | Dataset, Timeseries, Drift |
| [Train and deploy a model using Python SDK](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/training/train-within-notebook/train-within-notebook.ipynb) | Training and deploying a model from a notebook | Diabetes | Local | Azure Container Instance | None | None |
| :star:[Data drift on aks](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/monitor-models/data-drift/drift-on-aks.ipynb) | Filtering | NOAA | remote | AKS | Azure ML | Dataset, Timeseries, Drift | | :star:[Data drift quickdemo](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/work-with-data/datadrift-tutorial/datadrift-tutorial.ipynb) | Filtering | NOAA | Remote | None | Azure ML | Dataset, Timeseries, Drift |
| :star:[Filtering data using Tabular Timeseiries Dataset related API](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/work-with-data/datasets-tutorial/tabular-timeseries-dataset-filtering.ipynb) | Filtering | NOAA | Local | None | Azure ML | Dataset, Tabular Timeseries |
| [](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/training/train-within-notebook/train-within-notebook.ipynb) | Training and deploying a model from a notebook | Diabetes | Local | Azure Container Instance | None | None | | :star:[Introduction to labeled datasets](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/work-with-data/datasets-tutorial/labeled-datasets/labeled-datasets.ipynb) | Train | | Remote | None | Azure ML | Dataset, label, Estimator |
| :star:[Datasets with ML Pipeline](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/work-with-data/datasets-tutorial/pipeline-with-datasets/pipeline-for-image-classification.ipynb) | Train | Fashion MNIST | Remote | None | Azure ML | Dataset, Pipeline, Estimator, ScriptRun |
| :star:[Data drift quickdemo](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/work-with-data/datadrift-tutorial/datadrift-tutorial.ipynb) | Filtering | NOAA | remote | None | Azure ML | Dataset, Timeseries, Drift | | :star:[Train with Datasets (Tabular and File)](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/work-with-data/datasets-tutorial/train-with-datasets/train-with-datasets.ipynb) | Train | Iris, Diabetes | Remote | None | Azure ML | Dataset, Estimator, ScriptRun |
| [Forecasting away from training data](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/automated-machine-learning/forecasting-high-frequency/automl-forecasting-function.ipynb) | Forecasting | None | Remote | None | Azure ML AutoML | Forecasting, Confidence Intervals |
| :star:[Filtering data using Tabular Timeseiries Dataset related API](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/work-with-data/datasets-tutorial/tabular-timeseries-dataset-filtering.ipynb) | Filtering | NOAA | local | None | Azure ML | Dataset, Tabular Timeseries |
| :star:[Train with Datasets (Tabular and File)](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/work-with-data/datasets-tutorial/train-with-datasets.ipynb) | Filtering | Iris, Daibetes | remote | None | Azure ML | Dataset |
| [Forecasting away from training data](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/automated-machine-learning/forecasting-high-frequency/automl-forecasting-function.ipynb) | forecasting | None | remote | None | Azure ML AutoML | Forecasting, Confidence Intervals |
| [Automated ML run with basic edition features.](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/automated-machine-learning/classification-bank-marketing-all-features/auto-ml-classification-bank-marketing-all-features.ipynb) | Classification | Bankmarketing | AML | ACI | None | featurization, explainability, remote_run, AutomatedML | | [Automated ML run with basic edition features.](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/automated-machine-learning/classification-bank-marketing-all-features/auto-ml-classification-bank-marketing-all-features.ipynb) | Classification | Bankmarketing | AML | ACI | None | featurization, explainability, remote_run, AutomatedML |
| [Classification of credit card fraudulent transactions using Automated ML](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/automated-machine-learning/classification-credit-card-fraud/auto-ml-classification-credit-card-fraud.ipynb) | Classification | Creditcard | AML Compute | None | None | remote_run, AutomatedML |
| [Classification of credit card fraudulent transactions using Automated ML](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/automated-machine-learning/classification-credit-card-fraud/auto-ml-classification-credit-card-fraud.ipynb) | Classification | creditcard | AML Compute | None | None | remote_run, AutomatedML |
| [Automated ML run with featurization and model explainability.](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/automated-machine-learning/regression-hardware-performance-explanation-and-featurization/auto-ml-regression-hardware-performance-explanation-and-featurization.ipynb) | Regression | MachineData | AML | ACI | None | featurization, explainability, remote_run, AutomatedML | | [Automated ML run with featurization and model explainability.](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/automated-machine-learning/regression-hardware-performance-explanation-and-featurization/auto-ml-regression-hardware-performance-explanation-and-featurization.ipynb) | Regression | MachineData | AML | ACI | None | featurization, explainability, remote_run, AutomatedML |
| [Use MLflow with Azure Machine Learning for training and deployment](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/track-and-monitor-experiments/using-mlflow/train-deploy-pytorch/train-and-deploy-pytorch.ipynb) | Use MLflow with Azure Machine Learning to train and deploy Pa yTorch image classifier model | MNIST | AML Compute | Azure Container Instance | PyTorch | None | | [Use MLflow with Azure Machine Learning for training and deployment](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/track-and-monitor-experiments/using-mlflow/train-deploy-pytorch/train-and-deploy-pytorch.ipynb) | Use MLflow with Azure Machine Learning to train and deploy Pa yTorch image classifier model | MNIST | AML Compute | Azure Container Instance | PyTorch | None |
| :star:[Azure Machine Learning Pipeline with DataTranferStep](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-data-transfer.ipynb) | Demonstrates the use of DataTranferStep | Custom | ADF | None | Azure ML | None | | :star:[Azure Machine Learning Pipeline with DataTranferStep](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-data-transfer.ipynb) | Demonstrates the use of DataTranferStep | Custom | ADF | None | Azure ML | None |
| [Getting Started with Azure Machine Learning Pipelines](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-getting-started.ipynb) | Getting Started notebook for ANML Pipelines | Custom | AML Compute | None | Azure ML | None | | [Getting Started with Azure Machine Learning Pipelines](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-getting-started.ipynb) | Getting Started notebook for ANML Pipelines | Custom | AML Compute | None | Azure ML | None |
| [Azure Machine Learning Pipeline with AzureBatchStep](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-how-to-use-azurebatch-to-run-a-windows-executable.ipynb) | Demonstrates the use of AzureBatchStep | Custom | Azure Batch | None | Azure ML | None | | [Azure Machine Learning Pipeline with AzureBatchStep](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-how-to-use-azurebatch-to-run-a-windows-executable.ipynb) | Demonstrates the use of AzureBatchStep | Custom | Azure Batch | None | Azure ML | None |
| [Azure Machine Learning Pipeline with EstimatorStep](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-how-to-use-estimatorstep.ipynb) | Demonstrates the use of EstimatorStep | Custom | AML Compute | None | Azure ML | None | | [Azure Machine Learning Pipeline with EstimatorStep](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-how-to-use-estimatorstep.ipynb) | Demonstrates the use of EstimatorStep | Custom | AML Compute | None | Azure ML | None |
| :star:[How to use ModuleStep with AML Pipelines](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-how-to-use-modulestep.ipynb) | Demonstrates the use of ModuleStep | Custom | AML Compute | None | Azure ML | None | | :star:[How to use ModuleStep with AML Pipelines](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-how-to-use-modulestep.ipynb) | Demonstrates the use of ModuleStep | Custom | AML Compute | None | Azure ML | None |
| :star:[How to use Pipeline Drafts to create a Published Pipeline](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-how-to-use-pipeline-drafts.ipynb) | Demonstrates the use of Pipeline Drafts | Custom | AML Compute | None | Azure ML | None | | :star:[How to use Pipeline Drafts to create a Published Pipeline](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-how-to-use-pipeline-drafts.ipynb) | Demonstrates the use of Pipeline Drafts | Custom | AML Compute | None | Azure ML | None |
| :star:[Azure Machine Learning Pipeline with HyperDriveStep](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-parameter-tuning-with-hyperdrive.ipynb) | Demonstrates the use of HyperDriveStep | Custom | AML Compute | None | Azure ML | None | | :star:[Azure Machine Learning Pipeline with HyperDriveStep](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-parameter-tuning-with-hyperdrive.ipynb) | Demonstrates the use of HyperDriveStep | Custom | AML Compute | None | Azure ML | None |
| :star:[How to Publish a Pipeline and Invoke the REST endpoint](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-publish-and-run-using-rest-endpoint.ipynb) | Demonstrates the use of Published Pipelines | Custom | AML Compute | None | Azure ML | None | | :star:[How to Publish a Pipeline and Invoke the REST endpoint](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-publish-and-run-using-rest-endpoint.ipynb) | Demonstrates the use of Published Pipelines | Custom | AML Compute | None | Azure ML | None |
| :star:[How to Setup a Schedule for a Published Pipeline](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-setup-schedule-for-a-published-pipeline.ipynb) | Demonstrates the use of Schedules for Published Pipelines | Custom | AML Compute | None | Azure ML | None | | :star:[How to Setup a Schedule for a Published Pipeline](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-setup-schedule-for-a-published-pipeline.ipynb) | Demonstrates the use of Schedules for Published Pipelines | Custom | AML Compute | None | Azure ML | None |
| [How to setup a versioned Pipeline Endpoint](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-setup-versioned-pipeline-endpoints.ipynb) | Demonstrates the use of PipelineEndpoint to run a specific version of the Published Pipeline | Custom | AML Compute | None | Azure ML | None | | [How to setup a versioned Pipeline Endpoint](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-setup-versioned-pipeline-endpoints.ipynb) | Demonstrates the use of PipelineEndpoint to run a specific version of the Published Pipeline | Custom | AML Compute | None | Azure ML | None |
| :star:[How to use DataPath as a PipelineParameter](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-showcasing-datapath-and-pipelineparameter.ipynb) | Demonstrates the use of DataPath as a PipelineParameter | Custom | AML Compute | None | Azure ML | None | | :star:[How to use DataPath as a PipelineParameter](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-showcasing-datapath-and-pipelineparameter.ipynb) | Demonstrates the use of DataPath as a PipelineParameter | Custom | AML Compute | None | Azure ML | None |
| [How to use AdlaStep with AML Pipelines](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-use-adla-as-compute-target.ipynb) | Demonstrates the use of AdlaStep | Custom | Azure Data Lake Analytics | None | Azure ML | None | | [How to use AdlaStep with AML Pipelines](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-use-adla-as-compute-target.ipynb) | Demonstrates the use of AdlaStep | Custom | Azure Data Lake Analytics | None | Azure ML | None |
| :star:[How to use DatabricksStep with AML Pipelines](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-use-databricks-as-compute-target.ipynb) | Demonstrates the use of DatabricksStep | Custom | Azure Databricks | None | Azure ML, Azure Databricks | None | | :star:[How to use DatabricksStep with AML Pipelines](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-use-databricks-as-compute-target.ipynb) | Demonstrates the use of DatabricksStep | Custom | Azure Databricks | None | Azure ML, Azure Databricks | None |
| :star:[How to use AutoMLStep with AML Pipelines](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-with-automated-machine-learning-step.ipynb) | Demonstrates the use of AutoMLStep | Custom | AML Compute | None | Automated Machine Learning | None | | :star:[How to use AutoMLStep with AML Pipelines](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-with-automated-machine-learning-step.ipynb) | Demonstrates the use of AutoMLStep | Custom | AML Compute | None | Automated Machine Learning | None |
| :star:[Azure Machine Learning Pipelines with Data Dependency](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-with-data-dependency-steps.ipynb) | Demonstrates how to construct a Pipeline with data dependency between steps | Custom | AML Compute | None | Azure ML | None | | :star:[Azure Machine Learning Pipelines with Data Dependency](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-with-data-dependency-steps.ipynb) | Demonstrates how to construct a Pipeline with data dependency between steps | Custom | AML Compute | None | Azure ML | None |
@@ -84,45 +56,25 @@ Machine Learning notebook samples and encourage efficient retrieval of topics an
|Title| Task | Dataset | Training Compute | Deployment Target | ML Framework | Tags | |Title| Task | Dataset | Training Compute | Deployment Target | ML Framework | Tags |
|:----|:-----|:-------:|:----------------:|:-----------------:|:------------:|:------------:| |:----|:-----|:-------:|:----------------:|:-----------------:|:------------:|:------------:|
| [Train a model with hyperparameter tuning](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/ml-frameworks/chainer/deployment/train-hyperparameter-tune-deploy-with-chainer/train-hyperparameter-tune-deploy-with-chainer.ipynb) | Train a Convolutional Neural Network (CNN) | MNIST | AML Compute | Azure Container Instance | Chainer | None | | [Train a model with hyperparameter tuning](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/ml-frameworks/chainer/deployment/train-hyperparameter-tune-deploy-with-chainer/train-hyperparameter-tune-deploy-with-chainer.ipynb) | Train a Convolutional Neural Network (CNN) | MNIST | AML Compute | Azure Container Instance | Chainer | None |
| [Distributed Training with Chainer](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/ml-frameworks/chainer/training/distributed-chainer/distributed-chainer.ipynb) | Use the Chainer estimator to perform distributed training | MNIST | AML Compute | None | Chainer | None | | [Distributed Training with Chainer](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/ml-frameworks/chainer/training/distributed-chainer/distributed-chainer.ipynb) | Use the Chainer estimator to perform distributed training | MNIST | AML Compute | None | Chainer | None |
| [Training with hyperparameter tuning using PyTorch](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/ml-frameworks/pytorch/deployment/train-hyperparameter-tune-deploy-with-pytorch/train-hyperparameter-tune-deploy-with-pytorch.ipynb) | Train an image classification model using transfer learning with the PyTorch estimator | ImageNet | AML Compute | Azure Container Instance | PyTorch | None | | [Training with hyperparameter tuning using PyTorch](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/ml-frameworks/pytorch/deployment/train-hyperparameter-tune-deploy-with-pytorch/train-hyperparameter-tune-deploy-with-pytorch.ipynb) | Train an image classification model using transfer learning with the PyTorch estimator | ImageNet | AML Compute | Azure Container Instance | PyTorch | None |
| [Distributed PyTorch](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/ml-frameworks/pytorch/training/distributed-pytorch-with-horovod/distributed-pytorch-with-horovod.ipynb) | Train a model using the distributed training via Horovod | MNIST | AML Compute | None | PyTorch | None | | [Distributed PyTorch](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/ml-frameworks/pytorch/training/distributed-pytorch-with-horovod/distributed-pytorch-with-horovod.ipynb) | Train a model using the distributed training via Horovod | MNIST | AML Compute | None | PyTorch | None |
| [Distributed training with PyTorch](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/ml-frameworks/pytorch/training/distributed-pytorch-with-nccl-gloo/distributed-pytorch-with-nccl-gloo.ipynb) | Train a model using distributed training via Nccl/Gloo | MNIST | AML Compute | None | PyTorch | None | | [Distributed training with PyTorch](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/ml-frameworks/pytorch/training/distributed-pytorch-with-nccl-gloo/distributed-pytorch-with-nccl-gloo.ipynb) | Train a model using distributed training via Nccl/Gloo | MNIST | AML Compute | None | PyTorch | None |
| [Training and hyperparameter tuning with Scikit-learn](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/ml-frameworks/scikit-learn/training/train-hyperparameter-tune-deploy-with-sklearn/train-hyperparameter-tune-deploy-with-sklearn.ipynb) | Train a support vector machine (SVM) to perform classification | Iris | AML Compute | None | Scikit-learn | None | | [Training and hyperparameter tuning with Scikit-learn](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/ml-frameworks/scikit-learn/training/train-hyperparameter-tune-deploy-with-sklearn/train-hyperparameter-tune-deploy-with-sklearn.ipynb) | Train a support vector machine (SVM) to perform classification | Iris | AML Compute | None | Scikit-learn | None |
| [Training and hyperparameter tuning using the TensorFlow estimator](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/ml-frameworks/tensorflow/deployment/train-hyperparameter-tune-deploy-with-tensorflow/train-hyperparameter-tune-deploy-with-tensorflow.ipynb) | Train a deep neural network | MNIST | AML Compute | Azure Container Instance | TensorFlow | None | | [Training and hyperparameter tuning using the TensorFlow estimator](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/ml-frameworks/tensorflow/deployment/train-hyperparameter-tune-deploy-with-tensorflow/train-hyperparameter-tune-deploy-with-tensorflow.ipynb) | Train a deep neural network | MNIST | AML Compute | Azure Container Instance | TensorFlow | None |
| [Distributed training using TensorFlow with Horovod](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/ml-frameworks/tensorflow/training/distributed-tensorflow-with-horovod/distributed-tensorflow-with-horovod.ipynb) | Use the TensorFlow estimator to train a word2vec model | None | AML Compute | None | TensorFlow | None | | [Distributed training using TensorFlow with Horovod](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/ml-frameworks/tensorflow/training/distributed-tensorflow-with-horovod/distributed-tensorflow-with-horovod.ipynb) | Use the TensorFlow estimator to train a word2vec model | None | AML Compute | None | TensorFlow | None |
| [Distributed TensorFlow with parameter server](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/ml-frameworks/tensorflow/training/distributed-tensorflow-with-parameter-server/distributed-tensorflow-with-parameter-server.ipynb) | Use the TensorFlow estimator to train a model using distributed training | MNIST | AML Compute | None | TensorFlow | None | | [Distributed TensorFlow with parameter server](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/ml-frameworks/tensorflow/training/distributed-tensorflow-with-parameter-server/distributed-tensorflow-with-parameter-server.ipynb) | Use the TensorFlow estimator to train a model using distributed training | MNIST | AML Compute | None | TensorFlow | None |
| [Hyperparameter tuning and warm start using the TensorFlow estimator](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/ml-frameworks/tensorflow/training/hyperparameter-tune-and-warm-start-with-tensorflow/hyperparameter-tune-and-warm-start-with-tensorflow.ipynb) | Train a deep neural network | MNIST | AML Compute | Azure Container Instance | TensorFlow | None | | [Hyperparameter tuning and warm start using the TensorFlow estimator](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/ml-frameworks/tensorflow/training/hyperparameter-tune-and-warm-start-with-tensorflow/hyperparameter-tune-and-warm-start-with-tensorflow.ipynb) | Train a deep neural network | MNIST | AML Compute | Azure Container Instance | TensorFlow | None |
| [Resuming a model](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/ml-frameworks/tensorflow/training/train-tensorflow-resume-training/train-tensorflow-resume-training.ipynb) | Resume a model in TensorFlow from a previously submitted run | MNIST | AML Compute | None | TensorFlow | None | | [Resuming a model](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/ml-frameworks/tensorflow/training/train-tensorflow-resume-training/train-tensorflow-resume-training.ipynb) | Resume a model in TensorFlow from a previously submitted run | MNIST | AML Compute | None | TensorFlow | None |
| [Training in Spark](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/training/train-in-spark/train-in-spark.ipynb) | Submiting a run on a spark cluster | None | HDI cluster | None | PySpark | None | | [Training in Spark](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/training/train-in-spark/train-in-spark.ipynb) | Submiting a run on a spark cluster | None | HDI cluster | None | PySpark | None |
| [Train on Azure Machine Learning Compute](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/training/train-on-amlcompute/train-on-amlcompute.ipynb) | Submit a run on Azure Machine Learning Compute. | Diabetes | AML Compute | None | None | None | | [Train on Azure Machine Learning Compute](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/training/train-on-amlcompute/train-on-amlcompute.ipynb) | Submit a run on Azure Machine Learning Compute. | Diabetes | AML Compute | None | None | None |
| [Train on local compute](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/training/train-on-local/train-on-local.ipynb) | Train a model locally | Diabetes | Local | None | None | None | | [Train on local compute](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/training/train-on-local/train-on-local.ipynb) | Train a model locally | Diabetes | Local | None | None | None |
| [Train in a remote Linux virtual machine](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/training/train-on-remote-vm/train-on-remote-vm.ipynb) | Configure and execute a run | Diabetes | Data Science Virtual Machine | None | None | None | | [Train in a remote Linux virtual machine](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/training/train-on-remote-vm/train-on-remote-vm.ipynb) | Configure and execute a run | Diabetes | Data Science Virtual Machine | None | None | None |
| [Using Tensorboard](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/training-with-deep-learning/export-run-history-to-tensorboard/export-run-history-to-tensorboard.ipynb) | Export the run history as Tensorboard logs | None | None | None | TensorFlow | None | | [Using Tensorboard](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/training-with-deep-learning/export-run-history-to-tensorboard/export-run-history-to-tensorboard.ipynb) | Export the run history as Tensorboard logs | None | None | None | TensorFlow | None |
| [Train a DNN using hyperparameter tuning and deploying with Keras](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/training-with-deep-learning/train-hyperparameter-tune-deploy-with-keras/train-hyperparameter-tune-deploy-with-keras.ipynb) | Create a multi-class classifier | MNIST | AML Compute | Azure Container Instance | TensorFlow | None | | [Train a DNN using hyperparameter tuning and deploying with Keras](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/training-with-deep-learning/train-hyperparameter-tune-deploy-with-keras/train-hyperparameter-tune-deploy-with-keras.ipynb) | Create a multi-class classifier | MNIST | AML Compute | Azure Container Instance | TensorFlow | None |
| [Managing your training runs](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/track-and-monitor-experiments/manage-runs/manage-runs.ipynb) | Monitor and complete runs | None | Local | None | None | None | | [Managing your training runs](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/track-and-monitor-experiments/manage-runs/manage-runs.ipynb) | Monitor and complete runs | None | Local | None | None | None |
| [Tensorboard integration with run history](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/track-and-monitor-experiments/tensorboard/tensorboard.ipynb) | Run a TensorFlow job and view its Tensorboard output live | None | Local, DSVM, AML Compute | None | TensorFlow | None | | [Tensorboard integration with run history](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/track-and-monitor-experiments/tensorboard/tensorboard.ipynb) | Run a TensorFlow job and view its Tensorboard output live | None | Local, DSVM, AML Compute | None | TensorFlow | None |
| [Use MLflow with AML for a local training run](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/track-and-monitor-experiments/using-mlflow/train-local/train-local.ipynb) | Use MLflow tracking APIs together with Azure Machine Learning for storing your metrics and artifacts | Diabetes | Local | None | None | None | | [Use MLflow with AML for a local training run](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/track-and-monitor-experiments/using-mlflow/train-local/train-local.ipynb) | Use MLflow tracking APIs together with Azure Machine Learning for storing your metrics and artifacts | Diabetes | Local | None | None | None |
| [Use MLflow with AML for a remote training run](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/track-and-monitor-experiments/using-mlflow/train-remote/train-remote.ipynb) | Use MLflow tracking APIs together with AML for storing your metrics and artifacts | Diabetes | AML Compute | None | None | None | | [Use MLflow with AML for a remote training run](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/track-and-monitor-experiments/using-mlflow/train-remote/train-remote.ipynb) | Use MLflow tracking APIs together with AML for storing your metrics and artifacts | Diabetes | AML Compute | None | None | None |
@@ -132,18 +84,13 @@ Machine Learning notebook samples and encourage efficient retrieval of topics an
|Title| Task | Dataset | Training Compute | Deployment Target | ML Framework | Tags | |Title| Task | Dataset | Training Compute | Deployment Target | ML Framework | Tags |
|:----|:-----|:-------:|:----------------:|:-----------------:|:------------:|:------------:| |:----|:-----|:-------:|:----------------:|:-----------------:|:------------:|:------------:|
| [Deploy MNIST digit recognition with ONNX Runtime](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/onnx/onnx-inference-mnist-deploy.ipynb) | Image Classification | MNIST | local | Azure Container Instance | ONNX | ONNX Model Zoo | | [Deploy MNIST digit recognition with ONNX Runtime](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/onnx/onnx-inference-mnist-deploy.ipynb) | Image Classification | MNIST | Local | Azure Container Instance | ONNX | ONNX Model Zoo |
| [Deploy Facial Expression Recognition (FER+) with ONNX Runtime](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/onnx/onnx-inference-facial-expression-recognition-deploy.ipynb) | Facial Expression Recognition | Emotion FER | Local | Azure Container Instance | ONNX | ONNX Model Zoo |
| [Deploy Facial Expression Recognition (FER+) with ONNX Runtime](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/onnx/onnx-inference-facial-expression-recognition-deploy.ipynb) | Facial Expression Recognition | Emotion FER | local | Azure Container Instance | ONNX | ONNX Model Zoo |
| :star:[Register model and deploy as webservice](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/deploy-to-cloud/model-register-and-deploy.ipynb) | Deploy a model with Azure Machine Learning | Diabetes | None | Azure Container Instance | Scikit-learn | None | | :star:[Register model and deploy as webservice](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/deploy-to-cloud/model-register-and-deploy.ipynb) | Deploy a model with Azure Machine Learning | Diabetes | None | Azure Container Instance | Scikit-learn | None |
| :star:[Deploy models to AKS using controlled roll out](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/deploy-with-controlled-rollout/deploy-aks-with-controlled-rollout.ipynb) | Deploy a model with Azure Machine Learning | Diabetes | None | Azure Kubernetes Service | Scikit-learn | None |
| [Train MNIST in PyTorch, convert, and deploy with ONNX Runtime](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/onnx/onnx-train-pytorch-aml-deploy-mnist.ipynb) | Image Classification | MNIST | AML Compute | Azure Container Instance | ONNX | ONNX Converter | | [Train MNIST in PyTorch, convert, and deploy with ONNX Runtime](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/onnx/onnx-train-pytorch-aml-deploy-mnist.ipynb) | Image Classification | MNIST | AML Compute | Azure Container Instance | ONNX | ONNX Converter |
| [Deploy ResNet50 with ONNX Runtime](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/onnx/onnx-modelzoo-aml-deploy-resnet50.ipynb) | Image Classification | ImageNet | Local | Azure Container Instance | ONNX | ONNX Model Zoo |
| [Deploy ResNet50 with ONNX Runtime](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/onnx/onnx-modelzoo-aml-deploy-resnet50.ipynb) | Image Classification | ImageNet | local | Azure Container Instance | ONNX | ONNX Model Zoo |
| [Deploy a model as a web service using MLflow](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/track-and-monitor-experiments/using-mlflow/deploy-model/deploy-model.ipynb) | Use MLflow with AML | Diabetes | None | Azure Container Instance | Scikit-learn | None | | [Deploy a model as a web service using MLflow](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/track-and-monitor-experiments/using-mlflow/deploy-model/deploy-model.ipynb) | Use MLflow with AML | Diabetes | None | Azure Container Instance | Scikit-learn | None |
| :star:[Convert and deploy TinyYolo with ONNX Runtime](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/onnx/onnx-convert-aml-deploy-tinyyolo.ipynb) | Object Detection | PASCAL VOC | local | Azure Container Instance | ONNX | ONNX Converter | | :star:[Convert and deploy TinyYolo with ONNX Runtime](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/onnx/onnx-convert-aml-deploy-tinyyolo.ipynb) | Object Detection | PASCAL VOC | local | Azure Container Instance | ONNX | ONNX Converter |
@@ -151,105 +98,48 @@ Machine Learning notebook samples and encourage efficient retrieval of topics an
## Other Notebooks ## Other Notebooks
|Title| Task | Dataset | Training Compute | Deployment Target | ML Framework | Tags | |Title| Task | Dataset | Training Compute | Deployment Target | ML Framework | Tags |
|:----|:-----|:-------:|:----------------:|:-----------------:|:------------:|:------------:| |:----|:-----|:-------:|:----------------:|:-----------------:|:------------:|:------------:|
| [DNN Text Featurization](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/automated-machine-learning/classification-text-dnn/auto-ml-classification-text-dnn.ipynb) | Text featurization using DNNs for classification | None | | None | None | None | | [DNN Text Featurization](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/automated-machine-learning/classification-text-dnn/auto-ml-classification-text-dnn.ipynb) | Text featurization using DNNs for classification | None | AML Compute | None | None | None |
| [Automated ML Grouping with Pipeline.](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/automated-machine-learning/forecasting-grouping/auto-ml-forecasting-grouping.ipynb) | Use AzureML Pipeline to trigger multiple Automated ML runs. | Orange Juice Sales | AML Compute | Azure Container Instance | Scikit-learn, Pytorch | AutomatedML | | [Automated ML Grouping with Pipeline.](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/automated-machine-learning/forecasting-grouping/auto-ml-forecasting-grouping.ipynb) | Use AzureML Pipeline to trigger multiple Automated ML runs. | Orange Juice Sales | AML Compute | Azure Container Instance | Scikit-learn, Pytorch | AutomatedML |
| [configuration](https://github.com/Azure/MachineLearningNotebooks/blob/master/configuration.ipynb) | | | | | | | | [configuration](https://github.com/Azure/MachineLearningNotebooks/blob/master/configuration.ipynb) | | | | | | |
| [file-dataset-image-inference-mnist](https://github.com/Azure/MachineLearningNotebooks/blob/master//contrib/batch_inferencing/file-dataset-image-inference-mnist.ipynb) | | | | | | |
| [tabular-dataset-inference-iris](https://github.com/Azure/MachineLearningNotebooks/blob/master//contrib/batch_inferencing/tabular-dataset-inference-iris.ipynb) | | | | | | |
| [lightgbm-example](https://github.com/Azure/MachineLearningNotebooks/blob/master//contrib/gbdt/lightgbm/lightgbm-example.ipynb) | | | | | | | | [lightgbm-example](https://github.com/Azure/MachineLearningNotebooks/blob/master//contrib/gbdt/lightgbm/lightgbm-example.ipynb) | | | | | | |
| [azure-ml-with-nvidia-rapids](https://github.com/Azure/MachineLearningNotebooks/blob/master//contrib/RAPIDS/azure-ml-with-nvidia-rapids.ipynb) | | | | | | | | [azure-ml-with-nvidia-rapids](https://github.com/Azure/MachineLearningNotebooks/blob/master//contrib/RAPIDS/azure-ml-with-nvidia-rapids.ipynb) | | | | | | |
| [auto-ml-continuous-retraining](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/automated-machine-learning/continuous-retraining/auto-ml-continuous-retraining.ipynb) | | | | | | | | [auto-ml-continuous-retraining](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/automated-machine-learning/continuous-retraining/auto-ml-continuous-retraining.ipynb) | | | | | | |
| [auto-ml-forecasting-beer-remote](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/automated-machine-learning/forecasting-beer-remote/auto-ml-forecasting-beer-remote.ipynb) | | | | | | | | [auto-ml-forecasting-beer-remote](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/automated-machine-learning/forecasting-beer-remote/auto-ml-forecasting-beer-remote.ipynb) | | | | | | |
| [auto-ml-forecasting-energy-demand](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/automated-machine-learning/forecasting-energy-demand/auto-ml-forecasting-energy-demand.ipynb) | | | | | | |
| :star:[auto-ml-forecasting-energy-demand](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/automated-machine-learning/forecasting-energy-demand/auto-ml-forecasting-energy-demand.ipynb) | Forecasting | | | | | |
| [auto-ml-regression](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/automated-machine-learning/regression/auto-ml-regression.ipynb) | | | | | | | | [auto-ml-regression](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/automated-machine-learning/regression/auto-ml-regression.ipynb) | | | | | | |
| [build-model-run-history-03](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/azure-databricks/amlsdk/build-model-run-history-03.ipynb) | | | | | | | | [build-model-run-history-03](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/azure-databricks/amlsdk/build-model-run-history-03.ipynb) | | | | | | |
| [deploy-to-aci-04](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/azure-databricks/amlsdk/deploy-to-aci-04.ipynb) | | | | | | | | [deploy-to-aci-04](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/azure-databricks/amlsdk/deploy-to-aci-04.ipynb) | | | | | | |
| [deploy-to-aks-05](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/azure-databricks/amlsdk/deploy-to-aks-05.ipynb) | | | | | | | | [deploy-to-aks-05](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/azure-databricks/amlsdk/deploy-to-aks-05.ipynb) | | | | | | |
| [ingest-data-02](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/azure-databricks/amlsdk/ingest-data-02.ipynb) | | | | | | | | [ingest-data-02](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/azure-databricks/amlsdk/ingest-data-02.ipynb) | | | | | | |
| [installation-and-configuration-01](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/azure-databricks/amlsdk/installation-and-configuration-01.ipynb) | | | | | | | | [installation-and-configuration-01](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/azure-databricks/amlsdk/installation-and-configuration-01.ipynb) | | | | | | |
| [automl-databricks-local-01](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/azure-databricks/automl/automl-databricks-local-01.ipynb) | | | | | | | | [automl-databricks-local-01](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/azure-databricks/automl/automl-databricks-local-01.ipynb) | | | | | | |
| [automl-databricks-local-with-deployment](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/azure-databricks/automl/automl-databricks-local-with-deployment.ipynb) | | | | | | | | [automl-databricks-local-with-deployment](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/azure-databricks/automl/automl-databricks-local-with-deployment.ipynb) | | | | | | |
| [aml-pipelines-use-databricks-as-compute-target](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/azure-databricks/databricks-as-remote-compute-target/aml-pipelines-use-databricks-as-compute-target.ipynb) | | | | | | | | [aml-pipelines-use-databricks-as-compute-target](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/azure-databricks/databricks-as-remote-compute-target/aml-pipelines-use-databricks-as-compute-target.ipynb) | | | | | | |
| [accelerated-models-object-detection](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/accelerated-models/accelerated-models-object-detection.ipynb) | | | | | | | | [accelerated-models-object-detection](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/accelerated-models/accelerated-models-object-detection.ipynb) | | | | | | |
| [accelerated-models-quickstart](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/accelerated-models/accelerated-models-quickstart.ipynb) | | | | | | | | [accelerated-models-quickstart](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/accelerated-models/accelerated-models-quickstart.ipynb) | | | | | | |
| [accelerated-models-training](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/accelerated-models/accelerated-models-training.ipynb) | | | | | | | | [accelerated-models-training](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/accelerated-models/accelerated-models-training.ipynb) | | | | | | |
| [multi-model-register-and-deploy](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/deploy-multi-model/multi-model-register-and-deploy.ipynb) | | | | | | | | [multi-model-register-and-deploy](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/deploy-multi-model/multi-model-register-and-deploy.ipynb) | | | | | | |
| [register-model-deploy-local-advanced](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/deploy-to-local/register-model-deploy-local-advanced.ipynb) | | | | | | | | [register-model-deploy-local-advanced](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/deploy-to-local/register-model-deploy-local-advanced.ipynb) | | | | | | |
| [enable-app-insights-in-production-service](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/enable-app-insights-in-production-service/enable-app-insights-in-production-service.ipynb) | | | | | | | | [enable-app-insights-in-production-service](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/enable-app-insights-in-production-service/enable-app-insights-in-production-service.ipynb) | | | | | | |
| [onnx-model-register-and-deploy](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/onnx/onnx-model-register-and-deploy.ipynb) | | | | | | | | [onnx-model-register-and-deploy](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/onnx/onnx-model-register-and-deploy.ipynb) | | | | | | |
| [production-deploy-to-aks](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/production-deploy-to-aks/production-deploy-to-aks.ipynb) | | | | | | | | [production-deploy-to-aks](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/production-deploy-to-aks/production-deploy-to-aks.ipynb) | | | | | | |
| [register-model-create-image-deploy-service](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/register-model-create-image-deploy-service/register-model-create-image-deploy-service.ipynb) | | | | | | | | [register-model-create-image-deploy-service](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/register-model-create-image-deploy-service/register-model-create-image-deploy-service.ipynb) | | | | | | |
| [tensorflow-model-register-and-deploy](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/tensorflow/tensorflow-model-register-and-deploy.ipynb) | | | | | | | | [tensorflow-model-register-and-deploy](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/tensorflow/tensorflow-model-register-and-deploy.ipynb) | | | | | | |
| [explain-model-on-amlcompute](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/explain-model/azure-integration/remote-explanation/explain-model-on-amlcompute.ipynb) | | | | | | | | [explain-model-on-amlcompute](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/explain-model/azure-integration/remote-explanation/explain-model-on-amlcompute.ipynb) | | | | | | |
| [save-retrieve-explanations-run-history](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/explain-model/azure-integration/run-history/save-retrieve-explanations-run-history.ipynb) | | | | | | | | [save-retrieve-explanations-run-history](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/explain-model/azure-integration/run-history/save-retrieve-explanations-run-history.ipynb) | | | | | | |
| [train-explain-model-locally-and-deploy](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/explain-model/azure-integration/scoring-time/train-explain-model-locally-and-deploy.ipynb) | | | | | | | | [train-explain-model-locally-and-deploy](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/explain-model/azure-integration/scoring-time/train-explain-model-locally-and-deploy.ipynb) | | | | | | |
| [train-explain-model-on-amlcompute-and-deploy](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/explain-model/azure-integration/scoring-time/train-explain-model-on-amlcompute-and-deploy.ipynb) | | | | | | | | [train-explain-model-on-amlcompute-and-deploy](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/explain-model/azure-integration/scoring-time/train-explain-model-on-amlcompute-and-deploy.ipynb) | | | | | | |
| [advanced-feature-transformations-explain-local](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/explain-model/tabular-data/advanced-feature-transformations-explain-local.ipynb) | | | | | | |
| [explain-binary-classification-local](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/explain-model/tabular-data/explain-binary-classification-local.ipynb) | | | | | | |
| [explain-multiclass-classification-local](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/explain-model/tabular-data/explain-multiclass-classification-local.ipynb) | | | | | | |
| [explain-regression-local](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/explain-model/tabular-data/explain-regression-local.ipynb) | | | | | | |
| [simple-feature-transformations-explain-local](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/explain-model/tabular-data/simple-feature-transformations-explain-local.ipynb) | | | | | | |
| [nyc-taxi-data-regression-model-building](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/machine-learning-pipelines/nyc-taxi-data-regression-model-building/nyc-taxi-data-regression-model-building.ipynb) | | | | | | | | [nyc-taxi-data-regression-model-building](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/machine-learning-pipelines/nyc-taxi-data-regression-model-building/nyc-taxi-data-regression-model-building.ipynb) | | | | | | |
| [pipeline-batch-scoring](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/machine-learning-pipelines/pipeline-batch-scoring/pipeline-batch-scoring.ipynb) | | | | | | | | [pipeline-batch-scoring](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/machine-learning-pipelines/pipeline-batch-scoring/pipeline-batch-scoring.ipynb) | | | | | | |
| [pipeline-style-transfer](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/machine-learning-pipelines/pipeline-style-transfer/pipeline-style-transfer.ipynb) | | | | | | | | [pipeline-style-transfer](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/machine-learning-pipelines/pipeline-style-transfer/pipeline-style-transfer.ipynb) | | | | | | |
| [authentication-in-azureml](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/manage-azureml-service/authentication-in-azureml/authentication-in-azureml.ipynb) | | | | | | | | [authentication-in-azureml](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/manage-azureml-service/authentication-in-azureml/authentication-in-azureml.ipynb) | | | | | | |
| [Logging APIs](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/track-and-monitor-experiments/logging-api/logging-api.ipynb) | Logging APIs and analyzing results | None | None | None | None | None |
| [Logging APIs](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/track-and-monitor-experiments/logging-api/logging-api.ipynb) | Logging APIs and analyzing results | | None | None | None | None |
| [distributed-cntk-with-custom-docker](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/training-with-deep-learning/distributed-cntk-with-custom-docker/distributed-cntk-with-custom-docker.ipynb) | | | | | | | | [distributed-cntk-with-custom-docker](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/training-with-deep-learning/distributed-cntk-with-custom-docker/distributed-cntk-with-custom-docker.ipynb) | | | | | | |
| [notebook_example](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/training-with-deep-learning/how-to-use-estimator/notebook_example.ipynb) | | | | | | | | [notebook_example](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/training-with-deep-learning/how-to-use-estimator/notebook_example.ipynb) | | | | | | |
| [configuration](https://github.com/Azure/MachineLearningNotebooks/blob/master//setup-environment/configuration.ipynb) | | | | | | | | [configuration](https://github.com/Azure/MachineLearningNotebooks/blob/master//setup-environment/configuration.ipynb) | | | | | | |
| [img-classification-part1-training](https://github.com/Azure/MachineLearningNotebooks/blob/master//tutorials/img-classification-part1-training.ipynb) | | | | | | | | [img-classification-part1-training](https://github.com/Azure/MachineLearningNotebooks/blob/master//tutorials/img-classification-part1-training.ipynb) | | | | | | |
| [img-classification-part2-deploy](https://github.com/Azure/MachineLearningNotebooks/blob/master//tutorials/img-classification-part2-deploy.ipynb) | | | | | | | | [img-classification-part2-deploy](https://github.com/Azure/MachineLearningNotebooks/blob/master//tutorials/img-classification-part2-deploy.ipynb) | | | | | | |
| [regression-automated-ml](https://github.com/Azure/MachineLearningNotebooks/blob/master//tutorials/regression-automated-ml.ipynb) | | | | | | | | [regression-automated-ml](https://github.com/Azure/MachineLearningNotebooks/blob/master//tutorials/regression-automated-ml.ipynb) | | | | | | |
| [tutorial-1st-experiment-sdk-train](https://github.com/Azure/MachineLearningNotebooks/blob/master//tutorials/tutorial-1st-experiment-sdk-train.ipynb) | | | | | | | | [tutorial-1st-experiment-sdk-train](https://github.com/Azure/MachineLearningNotebooks/blob/master//tutorials/tutorial-1st-experiment-sdk-train.ipynb) | | | | | | |
| [tutorial-pipeline-batch-scoring-classification](https://github.com/Azure/MachineLearningNotebooks/blob/master//tutorials/tutorial-pipeline-batch-scoring-classification.ipynb) | | | | | | | | [tutorial-pipeline-batch-scoring-classification](https://github.com/Azure/MachineLearningNotebooks/blob/master//tutorials/tutorial-pipeline-batch-scoring-classification.ipynb) | | | | | | |

Some files were not shown because too many files have changed in this diff Show More