diff --git a/NBSETUP.md b/NBSETUP.md
index f8b6f750..611b56bf 100644
--- a/NBSETUP.md
+++ b/NBSETUP.md
@@ -102,3 +102,5 @@ pip install azureml-sdk[explain]
# install the core SDK and experimental components
pip install azureml-sdk[contrib]
```
+Drag and Drop
+The image will be downloaded by Fatkun
\ No newline at end of file
diff --git a/README.md b/README.md
index 42006430..5edf1824 100644
--- a/README.md
+++ b/README.md
@@ -1,9 +1,6 @@
# Azure Machine Learning service example notebooks
-This repository contains example notebooks demonstrating the [Azure Machine Learning](https://azure.microsoft.com/en-us/services/machine-learning-service/) Python SDK
-which allows you to build, train, deploy and manage machine learning solutions using Azure. The AML SDK
-allows you the choice of using local or cloud compute resources, while managing
-and maintaining the complete data science workflow from the cloud.
+This repository contains example notebooks demonstrating the [Azure Machine Learning](https://azure.microsoft.com/en-us/services/machine-learning-service/) Python SDK which allows you to build, train, deploy and manage machine learning solutions using Azure. The AML SDK allows you the choice of using local or cloud compute resources, while managing and maintaining the complete data science workflow from the cloud.

@@ -18,16 +15,17 @@ You should always run the [Configuration](./configuration.ipynb) notebook first
If you want to...
- * ...try out and explore Azure ML, start with image classification tutorials [part 1 training](./tutorials/img-classification-part1-training.ipynb) and [part 2 deployment](./tutorials/img-classification-part2-deploy.ipynb).
+ * ...try out and explore Azure ML, start with image classification tutorials: [Part 1 (Training)](./tutorials/img-classification-part1-training.ipynb) and [Part 2 (Deployment)](./tutorials/img-classification-part2-deploy.ipynb).
+ * ...prepare your data and do automated machine learning, start with regression tutorials: [Part 1 (Data Prep)](./tutorials/regression-part1-data-prep.ipynb) and [Part 2 (Automated ML)](./tutorials/regression-part2-automated-ml.ipynb).
* ...learn about experimentation and tracking run history, first [train within Notebook](./how-to-use-azureml/training/train-within-notebook/train-within-notebook.ipynb), then try [training on remote VM](./how-to-use-azureml/training/train-on-remote-vm/train-on-remote-vm.ipynb) and [using logging APIs](./how-to-use-azureml/training/logging-api/logging-api.ipynb).
* ...train deep learning models at scale, first learn about [Machine Learning Compute](./how-to-use-azureml/training/train-on-amlcompute/train-on-amlcompute.ipynb), and then try [distributed hyperparameter tuning](./how-to-use-azureml/training-with-deep-learning/train-hyperparameter-tune-deploy-with-pytorch/train-hyperparameter-tune-deploy-with-pytorch.ipynb) and [distributed training](./how-to-use-azureml/training-with-deep-learning/distributed-pytorch-with-horovod/distributed-pytorch-with-horovod.ipynb).
- * ...deploy model as realtime scoring service, first learn the basics by [training within Notebook and deploying to Azure Container Instance](./how-to-use-azureml/training/train-within-notebook/train-within-notebook.ipynb), then learn how to [register and manage models, and create Docker images](./how-to-use-azureml/deployment/register-model-create-image-deploy-service/register-model-create-image-deploy-service.ipynb), and [production deploy models on Azure Kubernetes Cluster](./how-to-use-azureml/deployment/production-deploy-to-aks/production-deploy-to-aks.ipynb).
- * ...deploy models as batch scoring service, first [train a model within Notebook](./how-to-use-azureml/training/train-within-notebook/train-within-notebook.ipynb), learn how to [register and manage models](./how-to-use-azureml/deployment/register-model-create-image-deploy-service/register-model-create-image-deploy-service.ipynb), then [create Machine Learning Compute for scoring compute](./how-to-use-azureml/training/train-on-amlcompute/train-on-amlcompute.ipynb), and [use Machine Learning Pipelines to deploy your model](./how-to-use-azureml/machine-learning-pipelines/pipeline-batch-scoring/pipeline-batch-scoring.ipynb).
+ * ...deploy models as a realtime scoring service, first learn the basics by [training within Notebook and deploying to Azure Container Instance](./how-to-use-azureml/training/train-within-notebook/train-within-notebook.ipynb), then learn how to [register and manage models, and create Docker images](./how-to-use-azureml/deployment/register-model-create-image-deploy-service/register-model-create-image-deploy-service.ipynb), and [production deploy models on Azure Kubernetes Cluster](./how-to-use-azureml/deployment/production-deploy-to-aks/production-deploy-to-aks.ipynb).
+ * ...deploy models as a batch scoring service, first [train a model within Notebook](./how-to-use-azureml/training/train-within-notebook/train-within-notebook.ipynb), learn how to [register and manage models](./how-to-use-azureml/deployment/register-model-create-image-deploy-service/register-model-create-image-deploy-service.ipynb), then [create Machine Learning Compute for scoring compute](./how-to-use-azureml/training/train-on-amlcompute/train-on-amlcompute.ipynb), and [use Machine Learning Pipelines to deploy your model](./how-to-use-azureml/machine-learning-pipelines/pipeline-mpi-batch-prediction.ipynb).
* ...monitor your deployed models, learn about using [App Insights](./how-to-use-azureml/deployment/enable-app-insights-in-production-service/enable-app-insights-in-production-service.ipynb) and [model data collection](./how-to-use-azureml/deployment/enable-data-collection-for-models-in-aks/enable-data-collection-for-models-in-aks.ipynb).
## Tutorials
-The [Tutorials](./tutorials) folder contains notebooks for the tutorials described in the [Azure Machine Learning documentation](https://aka.ms/aml-docs)
+The [Tutorials](./tutorials) folder contains notebooks for the tutorials described in the [Azure Machine Learning documentation](https://aka.ms/aml-docs).
## How to use Azure ML
@@ -45,9 +43,8 @@ The [How to use Azure ML](./how-to-use-azureml) folder contains specific example
## Documentation
* Quickstarts, end-to-end tutorials, and how-tos on the [official documentation site for Azure Machine Learning service](https://docs.microsoft.com/en-us/azure/machine-learning/service/).
-
- * [Python SDK reference]( https://docs.microsoft.com/en-us/python/api/overview/azure/ml/intro?view=azure-ml-py)
-
+ * [Python SDK reference](https://docs.microsoft.com/en-us/python/api/overview/azure/ml/intro?view=azure-ml-py)
+ * Azure ML Data Prep SDK [overview](https://aka.ms/data-prep-sdk), [Python SDK reference](https://aka.ms/aml-data-prep-apiref), and [tutorials and how-tos](https://aka.ms/aml-data-prep-notebooks).
---
@@ -56,4 +53,4 @@ The [How to use Azure ML](./how-to-use-azureml) folder contains specific example
Visit following repos to see projects contributed by Azure ML users:
- [Fine tune natural language processing models using Azure Machine Learning service](https://github.com/Microsoft/AzureML-BERT)
- - [Fashion MNIST with Azure ML SDK](https://github.com/amynic/azureml-sdk-fashion)
+ - [Fashion MNIST with Azure ML SDK](https://github.com/amynic/azureml-sdk-fashion)
\ No newline at end of file
diff --git a/configuration.ipynb b/configuration.ipynb
index b82bfb3d..40744bc5 100644
--- a/configuration.ipynb
+++ b/configuration.ipynb
@@ -96,7 +96,7 @@
"source": [
"import azureml.core\n",
"\n",
- "print(\"This notebook was created using version 1.0.15 of the Azure ML SDK\")\n",
+ "print(\"This notebook was created using version 1.0.17 of the Azure ML SDK\")\n",
"print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
]
},
diff --git a/how-to-use-azureml/README.md b/how-to-use-azureml/README.md
index f9260663..cedd4581 100644
--- a/how-to-use-azureml/README.md
+++ b/how-to-use-azureml/README.md
@@ -5,8 +5,8 @@ Learn how to use Azure Machine Learning services for experimentation and model m
As a pre-requisite, run the [configuration Notebook](../configuration.ipynb) notebook first to set up your Azure ML Workspace. Then, run the notebooks in following recommended order.
* [train-within-notebook](./training/train-within-notebook): Train a model hile tracking run history, and learn how to deploy the model as web service to Azure Container Instance.
-* [train-on-local](./training/train-on-local): Learn how to submit a run and use Azure ML managed run configuration.
-* [train-on-amlcompute](./training/train-on-amlcompute): Use a 1-n node managed compute cluster as a remote compute target for CPU or GPU based training.
+* [train-on-local](./training/train-on-local): Learn how to submit a run to local computer and use Azure ML managed run configuration.
+* [train-on-amlcompute](./training/train-on-amlcompute): Use a 1-n node Azure ML managed compute cluster for remote runs on Azure CPU or GPU infrastructure.
* [train-on-remote-vm](./training/train-on-remote-vm): Use Data Science Virtual Machine as a target for remote runs.
* [logging-api](./training/logging-api): Learn about the details of logging metrics to run history.
* [register-model-create-image-deploy-service](./deployment/register-model-create-image-deploy-service): Learn about the details of model management.
diff --git a/how-to-use-azureml/automated-machine-learning/README.md b/how-to-use-azureml/automated-machine-learning/README.md
index 77c78292..6fa39dd1 100644
--- a/how-to-use-azureml/automated-machine-learning/README.md
+++ b/how-to-use-azureml/automated-machine-learning/README.md
@@ -229,6 +229,9 @@ If a sample notebook fails with an error that property, method or library does n
1) Check that you have selected correct kernel in jupyter notebook. The kernel is displayed in the top right of the notebook page. It can be changed using the `Kernel | Change Kernel` menu option. For Azure Notebooks, it should be `Python 3.6`. For local conda environments, it should be the conda envioronment name that you specified in automl_setup. The default is azure_automl. Note that the kernel is saved as part of the notebook. So, if you switch to a new conda environment, you will have to select the new kernel in the notebook.
2) Check that the notebook is for the SDK version that you are using. You can check the SDK version by executing `azureml.core.VERSION` in a jupyter notebook cell. You can download previous version of the sample notebooks from GitHub by clicking the `Branch` button, selecting the `Tags` tab and then selecting the version.
+## Numpy import fails on Windows
+Some Windows environments see an error loading numpy with the latest Python version 3.6.8. If you see this issue, try with Python version 3.6.7.
+
## Remote run: DsvmCompute.create fails
There are several reasons why the DsvmCompute.create can fail. The reason is usually in the error message but you have to look at the end of the error message for the detailed reason. Some common reasons are:
1) `Compute name is invalid, it should start with a letter, be between 2 and 16 character, and only include letters (a-zA-Z), numbers (0-9) and \'-\'.` Note that underscore is not allowed in the name.
diff --git a/how-to-use-azureml/automated-machine-learning/automl_env.yml b/how-to-use-azureml/automated-machine-learning/automl_env.yml
index 029cfd60..ef1584f2 100644
--- a/how-to-use-azureml/automated-machine-learning/automl_env.yml
+++ b/how-to-use-azureml/automated-machine-learning/automl_env.yml
@@ -2,7 +2,7 @@ name: azure_automl
dependencies:
# The python interpreter version.
# Currently Azure ML only supports 3.5.2 and later.
-- python=3.6
+- python>=3.5.2,<3.6.8
- nb_conda
- matplotlib==2.1.0
- numpy>=1.11.0,<1.15.0
@@ -12,6 +12,7 @@ dependencies:
- scikit-learn>=0.18.0,<=0.19.1
- pandas>=0.22.0,<0.23.0
- tensorflow>=1.12.0
+- py-xgboost<=0.80
- pip:
# Required packages for AzureML execution, history, and data preparation.
diff --git a/how-to-use-azureml/automated-machine-learning/automl_env_mac.yml b/how-to-use-azureml/automated-machine-learning/automl_env_mac.yml
index 0e16cd1d..a9ba417b 100644
--- a/how-to-use-azureml/automated-machine-learning/automl_env_mac.yml
+++ b/how-to-use-azureml/automated-machine-learning/automl_env_mac.yml
@@ -2,7 +2,7 @@ name: azure_automl
dependencies:
# The python interpreter version.
# Currently Azure ML only supports 3.5.2 and later.
-- python=3.6
+- python>=3.5.2,<3.6.8
- nb_conda
- matplotlib==2.1.0
- numpy>=1.15.3
@@ -12,6 +12,7 @@ dependencies:
- scikit-learn>=0.18.0,<=0.19.1
- pandas>=0.22.0,<0.23.0
- tensorflow>=1.12.0
+- py-xgboost<=0.80
- pip:
# Required packages for AzureML execution, history, and data preparation.
diff --git a/how-to-use-azureml/automated-machine-learning/classification-with-deployment/auto-ml-classification-with-deployment.ipynb b/how-to-use-azureml/automated-machine-learning/classification-with-deployment/auto-ml-classification-with-deployment.ipynb
index 68effc07..8937d8f7 100644
--- a/how-to-use-azureml/automated-machine-learning/classification-with-deployment/auto-ml-classification-with-deployment.ipynb
+++ b/how-to-use-azureml/automated-machine-learning/classification-with-deployment/auto-ml-classification-with-deployment.ipynb
@@ -84,9 +84,9 @@
"ws = Workspace.from_config()\n",
"\n",
"# choose a name for experiment\n",
- "experiment_name = 'automl-local-classification'\n",
+ "experiment_name = 'automl-classification-deployment'\n",
"# project folder\n",
- "project_folder = './sample_projects/automl-local-classification'\n",
+ "project_folder = './sample_projects/automl-classification-deployment'\n",
"\n",
"experiment=Experiment(ws, experiment_name)\n",
"\n",
@@ -103,23 +103,6 @@
"outputDf.T"
]
},
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Opt-in diagnostics for better experience, quality, and security of future releases."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "from azureml.telemetry import set_diagnostics_collection\n",
- "set_diagnostics_collection(send_diagnostics = True)"
- ]
- },
{
"cell_type": "markdown",
"metadata": {},
@@ -289,8 +272,6 @@
"metadata": {},
"outputs": [],
"source": [
- "experiment_name = 'automl-local-classification'\n",
- "\n",
"experiment = Experiment(ws, experiment_name)\n",
"ml_run = AutoMLRun(experiment = experiment, run_id = local_run.id)"
]
diff --git a/how-to-use-azureml/automated-machine-learning/classification-with-whitelisting/auto-ml-classification-with-whitelisting.ipynb b/how-to-use-azureml/automated-machine-learning/classification-with-whitelisting/auto-ml-classification-with-whitelisting.ipynb
index 840d965b..27f399a3 100644
--- a/how-to-use-azureml/automated-machine-learning/classification-with-whitelisting/auto-ml-classification-with-whitelisting.ipynb
+++ b/how-to-use-azureml/automated-machine-learning/classification-with-whitelisting/auto-ml-classification-with-whitelisting.ipynb
@@ -100,23 +100,6 @@
"outputDf.T"
]
},
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Opt-in diagnostics for better experience, quality, and security of future releases."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "from azureml.telemetry import set_diagnostics_collection\n",
- "set_diagnostics_collection(send_diagnostics = True)"
- ]
- },
{
"cell_type": "markdown",
"metadata": {},
diff --git a/how-to-use-azureml/automated-machine-learning/classification/auto-ml-classification.ipynb b/how-to-use-azureml/automated-machine-learning/classification/auto-ml-classification.ipynb
index f70e1faa..5455c02c 100644
--- a/how-to-use-azureml/automated-machine-learning/classification/auto-ml-classification.ipynb
+++ b/how-to-use-azureml/automated-machine-learning/classification/auto-ml-classification.ipynb
@@ -81,8 +81,8 @@
"ws = Workspace.from_config()\n",
"\n",
"# Choose a name for the experiment and specify the project folder.\n",
- "experiment_name = 'automl-local-classification'\n",
- "project_folder = './sample_projects/automl-local-classification'\n",
+ "experiment_name = 'automl-classification'\n",
+ "project_folder = './sample_projects/automl-classification'\n",
"\n",
"experiment = Experiment(ws, experiment_name)\n",
"\n",
@@ -99,23 +99,6 @@
"outputDf.T"
]
},
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Opt-in diagnostics for better experience, quality, and security of future releases."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "from azureml.telemetry import set_diagnostics_collection\n",
- "set_diagnostics_collection(send_diagnostics = True)"
- ]
- },
{
"cell_type": "markdown",
"metadata": {},
diff --git a/how-to-use-azureml/automated-machine-learning/dataprep-remote-execution/auto-ml-dataprep-remote-execution.ipynb b/how-to-use-azureml/automated-machine-learning/dataprep-remote-execution/auto-ml-dataprep-remote-execution.ipynb
index 427a475e..373efc43 100644
--- a/how-to-use-azureml/automated-machine-learning/dataprep-remote-execution/auto-ml-dataprep-remote-execution.ipynb
+++ b/how-to-use-azureml/automated-machine-learning/dataprep-remote-execution/auto-ml-dataprep-remote-execution.ipynb
@@ -49,23 +49,6 @@
"Currently, Data Prep only supports __Ubuntu 16__ and __Red Hat Enterprise Linux 7__. We are working on supporting more linux distros."
]
},
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Opt-in diagnostics for better experience, quality, and security of future releases."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "from azureml.telemetry import set_diagnostics_collection\n",
- "set_diagnostics_collection(send_diagnostics = True)"
- ]
- },
{
"cell_type": "markdown",
"metadata": {},
diff --git a/how-to-use-azureml/automated-machine-learning/dataprep/auto-ml-dataprep.ipynb b/how-to-use-azureml/automated-machine-learning/dataprep/auto-ml-dataprep.ipynb
index c81cdbad..22c9a89d 100644
--- a/how-to-use-azureml/automated-machine-learning/dataprep/auto-ml-dataprep.ipynb
+++ b/how-to-use-azureml/automated-machine-learning/dataprep/auto-ml-dataprep.ipynb
@@ -49,23 +49,6 @@
"Currently, Data Prep only supports __Ubuntu 16__ and __Red Hat Enterprise Linux 7__. We are working on supporting more linux distros."
]
},
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Opt-in diagnostics for better experience, quality, and security of future releases."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "from azureml.telemetry import set_diagnostics_collection\n",
- "set_diagnostics_collection(send_diagnostics = True)"
- ]
- },
{
"cell_type": "markdown",
"metadata": {},
diff --git a/how-to-use-azureml/automated-machine-learning/exploring-previous-runs/auto-ml-exploring-previous-runs.ipynb b/how-to-use-azureml/automated-machine-learning/exploring-previous-runs/auto-ml-exploring-previous-runs.ipynb
index 8aa51733..309f00d4 100644
--- a/how-to-use-azureml/automated-machine-learning/exploring-previous-runs/auto-ml-exploring-previous-runs.ipynb
+++ b/how-to-use-azureml/automated-machine-learning/exploring-previous-runs/auto-ml-exploring-previous-runs.ipynb
@@ -70,23 +70,6 @@
"ws = Workspace.from_config()"
]
},
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Opt-in diagnostics for better experience, quality, and security of future releases."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "from azureml.telemetry import set_diagnostics_collection\n",
- "set_diagnostics_collection(send_diagnostics = True)"
- ]
- },
{
"cell_type": "markdown",
"metadata": {},
diff --git a/how-to-use-azureml/automated-machine-learning/forecasting-orange-juice-sales/auto-ml-forecasting-orange-juice-sales.ipynb b/how-to-use-azureml/automated-machine-learning/forecasting-orange-juice-sales/auto-ml-forecasting-orange-juice-sales.ipynb
index 3c3c674b..caf68330 100644
--- a/how-to-use-azureml/automated-machine-learning/forecasting-orange-juice-sales/auto-ml-forecasting-orange-juice-sales.ipynb
+++ b/how-to-use-azureml/automated-machine-learning/forecasting-orange-juice-sales/auto-ml-forecasting-orange-juice-sales.ipynb
@@ -147,8 +147,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
- "### Data Splitting\n",
- "For the purposes of demonstration and later forecast evaluation, we now split the data into a training and a testing set. The test set will contain the final 20 weeks of observed sales for each time-series."
+ "For demonstration purposes, we extract sales time-series for just a few of the stores:"
]
},
{
@@ -157,19 +156,37 @@
"metadata": {},
"outputs": [],
"source": [
- "ntest_periods = 20\n",
+ "use_stores = [2, 5, 8]\n",
+ "data_subset = data[data.Store.isin(use_stores)]\n",
+ "nseries = data_subset.groupby(grain_column_names).ngroups\n",
+ "print('Data subset contains {0} individual time-series.'.format(nseries))"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Data Splitting\n",
+ "We now split the data into a training and a testing set for later forecast evaluation. The test set will contain the final 20 weeks of observed sales for each time-series. The splits should be stratified by series, so we use a group-by statement on the grain columns."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "n_test_periods = 20\n",
"\n",
"def split_last_n_by_grain(df, n):\n",
- " \"\"\"\n",
- " Group df by grain and split on last n rows for each group\n",
- " \"\"\"\n",
+ " \"\"\"Group df by grain and split on last n rows for each group.\"\"\"\n",
" df_grouped = (df.sort_values(time_column_name) # Sort by ascending time\n",
" .groupby(grain_column_names, group_keys=False))\n",
" df_head = df_grouped.apply(lambda dfg: dfg.iloc[:-n])\n",
" df_tail = df_grouped.apply(lambda dfg: dfg.iloc[-n:])\n",
" return df_head, df_tail\n",
"\n",
- "X_train, X_test = split_last_n_by_grain(data, ntest_periods)"
+ "X_train, X_test = split_last_n_by_grain(data_subset, n_test_periods)"
]
},
{
@@ -187,24 +204,7 @@
"\n",
"AutoML will currently train a single, regression-type model across **all** time-series in a given training set. This allows the model to generalize across related series.\n",
"\n",
- "You are almost ready to start an AutoML training job. We will first need to create a validation set from the existing training set (i.e. for hyper-parameter tuning): "
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "nvalidation_periods = 20\n",
- "X_train, X_validate = split_last_n_by_grain(X_train, nvalidation_periods)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "We also need to separate the target column from the rest of the DataFrame: "
+ "You are almost ready to start an AutoML training job. First, we need to separate the target column from the rest of the DataFrame: "
]
},
{
@@ -214,8 +214,7 @@
"outputs": [],
"source": [
"target_column_name = 'Quantity'\n",
- "y_train = X_train.pop(target_column_name).values\n",
- "y_validate = X_validate.pop(target_column_name).values "
+ "y_train = X_train.pop(target_column_name).values"
]
},
{
@@ -224,22 +223,31 @@
"source": [
"## Train\n",
"\n",
- "The AutoMLConfig object defines the settings and data for an AutoML training job. Here, we set necessary inputs like the task type, the number of AutoML iterations to try, and the training and validation data. \n",
+ "The AutoMLConfig object defines the settings and data for an AutoML training job. Here, we set necessary inputs like the task type, the number of AutoML iterations to try, the training data, and cross-validation parameters. \n",
"\n",
- "For forecasting tasks, there are some additional parameters that can be set: the name of the column holding the date/time and the grain column names. A time column is required for forecasting, while the grain is optional. If a grain is not given, the forecaster assumes that the whole dataset is a single time-series. We also pass a list of columns to drop prior to modeling. The _logQuantity_ column is completely correlated with the target quantity, so it must be removed to prevent a target leak. \n",
+ "For forecasting tasks, there are some additional parameters that can be set: the name of the column holding the date/time, the grain column names, and the maximum forecast horizon. A time column is required for forecasting, while the grain is optional. If a grain is not given, AutoML assumes that the whole dataset is a single time-series. We also pass a list of columns to drop prior to modeling. The _logQuantity_ column is completely correlated with the target quantity, so it must be removed to prevent a target leak.\n",
+ "\n",
+ "The forecast horizon is given in units of the time-series frequency; for instance, the OJ series frequency is weekly, so a horizon of 20 means that a trained model will estimate sales up-to 20 weeks beyond the latest date in the training data for each series. In this example, we set the maximum horizon to the number of samples per series in the test set (n_test_periods). Generally, the value of this parameter will be dictated by business needs. For example, a demand planning organizaion that needs to estimate the next month of sales would set the horizon accordingly. \n",
+ "\n",
+ "Finally, a note about the cross-validation (CV) procedure for time-series data. AutoML uses out-of-sample error estimates to select a best pipeline/model, so it is important that the CV fold splitting is done correctly. Time-series can violate the basic statistical assumptions of the canonical K-Fold CV strategy, so AutoML implements a [rolling origin validation](https://robjhyndman.com/hyndsight/tscv/) procedure to create CV folds for time-series data. To use this procedure, you just need to specify the desired number of CV folds in the AutoMLConfig object. It is also possible to bypass CV and use your own validation set by setting the *X_valid* and *y_valid* parameters of AutoMLConfig.\n",
+ "\n",
+ "Here is a summary of AutoMLConfig parameters used for training the OJ model:\n",
"\n",
"|Property|Description|\n",
"|-|-|\n",
"|**task**|forecasting|\n",
"|**primary_metric**|This is the metric that you want to optimize.
Forecasting supports the following primary metrics
spearman_correlation
normalized_root_mean_squared_error
r2_score
normalized_mean_absolute_error\n",
"|**iterations**|Number of iterations. In each iteration, Auto ML trains a specific pipeline on the given data|\n",
- "|**X**|Training matrix of features, shape = [n_training_samples, n_features]|\n",
- "|**y**|Target values, shape = [n_training_samples, ]|\n",
- "|**X_valid**|Validation matrix of features, shape = [n_validation_samples, n_features]|\n",
- "|**y_valid**|Target values for validation, shape = [n_validation_samples, ]\n",
+ "|**X**|Training matrix of features as a pandas DataFrame, shape = [n_training_samples, n_features]|\n",
+ "|**y**|Target values as a numpy.ndarray, shape = [n_training_samples, ]|\n",
+ "|**n_cross_validations**|Number of cross-validation folds to use for model/pipeline selection|\n",
"|**enable_ensembling**|Allow AutoML to create ensembles of the best performing models\n",
"|**debug_log**|Log file path for writing debugging information\n",
- "|**path**|Relative path to the project folder. AutoML stores configuration files for the experiment under this folder. You can specify a new empty folder. "
+ "|**path**|Relative path to the project folder. AutoML stores configuration files for the experiment under this folder. You can specify a new empty folder.|\n",
+ "|**time_column_name**|Name of the datetime column in the input data|\n",
+ "|**grain_column_names**|Name(s) of the columns defining individual series in the input data|\n",
+ "|**drop_column_names**|Name(s) of columns to drop prior to modeling|\n",
+ "|**max_horizon**|Maximum desired forecast horizon in units of time-series frequency|"
]
},
{
@@ -248,10 +256,11 @@
"metadata": {},
"outputs": [],
"source": [
- "automl_settings = {\n",
+ "time_series_settings = {\n",
" 'time_column_name': time_column_name,\n",
" 'grain_column_names': grain_column_names,\n",
- " 'drop_column_names': ['logQuantity']\n",
+ " 'drop_column_names': ['logQuantity'],\n",
+ " 'max_horizon': n_test_periods\n",
"}\n",
"\n",
"automl_config = AutoMLConfig(task='forecasting',\n",
@@ -260,12 +269,11 @@
" iterations=10,\n",
" X=X_train,\n",
" y=y_train,\n",
- " X_valid=X_validate,\n",
- " y_valid=y_validate,\n",
+ " n_cross_validations=5,\n",
" enable_ensembling=False,\n",
" path=project_folder,\n",
" verbosity=logging.INFO,\n",
- " **automl_settings)"
+ " **time_series_settings)"
]
},
{
diff --git a/how-to-use-azureml/automated-machine-learning/missing-data-blacklist-early-termination/auto-ml-missing-data-blacklist-early-termination.ipynb b/how-to-use-azureml/automated-machine-learning/missing-data-blacklist-early-termination/auto-ml-missing-data-blacklist-early-termination.ipynb
index 2a4959d6..e6802499 100644
--- a/how-to-use-azureml/automated-machine-learning/missing-data-blacklist-early-termination/auto-ml-missing-data-blacklist-early-termination.ipynb
+++ b/how-to-use-azureml/automated-machine-learning/missing-data-blacklist-early-termination/auto-ml-missing-data-blacklist-early-termination.ipynb
@@ -102,23 +102,6 @@
"outputDf.T"
]
},
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Opt-in diagnostics for better experience, quality, and security of future releases."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "from azureml.telemetry import set_diagnostics_collection\n",
- "set_diagnostics_collection(send_diagnostics = True)"
- ]
- },
{
"cell_type": "markdown",
"metadata": {},
diff --git a/how-to-use-azureml/automated-machine-learning/model-explanation/auto-ml-model-explanation.ipynb b/how-to-use-azureml/automated-machine-learning/model-explanation/auto-ml-model-explanation.ipynb
index e4df9d6b..8d9b5a12 100644
--- a/how-to-use-azureml/automated-machine-learning/model-explanation/auto-ml-model-explanation.ipynb
+++ b/how-to-use-azureml/automated-machine-learning/model-explanation/auto-ml-model-explanation.ipynb
@@ -74,9 +74,9 @@
"ws = Workspace.from_config()\n",
"\n",
"# choose a name for experiment\n",
- "experiment_name = 'automl-local-classification'\n",
+ "experiment_name = 'automl-model-explanation'\n",
"# project folder\n",
- "project_folder = './sample_projects/automl-local-classification-model-explanation'\n",
+ "project_folder = './sample_projects/automl-model-explanation'\n",
"\n",
"experiment=Experiment(ws, experiment_name)\n",
"\n",
@@ -93,23 +93,6 @@
"outputDf.T"
]
},
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Opt-in diagnostics for better experience, quality, and security of future releases"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "from azureml.telemetry import set_diagnostics_collection\n",
- "set_diagnostics_collection(send_diagnostics=True)"
- ]
- },
{
"cell_type": "markdown",
"metadata": {},
diff --git a/how-to-use-azureml/automated-machine-learning/regression/auto-ml-regression.ipynb b/how-to-use-azureml/automated-machine-learning/regression/auto-ml-regression.ipynb
index 4b7750cf..74375c55 100644
--- a/how-to-use-azureml/automated-machine-learning/regression/auto-ml-regression.ipynb
+++ b/how-to-use-azureml/automated-machine-learning/regression/auto-ml-regression.ipynb
@@ -96,23 +96,6 @@
"outputDf.T"
]
},
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Opt-in diagnostics for better experience, quality, and security of future releases."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "from azureml.telemetry import set_diagnostics_collection\n",
- "set_diagnostics_collection(send_diagnostics = True)"
- ]
- },
{
"cell_type": "markdown",
"metadata": {},
diff --git a/how-to-use-azureml/automated-machine-learning/remote-attach/auto-ml-remote-attach.ipynb b/how-to-use-azureml/automated-machine-learning/remote-attach/auto-ml-remote-attach.ipynb
index 9466826c..13ac6f37 100644
--- a/how-to-use-azureml/automated-machine-learning/remote-attach/auto-ml-remote-attach.ipynb
+++ b/how-to-use-azureml/automated-machine-learning/remote-attach/auto-ml-remote-attach.ipynb
@@ -104,23 +104,6 @@
"outputDf.T"
]
},
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Opt-in diagnostics for better experience, quality, and security of future releases."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "from azureml.telemetry import set_diagnostics_collection\n",
- "set_diagnostics_collection(send_diagnostics = True)"
- ]
- },
{
"cell_type": "markdown",
"metadata": {},
@@ -130,7 +113,7 @@
"1. Create a Linux DSVM in Azure, following these [quick instructions](https://docs.microsoft.com/en-us/azure/machine-learning/desktop-workbench/how-to-create-dsvm-hdi). Make sure you use the Ubuntu flavor (not CentOS). Make sure that disk space is available under `/tmp` because AutoML creates files under `/tmp/azureml_run`s. The DSVM should have more cores than the number of parallel runs that you plan to enable. It should also have at least 4GB per core.\n",
"2. Enter the IP address, user name and password below.\n",
"\n",
- "**Note:** By default, SSH runs on port 22 and you don't need to change the port number below. If you've configured SSH to use a different port, change `dsvm_ssh_port` accordinglyaddress. [Read more](https://render.githubusercontent.com/documentation/sdk/ssh-issue.md) on changing SSH ports for security reasons."
+ "**Note:** By default, SSH runs on port 22 and you don't need to change the port number below. If you've configured SSH to use a different port, change `dsvm_ssh_port` accordinglyaddress. [Read more](https://docs.microsoft.com/en-us/azure/virtual-machines/troubleshooting/detailed-troubleshoot-ssh-connection) on changing SSH ports for security reasons."
]
},
{
diff --git a/how-to-use-azureml/automated-machine-learning/remote-batchai/auto-ml-remote-batchai.ipynb b/how-to-use-azureml/automated-machine-learning/remote-batchai/auto-ml-remote-batchai.ipynb
index 551a624e..d781231d 100644
--- a/how-to-use-azureml/automated-machine-learning/remote-batchai/auto-ml-remote-batchai.ipynb
+++ b/how-to-use-azureml/automated-machine-learning/remote-batchai/auto-ml-remote-batchai.ipynb
@@ -67,6 +67,7 @@
"source": [
"import logging\n",
"import os\n",
+ "import csv\n",
"\n",
"from matplotlib import pyplot as plt\n",
"import numpy as np\n",
@@ -89,7 +90,7 @@
"\n",
"# Choose a name for the run history container in the workspace.\n",
"experiment_name = 'automl-remote-amlcompute'\n",
- "project_folder = './sample_projects/automl-remote-amlcompute'\n",
+ "project_folder = './project'\n",
"\n",
"experiment = Experiment(ws, experiment_name)\n",
"\n",
@@ -106,23 +107,6 @@
"outputDf.T"
]
},
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Opt-in diagnostics for better experience, quality, and security of future releases."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "from azureml.telemetry import set_diagnostics_collection\n",
- "set_diagnostics_collection(send_diagnostics = True)"
- ]
- },
{
"cell_type": "markdown",
"metadata": {},
@@ -171,6 +155,51 @@
" # For a more detailed view of current AmlCompute status, use get_status()."
]
},
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Data\n",
+ "For remote executions, you need to make the data accessible from the remote compute.\n",
+ "This can be done by uploading the data to DataStore.\n",
+ "In this example, we upload scikit-learn's [load_digits](http://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_digits.html) data."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "data_train = datasets.load_digits()\n",
+ "\n",
+ "if not os.path.isdir('data'):\n",
+ " os.mkdir('data')\n",
+ " \n",
+ "if not os.path.exists(project_folder):\n",
+ " os.makedirs(project_folder)\n",
+ " \n",
+ "pd.DataFrame(data_train.data).to_csv(\"data/X_train.tsv\", index=False, header=False, quoting=csv.QUOTE_ALL, sep=\"\\t\")\n",
+ "pd.DataFrame(data_train.target).to_csv(\"data/y_train.tsv\", index=False, header=False, sep=\"\\t\")\n",
+ "\n",
+ "ds = ws.get_default_datastore()\n",
+ "ds.upload(src_dir='./data', target_path='bai_data', overwrite=True, show_progress=True)\n",
+ "\n",
+ "from azureml.core.runconfig import DataReferenceConfiguration\n",
+ "dr = DataReferenceConfiguration(datastore_name=ds.name, \n",
+ " path_on_datastore='bai_data', \n",
+ " path_on_compute='/tmp/azureml_runs',\n",
+ " mode='download', # download files from datastore to compute target\n",
+ " overwrite=False)"
+ ]
+ },
{
"cell_type": "code",
"execution_count": null,
@@ -188,29 +217,13 @@
"conda_run_config.environment.docker.enabled = True\n",
"conda_run_config.environment.docker.base_image = azureml.core.runconfig.DEFAULT_CPU_IMAGE\n",
"\n",
+ "# set the data reference of the run coonfiguration\n",
+ "conda_run_config.data_references = {ds.name: dr}\n",
+ "\n",
"cd = CondaDependencies.create(pip_packages=['azureml-sdk[automl]'], conda_packages=['numpy'])\n",
"conda_run_config.environment.python.conda_dependencies = cd"
]
},
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Data\n",
- "For remote executions you should author a `get_data.py` file containing a `get_data()` function. This file should be in the root directory of the project. You can encapsulate code to read data either from a blob storage or local disk in this file.\n",
- "In this example, the `get_data()` function returns data using scikit-learn's [load_digits](http://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_digits.html) method."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "if not os.path.exists(project_folder):\n",
- " os.makedirs(project_folder)"
- ]
- },
{
"cell_type": "code",
"execution_count": null,
@@ -219,17 +232,13 @@
"source": [
"%%writefile $project_folder/get_data.py\n",
"\n",
- "from sklearn import datasets\n",
- "from scipy import sparse\n",
- "import numpy as np\n",
+ "import pandas as pd\n",
"\n",
"def get_data():\n",
- " \n",
- " digits = datasets.load_digits()\n",
- " X_train = digits.data\n",
- " y_train = digits.target\n",
+ " X_train = pd.read_csv(\"/tmp/azureml_runs/bai_data/X_train.tsv\", delimiter=\"\\t\", header=None, quotechar='\"')\n",
+ " y_train = pd.read_csv(\"/tmp/azureml_runs/bai_data/y_train.tsv\", delimiter=\"\\t\", header=None, quotechar='\"')\n",
"\n",
- " return { \"X\" : X_train, \"y\" : y_train }"
+ " return { \"X\" : X_train.values, \"y\" : y_train[0].values }\n"
]
},
{
diff --git a/how-to-use-azureml/automated-machine-learning/remote-execution-with-datastore/auto-ml-remote-execution-with-datastore.ipynb b/how-to-use-azureml/automated-machine-learning/remote-execution-with-datastore/auto-ml-remote-execution-with-datastore.ipynb
index fe4d9c3e..7604f816 100644
--- a/how-to-use-azureml/automated-machine-learning/remote-execution-with-datastore/auto-ml-remote-execution-with-datastore.ipynb
+++ b/how-to-use-azureml/automated-machine-learning/remote-execution-with-datastore/auto-ml-remote-execution-with-datastore.ipynb
@@ -99,23 +99,6 @@
"outputDf.T"
]
},
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Opt-in diagnostics for better experience, quality, and security of future releases"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "from azureml.telemetry import set_diagnostics_collection\n",
- "set_diagnostics_collection(send_diagnostics=True)"
- ]
- },
{
"cell_type": "markdown",
"metadata": {},
@@ -123,7 +106,7 @@
"### Create a Remote Linux DSVM\n",
"Note: If creation fails with a message about Marketplace purchase eligibilty, go to portal.azure.com, start creating DSVM there, and select \"Want to create programmatically\" to enable programmatic creation. Once you've enabled it, you can exit without actually creating VM.\n",
"\n",
- "**Note**: By default SSH runs on port 22 and you don't need to specify it. But if for security reasons you can switch to a different port (such as 5022), you can append the port number to the address. [Read more](https://render.githubusercontent.com/documentation/sdk/ssh-issue.md) on this."
+ "**Note**: By default SSH runs on port 22 and you don't need to specify it. But if for security reasons you can switch to a different port (such as 5022), you can append the port number to the address. [Read more](https://docs.microsoft.com/en-us/azure/virtual-machines/troubleshooting/detailed-troubleshoot-ssh-connection) on this."
]
},
{
diff --git a/how-to-use-azureml/automated-machine-learning/remote-execution/auto-ml-remote-execution.ipynb b/how-to-use-azureml/automated-machine-learning/remote-execution/auto-ml-remote-execution.ipynb
index 5dca304a..241c7a95 100644
--- a/how-to-use-azureml/automated-machine-learning/remote-execution/auto-ml-remote-execution.ipynb
+++ b/how-to-use-azureml/automated-machine-learning/remote-execution/auto-ml-remote-execution.ipynb
@@ -68,6 +68,7 @@
"import logging\n",
"import os\n",
"import time\n",
+ "import csv\n",
"\n",
"from matplotlib import pyplot as plt\n",
"import numpy as np\n",
@@ -90,7 +91,7 @@
"\n",
"# Choose a name for the run history container in the workspace.\n",
"experiment_name = 'automl-remote-dsvm'\n",
- "project_folder = './sample_projects/automl-remote-dsvm'\n",
+ "project_folder = './project'\n",
"\n",
"experiment = Experiment(ws, experiment_name)\n",
"\n",
@@ -107,23 +108,6 @@
"outputDf.T"
]
},
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Opt-in diagnostics for better experience, quality, and security of future releases."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "from azureml.telemetry import set_diagnostics_collection\n",
- "set_diagnostics_collection(send_diagnostics = True)"
- ]
- },
{
"cell_type": "markdown",
"metadata": {},
@@ -153,6 +137,44 @@
" time.sleep(90) # Wait for ssh to be accessible"
]
},
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Data\n",
+ "For remote executions, you need to make the data accessible from the remote compute.\n",
+ "This can be done by uploading the data to DataStore.\n",
+ "In this example, we upload scikit-learn's [load_digits](http://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_digits.html) data."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "data_train = datasets.load_digits()\n",
+ "\n",
+ "if not os.path.isdir('data'):\n",
+ " os.mkdir('data')\n",
+ " \n",
+ "if not os.path.exists(project_folder):\n",
+ " os.makedirs(project_folder)\n",
+ " \n",
+ "pd.DataFrame(data_train.data).to_csv(\"data/X_train.tsv\", index=False, header=False, quoting=csv.QUOTE_ALL, sep=\"\\t\")\n",
+ "pd.DataFrame(data_train.target).to_csv(\"data/y_train.tsv\", index=False, header=False, sep=\"\\t\")\n",
+ "\n",
+ "ds = ws.get_default_datastore()\n",
+ "ds.upload(src_dir='./data', target_path='re_data', overwrite=True, show_progress=True)\n",
+ "\n",
+ "from azureml.core.runconfig import DataReferenceConfiguration\n",
+ "dr = DataReferenceConfiguration(datastore_name=ds.name, \n",
+ " path_on_datastore='re_data', \n",
+ " path_on_compute='/tmp/azureml_runs',\n",
+ " mode='download', # download files from datastore to compute target\n",
+ " overwrite=False)"
+ ]
+ },
{
"cell_type": "code",
"execution_count": null,
@@ -168,29 +190,13 @@
"# Set compute target to the Linux DSVM\n",
"conda_run_config.target = dsvm_compute\n",
"\n",
+ "# set the data reference of the run coonfiguration\n",
+ "conda_run_config.data_references = {ds.name: dr}\n",
+ "\n",
"cd = CondaDependencies.create(pip_packages=['azureml-sdk[automl]'], conda_packages=['numpy'])\n",
"conda_run_config.environment.python.conda_dependencies = cd"
]
},
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Data\n",
- "For remote executions you should author a `get_data.py` file containing a `get_data()` function. This file should be in the root directory of the project. You can encapsulate code to read data either from a blob storage or local disk in this file.\n",
- "In this example, the `get_data()` function returns data using scikit-learn's [load_digits](http://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_digits.html) method."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "if not os.path.exists(project_folder):\n",
- " os.makedirs(project_folder)"
- ]
- },
{
"cell_type": "code",
"execution_count": null,
@@ -199,17 +205,13 @@
"source": [
"%%writefile $project_folder/get_data.py\n",
"\n",
- "from sklearn import datasets\n",
- "from scipy import sparse\n",
- "import numpy as np\n",
+ "import pandas as pd\n",
"\n",
"def get_data():\n",
- " \n",
- " digits = datasets.load_digits()\n",
- " X_train = digits.data[100:,:]\n",
- " y_train = digits.target[100:]\n",
+ " X_train = pd.read_csv(\"/tmp/azureml_runs/re_data/X_train.tsv\", delimiter=\"\\t\", header=None, quotechar='\"')\n",
+ " y_train = pd.read_csv(\"/tmp/azureml_runs/re_data/y_train.tsv\", delimiter=\"\\t\", header=None, quotechar='\"')\n",
"\n",
- " return { \"X\" : X_train, \"y\" : y_train }"
+ " return { \"X\" : X_train.values, \"y\" : y_train[0].values }\n"
]
},
{
diff --git a/how-to-use-azureml/automated-machine-learning/sample-weight/auto-ml-sample-weight.ipynb b/how-to-use-azureml/automated-machine-learning/sample-weight/auto-ml-sample-weight.ipynb
index f64c282d..70e7b0d8 100644
--- a/how-to-use-azureml/automated-machine-learning/sample-weight/auto-ml-sample-weight.ipynb
+++ b/how-to-use-azureml/automated-machine-learning/sample-weight/auto-ml-sample-weight.ipynb
@@ -75,7 +75,7 @@
"experiment_name = 'non_sample_weight_experiment'\n",
"sample_weight_experiment_name = 'sample_weight_experiment'\n",
"\n",
- "project_folder = './sample_projects/automl-local-classification'\n",
+ "project_folder = './sample_projects/sample_weight'\n",
"\n",
"experiment = Experiment(ws, experiment_name)\n",
"sample_weight_experiment=Experiment(ws, sample_weight_experiment_name)\n",
@@ -93,23 +93,6 @@
"outputDf.T"
]
},
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Opt-in diagnostics for better experience, quality, and security of future releases."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "from azureml.telemetry import set_diagnostics_collection\n",
- "set_diagnostics_collection(send_diagnostics = True)"
- ]
- },
{
"cell_type": "markdown",
"metadata": {},
diff --git a/how-to-use-azureml/automated-machine-learning/sparse-data-train-test-split/auto-ml-sparse-data-train-test-split.ipynb b/how-to-use-azureml/automated-machine-learning/sparse-data-train-test-split/auto-ml-sparse-data-train-test-split.ipynb
index 24a1989f..5bcaccff 100644
--- a/how-to-use-azureml/automated-machine-learning/sparse-data-train-test-split/auto-ml-sparse-data-train-test-split.ipynb
+++ b/how-to-use-azureml/automated-machine-learning/sparse-data-train-test-split/auto-ml-sparse-data-train-test-split.ipynb
@@ -79,9 +79,9 @@
"ws = Workspace.from_config()\n",
"\n",
"# choose a name for the experiment\n",
- "experiment_name = 'automl-local-missing-data'\n",
+ "experiment_name = 'sparse-data-train-test-split'\n",
"# project folder\n",
- "project_folder = './sample_projects/automl-local-missing-data'\n",
+ "project_folder = './sample_projects/sparse-data-train-test-split'\n",
"\n",
"experiment = Experiment(ws, experiment_name)\n",
"\n",
@@ -98,23 +98,6 @@
"outputDf.T"
]
},
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Opt-in diagnostics for better experience, quality, and security of future releases."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "from azureml.telemetry import set_diagnostics_collection\n",
- "set_diagnostics_collection(send_diagnostics = True)"
- ]
- },
{
"cell_type": "markdown",
"metadata": {},
diff --git a/how-to-use-azureml/automated-machine-learning/subsampling/auto-ml-subsampling-local.ipynb b/how-to-use-azureml/automated-machine-learning/subsampling/auto-ml-subsampling-local.ipynb
index 31cb2355..745b5c08 100644
--- a/how-to-use-azureml/automated-machine-learning/subsampling/auto-ml-subsampling-local.ipynb
+++ b/how-to-use-azureml/automated-machine-learning/subsampling/auto-ml-subsampling-local.ipynb
@@ -88,23 +88,6 @@
"pd.DataFrame(data = output, index = ['']).T"
]
},
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Opt-in diagnostics for better experience, quality, and security of future releases."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "from azureml.telemetry import set_diagnostics_collection\n",
- "set_diagnostics_collection(send_diagnostics = True)"
- ]
- },
{
"cell_type": "markdown",
"metadata": {},
diff --git a/how-to-use-azureml/azure-databricks/amlsdk/build-model-run-history-03.ipynb b/how-to-use-azureml/azure-databricks/amlsdk/build-model-run-history-03.ipynb
index 6611b90d..11ef98b9 100644
--- a/how-to-use-azureml/azure-databricks/amlsdk/build-model-run-history-03.ipynb
+++ b/how-to-use-azureml/azure-databricks/amlsdk/build-model-run-history-03.ipynb
@@ -11,13 +11,6 @@
"Licensed under the MIT License."
]
},
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- ""
- ]
- },
{
"cell_type": "markdown",
"metadata": {},
@@ -60,14 +53,10 @@
"metadata": {},
"outputs": [],
"source": [
- "# import the Workspace class and check the azureml SDK version\n",
- "from azureml.core import Workspace\n",
- "\n",
- "ws = Workspace.from_config(auth = auth)\n",
- "print('Workspace name: ' + ws.name, \n",
- " 'Azure region: ' + ws.location, \n",
- " 'Subscription id: ' + ws.subscription_id, \n",
- " 'Resource group: ' + ws.resource_group, sep = '\\n')"
+ "# Set auth to be used by workspace related APIs.\n",
+ "# For automation or CI/CD ServicePrincipalAuthentication can be used.\n",
+ "# https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.authentication.serviceprincipalauthentication?view=azure-ml-py\n",
+ "auth = None"
]
},
{
@@ -79,7 +68,7 @@
"# import the Workspace class and check the azureml SDK version\n",
"from azureml.core import Workspace\n",
"\n",
- "ws = Workspace.from_config()\n",
+ "ws = Workspace.from_config(auth = auth)\n",
"print('Workspace name: ' + ws.name, \n",
" 'Azure region: ' + ws.location, \n",
" 'Subscription id: ' + ws.subscription_id, \n",
@@ -350,9 +339,6 @@
"authors": [
{
"name": "pasha"
- },
- {
- "name": "wamartin"
}
],
"kernelspec": {
@@ -370,9 +356,9 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
- "version": "3.7.0"
+ "version": "3.6.6"
},
- "name": "03.Build_model_runHistory",
+ "name": "build-model-run-history-03",
"notebookId": 3836944406456339
},
"nbformat": 4,
diff --git a/how-to-use-azureml/azure-databricks/amlsdk/deploy-to-aci-04.ipynb b/how-to-use-azureml/azure-databricks/amlsdk/deploy-to-aci-04.ipynb
index 015f117a..c3aa26d8 100644
--- a/how-to-use-azureml/azure-databricks/amlsdk/deploy-to-aci-04.ipynb
+++ b/how-to-use-azureml/azure-databricks/amlsdk/deploy-to-aci-04.ipynb
@@ -20,13 +20,6 @@
"Please Register Azure Container Instance(ACI) using Azure Portal: https://docs.microsoft.com/en-us/azure/azure-resource-manager/resource-manager-supported-services#portal in your subscription before using the SDK to deploy your ML model to ACI."
]
},
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- ""
- ]
- },
{
"cell_type": "code",
"execution_count": null,
@@ -45,15 +38,10 @@
"metadata": {},
"outputs": [],
"source": [
- "from azureml.core import Workspace\n",
- "\n",
- "#'''\n",
- "ws = Workspace.from_config(auth = auth)\n",
- "print('Workspace name: ' + ws.name, \n",
- " 'Azure region: ' + ws.location, \n",
- " 'Subscription id: ' + ws.subscription_id, \n",
- " 'Resource group: ' + ws.resource_group, sep = '\\n')\n",
- "#'''"
+ "# Set auth to be used by workspace related APIs.\n",
+ "# For automation or CI/CD ServicePrincipalAuthentication can be used.\n",
+ "# https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.authentication.serviceprincipalauthentication?view=azure-ml-py\n",
+ "auth = None"
]
},
{
@@ -63,18 +51,12 @@
"outputs": [],
"source": [
"from azureml.core import Workspace\n",
- "import azureml.core\n",
"\n",
- "# Check core SDK version number\n",
- "print(\"SDK version:\", azureml.core.VERSION)\n",
- "\n",
- "#'''\n",
- "ws = Workspace.from_config()\n",
+ "ws = Workspace.from_config(auth = auth)\n",
"print('Workspace name: ' + ws.name, \n",
" 'Azure region: ' + ws.location, \n",
" 'Subscription id: ' + ws.subscription_id, \n",
- " 'Resource group: ' + ws.resource_group, sep = '\\n')\n",
- "#'''"
+ " 'Resource group: ' + ws.resource_group, sep = '\\n')"
]
},
{
@@ -293,24 +275,14 @@
"outputs": [],
"source": [
"#comment to not delete the web service\n",
- "#myservice.delete()"
+ "myservice.delete()"
]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": []
}
],
"metadata": {
"authors": [
{
"name": "pasha"
- },
- {
- "name": "wamartin"
}
],
"kernelspec": {
@@ -328,9 +300,9 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
- "version": "3.7.0"
+ "version": "3.6.6"
},
- "name": "04.DeploytoACI",
+ "name": "deploy-to-aci-04",
"notebookId": 3836944406456376
},
"nbformat": 4,
diff --git a/how-to-use-azureml/azure-databricks/amlsdk/deploy-to-aks-existingimage-05.ipynb b/how-to-use-azureml/azure-databricks/amlsdk/deploy-to-aks-existingimage-05.ipynb
new file mode 100644
index 00000000..bb57cb2c
--- /dev/null
+++ b/how-to-use-azureml/azure-databricks/amlsdk/deploy-to-aks-existingimage-05.ipynb
@@ -0,0 +1,236 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Azure ML & Azure Databricks notebooks by Parashar Shah.\n",
+ "\n",
+ "Copyright (c) Microsoft Corporation. All rights reserved.\n",
+ "\n",
+ "Licensed under the MIT License."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "This notebook uses image from ACI notebook for deploying to AKS."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import azureml.core\n",
+ "\n",
+ "# Check core SDK version number\n",
+ "print(\"SDK version:\", azureml.core.VERSION)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Set auth to be used by workspace related APIs.\n",
+ "# For automation or CI/CD ServicePrincipalAuthentication can be used.\n",
+ "# https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.authentication.serviceprincipalauthentication?view=azure-ml-py\n",
+ "auth = None"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from azureml.core import Workspace\n",
+ "\n",
+ "ws = Workspace.from_config(auth = auth)\n",
+ "print('Workspace name: ' + ws.name, \n",
+ " 'Azure region: ' + ws.location, \n",
+ " 'Subscription id: ' + ws.subscription_id, \n",
+ " 'Resource group: ' + ws.resource_group, sep = '\\n')"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# List images by ws\n",
+ "\n",
+ "from azureml.core.image import ContainerImage\n",
+ "for i in ContainerImage.list(workspace = ws):\n",
+ " print('{}(v.{} [{}]) stored at {} with build log {}'.format(i.name, i.version, i.creation_state, i.image_location, i.image_build_log_uri))"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from azureml.core.image import Image\n",
+ "myimage = Image(workspace=ws, name=\"aciws\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#create AKS compute\n",
+ "#it may take 20-25 minutes to create a new cluster\n",
+ "\n",
+ "from azureml.core.compute import AksCompute, ComputeTarget\n",
+ "\n",
+ "# Use the default configuration (can also provide parameters to customize)\n",
+ "prov_config = AksCompute.provisioning_configuration()\n",
+ "\n",
+ "aks_name = 'ps-aks-demo2' \n",
+ "\n",
+ "# Create the cluster\n",
+ "aks_target = ComputeTarget.create(workspace = ws, \n",
+ " name = aks_name, \n",
+ " provisioning_configuration = prov_config)\n",
+ "\n",
+ "aks_target.wait_for_completion(show_output = True)\n",
+ "\n",
+ "print(aks_target.provisioning_state)\n",
+ "print(aks_target.provisioning_errors)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from azureml.core.webservice import Webservice\n",
+ "help( Webservice.deploy_from_image)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from azureml.core.webservice import Webservice, AksWebservice\n",
+ "from azureml.core.image import ContainerImage\n",
+ "\n",
+ "#Set the web service configuration (using default here with app insights)\n",
+ "aks_config = AksWebservice.deploy_configuration(enable_app_insights=True)\n",
+ "\n",
+ "#unique service name\n",
+ "service_name ='ps-aks-service'\n",
+ "\n",
+ "# Webservice creation using single command, there is a variant to use image directly as well.\n",
+ "aks_service = Webservice.deploy_from_image(\n",
+ " workspace=ws, \n",
+ " name=service_name,\n",
+ " deployment_config = aks_config,\n",
+ " image = myimage,\n",
+ " deployment_target = aks_target\n",
+ " )\n",
+ "\n",
+ "aks_service.wait_for_deployment(show_output=True)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "aks_service.deployment_status"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#for using the Web HTTP API \n",
+ "print(aks_service.scoring_uri)\n",
+ "print(aks_service.get_keys())"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import json\n",
+ "\n",
+ "#get the some sample data\n",
+ "test_data_path = \"AdultCensusIncomeTest\"\n",
+ "test = spark.read.parquet(test_data_path).limit(5)\n",
+ "\n",
+ "test_json = json.dumps(test.toJSON().collect())\n",
+ "\n",
+ "print(test_json)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#using data defined above predict if income is >50K (1) or <=50K (0)\n",
+ "aks_service.run(input_data=test_json)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#comment to not delete the web service\n",
+ "aks_service.delete()\n",
+ "#image.delete()\n",
+ "#model.delete()\n",
+ "aks_target.delete() "
+ ]
+ }
+ ],
+ "metadata": {
+ "authors": [
+ {
+ "name": "pasha"
+ }
+ ],
+ "kernelspec": {
+ "display_name": "Python 3.6",
+ "language": "python",
+ "name": "python36"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.6.6"
+ },
+ "name": "deploy-to-aks-existingimage-05",
+ "notebookId": 1030695628045968
+ },
+ "nbformat": 4,
+ "nbformat_minor": 1
+}
\ No newline at end of file
diff --git a/how-to-use-azureml/azure-databricks/amlsdk/ingest-data-02.ipynb b/how-to-use-azureml/azure-databricks/amlsdk/ingest-data-02.ipynb
index 5391a021..5126fb23 100644
--- a/how-to-use-azureml/azure-databricks/amlsdk/ingest-data-02.ipynb
+++ b/how-to-use-azureml/azure-databricks/amlsdk/ingest-data-02.ipynb
@@ -11,13 +11,6 @@
"Licensed under the MIT License."
]
},
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- ""
- ]
- },
{
"cell_type": "markdown",
"metadata": {},
@@ -42,7 +35,7 @@
"outputs": [],
"source": [
"# Download AdultCensusIncome.csv from Azure CDN. This file has 32,561 rows.\n",
- "basedataurl = \"https://amldockerdatasets.azureedge.net\"\n",
+ "dataurl = \"https://amldockerdatasets.azureedge.net/AdultCensusIncome.csv\"\n",
"datafile = \"AdultCensusIncome.csv\"\n",
"datafile_dbfs = os.path.join(\"/dbfs\", datafile)\n",
"\n",
@@ -50,7 +43,7 @@
" print(\"found {} at {}\".format(datafile, datafile_dbfs))\n",
"else:\n",
" print(\"downloading {} to {}\".format(datafile, datafile_dbfs))\n",
- " urllib.request.urlretrieve(os.path.join(basedataurl, datafile), datafile_dbfs)"
+ " urllib.request.urlretrieve(dataurl, datafile_dbfs)"
]
},
{
@@ -152,9 +145,6 @@
"authors": [
{
"name": "pasha"
- },
- {
- "name": "wamartin"
}
],
"kernelspec": {
@@ -172,9 +162,9 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
- "version": "3.7.0"
+ "version": "3.6.6"
},
- "name": "02.Ingest_data",
+ "name": "ingest-data-02",
"notebookId": 3836944406456362
},
"nbformat": 4,
diff --git a/how-to-use-azureml/azure-databricks/amlsdk/installation-and-configuration-01.ipynb b/how-to-use-azureml/azure-databricks/amlsdk/installation-and-configuration-01.ipynb
index 85d48624..4a247b67 100644
--- a/how-to-use-azureml/azure-databricks/amlsdk/installation-and-configuration-01.ipynb
+++ b/how-to-use-azureml/azure-databricks/amlsdk/installation-and-configuration-01.ipynb
@@ -35,13 +35,6 @@
"print(\"SDK version:\", azureml.core.VERSION)"
]
},
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- ""
- ]
- },
{
"cell_type": "markdown",
"metadata": {},
@@ -67,6 +60,18 @@
"# workspace_region = \"\""
]
},
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Set auth to be used by workspace related APIs.\n",
+ "# For automation or CI/CD ServicePrincipalAuthentication can be used.\n",
+ "# https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.authentication.serviceprincipalauthentication?view=azure-ml-py\n",
+ "auth = None"
+ ]
+ },
{
"cell_type": "code",
"execution_count": null,
@@ -82,6 +87,7 @@
" subscription_id = subscription_id,\n",
" resource_group = resource_group, \n",
" location = workspace_region,\n",
+ " auth = auth,\n",
" exist_ok=True)"
]
},
@@ -103,12 +109,13 @@
"source": [
"ws = Workspace(workspace_name = workspace_name,\n",
" subscription_id = subscription_id,\n",
- " resource_group = resource_group)\n",
+ " resource_group = resource_group,\n",
+ " auth = auth)\n",
"\n",
"# persist the subscription id, resource group name, and workspace name in aml_config/config.json.\n",
"ws.write_config()\n",
- "##if you need to give a different path/filename please use this\n",
- "##write_config(path=\"/databricks/driver/aml_config/\",file_name=)"
+ "#if you need to give a different path/filename please use this\n",
+ "#write_config(path=\"/databricks/driver/aml_config/\",file_name=)"
]
},
{
@@ -129,29 +136,19 @@
"# import the Workspace class and check the azureml SDK version\n",
"from azureml.core import Workspace\n",
"\n",
- "ws = Workspace.from_config()\n",
+ "ws = Workspace.from_config(auth = auth)\n",
"#ws = Workspace.from_config()\n",
"print('Workspace name: ' + ws.name, \n",
" 'Azure region: ' + ws.location, \n",
" 'Subscription id: ' + ws.subscription_id, \n",
" 'Resource group: ' + ws.resource_group, sep = '\\n')"
]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": []
}
],
"metadata": {
"authors": [
{
"name": "pasha"
- },
- {
- "name": "wamartin"
}
],
"kernelspec": {
@@ -169,10 +166,10 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
- "version": "3.7.0"
+ "version": "3.6.6"
},
- "name": "01.Installation_and_Configuration",
- "notebookId": 3836944406456490
+ "name": "installation-and-configuration-01",
+ "notebookId": 3688394266452835
},
"nbformat": 4,
"nbformat_minor": 1
diff --git a/how-to-use-azureml/azure-databricks/automl/automl-databricks-local-01.ipynb b/how-to-use-azureml/azure-databricks/automl/automl-databricks-local-01.ipynb
index 4022ad86..379cb8bd 100644
--- a/how-to-use-azureml/azure-databricks/automl/automl-databricks-local-01.ipynb
+++ b/how-to-use-azureml/azure-databricks/automl/automl-databricks-local-01.ipynb
@@ -1,701 +1,633 @@
{
- "cells": [
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Copyright (c) Microsoft Corporation. All rights reserved.\n",
- "\n",
- "Licensed under the MIT License."
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "# Automated ML on Azure Databricks\n",
- "\n",
- "In this example we use the scikit-learn's digit dataset to showcase how you can use AutoML for a simple classification problem.\n",
- "\n",
- "In this notebook you will learn how to:\n",
- "1. Create Azure Machine Learning Workspace object and initialize your notebook directory to easily reload this object from a configuration file.\n",
- "2. Create an `Experiment` in an existing `Workspace`.\n",
- "3. Configure Automated ML using `AutoMLConfig`.\n",
- "4. Train the model using Azure Databricks.\n",
- "5. Explore the results.\n",
- "6. Test the best fitted model.\n",
- "\n",
- "Before running this notebook, please follow the readme for using Automated ML on Azure Databricks for installing necessary libraries to your cluster."
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "We support installing AML SDK with Automated ML as library from GUI. When attaching a library follow this link and add the below string as your PyPi package. You can select the option to attach the library to all clusters or just one cluster.\n",
- "\n",
- "**azureml-sdk with automated ml**\n",
- "* Source: Upload Python Egg or PyPi\n",
- "* PyPi Name: `azureml-sdk[automl_databricks]`\n",
- "* Select Install Library"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### Check the Azure ML Core SDK Version to Validate Your Installation"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "import azureml.core\n",
- "\n",
- "print(\"SDK Version:\", azureml.core.VERSION)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Initialize an Azure ML Workspace\n",
- "### What is an Azure ML Workspace and Why Do I Need One?\n",
- "\n",
- "An Azure ML workspace is an Azure resource that organizes and coordinates the actions of many other Azure resources to assist in executing and sharing machine learning workflows. In particular, an Azure ML workspace coordinates storage, databases, and compute resources providing added functionality for machine learning experimentation, operationalization, and the monitoring of operationalized models.\n",
- "\n",
- "\n",
- "### What do I Need?\n",
- "\n",
- "To create or access an Azure ML workspace, you will need to import the Azure ML library and specify following information:\n",
- "* A name for your workspace. You can choose one.\n",
- "* Your subscription id. Use the `id` value from the `az account show` command output above.\n",
- "* The resource group name. The resource group organizes Azure resources and provides a default region for the resources in the group. The resource group will be created if it doesn't exist. Resource groups can be created and viewed in the [Azure portal](https://portal.azure.com)\n",
- "* Supported regions include `eastus2`, `eastus`,`westcentralus`, `southeastasia`, `westeurope`, `australiaeast`, `westus2`, `southcentralus`."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "##PUBLISHONLY\n",
- "#subscription_id = \"\" #you should be owner or contributor\n",
- "#resource_group = \"\" #you should be owner or contributor\n",
- "#workspace_name = \"\" #your workspace name\n",
- "#workspace_region = \"\" #your region"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Creating a Workspace\n",
- "If you already have access to an Azure ML workspace you want to use, you can skip this cell. Otherwise, this cell will create an Azure ML workspace for you in the specified subscription, provided you have the correct permissions for the given `subscription_id`.\n",
- "\n",
- "This will fail when:\n",
- "1. The workspace already exists.\n",
- "2. You do not have permission to create a workspace in the resource group.\n",
- "3. You are not a subscription owner or contributor and no Azure ML workspaces have ever been created in this subscription.\n",
- "\n",
- "If workspace creation fails for any reason other than already existing, please work with your IT administrator to provide you with the appropriate permissions or to provision the required resources.\n",
- "\n",
- "**Note:** Creation of a new workspace can take several minutes."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "##TESTONLY\n",
- "# import auth creds from notebook parameters\n",
- "tenant = dbutils.widgets.get('tenant_id')\n",
- "username = dbutils.widgets.get('service_principal_id')\n",
- "password = dbutils.widgets.get('service_principal_password')\n",
- "\n",
- "auth = azureml.core.authentication.ServicePrincipalAuthentication(tenant, username, password)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "##TESTONLY\n",
- "subscription_id = dbutils.widgets.get('subscription_id')\n",
- "resource_group = dbutils.widgets.get('resource_group')\n",
- "workspace_name = dbutils.widgets.get('workspace_name')\n",
- "workspace_region = dbutils.widgets.get('workspace_region')"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "##TESTONLY\n",
- "# Import the Workspace class and check the Azure ML SDK version.\n",
- "from azureml.core import Workspace\n",
- "\n",
- "ws = Workspace.create(name = workspace_name,\n",
- " subscription_id = subscription_id,\n",
- " resource_group = resource_group, \n",
- " location = workspace_region,\n",
- " auth = auth,\n",
- " exist_ok=True)\n",
- "ws.get_details()"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "##PUBLISHONLY\n",
- "## Import the Workspace class and check the Azure ML SDK version.\n",
- "#from azureml.core import Workspace\n",
- "\n",
- "#ws = Workspace.create(name = workspace_name,\n",
- "# subscription_id = subscription_id,\n",
- "# resource_group = resource_group, \n",
- "# location = workspace_region, \n",
- "# exist_ok=True)\n",
- "#ws.get_details()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Configuring Your Local Environment\n",
- "You can validate that you have access to the specified workspace and write a configuration file to the default configuration location, `./aml_config/config.json`."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "##TESTONLY\n",
- "from azureml.core import Workspace\n",
- "\n",
- "ws = Workspace(workspace_name = workspace_name,\n",
- " subscription_id = subscription_id,\n",
- " resource_group = resource_group,\n",
- " auth = auth)\n",
- "\n",
- "# Persist the subscription id, resource group name, and workspace name in aml_config/config.json.\n",
- "ws.write_config()"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "##PUBLISHONLY\n",
- "#from azureml.core import Workspace\n",
- "#\n",
- "#ws = Workspace(workspace_name = workspace_name,\n",
- "# subscription_id = subscription_id,\n",
- "# resource_group = resource_group)\n",
- "#\n",
- "## Persist the subscription id, resource group name, and workspace name in aml_config/config.json.\n",
- "#ws.write_config()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Create a Folder to Host Sample Projects\n",
- "Finally, create a folder where all the sample projects will be hosted."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "import os\n",
- "\n",
- "sample_projects_folder = './sample_projects'\n",
- "\n",
- "if not os.path.isdir(sample_projects_folder):\n",
- " os.mkdir(sample_projects_folder)\n",
- " \n",
- "print('Sample projects will be created in {}.'.format(sample_projects_folder))"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Create an Experiment\n",
- "\n",
- "As part of the setup you have already created an Azure ML `Workspace` object. For Automated ML you will need to create an `Experiment` object, which is a named object in a `Workspace` used to run experiments."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "import logging\n",
- "import os\n",
- "import random\n",
- "import time\n",
- "\n",
- "from matplotlib import pyplot as plt\n",
- "from matplotlib.pyplot import imshow\n",
- "import numpy as np\n",
- "import pandas as pd\n",
- "\n",
- "import azureml.core\n",
- "from azureml.core.experiment import Experiment\n",
- "from azureml.core.workspace import Workspace\n",
- "from azureml.train.automl import AutoMLConfig\n",
- "from azureml.train.automl.run import AutoMLRun"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "# Choose a name for the experiment and specify the project folder.\n",
- "experiment_name = 'automl-local-classification'\n",
- "project_folder = './sample_projects/automl-local-classification'\n",
- "\n",
- "experiment = Experiment(ws, experiment_name)\n",
- "\n",
- "output = {}\n",
- "output['SDK version'] = azureml.core.VERSION\n",
- "output['Subscription ID'] = ws.subscription_id\n",
- "output['Workspace Name'] = ws.name\n",
- "output['Resource Group'] = ws.resource_group\n",
- "output['Location'] = ws.location\n",
- "output['Project Directory'] = project_folder\n",
- "output['Experiment Name'] = experiment.name\n",
- "pd.set_option('display.max_colwidth', -1)\n",
- "pd.DataFrame(data = output, index = ['']).T"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Diagnostics\n",
- "\n",
- "Opt-in diagnostics for better experience, quality, and security of future releases."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "from azureml.telemetry import set_diagnostics_collection\n",
- "set_diagnostics_collection(send_diagnostics = True)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Registering Datastore"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Datastore is the way to save connection information to a storage service (e.g. Azure Blob, Azure Data Lake, Azure SQL) information to your workspace so you can access them without exposing credentials in your code. The first thing you will need to do is register a datastore, you can refer to our [python SDK documentation](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.datastore.datastore?view=azure-ml-py) on how to register datastores. __Note: for best security practices, please do not check in code that contains registering datastores with secrets into your source control__\n",
- "\n",
- "The code below registers a datastore pointing to a publicly readable blob container."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "from azureml.core import Datastore\n",
- "\n",
- "datastore_name = 'demo_training'\n",
- "Datastore.register_azure_blob_container(\n",
- " workspace = ws, \n",
- " datastore_name = datastore_name, \n",
- " container_name = 'automl-notebook-data', \n",
- " account_name = 'dprepdata'\n",
- ")"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Below is an example on how to register a private blob container\n",
- "```python\n",
- "datastore = Datastore.register_azure_blob_container(\n",
- " workspace = ws, \n",
- " datastore_name = 'example_datastore', \n",
- " container_name = 'example-container', \n",
- " account_name = 'storageaccount',\n",
- " account_key = 'accountkey'\n",
- ")\n",
- "```\n",
- "The example below shows how to register an Azure Data Lake store. Please make sure you have granted the necessary permissions for the service principal to access the data lake.\n",
- "```python\n",
- "datastore = Datastore.register_azure_data_lake(\n",
- " workspace = ws,\n",
- " datastore_name = 'example_datastore',\n",
- " store_name = 'adlsstore',\n",
- " tenant_id = 'tenant-id-of-service-principal',\n",
- " client_id = 'client-id-of-service-principal',\n",
- " client_secret = 'client-secret-of-service-principal'\n",
- ")\n",
- "```"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Load Training Data Using DataPrep"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Automated ML takes a Dataflow as input.\n",
- "\n",
- "If you are familiar with Pandas and have done your data preparation work in Pandas already, you can use the `read_pandas_dataframe` method in dprep to convert the DataFrame to a Dataflow.\n",
- "```python\n",
- "df = pd.read_csv(...)\n",
- "# apply some transforms\n",
- "dprep.read_pandas_dataframe(df, temp_folder='/path/accessible/by/both/driver/and/worker')\n",
- "```\n",
- "\n",
- "If you just need to ingest data without doing any preparation, you can directly use AzureML Data Prep (Data Prep) to do so. The code below demonstrates this scenario. Data Prep also has data preparation capabilities, we have many [sample notebooks](https://github.com/Microsoft/AMLDataPrepDocs) demonstrating the capabilities.\n",
- "\n",
- "You will get the datastore you registered previously and pass it to Data Prep for reading. The data comes from the digits dataset: `sklearn.datasets.load_digits()`. `DataPath` points to a specific location within a datastore. "
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "import azureml.dataprep as dprep\n",
- "from azureml.data.datapath import DataPath\n",
- "\n",
- "datastore = Datastore.get(workspace = ws, name = datastore_name)\n",
- "\n",
- "X_train = dprep.read_csv(DataPath(datastore = datastore, path_on_datastore = 'X.csv')) \n",
- "y_train = dprep.read_csv(DataPath(datastore = datastore, path_on_datastore = 'y.csv')).to_long(dprep.ColumnSelector(term='.*', use_regex = True))"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Review the Data Preparation Result\n",
- "You can peek the result of a Dataflow at any range using `skip(i)` and `head(j)`. Doing so evaluates only j records for all the steps in the Dataflow, which makes it fast even against large datasets."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "X_train.get_profile()"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "y_train.get_profile()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Configure AutoML\n",
- "\n",
- "Instantiate an `AutoMLConfig` object to specify the settings and data used to run the experiment.\n",
- "\n",
- "|Property|Description|\n",
- "|-|-|\n",
- "|**task**|classification or regression|\n",
- "|**primary_metric**|This is the metric that you want to optimize. Classification supports the following primary metrics:
accuracy
AUC_weighted
average_precision_score_weighted
norm_macro_recall
precision_score_weighted|\n",
- "|**primary_metric**|This is the metric that you want to optimize. Regression supports the following primary metrics:
spearman_correlation
normalized_root_mean_squared_error
r2_score
normalized_mean_absolute_error|\n",
- "|**iteration_timeout_minutes**|Time limit in minutes for each iteration.|\n",
- "|**iterations**|Number of iterations. In each iteration AutoML trains a specific pipeline with the data.|\n",
- "|**n_cross_validations**|Number of cross validation splits.|\n",
- "|**spark_context**|Spark Context object. for Databricks, use spark_context=sc|\n",
- "|**max_concurrent_iterations**|Maximum number of iterations to execute in parallel. This should be <= number of worker nodes in your Azure Databricks cluster.|\n",
- "|**X**|(sparse) array-like, shape = [n_samples, n_features]|\n",
- "|**y**|(sparse) array-like, shape = [n_samples, ], [n_samples, n_classes]
Multi-class targets. An indicator matrix turns on multilabel classification. This should be an array of integers.|\n",
- "|**path**|Relative path to the project folder. AutoML stores configuration files for the experiment under this folder. You can specify a new empty folder.|\n",
- "|**preprocess**|set this to True to enable pre-processing of data eg. string to numeric using one-hot encoding|\n",
- "|**exit_score**|Target score for experiment. It is associated with the metric. eg. exit_score=0.995 will exit experiment after that|"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "automl_config = AutoMLConfig(task = 'classification',\n",
- " debug_log = 'automl_errors.log',\n",
- " primary_metric = 'AUC_weighted',\n",
- " iteration_timeout_minutes = 10,\n",
- " iterations = 5,\n",
- " preprocess = True,\n",
- " n_cross_validations = 10,\n",
- " max_concurrent_iterations = 2, #change it based on number of worker nodes\n",
- " verbosity = logging.INFO,\n",
- " spark_context=sc, #databricks/spark related\n",
- " X = X_train, \n",
- " y = y_train,\n",
- " path = project_folder)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Train the Models\n",
- "\n",
- "Call the `submit` method on the experiment object and pass the run configuration. Execution of local runs is synchronous. Depending on the data and the number of iterations this can run for a while."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "local_run = experiment.submit(automl_config, show_output = False) # for higher runs please use show_output=False and use the below"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Explore the Results"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "#### Portal URL for Monitoring Runs\n",
- "\n",
- "The following will provide a link to the web interface to explore individual run details and status. In the future we might support output displayed in the notebook."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "displayHTML(\"Your experiment in Azure Portal: {}\".format(local_run.get_portal_url(), local_run.id))"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "The following will show the child runs and waits for the parent run to complete."
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "#### Retrieve All Child Runs after the experiment is completed (in portal)\n",
- "You can also use SDK methods to fetch all the child runs and see individual metrics that we log."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "children = list(local_run.get_children())\n",
- "metricslist = {}\n",
- "for run in children:\n",
- " properties = run.get_properties()\n",
- " metrics = {k: v for k, v in run.get_metrics().items() if isinstance(v, float)} \n",
- " metricslist[int(properties['iteration'])] = metrics\n",
- "\n",
- "rundata = pd.DataFrame(metricslist).sort_index(1)\n",
- "rundata"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### Retrieve the Best Model after the above run is complete \n",
- "\n",
- "Below we select the best pipeline from our iterations. The `get_output` method returns the best run and the fitted model. The Model includes the pipeline and any pre-processing. Overloads on `get_output` allow you to retrieve the best run and fitted model for *any* logged metric or for a particular *iteration*."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "best_run, fitted_model = local_run.get_output()\n",
- "print(best_run)\n",
- "print(fitted_model)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "#### Best Model Based on Any Other Metric after the above run is complete based on the child run\n",
- "Show the run and the model that has the smallest `log_loss` value:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "lookup_metric = \"log_loss\"\n",
- "best_run, fitted_model = local_run.get_output(metric = lookup_metric)\n",
- "print(best_run)\n",
- "print(fitted_model)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### Test the Best Fitted Model\n",
- "\n",
- "#### Load Test Data - you can split the dataset beforehand & pass Train dataset to AutoML and use Test dataset to evaluate the best model."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "from sklearn import datasets\n",
- "digits = datasets.load_digits()\n",
- "X_test = digits.data[:10, :]\n",
- "y_test = digits.target[:10]\n",
- "images = digits.images[:10]"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "#### Testing Our Best Fitted Model\n",
- "We will try to predict digits and see how our model works. This is just an example to show you."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "# Randomly select digits and test.\n",
- "for index in np.random.choice(len(y_test), 2, replace = False):\n",
- " print(index)\n",
- " predicted = fitted_model.predict(X_test[index:index + 1])[0]\n",
- " label = y_test[index]\n",
- " title = \"Label value = %d Predicted value = %d \" % (label, predicted)\n",
- " fig = plt.figure(1, figsize = (3,3))\n",
- " ax1 = fig.add_axes((0,0,.8,.8))\n",
- " ax1.set_title(title)\n",
- " plt.imshow(images[index], cmap = plt.cm.gray_r, interpolation = 'nearest')\n",
- " display(fig)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "When deploying an automated ML trained model, please specify _pippackages=['azureml-sdk[automl]']_ in your CondaDependencies.\n",
- "\n",
- "Please refer to only the **Deploy** section in this notebook - Deployment of Automated ML trained model"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": []
- }
- ],
- "metadata": {
- "authors": [
- {
- "name": "savitam"
- },
- {
- "name": "wamartin"
- }
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Copyright (c) Microsoft Corporation. All rights reserved.\n",
+ "\n",
+ "Licensed under the MIT License."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Automated ML on Azure Databricks\n",
+ "\n",
+ "In this example we use the scikit-learn's digit dataset to showcase how you can use AutoML for a simple classification problem.\n",
+ "\n",
+ "In this notebook you will learn how to:\n",
+ "1. Create Azure Machine Learning Workspace object and initialize your notebook directory to easily reload this object from a configuration file.\n",
+ "2. Create an `Experiment` in an existing `Workspace`.\n",
+ "3. Configure Automated ML using `AutoMLConfig`.\n",
+ "4. Train the model using Azure Databricks.\n",
+ "5. Explore the results.\n",
+ "6. Test the best fitted model.\n",
+ "\n",
+ "Before running this notebook, please follow the readme for using Automated ML on Azure Databricks for installing necessary libraries to your cluster."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "We support installing AML SDK with Automated ML as library from GUI. When attaching a library follow this link and add the below string as your PyPi package. You can select the option to attach the library to all clusters or just one cluster.\n",
+ "\n",
+ "**azureml-sdk with automated ml**\n",
+ "* Source: Upload Python Egg or PyPi\n",
+ "* PyPi Name: `azureml-sdk[automl_databricks]`\n",
+ "* Select Install Library"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Check the Azure ML Core SDK Version to Validate Your Installation"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import azureml.core\n",
+ "\n",
+ "print(\"SDK Version:\", azureml.core.VERSION)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Initialize an Azure ML Workspace\n",
+ "### What is an Azure ML Workspace and Why Do I Need One?\n",
+ "\n",
+ "An Azure ML workspace is an Azure resource that organizes and coordinates the actions of many other Azure resources to assist in executing and sharing machine learning workflows. In particular, an Azure ML workspace coordinates storage, databases, and compute resources providing added functionality for machine learning experimentation, operationalization, and the monitoring of operationalized models.\n",
+ "\n",
+ "\n",
+ "### What do I Need?\n",
+ "\n",
+ "To create or access an Azure ML workspace, you will need to import the Azure ML library and specify following information:\n",
+ "* A name for your workspace. You can choose one.\n",
+ "* Your subscription id. Use the `id` value from the `az account show` command output above.\n",
+ "* The resource group name. The resource group organizes Azure resources and provides a default region for the resources in the group. The resource group will be created if it doesn't exist. Resource groups can be created and viewed in the [Azure portal](https://portal.azure.com)\n",
+ "* Supported regions include `eastus2`, `eastus`,`westcentralus`, `southeastasia`, `westeurope`, `australiaeast`, `westus2`, `southcentralus`."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "subscription_id = \"\" #you should be owner or contributor\n",
+ "resource_group = \"\" #you should be owner or contributor\n",
+ "workspace_name = \"\" #your workspace name\n",
+ "workspace_region = \"\" #your region"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Creating a Workspace\n",
+ "If you already have access to an Azure ML workspace you want to use, you can skip this cell. Otherwise, this cell will create an Azure ML workspace for you in the specified subscription, provided you have the correct permissions for the given `subscription_id`.\n",
+ "\n",
+ "This will fail when:\n",
+ "1. The workspace already exists.\n",
+ "2. You do not have permission to create a workspace in the resource group.\n",
+ "3. You are not a subscription owner or contributor and no Azure ML workspaces have ever been created in this subscription.\n",
+ "\n",
+ "If workspace creation fails for any reason other than already existing, please work with your IT administrator to provide you with the appropriate permissions or to provision the required resources.\n",
+ "\n",
+ "**Note:** Creation of a new workspace can take several minutes."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Import the Workspace class and check the Azure ML SDK version.\n",
+ "from azureml.core import Workspace\n",
+ "\n",
+ "ws = Workspace.create(name = workspace_name,\n",
+ " subscription_id = subscription_id,\n",
+ " resource_group = resource_group, \n",
+ " location = workspace_region, \n",
+ " exist_ok=True)\n",
+ "ws.get_details()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Configuring Your Local Environment\n",
+ "You can validate that you have access to the specified workspace and write a configuration file to the default configuration location, `./aml_config/config.json`."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from azureml.core import Workspace\n",
+ "\n",
+ "ws = Workspace(workspace_name = workspace_name,\n",
+ " subscription_id = subscription_id,\n",
+ " resource_group = resource_group)\n",
+ "\n",
+ "# Persist the subscription id, resource group name, and workspace name in aml_config/config.json.\n",
+ "ws.write_config()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Create a Folder to Host Sample Projects\n",
+ "Finally, create a folder where all the sample projects will be hosted."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import os\n",
+ "\n",
+ "sample_projects_folder = './sample_projects'\n",
+ "\n",
+ "if not os.path.isdir(sample_projects_folder):\n",
+ " os.mkdir(sample_projects_folder)\n",
+ " \n",
+ "print('Sample projects will be created in {}.'.format(sample_projects_folder))"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Create an Experiment\n",
+ "\n",
+ "As part of the setup you have already created an Azure ML `Workspace` object. For Automated ML you will need to create an `Experiment` object, which is a named object in a `Workspace` used to run experiments."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import logging\n",
+ "import os\n",
+ "import random\n",
+ "import time\n",
+ "\n",
+ "from matplotlib import pyplot as plt\n",
+ "from matplotlib.pyplot import imshow\n",
+ "import numpy as np\n",
+ "import pandas as pd\n",
+ "\n",
+ "import azureml.core\n",
+ "from azureml.core.experiment import Experiment\n",
+ "from azureml.core.workspace import Workspace\n",
+ "from azureml.train.automl import AutoMLConfig\n",
+ "from azureml.train.automl.run import AutoMLRun"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Choose a name for the experiment and specify the project folder.\n",
+ "experiment_name = 'automl-local-classification'\n",
+ "project_folder = './sample_projects/automl-local-classification'\n",
+ "\n",
+ "experiment = Experiment(ws, experiment_name)\n",
+ "\n",
+ "output = {}\n",
+ "output['SDK version'] = azureml.core.VERSION\n",
+ "output['Subscription ID'] = ws.subscription_id\n",
+ "output['Workspace Name'] = ws.name\n",
+ "output['Resource Group'] = ws.resource_group\n",
+ "output['Location'] = ws.location\n",
+ "output['Project Directory'] = project_folder\n",
+ "output['Experiment Name'] = experiment.name\n",
+ "pd.set_option('display.max_colwidth', -1)\n",
+ "pd.DataFrame(data = output, index = ['']).T"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Diagnostics\n",
+ "\n",
+ "Opt-in diagnostics for better experience, quality, and security of future releases."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from azureml.telemetry import set_diagnostics_collection\n",
+ "set_diagnostics_collection(send_diagnostics = True)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Registering Datastore"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Datastore is the way to save connection information to a storage service (e.g. Azure Blob, Azure Data Lake, Azure SQL) information to your workspace so you can access them without exposing credentials in your code. The first thing you will need to do is register a datastore, you can refer to our [python SDK documentation](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.datastore.datastore?view=azure-ml-py) on how to register datastores. __Note: for best security practices, please do not check in code that contains registering datastores with secrets into your source control__\n",
+ "\n",
+ "The code below registers a datastore pointing to a publicly readable blob container."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from azureml.core import Datastore\n",
+ "\n",
+ "datastore_name = 'demo_training'\n",
+ "Datastore.register_azure_blob_container(\n",
+ " workspace = ws, \n",
+ " datastore_name = datastore_name, \n",
+ " container_name = 'automl-notebook-data', \n",
+ " account_name = 'dprepdata'\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Below is an example on how to register a private blob container\n",
+ "```python\n",
+ "datastore = Datastore.register_azure_blob_container(\n",
+ " workspace = ws, \n",
+ " datastore_name = 'example_datastore', \n",
+ " container_name = 'example-container', \n",
+ " account_name = 'storageaccount',\n",
+ " account_key = 'accountkey'\n",
+ ")\n",
+ "```\n",
+ "The example below shows how to register an Azure Data Lake store. Please make sure you have granted the necessary permissions for the service principal to access the data lake.\n",
+ "```python\n",
+ "datastore = Datastore.register_azure_data_lake(\n",
+ " workspace = ws,\n",
+ " datastore_name = 'example_datastore',\n",
+ " store_name = 'adlsstore',\n",
+ " tenant_id = 'tenant-id-of-service-principal',\n",
+ " client_id = 'client-id-of-service-principal',\n",
+ " client_secret = 'client-secret-of-service-principal'\n",
+ ")\n",
+ "```"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Load Training Data Using DataPrep"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Automated ML takes a Dataflow as input.\n",
+ "\n",
+ "If you are familiar with Pandas and have done your data preparation work in Pandas already, you can use the `read_pandas_dataframe` method in dprep to convert the DataFrame to a Dataflow.\n",
+ "```python\n",
+ "df = pd.read_csv(...)\n",
+ "# apply some transforms\n",
+ "dprep.read_pandas_dataframe(df, temp_folder='/path/accessible/by/both/driver/and/worker')\n",
+ "```\n",
+ "\n",
+ "If you just need to ingest data without doing any preparation, you can directly use AzureML Data Prep (Data Prep) to do so. The code below demonstrates this scenario. Data Prep also has data preparation capabilities, we have many [sample notebooks](https://github.com/Microsoft/AMLDataPrepDocs) demonstrating the capabilities.\n",
+ "\n",
+ "You will get the datastore you registered previously and pass it to Data Prep for reading. The data comes from the digits dataset: `sklearn.datasets.load_digits()`. `DataPath` points to a specific location within a datastore. "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import azureml.dataprep as dprep\n",
+ "from azureml.data.datapath import DataPath\n",
+ "\n",
+ "datastore = Datastore.get(workspace = ws, name = datastore_name)\n",
+ "\n",
+ "X_train = dprep.read_csv(DataPath(datastore = datastore, path_on_datastore = 'X.csv')) \n",
+ "y_train = dprep.read_csv(DataPath(datastore = datastore, path_on_datastore = 'y.csv')).to_long(dprep.ColumnSelector(term='.*', use_regex = True))"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Review the Data Preparation Result\n",
+ "You can peek the result of a Dataflow at any range using `skip(i)` and `head(j)`. Doing so evaluates only j records for all the steps in the Dataflow, which makes it fast even against large datasets."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "X_train.get_profile()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "y_train.get_profile()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Configure AutoML\n",
+ "\n",
+ "Instantiate an `AutoMLConfig` object to specify the settings and data used to run the experiment.\n",
+ "\n",
+ "|Property|Description|\n",
+ "|-|-|\n",
+ "|**task**|classification or regression|\n",
+ "|**primary_metric**|This is the metric that you want to optimize. Classification supports the following primary metrics:
accuracy
AUC_weighted
average_precision_score_weighted
norm_macro_recall
precision_score_weighted|\n",
+ "|**primary_metric**|This is the metric that you want to optimize. Regression supports the following primary metrics:
spearman_correlation
normalized_root_mean_squared_error
r2_score
normalized_mean_absolute_error|\n",
+ "|**iteration_timeout_minutes**|Time limit in minutes for each iteration.|\n",
+ "|**iterations**|Number of iterations. In each iteration AutoML trains a specific pipeline with the data.|\n",
+ "|**n_cross_validations**|Number of cross validation splits.|\n",
+ "|**spark_context**|Spark Context object. for Databricks, use spark_context=sc|\n",
+ "|**max_concurrent_iterations**|Maximum number of iterations to execute in parallel. This should be <= number of worker nodes in your Azure Databricks cluster.|\n",
+ "|**X**|(sparse) array-like, shape = [n_samples, n_features]|\n",
+ "|**y**|(sparse) array-like, shape = [n_samples, ], [n_samples, n_classes]
Multi-class targets. An indicator matrix turns on multilabel classification. This should be an array of integers.|\n",
+ "|**path**|Relative path to the project folder. AutoML stores configuration files for the experiment under this folder. You can specify a new empty folder.|\n",
+ "|**preprocess**|set this to True to enable pre-processing of data eg. string to numeric using one-hot encoding|\n",
+ "|**exit_score**|Target score for experiment. It is associated with the metric. eg. exit_score=0.995 will exit experiment after that|"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "automl_config = AutoMLConfig(task = 'classification',\n",
+ " debug_log = 'automl_errors.log',\n",
+ " primary_metric = 'AUC_weighted',\n",
+ " iteration_timeout_minutes = 10,\n",
+ " iterations = 5,\n",
+ " preprocess = True,\n",
+ " n_cross_validations = 10,\n",
+ " max_concurrent_iterations = 2, #change it based on number of worker nodes\n",
+ " verbosity = logging.INFO,\n",
+ " spark_context=sc, #databricks/spark related\n",
+ " X = X_train, \n",
+ " y = y_train,\n",
+ " path = project_folder)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Train the Models\n",
+ "\n",
+ "Call the `submit` method on the experiment object and pass the run configuration. Execution of local runs is synchronous. Depending on the data and the number of iterations this can run for a while."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "local_run = experiment.submit(automl_config, show_output = False) # for higher runs please use show_output=False and use the below"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Explore the Results"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "#### Portal URL for Monitoring Runs\n",
+ "\n",
+ "The following will provide a link to the web interface to explore individual run details and status. In the future we might support output displayed in the notebook."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "displayHTML(\"Your experiment in Azure Portal: {}\".format(local_run.get_portal_url(), local_run.id))"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "The following will show the child runs and waits for the parent run to complete."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "#### Retrieve All Child Runs after the experiment is completed (in portal)\n",
+ "You can also use SDK methods to fetch all the child runs and see individual metrics that we log."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "children = list(local_run.get_children())\n",
+ "metricslist = {}\n",
+ "for run in children:\n",
+ " properties = run.get_properties()\n",
+ " metrics = {k: v for k, v in run.get_metrics().items() if isinstance(v, float)} \n",
+ " metricslist[int(properties['iteration'])] = metrics\n",
+ "\n",
+ "rundata = pd.DataFrame(metricslist).sort_index(1)\n",
+ "rundata"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Retrieve the Best Model after the above run is complete \n",
+ "\n",
+ "Below we select the best pipeline from our iterations. The `get_output` method returns the best run and the fitted model. The Model includes the pipeline and any pre-processing. Overloads on `get_output` allow you to retrieve the best run and fitted model for *any* logged metric or for a particular *iteration*."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "best_run, fitted_model = local_run.get_output()\n",
+ "print(best_run)\n",
+ "print(fitted_model)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "#### Best Model Based on Any Other Metric after the above run is complete based on the child run\n",
+ "Show the run and the model that has the smallest `log_loss` value:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "lookup_metric = \"log_loss\"\n",
+ "best_run, fitted_model = local_run.get_output(metric = lookup_metric)\n",
+ "print(best_run)\n",
+ "print(fitted_model)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Test the Best Fitted Model\n",
+ "\n",
+ "#### Load Test Data - you can split the dataset beforehand & pass Train dataset to AutoML and use Test dataset to evaluate the best model."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from sklearn import datasets\n",
+ "digits = datasets.load_digits()\n",
+ "X_test = digits.data[:10, :]\n",
+ "y_test = digits.target[:10]\n",
+ "images = digits.images[:10]"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "#### Testing Our Best Fitted Model\n",
+ "We will try to predict digits and see how our model works. This is just an example to show you."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Randomly select digits and test.\n",
+ "for index in np.random.choice(len(y_test), 2, replace = False):\n",
+ " print(index)\n",
+ " predicted = fitted_model.predict(X_test[index:index + 1])[0]\n",
+ " label = y_test[index]\n",
+ " title = \"Label value = %d Predicted value = %d \" % (label, predicted)\n",
+ " fig = plt.figure(1, figsize = (3,3))\n",
+ " ax1 = fig.add_axes((0,0,.8,.8))\n",
+ " ax1.set_title(title)\n",
+ " plt.imshow(images[index], cmap = plt.cm.gray_r, interpolation = 'nearest')\n",
+ " display(fig)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "When deploying an automated ML trained model, please specify _pippackages=['azureml-sdk[automl]']_ in your CondaDependencies.\n",
+ "\n",
+ "Please refer to only the **Deploy** section in this notebook - Deployment of Automated ML trained model"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ }
],
- "kernelspec": {
- "display_name": "Python 3.6",
- "language": "python",
- "name": "python36"
+ "metadata": {
+ "authors": [
+ {
+ "name": "savitam"
+ },
+ {
+ "name": "wamartin"
+ }
+ ],
+ "kernelspec": {
+ "display_name": "Python 3.6",
+ "language": "python",
+ "name": "python36"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.6.5"
+ },
+ "name": "auto-ml-classification-local-adb",
+ "notebookId": 587284549713154
},
- "language_info": {
- "codemirror_mode": {
- "name": "ipython",
- "version": 3
- },
- "file_extension": ".py",
- "mimetype": "text/x-python",
- "name": "python",
- "nbconvert_exporter": "python",
- "pygments_lexer": "ipython3",
- "version": "3.6.5"
- },
- "name": "auto-ml-classification-local-adb",
- "notebookId": 587284549713154
- },
- "nbformat": 4,
- "nbformat_minor": 1
-}
+ "nbformat": 4,
+ "nbformat_minor": 1
+}
\ No newline at end of file
diff --git a/how-to-use-azureml/azure-databricks/automl/automl-databricks-local-with-deployment.ipynb b/how-to-use-azureml/azure-databricks/automl/automl-databricks-local-with-deployment.ipynb
index 1dce4fed..2254f111 100644
--- a/how-to-use-azureml/azure-databricks/automl/automl-databricks-local-with-deployment.ipynb
+++ b/how-to-use-azureml/azure-databricks/automl/automl-databricks-local-with-deployment.ipynb
@@ -1,858 +1,789 @@
{
- "cells": [
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Copyright (c) Microsoft Corporation. All rights reserved.\n",
- "\n",
- "Licensed under the MIT License."
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "We support installing AML SDK as library from GUI. When attaching a library follow this https://docs.databricks.com/user-guide/libraries.html and add the below string as your PyPi package. You can select the option to attach the library to all clusters or just one cluster.\n",
- "\n",
- "**install azureml-sdk with Automated ML**\n",
- "* Source: Upload Python Egg or PyPi\n",
- "* PyPi Name: `azureml-sdk[automl_databricks]`\n",
- "* Select Install Library"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "# AutoML : Classification with Local Compute on Azure DataBricks with deployment to ACI\n",
- "\n",
- "In this example we use the scikit-learn's [digit dataset](http://scikit-learn.org/stable/datasets/index.html#optical-recognition-of-handwritten-digits-dataset) to showcase how you can use AutoML for a simple classification problem.\n",
- "\n",
- "In this notebook you will learn how to:\n",
- "1. Create Azure Machine Learning Workspace object and initialize your notebook directory to easily reload this object from a configuration file.\n",
- "2. Create an `Experiment` in an existing `Workspace`.\n",
- "3. Configure AutoML using `AutoMLConfig`.\n",
- "4. Train the model using AzureDataBricks.\n",
- "5. Explore the results.\n",
- "6. Register the model.\n",
- "7. Deploy the model.\n",
- "8. Test the best fitted model.\n",
- "\n",
- "Prerequisites:\n",
- "Before running this notebook, please follow the readme for installing necessary libraries to your cluster."
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Register Machine Learning Services Resource Provider\n",
- "Microsoft.MachineLearningServices only needs to be registed once in the subscription. To register it:\n",
- "Start the Azure portal.\n",
- "Select your All services and then Subscription.\n",
- "Select the subscription that you want to use.\n",
- "Click on Resource providers\n",
- "Click the Register link next to Microsoft.MachineLearningServices"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### Check the Azure ML Core SDK Version to Validate Your Installation"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "import azureml.core\n",
- "\n",
- "print(\"SDK Version:\", azureml.core.VERSION)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Initialize an Azure ML Workspace\n",
- "### What is an Azure ML Workspace and Why Do I Need One?\n",
- "\n",
- "An Azure ML workspace is an Azure resource that organizes and coordinates the actions of many other Azure resources to assist in executing and sharing machine learning workflows. In particular, an Azure ML workspace coordinates storage, databases, and compute resources providing added functionality for machine learning experimentation, operationalization, and the monitoring of operationalized models.\n",
- "\n",
- "\n",
- "### What do I Need?\n",
- "\n",
- "To create or access an Azure ML workspace, you will need to import the Azure ML library and specify following information:\n",
- "* A name for your workspace. You can choose one.\n",
- "* Your subscription id. Use the `id` value from the `az account show` command output above.\n",
- "* The resource group name. The resource group organizes Azure resources and provides a default region for the resources in the group. The resource group will be created if it doesn't exist. Resource groups can be created and viewed in the [Azure portal](https://portal.azure.com)\n",
- "* Supported regions include `eastus2`, `eastus`,`westcentralus`, `southeastasia`, `westeurope`, `australiaeast`, `westus2`, `southcentralus`."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "##PUBLISHONLY\n",
- "#subscription_id = \"\" #you should be owner or contributor\n",
- "#resource_group = \"\" #you should be owner or contributor\n",
- "#workspace_name = \"\" #your workspace name\n",
- "#workspace_region = \"\" #your region"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Creating a Workspace\n",
- "If you already have access to an Azure ML workspace you want to use, you can skip this cell. Otherwise, this cell will create an Azure ML workspace for you in the specified subscription, provided you have the correct permissions for the given `subscription_id`.\n",
- "\n",
- "This will fail when:\n",
- "1. The workspace already exists.\n",
- "2. You do not have permission to create a workspace in the resource group.\n",
- "3. You are not a subscription owner or contributor and no Azure ML workspaces have ever been created in this subscription.\n",
- "\n",
- "If workspace creation fails for any reason other than already existing, please work with your IT administrator to provide you with the appropriate permissions or to provision the required resources.\n",
- "\n",
- "**Note:** Creation of a new workspace can take several minutes."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "##TESTONLY\n",
- "# import auth creds from notebook parameters\n",
- "tenant = dbutils.widgets.get('tenant_id')\n",
- "username = dbutils.widgets.get('service_principal_id')\n",
- "password = dbutils.widgets.get('service_principal_password')\n",
- "\n",
- "auth = azureml.core.authentication.ServicePrincipalAuthentication(tenant, username, password)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "##TESTONLY\n",
- "subscription_id = dbutils.widgets.get('subscription_id')\n",
- "resource_group = dbutils.widgets.get('resource_group')\n",
- "workspace_name = dbutils.widgets.get('workspace_name')\n",
- "workspace_region = dbutils.widgets.get('workspace_region')"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "##TESTONLY\n",
- "# Import the Workspace class and check the Azure ML SDK version.\n",
- "from azureml.core import Workspace\n",
- "\n",
- "ws = Workspace.create(name = workspace_name,\n",
- " subscription_id = subscription_id,\n",
- " resource_group = resource_group, \n",
- " location = workspace_region,\n",
- " auth = auth,\n",
- " exist_ok=True)\n",
- "ws.get_details()"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "##PUBLISHONLY\n",
- "## Import the Workspace class and check the Azure ML SDK version.\n",
- "#from azureml.core import Workspace\n",
- "\n",
- "#ws = Workspace.create(name = workspace_name,\n",
- "# subscription_id = subscription_id,\n",
- "# resource_group = resource_group, \n",
- "# location = workspace_region, \n",
- "# exist_ok=True)\n",
- "#ws.get_details()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Configuring Your Local Environment\n",
- "You can validate that you have access to the specified workspace and write a configuration file to the default configuration location, `./aml_config/config.json`."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "##TESTONLY\n",
- "from azureml.core import Workspace\n",
- "\n",
- "ws = Workspace(workspace_name = workspace_name,\n",
- " subscription_id = subscription_id,\n",
- " resource_group = resource_group,\n",
- " auth = auth)\n",
- "\n",
- "# Persist the subscription id, resource group name, and workspace name in aml_config/config.json.\n",
- "ws.write_config()\n",
- "#write_config(path=\"/databricks/driver/aml_config/\",file_name=)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "##PUBLISHONLY\n",
- "#from azureml.core import Workspace\n",
- "#\n",
- "#ws = Workspace(workspace_name = workspace_name,\n",
- "# subscription_id = subscription_id,\n",
- "# resource_group = resource_group)\n",
- "#\n",
- "## Persist the subscription id, resource group name, and workspace name in aml_config/config.json.\n",
- "#ws.write_config()\n",
- "#write_config(path=\"/databricks/driver/aml_config/\",file_name=)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Create a Folder to Host Sample Projects\n",
- "Finally, create a folder where all the sample projects will be hosted."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "import os\n",
- "\n",
- "sample_projects_folder = './sample_projects'\n",
- "\n",
- "if not os.path.isdir(sample_projects_folder):\n",
- " os.mkdir(sample_projects_folder)\n",
- " \n",
- "print('Sample projects will be created in {}.'.format(sample_projects_folder))"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Create an Experiment\n",
- "\n",
- "As part of the setup you have already created an Azure ML `Workspace` object. For AutoML you will need to create an `Experiment` object, which is a named object in a `Workspace` used to run experiments."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "import logging\n",
- "import os\n",
- "import random\n",
- "import time\n",
- "\n",
- "from matplotlib import pyplot as plt\n",
- "from matplotlib.pyplot import imshow\n",
- "import numpy as np\n",
- "import pandas as pd\n",
- "\n",
- "import azureml.core\n",
- "from azureml.core.experiment import Experiment\n",
- "from azureml.core.workspace import Workspace\n",
- "from azureml.train.automl import AutoMLConfig\n",
- "from azureml.train.automl.run import AutoMLRun"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "# Choose a name for the experiment and specify the project folder.\n",
- "experiment_name = 'automl-local-classification'\n",
- "project_folder = './sample_projects/automl-local-classification'\n",
- "\n",
- "experiment = Experiment(ws, experiment_name)\n",
- "\n",
- "output = {}\n",
- "output['SDK version'] = azureml.core.VERSION\n",
- "output['Subscription ID'] = ws.subscription_id\n",
- "output['Workspace Name'] = ws.name\n",
- "output['Resource Group'] = ws.resource_group\n",
- "output['Location'] = ws.location\n",
- "output['Project Directory'] = project_folder\n",
- "output['Experiment Name'] = experiment.name\n",
- "pd.set_option('display.max_colwidth', -1)\n",
- "pd.DataFrame(data = output, index = ['']).T"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Diagnostics\n",
- "\n",
- "Opt-in diagnostics for better experience, quality, and security of future releases."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "from azureml.telemetry import set_diagnostics_collection\n",
- "set_diagnostics_collection(send_diagnostics = True)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Registering Datastore"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Datastore is the way to save connection information to a storage service (e.g. Azure Blob, Azure Data Lake, Azure SQL) information to your workspace so you can access them without exposing credentials in your code. The first thing you will need to do is register a datastore, you can refer to our [python SDK documentation](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.datastore.datastore?view=azure-ml-py) on how to register datastores. __Note: for best security practices, please do not check in code that contains registering datastores with secrets into your source control__\n",
- "\n",
- "The code below registers a datastore pointing to a publicly readable blob container."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "from azureml.core import Datastore\n",
- "\n",
- "datastore_name = 'demo_training'\n",
- "Datastore.register_azure_blob_container(\n",
- " workspace = ws, \n",
- " datastore_name = datastore_name, \n",
- " container_name = 'automl-notebook-data', \n",
- " account_name = 'dprepdata'\n",
- ")"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Below is an example on how to register a private blob container\n",
- "```python\n",
- "datastore = Datastore.register_azure_blob_container(\n",
- " workspace = ws, \n",
- " datastore_name = 'example_datastore', \n",
- " container_name = 'example-container', \n",
- " account_name = 'storageaccount',\n",
- " account_key = 'accountkey'\n",
- ")\n",
- "```\n",
- "The example below shows how to register an Azure Data Lake store. Please make sure you have granted the necessary permissions for the service principal to access the data lake.\n",
- "```python\n",
- "datastore = Datastore.register_azure_data_lake(\n",
- " workspace = ws,\n",
- " datastore_name = 'example_datastore',\n",
- " store_name = 'adlsstore',\n",
- " tenant_id = 'tenant-id-of-service-principal',\n",
- " client_id = 'client-id-of-service-principal',\n",
- " client_secret = 'client-secret-of-service-principal'\n",
- ")\n",
- "```"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Load Training Data Using DataPrep"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Automated ML takes a Dataflow as input.\n",
- "\n",
- "If you are familiar with Pandas and have done your data preparation work in Pandas already, you can use the `read_pandas_dataframe` method in dprep to convert the DataFrame to a Dataflow.\n",
- "```python\n",
- "df = pd.read_csv(...)\n",
- "# apply some transforms\n",
- "dprep.read_pandas_dataframe(df, temp_folder='/path/accessible/by/both/driver/and/worker')\n",
- "```\n",
- "\n",
- "If you just need to ingest data without doing any preparation, you can directly use AzureML Data Prep (Data Prep) to do so. The code below demonstrates this scenario. Data Prep also has data preparation capabilities, we have many [sample notebooks](https://github.com/Microsoft/AMLDataPrepDocs) demonstrating the capabilities.\n",
- "\n",
- "You will get the datastore you registered previously and pass it to Data Prep for reading. The data comes from the digits dataset: `sklearn.datasets.load_digits()`. `DataPath` points to a specific location within a datastore. "
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "import azureml.dataprep as dprep\n",
- "from azureml.data.datapath import DataPath\n",
- "\n",
- "datastore = Datastore.get(workspace = ws, name = datastore_name)\n",
- "\n",
- "X_train = dprep.read_csv(DataPath(datastore = datastore, path_on_datastore = 'X.csv')) \n",
- "y_train = dprep.read_csv(DataPath(datastore = datastore, path_on_datastore = 'y.csv')).to_long(dprep.ColumnSelector(term='.*', use_regex = True))"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Review the Data Preparation Result\n",
- "You can peek the result of a Dataflow at any range using skip(i) and head(j). Doing so evaluates only j records for all the steps in the Dataflow, which makes it fast even against large datasets."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "X_train.get_profile()"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "y_train.get_profile()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Configure AutoML\n",
- "\n",
- "Instantiate an `AutoMLConfig` object to specify the settings and data used to run the experiment.\n",
- "\n",
- "|Property|Description|\n",
- "|-|-|\n",
- "|**task**|classification or regression|\n",
- "|**primary_metric**|This is the metric that you want to optimize. Classification supports the following primary metrics:
accuracy
AUC_weighted
average_precision_score_weighted
norm_macro_recall
precision_score_weighted|\n",
- "|**primary_metric**|This is the metric that you want to optimize. Regression supports the following primary metrics:
spearman_correlation
normalized_root_mean_squared_error
r2_score
normalized_mean_absolute_error|\n",
- "|**iteration_timeout_minutes**|Time limit in minutes for each iteration.|\n",
- "|**iterations**|Number of iterations. In each iteration AutoML trains a specific pipeline with the data.|\n",
- "|**n_cross_validations**|Number of cross validation splits.|\n",
- "|**spark_context**|Spark Context object. for Databricks, use spark_context=sc|\n",
- "|**max_concurrent_iterations**|Maximum number of iterations to execute in parallel. This should be <= number of worker nodes in your Azure Databricks cluster.|\n",
- "|**X**|(sparse) array-like, shape = [n_samples, n_features]|\n",
- "|**y**|(sparse) array-like, shape = [n_samples, ], [n_samples, n_classes]
Multi-class targets. An indicator matrix turns on multilabel classification. This should be an array of integers.|\n",
- "|**path**|Relative path to the project folder. AutoML stores configuration files for the experiment under this folder. You can specify a new empty folder.|\n",
- "|**preprocess**|set this to True to enable pre-processing of data eg. string to numeric using one-hot encoding|\n",
- "|**exit_score**|Target score for experiment. It is associated with the metric. eg. exit_score=0.995 will exit experiment after that|"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "automl_config = AutoMLConfig(task = 'classification',\n",
- " debug_log = 'automl_errors.log',\n",
- " primary_metric = 'AUC_weighted',\n",
- " iteration_timeout_minutes = 10,\n",
- " iterations = 30,\n",
- " preprocess = True,\n",
- " n_cross_validations = 10,\n",
- " max_concurrent_iterations = 2, #change it based on number of worker nodes\n",
- " verbosity = logging.INFO,\n",
- " spark_context=sc, #databricks/spark related\n",
- " X = X_train, \n",
- " y = y_train,\n",
- " path = project_folder)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Train the Models\n",
- "\n",
- "Call the `submit` method on the experiment object and pass the run configuration. Execution of local runs is synchronous. Depending on the data and the number of iterations this can run for a while."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "local_run = experiment.submit(automl_config, show_output = False) # for higher runs please use show_output=False and use the below"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Explore the Results"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "#### Portal URL for Monitoring Runs\n",
- "\n",
- "The following will provide a link to the web interface to explore individual run details and status. In the future we might support output displayed in the notebook."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "displayHTML(\"Azure Portal: {}\".format(local_run.get_portal_url(), local_run.id))"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "The following will show the child runs and waits for the parent run to complete."
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "#### Retrieve All Child Runs after the experiment is completed (in portal)\n",
- "You can also use SDK methods to fetch all the child runs and see individual metrics that we log."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "children = list(local_run.get_children())\n",
- "metricslist = {}\n",
- "for run in children:\n",
- " properties = run.get_properties()\n",
- " #print(properties)\n",
- " metrics = {k: v for k, v in run.get_metrics().items() if isinstance(v, float)} \n",
- " metricslist[int(properties['iteration'])] = metrics\n",
- "\n",
- "rundata = pd.DataFrame(metricslist).sort_index(1)\n",
- "rundata"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### Retrieve the Best Model after the above run is complete \n",
- "\n",
- "Below we select the best pipeline from our iterations. The `get_output` method returns the best run and the fitted model. The Model includes the pipeline and any pre-processing. Overloads on `get_output` allow you to retrieve the best run and fitted model for *any* logged metric or for a particular *iteration*."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "best_run, fitted_model = local_run.get_output()\n",
- "print(best_run)\n",
- "print(fitted_model)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "#### Best Model Based on Any Other Metric after the above run is complete based on the child run\n",
- "Show the run and the model that has the smallest `log_loss` value:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "lookup_metric = \"log_loss\"\n",
- "best_run, fitted_model = local_run.get_output(metric = lookup_metric)\n",
- "print(best_run)\n",
- "print(fitted_model)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Register the Fitted Model for Deployment\n",
- "If neither metric nor iteration are specified in the register_model call, the iteration with the best primary metric is registered."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "description = 'AutoML Model'\n",
- "tags = None\n",
- "model = local_run.register_model(description = description, tags = tags)\n",
- "local_run.model_id # This will be written to the scoring script file later in the notebook."
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Create Scoring Script\n",
- "Replace model_id with name of model from output of above register cell"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "%%writefile score.py\n",
- "import pickle\n",
- "import json\n",
- "import numpy\n",
- "import azureml.train.automl\n",
- "from sklearn.externals import joblib\n",
- "from azureml.core.model import Model\n",
- "\n",
- "\n",
- "def init():\n",
- " global model\n",
- " model_path = Model.get_model_path(model_name = '<>') # this name is model.id of model that we want to deploy\n",
- " # deserialize the model file back into a sklearn model\n",
- " model = joblib.load(model_path)\n",
- "\n",
- "def run(rawdata):\n",
- " try:\n",
- " data = json.loads(rawdata)['data']\n",
- " data = numpy.array(data)\n",
- " result = model.predict(data)\n",
- " except Exception as e:\n",
- " result = str(e)\n",
- " return json.dumps({\"error\": result})\n",
- " return json.dumps({\"result\":result.tolist()})"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "#### Create a YAML File for the Environment"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "from azureml.core.conda_dependencies import CondaDependencies\n",
- "\n",
- "myenv = CondaDependencies.create(conda_packages=['numpy','scikit-learn'], pip_packages=['azureml-sdk[automl]'])\n",
- "\n",
- "conda_env_file_name = 'mydeployenv.yml'\n",
- "myenv.save_to_file('.', conda_env_file_name)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "#### Create ACI config"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "#deploy to ACI\n",
- "from azureml.core.webservice import AciWebservice, Webservice\n",
- "\n",
- "myaci_config = AciWebservice.deploy_configuration(\n",
- " cpu_cores = 2, \n",
- " memory_gb = 2, \n",
- " tags = {'name':'Databricks Azure ML ACI'}, \n",
- " description = 'This is for ADB and AutoML example.')"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Deploy the Image as a Web Service on Azure Container Instance\n",
- "Replace servicename with any meaningful name of service"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "\n",
- "# this will take 10-15 minutes to finish\n",
- "\n",
- "service_name = \"<>\"\n",
- "runtime = \"spark-py\" \n",
- "driver_file = \"score.py\"\n",
- "my_conda_file = \"mydeployenv.yml\"\n",
- "\n",
- "# image creation\n",
- "from azureml.core.image import ContainerImage\n",
- "myimage_config = ContainerImage.image_configuration(execution_script = driver_file, \n",
- " runtime = runtime, \n",
- " conda_file = 'mydeployenv.yml')\n",
- "\n",
- "# Webservice creation\n",
- "myservice = Webservice.deploy_from_model(\n",
- " workspace=ws, \n",
- " name=service_name,\n",
- " deployment_config = myaci_config,\n",
- " models = [model],\n",
- " image_config = myimage_config\n",
- " )\n",
- "\n",
- "myservice.wait_for_deployment(show_output=True)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "#for using the Web HTTP API \n",
- "print(myservice.scoring_uri)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### Test the Best Fitted Model\n",
- "\n",
- "#### Load Test Data - you can split the dataset beforehand & pass Train dataset to AutoML and use Test dataset to evaluate the best model."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "from sklearn import datasets\n",
- "digits = datasets.load_digits()\n",
- "X_test = digits.data[:10, :]\n",
- "y_test = digits.target[:10]\n",
- "images = digits.images[:10]"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "#### Testing Our Best Fitted Model\n",
- "We will try to predict digits and see how our model works. This is just an example to show you."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "# Randomly select digits and test.\n",
- "for index in np.random.choice(len(y_test), 2, replace = False):\n",
- " print(index)\n",
- " predicted = fitted_model.predict(X_test[index:index + 1])[0]\n",
- " label = y_test[index]\n",
- " title = \"Label value = %d Predicted value = %d \" % (label, predicted)\n",
- " fig = plt.figure(1, figsize = (3,3))\n",
- " ax1 = fig.add_axes((0,0,.8,.8))\n",
- " ax1.set_title(title)\n",
- " plt.imshow(images[index], cmap = plt.cm.gray_r, interpolation = 'nearest')\n",
- " display(fig)"
- ]
- }
- ],
- "metadata": {
- "authors": [
- {
- "name": "savitam"
- },
- {
- "name": "wamartin"
- }
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Copyright (c) Microsoft Corporation. All rights reserved.\n",
+ "\n",
+ "Licensed under the MIT License."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "We support installing AML SDK as library from GUI. When attaching a library follow this https://docs.databricks.com/user-guide/libraries.html and add the below string as your PyPi package. You can select the option to attach the library to all clusters or just one cluster.\n",
+ "\n",
+ "**install azureml-sdk with Automated ML**\n",
+ "* Source: Upload Python Egg or PyPi\n",
+ "* PyPi Name: `azureml-sdk[automl_databricks]`\n",
+ "* Select Install Library"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# AutoML : Classification with Local Compute on Azure DataBricks with deployment to ACI\n",
+ "\n",
+ "In this example we use the scikit-learn's [digit dataset](http://scikit-learn.org/stable/datasets/index.html#optical-recognition-of-handwritten-digits-dataset) to showcase how you can use AutoML for a simple classification problem.\n",
+ "\n",
+ "In this notebook you will learn how to:\n",
+ "1. Create Azure Machine Learning Workspace object and initialize your notebook directory to easily reload this object from a configuration file.\n",
+ "2. Create an `Experiment` in an existing `Workspace`.\n",
+ "3. Configure AutoML using `AutoMLConfig`.\n",
+ "4. Train the model using AzureDataBricks.\n",
+ "5. Explore the results.\n",
+ "6. Register the model.\n",
+ "7. Deploy the model.\n",
+ "8. Test the best fitted model.\n",
+ "\n",
+ "Prerequisites:\n",
+ "Before running this notebook, please follow the readme for installing necessary libraries to your cluster."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Register Machine Learning Services Resource Provider\n",
+ "Microsoft.MachineLearningServices only needs to be registed once in the subscription. To register it:\n",
+ "Start the Azure portal.\n",
+ "Select your All services and then Subscription.\n",
+ "Select the subscription that you want to use.\n",
+ "Click on Resource providers\n",
+ "Click the Register link next to Microsoft.MachineLearningServices"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Check the Azure ML Core SDK Version to Validate Your Installation"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import azureml.core\n",
+ "\n",
+ "print(\"SDK Version:\", azureml.core.VERSION)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Initialize an Azure ML Workspace\n",
+ "### What is an Azure ML Workspace and Why Do I Need One?\n",
+ "\n",
+ "An Azure ML workspace is an Azure resource that organizes and coordinates the actions of many other Azure resources to assist in executing and sharing machine learning workflows. In particular, an Azure ML workspace coordinates storage, databases, and compute resources providing added functionality for machine learning experimentation, operationalization, and the monitoring of operationalized models.\n",
+ "\n",
+ "\n",
+ "### What do I Need?\n",
+ "\n",
+ "To create or access an Azure ML workspace, you will need to import the Azure ML library and specify following information:\n",
+ "* A name for your workspace. You can choose one.\n",
+ "* Your subscription id. Use the `id` value from the `az account show` command output above.\n",
+ "* The resource group name. The resource group organizes Azure resources and provides a default region for the resources in the group. The resource group will be created if it doesn't exist. Resource groups can be created and viewed in the [Azure portal](https://portal.azure.com)\n",
+ "* Supported regions include `eastus2`, `eastus`,`westcentralus`, `southeastasia`, `westeurope`, `australiaeast`, `westus2`, `southcentralus`."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "subscription_id = \"\" #you should be owner or contributor\n",
+ "resource_group = \"\" #you should be owner or contributor\n",
+ "workspace_name = \"\" #your workspace name\n",
+ "workspace_region = \"\" #your region"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Creating a Workspace\n",
+ "If you already have access to an Azure ML workspace you want to use, you can skip this cell. Otherwise, this cell will create an Azure ML workspace for you in the specified subscription, provided you have the correct permissions for the given `subscription_id`.\n",
+ "\n",
+ "This will fail when:\n",
+ "1. The workspace already exists.\n",
+ "2. You do not have permission to create a workspace in the resource group.\n",
+ "3. You are not a subscription owner or contributor and no Azure ML workspaces have ever been created in this subscription.\n",
+ "\n",
+ "If workspace creation fails for any reason other than already existing, please work with your IT administrator to provide you with the appropriate permissions or to provision the required resources.\n",
+ "\n",
+ "**Note:** Creation of a new workspace can take several minutes."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Import the Workspace class and check the Azure ML SDK version.\n",
+ "from azureml.core import Workspace\n",
+ "\n",
+ "ws = Workspace.create(name = workspace_name,\n",
+ " subscription_id = subscription_id,\n",
+ " resource_group = resource_group, \n",
+ " location = workspace_region, \n",
+ " exist_ok=True)\n",
+ "ws.get_details()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Configuring Your Local Environment\n",
+ "You can validate that you have access to the specified workspace and write a configuration file to the default configuration location, `./aml_config/config.json`."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from azureml.core import Workspace\n",
+ "\n",
+ "ws = Workspace(workspace_name = workspace_name,\n",
+ " subscription_id = subscription_id,\n",
+ " resource_group = resource_group)\n",
+ "\n",
+ "# Persist the subscription id, resource group name, and workspace name in aml_config/config.json.\n",
+ "ws.write_config()\n",
+ "write_config(path=\"/databricks/driver/aml_config/\",file_name=)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Create a Folder to Host Sample Projects\n",
+ "Finally, create a folder where all the sample projects will be hosted."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import os\n",
+ "\n",
+ "sample_projects_folder = './sample_projects'\n",
+ "\n",
+ "if not os.path.isdir(sample_projects_folder):\n",
+ " os.mkdir(sample_projects_folder)\n",
+ " \n",
+ "print('Sample projects will be created in {}.'.format(sample_projects_folder))"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Create an Experiment\n",
+ "\n",
+ "As part of the setup you have already created an Azure ML `Workspace` object. For AutoML you will need to create an `Experiment` object, which is a named object in a `Workspace` used to run experiments."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import logging\n",
+ "import os\n",
+ "import random\n",
+ "import time\n",
+ "\n",
+ "from matplotlib import pyplot as plt\n",
+ "from matplotlib.pyplot import imshow\n",
+ "import numpy as np\n",
+ "import pandas as pd\n",
+ "\n",
+ "import azureml.core\n",
+ "from azureml.core.experiment import Experiment\n",
+ "from azureml.core.workspace import Workspace\n",
+ "from azureml.train.automl import AutoMLConfig\n",
+ "from azureml.train.automl.run import AutoMLRun"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Choose a name for the experiment and specify the project folder.\n",
+ "experiment_name = 'automl-local-classification'\n",
+ "project_folder = './sample_projects/automl-local-classification'\n",
+ "\n",
+ "experiment = Experiment(ws, experiment_name)\n",
+ "\n",
+ "output = {}\n",
+ "output['SDK version'] = azureml.core.VERSION\n",
+ "output['Subscription ID'] = ws.subscription_id\n",
+ "output['Workspace Name'] = ws.name\n",
+ "output['Resource Group'] = ws.resource_group\n",
+ "output['Location'] = ws.location\n",
+ "output['Project Directory'] = project_folder\n",
+ "output['Experiment Name'] = experiment.name\n",
+ "pd.set_option('display.max_colwidth', -1)\n",
+ "pd.DataFrame(data = output, index = ['']).T"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Diagnostics\n",
+ "\n",
+ "Opt-in diagnostics for better experience, quality, and security of future releases."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from azureml.telemetry import set_diagnostics_collection\n",
+ "set_diagnostics_collection(send_diagnostics = True)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Registering Datastore"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Datastore is the way to save connection information to a storage service (e.g. Azure Blob, Azure Data Lake, Azure SQL) information to your workspace so you can access them without exposing credentials in your code. The first thing you will need to do is register a datastore, you can refer to our [python SDK documentation](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.datastore.datastore?view=azure-ml-py) on how to register datastores. __Note: for best security practices, please do not check in code that contains registering datastores with secrets into your source control__\n",
+ "\n",
+ "The code below registers a datastore pointing to a publicly readable blob container."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from azureml.core import Datastore\n",
+ "\n",
+ "datastore_name = 'demo_training'\n",
+ "Datastore.register_azure_blob_container(\n",
+ " workspace = ws, \n",
+ " datastore_name = datastore_name, \n",
+ " container_name = 'automl-notebook-data', \n",
+ " account_name = 'dprepdata'\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Below is an example on how to register a private blob container\n",
+ "```python\n",
+ "datastore = Datastore.register_azure_blob_container(\n",
+ " workspace = ws, \n",
+ " datastore_name = 'example_datastore', \n",
+ " container_name = 'example-container', \n",
+ " account_name = 'storageaccount',\n",
+ " account_key = 'accountkey'\n",
+ ")\n",
+ "```\n",
+ "The example below shows how to register an Azure Data Lake store. Please make sure you have granted the necessary permissions for the service principal to access the data lake.\n",
+ "```python\n",
+ "datastore = Datastore.register_azure_data_lake(\n",
+ " workspace = ws,\n",
+ " datastore_name = 'example_datastore',\n",
+ " store_name = 'adlsstore',\n",
+ " tenant_id = 'tenant-id-of-service-principal',\n",
+ " client_id = 'client-id-of-service-principal',\n",
+ " client_secret = 'client-secret-of-service-principal'\n",
+ ")\n",
+ "```"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Load Training Data Using DataPrep"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Automated ML takes a Dataflow as input.\n",
+ "\n",
+ "If you are familiar with Pandas and have done your data preparation work in Pandas already, you can use the `read_pandas_dataframe` method in dprep to convert the DataFrame to a Dataflow.\n",
+ "```python\n",
+ "df = pd.read_csv(...)\n",
+ "# apply some transforms\n",
+ "dprep.read_pandas_dataframe(df, temp_folder='/path/accessible/by/both/driver/and/worker')\n",
+ "```\n",
+ "\n",
+ "If you just need to ingest data without doing any preparation, you can directly use AzureML Data Prep (Data Prep) to do so. The code below demonstrates this scenario. Data Prep also has data preparation capabilities, we have many [sample notebooks](https://github.com/Microsoft/AMLDataPrepDocs) demonstrating the capabilities.\n",
+ "\n",
+ "You will get the datastore you registered previously and pass it to Data Prep for reading. The data comes from the digits dataset: `sklearn.datasets.load_digits()`. `DataPath` points to a specific location within a datastore. "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import azureml.dataprep as dprep\n",
+ "from azureml.data.datapath import DataPath\n",
+ "\n",
+ "datastore = Datastore.get(workspace = ws, name = datastore_name)\n",
+ "\n",
+ "X_train = dprep.read_csv(DataPath(datastore = datastore, path_on_datastore = 'X.csv')) \n",
+ "y_train = dprep.read_csv(DataPath(datastore = datastore, path_on_datastore = 'y.csv')).to_long(dprep.ColumnSelector(term='.*', use_regex = True))"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Review the Data Preparation Result\n",
+ "You can peek the result of a Dataflow at any range using skip(i) and head(j). Doing so evaluates only j records for all the steps in the Dataflow, which makes it fast even against large datasets."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "X_train.get_profile()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "y_train.get_profile()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Configure AutoML\n",
+ "\n",
+ "Instantiate an `AutoMLConfig` object to specify the settings and data used to run the experiment.\n",
+ "\n",
+ "|Property|Description|\n",
+ "|-|-|\n",
+ "|**task**|classification or regression|\n",
+ "|**primary_metric**|This is the metric that you want to optimize. Classification supports the following primary metrics:
accuracy
AUC_weighted
average_precision_score_weighted
norm_macro_recall
precision_score_weighted|\n",
+ "|**primary_metric**|This is the metric that you want to optimize. Regression supports the following primary metrics:
spearman_correlation
normalized_root_mean_squared_error
r2_score
normalized_mean_absolute_error|\n",
+ "|**iteration_timeout_minutes**|Time limit in minutes for each iteration.|\n",
+ "|**iterations**|Number of iterations. In each iteration AutoML trains a specific pipeline with the data.|\n",
+ "|**n_cross_validations**|Number of cross validation splits.|\n",
+ "|**spark_context**|Spark Context object. for Databricks, use spark_context=sc|\n",
+ "|**max_concurrent_iterations**|Maximum number of iterations to execute in parallel. This should be <= number of worker nodes in your Azure Databricks cluster.|\n",
+ "|**X**|(sparse) array-like, shape = [n_samples, n_features]|\n",
+ "|**y**|(sparse) array-like, shape = [n_samples, ], [n_samples, n_classes]
Multi-class targets. An indicator matrix turns on multilabel classification. This should be an array of integers.|\n",
+ "|**path**|Relative path to the project folder. AutoML stores configuration files for the experiment under this folder. You can specify a new empty folder.|\n",
+ "|**preprocess**|set this to True to enable pre-processing of data eg. string to numeric using one-hot encoding|\n",
+ "|**exit_score**|Target score for experiment. It is associated with the metric. eg. exit_score=0.995 will exit experiment after that|"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "automl_config = AutoMLConfig(task = 'classification',\n",
+ " debug_log = 'automl_errors.log',\n",
+ " primary_metric = 'AUC_weighted',\n",
+ " iteration_timeout_minutes = 10,\n",
+ " iterations = 30,\n",
+ " preprocess = True,\n",
+ " n_cross_validations = 10,\n",
+ " max_concurrent_iterations = 2, #change it based on number of worker nodes\n",
+ " verbosity = logging.INFO,\n",
+ " spark_context=sc, #databricks/spark related\n",
+ " X = X_train, \n",
+ " y = y_train,\n",
+ " path = project_folder)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Train the Models\n",
+ "\n",
+ "Call the `submit` method on the experiment object and pass the run configuration. Execution of local runs is synchronous. Depending on the data and the number of iterations this can run for a while."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "local_run = experiment.submit(automl_config, show_output = False) # for higher runs please use show_output=False and use the below"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Explore the Results"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "#### Portal URL for Monitoring Runs\n",
+ "\n",
+ "The following will provide a link to the web interface to explore individual run details and status. In the future we might support output displayed in the notebook."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "displayHTML(\"Azure Portal: {}\".format(local_run.get_portal_url(), local_run.id))"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "The following will show the child runs and waits for the parent run to complete."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "#### Retrieve All Child Runs after the experiment is completed (in portal)\n",
+ "You can also use SDK methods to fetch all the child runs and see individual metrics that we log."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "children = list(local_run.get_children())\n",
+ "metricslist = {}\n",
+ "for run in children:\n",
+ " properties = run.get_properties()\n",
+ " #print(properties)\n",
+ " metrics = {k: v for k, v in run.get_metrics().items() if isinstance(v, float)} \n",
+ " metricslist[int(properties['iteration'])] = metrics\n",
+ "\n",
+ "rundata = pd.DataFrame(metricslist).sort_index(1)\n",
+ "rundata"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Retrieve the Best Model after the above run is complete \n",
+ "\n",
+ "Below we select the best pipeline from our iterations. The `get_output` method returns the best run and the fitted model. The Model includes the pipeline and any pre-processing. Overloads on `get_output` allow you to retrieve the best run and fitted model for *any* logged metric or for a particular *iteration*."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "best_run, fitted_model = local_run.get_output()\n",
+ "print(best_run)\n",
+ "print(fitted_model)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "#### Best Model Based on Any Other Metric after the above run is complete based on the child run\n",
+ "Show the run and the model that has the smallest `log_loss` value:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "lookup_metric = \"log_loss\"\n",
+ "best_run, fitted_model = local_run.get_output(metric = lookup_metric)\n",
+ "print(best_run)\n",
+ "print(fitted_model)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Register the Fitted Model for Deployment\n",
+ "If neither metric nor iteration are specified in the register_model call, the iteration with the best primary metric is registered."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "description = 'AutoML Model'\n",
+ "tags = None\n",
+ "model = local_run.register_model(description = description, tags = tags)\n",
+ "local_run.model_id # This will be written to the scoring script file later in the notebook."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Create Scoring Script\n",
+ "Replace model_id with name of model from output of above register cell"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "%%writefile score.py\n",
+ "import pickle\n",
+ "import json\n",
+ "import numpy\n",
+ "import azureml.train.automl\n",
+ "from sklearn.externals import joblib\n",
+ "from azureml.core.model import Model\n",
+ "\n",
+ "\n",
+ "def init():\n",
+ " global model\n",
+ " model_path = Model.get_model_path(model_name = '<>') # this name is model.id of model that we want to deploy\n",
+ " # deserialize the model file back into a sklearn model\n",
+ " model = joblib.load(model_path)\n",
+ "\n",
+ "def run(rawdata):\n",
+ " try:\n",
+ " data = json.loads(rawdata)['data']\n",
+ " data = numpy.array(data)\n",
+ " result = model.predict(data)\n",
+ " except Exception as e:\n",
+ " result = str(e)\n",
+ " return json.dumps({\"error\": result})\n",
+ " return json.dumps({\"result\":result.tolist()})"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "#### Create a YAML File for the Environment"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from azureml.core.conda_dependencies import CondaDependencies\n",
+ "\n",
+ "myenv = CondaDependencies.create(conda_packages=['numpy','scikit-learn'], pip_packages=['azureml-sdk[automl]'])\n",
+ "\n",
+ "conda_env_file_name = 'mydeployenv.yml'\n",
+ "myenv.save_to_file('.', conda_env_file_name)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "#### Create ACI config"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#deploy to ACI\n",
+ "from azureml.core.webservice import AciWebservice, Webservice\n",
+ "\n",
+ "myaci_config = AciWebservice.deploy_configuration(\n",
+ " cpu_cores = 2, \n",
+ " memory_gb = 2, \n",
+ " tags = {'name':'Databricks Azure ML ACI'}, \n",
+ " description = 'This is for ADB and AutoML example.')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Deploy the Image as a Web Service on Azure Container Instance\n",
+ "Replace servicename with any meaningful name of service"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "\n",
+ "# this will take 10-15 minutes to finish\n",
+ "\n",
+ "service_name = \"<>\"\n",
+ "runtime = \"spark-py\" \n",
+ "driver_file = \"score.py\"\n",
+ "my_conda_file = \"mydeployenv.yml\"\n",
+ "\n",
+ "# image creation\n",
+ "from azureml.core.image import ContainerImage\n",
+ "myimage_config = ContainerImage.image_configuration(execution_script = driver_file, \n",
+ " runtime = runtime, \n",
+ " conda_file = 'mydeployenv.yml')\n",
+ "\n",
+ "# Webservice creation\n",
+ "myservice = Webservice.deploy_from_model(\n",
+ " workspace=ws, \n",
+ " name=service_name,\n",
+ " deployment_config = myaci_config,\n",
+ " models = [model],\n",
+ " image_config = myimage_config\n",
+ " )\n",
+ "\n",
+ "myservice.wait_for_deployment(show_output=True)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#for using the Web HTTP API \n",
+ "print(myservice.scoring_uri)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Test the Best Fitted Model\n",
+ "\n",
+ "#### Load Test Data - you can split the dataset beforehand & pass Train dataset to AutoML and use Test dataset to evaluate the best model."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from sklearn import datasets\n",
+ "digits = datasets.load_digits()\n",
+ "X_test = digits.data[:10, :]\n",
+ "y_test = digits.target[:10]\n",
+ "images = digits.images[:10]"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "#### Testing Our Best Fitted Model\n",
+ "We will try to predict digits and see how our model works. This is just an example to show you."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Randomly select digits and test.\n",
+ "for index in np.random.choice(len(y_test), 2, replace = False):\n",
+ " print(index)\n",
+ " predicted = fitted_model.predict(X_test[index:index + 1])[0]\n",
+ " label = y_test[index]\n",
+ " title = \"Label value = %d Predicted value = %d \" % (label, predicted)\n",
+ " fig = plt.figure(1, figsize = (3,3))\n",
+ " ax1 = fig.add_axes((0,0,.8,.8))\n",
+ " ax1.set_title(title)\n",
+ " plt.imshow(images[index], cmap = plt.cm.gray_r, interpolation = 'nearest')\n",
+ " display(fig)"
+ ]
+ }
],
- "kernelspec": {
- "display_name": "Python 3.6",
- "language": "python",
- "name": "python36"
+ "metadata": {
+ "authors": [
+ {
+ "name": "savitam"
+ },
+ {
+ "name": "wamartin"
+ }
+ ],
+ "kernelspec": {
+ "display_name": "Python 3.6",
+ "language": "python",
+ "name": "python36"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.6.5"
+ },
+ "name": "auto-ml-classification-local-adb",
+ "notebookId": 2733885892129020
},
- "language_info": {
- "codemirror_mode": {
- "name": "ipython",
- "version": 3
- },
- "file_extension": ".py",
- "mimetype": "text/x-python",
- "name": "python",
- "nbconvert_exporter": "python",
- "pygments_lexer": "ipython3",
- "version": "3.6.5"
- },
- "name": "auto-ml-classification-local-adb",
- "notebookId": 2733885892129020
- },
- "nbformat": 4,
- "nbformat_minor": 1
-}
+ "nbformat": 4,
+ "nbformat_minor": 1
+}
\ No newline at end of file
diff --git a/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-data-transfer.ipynb b/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-data-transfer.ipynb
index 59a1b722..40415dfb 100644
--- a/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-data-transfer.ipynb
+++ b/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-data-transfer.ipynb
@@ -12,8 +12,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
- "# Azure Machine Learning Pipeline with DataTransferStep\n",
- "This notebook is used to demonstrate the use of DataTransferStep in Azure Machine Learning Pipeline.\n",
+ "# Azure Machine Learning Pipeline with DataTranferStep\n",
+ "This notebook is used to demonstrate the use of DataTranferStep in Azure Machine Learning Pipeline.\n",
"\n",
"In certain cases, you will need to transfer data from one data location to another. For example, your data may be in Files storage and you may want to move it to Blob storage. Or, if your data is in an ADLS account and you want to make it available in the Blob storage. The built-in **DataTransferStep** class helps you transfer data in these situations.\n",
"\n",
@@ -466,4 +466,4 @@
},
"nbformat": 4,
"nbformat_minor": 2
-}
+}
\ No newline at end of file
diff --git a/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-how-to-use-azurebatch-to-run-a-windows-executable.ipynb b/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-how-to-use-azurebatch-to-run-a-windows-executable.ipynb
index 620c2541..6af8c37b 100644
--- a/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-how-to-use-azurebatch-to-run-a-windows-executable.ipynb
+++ b/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-how-to-use-azurebatch-to-run-a-windows-executable.ipynb
@@ -67,8 +67,7 @@
"source": [
"Initialize a workspace object from persisted configuration. Make sure the config file is present at .\\config.json\n",
"\n",
- "If you don't have a config.json file, please go through the configuration Notebook located here:\n",
- "https://github.com/Azure/MachineLearningNotebooks. \n",
+ "If you don't have a config.json file, please go through the configuration Notebook located [here](https://github.com/Azure/MachineLearningNotebooks). \n",
"\n",
"This sets you up with a working config file that has information on your workspace, subscription id, etc. "
]
@@ -80,7 +79,11 @@
"outputs": [],
"source": [
"ws = Workspace.from_config()\n",
- "print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep = '\\n')"
+ "\n",
+ "print('Workspace Name: ' + ws.name, \n",
+ " 'Azure Region: ' + ws.location, \n",
+ " 'Subscription Id: ' + ws.subscription_id, \n",
+ " 'Resource Group: ' + ws.resource_group, sep = '\\n')"
]
},
{
@@ -114,7 +117,8 @@
" batch_compute = BatchCompute(ws, batch_compute_name)\n",
"except ComputeTargetException:\n",
" print('Attaching Batch compute...')\n",
- " provisioning_config = BatchCompute.attach_configuration(resource_group=batch_resource_group, account_name=batch_account_name)\n",
+ " provisioning_config = BatchCompute.attach_configuration(resource_group=batch_resource_group, \n",
+ " account_name=batch_account_name)\n",
" batch_compute = ComputeTarget.attach(ws, batch_compute_name, provisioning_config)\n",
" batch_compute.wait_for_completion()\n",
" print(\"Provisioning state:{}\".format(batch_compute.provisioning_state))\n",
@@ -127,7 +131,19 @@
"cell_type": "markdown",
"metadata": {},
"source": [
- "## Setup DataStore"
+ "## Setup Datastore"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Setting up the Blob storage associated with the workspace. \n",
+ "The following call retrieves the Azure Blob Store associated with your workspace. \n",
+ "Note that workspaceblobstore is **the name of this store and CANNOT BE CHANGED and must be used as is**. \n",
+ " \n",
+ "If you want to register another Datastore, please follow the instructions from here:\n",
+ "https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-access-data#register-a-datastore"
]
},
{
@@ -136,11 +152,12 @@
"metadata": {},
"outputs": [],
"source": [
- "# Blob storage associated with the workspace\n",
- "# The following call GETS the Azure Blob Store associated with your workspace.\n",
- "# Note that workspaceblobstore is **the name of this store and CANNOT BE CHANGED and must be used as is** \n",
- "default_blob_store = Datastore(ws, \"workspaceblobstore\")\n",
- "print(\"Blobstore name: {}\".format(def_blob_store.name))"
+ "datastore = Datastore(ws, \"workspaceblobstore\")\n",
+ "\n",
+ "print('Datastore details:')\n",
+ "print('Datastore Account Name: ' + datastore.account_name)\n",
+ "print('Datastore Workspace Name: ' + datastore.workspace.name)\n",
+ "print('Datastore Container Name: ' + datastore.container_name)"
]
},
{
@@ -154,7 +171,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
- "For this example we will upload a file in the provided DataStore. These are some helper methods to achieve that."
+ "For this example we will upload a file in the provided Datastore. These are some helper methods to achieve that."
]
},
{
@@ -171,16 +188,16 @@
" return temp_dir\n",
"\n",
"\n",
- "def upload_file_to_datastore(datastore, path, content):\n",
- " dir = create_local_file(content=content, file_name=\"temp.file\")\n",
- " datastore.upload(src_dir=dir, target_path=path, overwrite=True, show_progress=True)"
+ "def upload_file_to_datastore(datastore, file_name, content):\n",
+ " dir = create_local_file(content=content, file_name=file_name)\n",
+ " datastore.upload(src_dir=dir, overwrite=True, show_progress=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
- "Here we associate the input DataReference with an existing file in the provided DataStore. Feel free to upload the file of your choice manually or use the *upload_testdata* method. "
+ "Here we associate the input DataReference with an existing file in the provided Datastore. Feel free to upload the file of your choice manually or use the *upload_file_to_datastore* method. "
]
},
{
@@ -189,14 +206,14 @@
"metadata": {},
"outputs": [],
"source": [
- "testdata_path=\"testdata.txt\"\n",
+ "file_name=\"input.txt\"\n",
"\n",
- "upload_file_to_datastore(datastore=default_blob_store, \n",
- " path=testdata_path, \n",
- " content=\"This is the content of the file\")\n",
+ "upload_file_to_datastore(datastore=datastore, \n",
+ " file_name=file_name, \n",
+ " content=\"this is the content of the file\")\n",
"\n",
- "testdata = DataReference(datastore=default_blob_store, \n",
- " path_on_datastore=testdata_path, \n",
+ "testdata = DataReference(datastore=datastore, \n",
+ " path_on_datastore=file_name, \n",
" data_reference_name=\"input\")\n",
"\n",
"outputdata = PipelineData(name=\"output\", datastore=datastore)"
@@ -224,7 +241,7 @@
"source": [
"binaries_folder = \"azurebatch/job_binaries\"\n",
"if not os.path.isdir(binaries_folder):\n",
- " os.mkdir(project_folder)\n",
+ " os.mkdir(binaries_folder)\n",
"\n",
"file_name=\"azurebatch.cmd\"\n",
"with open(path.join(binaries_folder, file_name), 'w') as f:\n",
diff --git a/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-parameter-tuning-with-hyperdrive.ipynb b/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-parameter-tuning-with-hyperdrive.ipynb
index 7ce0ef75..e1a03fea 100644
--- a/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-parameter-tuning-with-hyperdrive.ipynb
+++ b/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-parameter-tuning-with-hyperdrive.ipynb
@@ -29,7 +29,8 @@
"import os\n",
"import shutil\n",
"import urllib\n",
- "from azureml.core import Experiment\n",
+ "import azureml.core\n",
+ "from azureml.core import Workspace, Experiment\n",
"from azureml.core.datastore import Datastore\n",
"from azureml.core.compute import ComputeTarget, AmlCompute\n",
"from azureml.exceptions import ComputeTargetException\n",
@@ -109,7 +110,7 @@
"metadata": {},
"source": [
"## Upload MNIST dataset to blob datastore \n",
- "A [datastore](https://docs.microsoft.com/azure/machine-learning/service/how-to-access-data) is a place where data can be stored that is then made accessible to a Run either by means of mounting or copying the data to the compute target. A datastore can either be backed by an Azure Blob Storage or and Azure File Share (ADLS will be supported in the future). In the next step, we will use Azure Blob Storage and upload the training and test set into the Azure Blob datastore, which we will then later be mount on a Batch AI cluster for training."
+ "A [datastore](https://docs.microsoft.com/azure/machine-learning/service/how-to-access-data) is a place where data can be stored that is then made accessible to a Run either by means of mounting or copying the data to the compute target. In the next step, we will use Azure Blob Storage and upload the training and test set into the Azure Blob datastore, which we will then later be mount on a Batch AI cluster for training."
]
},
{
@@ -118,7 +119,7 @@
"metadata": {},
"outputs": [],
"source": [
- "ds = Datastore(workspace=ws, name=\"MyBlobDatastore\")\n",
+ "ds = ws.get_default_datastore()\n",
"ds.upload(src_dir='./data/mnist', target_path='mnist', overwrite=True, show_progress=True)"
]
},
@@ -129,12 +130,12 @@
"## Retrieve or create a Azure Machine Learning compute\n",
"Azure Machine Learning Compute is a service for provisioning and managing clusters of Azure virtual machines for running machine learning workloads. Let's create a new Azure Machine Learning Compute in the current workspace, if it doesn't already exist. We will then run the training script on this compute target.\n",
"\n",
- "If we could not find the compute with the given name in the previous cell, then we will create a new compute here. We will create an Azure Machine Learning Compute containing **STANDARD_D2_V2 CPU VMs**. This process is broken down into the following steps:\n",
+ "If we could not find the compute with the given name in the previous cell, then we will create a new compute here. This process is broken down into the following steps:\n",
"\n",
"1. Create the configuration\n",
"2. Create the Azure Machine Learning compute\n",
"\n",
- "**This process will take about 3 minutes and is providing only sparse output in the process. Please make sure to wait until the call returns before moving to the next cell.**\n"
+ "**This process will take a few minutes and is providing only sparse output in the process. Please make sure to wait until the call returns before moving to the next cell.**\n"
]
},
{
@@ -143,7 +144,7 @@
"metadata": {},
"outputs": [],
"source": [
- "cluster_name = \"aml-compute\"\n",
+ "cluster_name = \"gpucluster\"\n",
"\n",
"try:\n",
" compute_target = ComputeTarget(workspace=ws, name=cluster_name)\n",
@@ -320,7 +321,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
- "### Build the experiment"
+ "### Run the pipeline"
]
},
{
@@ -329,31 +330,15 @@
"metadata": {},
"outputs": [],
"source": [
- "pipeline = Pipeline(workspace=ws, steps=[hd_step])"
+ "pipeline = Pipeline(workspace=ws, steps=[hd_step])\n",
+ "pipeline_run = Experiment(ws, 'Hyperdrive_Test').submit(pipeline)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
- "### Submit the experiment "
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "pipeline_run = Experiment(ws, 'Hyperdrive_Test').submit(pipeline)\n",
- "pipeline_run.wait_for_completion()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### View Run Details"
+ "### Monitor using widget"
]
},
{
@@ -365,6 +350,22 @@
"from azureml.widgets import RunDetails\n",
"RunDetails(pipeline_run).show()"
]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Wait for the completion of this Pipeline run"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "pipeline_run.wait_for_completion()"
+ ]
}
],
"metadata": {
diff --git a/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-setup-schedule-for-a-published-pipeline.ipynb b/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-setup-schedule-for-a-published-pipeline.ipynb
index 8d7abb8a..f3ea2731 100644
--- a/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-setup-schedule-for-a-published-pipeline.ipynb
+++ b/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-setup-schedule-for-a-published-pipeline.ipynb
@@ -204,7 +204,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
- "### Create a schedule for the pipeline"
+ "### Create a schedule for the pipeline using a recurrence\n",
+ "This schedule will run on a specified recurrence interval."
]
},
{
@@ -345,7 +346,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
- "### Change reccurence of the schedule"
+ "### Change recurrence of the schedule"
]
},
{
@@ -366,13 +367,58 @@
" wait_for_provisioning=True,\n",
" recurrence=recurrence)\n",
"\n",
- "fetched_schedule = Schedule.get_schedule(ws, fetched_schedule.id)\n",
+ "fetched_schedule = Schedule.get(ws, fetched_schedule.id)\n",
"\n",
"print(\"Updated schedule:\", fetched_schedule.id, \n",
" \"\\nNew name:\", fetched_schedule.name,\n",
" \"\\nNew frequency:\", fetched_schedule.recurrence.frequency,\n",
" \"\\nNew status:\", fetched_schedule.status)"
]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Create a schedule for the pipeline using a Datastore\n",
+ "This schedule will run when additions or modifications are made to Blobs in the Datastore container.\n",
+ "Note: Only Blob Datastores are supported."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from azureml.core.datastore import Datastore\n",
+ "\n",
+ "datastore = Datastore(workspace=ws, name=\"workspaceblobstore\")\n",
+ "\n",
+ "schedule = Schedule.create(workspace=ws, name=\"My_Schedule\",\n",
+ " pipeline_id=pub_pipeline_id, \n",
+ " experiment_name='Schedule_Run',\n",
+ " datastore=datastore,\n",
+ " wait_for_provisioning=True,\n",
+ " description=\"Schedule Run\")\n",
+ "\n",
+ "# You may want to make sure that the schedule is provisioned properly\n",
+ "# before making any further changes to the schedule\n",
+ "\n",
+ "print(\"Created schedule with id: {}\".format(schedule.id))"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Set the wait_for_provisioning flag to False if you do not want to wait \n",
+ "# for the call to provision the schedule in the backend.\n",
+ "schedule.disable(wait_for_provisioning=True)\n",
+ "schedule = Schedule.get(ws, schedule_id)\n",
+ "print(\"Disabled schedule {}. New status is: {}\".format(schedule.id, schedule.status))"
+ ]
}
],
"metadata": {
diff --git a/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-use-databricks-as-compute-target.ipynb b/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-use-databricks-as-compute-target.ipynb
index 483a5753..4b92d211 100644
--- a/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-use-databricks-as-compute-target.ipynb
+++ b/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-use-databricks-as-compute-target.ipynb
@@ -168,7 +168,7 @@
"metadata": {},
"source": [
"## Data Connections with Inputs and Outputs\n",
- "The DatabricksStep supports Azure Blob and ADLS for inputs and outputs. You also will need to define a [Secrets](https://docs.azuredatabricks.net/user-guide/secrets/index.html) scope to enable authentication to external data sources such as Blob and ADLS from Databricks.\n",
+ "The DatabricksStep supports Azure Bloband ADLS for inputs and outputs. You also will need to define a [Secrets](https://docs.azuredatabricks.net/user-guide/secrets/index.html) scope to enable authentication to external data sources such as Blob and ADLS from Databricks.\n",
"\n",
"- Databricks documentation on [Azure Blob](https://docs.azuredatabricks.net/spark/latest/data-sources/azure/azure-storage.html)\n",
"- Databricks documentation on [ADLS](https://docs.databricks.com/spark/latest/data-sources/azure/azure-datalake.html)\n",
@@ -397,7 +397,7 @@
"metadata": {},
"source": [
"### 1. Running the demo notebook already added to the Databricks workspace\n",
- "Create a notebook in the Azure Databricks workspace, and provide the path to that notebook as the value associated with the environment variable \"DATABRICKS_NOTEBOOK_PATH\". This will then set the variable notebook_path when you run the code cell below:"
+ "Create a notebook in the Azure Databricks workspace, and provide the path to that notebook as the value associated with the environment variable \"DATABRICKS_NOTEBOOK_PATH\". This will then set the variable\u00c2\u00a0notebook_path\u00c2\u00a0when you run the code cell below:"
]
},
{
@@ -436,7 +436,6 @@
"source": [
"steps = [dbNbStep]\n",
"pipeline = Pipeline(workspace=ws, steps=steps)\n",
- "pipeline.validate()\n",
"pipeline_run = Experiment(ws, 'DB_Notebook_demo').submit(pipeline)\n",
"pipeline_run.wait_for_completion()"
]
@@ -706,4 +705,4 @@
},
"nbformat": 4,
"nbformat_minor": 2
-}
+}
\ No newline at end of file
diff --git a/how-to-use-azureml/machine-learning-pipelines/pipeline-style-transfer/pipeline-style-transfer.ipynb b/how-to-use-azureml/machine-learning-pipelines/pipeline-style-transfer/pipeline-style-transfer.ipynb
index 93349181..3c78998e 100644
--- a/how-to-use-azureml/machine-learning-pipelines/pipeline-style-transfer/pipeline-style-transfer.ipynb
+++ b/how-to-use-azureml/machine-learning-pipelines/pipeline-style-transfer/pipeline-style-transfer.ipynb
@@ -120,7 +120,7 @@
"metadata": {},
"source": [
"# Python Scripts\n",
- "We use an edited version of `neural_style_mpi.py` (original is [here](https://github.com/pytorch/examples/blob/master/fast_neural_style/neural_style/neural_style_mpi.py)). Scripts to split and stitch the video are thin wrappers to calls to `ffmpeg`. \n",
+ "We use an edited version of `neural_style_mpi.py` (original is [here](https://github.com/pytorch/examples/blob/master/fast_neural_style/neural_style/neural_style.py)). Scripts to split and stitch the video are thin wrappers to calls to `ffmpeg`. \n",
"\n",
"We install `ffmpeg` through conda dependencies."
]
@@ -201,6 +201,13 @@
" )"
]
},
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "The sample video **organutan.mp4** is stored at a publicly shared datastore. We are registering the datastore below. If you want to take a look at the original video, click here. (https://pipelinedata.blob.core.windows.net/sample-videos/orangutan.mp4)"
+ ]
+ },
{
"cell_type": "code",
"execution_count": null,
@@ -208,8 +215,8 @@
"outputs": [],
"source": [
"# datastore for input video\n",
- "account_name = \"happypathspublic\"\n",
- "video_ds = Datastore.register_azure_blob_container(ws, \"videos\", \"videos\",\n",
+ "account_name = \"pipelinedata\"\n",
+ "video_ds = Datastore.register_azure_blob_container(ws, \"videos\", \"sample-videos\",\n",
" account_name=account_name, overwrite=True)\n",
"\n",
"# datastore for models\n",
@@ -238,9 +245,10 @@
"metadata": {},
"outputs": [],
"source": [
+ "video_name=os.getenv(\"STYLE_TRANSFER_VIDEO_NAME\", \"orangutan.mp4\") \n",
"orangutan_video = DataReference(datastore=video_ds,\n",
" data_reference_name=\"video\",\n",
- " path_on_datastore=\"orangutan.mp4\", mode=\"download\")"
+ " path_on_datastore=video_name, mode=\"download\")"
]
},
{
@@ -542,7 +550,7 @@
"response = requests.post(rest_endpoint, \n",
" headers=aad_token,\n",
" json={\"ExperimentName\": \"style_transfer\",\n",
- " \"ParameterAssignments\": {\"style\": \"udnie\", \"nodecount\": 4}}) \n",
+ " \"ParameterAssignments\": {\"style\": \"udnie\", \"nodecount\": 3}}) \n",
"run_id = response.json()[\"Id\"]\n",
"\n",
"published_pipeline_run_udnie = PipelineRun(ws.experiments[\"style_transfer\"], run_id)\n",
diff --git a/how-to-use-azureml/manage-azureml-service/authentication-in-azureml/authentication-in-azure-ml.ipynb b/how-to-use-azureml/manage-azureml-service/authentication-in-azureml/authentication-in-azure-ml.ipynb
index e8352b69..957942d1 100644
--- a/how-to-use-azureml/manage-azureml-service/authentication-in-azureml/authentication-in-azure-ml.ipynb
+++ b/how-to-use-azureml/manage-azureml-service/authentication-in-azureml/authentication-in-azure-ml.ipynb
@@ -209,8 +209,8 @@
"\n",
"svc_pr = ServicePrincipalAuthentication(\n",
" tenant_id=\"my-tenant-id\",\n",
- " username=\"my-application-id\",\n",
- " password=svc_pr_password)\n",
+ " service_principal_id=\"my-application-id\",\n",
+ " service_principal_password=svc_pr_password)\n",
"\n",
"\n",
"ws = Workspace(\n",
diff --git a/how-to-use-azureml/training-with-deep-learning/README.md b/how-to-use-azureml/training-with-deep-learning/README.md
index 979450f6..975108f4 100644
--- a/how-to-use-azureml/training-with-deep-learning/README.md
+++ b/how-to-use-azureml/training-with-deep-learning/README.md
@@ -4,13 +4,15 @@ These examples show you:
1. [How to use the Estimator pattern in Azure ML](how-to-use-estimator)
2. [Train using TensorFlow Estimator and tune hyperparameters using Hyperdrive](train-hyperparameter-tune-deploy-with-tensorflow)
-3. [Train using Keras and tune hyperparameters using Hyperdrive](train-hyperparameter-tune-deploy-with-keras)
-4. [Train using Pytorch Estimator and tune hyperparameters using Hyperdrive](train-hyperparameter-tune-deploy-with-pytorch)
-5. [Distributed training using TensorFlow and Parameter Server](distributed-tensorflow-with-parameter-server)
-6. [Distributed training using TensorFlow and Horovod](distributed-tensorflow-with-horovod)
-7. [Distributed training using Pytorch and Horovod](distributed-pytorch-with-horovod)
-8. [Distributed training using CNTK and custom Docker image](distributed-cntk-with-custom-docker)
-9. [Export run history records to Tensorboard](export-run-history-to-tensorboard)
-10. [Use TensorBoard to monitor training execution](tensorboard)
+3. [Train using Pytorch Estimator and tune hyperparameters using Hyperdrive](train-hyperparameter-tune-deploy-with-pytorch)
+4. [Train using Keras and tune hyperparameters using Hyperdrive](train-hyperparameter-tune-deploy-with-keras)
+5. [Train using Chainer Estimator and tune hyperparameters using Hyperdrive](train-hyperparameter-tune-deploy-with-chainer)
+6. [Distributed training using TensorFlow and Parameter Server](distributed-tensorflow-with-parameter-server)
+7. [Distributed training using TensorFlow and Horovod](distributed-tensorflow-with-horovod)
+8. [Distributed training using Pytorch and Horovod](distributed-pytorch-with-horovod)
+9. [Distributed training using CNTK and custom Docker image](distributed-cntk-with-custom-docker)
+10. [Distributed training using Chainer](distributed-chainer)
+11. [Export run history records to Tensorboard](export-run-history-to-tensorboard)
+12. [Use TensorBoard to monitor training execution](tensorboard)
Learn more about how to use `Estimator` class to [train deep neural networks with Azure Machine Learning](https://docs.microsoft.com/azure/machine-learning/service/how-to-train-ml-models).
diff --git a/how-to-use-azureml/training-with-deep-learning/distributed-chainer/chainer_mnist_distributed.py b/how-to-use-azureml/training-with-deep-learning/distributed-chainer/chainer_mnist_distributed.py
new file mode 100644
index 00000000..deb4b5f6
--- /dev/null
+++ b/how-to-use-azureml/training-with-deep-learning/distributed-chainer/chainer_mnist_distributed.py
@@ -0,0 +1,153 @@
+
+import argparse
+
+import chainer
+import chainer.cuda
+import chainer.functions as F
+import chainer.links as L
+from chainer import training
+from chainer.training import extensions
+
+import chainermn
+import chainermn.datasets
+import chainermn.functions
+
+
+chainer.disable_experimental_feature_warning = True
+
+
+class MLP0SubA(chainer.Chain):
+ def __init__(self, comm, n_out):
+ super(MLP0SubA, self).__init__(
+ l1=L.Linear(784, n_out))
+
+ def __call__(self, x):
+ return F.relu(self.l1(x))
+
+
+class MLP0SubB(chainer.Chain):
+ def __init__(self, comm):
+ super(MLP0SubB, self).__init__()
+
+ def __call__(self, y):
+ return y
+
+
+class MLP0(chainermn.MultiNodeChainList):
+ # Model on worker 0.
+ def __init__(self, comm, n_out):
+ super(MLP0, self).__init__(comm=comm)
+ self.add_link(MLP0SubA(comm, n_out), rank_in=None, rank_out=1)
+ self.add_link(MLP0SubB(comm), rank_in=1, rank_out=None)
+
+
+class MLP1Sub(chainer.Chain):
+ def __init__(self, n_units, n_out):
+ super(MLP1Sub, self).__init__(
+ l2=L.Linear(None, n_units),
+ l3=L.Linear(None, n_out))
+
+ def __call__(self, h0):
+ h1 = F.relu(self.l2(h0))
+ return self.l3(h1)
+
+
+class MLP1(chainermn.MultiNodeChainList):
+ # Model on worker 1.
+ def __init__(self, comm, n_units, n_out):
+ super(MLP1, self).__init__(comm=comm)
+ self.add_link(MLP1Sub(n_units, n_out), rank_in=0, rank_out=0)
+
+
+def main():
+ parser = argparse.ArgumentParser(
+ description='ChainerMN example: pipelined neural network')
+ parser.add_argument('--batchsize', '-b', type=int, default=100,
+ help='Number of images in each mini-batch')
+ parser.add_argument('--epoch', '-e', type=int, default=20,
+ help='Number of sweeps over the dataset to train')
+ parser.add_argument('--gpu', '-g', action='store_true',
+ help='Use GPU')
+ parser.add_argument('--out', '-o', default='result',
+ help='Directory to output the result')
+ parser.add_argument('--unit', '-u', type=int, default=1000,
+ help='Number of units')
+ args = parser.parse_args()
+
+ # Prepare ChainerMN communicator.
+ if args.gpu:
+ comm = chainermn.create_communicator('hierarchical')
+ data_axis, model_axis = comm.rank % 2, comm.rank // 2
+ data_comm = comm.split(data_axis, comm.rank)
+ model_comm = comm.split(model_axis, comm.rank)
+ device = comm.intra_rank
+ else:
+ comm = chainermn.create_communicator('naive')
+ data_axis, model_axis = comm.rank % 2, comm.rank // 2
+ data_comm = comm.split(data_axis, comm.rank)
+ model_comm = comm.split(model_axis, comm.rank)
+ device = -1
+
+ if model_comm.size != 2:
+ raise ValueError(
+ 'This example can only be executed on the even number'
+ 'of processes.')
+
+ if comm.rank == 0:
+ print('==========================================')
+ if args.gpu:
+ print('Using GPUs')
+ print('Num unit: {}'.format(args.unit))
+ print('Num Minibatch-size: {}'.format(args.batchsize))
+ print('Num epoch: {}'.format(args.epoch))
+ print('==========================================')
+
+ if data_axis == 0:
+ model = L.Classifier(MLP0(model_comm, args.unit))
+ elif data_axis == 1:
+ model = MLP1(model_comm, args.unit, 10)
+
+ if device >= 0:
+ chainer.cuda.get_device_from_id(device).use()
+ model.to_gpu()
+
+ optimizer = chainermn.create_multi_node_optimizer(
+ chainer.optimizers.Adam(), data_comm)
+ optimizer.setup(model)
+
+ # Original dataset on worker 0 and 1.
+ # Datasets of worker 0 and 1 are split and distributed to all workers.
+ if model_axis == 0:
+ train, test = chainer.datasets.get_mnist()
+ if data_axis == 1:
+ train = chainermn.datasets.create_empty_dataset(train)
+ test = chainermn.datasets.create_empty_dataset(test)
+ else:
+ train, test = None, None
+ train = chainermn.scatter_dataset(train, data_comm, shuffle=True)
+ test = chainermn.scatter_dataset(test, data_comm, shuffle=True)
+
+ train_iter = chainer.iterators.SerialIterator(
+ train, args.batchsize, shuffle=False)
+ test_iter = chainer.iterators.SerialIterator(
+ test, args.batchsize, repeat=False, shuffle=False)
+
+ updater = training.StandardUpdater(train_iter, optimizer, device=device)
+ trainer = training.Trainer(updater, (args.epoch, 'epoch'), out=args.out)
+ evaluator = extensions.Evaluator(test_iter, model, device=device)
+ evaluator = chainermn.create_multi_node_evaluator(evaluator, data_comm)
+ trainer.extend(evaluator)
+
+ # Some display and output extentions are necessary only for worker 0.
+ if comm.rank == 0:
+ trainer.extend(extensions.LogReport())
+ trainer.extend(extensions.PrintReport(
+ ['epoch', 'main/loss', 'validation/main/loss',
+ 'main/accuracy', 'validation/main/accuracy', 'elapsed_time']))
+ trainer.extend(extensions.ProgressBar())
+
+ trainer.run()
+
+
+if __name__ == '__main__':
+ main()
diff --git a/how-to-use-azureml/training-with-deep-learning/distributed-chainer/distributed-chainer.ipynb b/how-to-use-azureml/training-with-deep-learning/distributed-chainer/distributed-chainer.ipynb
new file mode 100644
index 00000000..c9643d54
--- /dev/null
+++ b/how-to-use-azureml/training-with-deep-learning/distributed-chainer/distributed-chainer.ipynb
@@ -0,0 +1,315 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Copyright (c) Microsoft Corporation. All rights reserved.\n",
+ "\n",
+ "Licensed under the MIT License."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Distributed Chainer\n",
+ "In this tutorial, you will run a Chainer training example on the [MNIST](http://yann.lecun.com/exdb/mnist/) dataset using ChainerMN distributed training across a GPU cluster."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Prerequisites\n",
+ "* Go through the [Configuration](../../../configuration.ipynb) notebook to install the Azure Machine Learning Python SDK and create an Azure ML `Workspace`"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Check core SDK version number\n",
+ "import azureml.core\n",
+ "\n",
+ "print(\"SDK version:\", azureml.core.VERSION)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Diagnostics\n",
+ "Opt-in diagnostics for better experience, quality, and security of future releases."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "tags": [
+ "Diagnostics"
+ ]
+ },
+ "outputs": [],
+ "source": [
+ "from azureml.telemetry import set_diagnostics_collection\n",
+ "\n",
+ "set_diagnostics_collection(send_diagnostics=True)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Initialize workspace\n",
+ "\n",
+ "Initialize a [Workspace](https://docs.microsoft.com/azure/machine-learning/service/concept-azure-machine-learning-architecture#workspace) object from the existing workspace you created in the Prerequisites step. `Workspace.from_config()` creates a workspace object from the details stored in `config.json`."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from azureml.core.workspace import Workspace\n",
+ "\n",
+ "ws = Workspace.from_config()\n",
+ "print('Workspace name: ' + ws.name, \n",
+ " 'Azure region: ' + ws.location, \n",
+ " 'Subscription id: ' + ws.subscription_id, \n",
+ " 'Resource group: ' + ws.resource_group, sep = '\\n')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Create or attach existing AmlCompute\n",
+ "You will need to create a [compute target](https://docs.microsoft.com/azure/machine-learning/service/concept-azure-machine-learning-architecture#compute-target) for training your model. In this tutorial, we use Azure ML managed compute ([AmlCompute](https://docs.microsoft.com/azure/machine-learning/service/how-to-set-up-training-targets#amlcompute)) for our remote training compute resource. Specifically, the below code creates an `STANDARD_NC6` GPU cluster that autoscales from `0` to `4` nodes.\n",
+ "\n",
+ "**Creation of AmlCompute takes approximately 5 minutes.** If the AmlCompute with that name is already in your workspace, this code will skip the creation process.\n",
+ "\n",
+ "As with other Azure services, there are limits on certain resources (e.g. AmlCompute) associated with the Azure Machine Learning service. Please read [this article](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-manage-quotas) on the default limits and how to request more quota."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from azureml.core.compute import ComputeTarget, AmlCompute\n",
+ "from azureml.core.compute_target import ComputeTargetException\n",
+ "\n",
+ "# choose a name for your cluster\n",
+ "cluster_name = \"gpucluster\"\n",
+ "\n",
+ "try:\n",
+ " compute_target = ComputeTarget(workspace=ws, name=cluster_name)\n",
+ " print('Found existing compute target.')\n",
+ "except ComputeTargetException:\n",
+ " print('Creating a new compute target...')\n",
+ " compute_config = AmlCompute.provisioning_configuration(vm_size='STANDARD_NC6',\n",
+ " max_nodes=4)\n",
+ "\n",
+ " # create the cluster\n",
+ " compute_target = ComputeTarget.create(ws, cluster_name, compute_config)\n",
+ "\n",
+ " compute_target.wait_for_completion(show_output=True)\n",
+ "\n",
+ "# use get_status() to get a detailed status for the current AmlCompute. \n",
+ "print(compute_target.get_status().serialize())"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "The above code creates GPU compute. If you instead want to create CPU compute, provide a different VM size to the `vm_size` parameter, such as `STANDARD_D2_V2`."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Train model on the remote compute\n",
+ "Now that we have the AmlCompute ready to go, let's run our distributed training job."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Create a project directory\n",
+ "Create a directory that will contain all the necessary code from your local machine that you will need access to on the remote resource. This includes the training script and any additional files your training script depends on."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import os\n",
+ "\n",
+ "project_folder = './chainer-distr'\n",
+ "os.makedirs(project_folder, exist_ok=True)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Prepare training script\n",
+ "Now you will need to create your training script. In this tutorial, the script for distributed training of MNIST is already provided for you at `train_mnist.py`. In practice, you should be able to take any custom Chainer training script as is and run it with Azure ML without having to modify your code."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Once your script is ready, copy the training script `train_mnist.py` into the project directory."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import shutil\n",
+ "\n",
+ "shutil.copy('train_mnist.py', project_folder)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Create an experiment\n",
+ "Create an [Experiment](https://docs.microsoft.com/azure/machine-learning/service/concept-azure-machine-learning-architecture#experiment) to track all the runs in your workspace for this distributed Chainer tutorial. "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from azureml.core import Experiment\n",
+ "\n",
+ "experiment_name = 'chainer-distr'\n",
+ "experiment = Experiment(ws, name=experiment_name)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Create a Chainer estimator\n",
+ "The Azure ML SDK's Chainer estimator enables you to easily submit Chainer training jobs for both single-node and distributed runs."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from azureml.train.dnn import Chainer\n",
+ "\n",
+ "estimator = Chainer(source_directory=project_folder,\n",
+ " compute_target=compute_target,\n",
+ " entry_script='train_mnist.py',\n",
+ " node_count=2,\n",
+ " process_count_per_node=1,\n",
+ " distributed_backend='mpi',\n",
+ " use_gpu=True)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "The above code specifies that we will run our training script on `2` nodes, with one worker per node. In order to execute a distributed run using MPI, you must provide the argument `distributed_backend='mpi'`. Using this estimator with these settings, Chainer and its dependencies will be installed for you. However, if your script also uses other packages, make sure to install them via the `Chainer` constructor's `pip_packages` or `conda_packages` parameters."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Submit job\n",
+ "Run your experiment by submitting your estimator object. Note that this call is asynchronous."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "run = experiment.submit(estimator)\n",
+ "print(run)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Monitor your run\n",
+ "You can monitor the progress of the run with a Jupyter widget. Like the run submission, the widget is asynchronous and provides live updates every 10-15 seconds until the job completes. You can see that the widget automatically plots and visualizes the loss metric that we logged to the Azure ML run."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from azureml.widgets import RunDetails\n",
+ "\n",
+ "RunDetails(run).show()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "run.wait_for_completion(show_output=True)"
+ ]
+ }
+ ],
+ "metadata": {
+ "authors": [
+ {
+ "name": "minxia"
+ }
+ ],
+ "kernelspec": {
+ "display_name": "Python 3.6",
+ "language": "python",
+ "name": "python36"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.6.6"
+ },
+ "msauthor": "minxia"
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
\ No newline at end of file
diff --git a/how-to-use-azureml/training-with-deep-learning/distributed-chainer/train_mnist.py b/how-to-use-azureml/training-with-deep-learning/distributed-chainer/train_mnist.py
new file mode 100644
index 00000000..29c77f2d
--- /dev/null
+++ b/how-to-use-azureml/training-with-deep-learning/distributed-chainer/train_mnist.py
@@ -0,0 +1,125 @@
+# Official ChainerMN example taken from
+# https://github.com/chainer/chainer/blob/master/examples/chainermn/mnist/train_mnist.py
+
+from __future__ import print_function
+
+import argparse
+
+import chainer
+import chainer.functions as F
+import chainer.links as L
+from chainer import training
+from chainer.training import extensions
+
+import chainermn
+
+
+class MLP(chainer.Chain):
+
+ def __init__(self, n_units, n_out):
+ super(MLP, self).__init__(
+ # the size of the inputs to each layer will be inferred
+ l1=L.Linear(784, n_units), # n_in -> n_units
+ l2=L.Linear(n_units, n_units), # n_units -> n_units
+ l3=L.Linear(n_units, n_out), # n_units -> n_out
+ )
+
+ def __call__(self, x):
+ h1 = F.relu(self.l1(x))
+ h2 = F.relu(self.l2(h1))
+ return self.l3(h2)
+
+
+def main():
+ parser = argparse.ArgumentParser(description='ChainerMN example: MNIST')
+ parser.add_argument('--batchsize', '-b', type=int, default=100,
+ help='Number of images in each mini-batch')
+ parser.add_argument('--communicator', type=str,
+ default='non_cuda_aware', help='Type of communicator')
+ parser.add_argument('--epoch', '-e', type=int, default=20,
+ help='Number of sweeps over the dataset to train')
+ parser.add_argument('--gpu', '-g', default=True,
+ help='Use GPU')
+ parser.add_argument('--out', '-o', default='result',
+ help='Directory to output the result')
+ parser.add_argument('--resume', '-r', default='',
+ help='Resume the training from snapshot')
+ parser.add_argument('--unit', '-u', type=int, default=1000,
+ help='Number of units')
+ args = parser.parse_args()
+
+ # Prepare ChainerMN communicator.
+
+ if args.gpu:
+ if args.communicator == 'naive':
+ print("Error: 'naive' communicator does not support GPU.\n")
+ exit(-1)
+ comm = chainermn.create_communicator(args.communicator)
+ device = comm.intra_rank
+ else:
+ if args.communicator != 'naive':
+ print('Warning: using naive communicator '
+ 'because only naive supports CPU-only execution')
+ comm = chainermn.create_communicator('naive')
+ device = -1
+
+ if comm.rank == 0:
+ print('==========================================')
+ print('Num process (COMM_WORLD): {}'.format(comm.size))
+ if args.gpu:
+ print('Using GPUs')
+ print('Using {} communicator'.format(args.communicator))
+ print('Num unit: {}'.format(args.unit))
+ print('Num Minibatch-size: {}'.format(args.batchsize))
+ print('Num epoch: {}'.format(args.epoch))
+ print('==========================================')
+
+ model = L.Classifier(MLP(args.unit, 10))
+ if device >= 0:
+ chainer.cuda.get_device_from_id(device).use()
+ model.to_gpu()
+
+ # Create a multi node optimizer from a standard Chainer optimizer.
+ optimizer = chainermn.create_multi_node_optimizer(
+ chainer.optimizers.Adam(), comm)
+ optimizer.setup(model)
+
+ # Split and distribute the dataset. Only worker 0 loads the whole dataset.
+ # Datasets of worker 0 are evenly split and distributed to all workers.
+ if comm.rank == 0:
+ train, test = chainer.datasets.get_mnist()
+ else:
+ train, test = None, None
+ train = chainermn.scatter_dataset(train, comm, shuffle=True)
+ test = chainermn.scatter_dataset(test, comm, shuffle=True)
+
+ train_iter = chainer.iterators.SerialIterator(train, args.batchsize)
+ test_iter = chainer.iterators.SerialIterator(test, args.batchsize,
+ repeat=False, shuffle=False)
+
+ updater = training.StandardUpdater(train_iter, optimizer, device=device)
+ trainer = training.Trainer(updater, (args.epoch, 'epoch'), out=args.out)
+
+ # Create a multi node evaluator from a standard Chainer evaluator.
+ evaluator = extensions.Evaluator(test_iter, model, device=device)
+ evaluator = chainermn.create_multi_node_evaluator(evaluator, comm)
+ trainer.extend(evaluator)
+
+ # Some display and output extensions are necessary only for one worker.
+ # (Otherwise, there would just be repeated outputs.)
+ if comm.rank == 0:
+ trainer.extend(extensions.dump_graph('main/loss'))
+ trainer.extend(extensions.LogReport())
+ trainer.extend(extensions.PrintReport(
+ ['epoch', 'main/loss', 'validation/main/loss',
+ 'main/accuracy', 'validation/main/accuracy', 'elapsed_time']))
+ trainer.extend(extensions.ProgressBar())
+
+ if args.resume:
+ chainer.serializers.load_npz(args.resume, trainer)
+
+ trainer.run()
+
+
+if __name__ == '__main__':
+ main()
diff --git a/how-to-use-azureml/training-with-deep-learning/export-run-history-to-tensorboard/export-run-history-to-tensorboard.ipynb b/how-to-use-azureml/training-with-deep-learning/export-run-history-to-tensorboard/export-run-history-to-tensorboard.ipynb
index 2911aeac..59c19e52 100644
--- a/how-to-use-azureml/training-with-deep-learning/export-run-history-to-tensorboard/export-run-history-to-tensorboard.ipynb
+++ b/how-to-use-azureml/training-with-deep-learning/export-run-history-to-tensorboard/export-run-history-to-tensorboard.ipynb
@@ -56,7 +56,7 @@
"metadata": {},
"outputs": [],
"source": [
- "!pip install azureml-contrib-tensorboard"
+ "!pip install azureml-tensorboard"
]
},
{
@@ -166,7 +166,7 @@
"outputs": [],
"source": [
"# Export Run History to Tensorboard logs\n",
- "from azureml.contrib.tensorboard.export import export_to_tensorboard\n",
+ "from azureml.tensorboard.export import export_to_tensorboard\n",
"import os\n",
"\n",
"logdir = 'exportedTBlogs'\n",
@@ -208,7 +208,7 @@
"metadata": {},
"outputs": [],
"source": [
- "from azureml.contrib.tensorboard import Tensorboard\n",
+ "from azureml.tensorboard import Tensorboard\n",
"\n",
"# The Tensorboard constructor takes an array of runs, so be sure and pass it in as a single-element array here\n",
"tb = Tensorboard([], local_root=logdir, port=6006)\n",
diff --git a/how-to-use-azureml/training-with-deep-learning/tensorboard/tensorboard.ipynb b/how-to-use-azureml/training-with-deep-learning/tensorboard/tensorboard.ipynb
index ea4f4a5b..8e355d98 100644
--- a/how-to-use-azureml/training-with-deep-learning/tensorboard/tensorboard.ipynb
+++ b/how-to-use-azureml/training-with-deep-learning/tensorboard/tensorboard.ipynb
@@ -57,7 +57,7 @@
"metadata": {},
"outputs": [],
"source": [
- "!pip install azureml-contrib-tensorboard"
+ "!pip install azureml-tensorboard"
]
},
{
@@ -239,7 +239,7 @@
"metadata": {},
"outputs": [],
"source": [
- "from azureml.contrib.tensorboard import Tensorboard\n",
+ "from azureml.tensorboard import Tensorboard\n",
"\n",
"# The Tensorboard constructor takes an array of runs, so be sure and pass it in as a single-element array here\n",
"tb = Tensorboard([run])\n",
@@ -293,7 +293,7 @@
"metadata": {},
"outputs": [],
"source": [
- "from azureml.core.compute import RemoteCompute\n",
+ "from azureml.core.compute import ComputeTarget, RemoteCompute\n",
"from azureml.core.compute_target import ComputeTargetException\n",
"\n",
"username = os.getenv('AZUREML_DSVM_USERNAME', default='')\n",
@@ -305,12 +305,11 @@
" attached_dsvm_compute = RemoteCompute(workspace=ws, name=compute_target_name)\n",
" print('found existing:', attached_dsvm_compute.name)\n",
"except ComputeTargetException:\n",
- " attached_dsvm_compute = RemoteCompute.attach(workspace=ws,\n",
- " name=compute_target_name,\n",
- " username=username,\n",
- " address=address,\n",
- " ssh_port=22,\n",
- " private_key_file='./.ssh/id_rsa')\n",
+ " config = RemoteCompute.attach_configuration(username=username,\n",
+ " address=address,\n",
+ " ssh_port=22,\n",
+ " private_key_file='./.ssh/id_rsa')\n",
+ " attached_dsvm_compute = ComputeTarget.attach(ws, compute_target_name, config)\n",
" \n",
" attached_dsvm_compute.wait_for_completion(show_output=True)"
]
@@ -407,10 +406,13 @@
"# choose a name for your cluster\n",
"cluster_name = \"cpucluster\"\n",
"\n",
- "try:\n",
- " compute_target = ComputeTarget(workspace=ws, name=cluster_name)\n",
- " print('Found existing compute target.')\n",
- "except ComputeTargetException:\n",
+ "cts = ws.compute_targets\n",
+ "found = False\n",
+ "if cluster_name in cts and cts[cluster_name].type == 'AmlCompute':\n",
+ " found = True\n",
+ " print('Found existing compute target.')\n",
+ " compute_target = cts[cluster_name]\n",
+ "if not found:\n",
" print('Creating a new compute target...')\n",
" compute_config = AmlCompute.provisioning_configuration(vm_size='STANDARD_D2_V2', \n",
" max_nodes=4)\n",
@@ -418,10 +420,10 @@
" # create the cluster\n",
" compute_target = ComputeTarget.create(ws, cluster_name, compute_config)\n",
"\n",
- "compute_target.wait_for_completion(show_output=True, min_node_count=1, timeout_in_minutes=20)\n",
+ "compute_target.wait_for_completion(show_output=True, min_node_count=None)\n",
"\n",
"# use get_status() to get a detailed status for the current cluster. \n",
- "print(compute_target.get_status().serialize())"
+ "# print(compute_target.get_status().serialize())"
]
},
{
diff --git a/how-to-use-azureml/training-with-deep-learning/train-hyperparameter-tune-deploy-with-chainer/chainer_mnist.py b/how-to-use-azureml/training-with-deep-learning/train-hyperparameter-tune-deploy-with-chainer/chainer_mnist.py
new file mode 100644
index 00000000..515ce8ba
--- /dev/null
+++ b/how-to-use-azureml/training-with-deep-learning/train-hyperparameter-tune-deploy-with-chainer/chainer_mnist.py
@@ -0,0 +1,136 @@
+
+import argparse
+
+import numpy as np
+
+import chainer
+from chainer import backend
+from chainer import backends
+from chainer.backends import cuda
+from chainer import Function, gradient_check, report, training, utils, Variable
+from chainer import datasets, iterators, optimizers, serializers
+from chainer import Link, Chain, ChainList
+import chainer.functions as F
+import chainer.links as L
+from chainer.training import extensions
+from chainer.dataset import concat_examples
+from chainer.backends.cuda import to_cpu
+
+from azureml.core.run import Run
+run = Run.get_context()
+
+
+class MyNetwork(Chain):
+
+ def __init__(self, n_mid_units=100, n_out=10):
+ super(MyNetwork, self).__init__()
+ with self.init_scope():
+ self.l1 = L.Linear(None, n_mid_units)
+ self.l2 = L.Linear(n_mid_units, n_mid_units)
+ self.l3 = L.Linear(n_mid_units, n_out)
+
+ def forward(self, x):
+ h = F.relu(self.l1(x))
+ h = F.relu(self.l2(h))
+ return self.l3(h)
+
+
+def main():
+ parser = argparse.ArgumentParser(description='Chainer example: MNIST')
+ parser.add_argument('--batchsize', '-b', type=int, default=100,
+ help='Number of images in each mini-batch')
+ parser.add_argument('--epochs', '-e', type=int, default=20,
+ help='Number of sweeps over the dataset to train')
+ parser.add_argument('--output_dir', '-o', default='./outputs',
+ help='Directory to output the result')
+ parser.add_argument('--gpu_id', '-g', default=0,
+ help='ID of the GPU to be used. Set to -1 if you use CPU')
+ args = parser.parse_args()
+
+ # Download the MNIST data if you haven't downloaded it yet
+ train, test = datasets.mnist.get_mnist(withlabel=True, ndim=1)
+
+ gpu_id = args.gpu_id
+ batchsize = args.batchsize
+ epochs = args.epochs
+ run.log('Batch size', np.int(batchsize))
+ run.log('Epochs', np.int(epochs))
+
+ train_iter = iterators.SerialIterator(train, batchsize)
+ test_iter = iterators.SerialIterator(test, batchsize,
+ repeat=False, shuffle=False)
+
+ model = MyNetwork()
+
+ if gpu_id >= 0:
+ # Make a specified GPU current
+ chainer.backends.cuda.get_device_from_id(0).use()
+ model.to_gpu() # Copy the model to the GPU
+
+ # Choose an optimizer algorithm
+ optimizer = optimizers.MomentumSGD(lr=0.01, momentum=0.9)
+
+ # Give the optimizer a reference to the model so that it
+ # can locate the model's parameters.
+ optimizer.setup(model)
+
+ while train_iter.epoch < epochs:
+ # ---------- One iteration of the training loop ----------
+ train_batch = train_iter.next()
+ image_train, target_train = concat_examples(train_batch, gpu_id)
+
+ # Calculate the prediction of the network
+ prediction_train = model(image_train)
+
+ # Calculate the loss with softmax_cross_entropy
+ loss = F.softmax_cross_entropy(prediction_train, target_train)
+
+ # Calculate the gradients in the network
+ model.cleargrads()
+ loss.backward()
+
+ # Update all the trainable parameters
+ optimizer.update()
+ # --------------------- until here ---------------------
+
+ # Check the validation accuracy of prediction after every epoch
+ if train_iter.is_new_epoch: # If this iteration is the final iteration of the current epoch
+
+ # Display the training loss
+ print('epoch:{:02d} train_loss:{:.04f} '.format(
+ train_iter.epoch, float(to_cpu(loss.array))), end='')
+
+ test_losses = []
+ test_accuracies = []
+ while True:
+ test_batch = test_iter.next()
+ image_test, target_test = concat_examples(test_batch, gpu_id)
+
+ # Forward the test data
+ prediction_test = model(image_test)
+
+ # Calculate the loss
+ loss_test = F.softmax_cross_entropy(prediction_test, target_test)
+ test_losses.append(to_cpu(loss_test.array))
+
+ # Calculate the accuracy
+ accuracy = F.accuracy(prediction_test, target_test)
+ accuracy.to_cpu()
+ test_accuracies.append(accuracy.array)
+
+ if test_iter.is_new_epoch:
+ test_iter.epoch = 0
+ test_iter.current_position = 0
+ test_iter.is_new_epoch = False
+ test_iter._pushed_position = None
+ break
+
+ val_accuracy = np.mean(test_accuracies)
+ print('val_loss:{:.04f} val_accuracy:{:.04f}'.format(
+ np.mean(test_losses), val_accuracy))
+
+ run.log("Accuracy", np.float(val_accuracy))
+
+
+if __name__ == '__main__':
+ main()
diff --git a/how-to-use-azureml/training-with-deep-learning/train-hyperparameter-tune-deploy-with-chainer/chainer_mnist_hd.py b/how-to-use-azureml/training-with-deep-learning/train-hyperparameter-tune-deploy-with-chainer/chainer_mnist_hd.py
new file mode 100644
index 00000000..46d43588
--- /dev/null
+++ b/how-to-use-azureml/training-with-deep-learning/train-hyperparameter-tune-deploy-with-chainer/chainer_mnist_hd.py
@@ -0,0 +1,134 @@
+
+import argparse
+
+import numpy as np
+
+import chainer
+from chainer import backend
+from chainer import backends
+from chainer.backends import cuda
+from chainer import Function, gradient_check, report, training, utils, Variable
+from chainer import datasets, iterators, optimizers, serializers
+from chainer import Link, Chain, ChainList
+import chainer.functions as F
+import chainer.links as L
+from chainer.training import extensions
+from chainer.dataset import concat_examples
+from chainer.backends.cuda import to_cpu
+
+from azureml.core.run import Run
+run = Run.get_context()
+
+
+class MyNetwork(Chain):
+
+ def __init__(self, n_mid_units=100, n_out=10):
+ super(MyNetwork, self).__init__()
+ with self.init_scope():
+ self.l1 = L.Linear(None, n_mid_units)
+ self.l2 = L.Linear(n_mid_units, n_mid_units)
+ self.l3 = L.Linear(n_mid_units, n_out)
+
+ def forward(self, x):
+ h = F.relu(self.l1(x))
+ h = F.relu(self.l2(h))
+ return self.l3(h)
+
+
+def main():
+ parser = argparse.ArgumentParser(description='Chainer example: MNIST')
+ parser.add_argument('--batchsize', '-b', type=int, default=100,
+ help='Number of images in each mini-batch')
+ parser.add_argument('--epochs', '-e', type=int, default=20,
+ help='Number of sweeps over the dataset to train')
+ parser.add_argument('--output_dir', '-o', default='./outputs',
+ help='Directory to output the result')
+ args = parser.parse_args()
+
+ # Download the MNIST data if you haven't downloaded it yet
+ train, test = datasets.mnist.get_mnist(withlabel=True, ndim=1)
+
+ batchsize = args.batchsize
+ epochs = args.epochs
+ run.log('Batch size', np.int(batchsize))
+ run.log('Epochs', np.int(epochs))
+
+ train_iter = iterators.SerialIterator(train, batchsize)
+ test_iter = iterators.SerialIterator(test, batchsize,
+ repeat=False, shuffle=False)
+
+ model = MyNetwork()
+
+ gpu_id = -1 # Set to -1 if you use CPU
+ if gpu_id >= 0:
+ # Make a specified GPU current
+ chainer.backends.cuda.get_device_from_id(0).use()
+ model.to_gpu() # Copy the model to the GPU
+
+ # Choose an optimizer algorithm
+ optimizer = optimizers.MomentumSGD(lr=0.01, momentum=0.9)
+
+ # Give the optimizer a reference to the model so that it
+ # can locate the model's parameters.
+ optimizer.setup(model)
+
+ while train_iter.epoch < epochs:
+ # ---------- One iteration of the training loop ----------
+ train_batch = train_iter.next()
+ image_train, target_train = concat_examples(train_batch, gpu_id)
+
+ # Calculate the prediction of the network
+ prediction_train = model(image_train)
+
+ # Calculate the loss with softmax_cross_entropy
+ loss = F.softmax_cross_entropy(prediction_train, target_train)
+
+ # Calculate the gradients in the network
+ model.cleargrads()
+ loss.backward()
+
+ # Update all the trainable parameters
+ optimizer.update()
+ # --------------------- until here ---------------------
+
+ # Check the validation accuracy of prediction after every epoch
+ if train_iter.is_new_epoch: # If this iteration is the final iteration of the current epoch
+
+ # Display the training loss
+ print('epoch:{:02d} train_loss:{:.04f} '.format(
+ train_iter.epoch, float(to_cpu(loss.array))), end='')
+
+ test_losses = []
+ test_accuracies = []
+ while True:
+ test_batch = test_iter.next()
+ image_test, target_test = concat_examples(test_batch, gpu_id)
+
+ # Forward the test data
+ prediction_test = model(image_test)
+
+ # Calculate the loss
+ loss_test = F.softmax_cross_entropy(prediction_test, target_test)
+ test_losses.append(to_cpu(loss_test.array))
+
+ # Calculate the accuracy
+ accuracy = F.accuracy(prediction_test, target_test)
+ accuracy.to_cpu()
+ test_accuracies.append(accuracy.array)
+
+ if test_iter.is_new_epoch:
+ test_iter.epoch = 0
+ test_iter.current_position = 0
+ test_iter.is_new_epoch = False
+ test_iter._pushed_position = None
+ break
+
+ val_accuracy = np.mean(test_accuracies)
+ print('val_loss:{:.04f} val_accuracy:{:.04f}'.format(
+ np.mean(test_losses), val_accuracy))
+
+ run.log("Accuracy", np.float(val_accuracy))
+
+
+if __name__ == '__main__':
+ main()
diff --git a/how-to-use-azureml/training-with-deep-learning/train-hyperparameter-tune-deploy-with-chainer/train-hyperparameter-tune-deploy-with-chainer.ipynb b/how-to-use-azureml/training-with-deep-learning/train-hyperparameter-tune-deploy-with-chainer/train-hyperparameter-tune-deploy-with-chainer.ipynb
new file mode 100644
index 00000000..28bf9b1c
--- /dev/null
+++ b/how-to-use-azureml/training-with-deep-learning/train-hyperparameter-tune-deploy-with-chainer/train-hyperparameter-tune-deploy-with-chainer.ipynb
@@ -0,0 +1,425 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Copyright (c) Microsoft Corporation. All rights reserved. \n",
+ "\n",
+ "Licensed under the MIT License."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Train and hyperparameter tune with Chainer\n",
+ "\n",
+ "In this tutorial, we demonstrate how to use the Azure ML Python SDK to train a Convolutional Neural Network (CNN) on a single-node GPU with Chainer to perform handwritten digit recognition on the popular MNIST dataset. We will also demonstrate how to perform hyperparameter tuning of the model using Azure ML's HyperDrive service."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Prerequisites\n",
+ "* Go through the [Configuration](../../../configuration.ipynb) notebook to install the Azure Machine Learning Python SDK and create an Azure ML `Workspace`"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Check core SDK version number\n",
+ "import azureml.core\n",
+ "\n",
+ "print(\"SDK version:\", azureml.core.VERSION)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Diagnostics\n",
+ "Opt-in diagnostics for better experience, quality, and security of future releases."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "tags": [
+ "Diagnostics"
+ ]
+ },
+ "outputs": [],
+ "source": [
+ "from azureml.telemetry import set_diagnostics_collection\n",
+ "\n",
+ "set_diagnostics_collection(send_diagnostics=True)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Initialize workspace\n",
+ "Initialize a [Workspace](https://docs.microsoft.com/azure/machine-learning/service/concept-azure-machine-learning-architecture#workspace) object from the existing workspace you created in the Prerequisites step. `Workspace.from_config()` creates a workspace object from the details stored in `config.json`."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from azureml.core.workspace import Workspace\n",
+ "\n",
+ "ws = Workspace.from_config()\n",
+ "print('Workspace name: ' + ws.name, \n",
+ " 'Azure region: ' + ws.location, \n",
+ " 'Subscription id: ' + ws.subscription_id, \n",
+ " 'Resource group: ' + ws.resource_group, sep = '\\n')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Create or Attach existing AmlCompute\n",
+ "You will need to create a [compute target](https://docs.microsoft.com/azure/machine-learning/service/concept-azure-machine-learning-architecture#compute-target) for training your model. In this tutorial, we use Azure ML managed compute ([AmlCompute](https://docs.microsoft.com/azure/machine-learning/service/how-to-set-up-training-targets#amlcompute)) for our remote training compute resource.\n",
+ "\n",
+ "**Creation of AmlCompute takes approximately 5 minutes.** If the AmlCompute with that name is already in your workspace, this code will skip the creation process.\n",
+ "\n",
+ "As with other Azure services, there are limits on certain resources (e.g. AmlCompute) associated with the Azure Machine Learning service. Please read [this article](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-manage-quotas) on the default limits and how to request more quota."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from azureml.core.compute import ComputeTarget, AmlCompute\n",
+ "from azureml.core.compute_target import ComputeTargetException\n",
+ "\n",
+ "# choose a name for your cluster\n",
+ "cluster_name = \"gpucluster\"\n",
+ "\n",
+ "try:\n",
+ " compute_target = ComputeTarget(workspace=ws, name=cluster_name)\n",
+ " print('Found existing compute target.')\n",
+ "except ComputeTargetException:\n",
+ " print('Creating a new compute target...')\n",
+ " compute_config = AmlCompute.provisioning_configuration(vm_size='STANDARD_NC6', \n",
+ " max_nodes=4)\n",
+ "\n",
+ " # create the cluster\n",
+ " compute_target = ComputeTarget.create(ws, cluster_name, compute_config)\n",
+ "\n",
+ " compute_target.wait_for_completion(show_output=True)\n",
+ "\n",
+ "# use get_status() to get a detailed status for the current cluster. \n",
+ "print(compute_target.get_status().serialize())"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "The above code creates a GPU cluster. If you instead want to create a CPU cluster, provide a different VM size to the `vm_size` parameter, such as `STANDARD_D2_V2`."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Train model on the remote compute\n",
+ "Now that you have your data and training script prepared, you are ready to train on your remote compute cluster. You can take advantage of Azure compute to leverage GPUs to cut down your training time. "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Create a project directory\n",
+ "Create a directory that will contain all the necessary code from your local machine that you will need access to on the remote resource. This includes the training script and any additional files your training script depends on."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import os\n",
+ "\n",
+ "project_folder = './chainer-mnist'\n",
+ "os.makedirs(project_folder, exist_ok=True)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Prepare training script\n",
+ "Now you will need to create your training script. In this tutorial, the training script is already provided for you at `chainer_mnist.py`. In practice, you should be able to take any custom training script as is and run it with Azure ML without having to modify your code.\n",
+ "\n",
+ "However, if you would like to use Azure ML's [tracking and metrics](https://docs.microsoft.com/azure/machine-learning/service/concept-azure-machine-learning-architecture#metrics) capabilities, you will have to add a small amount of Azure ML code inside your training script. \n",
+ "\n",
+ "In `chainer_mnist.py`, we will log some metrics to our Azure ML run. To do so, we will access the Azure ML `Run` object within the script:\n",
+ "```Python\n",
+ "from azureml.core.run import Run\n",
+ "run = Run.get_context()\n",
+ "```\n",
+ "Further within `chainer_mnist.py`, we log the batchsize and epochs parameters, and the highest accuracy the model achieves:\n",
+ "```Python\n",
+ "run.log('Batch size', np.int(args.batchsize))\n",
+ "run.log('Epochs', np.int(args.epochs))\n",
+ "\n",
+ "run.log('Accuracy', np.float(val_accuracy))\n",
+ "```\n",
+ "These run metrics will become particularly important when we begin hyperparameter tuning our model in the \"Tune model hyperparameters\" section."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Once your script is ready, copy the training script `chainer_mnist.py` into your project directory."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import shutil\n",
+ "\n",
+ "shutil.copy('chainer_mnist.py', project_folder)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Create an experiment\n",
+ "Create an [Experiment](https://docs.microsoft.com/azure/machine-learning/service/concept-azure-machine-learning-architecture#experiment) to track all the runs in your workspace for this Chainer tutorial. "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from azureml.core import Experiment\n",
+ "\n",
+ "experiment_name = 'chainer-mnist'\n",
+ "experiment = Experiment(ws, name=experiment_name)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Create a Chainer estimator\n",
+ "The Azure ML SDK's Chainer estimator enables you to easily submit Chainer training jobs for both single-node and distributed runs. The following code will define a single-node Chainer job."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from azureml.train.dnn import Chainer\n",
+ "\n",
+ "script_params = {\n",
+ " '--epochs': 10,\n",
+ " '--batchsize': 128,\n",
+ " '--output_dir': './outputs'\n",
+ "}\n",
+ "\n",
+ "estimator = Chainer(source_directory=project_folder, \n",
+ " script_params=script_params,\n",
+ " compute_target=compute_target,\n",
+ " pip_packages=['numpy', 'pytest'],\n",
+ " entry_script='chainer_mnist.py',\n",
+ " use_gpu=True)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "The `script_params` parameter is a dictionary containing the command-line arguments to your training script `entry_script`. To leverage the Azure VM's GPU for training, we set `use_gpu=True`."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Submit job\n",
+ "Run your experiment by submitting your estimator object. Note that this call is asynchronous."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "run = experiment.submit(estimator)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Monitor your run\n",
+ "You can monitor the progress of the run with a Jupyter widget. Like the run submission, the widget is asynchronous and provides live updates every 10-15 seconds until the job completes."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from azureml.widgets import RunDetails\n",
+ "\n",
+ "RunDetails(run).show()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# to get more details of your run\n",
+ "print(run.get_details())"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Tune model hyperparameters\n",
+ "Now that we've seen how to do a simple Chainer training run using the SDK, let's see if we can further improve the accuracy of our model. We can optimize our model's hyperparameters using Azure Machine Learning's hyperparameter tuning capabilities."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Start a hyperparameter sweep\n",
+ "First, we will define the hyperparameter space to sweep over. Let's tune the batch size and epochs parameters. In this example we will use random sampling to try different configuration sets of hyperparameters to maximize our primary metric, accuracy.\n",
+ "\n",
+ "Then, we specify the early termination policy to use to early terminate poorly performing runs. Here we use the `BanditPolicy`, which will terminate any run that doesn't fall within the slack factor of our primary evaluation metric. In this tutorial, we will apply this policy every epoch (since we report our `Accuracy` metric every epoch and `evaluation_interval=1`). Notice we will delay the first policy evaluation until after the first `3` epochs (`delay_evaluation=3`).\n",
+ "Refer [here](https://docs.microsoft.com/azure/machine-learning/service/how-to-tune-hyperparameters#specify-an-early-termination-policy) for more information on the BanditPolicy and other policies available."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from azureml.train.hyperdrive.runconfig import HyperDriveRunConfig\n",
+ "from azureml.train.hyperdrive.sampling import RandomParameterSampling\n",
+ "from azureml.train.hyperdrive.policy import BanditPolicy\n",
+ "from azureml.train.hyperdrive.run import PrimaryMetricGoal\n",
+ "from azureml.train.hyperdrive.parameter_expressions import choice\n",
+ " \n",
+ "\n",
+ "param_sampling = RandomParameterSampling( {\n",
+ " \"--batchsize\": choice(128, 256),\n",
+ " \"--epochs\": choice(5, 10, 20, 40)\n",
+ " }\n",
+ ")\n",
+ "\n",
+ "hyperdrive_run_config = HyperDriveRunConfig(estimator=estimator,\n",
+ " hyperparameter_sampling=param_sampling, \n",
+ " primary_metric_name='Accuracy',\n",
+ " primary_metric_goal=PrimaryMetricGoal.MAXIMIZE,\n",
+ " max_total_runs=8,\n",
+ " max_concurrent_runs=4)\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Finally, lauch the hyperparameter tuning job."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# start the HyperDrive run\n",
+ "hyperdrive_run = experiment.submit(hyperdrive_run_config)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Monitor HyperDrive runs\n",
+ "You can monitor the progress of the runs with the following Jupyter widget. "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "RunDetails(hyperdrive_run).show()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "run.wait_for_completion(show_output=True)"
+ ]
+ }
+ ],
+ "metadata": {
+ "authors": [
+ {
+ "name": "minxia"
+ }
+ ],
+ "kernelspec": {
+ "display_name": "Python 3.6",
+ "language": "python",
+ "name": "python36"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.6.6"
+ },
+ "msauthor": "minxia"
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
\ No newline at end of file
diff --git a/how-to-use-azureml/training-with-deep-learning/train-hyperparameter-tune-deploy-with-keras/keras_mnist.py b/how-to-use-azureml/training-with-deep-learning/train-hyperparameter-tune-deploy-with-keras/keras_mnist.py
index 0ca63a2b..9f2529e6 100644
--- a/how-to-use-azureml/training-with-deep-learning/train-hyperparameter-tune-deploy-with-keras/keras_mnist.py
+++ b/how-to-use-azureml/training-with-deep-learning/train-hyperparameter-tune-deploy-with-keras/keras_mnist.py
@@ -28,6 +28,8 @@ parser.add_argument('--first-layer-neurons', type=int, dest='n_hidden_1', defaul
help='# of neurons in the first layer')
parser.add_argument('--second-layer-neurons', type=int, dest='n_hidden_2', default=100,
help='# of neurons in the second layer')
+parser.add_argument('--learning-rate', type=float, dest='learning_rate', default=0.001, help='learning rate')
+
args = parser.parse_args()
data_folder = args.data_folder
@@ -46,9 +48,9 @@ n_inputs = 28 * 28
n_h1 = args.n_hidden_1
n_h2 = args.n_hidden_2
n_outputs = 10
-
n_epochs = 20
batch_size = args.batch_size
+learning_rate = args.learning_rate
y_train = one_hot_encode(y_train, n_outputs)
y_test = one_hot_encode(y_test, n_outputs)
@@ -56,9 +58,9 @@ print(X_train.shape, y_train.shape, X_test.shape, y_test.shape, sep='\n')
# Build a simple MLP model
model = Sequential()
-# input layer
+# first hidden layer
model.add(Dense(n_h1, activation='relu', input_shape=(n_inputs,)))
-# hidden layer
+# second hidden layer
model.add(Dense(n_h2, activation='relu'))
# output layer
model.add(Dense(n_outputs, activation='softmax'))
@@ -66,7 +68,7 @@ model.add(Dense(n_outputs, activation='softmax'))
model.summary()
model.compile(loss='categorical_crossentropy',
- optimizer=RMSprop(),
+ optimizer=RMSprop(lr=learning_rate),
metrics=['accuracy'])
# start an Azure ML run
diff --git a/how-to-use-azureml/training-with-deep-learning/train-hyperparameter-tune-deploy-with-keras/train-hyperparameter-tune-deploy-with-keras.ipynb b/how-to-use-azureml/training-with-deep-learning/train-hyperparameter-tune-deploy-with-keras/train-hyperparameter-tune-deploy-with-keras.ipynb
index 4248dfca..1713c5c0 100644
--- a/how-to-use-azureml/training-with-deep-learning/train-hyperparameter-tune-deploy-with-keras/train-hyperparameter-tune-deploy-with-keras.ipynb
+++ b/how-to-use-azureml/training-with-deep-learning/train-hyperparameter-tune-deploy-with-keras/train-hyperparameter-tune-deploy-with-keras.ipynb
@@ -1,1141 +1,1171 @@
{
- "cells": [
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Copyright (c) Microsoft Corporation. All rights reserved.\n",
- "\n",
- "Licensed under the MIT License."
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "nbpresent": {
- "id": "bf74d2e9-2708-49b1-934b-e0ede342f475"
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Copyright (c) Microsoft Corporation. All rights reserved.\n",
+ "\n",
+ "Licensed under the MIT License."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "nbpresent": {
+ "id": "bf74d2e9-2708-49b1-934b-e0ede342f475"
+ }
+ },
+ "source": [
+ "# Training, hyperparameter tune, and deploy with Keras\n",
+ "\n",
+ "## Introduction\n",
+ "This tutorial shows how to train a simple deep neural network using the MNIST dataset and Keras on Azure Machine Learning. MNIST is a popular dataset consisting of 70,000 grayscale images. Each image is a handwritten digit of `28x28` pixels, representing number from 0 to 9. The goal is to create a multi-class classifier to identify the digit each image represents, and deploy it as a web service in Azure.\n",
+ "\n",
+ "For more information about the MNIST dataset, please visit [Yan LeCun's website](http://yann.lecun.com/exdb/mnist/).\n",
+ "\n",
+ "## Prerequisite:\n",
+ "* Understand the [architecture and terms](https://docs.microsoft.com/azure/machine-learning/service/concept-azure-machine-learning-architecture) introduced by Azure Machine Learning\n",
+ "* Go through the [configuration notebook](../../../configuration.ipynb) to:\n",
+ " * install the AML SDK\n",
+ " * create a workspace and its configuration file (`config.json`)\n",
+ "* For local scoring test, you will also need to have `tensorflow` and `keras` installed in the current Jupyter kernel."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Let's get started. First let's import some Python libraries."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "nbpresent": {
+ "id": "c377ea0c-0cd9-4345-9be2-e20fb29c94c3"
+ }
+ },
+ "outputs": [],
+ "source": [
+ "%matplotlib inline\n",
+ "import numpy as np\n",
+ "import os\n",
+ "import matplotlib.pyplot as plt"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "nbpresent": {
+ "id": "edaa7f2f-2439-4148-b57a-8c794c0945ec"
+ }
+ },
+ "outputs": [],
+ "source": [
+ "import azureml\n",
+ "from azureml.core import Workspace\n",
+ "\n",
+ "# check core SDK version number\n",
+ "print(\"Azure ML SDK Version: \", azureml.core.VERSION)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Initialize workspace\n",
+ "Initialize a [Workspace](https://docs.microsoft.com/azure/machine-learning/service/concept-azure-machine-learning-architecture#workspace) object from the existing workspace you created in the Prerequisites step. `Workspace.from_config()` creates a workspace object from the details stored in `config.json`."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "ws = Workspace.from_config()\n",
+ "print('Workspace name: ' + ws.name, \n",
+ " 'Azure region: ' + ws.location, \n",
+ " 'Subscription id: ' + ws.subscription_id, \n",
+ " 'Resource group: ' + ws.resource_group, sep='\\n')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "nbpresent": {
+ "id": "59f52294-4a25-4c92-bab8-3b07f0f44d15"
+ }
+ },
+ "source": [
+ "## Create an Azure ML experiment\n",
+ "Let's create an experiment named \"keras-mnist\" and a folder to hold the training scripts. The script runs will be recorded under the experiment in Azure."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "nbpresent": {
+ "id": "bc70f780-c240-4779-96f3-bc5ef9a37d59"
+ }
+ },
+ "outputs": [],
+ "source": [
+ "from azureml.core import Experiment\n",
+ "\n",
+ "script_folder = './keras-mnist'\n",
+ "os.makedirs(script_folder, exist_ok=True)\n",
+ "\n",
+ "exp = Experiment(workspace=ws, name='keras-mnist')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "nbpresent": {
+ "id": "defe921f-8097-44c3-8336-8af6700804a7"
+ }
+ },
+ "source": [
+ "## Download MNIST dataset\n",
+ "In order to train on the MNIST dataset we will first need to download it from Yan LeCun's web site directly and save them in a `data` folder locally."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import urllib\n",
+ "\n",
+ "os.makedirs('./data/mnist', exist_ok=True)\n",
+ "\n",
+ "urllib.request.urlretrieve('http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz', filename='./data/mnist/train-images.gz')\n",
+ "urllib.request.urlretrieve('http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz', filename='./data/mnist/train-labels.gz')\n",
+ "urllib.request.urlretrieve('http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz', filename='./data/mnist/test-images.gz')\n",
+ "urllib.request.urlretrieve('http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz', filename='./data/mnist/test-labels.gz')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "nbpresent": {
+ "id": "c3f2f57c-7454-4d3e-b38d-b0946cf066ea"
+ }
+ },
+ "source": [
+ "## Show some sample images\n",
+ "Let's load the downloaded compressed file into numpy arrays using some utility functions included in the `utils.py` library file from the current folder. Then we use `matplotlib` to plot 30 random images from the dataset along with their labels."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "nbpresent": {
+ "id": "396d478b-34aa-4afa-9898-cdce8222a516"
+ }
+ },
+ "outputs": [],
+ "source": [
+ "from utils import load_data, one_hot_encode\n",
+ "\n",
+ "# note we also shrink the intensity values (X) from 0-255 to 0-1. This helps the neural network converge faster.\n",
+ "X_train = load_data('./data/mnist/train-images.gz', False) / 255.0\n",
+ "y_train = load_data('./data/mnist/train-labels.gz', True).reshape(-1)\n",
+ "\n",
+ "X_test = load_data('./data/mnist/test-images.gz', False) / 255.0\n",
+ "y_test = load_data('./data/mnist/test-labels.gz', True).reshape(-1)\n",
+ "\n",
+ "count = 0\n",
+ "sample_size = 30\n",
+ "plt.figure(figsize = (16, 6))\n",
+ "for i in np.random.permutation(X_train.shape[0])[:sample_size]:\n",
+ " count = count + 1\n",
+ " plt.subplot(1, sample_size, count)\n",
+ " plt.axhline('')\n",
+ " plt.axvline('')\n",
+ " plt.text(x = 10, y = -10, s = y_train[i], fontsize = 18)\n",
+ " plt.imshow(X_train[i].reshape(28, 28), cmap = plt.cm.Greys)\n",
+ "plt.show()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Upload MNIST dataset to default datastore \n",
+ "A [datastore](https://docs.microsoft.com/azure/machine-learning/service/how-to-access-data) is a place where data can be stored that is then made accessible to a Run either by means of mounting or copying the data to the compute target. A datastore can either be backed by an Azure Blob Storage or and Azure File Share (ADLS will be supported in the future). For simple data handling, each workspace provides a default datastore that can be used, in case the data is not already in Blob Storage or File Share."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "ds = ws.get_default_datastore()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "In this next step, we will upload the training and test set into the workspace's default datastore, which we will then later be mount on an `AmlCompute` cluster for training."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "ds.upload(src_dir='./data/mnist', target_path='mnist', overwrite=True, show_progress=True)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Create or Attach existing AmlCompute\n",
+ "You will need to create a [compute target](https://docs.microsoft.com/azure/machine-learning/service/concept-azure-machine-learning-architecture#compute-target) for training your model. In this tutorial, you create `AmlCompute` as your training compute resource."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "If we could not find the cluster with the given name, then we will create a new cluster here. We will create an `AmlCompute` cluster of `STANDARD_NC6` GPU VMs. This process is broken down into 3 steps:\n",
+ "1. create the configuration (this step is local and only takes a second)\n",
+ "2. create the cluster (this step will take about **20 seconds**)\n",
+ "3. provision the VMs to bring the cluster to the initial size (of 1 in this case). This step will take about **3-5 minutes** and is providing only sparse output in the process. Please make sure to wait until the call returns before moving to the next cell"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from azureml.core.compute import ComputeTarget, AmlCompute\n",
+ "from azureml.core.compute_target import ComputeTargetException\n",
+ "\n",
+ "# choose a name for your cluster\n",
+ "cluster_name = \"gpucluster\"\n",
+ "\n",
+ "try:\n",
+ " compute_target = ComputeTarget(workspace=ws, name=cluster_name)\n",
+ " print('Found existing compute target')\n",
+ "except ComputeTargetException:\n",
+ " print('Creating a new compute target...')\n",
+ " compute_config = AmlCompute.provisioning_configuration(vm_size='STANDARD_NC6', \n",
+ " max_nodes=4)\n",
+ "\n",
+ " # create the cluster\n",
+ " compute_target = ComputeTarget.create(ws, cluster_name, compute_config)\n",
+ "\n",
+ " # can poll for a minimum number of nodes and for a specific timeout. \n",
+ " # if no min node count is provided it uses the scale settings for the cluster\n",
+ " compute_target.wait_for_completion(show_output=True, min_node_count=None, timeout_in_minutes=20)\n",
+ "\n",
+ "# use get_status() to get a detailed status for the current cluster. \n",
+ "print(compute_target.get_status().serialize())"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Now that you have created the compute target, let's see what the workspace's `compute_targets` property returns. You should now see one entry named \"gpucluster\" of type `AmlCompute`."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "compute_targets = ws.compute_targets\n",
+ "for name, ct in compute_targets.items():\n",
+ " print(name, ct.type, ct.provisioning_state)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Copy the training files into the script folder\n",
+ "The Keras training script is already created for you. You can simply copy it into the script folder, together with the utility library used to load compressed data file into numpy array."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import shutil\n",
+ "\n",
+ "# the training logic is in the keras_mnist.py file.\n",
+ "shutil.copy('./keras_mnist.py', script_folder)\n",
+ "\n",
+ "# the utils.py just helps loading data from the downloaded MNIST dataset into numpy arrays.\n",
+ "shutil.copy('./utils.py', script_folder)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "nbpresent": {
+ "id": "2039d2d5-aca6-4f25-a12f-df9ae6529cae"
+ }
+ },
+ "source": [
+ "## Construct neural network in Keras\n",
+ "In the training script `keras_mnist.py`, it creates a very simple DNN (deep neural network), with just 2 hidden layers. The input layer has 28 * 28 = 784 neurons, each representing a pixel in an image. The first hidden layer has 300 neurons, and the second hidden layer has 100 neurons. The output layer has 10 neurons, each representing a targeted label from 0 to 9.\n",
+ "\n",
+ ""
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Azure ML concepts \n",
+ "Please note the following three things in the code below:\n",
+ "1. The script accepts arguments using the argparse package. In this case there is one argument `--data_folder` which specifies the file system folder in which the script can find the MNIST data\n",
+ "```\n",
+ " parser = argparse.ArgumentParser()\n",
+ " parser.add_argument('--data_folder')\n",
+ "```\n",
+ "2. The script is accessing the Azure ML `Run` object by executing `run = Run.get_context()`. Further down the script is using the `run` to report the loss and accuracy at the end of each epoch via callback.\n",
+ "```\n",
+ " run.log('Loss', log['loss'])\n",
+ " run.log('Accuracy', log['acc'])\n",
+ "```\n",
+ "3. When running the script on Azure ML, you can write files out to a folder `./outputs` that is relative to the root directory. This folder is specially tracked by Azure ML in the sense that any files written to that folder during script execution on the remote target will be picked up by Run History; these files (known as artifacts) will be available as part of the run history record."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "The next cell will print out the training code for you to inspect."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "with open(os.path.join(script_folder, './keras_mnist.py'), 'r') as f:\n",
+ " print(f.read())"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Create TensorFlow estimator & add Keras\n",
+ "Next, we construct an `azureml.train.dnn.TensorFlow` estimator object, use the `gpucluster` as compute target, and pass the mount-point of the datastore to the training code as a parameter.\n",
+ "The TensorFlow estimator is providing a simple way of launching a TensorFlow training job on a compute target. It will automatically provide a docker image that has TensorFlow installed. In this case, we add `keras` package (for the Keras framework obviously), and `matplotlib` package for plotting a \"Loss vs. Accuracy\" chart and record it in run history."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from azureml.train.dnn import TensorFlow\n",
+ "\n",
+ "script_params = {\n",
+ " '--data-folder': ds.path('mnist').as_mount(),\n",
+ " '--batch-size': 50,\n",
+ " '--first-layer-neurons': 300,\n",
+ " '--second-layer-neurons': 100,\n",
+ " '--learning-rate': 0.001\n",
+ "}\n",
+ "\n",
+ "est = TensorFlow(source_directory=script_folder,\n",
+ " script_params=script_params,\n",
+ " compute_target=compute_target, \n",
+ " conda_packages=['keras', 'matplotlib'],\n",
+ " entry_script='keras_mnist.py', \n",
+ " use_gpu=True)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "And if you are curious, this is what the mounting point looks like:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "print(ds.path('mnist').as_mount())"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Submit job to run\n",
+ "Submit the estimator to the Azure ML experiment to kick off the execution."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "run = exp.submit(est)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Monitor the Run\n",
+ "As the Run is executed, it will go through the following stages:\n",
+ "1. Preparing: A docker image is created matching the Python environment specified by the TensorFlow estimator and it will be uploaded to the workspace's Azure Container Registry. This step will only happen once for each Python environment -- the container will then be cached for subsequent runs. Creating and uploading the image takes about **5 minutes**. While the job is preparing, logs are streamed to the run history and can be viewed to monitor the progress of the image creation.\n",
+ "\n",
+ "2. Scaling: If the compute needs to be scaled up (i.e. the AmlCompute cluster requires more nodes to execute the run than currently available), the cluster will attempt to scale up in order to make the required amount of nodes available. Scaling typically takes about **5 minutes**.\n",
+ "\n",
+ "3. Running: All scripts in the script folder are uploaded to the compute target, data stores are mounted/copied and the `entry_script` is executed. While the job is running, stdout and the `./logs` folder are streamed to the run history and can be viewed to monitor the progress of the run.\n",
+ "\n",
+ "4. Post-Processing: The `./outputs` folder of the run is copied over to the run history\n",
+ "\n",
+ "There are multiple ways to check the progress of a running job. We can use a Jupyter notebook widget. \n",
+ "\n",
+ "**Note: The widget will automatically update ever 10-15 seconds, always showing you the most up-to-date information about the run**"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from azureml.widgets import RunDetails\n",
+ "RunDetails(run).show()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "We can also periodically check the status of the run object, and navigate to Azure portal to monitor the run."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "run"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "run.wait_for_completion(show_output=True)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "In the outputs of the training script, it prints out the Keras version number. Please make a note of it."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### The Run object\n",
+ "The Run object provides the interface to the run history -- both to the job and to the control plane (this notebook), and both while the job is running and after it has completed. It provides a number of interesting features for instance:\n",
+ "* `run.get_details()`: Provides a rich set of properties of the run\n",
+ "* `run.get_metrics()`: Provides a dictionary with all the metrics that were reported for the Run\n",
+ "* `run.get_file_names()`: List all the files that were uploaded to the run history for this Run. This will include the `outputs` and `logs` folder, azureml-logs and other logs, as well as files that were explicitly uploaded to the run using `run.upload_file()`\n",
+ "\n",
+ "Below are some examples -- please run through them and inspect their output. "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "run.get_details()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "run.get_metrics()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "run.get_file_names()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Download the saved model"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "In the training script, the Keras model is saved into two files, `model.json` and `model.h5`, in the `outputs/models` folder on the gpucluster AmlCompute node. Azure ML automatically uploaded anything written in the `./outputs` folder into run history file store. Subsequently, we can use the `run` object to download the model files. They are under the the `outputs/model` folder in the run history file store, and are downloaded into a local folder named `model`."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# create a model folder in the current directory\n",
+ "os.makedirs('./model', exist_ok=True)\n",
+ "\n",
+ "for f in run.get_file_names():\n",
+ " if f.startswith('outputs/model'):\n",
+ " output_file_path = os.path.join('./model', f.split('/')[-1])\n",
+ " print('Downloading from {} to {} ...'.format(f, output_file_path))\n",
+ " run.download_file(name=f, output_file_path=output_file_path)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Predict on the test set\n",
+ "Let's check the version of the local Keras. Make sure it matches with the version number printed out in the training script. Otherwise you might not be able to load the model properly."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import keras\n",
+ "import tensorflow as tf\n",
+ "\n",
+ "print(\"Keras version:\", keras.__version__)\n",
+ "print(\"Tensorflow version:\", tf.__version__)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Now let's load the downloaded model."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from keras.models import model_from_json\n",
+ "\n",
+ "# load json and create model\n",
+ "json_file = open('model/model.json', 'r')\n",
+ "loaded_model_json = json_file.read()\n",
+ "json_file.close()\n",
+ "loaded_model = model_from_json(loaded_model_json)\n",
+ "# load weights into new model\n",
+ "loaded_model.load_weights(\"model/model.h5\")\n",
+ "print(\"Model loaded from disk.\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Feed test dataset to the persisted model to get predictions."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# evaluate loaded model on test data\n",
+ "loaded_model.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=['accuracy'])\n",
+ "y_test_ohe = one_hot_encode(y_test, 10)\n",
+ "y_hat = np.argmax(loaded_model.predict(X_test), axis=1)\n",
+ "\n",
+ "# print the first 30 labels and predictions\n",
+ "print('labels: \\t', y_test[:30])\n",
+ "print('predictions:\\t', y_hat[:30])"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Calculate the overall accuracy by comparing the predicted value against the test set."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "print(\"Accuracy on the test set:\", np.average(y_hat == y_test))"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Intelligent hyperparameter tuning\n",
+ "We have trained the model with one set of hyperparameters, now let's how we can do hyperparameter tuning by launching multiple runs on the cluster. First let's define the parameter space using random sampling."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from azureml.train.hyperdrive import RandomParameterSampling, BanditPolicy, HyperDriveRunConfig, PrimaryMetricGoal\n",
+ "from azureml.train.hyperdrive import choice, loguniform\n",
+ "\n",
+ "ps = RandomParameterSampling(\n",
+ " {\n",
+ " '--batch-size': choice(25, 50, 100),\n",
+ " '--first-layer-neurons': choice(10, 50, 200, 300, 500),\n",
+ " '--second-layer-neurons': choice(10, 50, 200, 500),\n",
+ " '--learning-rate': loguniform(-6, -1)\n",
+ " }\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Next, we will create a new estimator without the above parameters since they will be passed in later by Hyperdrive configuration. Note we still need to keep the `data-folder` parameter since that's not a hyperparamter we will sweep."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "est = TensorFlow(source_directory=script_folder,\n",
+ " script_params={'--data-folder': ds.path('mnist').as_mount()},\n",
+ " compute_target=compute_target,\n",
+ " conda_packages=['keras', 'matplotlib'],\n",
+ " entry_script='keras_mnist.py', \n",
+ " use_gpu=True)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Now we will define an early termnination policy. The `BanditPolicy` basically states to check the job every 2 iterations. If the primary metric (defined later) falls outside of the top 10% range, Azure ML terminate the job. This saves us from continuing to explore hyperparameters that don't show promise of helping reach our target metric."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "policy = BanditPolicy(evaluation_interval=2, slack_factor=0.1)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Now we are ready to configure a run configuration object, and specify the primary metric `Accuracy` that's recorded in your training runs. If you go back to visit the training script, you will notice that this value is being logged after every epoch (a full batch set). We also want to tell the service that we are looking to maximizing this value. We also set the number of samples to 20, and maximal concurrent job to 4, which is the same as the number of nodes in our computer cluster."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "hdc = HyperDriveRunConfig(estimator=est, \n",
+ " hyperparameter_sampling=ps, \n",
+ " policy=policy, \n",
+ " primary_metric_name='Accuracy', \n",
+ " primary_metric_goal=PrimaryMetricGoal.MAXIMIZE, \n",
+ " max_total_runs=20,\n",
+ " max_concurrent_runs=4)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Finally, let's launch the hyperparameter tuning job."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "hdr = exp.submit(config=hdc)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "We can use a run history widget to show the progress. Be patient as this might take a while to complete."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "RunDetails(hdr).show()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "hdr.wait_for_completion(show_output=True)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Find and register best model\n",
+ "When all the jobs finish, we can find out the one that has the highest accuracy."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "best_run = hdr.get_best_run_by_primary_metric()\n",
+ "print(best_run.get_details()['runDefinition']['Arguments'])"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Now let's list the model files uploaded during the run."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "print(best_run.get_file_names())"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "We can then register the folder (and all files in it) as a model named `keras-dnn-mnist` under the workspace for deployment."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "model = best_run.register_model(model_name='keras-mlp-mnist', model_path='outputs/model')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Deploy the model in ACI\n",
+ "Now we are ready to deploy the model as a web service running in Azure Container Instance [ACI](https://azure.microsoft.com/en-us/services/container-instances/). Azure Machine Learning accomplishes this by constructing a Docker image with the scoring logic and model baked in.\n",
+ "### Create score.py\n",
+ "First, we will create a scoring script that will be invoked by the web service call. \n",
+ "\n",
+ "* Note that the scoring script must have two required functions, `init()` and `run(input_data)`. \n",
+ " * In `init()` function, you typically load the model into a global object. This function is executed only once when the Docker container is started. \n",
+ " * In `run(input_data)` function, the model is used to predict a value based on the input data. The input and output to `run` typically use JSON as serialization and de-serialization format but you are not limited to that."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "%%writefile score.py\n",
+ "import json\n",
+ "import numpy as np\n",
+ "import os\n",
+ "from keras.models import model_from_json\n",
+ "\n",
+ "from azureml.core.model import Model\n",
+ "\n",
+ "def init():\n",
+ " global model\n",
+ " \n",
+ " model_root = Model.get_model_path('keras-mlp-mnist')\n",
+ " # load json and create model\n",
+ " json_file = open(os.path.join(model_root, 'model.json'), 'r')\n",
+ " model_json = json_file.read()\n",
+ " json_file.close()\n",
+ " model = model_from_json(model_json)\n",
+ " # load weights into new model\n",
+ " model.load_weights(os.path.join(model_root, \"model.h5\")) \n",
+ " model.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=['accuracy'])\n",
+ " \n",
+ "def run(raw_data):\n",
+ " data = np.array(json.loads(raw_data)['data'])\n",
+ " # make prediction\n",
+ " y_hat = np.argmax(model.predict(data), axis=1)\n",
+ " return y_hat.tolist()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Create myenv.yml\n",
+ "We also need to create an environment file so that Azure Machine Learning can install the necessary packages in the Docker image which are required by your scoring script. In this case, we need to specify conda packages `tensorflow` and `keras`."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from azureml.core.runconfig import CondaDependencies\n",
+ "\n",
+ "cd = CondaDependencies.create()\n",
+ "cd.add_conda_package('tensorflow')\n",
+ "cd.add_conda_package('keras')\n",
+ "cd.save_to_file(base_directory='./', conda_file_path='myenv.yml')\n",
+ "\n",
+ "print(cd.serialize_to_string())"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Deploy to ACI\n",
+ "We are almost ready to deploy. Create a deployment configuration and specify the number of CPUs and gigbyte of RAM needed for your ACI container. "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from azureml.core.webservice import AciWebservice\n",
+ "\n",
+ "aciconfig = AciWebservice.deploy_configuration(cpu_cores=1, \n",
+ " auth_enabled=True, # this flag generates API keys to secure access\n",
+ " memory_gb=1, \n",
+ " tags={'name':'mnist', 'framework': 'Keras'},\n",
+ " description='Keras MLP on MNIST')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "#### Deployment Process\n",
+ "Now we can deploy. **This cell will run for about 7-8 minutes**. Behind the scene, it will do the following:\n",
+ "1. **Build Docker image** \n",
+ "Build a Docker image using the scoring file (`score.py`), the environment file (`myenv.yml`), and the `model` object. \n",
+ "2. **Register image** \n",
+ "Register that image under the workspace. \n",
+ "3. **Ship to ACI** \n",
+ "And finally ship the image to the ACI infrastructure, start up a container in ACI using that image, and expose an HTTP endpoint to accept REST client calls."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from azureml.core.image import ContainerImage\n",
+ "\n",
+ "imgconfig = ContainerImage.image_configuration(execution_script=\"score.py\", \n",
+ " runtime=\"python\", \n",
+ " conda_file=\"myenv.yml\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "%%time\n",
+ "from azureml.core.webservice import Webservice\n",
+ "\n",
+ "service = Webservice.deploy_from_model(workspace=ws,\n",
+ " name='keras-mnist-svc',\n",
+ " deployment_config=aciconfig,\n",
+ " models=[model],\n",
+ " image_config=imgconfig)\n",
+ "\n",
+ "service.wait_for_deployment(show_output=True)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "**Tip: If something goes wrong with the deployment, the first thing to look at is the logs from the service by running the following command:**"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "print(service.get_logs())"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "This is the scoring web service endpoint:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "print(service.scoring_uri)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Test the deployed model\n",
+ "Let's test the deployed model. Pick 30 random samples from the test set, and send it to the web service hosted in ACI. Note here we are using the `run` API in the SDK to invoke the service. You can also make raw HTTP calls using any HTTP tool such as curl.\n",
+ "\n",
+ "After the invocation, we print the returned predictions and plot them along with the input images. Use red font color and inversed image (white on black) to highlight the misclassified samples. Note since the model accuracy is pretty high, you might have to run the below cell a few times before you can see a misclassified sample."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import json\n",
+ "\n",
+ "# find 30 random samples from test set\n",
+ "n = 30\n",
+ "sample_indices = np.random.permutation(X_test.shape[0])[0:n]\n",
+ "\n",
+ "test_samples = json.dumps({\"data\": X_test[sample_indices].tolist()})\n",
+ "test_samples = bytes(test_samples, encoding='utf8')\n",
+ "\n",
+ "# predict using the deployed model\n",
+ "result = service.run(input_data=test_samples)\n",
+ "\n",
+ "# compare actual value vs. the predicted values:\n",
+ "i = 0\n",
+ "plt.figure(figsize = (20, 1))\n",
+ "\n",
+ "for s in sample_indices:\n",
+ " plt.subplot(1, n, i + 1)\n",
+ " plt.axhline('')\n",
+ " plt.axvline('')\n",
+ " \n",
+ " # use different color for misclassified sample\n",
+ " font_color = 'red' if y_test[s] != result[i] else 'black'\n",
+ " clr_map = plt.cm.gray if y_test[s] != result[i] else plt.cm.Greys\n",
+ " \n",
+ " plt.text(x=10, y=-10, s=y_hat[s], fontsize=18, color=font_color)\n",
+ " plt.imshow(X_test[s].reshape(28, 28), cmap=clr_map)\n",
+ " \n",
+ " i = i + 1\n",
+ "plt.show()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "We can retreive the API keys used for accessing the HTTP endpoint."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# retreive the API keys. two keys were generated.\n",
+ "key1, Key2 = service.get_keys()\n",
+ "print(key1)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "We can now send construct raw HTTP request and send to the service. Don't forget to add key to the HTTP header."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import requests\n",
+ "\n",
+ "# send a random row from the test set to score\n",
+ "random_index = np.random.randint(0, len(X_test)-1)\n",
+ "input_data = \"{\\\"data\\\": [\" + str(list(X_test[random_index])) + \"]}\"\n",
+ "\n",
+ "headers = {'Content-Type':'application/json', 'Authorization': 'Bearer ' + key1}\n",
+ "\n",
+ "resp = requests.post(service.scoring_uri, input_data, headers=headers)\n",
+ "\n",
+ "print(\"POST to url\", service.scoring_uri)\n",
+ "#print(\"input data:\", input_data)\n",
+ "print(\"label:\", y_test[random_index])\n",
+ "print(\"prediction:\", resp.text)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Let's look at the workspace after the web service was deployed. You should see \n",
+ "* a registered model named 'keras-mlp-mnist' and with the id 'model:1'\n",
+ "* an image called 'keras-mnist-svc' and with a docker image location pointing to your workspace's Azure Container Registry (ACR) \n",
+ "* a webservice called 'keras-mnist-svc' with some scoring URL"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "models = ws.models\n",
+ "for name, model in models.items():\n",
+ " print(\"Model: {}, ID: {}\".format(name, model.id))\n",
+ " \n",
+ "images = ws.images\n",
+ "for name, image in images.items():\n",
+ " print(\"Image: {}, location: {}\".format(name, image.image_location))\n",
+ " \n",
+ "webservices = ws.webservices\n",
+ "for name, webservice in webservices.items():\n",
+ " print(\"Webservice: {}, scoring URI: {}\".format(name, webservice.scoring_uri))"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Clean up\n",
+ "You can delete the ACI deployment with a simple delete API call."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "service.delete()"
+ ]
}
- },
- "source": [
- "# Training, hyperparameter tune, and deploy with Keras\n",
- "\n",
- "## Introduction\n",
- "This tutorial shows how to train a simple deep neural network using the MNIST dataset and Keras on Azure Machine Learning. MNIST is a popular dataset consisting of 70,000 grayscale images. Each image is a handwritten digit of `28x28` pixels, representing number from 0 to 9. The goal is to create a multi-class classifier to identify the digit each image represents, and deploy it as a web service in Azure.\n",
- "\n",
- "For more information about the MNIST dataset, please visit [Yan LeCun's website](http://yann.lecun.com/exdb/mnist/).\n",
- "\n",
- "## Prerequisite:\n",
- "* Understand the [architecture and terms](https://docs.microsoft.com/azure/machine-learning/service/concept-azure-machine-learning-architecture) introduced by Azure Machine Learning\n",
- "* Go through the [configuration notebook](../../../configuration.ipynb) to:\n",
- " * install the AML SDK\n",
- " * create a workspace and its configuration file (`config.json`)\n",
- "* For local scoring test, you will also need to have `tensorflow` and `keras` installed in the current Jupyter kernel."
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Let's get started. First let's import some Python libraries."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "nbpresent": {
- "id": "c377ea0c-0cd9-4345-9be2-e20fb29c94c3"
- }
- },
- "outputs": [],
- "source": [
- "%matplotlib inline\n",
- "import numpy as np\n",
- "import os\n",
- "import matplotlib.pyplot as plt"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "nbpresent": {
- "id": "edaa7f2f-2439-4148-b57a-8c794c0945ec"
- }
- },
- "outputs": [],
- "source": [
- "import azureml\n",
- "from azureml.core import Workspace\n",
- "\n",
- "# check core SDK version number\n",
- "print(\"Azure ML SDK Version: \", azureml.core.VERSION)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Initialize workspace\n",
- "Initialize a [Workspace](https://docs.microsoft.com/azure/machine-learning/service/concept-azure-machine-learning-architecture#workspace) object from the existing workspace you created in the Prerequisites step. `Workspace.from_config()` creates a workspace object from the details stored in `config.json`."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "ws = Workspace.from_config()\n",
- "print('Workspace name: ' + ws.name, \n",
- " 'Azure region: ' + ws.location, \n",
- " 'Subscription id: ' + ws.subscription_id, \n",
- " 'Resource group: ' + ws.resource_group, sep = '\\n')"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "nbpresent": {
- "id": "59f52294-4a25-4c92-bab8-3b07f0f44d15"
- }
- },
- "source": [
- "## Create an Azure ML experiment\n",
- "Let's create an experiment named \"keras-mnist\" and a folder to hold the training scripts. The script runs will be recorded under the experiment in Azure."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "nbpresent": {
- "id": "bc70f780-c240-4779-96f3-bc5ef9a37d59"
- }
- },
- "outputs": [],
- "source": [
- "from azureml.core import Experiment\n",
- "\n",
- "script_folder = './keras-mnist'\n",
- "os.makedirs(script_folder, exist_ok=True)\n",
- "\n",
- "exp = Experiment(workspace=ws, name='keras-mnist')"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "nbpresent": {
- "id": "defe921f-8097-44c3-8336-8af6700804a7"
- }
- },
- "source": [
- "## Download MNIST dataset\n",
- "In order to train on the MNIST dataset we will first need to download it from Yan LeCun's web site directly and save them in a `data` folder locally."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "import urllib\n",
- "\n",
- "os.makedirs('./data/mnist', exist_ok=True)\n",
- "\n",
- "urllib.request.urlretrieve('http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz', filename='./data/mnist/train-images.gz')\n",
- "urllib.request.urlretrieve('http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz', filename='./data/mnist/train-labels.gz')\n",
- "urllib.request.urlretrieve('http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz', filename='./data/mnist/test-images.gz')\n",
- "urllib.request.urlretrieve('http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz', filename='./data/mnist/test-labels.gz')"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "nbpresent": {
- "id": "c3f2f57c-7454-4d3e-b38d-b0946cf066ea"
- }
- },
- "source": [
- "## Show some sample images\n",
- "Let's load the downloaded compressed file into numpy arrays using some utility functions included in the `utils.py` library file from the current folder. Then we use `matplotlib` to plot 30 random images from the dataset along with their labels."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "nbpresent": {
- "id": "396d478b-34aa-4afa-9898-cdce8222a516"
- }
- },
- "outputs": [],
- "source": [
- "from utils import load_data, one_hot_encode\n",
- "\n",
- "# note we also shrink the intensity values (X) from 0-255 to 0-1. This helps the neural network converge faster.\n",
- "X_train = load_data('./data/mnist/train-images.gz', False) / 255.0\n",
- "y_train = load_data('./data/mnist/train-labels.gz', True).reshape(-1)\n",
- "\n",
- "X_test = load_data('./data/mnist/test-images.gz', False) / 255.0\n",
- "y_test = load_data('./data/mnist/test-labels.gz', True).reshape(-1)\n",
- "\n",
- "count = 0\n",
- "sample_size = 30\n",
- "plt.figure(figsize = (16, 6))\n",
- "for i in np.random.permutation(X_train.shape[0])[:sample_size]:\n",
- " count = count + 1\n",
- " plt.subplot(1, sample_size, count)\n",
- " plt.axhline('')\n",
- " plt.axvline('')\n",
- " plt.text(x = 10, y = -10, s = y_train[i], fontsize = 18)\n",
- " plt.imshow(X_train[i].reshape(28, 28), cmap = plt.cm.Greys)\n",
- "plt.show()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Upload MNIST dataset to default datastore \n",
- "A [datastore](https://docs.microsoft.com/azure/machine-learning/service/how-to-access-data) is a place where data can be stored that is then made accessible to a Run either by means of mounting or copying the data to the compute target. A datastore can either be backed by an Azure Blob Storage or and Azure File Share (ADLS will be supported in the future). For simple data handling, each workspace provides a default datastore that can be used, in case the data is not already in Blob Storage or File Share."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "ds = ws.get_default_datastore()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "In this next step, we will upload the training and test set into the workspace's default datastore, which we will then later be mount on an `AmlCompute` cluster for training."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "ds.upload(src_dir='./data/mnist', target_path='mnist', overwrite=True, show_progress=True)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Create or Attach existing AmlCompute\n",
- "You will need to create a [compute target](https://docs.microsoft.com/azure/machine-learning/service/concept-azure-machine-learning-architecture#compute-target) for training your model. In this tutorial, you create `AmlCompute` as your training compute resource."
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "If we could not find the cluster with the given name, then we will create a new cluster here. We will create an `AmlCompute` cluster of `STANDARD_NC6` GPU VMs. This process is broken down into 3 steps:\n",
- "1. create the configuration (this step is local and only takes a second)\n",
- "2. create the cluster (this step will take about **20 seconds**)\n",
- "3. provision the VMs to bring the cluster to the initial size (of 1 in this case). This step will take about **3-5 minutes** and is providing only sparse output in the process. Please make sure to wait until the call returns before moving to the next cell"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "from azureml.core.compute import ComputeTarget, AmlCompute\n",
- "from azureml.core.compute_target import ComputeTargetException\n",
- "\n",
- "# choose a name for your cluster\n",
- "cluster_name = \"gpucluster\"\n",
- "\n",
- "try:\n",
- " compute_target = ComputeTarget(workspace=ws, name=cluster_name)\n",
- " print('Found existing compute target')\n",
- "except ComputeTargetException:\n",
- " print('Creating a new compute target...')\n",
- " compute_config = AmlCompute.provisioning_configuration(vm_size='STANDARD_NC6', \n",
- " max_nodes=4)\n",
- "\n",
- " # create the cluster\n",
- " compute_target = ComputeTarget.create(ws, cluster_name, compute_config)\n",
- "\n",
- " # can poll for a minimum number of nodes and for a specific timeout. \n",
- " # if no min node count is provided it uses the scale settings for the cluster\n",
- " compute_target.wait_for_completion(show_output=True, min_node_count=None, timeout_in_minutes=20)\n",
- "\n",
- "# use get_status() to get a detailed status for the current cluster. \n",
- "print(compute_target.get_status().serialize())"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Now that you have created the compute target, let's see what the workspace's `compute_targets` property returns. You should now see one entry named 'gpucluster' of type `AmlCompute`."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "compute_targets = ws.compute_targets\n",
- "for name, ct in compute_targets.items():\n",
- " print(name, ct.type, ct.provisioning_state)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Copy the training files into the script folder\n",
- "The TensorFlow training script is already created for you. You can simply copy it into the script folder, together with the utility library used to load compressed data file into numpy array."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "import shutil\n",
- "\n",
- "# the training logic is in the keras_mnist.py file.\n",
- "shutil.copy('./keras_mnist.py', script_folder)\n",
- "\n",
- "# the utils.py just helps loading data from the downloaded MNIST dataset into numpy arrays.\n",
- "shutil.copy('./utils.py', script_folder)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "nbpresent": {
- "id": "2039d2d5-aca6-4f25-a12f-df9ae6529cae"
- }
- },
- "source": [
- "## Construct neural network in TensorFlow\n",
- "In the training script `keras_mnist.py`, it creates a very simple DNN (deep neural network), with just 2 hidden layers. The input layer has 28 * 28 = 784 neurons, each representing a pixel in an image. The first hidden layer has 300 neurons, and the second hidden layer has 100 neurons. The output layer has 10 neurons, each representing a targeted label from 0 to 9.\n",
- "\n",
- ""
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### Azure ML concepts \n",
- "Please note the following three things in the code below:\n",
- "1. The script accepts arguments using the argparse package. In this case there is one argument `--data_folder` which specifies the file system folder in which the script can find the MNIST data\n",
- "```\n",
- " parser = argparse.ArgumentParser()\n",
- " parser.add_argument('--data_folder')\n",
- "```\n",
- "2. The script is accessing the Azure ML `Run` object by executing `run = Run.get_context()`. Further down the script is using the `run` to report the loss and accuracy at the end of each epoch via callback.\n",
- "```\n",
- " run.log('Loss', log['loss'])\n",
- " run.log('Accuracy', log['acc'])\n",
- "```\n",
- "3. When running the script on Azure ML, you can write files out to a folder `./outputs` that is relative to the root directory. This folder is specially tracked by Azure ML in the sense that any files written to that folder during script execution on the remote target will be picked up by Run History; these files (known as artifacts) will be available as part of the run history record."
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "The next cell will print out the training code for you to inspect it."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "with open(os.path.join(script_folder, './keras_mnist.py'), 'r') as f:\n",
- " print(f.read())"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Create TensorFlow estimator & add Keras\n",
- "Next, we construct an `azureml.train.dnn.TensorFlow` estimator object, use the `gpucluster` as compute target, and pass the mount-point of the datastore to the training code as a parameter.\n",
- "The TensorFlow estimator is providing a simple way of launching a TensorFlow training job on a compute target. It will automatically provide a docker image that has TensorFlow installed. In this case, we add `keras` package (for the Keras framework obviously), and `matplotlib` package for plotting a \"Loss vs. Accuracy\" chart and record it in run history."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "from azureml.train.dnn import TensorFlow\n",
- "\n",
- "script_params = {\n",
- " '--data-folder': ds.path('mnist').as_mount(),\n",
- " '--batch-size': 50,\n",
- " '--first-layer-neurons': 300,\n",
- " '--second-layer-neurons': 100 \n",
- "}\n",
- "\n",
- "est = TensorFlow(source_directory=script_folder,\n",
- " script_params=script_params,\n",
- " compute_target=compute_target, \n",
- " conda_packages=['keras', 'matplotlib'],\n",
- " entry_script='keras_mnist.py', \n",
- " use_gpu=True)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "And if you are curious, this is what the mounting point looks like:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "print(ds.path('mnist').as_mount())"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Submit job to run\n",
- "Submit the estimator to an Azure ML experiment to kick off the execution."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "run = exp.submit(est)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### Monitor the Run\n",
- "As the Run is executed, it will go through the following stages:\n",
- "1. Preparing: A docker image is created matching the Python environment specified by the TensorFlow estimator and it will be uploaded to the workspace's Azure Container Registry. This step will only happen once for each Python environment -- the container will then be cached for subsequent runs. Creating and uploading the image takes about **5 minutes**. While the job is preparing, logs are streamed to the run history and can be viewed to monitor the progress of the image creation.\n",
- "\n",
- "2. Scaling: If the compute needs to be scaled up (i.e. the AmlCompute cluster requires more nodes to execute the run than currently available), the cluster will attempt to scale up in order to make the required amount of nodes available. Scaling typically takes about **5 minutes**.\n",
- "\n",
- "3. Running: All scripts in the script folder are uploaded to the compute target, data stores are mounted/copied and the `entry_script` is executed. While the job is running, stdout and the `./logs` folder are streamed to the run history and can be viewed to monitor the progress of the run.\n",
- "\n",
- "4. Post-Processing: The `./outputs` folder of the run is copied over to the run history\n",
- "\n",
- "There are multiple ways to check the progress of a running job. We can use a Jupyter notebook widget. \n",
- "\n",
- "**Note: The widget will automatically update ever 10-15 seconds, always showing you the most up-to-date information about the run**"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "from azureml.widgets import RunDetails\n",
- "RunDetails(run).show()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "We can also periodically check the status of the run object, and navigate to Azure portal to monitor the run."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "run"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "run.wait_for_completion(show_output=True)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### The Run object\n",
- "The Run object provides the interface to the run history -- both to the job and to the control plane (this notebook), and both while the job is running and after it has completed. It provides a number of interesting features for instance:\n",
- "* `run.get_details()`: Provides a rich set of properties of the run\n",
- "* `run.get_metrics()`: Provides a dictionary with all the metrics that were reported for the Run\n",
- "* `run.get_file_names()`: List all the files that were uploaded to the run history for this Run. This will include the `outputs` and `logs` folder, azureml-logs and other logs, as well as files that were explicitly uploaded to the run using `run.upload_file()`\n",
- "\n",
- "Below are some examples -- please run through them and inspect their output. "
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "run.get_details()"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "run.get_metrics()"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "run.get_file_names()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Download the saved model"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "In the training script, the Keras model is saved into two files, `model.json` and `model.h5`, in the `outputs/models` folder on the gpucluster AmlCompute node. Azure ML automatically uploaded anything written in the `./outputs` folder into run history file store. Subsequently, we can use the `run` object to download the model files. They are under the the `outputs/model` folder in the run history file store, and are downloaded into a local folder named `model`."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "# create a model folder in the current directory\n",
- "os.makedirs('./model', exist_ok=True)\n",
- "\n",
- "for f in run.get_file_names():\n",
- " if f.startswith('outputs/model'):\n",
- " output_file_path = os.path.join('./model', f.split('/')[-1])\n",
- " print('Downloading from {} to {} ...'.format(f, output_file_path))\n",
- " run.download_file(name=f, output_file_path=output_file_path)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Predict on the test set\n",
- "Now load the saved Kears model."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "from keras.models import model_from_json\n",
- "\n",
- "# load json and create model\n",
- "json_file = open('model/model.json', 'r')\n",
- "loaded_model_json = json_file.read()\n",
- "json_file.close()\n",
- "loaded_model = model_from_json(loaded_model_json)\n",
- "# load weights into new model\n",
- "loaded_model.load_weights(\"model/model.h5\")\n",
- "print(\"Model loaded from disk.\")"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Feed test dataset to the persisted model to get predictions."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "# evaluate loaded model on test data\n",
- "loaded_model.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=['accuracy'])\n",
- "y_test_ohe = one_hot_encode(y_test, 10)\n",
- "y_hat = np.argmax(loaded_model.predict(X_test), axis=1)\n",
- "\n",
- "# print the first 30 labels and predictions\n",
- "print('labels: \\t', y_test[:30])\n",
- "print('predictions:\\t', y_hat[:30])"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Calculate the overall accuracy by comparing the predicted value against the test set."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "print(\"Accuracy on the test set:\", np.average(y_hat == y_test))"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Intelligent hyperparameter tuning\n",
- "We have trained the model with one set of hyperparameters, now let's how we can do hyperparameter tuning by launching multiple runs on the cluster. First let's define the parameter space using random sampling."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "from azureml.train.hyperdrive import RandomParameterSampling, BanditPolicy, HyperDriveRunConfig, PrimaryMetricGoal\n",
- "from azureml.train.hyperdrive import choice, loguniform\n",
- "\n",
- "ps = RandomParameterSampling(\n",
- " {\n",
- " '--batch-size': choice(25, 50, 100),\n",
- " '--first-layer-neurons': choice(10, 50, 200, 300, 500),\n",
- " '--second-layer-neurons': choice(10, 50, 200, 500) \n",
- " }\n",
- ")"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Next, we will create a new estimator without the above parameters since they will be passed in later by Hyperdrive configuration. Note we still need to keep the `data-folder` parameter since that's not a hyperparamter we will sweep."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "est = TensorFlow(source_directory=script_folder,\n",
- " script_params={'--data-folder': ds.path('mnist').as_mount()},\n",
- " compute_target=compute_target,\n",
- " conda_packages=['keras', 'matplotlib'],\n",
- " entry_script='keras_mnist.py', \n",
- " use_gpu=True)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Now we will define an early termnination policy. The `BanditPolicy` basically states to check the job every 2 iterations. If the primary metric (defined later) falls outside of the top 10% range, Azure ML terminate the job. This saves us from continuing to explore hyperparameters that don't show promise of helping reach our target metric."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "policy = BanditPolicy(evaluation_interval=2, slack_factor=0.1)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Now we are ready to configure a run configuration object, and specify the primary metric `Accuracy` that's recorded in your training runs. If you go back to visit the training script, you will notice that this value is being logged after every epoch (a full batch set). We also want to tell the service that we are looking to maximizing this value. We also set the number of samples to 20, and maximal concurrent job to 4, which is the same as the number of nodes in our computer cluster."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "htc = HyperDriveRunConfig(estimator=est, \n",
- " hyperparameter_sampling=ps, \n",
- " policy=policy, \n",
- " primary_metric_name='Accuracy', \n",
- " primary_metric_goal=PrimaryMetricGoal.MAXIMIZE, \n",
- " max_total_runs=20,\n",
- " max_concurrent_runs=4)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Finally, let's launch the hyperparameter tuning job."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "htr = exp.submit(config=htc)\n",
- "htr"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "We can use a run history widget to show the progress. Be patient as this might take a while to complete."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "RunDetails(htr).show()"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "htr.wait_for_completion(show_output=True)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Find and register best model\n",
- "When all the jobs finish, we can find out the one that has the highest accuracy."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "best_run = htr.get_best_run_by_primary_metric()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Now let's list the model files uploaded during the run."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "print(best_run.get_file_names())"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "We can then register the folder (and all files in it) as a model named `keras-dnn-mnist` under the workspace for deployment."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "model = best_run.register_model(model_name='keras-mlp-mnist', model_path='outputs/model')"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Deploy the model in ACI\n",
- "Now we are ready to deploy the model as a web service running in Azure Container Instance [ACI](https://azure.microsoft.com/en-us/services/container-instances/). Azure Machine Learning accomplishes this by constructing a Docker image with the scoring logic and model baked in.\n",
- "### Create score.py\n",
- "First, we will create a scoring script that will be invoked by the web service call. \n",
- "\n",
- "* Note that the scoring script must have two required functions, `init()` and `run(input_data)`. \n",
- " * In `init()` function, you typically load the model into a global object. This function is executed only once when the Docker container is started. \n",
- " * In `run(input_data)` function, the model is used to predict a value based on the input data. The input and output to `run` typically use JSON as serialization and de-serialization format but you are not limited to that."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "%%writefile score.py\n",
- "import json\n",
- "import numpy as np\n",
- "import os\n",
- "from keras.models import model_from_json\n",
- "\n",
- "from azureml.core.model import Model\n",
- "\n",
- "def init():\n",
- " global model\n",
- " \n",
- " model_root = Model.get_model_path('keras-mlp-mnist')\n",
- " # load json and create model\n",
- " json_file = open(os.path.join(model_root, 'model.json'), 'r')\n",
- " model_json = json_file.read()\n",
- " json_file.close()\n",
- " model = model_from_json(model_json)\n",
- " # load weights into new model\n",
- " model.load_weights(os.path.join(model_root, \"model.h5\")) \n",
- " model.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=['accuracy'])\n",
- " \n",
- "def run(raw_data):\n",
- " data = np.array(json.loads(raw_data)['data'])\n",
- " # make prediction\n",
- " y_hat = np.argmax(model.predict(data), axis=1)\n",
- " return y_hat.tolist()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### Create myenv.yml\n",
- "We also need to create an environment file so that Azure Machine Learning can install the necessary packages in the Docker image which are required by your scoring script. In this case, we need to specify packages `numpy`, `tensorflow`."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "from azureml.core.runconfig import CondaDependencies\n",
- "\n",
- "cd = CondaDependencies.create()\n",
- "cd.add_tensorflow_conda_package()\n",
- "cd.add_conda_package('keras')\n",
- "cd.save_to_file(base_directory='./', conda_file_path='myenv.yml')\n",
- "\n",
- "print(cd.serialize_to_string())"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### Deploy to ACI\n",
- "We are almost ready to deploy. Create a deployment configuration and specify the number of CPUs and gigbyte of RAM needed for your ACI container. "
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "from azureml.core.webservice import AciWebservice\n",
- "\n",
- "aciconfig = AciWebservice.deploy_configuration(cpu_cores=1, \n",
- " auth_enabled=True, # this flag generates API keys to secure access\n",
- " memory_gb=1, \n",
- " tags={'name':'mnist', 'framework': 'Keras MLP'},\n",
- " description='Keras MLP on MNIST')"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "#### Deployment Process\n",
- "Now we can deploy. **This cell will run for about 7-8 minutes**. Behind the scene, it will do the following:\n",
- "1. **Build Docker image** \n",
- "Build a Docker image using the scoring file (`score.py`), the environment file (`myenv.yml`), and the `model` object. \n",
- "2. **Register image** \n",
- "Register that image under the workspace. \n",
- "3. **Ship to ACI** \n",
- "And finally ship the image to the ACI infrastructure, start up a container in ACI using that image, and expose an HTTP endpoint to accept REST client calls."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "from azureml.core.image import ContainerImage\n",
- "\n",
- "imgconfig = ContainerImage.image_configuration(execution_script=\"score.py\", \n",
- " runtime=\"python\", \n",
- " conda_file=\"myenv.yml\")"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "%%time\n",
- "from azureml.core.webservice import Webservice\n",
- "\n",
- "service = Webservice.deploy_from_model(workspace=ws,\n",
- " name='keras-mnist-svc',\n",
- " deployment_config=aciconfig,\n",
- " models=[model],\n",
- " image_config=imgconfig)\n",
- "\n",
- "service.wait_for_deployment(show_output=True)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "**Tip: If something goes wrong with the deployment, the first thing to look at is the logs from the service by running the following command:**"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "print(service.get_logs())"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "This is the scoring web service endpoint:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "print(service.scoring_uri)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### Test the deployed model\n",
- "Let's test the deployed model. Pick 30 random samples from the test set, and send it to the web service hosted in ACI. Note here we are using the `run` API in the SDK to invoke the service. You can also make raw HTTP calls using any HTTP tool such as curl.\n",
- "\n",
- "After the invocation, we print the returned predictions and plot them along with the input images. Use red font color and inversed image (white on black) to highlight the misclassified samples. Note since the model accuracy is pretty high, you might have to run the below cell a few times before you can see a misclassified sample."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "import json\n",
- "\n",
- "# find 30 random samples from test set\n",
- "n = 30\n",
- "sample_indices = np.random.permutation(X_test.shape[0])[0:n]\n",
- "\n",
- "test_samples = json.dumps({\"data\": X_test[sample_indices].tolist()})\n",
- "test_samples = bytes(test_samples, encoding='utf8')\n",
- "\n",
- "# predict using the deployed model\n",
- "result = service.run(input_data=test_samples)\n",
- "\n",
- "# compare actual value vs. the predicted values:\n",
- "i = 0\n",
- "plt.figure(figsize = (20, 1))\n",
- "\n",
- "for s in sample_indices:\n",
- " plt.subplot(1, n, i + 1)\n",
- " plt.axhline('')\n",
- " plt.axvline('')\n",
- " \n",
- " # use different color for misclassified sample\n",
- " font_color = 'red' if y_test[s] != result[i] else 'black'\n",
- " clr_map = plt.cm.gray if y_test[s] != result[i] else plt.cm.Greys\n",
- " \n",
- " plt.text(x=10, y=-10, s=y_hat[s], fontsize=18, color=font_color)\n",
- " plt.imshow(X_test[s].reshape(28, 28), cmap=clr_map)\n",
- " \n",
- " i = i + 1\n",
- "plt.show()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "We can retreive the API keys used for accessing the HTTP endpoint."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "# retreive the API keys. two keys were generated.\n",
- "key1, Key2 = service.get_keys()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "We can now send construct raw HTTP request and send to the service. Don't forget to add key to the HTTP header."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "import requests\n",
- "\n",
- "# send a random row from the test set to score\n",
- "random_index = np.random.randint(0, len(X_test)-1)\n",
- "input_data = \"{\\\"data\\\": [\" + str(list(X_test[random_index])) + \"]}\"\n",
- "\n",
- "headers = {'Content-Type':'application/json', 'Authorization': 'Bearer ' + key1}\n",
- "\n",
- "resp = requests.post(service.scoring_uri, input_data, headers=headers)\n",
- "\n",
- "print(\"POST to url\", service.scoring_uri)\n",
- "#print(\"input data:\", input_data)\n",
- "print(\"label:\", y_test[random_index])\n",
- "print(\"prediction:\", resp.text)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Let's look at the workspace after the web service was deployed. You should see \n",
- "* a registered model named 'keras-mlp-mnist' and with the id 'model:1'\n",
- "* an image called 'keras-mnist-svc' and with a docker image location pointing to your workspace's Azure Container Registry (ACR) \n",
- "* a webservice called 'keras-mnist-svc' with some scoring URL"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "models = ws.models\n",
- "for name, model in models.items():\n",
- " print(\"Model: {}, ID: {}\".format(name, model.id))\n",
- " \n",
- "images = ws.images\n",
- "for name, image in images.items():\n",
- " print(\"Image: {}, location: {}\".format(name, image.image_location))\n",
- " \n",
- "webservices = ws.webservices\n",
- "for name, webservice in webservices.items():\n",
- " print(\"Webservice: {}, scoring URI: {}\".format(name, webservice.scoring_uri))"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Clean up\n",
- "You can delete the ACI deployment with a simple delete API call."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "service.delete()"
- ]
- }
- ],
- "metadata": {
- "authors": [
- {
- "name": "haining"
- }
],
- "kernelspec": {
- "display_name": "Python 3.6",
- "language": "python",
- "name": "python36"
+ "metadata": {
+ "authors": [
+ {
+ "name": "haining"
+ }
+ ],
+ "kernelspec": {
+ "display_name": "Python 3.6",
+ "language": "python",
+ "name": "python36"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.6.7"
+ },
+ "msauthor": "haining"
},
- "language_info": {
- "codemirror_mode": {
- "name": "ipython",
- "version": 3
- },
- "file_extension": ".py",
- "mimetype": "text/x-python",
- "name": "python",
- "nbconvert_exporter": "python",
- "pygments_lexer": "ipython3",
- "version": "3.6.8"
- },
- "msauthor": "haining"
- },
- "nbformat": 4,
- "nbformat_minor": 2
-}
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
\ No newline at end of file
diff --git a/how-to-use-azureml/training/logging-api/logging-api.ipynb b/how-to-use-azureml/training/logging-api/logging-api.ipynb
index 13a81ee1..91939312 100644
--- a/how-to-use-azureml/training/logging-api/logging-api.ipynb
+++ b/how-to-use-azureml/training/logging-api/logging-api.ipynb
@@ -217,7 +217,7 @@
"metadata": {},
"outputs": [],
"source": [
- "props = run.upload_file(name='myfile_in_the_cloud.txt', path_or_stream='./myfile.txt')\n",
+ "props = run.upload_file(name='outputs/myfile_in_the_cloud.txt', path_or_stream='./myfile.txt')\n",
"props.serialize()"
]
},
diff --git a/how-to-use-azureml/training/train-within-notebook/train-within-notebook.ipynb b/how-to-use-azureml/training/train-within-notebook/train-within-notebook.ipynb
index bdc24e91..45d7aa37 100644
--- a/how-to-use-azureml/training/train-within-notebook/train-within-notebook.ipynb
+++ b/how-to-use-azureml/training/train-within-notebook/train-within-notebook.ipynb
@@ -81,7 +81,7 @@
"from azureml.core import Experiment, Workspace\n",
"\n",
"# Check core SDK version number\n",
- "print(\"This notebook was created using version 1.0.15 of the Azure ML SDK\")\n",
+ "print(\"This notebook was created using version 1.0.2 of the Azure ML SDK\")\n",
"print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")\n",
"print(\"\")\n",
"\n",
@@ -138,7 +138,6 @@
"* We use `start_logging` to create a new run in this experiment\n",
"* We use `run.log()` to record a parameter, alpha, and an accuracy measure - the Mean Squared Error (MSE) to the run. We will be able to review and compare these measures in the Azure Portal at a later time.\n",
"* We store the resulting model in the **outputs** directory, which is automatically captured by AML when the run is complete.\n",
- "* We use `run.take_snapshot()` to capture *this* notebook so we can reproduce this experiment at a later time.\n",
"* We use `run.complete()` to indicate that the run is over and results can be captured and finalized"
]
},
@@ -173,9 +172,6 @@
"# Save the model to the outputs directory for capture\n",
"joblib.dump(value=regression_model, filename='outputs/model.pkl')\n",
"\n",
- "# Take a snapshot of the directory containing this notebook\n",
- "run.take_snapshot('./')\n",
- "\n",
"# Complete the run\n",
"run.complete()"
]
@@ -238,10 +234,7 @@
" run.log(name=\"mse\", value=mse)\n",
"\n",
" # Save the model to the outputs directory for capture\n",
- " joblib.dump(value=regression_model, filename='outputs/model.pkl')\n",
- " \n",
- " # Capture this notebook with the run\n",
- " run.take_snapshot('./')\n"
+ " joblib.dump(value=regression_model, filename='outputs/model.pkl')\n"
]
},
{
diff --git a/tutorials/img-classification-part1-training.ipynb b/tutorials/img-classification-part1-training.ipynb
index 3b4d4d6d..2d5a663d 100644
--- a/tutorials/img-classification-part1-training.ipynb
+++ b/tutorials/img-classification-part1-training.ipynb
@@ -94,7 +94,7 @@
"source": [
"# load workspace configuration from the config.json file in the current folder.\n",
"ws = Workspace.from_config()\n",
- "print(ws.name, ws.location, ws.resource_group, ws.location, sep = '\\t')"
+ "print(ws.name, ws.location, ws.resource_group, ws.location, sep='\\t')"
]
},
{
@@ -205,7 +205,7 @@
"import urllib.request\n",
"\n",
"data_folder = os.path.join(os.getcwd(), 'data')\n",
- "os.makedirs(data_folder, exist_ok = True)\n",
+ "os.makedirs(data_folder, exist_ok=True)\n",
"\n",
"urllib.request.urlretrieve('http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz', filename=os.path.join(data_folder, 'train-images.gz'))\n",
"urllib.request.urlretrieve('http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz', filename=os.path.join(data_folder, 'train-labels.gz'))\n",
@@ -304,7 +304,7 @@
"outputs": [],
"source": [
"import os\n",
- "script_folder = os.path.join(os.getcwd(), \"sklearn-mnist\")\n",
+ "script_folder = os.path.join(os.getcwd(), \"sklearn-mnist\")\n",
"os.makedirs(script_folder, exist_ok=True)"
]
},
@@ -341,7 +341,7 @@
"parser.add_argument('--regularization', type=float, dest='reg', default=0.01, help='regularization rate')\n",
"args = parser.parse_args()\n",
"\n",
- "data_folder = os.path.join(args.data_folder, 'mnist')\n",
+ "data_folder = args.data_folder\n",
"print('Data folder:', data_folder)\n",
"\n",
"# load train and test set into numpy arrays\n",
@@ -426,7 +426,7 @@
"* Parameters required from the training script \n",
"* Python packages needed for training\n",
"\n",
- "In this tutorial, this target is AmlCompute. All files in the script folder are uploaded into the cluster nodes for execution. The data_folder is set to use the datastore (`ds.as_mount()`)."
+ "In this tutorial, this target is AmlCompute. All files in the script folder are uploaded into the cluster nodes for execution. The data_folder is set to use the datastore (`ds.path('mnist').as_mount()`)."
]
},
{
@@ -442,8 +442,8 @@
"from azureml.train.estimator import Estimator\n",
"\n",
"script_params = {\n",
- " '--data-folder': ds.as_mount(),\n",
- " '--regularization': 0.8\n",
+ " '--data-folder': ds.path('mnist').as_mount(),\n",
+ " '--regularization': 0.05\n",
"}\n",
"\n",
"est = Estimator(source_directory=script_folder,\n",
@@ -453,13 +453,29 @@
" conda_packages=['scikit-learn'])"
]
},
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "This is what the mounting point looks like:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "print(ds.path('mnist').as_mount())"
+ ]
+ },
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Submit the job to the cluster\n",
"\n",
- "Run the experiment by submitting the estimator object."
+ "Run the experiment by submitting the estimator object. And you can navigate to Azure portal to monitor the run."
]
},
{
@@ -486,17 +502,17 @@
"\n",
"## Monitor a remote run\n",
"\n",
- "In total, the first run takes **approximately 10 minutes**. But for subsequent runs, as long as the script dependencies don't change, the same image is reused and hence the container start up time is much faster.\n",
+ "In total, the first run takes **approximately 10 minutes**. But for subsequent runs, as long as the dependencies (`conda_packages` parameter in the above estimator constructor) don't change, the same image is reused and hence the container start up time is much faster.\n",
"\n",
"Here is what's happening while you wait:\n",
"\n",
- "- **Image creation**: A Docker image is created matching the Python environment specified by the estimator. The image is uploaded to the workspace. Image creation and uploading takes **about 5 minutes**. \n",
+ "- **Image creation**: A Docker image is created matching the Python environment specified by the estimator. The image is built and stored in the ACR (Azure Container Registry) associated with your workspace. Image creation and uploading takes **about 5 minutes**. \n",
"\n",
" This stage happens once for each Python environment since the container is cached for subsequent runs. During image creation, logs are streamed to the run history. You can monitor the image creation progress using these logs.\n",
"\n",
"- **Scaling**: If the remote cluster requires more nodes to execute the run than currently available, additional nodes are added automatically. Scaling typically takes **about 5 minutes.**\n",
"\n",
- "- **Running**: In this stage, the necessary scripts and files are sent to the compute target, then data stores are mounted/copied, then the entry_script is run. While the job is running, stdout and the ./logs directory are streamed to the run history. You can monitor the run's progress using these logs.\n",
+ "- **Running**: In this stage, the necessary scripts and files are sent to the compute target, then data stores are mounted/copied, then the entry_script is run. While the job is running, stdout and the files in the ./logs directory are streamed to the run history. You can monitor the run's progress using these logs.\n",
"\n",
"- **Post-Processing**: The ./outputs directory of the run is copied over to the run history in your workspace so you can access these results.\n",
"\n",
@@ -526,7 +542,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
- "If you need to cancel a run, you can follow [these instructions](https://aka.ms/aml-docs-cancel-run)."
+ "By the way, if you need to cancel a run, you can follow [these instructions](https://aka.ms/aml-docs-cancel-run)."
]
},
{
@@ -535,7 +551,7 @@
"source": [
"### Get log results upon completion\n",
"\n",
- "Model training and monitoring happen in the background. Wait until the model has completed training before running more code. Use `wait_for_completion` to show when the model training is complete."
+ "Model training happens in the background. You can use `wait_for_completion` to block and wait until the model has completed training before running more code. "
]
},
{
@@ -550,7 +566,8 @@
},
"outputs": [],
"source": [
- "run.wait_for_completion(show_output=False) # specify True for a verbose log"
+ "# specify show_output to True for a verbose log\n",
+ "run.wait_for_completion(show_output=False) "
]
},
{
@@ -559,7 +576,7 @@
"source": [
"### Display run results\n",
"\n",
- "You now have a model trained on a remote cluster. Retrieve the accuracy of the model:"
+ "You now have a model trained on a remote cluster. Retrieve all the metrics logged during the run, including the accuracy of the model:"
]
},
{
@@ -620,7 +637,7 @@
"source": [
"# register model \n",
"model = run.register_model(model_name='sklearn_mnist', model_path='outputs/sklearn_mnist_model.pkl')\n",
- "print(model.name, model.id, model.version, sep = '\\t')"
+ "print(model.name, model.id, model.version, sep='\\t')"
]
},
{
@@ -663,9 +680,9 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
- "version": "3.6.2"
+ "version": "3.6.8"
},
- "msauthor": "sgilley"
+ "msauthor": "haining"
},
"nbformat": 4,
"nbformat_minor": 2