update samples from Release-52 as a part of SDK release

Merge pull request #971 from Azure/release_update/Release-51
update samples from Release-51 as a part of SDK release
2025-12-20 01:27:06 -05:00 · 2020-05-18 19:21:05 +00:00 · 2020-05-13 22:17:45 -07:00 · 2020-05-14 05:03:47 +00:00 · 2020-05-12 19:57:40 -07:00 · 2020-05-13 02:45:40 +00:00
130 changed files with 13274 additions and 5783 deletions
--- a/configuration.ipynb
+++ b/configuration.ipynb
@@ -103,7 +103,7 @@
      "source": [
        "import azureml.core\n",
        "\n",
-        "print(\"This notebook was created using version 1.3.0 of the Azure ML SDK\")\n",
+        "print(\"This notebook was created using version 1.5.0 of the Azure ML SDK\")\n",
        "print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
      ]
    },
--- a/how-to-use-azureml/automated-machine-learning/README.md
+++ b/how-to-use-azureml/automated-machine-learning/README.md
@@ -1,8 +1,8 @@
 # Table of Contents
 1. [Automated ML Introduction](#introduction)
-1. [Setup using Azure Notebooks](#jupyter)
+1. [Setup using Compute Instances](#jupyter)
 1. [Setup using Azure Databricks](#databricks)
 1. [Setup using a Local Conda environment](#localconda)
 1. [Setup using Azure Databricks](#databricks)
 1. [Automated ML SDK Sample Notebooks](#samples)
 1. [Documentation](#documentation)
 1. [Running using python command](#pythoncommand)
@@ -21,13 +21,13 @@ Below are the three execution environments supported by automated ML.
 <a name="jupyter"></a>
-## Setup using Notebook VMs - Jupyter based notebooks from a Azure VM
+## Setup using Compute Instances - Jupyter based notebooks from a Azure Virtual Machine
 1. Open the [ML Azure portal](https://ml.azure.com)
 1. Select Compute
-1. Select Notebook VMs
+1. Select Compute Instances
 1. Click New
-1. Type a name for the Vm and select a VM type
+1. Type a Compute Name, select a Virtual Machine type and select a Virtual Machine size
 1. Click Create
 <a name="localconda"></a>
--- a/how-to-use-azureml/automated-machine-learning/automl_env.yml
+++ b/how-to-use-azureml/automated-machine-learning/automl_env.yml
@@ -4,34 +4,28 @@ dependencies:
  # Currently Azure ML only supports 3.5.2 and later.
 - pip<=19.3.1
 - python>=3.5.2,<3.6.8
 - wheel==0.30.0
 - nb_conda
 - matplotlib==2.1.0
 - numpy>=1.16.0,<=1.16.2
 - cython
 - urllib3<1.24
- scipy>=1.0.0,<=1.1.0
+- scipy==1.4.1
 - scikit-learn>=0.19.0,<=0.20.3
 - pandas>=0.22.0,<=0.23.4
 - py-xgboost<=0.90
- fbprophet==0.5
+- conda-forge::fbprophet==0.5
- pytorch=1.1.0
+- pytorch::pytorch=1.4.0
- cudatoolkit=9.0
+- cudatoolkit=10.1.243
 - pip:
  # Required packages for AzureML execution, history, and data preparation.
  - azureml-defaults
  - azureml-dataprep[pandas]
  - azureml-train-automl
  - azureml-train
  - azureml-widgets
  - azureml-pipeline
  - pytorch-transformers==1.0.0
  - spacy==2.1.8
-  - onnxruntime==1.0.0
+  - pyarrow==0.17.0
  - https://aka.ms/automl-resources/packages/en_core_web_sm-2.1.0.tar.gz
 channels:
 - anaconda
 - conda-forge
 - pytorch
--- a/how-to-use-azureml/automated-machine-learning/automl_env_mac.yml
+++ b/how-to-use-azureml/automated-machine-learning/automl_env_mac.yml
@@ -5,34 +5,27 @@ dependencies:
 - pip<=19.3.1
 - nomkl
 - python>=3.5.2,<3.6.8
 - wheel==0.30.0
 - nb_conda
 - matplotlib==2.1.0
 - numpy>=1.16.0,<=1.16.2
 - cython
 - urllib3<1.24
- scipy>=1.0.0,<=1.1.0
+- scipy==1.4.1
 - scikit-learn>=0.19.0,<=0.20.3
- pandas>=0.22.0,<0.23.0
+- pandas>=0.22.0,<=0.23.4
- py-xgboost<=0.80
+- py-xgboost<=0.90
- fbprophet==0.5
+- conda-forge::fbprophet==0.5
- pytorch=1.1.0
+- pytorch::pytorch=1.4.0
 - cudatoolkit=9.0
 - pip:
  # Required packages for AzureML execution, history, and data preparation.
  - azureml-defaults
  - azureml-dataprep[pandas]
  - azureml-train-automl
  - azureml-train
  - azureml-widgets
  - azureml-pipeline
  - pytorch-transformers==1.0.0
  - spacy==2.1.8
-  - onnxruntime==1.0.0
+  - pyarrow==0.17.0
  - https://aka.ms/automl-resources/packages/en_core_web_sm-2.1.0.tar.gz  
 channels:
 - anaconda
 - conda-forge
 - pytorch
--- a/how-to-use-azureml/automated-machine-learning/classification-bank-marketing-all-features/auto-ml-classification-bank-marketing-all-features.ipynb
+++ b/how-to-use-azureml/automated-machine-learning/classification-bank-marketing-all-features/auto-ml-classification-bank-marketing-all-features.ipynb
@@ -41,7 +41,7 @@
        "\n",
        "In this example we use the UCI Bank Marketing dataset to showcase how you can use AutoML for a  classification problem and deploy it to an Azure Container Instance (ACI). The classification goal is to predict if the client will subscribe to a term deposit with the bank.\n",
        "\n",
-        "If you are using an Azure Machine Learning Notebook VM, you are all set.  Otherwise, go through the [configuration](../../../configuration.ipynb)  notebook first if you haven't already to establish your connection to the AzureML Workspace. \n",
+        "If you are using an Azure Machine Learning Compute Instance, you are all set.  Otherwise, go through the [configuration](../../../configuration.ipynb)  notebook first if you haven't already to establish your connection to the AzureML Workspace. \n",
        "\n",
        "Please find the ONNX related documentations [here](https://github.com/onnx/onnx).\n",
        "\n",
@@ -105,7 +105,7 @@
      "metadata": {},
      "outputs": [],
      "source": [
-        "print(\"This notebook was created using version 1.3.0 of the Azure ML SDK\")\n",
+        "print(\"This notebook was created using version 1.5.0 of the Azure ML SDK\")\n",
        "print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
      ]
    },
@@ -643,7 +643,7 @@
        "\n",
        "### Retrieve the Best Model\n",
        "\n",
-        "Below we select the best pipeline from our iterations. The `get_output` method on `automl_classifier` returns the best run and the fitted model for the last invocation. Overloads on `get_output` allow you to retrieve the best run and fitted model for *any* logged metric or for a particular *iteration*."
+        "Below we select the best pipeline from our iterations.  The `get_output` method returns the best run and the fitted model. Overloads on `get_output` allow you to retrieve the best run and fitted model for *any* logged metric or for a particular *iteration*."
      ]
    },
    {
--- a/how-to-use-azureml/automated-machine-learning/classification-credit-card-fraud/auto-ml-classification-credit-card-fraud.ipynb
+++ b/how-to-use-azureml/automated-machine-learning/classification-credit-card-fraud/auto-ml-classification-credit-card-fraud.ipynb
@@ -42,7 +42,7 @@
        "\n",
        "This notebook is using remote compute to train the model.\n",
        "\n",
-        "If you are using an Azure Machine Learning [Notebook VM](https://docs.microsoft.com/en-us/azure/machine-learning/service/tutorial-1st-experiment-sdk-setup), you are all set. Otherwise, go through the [configuration](../../../configuration.ipynb) notebook first if you haven't already to establish your connection to the AzureML Workspace. \n",
+        "If you are using an Azure Machine Learning Compute Instance, you are all set. Otherwise, go through the [configuration](../../../configuration.ipynb) notebook first if you haven't already to establish your connection to the AzureML Workspace. \n",
        "\n",
        "In this notebook you will learn how to:\n",
        "1. Create an experiment using an existing workspace.\n",
@@ -93,7 +93,7 @@
      "metadata": {},
      "outputs": [],
      "source": [
-        "print(\"This notebook was created using version 1.3.0 of the Azure ML SDK\")\n",
+        "print(\"This notebook was created using version 1.5.0 of the Azure ML SDK\")\n",
        "print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
      ]
    },
@@ -322,7 +322,7 @@
        "\n",
        "### Retrieve the Best Model\n",
        "\n",
-        "Below we select the best pipeline from our iterations. The `get_output` method on `automl_classifier` returns the best run and the fitted model for the last invocation. Overloads on `get_output` allow you to retrieve the best run and fitted model for *any* logged metric or for a particular *iteration*."
+        "Below we select the best pipeline from our iterations. The `get_output` method returns the best run and the fitted model.  Overloads on `get_output` allow you to retrieve the best run and fitted model for *any* logged metric or for a particular *iteration*."
      ]
    },
    {
--- a/how-to-use-azureml/automated-machine-learning/classification-text-dnn/auto-ml-classification-text-dnn.ipynb
+++ b/how-to-use-azureml/automated-machine-learning/classification-text-dnn/auto-ml-classification-text-dnn.ipynb
@@ -97,7 +97,7 @@
      "metadata": {},
      "outputs": [],
      "source": [
-        "print(\"This notebook was created using version 1.3.0 of the Azure ML SDK\")\n",
+        "print(\"This notebook was created using version 1.5.0 of the Azure ML SDK\")\n",
        "print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
      ]
    },
@@ -194,8 +194,8 @@
        "    '''\n",
        "    remove = ('headers', 'footers', 'quotes')\n",
        "    categories = [\n",
-        "        'alt.atheism',\n",
+        "        'rec.sport.baseball',\n",
-        "        'talk.religion.misc',\n",
+        "        'rec.sport.hockey',\n",
        "        'comp.graphics',\n",
        "        'sci.space',\n",
        "        ]\n",
@@ -345,7 +345,8 @@
      "cell_type": "markdown",
      "metadata": {},
      "source": [
-        "You can test the model locally to get a feel of the input/output. This step may require additional package installations such as pytorch."
+        "You can test the model locally to get a feel of the input/output. When the model contains BERT, this step will require pytorch and pytorch-transformers installed in your local environment. The exact versions of these packages can be found in the **automl_env.yml** file located in the local copy of your MachineLearningNotebooks folder here:\n",
        "MachineLearningNotebooks/how-to-use-azureml/automated-machine-learning/automl_env.yml"
      ]
    },
    {
@@ -481,7 +482,7 @@
      "source": [
        "script_folder = os.path.join(os.getcwd(), 'inference')\n",
        "os.makedirs(script_folder, exist_ok=True)\n",
-        "shutil.copy2('infer.py', script_folder)"
+        "shutil.copy('infer.py', script_folder)"
      ]
    },
    {
--- a/how-to-use-azureml/automated-machine-learning/classification-text-dnn/auto-ml-classification-text-dnn.yml
+++ b/how-to-use-azureml/automated-machine-learning/classification-text-dnn/auto-ml-classification-text-dnn.yml
@@ -5,7 +5,6 @@ dependencies:
  - azureml-train-automl
  - azureml-widgets
  - matplotlib
  - azurmel-train
  - https://download.pytorch.org/whl/cpu/torch-1.1.0-cp35-cp35m-win_amd64.whl
  - sentencepiece==0.1.82
  - pytorch-transformers==1.0
--- a/how-to-use-azureml/automated-machine-learning/classification-text-dnn/infer.py
+++ b/how-to-use-azureml/automated-machine-learning/classification-text-dnn/infer.py
@@ -2,8 +2,7 @@ import numpy as np
 import argparse
 from azureml.core import Run
 from sklearn.externals import joblib
-from azureml.automl.core._vendor.automl.client.core.common import metrics
+from azureml.automl.core.shared import constants, metrics
 from automl.client.core.common import constants
 from azureml.core.model import Model
--- a/how-to-use-azureml/automated-machine-learning/continuous-retraining/auto-ml-continuous-retraining.ipynb
+++ b/how-to-use-azureml/automated-machine-learning/continuous-retraining/auto-ml-continuous-retraining.ipynb
@@ -88,7 +88,7 @@
      "metadata": {},
      "outputs": [],
      "source": [
-        "print(\"This notebook was created using version 1.3.0 of the Azure ML SDK\")\n",
+        "print(\"This notebook was created using version 1.5.0 of the Azure ML SDK\")\n",
        "print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
      ]
    },
--- a/how-to-use-azureml/automated-machine-learning/forecasting-beer-remote/auto-ml-forecasting-beer-remote.ipynb
+++ b/how-to-use-azureml/automated-machine-learning/forecasting-beer-remote/auto-ml-forecasting-beer-remote.ipynb
@@ -114,7 +114,7 @@
      "metadata": {},
      "outputs": [],
      "source": [
-        "print(\"This notebook was created using version 1.3.0 of the Azure ML SDK\")\n",
+        "print(\"This notebook was created using version 1.5.0 of the Azure ML SDK\")\n",
        "print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
      ]
    },
@@ -572,7 +572,7 @@
        "\n",
        "script_folder = os.path.join(os.getcwd(), 'inference')\n",
        "os.makedirs(script_folder, exist_ok=True)\n",
-        "shutil.copy2('infer.py', script_folder)"
+        "shutil.copy('infer.py', script_folder)"
      ]
    },
    {
--- a/how-to-use-azureml/automated-machine-learning/forecasting-beer-remote/infer.py
+++ b/how-to-use-azureml/automated-machine-learning/forecasting-beer-remote/infer.py
@@ -4,8 +4,7 @@ import argparse
 from azureml.core import Run
 from sklearn.externals import joblib
 from sklearn.metrics import mean_absolute_error, mean_squared_error
-from azureml.automl.core._vendor.automl.client.core.common import metrics
+from azureml.automl.core.shared import constants, metrics
 from automl.client.core.common import constants
 from pandas.tseries.frequencies import to_offset
--- a/how-to-use-azureml/automated-machine-learning/forecasting-bike-share/auto-ml-forecasting-bike-share.ipynb
+++ b/how-to-use-azureml/automated-machine-learning/forecasting-bike-share/auto-ml-forecasting-bike-share.ipynb
@@ -87,7 +87,7 @@
      "metadata": {},
      "outputs": [],
      "source": [
-        "print(\"This notebook was created using version 1.3.0 of the Azure ML SDK\")\n",
+        "print(\"This notebook was created using version 1.5.0 of the Azure ML SDK\")\n",
        "print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
      ]
    },
@@ -453,8 +453,8 @@
        "\n",
        "script_folder = os.path.join(os.getcwd(), 'forecast')\n",
        "os.makedirs(script_folder, exist_ok=True)\n",
-        "shutil.copy2('forecasting_script.py', script_folder)\n",
+        "shutil.copy('forecasting_script.py', script_folder)\n",
-        "shutil.copy2('forecasting_helper.py', script_folder)"
+        "shutil.copy('forecasting_helper.py', script_folder)"
      ]
    },
    {
@@ -510,10 +510,9 @@
      "metadata": {},
      "outputs": [],
      "source": [
-        "from azureml.automl.core._vendor.automl.client.core.common import metrics\n",
+        "from azureml.automl.core.shared import constants, metrics\n",
        "from sklearn.metrics import mean_absolute_error, mean_squared_error\n",
        "from matplotlib import pyplot as plt\n",
        "from automl.client.core.common import constants\n",
        "\n",
        "# use automl metrics module\n",
        "scores = metrics.compute_metrics_regression(\n",
--- a/how-to-use-azureml/automated-machine-learning/forecasting-bike-share/forecasting_script.py
+++ b/how-to-use-azureml/automated-machine-learning/forecasting-bike-share/forecasting_script.py
@@ -1,6 +1,6 @@
 import argparse
 import azureml.train.automl
-from azureml.automl.runtime._vendor.automl.client.core.runtime import forecasting_models
+from azureml.automl.runtime.shared import forecasting_models
 from azureml.core import Run
 from sklearn.externals import joblib
 import forecasting_helper
--- a/how-to-use-azureml/automated-machine-learning/forecasting-energy-demand/auto-ml-forecasting-energy-demand.ipynb
+++ b/how-to-use-azureml/automated-machine-learning/forecasting-energy-demand/auto-ml-forecasting-energy-demand.ipynb
@@ -42,7 +42,7 @@
        "\n",
        "In this example we use the associated New York City energy demand dataset to showcase how you can use AutoML for a simple forecasting problem and explore the results. The goal is predict the energy demand for the next 48 hours based on historic time-series data.\n",
        "\n",
-        "If you are using an Azure Machine Learning [Notebook VM](https://docs.microsoft.com/en-us/azure/machine-learning/service/tutorial-1st-experiment-sdk-setup), you are all set. Otherwise, go through the [configuration notebook](../../../configuration.ipynb) first, if you haven't already, to establish your connection to the AzureML Workspace.\n",
+        "If you are using an Azure Machine Learning Compute Instance, you are all set. Otherwise, go through the [configuration notebook](../../../configuration.ipynb) first, if you haven't already, to establish your connection to the AzureML Workspace.\n",
        "\n",
        "In this notebook you will learn how to:\n",
        "1. Creating an Experiment using an existing Workspace\n",
@@ -97,7 +97,7 @@
      "metadata": {},
      "outputs": [],
      "source": [
-        "print(\"This notebook was created using version 1.3.0 of the Azure ML SDK\")\n",
+        "print(\"This notebook was created using version 1.5.0 of the Azure ML SDK\")\n",
        "print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
      ]
    },
@@ -507,9 +507,8 @@
      "metadata": {},
      "outputs": [],
      "source": [
-        "from azureml.automl.core._vendor.automl.client.core.common import metrics\n",
+        "from azureml.automl.core.shared import constants, metrics\n",
        "from matplotlib import pyplot as plt\n",
        "from automl.client.core.common import constants\n",
        "\n",
        "# use automl metrics module\n",
        "scores = metrics.compute_metrics_regression(\n",
@@ -668,9 +667,8 @@
      "metadata": {},
      "outputs": [],
      "source": [
-        "from azureml.automl.core._vendor.automl.client.core.common import metrics\n",
+        "from azureml.automl.core.shared import constants, metrics\n",
        "from matplotlib import pyplot as plt\n",
        "from automl.client.core.common import constants\n",
        "\n",
        "# use automl metrics module\n",
        "scores = metrics.compute_metrics_regression(\n",
--- a/how-to-use-azureml/automated-machine-learning/forecasting-high-frequency/auto-ml-forecasting-function.ipynb
+++ b/how-to-use-azureml/automated-machine-learning/forecasting-high-frequency/auto-ml-forecasting-function.ipynb
@@ -95,7 +95,7 @@
      "metadata": {},
      "outputs": [],
      "source": [
-        "print(\"This notebook was created using version 1.3.0 of the Azure ML SDK\")\n",
+        "print(\"This notebook was created using version 1.5.0 of the Azure ML SDK\")\n",
        "print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
      ]
    },
@@ -355,9 +355,24 @@
        "                             label_column_name=target_label,\n",
        "                             **time_series_settings)\n",
        "\n",
-        "remote_run = experiment.submit(automl_config, show_output=False)\n",
+        "remote_run = experiment.submit(automl_config, show_output=False)"
-        "remote_run.wait_for_completion()\n",
+      ]
-        "\n",
+    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "remote_run.wait_for_completion()"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Retrieve the best model to use it further.\n",
        "_, fitted_model = remote_run.get_output()"
      ]
--- a/how-to-use-azureml/automated-machine-learning/forecasting-orange-juice-sales/auto-ml-forecasting-orange-juice-sales.ipynb
+++ b/how-to-use-azureml/automated-machine-learning/forecasting-orange-juice-sales/auto-ml-forecasting-orange-juice-sales.ipynb
@@ -65,7 +65,8 @@
        "\n",
        "from azureml.core.workspace import Workspace\n",
        "from azureml.core.experiment import Experiment\n",
-        "from azureml.train.automl import AutoMLConfig"
+        "from azureml.train.automl import AutoMLConfig\n",
        "from azureml.automl.core.featurization import FeaturizationConfig"
      ]
    },
    {
@@ -81,7 +82,7 @@
      "metadata": {},
      "outputs": [],
      "source": [
-        "print(\"This notebook was created using version 1.3.0 of the Azure ML SDK\")\n",
+        "print(\"This notebook was created using version 1.5.0 of the Azure ML SDK\")\n",
        "print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
      ]
    },
@@ -318,17 +319,54 @@
        "target_column_name = 'Quantity'"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Customization\n",
        "\n",
        "The featurization customization in forecasting is an advanced feature in AutoML which allows our customers to change the default forecasting featurization behaviors and column types through `FeaturizationConfig`. The supported scenarios include,\n",
        "1. Column purposes update: Override feature type for the specified column. Currently supports DateTime, Categorical and Numeric. This customization can be used in the scenario that the type of the column cannot correctly reflect its purpose. Some numerical columns, for instance, can be treated as Categorical columns which need to be converted to categorical while some can be treated as epoch timestamp which need to be converted to datetime. To tell our SDK to correctly preprocess these columns, a configuration need to be add with the columns and their desired types.\n",
        "2. Transformer parameters update: Currently supports parameter change for Imputer only. User can customize imputation methods, the supported methods are constant for target data and mean, median, most frequent and constant for training data. This customization can be used for the scenario that our customers know which imputation methods fit best to the input data. For instance, some datasets use NaN to represent 0 which the correct behavior should impute all the missing value with 0. To achieve this behavior, these columns need to be configured as constant imputation with `fill_value` 0.\n",
        "3. Drop columns: Columns to drop from being featurized. These usually are the columns which are leaky or the columns contain no useful data.\n",
        "\n",
        "This step requires an Enterprise workspace to gain access to this feature. To learn more about creating an Enterprise workspace or upgrading to an Enterprise workspace from the Azure portal, please visit our [Workspace page.](https://docs.microsoft.com/azure/machine-learning/service/concept-workspace#upgrade)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "featurization_config = FeaturizationConfig()\n",
        "featurization_config.drop_columns = ['logQuantity']  # 'logQuantity' is a leaky feature, so we remove it.\n",
        "# Force the CPWVOL5 feature to be numeric type.\n",
        "featurization_config.add_column_purpose('CPWVOL5', 'Numeric')\n",
        "# Fill missing values in the target column, Quantity, with zeros.\n",
        "featurization_config.add_transformer_params('Imputer', ['Quantity'], {\"strategy\": \"constant\", \"fill_value\": 0})\n",
        "# Fill missing values in the INCOME column with median value.\n",
        "featurization_config.add_transformer_params('Imputer', ['INCOME'], {\"strategy\": \"median\"})"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Train\n",
        "\n",
-        "The AutoMLConfig object defines the settings and data for an AutoML training job. Here, we set necessary inputs like the task type, the number of AutoML iterations to try, the training data, and cross-validation parameters. \n",
+        "The [AutoMLConfig](https://docs.microsoft.com/en-us/python/api/azureml-train-automl-client/azureml.train.automl.automlconfig.automlconfig?view=azure-ml-py) object defines the settings and data for an AutoML training job. Here, we set necessary inputs like the task type, the number of AutoML iterations to try, the training data, and cross-validation parameters.\n",
        "\n",
-        "For forecasting tasks, there are some additional parameters that can be set: the name of the column holding the date/time, the grain column names, and the maximum forecast horizon. A time column is required for forecasting, while the grain is optional. If a grain is not given, AutoML assumes that the whole dataset is a single time-series. We also pass a list of columns to drop prior to modeling. The _logQuantity_ column is completely correlated with the target quantity, so it must be removed to prevent a target leak.\n",
+        "For forecasting tasks, there are some additional parameters that can be set: the name of the column holding the date/time, the grain column names, and the maximum forecast horizon. A time column is required for forecasting, while the grain is optional. If grain columns are not given, AutoML assumes that the whole dataset is a single time-series. We also pass a list of columns to drop prior to modeling. The _logQuantity_ column is completely correlated with the target quantity, so it must be removed to prevent a target leak.\n",
        "\n",
        "The forecast horizon is given in units of the time-series frequency; for instance, the OJ series frequency is weekly, so a horizon of 20 means that a trained model will estimate sales up to 20 weeks beyond the latest date in the training data for each series. In this example, we set the maximum horizon to the number of samples per series in the test set (n_test_periods). Generally, the value of this parameter will be dictated by business needs. For example, a demand planning application that estimates the next month of sales should set the horizon according to suitable planning time-scales. Please see the [energy_demand notebook](https://github.com/Azure/MachineLearningNotebooks/tree/master/how-to-use-azureml/automated-machine-learning/forecasting-energy-demand) for more discussion of forecast horizon.\n",
        "\n",
        "We note here that AutoML can sweep over two types of time-series models:\n",
        "* Models that are trained for each series such as ARIMA and Facebook's Prophet. Note that these models are only available for [Enterprise Edition Workspaces](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-manage-workspace#upgrade).\n",
        "* Models trained across multiple time-series using a regression approach.\n",
        "\n",
        "In the first case, AutoML loops over all time-series in your dataset and trains one model (e.g. AutoArima or Prophet, as the case may be) for each series. This can result in long runtimes to train these models if there are a lot of series in the data. One way to mitigate this problem is to fit models for different series in parallel if you have multiple compute cores available. To enable this behavior, set the `max_cores_per_iteration` parameter in your AutoMLConfig as shown in the example in the next cell. \n",
        "\n",
        "The forecast horizon is given in units of the time-series frequency; for instance, the OJ series frequency is weekly, so a horizon of 20 means that a trained model will estimate sales up to 20 weeks beyond the latest date in the training data for each series. In this example, we set the maximum horizon to the number of samples per series in the test set (n_test_periods). Generally, the value of this parameter will be dictated by business needs. For example, a demand planning organizaion that needs to estimate the next month of sales would set the horizon accordingly. Please see the [energy_demand notebook](https://github.com/Azure/MachineLearningNotebooks/tree/master/how-to-use-azureml/automated-machine-learning/forecasting-energy-demand) for more discussion of forecast horizon.\n",
        "\n",
        "Finally, a note about the cross-validation (CV) procedure for time-series data. AutoML uses out-of-sample error estimates to select a best pipeline/model, so it is important that the CV fold splitting is done correctly. Time-series can violate the basic statistical assumptions of the canonical K-Fold CV strategy, so AutoML implements a [rolling origin validation](https://robjhyndman.com/hyndsight/tscv/) procedure to create CV folds for time-series data. To use this procedure, you just need to specify the desired number of CV folds in the AutoMLConfig object. It is also possible to bypass CV and use your own validation set by setting the *validation_data* parameter of AutoMLConfig.\n",
        "\n",
@@ -349,8 +387,9 @@
        "|**debug_log**|Log file path for writing debugging information|\n",
        "|**time_column_name**|Name of the datetime column in the input data|\n",
        "|**grain_column_names**|Name(s) of the columns defining individual series in the input data|\n",
-        "|**drop_column_names**|Name(s) of columns to drop prior to modeling|\n",
+        "|**max_horizon**|Maximum desired forecast horizon in units of time-series frequency|\n",
-        "|**max_horizon**|Maximum desired forecast horizon in units of time-series frequency|"
+        "|**featurization**| 'auto' / 'off' / FeaturizationConfig Indicator for whether featurization step should be done automatically or not, or whether customized featurization should be used. Setting this enables AutoML to perform featurization on the input to handle *missing data*, and to perform some common *feature extraction*.|\n",
        "|**max_cores_per_iteration**|Maximum number of cores to utilize per iteration. A value of -1 indicates all available cores should be used.|"
      ]
    },
    {
@@ -362,7 +401,6 @@
        "time_series_settings = {\n",
        "    'time_column_name': time_column_name,\n",
        "    'grain_column_names': grain_column_names,\n",
        "    'drop_column_names': ['logQuantity'],  # 'logQuantity' is a leaky feature, so we remove it.\n",
        "    'max_horizon': n_test_periods\n",
        "}\n",
        "\n",
@@ -374,8 +412,10 @@
        "                             label_column_name=target_column_name,\n",
        "                             compute_target=compute_target,\n",
        "                             enable_early_stopping=True,\n",
        "                             featurization=featurization_config,\n",
        "                             n_cross_validations=3,\n",
        "                             verbosity=logging.INFO,\n",
        "                             max_cores_per_iteration=-1,\n",
        "                             **time_series_settings)"
      ]
    },
@@ -425,6 +465,33 @@
        "model_name = best_run.properties['model_name']"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Transparency\n",
        "\n",
        "View updated featurization summary"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "custom_featurizer = fitted_model.named_steps['timeseriestransformer']"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "custom_featurizer.get_featurization_summary()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
@@ -478,7 +545,7 @@
      "source": [
        "If you are used to scikit pipelines, perhaps you expected `predict(X_test)`. However, forecasting requires a more general interface that also supplies the past target `y` values. Please use `forecast(X,y)` as `predict(X)` is reserved for internal purposes on forecasting models.\n",
        "\n",
-        "The [energy demand forecasting notebook](https://github.com/Azure/MachineLearningNotebooks/tree/master/how-to-use-azureml/automated-machine-learning/forecasting-energy-demand) demonstrates the use of the forecast function in more detail in the context of using lags and rolling window features. "
+        "The [forecast function notebook](https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/automated-machine-learning/forecasting-high-frequency/auto-ml-forecasting-function.ipynb) demonstrates the use of the forecast function for a variety of use cases. Also, please see the [API documentation for the forecast function](https://docs.microsoft.com/en-us/python/api/azureml-automl-runtime/azureml.automl.runtime.shared.model_wrappers.forecastingpipelinewrapper?view=azure-ml-py#forecast-x-pred--typing-union-pandas-core-frame-dataframe--nonetype----none--y-pred--typing-union-pandas-core-frame-dataframe--numpy-ndarray--nonetype----none--forecast-destination--typing-union-pandas--libs-tslibs-timestamps-timestamp--nonetype----none--ignore-data-errors--bool---false-----typing-tuple-numpy-ndarray--pandas-core-frame-dataframe-)."
      ]
    },
    {
@@ -509,9 +576,8 @@
      "metadata": {},
      "outputs": [],
      "source": [
-        "from azureml.automl.core._vendor.automl.client.core.common import metrics\n",
+        "from azureml.automl.core.shared import constants, metrics\n",
        "from matplotlib import pyplot as plt\n",
        "from automl.client.core.common import constants\n",
        "\n",
        "# use automl metrics module\n",
        "scores = metrics.compute_metrics_regression(\n",
--- a/how-to-use-azureml/automated-machine-learning/local-run-classification-credit-card-fraud/auto-ml-classification-credit-card-fraud-local.ipynb
+++ b/how-to-use-azureml/automated-machine-learning/local-run-classification-credit-card-fraud/auto-ml-classification-credit-card-fraud-local.ipynb
@@ -42,7 +42,7 @@
        "\n",
        "This notebook is using the local machine compute to train the model.\n",
        "\n",
-        "If you are using an Azure Machine Learning [Notebook VM](https://docs.microsoft.com/en-us/azure/machine-learning/service/tutorial-1st-experiment-sdk-setup), you are all set. Otherwise, go through the [configuration](../../../configuration.ipynb) notebook first if you haven't already to establish your connection to the AzureML Workspace. \n",
+        "If you are using an Azure Machine Learning Compute Instance, you are all set. Otherwise, go through the [configuration](../../../configuration.ipynb) notebook first if you haven't already to establish your connection to the AzureML Workspace. \n",
        "\n",
        "In this notebook you will learn how to:\n",
        "1. Create an experiment using an existing workspace.\n",
@@ -95,7 +95,7 @@
      "metadata": {},
      "outputs": [],
      "source": [
-        "print(\"This notebook was created using version 1.3.0 of the Azure ML SDK\")\n",
+        "print(\"This notebook was created using version 1.5.0 of the Azure ML SDK\")\n",
        "print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
      ]
    },
@@ -370,7 +370,7 @@
      "metadata": {},
      "source": [
        "#### Initialize the Mimic Explainer for feature importance\n",
-        "For explaining the AutoML models, use the MimicWrapper from azureml.explain.model package. The MimicWrapper can be initialized with fields in automl_explainer_setup_obj, your workspace and a LightGBM model which acts as a surrogate model to explain the AutoML model (fitted_model here). The MimicWrapper also takes the automl_run object where engineered explanations will be uploaded."
+        "For explaining the AutoML models, use the MimicWrapper from azureml.explain.model package. The MimicWrapper can be initialized with fields in automl_explainer_setup_obj, your workspace and a surrogate model to explain the AutoML model (fitted_model here). The MimicWrapper also takes the automl_run object where engineered explanations will be uploaded."
      ]
    },
    {
@@ -379,13 +379,14 @@
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.explain.model.mimic.models.lightgbm_model import LGBMExplainableModel\n",
        "from azureml.explain.model.mimic_wrapper import MimicWrapper\n",
-        "explainer = MimicWrapper(ws, automl_explainer_setup_obj.automl_estimator, LGBMExplainableModel, \n",
+        "explainer = MimicWrapper(ws, automl_explainer_setup_obj.automl_estimator,\n",
        "                         explainable_model=automl_explainer_setup_obj.surrogate_model, \n",
        "                         init_dataset=automl_explainer_setup_obj.X_transform, run=automl_run,\n",
        "                         features=automl_explainer_setup_obj.engineered_feature_names, \n",
        "                         feature_maps=[automl_explainer_setup_obj.feature_map],\n",
-        "                         classes=automl_explainer_setup_obj.classes)"
+        "                         classes=automl_explainer_setup_obj.classes,\n",
        "                         explainer_kwargs=automl_explainer_setup_obj.surrogate_model_params)"
      ]
    },
    {
--- a/how-to-use-azureml/automated-machine-learning/regression-explanation-featurization/auto-ml-regression-explanation-featurization.ipynb
+++ b/how-to-use-azureml/automated-machine-learning/regression-explanation-featurization/auto-ml-regression-explanation-featurization.ipynb
@@ -40,7 +40,7 @@
        "In this example we use the Hardware Performance Dataset to showcase how you can use AutoML for a simple regression problem. The Regression goal is to predict the performance of certain combinations of hardware parts.\n",
        "After training AutoML models for this regression data set, we show how you can compute model explanations on your remote compute using a sample explainer script.\n",
        "\n",
-        "If you are using an Azure Machine Learning Notebook VM, you are all set.  Otherwise, go through the [configuration](../../../configuration.ipynb)  notebook first if you haven't already to establish your connection to the AzureML Workspace. \n",
+        "If you are using an Azure Machine Learning Compute Instance, you are all set.  Otherwise, go through the [configuration](../../../configuration.ipynb)  notebook first if you haven't already to establish your connection to the AzureML Workspace. \n",
        "\n",
        "An Enterprise workspace is required for this notebook. To learn more about creating an Enterprise workspace or upgrading to an Enterprise workspace from the Azure portal, please visit our [Workspace page.](https://docs.microsoft.com/azure/machine-learning/service/concept-workspace#upgrade) \n",
        "\n",
@@ -98,7 +98,7 @@
      "metadata": {},
      "outputs": [],
      "source": [
-        "print(\"This notebook was created using version 1.3.0 of the Azure ML SDK\")\n",
+        "print(\"This notebook was created using version 1.5.0 of the Azure ML SDK\")\n",
        "print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
      ]
    },
--- a/how-to-use-azureml/automated-machine-learning/regression-explanation-featurization/train_explainer.py
+++ b/how-to-use-azureml/automated-machine-learning/regression-explanation-featurization/train_explainer.py
@@ -10,8 +10,7 @@ from azureml.train.automl.runtime.automl_explain_utilities import AutoMLExplaine
    automl_setup_model_explanations, automl_check_model_if_explainable
 from azureml.explain.model.mimic.models.lightgbm_model import LGBMExplainableModel
 from azureml.explain.model.mimic_wrapper import MimicWrapper
-from automl.client.core.common.constants import MODEL_PATH
+from azureml.automl.core.shared.constants import MODEL_PATH
 from azureml.automl.core.shared.constants import MODEL_EXPLANATION_TAG
 from azureml.explain.model.scoring.scoring_explainer import TreeScoringExplainer, save
@@ -69,9 +68,6 @@ raw_explanations = explainer.explain(['local', 'global'], get_raw=True, tag='raw
                                     raw_feature_names=automl_explainer_setup_obj.raw_feature_names,
                                     eval_dataset=automl_explainer_setup_obj.X_test_transform)
 # Set tag that explanations completed
 automl_run.tag(MODEL_EXPLANATION_TAG, 'True')
 print("Engineered and raw explanations computed successfully")
 # Initialize the ScoringExplainer
--- a/how-to-use-azureml/automated-machine-learning/regression/auto-ml-regression.ipynb
+++ b/how-to-use-azureml/automated-machine-learning/regression/auto-ml-regression.ipynb
@@ -40,7 +40,7 @@
        "## Introduction\n",
        "In this example we use the Hardware Performance Dataset to showcase how you can use AutoML for a simple regression problem. The Regression goal is to predict the performance of certain combinations of hardware parts.\n",
        "\n",
-        "If you are using an Azure Machine Learning Notebook VM, you are all set.  Otherwise, go through the [configuration](../../../configuration.ipynb)  notebook first if you haven't already to establish your connection to the AzureML Workspace. \n",
+        "If you are using an Azure Machine Learning Compute Instance, you are all set.  Otherwise, go through the [configuration](../../../configuration.ipynb)  notebook first if you haven't already to establish your connection to the AzureML Workspace. \n",
        "\n",
        "In this notebook you will learn how to:\n",
        "1. Create an `Experiment` in an existing `Workspace`.\n",
@@ -92,7 +92,7 @@
      "metadata": {},
      "outputs": [],
      "source": [
-        "print(\"This notebook was created using version 1.3.0 of the Azure ML SDK\")\n",
+        "print(\"This notebook was created using version 1.5.0 of the Azure ML SDK\")\n",
        "print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
      ]
    },
--- a/how-to-use-azureml/automated-machine-learning/sql-server/energy-demand/ForecastEnergyDemand.sql
+++ b/how-to-use-azureml/automated-machine-learning/sql-server/energy-demand/ForecastEnergyDemand.sql
@@ -1,23 +0,0 @@
 -- This shows using the AutoMLForecast stored procedure to predict using a forecasting model for the nyc_energy dataset.
 DECLARE @Model NVARCHAR(MAX) = (SELECT TOP 1 Model FROM dbo.aml_model
                                WHERE ExperimentName = 'automl-sql-forecast'
 								ORDER BY CreatedDate DESC)
 DECLARE @max_horizon INT = 48
 DECLARE @split_time NVARCHAR(22) = (SELECT DATEADD(hour, -@max_horizon, MAX(timeStamp)) FROM nyc_energy WHERE demand IS NOT NULL)
 DECLARE @TestDataQuery NVARCHAR(MAX) = '
 SELECT CAST(timeStamp AS NVARCHAR(30)) AS timeStamp,
       demand,
 	   precip,
 	   temp
 FROM nyc_energy
 WHERE demand IS NOT NULL AND precip IS NOT NULL AND temp IS NOT NULL
 AND timeStamp > ''' + @split_time + ''''
 EXEC dbo.AutoMLForecast @input_query=@TestDataQuery,
@label_column='demand',
@time_column_name='timeStamp',
@model=@model
 WITH RESULT SETS ((timeStamp DATETIME, grain NVARCHAR(255), predicted_demand FLOAT, precip FLOAT, temp FLOAT, actual_demand FLOAT))
--- a/how-to-use-azureml/automated-machine-learning/sql-server/energy-demand/GetMetrics.sql
+++ b/how-to-use-azureml/automated-machine-learning/sql-server/energy-demand/GetMetrics.sql
@@ -1,10 +0,0 @@
 -- This lists all the metrics for all iterations for the most recent run.
 DECLARE @RunId NVARCHAR(43)
 DECLARE @ExperimentName NVARCHAR(255)
 SELECT TOP 1 @ExperimentName=ExperimentName, @RunId=SUBSTRING(RunId, 1, 43)
 FROM aml_model
 ORDER BY CreatedDate DESC
 EXEC dbo.AutoMLGetMetrics @RunId, @ExperimentName
--- a/how-to-use-azureml/automated-machine-learning/sql-server/energy-demand/TrainEnergyDemand.sql
+++ b/how-to-use-azureml/automated-machine-learning/sql-server/energy-demand/TrainEnergyDemand.sql
@@ -1,25 +0,0 @@
 -- This shows using the AutoMLTrain stored procedure to create a forecasting model for the nyc_energy dataset.
 DECLARE @max_horizon INT = 48
 DECLARE @split_time NVARCHAR(22) = (SELECT DATEADD(hour, -@max_horizon, MAX(timeStamp)) FROM nyc_energy WHERE demand IS NOT NULL)
 DECLARE @TrainDataQuery NVARCHAR(MAX) = '
 SELECT CAST(timeStamp as NVARCHAR(30)) as timeStamp,
       demand,
 	   precip,
 	   temp
 FROM nyc_energy
 WHERE demand IS NOT NULL AND precip IS NOT NULL AND temp IS NOT NULL
 and timeStamp < ''' + @split_time + ''''
 INSERT INTO dbo.aml_model(RunId, ExperimentName, Model, LogFileText, WorkspaceName)
 EXEC dbo.AutoMLTrain @input_query= @TrainDataQuery,
@label_column='demand',
@task='forecasting',
@iterations=10,
@iteration_timeout_minutes=5,
@time_column_name='timeStamp',
@max_horizon=@max_horizon,
@experiment_name='automl-sql-forecast',
@primary_metric='normalized_root_mean_squared_error'
--- a/how-to-use-azureml/automated-machine-learning/sql-server/energy-demand/auto-ml-sql-energy-demand.ipynb
+++ b/how-to-use-azureml/automated-machine-learning/sql-server/energy-demand/auto-ml-sql-energy-demand.ipynb
@@ -1,161 +0,0 @@
 {
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "# Train a model and use it for prediction\r\n",
        "\r\n",
        "Before running this notebook, run the auto-ml-sql-setup.ipynb notebook."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/automated-machine-learning/sql-server/energy-demand/auto-ml-sql-energy-demand.png)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Set the default database"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "USE [automl]\r\n",
        "GO"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Use the AutoMLTrain stored procedure to create a forecasting model for the nyc_energy dataset."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "INSERT INTO dbo.aml_model(RunId, ExperimentName, Model, LogFileText, WorkspaceName)\r\n",
        "EXEC dbo.AutoMLTrain @input_query='\r\n",
        "SELECT CAST(timeStamp as NVARCHAR(30)) as timeStamp,\r\n",
        "       demand,\r\n",
        "\t   precip,\r\n",
        "\t   temp,\r\n",
        "\t   CASE WHEN timeStamp < ''2017-01-01'' THEN 0 ELSE 1 END AS is_validate_column\r\n",
        "FROM nyc_energy\r\n",
        "WHERE demand IS NOT NULL AND precip IS NOT NULL AND temp IS NOT NULL\r\n",
        "and timeStamp < ''2017-02-01''',\r\n",
        "@label_column='demand',\r\n",
        "@task='forecasting',\r\n",
        "@iterations=10,\r\n",
        "@iteration_timeout_minutes=5,\r\n",
        "@time_column_name='timeStamp',\r\n",
        "@is_validate_column='is_validate_column',\r\n",
        "@experiment_name='automl-sql-forecast',\r\n",
        "@primary_metric='normalized_root_mean_squared_error'"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Use the AutoMLPredict stored procedure to predict using the forecasting model for the nyc_energy dataset."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "DECLARE @Model NVARCHAR(MAX) = (SELECT TOP 1 Model FROM dbo.aml_model\r\n",
        "                                WHERE ExperimentName = 'automl-sql-forecast'\r\n",
        "\t\t\t\t\t\t\t\tORDER BY CreatedDate DESC)\r\n",
        "\r\n",
        "EXEC dbo.AutoMLPredict @input_query='\r\n",
        "SELECT CAST(timeStamp AS NVARCHAR(30)) AS timeStamp,\r\n",
        "       demand,\r\n",
        "\t   precip,\r\n",
        "\t   temp\r\n",
        "FROM nyc_energy\r\n",
        "WHERE demand IS NOT NULL AND precip IS NOT NULL AND temp IS NOT NULL\r\n",
        "AND timeStamp >= ''2017-02-01''',\r\n",
        "@label_column='demand',\r\n",
        "@model=@model\r\n",
        "WITH RESULT SETS ((timeStamp NVARCHAR(30), actual_demand FLOAT, precip FLOAT, temp FLOAT, predicted_demand FLOAT))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## List all the metrics for all iterations for the most recent training run."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "DECLARE @RunId NVARCHAR(43)\r\n",
        "DECLARE @ExperimentName NVARCHAR(255)\r\n",
        "\r\n",
        "SELECT TOP 1 @ExperimentName=ExperimentName, @RunId=SUBSTRING(RunId, 1, 43)\r\n",
        "FROM aml_model\r\n",
        "ORDER BY CreatedDate DESC\r\n",
        "\r\n",
        "EXEC dbo.AutoMLGetMetrics @RunId, @ExperimentName"
      ]
    }
  ],
  "metadata": {
    "authors": [
      {
        "name": "jeffshep"
      }
    ],
    "category": "tutorial",
    "compute": [
      "Local"
    ],
    "datasets": [
      "NYC Energy"
    ],
    "deployment": [
      "None"
    ],
    "exclude_from_index": false,
    "framework": [
      "Azure ML AutoML"
    ],
    "tags": [
      ""
    ],
    "friendly_name": "Forecasting with automated ML SQL integration",
    "index_order": 1,
    "kernelspec": {
      "display_name": "Python 3.6",
      "language": "sql",
      "name": "python36"
    },
    "language_info": {
      "name": "sql",
      "version": ""
    },
    "task": "Forecasting"
  },
  "nbformat": 4,
  "nbformat_minor": 2
 }
--- a/how-to-use-azureml/automated-machine-learning/sql-server/setup/AutoMLForecast.sql
+++ b/how-to-use-azureml/automated-machine-learning/sql-server/setup/AutoMLForecast.sql
@@ -1,92 +0,0 @@
 -- This procedure forecast values based on a forecasting model returned by AutoMLTrain.
 -- It returns a dataset with the forecasted values.
 SET ANSI_NULLS ON
 GO
 SET QUOTED_IDENTIFIER ON
 GO
 CREATE OR ALTER PROCEDURE [dbo].[AutoMLForecast]
 (
   @input_query NVARCHAR(MAX),          -- A SQL query returning data to predict on.
   @model NVARCHAR(MAX),                -- A model returned from AutoMLTrain.
   @time_column_name  NVARCHAR(255)='', -- The name of the timestamp column for forecasting.
   @label_column  NVARCHAR(255)='',     -- Optional name of the column from input_query, which should be ignored when predicting
   @y_query_column NVARCHAR(255)='',    -- Optional value column that can be used for predicting.
                                        -- If specified, this can contain values for past times (after the model was trained)
 									    -- and contain Nan for future times.
   @forecast_column_name NVARCHAR(255) = 'predicted'
                                        -- The name of the output column containing the forecast value.
 ) AS 
 BEGIN 
    EXEC sp_execute_external_script @language = N'Python', @script = N'import pandas as pd 
 import azureml.core  
 import numpy as np 
 from azureml.train.automl import AutoMLConfig  
 import pickle 
 import codecs 
 model_obj = pickle.loads(codecs.decode(model.encode(), "base64")) 
 test_data = input_data.copy() 
 if label_column != "" and label_column is not None:
    y_test = test_data.pop(label_column).values
 else:
    y_test = None 
 if y_query_column != "" and y_query_column is not None:
    y_query = test_data.pop(y_query_column).values
 else:
    y_query = np.repeat(np.nan, len(test_data))
 X_test = test_data 
 if time_column_name != "" and time_column_name is not None:
    X_test[time_column_name] = pd.to_datetime(X_test[time_column_name])
 y_fcst, X_trans = model_obj.forecast(X_test, y_query) 
 def align_outputs(y_forecast, X_trans, X_test, y_test, forecast_column_name):
    # Demonstrates how to get the output aligned to the inputs
    # using pandas indexes. Helps understand what happened if
    # the output shape differs from the input shape, or if
    # the data got re-sorted by time and grain during forecasting.
    # Typical causes of misalignment are:
    # * we predicted some periods that were missing in actuals -> drop from eval
    # * model was asked to predict past max_horizon -> increase max horizon
    # * data at start of X_test was needed for lags -> provide previous periods
    df_fcst = pd.DataFrame({forecast_column_name : y_forecast})
    # y and X outputs are aligned by forecast() function contract
    df_fcst.index = X_trans.index
    # align original X_test to y_test    
    X_test_full = X_test.copy()
    if y_test is not None:
        X_test_full[label_column] = y_test
    # X_test_full does not include origin, so reset for merge
    df_fcst.reset_index(inplace=True)
    X_test_full = X_test_full.reset_index().drop(columns=''index'')
    together = df_fcst.merge(X_test_full, how=''right'')
    # drop rows where prediction or actuals are nan 
    # happens because of missing actuals 
    # or at edges of time due to lags/rolling windows
    clean = together[together[[label_column, forecast_column_name]].notnull().all(axis=1)]
    return(clean)
 combined_output = align_outputs(y_fcst, X_trans, X_test, y_test, forecast_column_name)
 ' 
    , @input_data_1 = @input_query 
    , @input_data_1_name = N'input_data' 
    , @output_data_1_name = N'combined_output' 
    , @params = N'@model NVARCHAR(MAX), @time_column_name  NVARCHAR(255), @label_column NVARCHAR(255), @y_query_column NVARCHAR(255), @forecast_column_name NVARCHAR(255)' 
    , @model = @model 
 	, @time_column_name = @time_column_name
 	, @label_column = @label_column
 	, @y_query_column = @y_query_column
 	, @forecast_column_name = @forecast_column_name
 END
--- a/how-to-use-azureml/automated-machine-learning/sql-server/setup/AutoMLGetMetrics.sql
+++ b/how-to-use-azureml/automated-machine-learning/sql-server/setup/AutoMLGetMetrics.sql
@@ -1,70 +0,0 @@
 -- This procedure returns a list of metrics for each iteration of a run.
 SET ANSI_NULLS ON
 GO
 SET QUOTED_IDENTIFIER ON
 GO
 CREATE OR ALTER PROCEDURE [dbo].[AutoMLGetMetrics]
 (
 	@run_id NVARCHAR(250),                           -- The RunId
    @experiment_name NVARCHAR(32)='automl-sql-test', -- This can be used to find the experiment in the Azure Portal.
    @connection_name NVARCHAR(255)='default'         -- The AML connection to use.
 ) AS
 BEGIN
    DECLARE @tenantid NVARCHAR(255)
    DECLARE @appid NVARCHAR(255)
    DECLARE @password NVARCHAR(255)
    DECLARE @config_file NVARCHAR(255)
 	SELECT @tenantid=TenantId, @appid=AppId, @password=Password, @config_file=ConfigFile
 	FROM aml_connection
 	WHERE ConnectionName = @connection_name;
    EXEC sp_execute_external_script @language = N'Python', @script = N'import pandas as pd
 import logging 
 import azureml.core 
 import numpy as np
 from azureml.core.experiment import Experiment 
 from azureml.train.automl.run import AutoMLRun
 from azureml.core.authentication import ServicePrincipalAuthentication 
 from azureml.core.workspace import Workspace 
 auth = ServicePrincipalAuthentication(tenantid, appid, password) 
 ws = Workspace.from_config(path=config_file, auth=auth) 
 experiment = Experiment(ws, experiment_name) 
 ml_run = AutoMLRun(experiment = experiment, run_id = run_id)
 children = list(ml_run.get_children())
 iterationlist = []
 metricnamelist = []
 metricvaluelist = []
 for run in children:
    properties = run.get_properties()
    if "iteration" in properties:
        iteration = int(properties["iteration"])
        for metric_name, metric_value in run.get_metrics().items():
            if isinstance(metric_value, float):
                iterationlist.append(iteration)
                metricnamelist.append(metric_name)
                metricvaluelist.append(metric_value)
 metrics = pd.DataFrame({"iteration": iterationlist, "metric_name": metricnamelist, "metric_value": metricvaluelist})
 '
    , @output_data_1_name = N'metrics'
 	, @params = N'@run_id NVARCHAR(250), 
 				  @experiment_name NVARCHAR(32),
  				  @tenantid NVARCHAR(255),
 				  @appid NVARCHAR(255),
 				  @password NVARCHAR(255),
 				  @config_file NVARCHAR(255)'
    , @run_id = @run_id
 	, @experiment_name = @experiment_name
 	, @tenantid = @tenantid
 	, @appid = @appid
 	, @password = @password
 	, @config_file = @config_file
 WITH RESULT SETS ((iteration INT, metric_name NVARCHAR(100), metric_value FLOAT))
 END
--- a/how-to-use-azureml/automated-machine-learning/sql-server/setup/AutoMLPredict.sql
+++ b/how-to-use-azureml/automated-machine-learning/sql-server/setup/AutoMLPredict.sql
@@ -1,41 +0,0 @@
 -- This procedure predicts values based on a model returned by AutoMLTrain and a dataset.
 -- It returns the dataset with a new column added, which is the predicted value.
 SET ANSI_NULLS ON
 GO
 SET QUOTED_IDENTIFIER ON
 GO
 CREATE OR ALTER PROCEDURE [dbo].[AutoMLPredict]
 (
   @input_query NVARCHAR(MAX),      -- A SQL query returning data to predict on.
   @model NVARCHAR(MAX),            -- A model returned from AutoMLTrain.
   @label_column  NVARCHAR(255)=''  -- Optional name of the column from input_query, which should be ignored when predicting
 ) AS 
 BEGIN 
    EXEC sp_execute_external_script @language = N'Python', @script = N'import pandas as pd 
 import azureml.core  
 import numpy as np 
 from azureml.train.automl import AutoMLConfig  
 import pickle 
 import codecs 
 model_obj = pickle.loads(codecs.decode(model.encode(), "base64")) 
 test_data = input_data.copy() 
 if label_column != "" and label_column is not None:
    y_test = test_data.pop(label_column).values 
 X_test = test_data 
 predicted = model_obj.predict(X_test) 
 combined_output = input_data.assign(predicted=predicted)
 ' 
    , @input_data_1 = @input_query 
    , @input_data_1_name = N'input_data' 
    , @output_data_1_name = N'combined_output' 
    , @params = N'@model NVARCHAR(MAX), @label_column  NVARCHAR(255)' 
    , @model = @model 
 	, @label_column = @label_column
 END
--- a/how-to-use-azureml/automated-machine-learning/sql-server/setup/AutoMLTrain.sql
+++ b/how-to-use-azureml/automated-machine-learning/sql-server/setup/AutoMLTrain.sql
@@ -1,240 +0,0 @@
 -- This stored procedure uses automated machine learning to train several models
 -- and returns the best model.
 --
 -- The result set has several columns:
 --   best_run - iteration ID for the best model
 --   experiment_name - experiment name pass in with the @experiment_name parameter
 --   fitted_model - best model found
 --   log_file_text - AutoML debug_log contents
 --   workspace - name of the Azure ML workspace where run history is stored
 --
 -- An example call for a classification problem is:
 --    insert into dbo.aml_model(RunId, ExperimentName, Model, LogFileText, WorkspaceName)
 --    exec dbo.AutoMLTrain @input_query='
 --    SELECT top 100000 
 --          CAST([pickup_datetime] AS NVARCHAR(30)) AS pickup_datetime
 --          ,CAST([dropoff_datetime] AS NVARCHAR(30)) AS dropoff_datetime
 --          ,[passenger_count]
 --          ,[trip_time_in_secs]
 --          ,[trip_distance]
 --          ,[payment_type]
 --          ,[tip_class]
 --      FROM [dbo].[nyctaxi_sample] order by [hack_license] ',
 --      @label_column = 'tip_class',
 --      @iterations=10
 -- 
 -- An example call for forecasting is:
 --      insert into dbo.aml_model(RunId, ExperimentName, Model, LogFileText, WorkspaceName)
 --      exec dbo.AutoMLTrain @input_query='
 --      select cast(timeStamp as nvarchar(30)) as timeStamp,
 --             demand,
 --      	   precip,
 --      	   temp,
 --             case when timeStamp < ''2017-01-01'' then 0 else 1 end as is_validate_column
 --      from nyc_energy
 --      where demand is not null and precip is not null and temp is not null
 --      and timeStamp < ''2017-02-01''',
 --      @label_column='demand',
 --      @task='forecasting',
 --      @iterations=10,
 --      @iteration_timeout_minutes=5,
 --      @time_column_name='timeStamp',
 --      @is_validate_column='is_validate_column',
 --      @experiment_name='automl-sql-forecast',
 --      @primary_metric='normalized_root_mean_squared_error'
 SET ANSI_NULLS ON
 GO
 SET QUOTED_IDENTIFIER ON
 GO
 CREATE OR ALTER PROCEDURE [dbo].[AutoMLTrain]
 (
    @input_query NVARCHAR(MAX),                      -- The SQL Query that will return the data to train and validate the model.
    @label_column NVARCHAR(255)='Label',             -- The name of the column in the result of @input_query that is the label.
    @primary_metric NVARCHAR(40)='AUC_weighted',     -- The metric to optimize.
    @iterations INT=100,                             -- The maximum number of pipelines to train.
    @task NVARCHAR(40)='classification',             -- The type of task.  Can be classification, regression or forecasting.
    @experiment_name NVARCHAR(32)='automl-sql-test', -- This can be used to find the experiment in the Azure Portal.
    @iteration_timeout_minutes INT = 15,             -- The maximum time in minutes for training a single pipeline. 
    @experiment_timeout_hours FLOAT = 1,             -- The maximum time in hours for training all pipelines.
    @n_cross_validations INT = 3,                    -- The number of cross validations.
    @blacklist_models NVARCHAR(MAX) = '',            -- A comma separated list of algos that will not be used.
                                                     -- The list of possible models can be found at:
                                                     -- https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-configure-auto-train#configure-your-experiment-settings
    @whitelist_models NVARCHAR(MAX) = '',            -- A comma separated list of algos that can be used.
                                                     -- The list of possible models can be found at:
                                                     -- https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-configure-auto-train#configure-your-experiment-settings
    @experiment_exit_score FLOAT = 0,                -- Stop the experiment if this score is acheived.
    @sample_weight_column NVARCHAR(255)='',          -- The name of the column in the result of  @input_query that gives a sample weight.
    @is_validate_column NVARCHAR(255)='',            -- The name of the column in the result of  @input_query that indicates if the row is for training or validation.
 	                                                 -- In the values of the column, 0 means for training and 1 means for validation.
    @time_column_name  NVARCHAR(255)='',             -- The name of the timestamp column for forecasting.
    @connection_name NVARCHAR(255)='default',        -- The AML connection to use.
    @max_horizon INT = 0                             -- A forecast horizon is a time span into the future (or just beyond the latest date in the training data)
                                                     -- where forecasts of the target quantity are needed.
                                                     -- For example, if data is recorded daily and max_horizon is 5, we will predict 5 days ahead.
 ) AS
 BEGIN
    DECLARE @tenantid NVARCHAR(255)
    DECLARE @appid NVARCHAR(255)
    DECLARE @password NVARCHAR(255)
    DECLARE @config_file NVARCHAR(255)
 	SELECT @tenantid=TenantId, @appid=AppId, @password=Password, @config_file=ConfigFile
 	FROM aml_connection
 	WHERE ConnectionName = @connection_name;
 	EXEC sp_execute_external_script @language = N'Python', @script = N'import pandas as pd
 import logging 
 import azureml.core 
 import pandas as pd
 import numpy as np
 from azureml.core.experiment import Experiment 
 from azureml.train.automl import AutoMLConfig 
 from sklearn import datasets 
 import pickle
 import codecs
 from azureml.core.authentication import ServicePrincipalAuthentication 
 from azureml.core.workspace import Workspace 
 if __name__.startswith("sqlindb"):
    auth = ServicePrincipalAuthentication(tenantid, appid, password) 
    ws = Workspace.from_config(path=config_file, auth=auth) 
    project_folder = "./sample_projects/" + experiment_name
    experiment = Experiment(ws, experiment_name) 
    data_train = input_data
    X_valid = None
    y_valid = None
    sample_weight_valid = None
    if is_validate_column != "" and is_validate_column is not None:
        data_train = input_data[input_data[is_validate_column] <= 0]
        data_valid = input_data[input_data[is_validate_column] > 0]
        data_train.pop(is_validate_column)
        data_valid.pop(is_validate_column)
        y_valid = data_valid.pop(label_column).values
        if sample_weight_column != "" and sample_weight_column is not None:
            sample_weight_valid = data_valid.pop(sample_weight_column).values
        X_valid = data_valid
        n_cross_validations = None
    y_train = data_train.pop(label_column).values
    sample_weight = None
    if sample_weight_column != "" and sample_weight_column is not None:
        sample_weight = data_train.pop(sample_weight_column).values
    X_train = data_train
    if experiment_timeout_hours == 0:
        experiment_timeout_hours = None
    if experiment_exit_score == 0:
        experiment_exit_score = None
    if blacklist_models == "":
        blacklist_models = None
    if blacklist_models is not None:
        blacklist_models = blacklist_models.replace(" ", "").split(",")
    if whitelist_models == "":
        whitelist_models = None
    if whitelist_models is not None:
        whitelist_models = whitelist_models.replace(" ", "").split(",")
    automl_settings = {}
    preprocess = True
    if time_column_name != "" and time_column_name is not None:
        automl_settings = { "time_column_name": time_column_name }
        preprocess = False
        if max_horizon > 0:
            automl_settings["max_horizon"] = max_horizon
    log_file_name = "automl_sqlindb_errors.log"
    automl_config = AutoMLConfig(task = task, 
                                 debug_log = log_file_name, 
                                 primary_metric = primary_metric, 
                                 iteration_timeout_minutes = iteration_timeout_minutes, 
                                 experiment_timeout_hours = experiment_timeout_hours,
                                 iterations = iterations, 
                                 n_cross_validations = n_cross_validations, 
                                 preprocess = preprocess,
                                 verbosity = logging.INFO, 
                                 X = X_train,  
                                 y = y_train, 
                                 path = project_folder,
                                 blacklist_models = blacklist_models,
                                 whitelist_models = whitelist_models,
                                 experiment_exit_score = experiment_exit_score,
                                 sample_weight = sample_weight,
                                 X_valid = X_valid,
                                 y_valid = y_valid,
                                 sample_weight_valid = sample_weight_valid,
                                 **automl_settings) 
    local_run = experiment.submit(automl_config, show_output = True) 
    best_run, fitted_model = local_run.get_output()
    pickled_model = codecs.encode(pickle.dumps(fitted_model), "base64").decode()
    log_file_text = ""
    try:
        with open(log_file_name, "r") as log_file:
            log_file_text = log_file.read()
    except:
        log_file_text = "Log file not found"
    returned_model = pd.DataFrame({"best_run": [best_run.id], "experiment_name": [experiment_name], "fitted_model": [pickled_model], "log_file_text": [log_file_text], "workspace": [ws.name]}, dtype=np.dtype(np.str))
 '
 	, @input_data_1 = @input_query
 	, @input_data_1_name = N'input_data'
 	, @output_data_1_name = N'returned_model'
 	, @params = N'@label_column NVARCHAR(255), 
 	              @primary_metric NVARCHAR(40),
 				  @iterations INT, @task NVARCHAR(40),
 				  @experiment_name NVARCHAR(32),
 				  @iteration_timeout_minutes INT,
 				  @experiment_timeout_hours FLOAT,
 				  @n_cross_validations INT,
 				  @blacklist_models NVARCHAR(MAX),
 				  @whitelist_models NVARCHAR(MAX),
 				  @experiment_exit_score FLOAT,
 				  @sample_weight_column NVARCHAR(255),
 				  @is_validate_column NVARCHAR(255),
 				  @time_column_name  NVARCHAR(255),
 				  @tenantid NVARCHAR(255),
 				  @appid NVARCHAR(255),
 				  @password NVARCHAR(255),
 				  @config_file NVARCHAR(255),
 				  @max_horizon INT'
 	, @label_column = @label_column
 	, @primary_metric = @primary_metric
 	, @iterations = @iterations
 	, @task = @task
 	, @experiment_name = @experiment_name
 	, @iteration_timeout_minutes = @iteration_timeout_minutes
 	, @experiment_timeout_hours = @experiment_timeout_hours
 	, @n_cross_validations = @n_cross_validations
 	, @blacklist_models = @blacklist_models
 	, @whitelist_models = @whitelist_models
 	, @experiment_exit_score = @experiment_exit_score
 	, @sample_weight_column = @sample_weight_column
 	, @is_validate_column = @is_validate_column
 	, @time_column_name = @time_column_name
 	, @tenantid = @tenantid
 	, @appid = @appid
 	, @password = @password
 	, @config_file = @config_file
 	, @max_horizon = @max_horizon
 WITH RESULT SETS ((best_run NVARCHAR(250), experiment_name NVARCHAR(100), fitted_model VARCHAR(MAX), log_file_text NVARCHAR(MAX), workspace NVARCHAR(100)))
 END
--- a/how-to-use-azureml/automated-machine-learning/sql-server/setup/aml_connection.sql
+++ b/how-to-use-azureml/automated-machine-learning/sql-server/setup/aml_connection.sql
@@ -1,18 +0,0 @@
 -- This is a table to store the Azure ML connection information.
 SET ANSI_NULLS ON
 GO
 SET QUOTED_IDENTIFIER ON
 GO
 CREATE TABLE [dbo].[aml_connection](
    [Id] [int] IDENTITY(1,1) NOT NULL PRIMARY KEY,
 	[ConnectionName] [nvarchar](255) NULL,
 	[TenantId] [nvarchar](255) NULL,
 	[AppId] [nvarchar](255) NULL,
 	[Password] [nvarchar](255) NULL,
 	[ConfigFile] [nvarchar](255) NULL
 ) ON [PRIMARY]
 GO
--- a/how-to-use-azureml/automated-machine-learning/sql-server/setup/aml_model.sql
+++ b/how-to-use-azureml/automated-machine-learning/sql-server/setup/aml_model.sql
@@ -1,22 +0,0 @@
 -- This is a table to hold the results from the AutoMLTrain procedure.
 SET ANSI_NULLS ON
 GO
 SET QUOTED_IDENTIFIER ON
 GO
 CREATE TABLE [dbo].[aml_model](
    [Id] [int] IDENTITY(1,1) NOT NULL PRIMARY KEY,
    [Model] [varchar](max) NOT NULL,        -- The model, which can be passed to AutoMLPredict for testing or prediction.
    [RunId] [nvarchar](250) NULL,           -- The RunId, which can be used to view the model in the Azure Portal.
    [CreatedDate] [datetime] NULL,
    [ExperimentName] [nvarchar](100) NULL,  -- Azure ML Experiment Name
    [WorkspaceName] [nvarchar](100) NULL,   -- Azure ML Workspace Name
 	[LogFileText] [nvarchar](max) NULL
 ) 
 GO
 ALTER TABLE [dbo].[aml_model] ADD  DEFAULT (getutcdate()) FOR [CreatedDate]
 GO
--- a/how-to-use-azureml/automated-machine-learning/sql-server/setup/auto-ml-sql-setup.ipynb
+++ b/how-to-use-azureml/automated-machine-learning/sql-server/setup/auto-ml-sql-setup.ipynb
@@ -1,581 +0,0 @@
 {
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "# Set up Azure ML Automated Machine Learning on SQL Server 2019 CTP 2.4 big data cluster\r\n",
        "\r\n",
        "\\# Prerequisites:  \r\n",
        "\\# - An Azure subscription and resource group  \r\n",
        "\\# - An Azure Machine Learning workspace  \r\n",
        "\\# - A SQL Server 2019 CTP 2.4 big data cluster with Internet access and a database named 'automl'  \r\n",
        "\\# - Azure CLI  \r\n",
        "\\# - kubectl command  \r\n",
        "\\# - The https://github.com/Azure/MachineLearningNotebooks repository downloaded (cloned) to your local machine\r\n",
        "\r\n",
        "\\# In the 'automl' database, create a table named 'dbo.nyc_energy' as follows:  \r\n",
        "\\# - In SQL Server Management Studio, right-click the 'automl' database, select Tasks, then Import Flat File.  \r\n",
        "\\# - Select the file AzureMlCli\\notebooks\\how-to-use-azureml\\automated-machine-learning\\forecasting-energy-demand\\nyc_energy.csv.  \r\n",
        "\\# - Using the \"Modify Columns\" page, allow nulls for all columns. \r\n",
        "\r\n",
        "\\# Create an Azure Machine Learning Workspace using the instructions at https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-manage-workspace \r\n",
        "\r\n",
        "\\# Create an Azure service principal.  You can do this with the following commands: \r\n",
        "\r\n",
        "az login  \r\n",
        "az account set --subscription *subscriptionid*  \r\n",
        "\r\n",
        "\\# The following command prints out the **appId** and **tenant**,  \r\n",
        "\\# which you insert into the indicated cell later in this notebook  \r\n",
        "\\# to allow AutoML to authenticate with Azure:  \r\n",
        "\r\n",
        "az ad sp create-for-rbac --name *principlename* --password *password*\r\n",
        "\r\n",
        "\\# Log into the master instance of SQL Server 2019 CTP 2.4:  \r\n",
        "kubectl exec -it mssql-master-pool-0 -n *clustername* -c mssql-server -- /bin/bash\r\n",
        "\r\n",
        "mkdir /tmp/aml\r\n",
        "\r\n",
        "cd /tmp/aml\r\n",
        "\r\n",
        "\\# **Modify** the following with your subscription_id, resource_group, and workspace_name:  \r\n",
        "cat > config.json << EOF  \r\n",
        "{  \r\n",
        "    \"subscription_id\": \"123456ab-78cd-0123-45ef-abcd12345678\",  \r\n",
        "    \"resource_group\": \"myrg1\",  \r\n",
        "    \"workspace_name\": \"myws1\"  \r\n",
        "}  \r\n",
        "EOF\r\n",
        "\r\n",
        "\\# The directory referenced below is appropriate for the master instance of SQL Server 2019 CTP 2.4.\r\n",
        "\r\n",
        "cd /opt/mssql/mlservices/runtime/python/bin\r\n",
        "\r\n",
        "./python -m pip install azureml-sdk[automl]\r\n",
        "\r\n",
        "./python -m pip install --upgrade numpy \r\n",
        "\r\n",
        "./python -m pip install --upgrade sklearn\r\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/automated-machine-learning/sql-server/setup/auto-ml-sql-setup.png)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "-- Enable external scripts to allow invoking Python\r\n",
        "sp_configure 'external scripts enabled',1 \r\n",
        "reconfigure with override \r\n",
        "GO\r\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "-- Use database 'automl'\r\n",
        "USE [automl]\r\n",
        "GO"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "-- This is a table to hold the Azure ML connection information.\r\n",
        "SET ANSI_NULLS ON\r\n",
        "GO\r\n",
        "\r\n",
        "SET QUOTED_IDENTIFIER ON\r\n",
        "GO\r\n",
        "\r\n",
        "CREATE TABLE [dbo].[aml_connection](\r\n",
        "    [Id] [int] IDENTITY(1,1) NOT NULL PRIMARY KEY,\r\n",
        "\t[ConnectionName] [nvarchar](255) NULL,\r\n",
        "\t[TenantId] [nvarchar](255) NULL,\r\n",
        "\t[AppId] [nvarchar](255) NULL,\r\n",
        "\t[Password] [nvarchar](255) NULL,\r\n",
        "\t[ConfigFile] [nvarchar](255) NULL\r\n",
        ") ON [PRIMARY]\r\n",
        "GO"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "# Copy the values from create-for-rbac above into the cell below"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "-- Use the following values:\r\n",
        "-- Leave the name as 'Default'\r\n",
        "-- Insert <tenant> returned by create-for-rbac above\r\n",
        "-- Insert <AppId> returned by create-for-rbac above\r\n",
        "-- Insert <password> used in create-for-rbac above\r\n",
        "-- Leave <path> as '/tmp/aml/config.json'\r\n",
        "INSERT INTO [dbo].[aml_connection]  \r\n",
        "VALUES (\r\n",
        "    N'Default', -- Name\r\n",
        "    N'11111111-2222-3333-4444-555555555555', -- Tenant\r\n",
        "    N'aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee', -- AppId\r\n",
        "    N'insertpasswordhere', -- Password\r\n",
        "    N'/tmp/aml/config.json' -- Path\r\n",
        "    );\r\n",
        "GO"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "-- This is a table to hold the results from the AutoMLTrain procedure.\r\n",
        "SET ANSI_NULLS ON\r\n",
        "GO\r\n",
        "\r\n",
        "SET QUOTED_IDENTIFIER ON\r\n",
        "GO\r\n",
        "\r\n",
        "CREATE TABLE [dbo].[aml_model](\r\n",
        "    [Id] [int] IDENTITY(1,1) NOT NULL PRIMARY KEY,\r\n",
        "    [Model] [varchar](max) NOT NULL,        -- The model, which can be passed to AutoMLPredict for testing or prediction.\r\n",
        "    [RunId] [nvarchar](250) NULL,           -- The RunId, which can be used to view the model in the Azure Portal.\r\n",
        "    [CreatedDate] [datetime] NULL,\r\n",
        "    [ExperimentName] [nvarchar](100) NULL,  -- Azure ML Experiment Name\r\n",
        "    [WorkspaceName] [nvarchar](100) NULL,   -- Azure ML Workspace Name\r\n",
        "\t[LogFileText] [nvarchar](max) NULL\r\n",
        ") \r\n",
        "GO\r\n",
        "\r\n",
        "ALTER TABLE [dbo].[aml_model] ADD  DEFAULT (getutcdate()) FOR [CreatedDate]\r\n",
        "GO\r\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "-- This stored procedure uses automated machine learning to train several models\r\n",
        "-- and return the best model.\r\n",
        "--\r\n",
        "-- The result set has several columns:\r\n",
        "--   best_run - ID of the best model found\r\n",
        "--   experiment_name - training run name\r\n",
        "--   fitted_model - best model found\r\n",
        "--   log_file_text - console output\r\n",
        "--   workspace - name of the Azure ML workspace where run history is stored\r\n",
        "--\r\n",
        "-- An example call for a classification problem is:\r\n",
        "--    insert into dbo.aml_model(RunId, ExperimentName, Model, LogFileText, WorkspaceName)\r\n",
        "--    exec dbo.AutoMLTrain @input_query='\r\n",
        "--    SELECT top 100000 \r\n",
        "--          CAST([pickup_datetime] AS NVARCHAR(30)) AS pickup_datetime\r\n",
        "--          ,CAST([dropoff_datetime] AS NVARCHAR(30)) AS dropoff_datetime\r\n",
        "--          ,[passenger_count]\r\n",
        "--          ,[trip_time_in_secs]\r\n",
        "--          ,[trip_distance]\r\n",
        "--          ,[payment_type]\r\n",
        "--          ,[tip_class]\r\n",
        "--      FROM [dbo].[nyctaxi_sample] order by [hack_license] ',\r\n",
        "--      @label_column = 'tip_class',\r\n",
        "--      @iterations=10\r\n",
        "-- \r\n",
        "-- An example call for forecasting is:\r\n",
        "--      insert into dbo.aml_model(RunId, ExperimentName, Model, LogFileText, WorkspaceName)\r\n",
        "--      exec dbo.AutoMLTrain @input_query='\r\n",
        "--      select cast(timeStamp as nvarchar(30)) as timeStamp,\r\n",
        "--             demand,\r\n",
        "--      \t   precip,\r\n",
        "--      \t   temp,\r\n",
        "--             case when timeStamp < ''2017-01-01'' then 0 else 1 end as is_validate_column\r\n",
        "--      from nyc_energy\r\n",
        "--      where demand is not null and precip is not null and temp is not null\r\n",
        "--      and timeStamp < ''2017-02-01''',\r\n",
        "--      @label_column='demand',\r\n",
        "--      @task='forecasting',\r\n",
        "--      @iterations=10,\r\n",
        "--      @iteration_timeout_minutes=5,\r\n",
        "--      @time_column_name='timeStamp',\r\n",
        "--      @is_validate_column='is_validate_column',\r\n",
        "--      @experiment_name='automl-sql-forecast',\r\n",
        "--      @primary_metric='normalized_root_mean_squared_error'\r\n",
        "\r\n",
        "SET ANSI_NULLS ON\r\n",
        "GO\r\n",
        "SET QUOTED_IDENTIFIER ON\r\n",
        "GO\r\n",
        "CREATE OR ALTER PROCEDURE [dbo].[AutoMLTrain]\r\n",
        " (\r\n",
        "    @input_query NVARCHAR(MAX),                      -- The SQL Query that will return the data to train and validate the model.\r\n",
        "    @label_column NVARCHAR(255)='Label',             -- The name of the column in the result of @input_query that is the label.\r\n",
        "    @primary_metric NVARCHAR(40)='AUC_weighted',     -- The metric to optimize.\r\n",
        "    @iterations INT=100,                             -- The maximum number of pipelines to train.\r\n",
        "    @task NVARCHAR(40)='classification',             -- The type of task.  Can be classification, regression or forecasting.\r\n",
        "    @experiment_name NVARCHAR(32)='automl-sql-test', -- This can be used to find the experiment in the Azure Portal.\r\n",
        "    @iteration_timeout_minutes INT = 15,             -- The maximum time in minutes for training a single pipeline. \r\n",
        "    @experiment_timeout_hours FLOAT = 1,             -- The maximum time in hours for training all pipelines.\r\n",
        "    @n_cross_validations INT = 3,                    -- The number of cross validations.\r\n",
        "    @blacklist_models NVARCHAR(MAX) = '',            -- A comma separated list of algos that will not be used.\r\n",
        "                                                     -- The list of possible models can be found at:\r\n",
        "                                                     -- https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-configure-auto-train#configure-your-experiment-settings\r\n",
        "    @whitelist_models NVARCHAR(MAX) = '',            -- A comma separated list of algos that can be used.\r\n",
        "                                                     -- The list of possible models can be found at:\r\n",
        "                                                     -- https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-configure-auto-train#configure-your-experiment-settings\r\n",
        "    @experiment_exit_score FLOAT = 0,                -- Stop the experiment if this score is acheived.\r\n",
        "    @sample_weight_column NVARCHAR(255)='',          -- The name of the column in the result of  @input_query that gives a sample weight.\r\n",
        "    @is_validate_column NVARCHAR(255)='',            -- The name of the column in the result of  @input_query that indicates if the row is for training or validation.\r\n",
        "\t                                                 -- In the values of the column, 0 means for training and 1 means for validation.\r\n",
        "    @time_column_name  NVARCHAR(255)='',             -- The name of the timestamp column for forecasting.\r\n",
        "\t@connection_name NVARCHAR(255)='default'         -- The AML connection to use.\r\n",
        " ) AS\r\n",
        "BEGIN\r\n",
        "\r\n",
        "    DECLARE @tenantid NVARCHAR(255)\r\n",
        "    DECLARE @appid NVARCHAR(255)\r\n",
        "    DECLARE @password NVARCHAR(255)\r\n",
        "    DECLARE @config_file NVARCHAR(255)\r\n",
        "\r\n",
        "\tSELECT @tenantid=TenantId, @appid=AppId, @password=Password, @config_file=ConfigFile\r\n",
        "\tFROM aml_connection\r\n",
        "\tWHERE ConnectionName = @connection_name;\r\n",
        "\r\n",
        "\tEXEC sp_execute_external_script @language = N'Python', @script = N'import pandas as pd\r\n",
        "import logging \r\n",
        "import azureml.core \r\n",
        "import pandas as pd\r\n",
        "import numpy as np\r\n",
        "from azureml.core.experiment import Experiment \r\n",
        "from azureml.train.automl import AutoMLConfig \r\n",
        "from sklearn import datasets \r\n",
        "import pickle\r\n",
        "import codecs\r\n",
        "from azureml.core.authentication import ServicePrincipalAuthentication \r\n",
        "from azureml.core.workspace import Workspace \r\n",
        "\r\n",
        "if __name__.startswith(\"sqlindb\"):\r\n",
        "    auth = ServicePrincipalAuthentication(tenantid, appid, password) \r\n",
        " \r\n",
        "    ws = Workspace.from_config(path=config_file, auth=auth) \r\n",
        " \r\n",
        "    project_folder = \"./sample_projects/\" + experiment_name\r\n",
        " \r\n",
        "    experiment = Experiment(ws, experiment_name) \r\n",
        "\r\n",
        "    data_train = input_data\r\n",
        "    X_valid = None\r\n",
        "    y_valid = None\r\n",
        "    sample_weight_valid = None\r\n",
        "\r\n",
        "    if is_validate_column != \"\" and is_validate_column is not None:\r\n",
        "        data_train = input_data[input_data[is_validate_column] <= 0]\r\n",
        "        data_valid = input_data[input_data[is_validate_column] > 0]\r\n",
        "        data_train.pop(is_validate_column)\r\n",
        "        data_valid.pop(is_validate_column)\r\n",
        "        y_valid = data_valid.pop(label_column).values\r\n",
        "        if sample_weight_column != \"\" and sample_weight_column is not None:\r\n",
        "            sample_weight_valid = data_valid.pop(sample_weight_column).values\r\n",
        "        X_valid = data_valid\r\n",
        "        n_cross_validations = None\r\n",
        "\r\n",
        "    y_train = data_train.pop(label_column).values\r\n",
        "\r\n",
        "    sample_weight = None\r\n",
        "    if sample_weight_column != \"\" and sample_weight_column is not None:\r\n",
        "        sample_weight = data_train.pop(sample_weight_column).values\r\n",
        "\r\n",
        "    X_train = data_train\r\n",
        "\r\n",
        "    if experiment_timeout_hours == 0:\r\n",
        "        experiment_timeout_hours = None\r\n",
        "\r\n",
        "    if experiment_exit_score == 0:\r\n",
        "        experiment_exit_score = None\r\n",
        "\r\n",
        "    if blacklist_models == \"\":\r\n",
        "        blacklist_models = None\r\n",
        "\r\n",
        "    if blacklist_models is not None:\r\n",
        "        blacklist_models = blacklist_models.replace(\" \", \"\").split(\",\")\r\n",
        "\r\n",
        "    if whitelist_models == \"\":\r\n",
        "        whitelist_models = None\r\n",
        "\r\n",
        "    if whitelist_models is not None:\r\n",
        "        whitelist_models = whitelist_models.replace(\" \", \"\").split(\",\")\r\n",
        "\r\n",
        "    automl_settings = {}\r\n",
        "    preprocess = True\r\n",
        "    if time_column_name != \"\" and time_column_name is not None:\r\n",
        "        automl_settings = { \"time_column_name\": time_column_name }\r\n",
        "        preprocess = False\r\n",
        "\r\n",
        "    log_file_name = \"automl_errors.log\"\r\n",
        "\t \r\n",
        "    automl_config = AutoMLConfig(task = task, \r\n",
        "                                 debug_log = log_file_name, \r\n",
        "                                 primary_metric = primary_metric, \r\n",
        "                                 iteration_timeout_minutes = iteration_timeout_minutes, \r\n",
        "                                 experiment_timeout_hours = experiment_timeout_hours,\r\n",
        "                                 iterations = iterations, \r\n",
        "                                 n_cross_validations = n_cross_validations, \r\n",
        "                                 preprocess = preprocess,\r\n",
        "                                 verbosity = logging.INFO, \r\n",
        "                                 X = X_train,  \r\n",
        "                                 y = y_train, \r\n",
        "                                 path = project_folder,\r\n",
        "                                 blacklist_models = blacklist_models,\r\n",
        "                                 whitelist_models = whitelist_models,\r\n",
        "                                 experiment_exit_score = experiment_exit_score,\r\n",
        "                                 sample_weight = sample_weight,\r\n",
        "                                 X_valid = X_valid,\r\n",
        "                                 y_valid = y_valid,\r\n",
        "                                 sample_weight_valid = sample_weight_valid,\r\n",
        "                                 **automl_settings) \r\n",
        " \r\n",
        "    local_run = experiment.submit(automl_config, show_output = True) \r\n",
        "\r\n",
        "    best_run, fitted_model = local_run.get_output()\r\n",
        "\r\n",
        "    pickled_model = codecs.encode(pickle.dumps(fitted_model), \"base64\").decode()\r\n",
        "\r\n",
        "    log_file_text = \"\"\r\n",
        "\r\n",
        "    try:\r\n",
        "        with open(log_file_name, \"r\") as log_file:\r\n",
        "            log_file_text = log_file.read()\r\n",
        "    except:\r\n",
        "        log_file_text = \"Log file not found\"\r\n",
        "\r\n",
        "    returned_model = pd.DataFrame({\"best_run\": [best_run.id], \"experiment_name\": [experiment_name], \"fitted_model\": [pickled_model], \"log_file_text\": [log_file_text], \"workspace\": [ws.name]}, dtype=np.dtype(np.str))\r\n",
        "'\r\n",
        "\t, @input_data_1 = @input_query\r\n",
        "\t, @input_data_1_name = N'input_data'\r\n",
        "\t, @output_data_1_name = N'returned_model'\r\n",
        "\t, @params = N'@label_column NVARCHAR(255), \r\n",
        "\t              @primary_metric NVARCHAR(40),\r\n",
        "\t\t\t\t  @iterations INT, @task NVARCHAR(40),\r\n",
        "\t\t\t\t  @experiment_name NVARCHAR(32),\r\n",
        "\t\t\t\t  @iteration_timeout_minutes INT,\r\n",
        "\t\t\t\t  @experiment_timeout_hours FLOAT,\r\n",
        "\t\t\t\t  @n_cross_validations INT,\r\n",
        "\t\t\t\t  @blacklist_models NVARCHAR(MAX),\r\n",
        "\t\t\t\t  @whitelist_models NVARCHAR(MAX),\r\n",
        "\t\t\t\t  @experiment_exit_score FLOAT,\r\n",
        "\t\t\t\t  @sample_weight_column NVARCHAR(255),\r\n",
        "\t\t\t\t  @is_validate_column NVARCHAR(255),\r\n",
        "\t\t\t\t  @time_column_name  NVARCHAR(255),\r\n",
        "\t\t\t\t  @tenantid NVARCHAR(255),\r\n",
        "\t\t\t\t  @appid NVARCHAR(255),\r\n",
        "\t\t\t\t  @password NVARCHAR(255),\r\n",
        "\t\t\t\t  @config_file NVARCHAR(255)'\r\n",
        "\t, @label_column = @label_column\r\n",
        "\t, @primary_metric = @primary_metric\r\n",
        "\t, @iterations = @iterations\r\n",
        "\t, @task = @task\r\n",
        "\t, @experiment_name = @experiment_name\r\n",
        "\t, @iteration_timeout_minutes = @iteration_timeout_minutes\r\n",
        "\t, @experiment_timeout_hours = @experiment_timeout_hours\r\n",
        "\t, @n_cross_validations = @n_cross_validations\r\n",
        "\t, @blacklist_models = @blacklist_models\r\n",
        "\t, @whitelist_models = @whitelist_models\r\n",
        "\t, @experiment_exit_score = @experiment_exit_score\r\n",
        "\t, @sample_weight_column = @sample_weight_column\r\n",
        "\t, @is_validate_column = @is_validate_column\r\n",
        "\t, @time_column_name = @time_column_name\r\n",
        "\t, @tenantid = @tenantid\r\n",
        "\t, @appid = @appid\r\n",
        "\t, @password = @password\r\n",
        "\t, @config_file = @config_file\r\n",
        "WITH RESULT SETS ((best_run NVARCHAR(250), experiment_name NVARCHAR(100), fitted_model VARCHAR(MAX), log_file_text NVARCHAR(MAX), workspace NVARCHAR(100)))\r\n",
        "END"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "-- This procedure returns a list of metrics for each iteration of a training run.\r\n",
        "SET ANSI_NULLS ON\r\n",
        "GO\r\n",
        "SET QUOTED_IDENTIFIER ON\r\n",
        "GO\r\n",
        "CREATE OR ALTER PROCEDURE [dbo].[AutoMLGetMetrics]\r\n",
        " (\r\n",
        "\t@run_id NVARCHAR(250),                           -- The RunId\r\n",
        "    @experiment_name NVARCHAR(32)='automl-sql-test', -- This can be used to find the experiment in the Azure Portal.\r\n",
        "    @connection_name NVARCHAR(255)='default'         -- The AML connection to use.\r\n",
        " ) AS\r\n",
        "BEGIN\r\n",
        "    DECLARE @tenantid NVARCHAR(255)\r\n",
        "    DECLARE @appid NVARCHAR(255)\r\n",
        "    DECLARE @password NVARCHAR(255)\r\n",
        "    DECLARE @config_file NVARCHAR(255)\r\n",
        "\r\n",
        "\tSELECT @tenantid=TenantId, @appid=AppId, @password=Password, @config_file=ConfigFile\r\n",
        "\tFROM aml_connection\r\n",
        "\tWHERE ConnectionName = @connection_name;\r\n",
        "\r\n",
        "    EXEC sp_execute_external_script @language = N'Python', @script = N'import pandas as pd\r\n",
        "import logging \r\n",
        "import azureml.core \r\n",
        "import numpy as np\r\n",
        "from azureml.core.experiment import Experiment \r\n",
        "from azureml.train.automl.run import AutoMLRun\r\n",
        "from azureml.core.authentication import ServicePrincipalAuthentication \r\n",
        "from azureml.core.workspace import Workspace \r\n",
        "\r\n",
        "auth = ServicePrincipalAuthentication(tenantid, appid, password) \r\n",
        " \r\n",
        "ws = Workspace.from_config(path=config_file, auth=auth) \r\n",
        " \r\n",
        "experiment = Experiment(ws, experiment_name) \r\n",
        "\r\n",
        "ml_run = AutoMLRun(experiment = experiment, run_id = run_id)\r\n",
        "\r\n",
        "children = list(ml_run.get_children())\r\n",
        "iterationlist = []\r\n",
        "metricnamelist = []\r\n",
        "metricvaluelist = []\r\n",
        "\r\n",
        "for run in children:\r\n",
        "    properties = run.get_properties()\r\n",
        "    if \"iteration\" in properties:\r\n",
        "        iteration = int(properties[\"iteration\"])\r\n",
        "        for metric_name, metric_value in run.get_metrics().items():\r\n",
        "            if isinstance(metric_value, float):\r\n",
        "                iterationlist.append(iteration)\r\n",
        "                metricnamelist.append(metric_name)\r\n",
        "                metricvaluelist.append(metric_value)\r\n",
        "             \r\n",
        "metrics = pd.DataFrame({\"iteration\": iterationlist, \"metric_name\": metricnamelist, \"metric_value\": metricvaluelist})\r\n",
        "'\r\n",
        "    , @output_data_1_name = N'metrics'\r\n",
        "\t, @params = N'@run_id NVARCHAR(250), \r\n",
        "\t\t\t\t  @experiment_name NVARCHAR(32),\r\n",
        "  \t\t\t\t  @tenantid NVARCHAR(255),\r\n",
        "\t\t\t\t  @appid NVARCHAR(255),\r\n",
        "\t\t\t\t  @password NVARCHAR(255),\r\n",
        "\t\t\t\t  @config_file NVARCHAR(255)'\r\n",
        "    , @run_id = @run_id\r\n",
        "\t, @experiment_name = @experiment_name\r\n",
        "\t, @tenantid = @tenantid\r\n",
        "\t, @appid = @appid\r\n",
        "\t, @password = @password\r\n",
        "\t, @config_file = @config_file\r\n",
        "WITH RESULT SETS ((iteration INT, metric_name NVARCHAR(100), metric_value FLOAT))\r\n",
        "END"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "-- This procedure predicts values based on a model returned by AutoMLTrain and a dataset.\r\n",
        "-- It returns the dataset with a new column added, which is the predicted value.\r\n",
        "SET ANSI_NULLS ON\r\n",
        "GO\r\n",
        "SET QUOTED_IDENTIFIER ON\r\n",
        "GO\r\n",
        "CREATE OR ALTER PROCEDURE [dbo].[AutoMLPredict]\r\n",
        " (\r\n",
        "   @input_query NVARCHAR(MAX),      -- A SQL query returning data to predict on.\r\n",
        "   @model NVARCHAR(MAX),            -- A model returned from AutoMLTrain.\r\n",
        "   @label_column  NVARCHAR(255)=''  -- Optional name of the column from input_query, which should be ignored when predicting\r\n",
        " ) AS \r\n",
        "BEGIN \r\n",
        "  \r\n",
        "    EXEC sp_execute_external_script @language = N'Python', @script = N'import pandas as pd \r\n",
        "import azureml.core  \r\n",
        "import numpy as np \r\n",
        "from azureml.train.automl import AutoMLConfig  \r\n",
        "import pickle \r\n",
        "import codecs \r\n",
        "  \r\n",
        "model_obj = pickle.loads(codecs.decode(model.encode(), \"base64\")) \r\n",
        "  \r\n",
        "test_data = input_data.copy() \r\n",
        "\r\n",
        "if label_column != \"\" and label_column is not None:\r\n",
        "    y_test = test_data.pop(label_column).values \r\n",
        "X_test = test_data \r\n",
        "  \r\n",
        "predicted = model_obj.predict(X_test) \r\n",
        "  \r\n",
        "combined_output = input_data.assign(predicted=predicted)\r\n",
        "  \r\n",
        "' \r\n",
        "    , @input_data_1 = @input_query \r\n",
        "    , @input_data_1_name = N'input_data' \r\n",
        "    , @output_data_1_name = N'combined_output' \r\n",
        "    , @params = N'@model NVARCHAR(MAX), @label_column  NVARCHAR(255)' \r\n",
        "    , @model = @model \r\n",
        "\t, @label_column = @label_column\r\n",
        "END"
      ]
    }
  ],
  "metadata": {
    "authors": [
      {
        "name": "jeffshep"
      }
    ],
    "category": "tutorial",
    "compute": [
      "None"
    ],
    "datasets": [
      "None"
    ],
    "deployment": [
      "None"
    ],
    "exclude_from_index": false,
    "framework": [
      "Azure ML AutoML"
    ],
    "friendly_name": "Setup automated ML SQL integration",
    "index_order": 1,
    "kernelspec": {
      "display_name": "Python 3.6",
      "language": "sql",
      "name": "python36"
    },
    "language_info": {
      "name": "sql",
      "version": ""
    },
    "tags": [
      ""
    ],
    "task": "None"
  },
  "nbformat": 4,
  "nbformat_minor": 2
 }
--- a/how-to-use-azureml/azure-databricks/automl/automl-databricks-local-with-deployment.ipynb
+++ b/how-to-use-azureml/azure-databricks/automl/automl-databricks-local-with-deployment.ipynb
@@ -542,7 +542,7 @@
      "metadata": {},
      "outputs": [],
      "source": [
-        "from automl.client.core.common import constants\n",
+        "from azureml.automl.core.shared import constants\n",
        "conda_env_file_name = 'conda_env.yml'\n",
        "best_run.download_file(name=\"outputs/conda_env_v_1_0_0.yml\", output_file_path=conda_env_file_name)\n",
        "with open(conda_env_file_name, \"r\") as conda_file:\n",
@@ -564,7 +564,7 @@
      "metadata": {},
      "outputs": [],
      "source": [
-        "from automl.client.core.common import constants\n",
+        "from azureml.automl.core.shared import constants\n",
        "script_file_name = 'scoring_file.py'\n",
        "best_run.download_file(name=\"outputs/scoring_file_v_1_0_0.py\", output_file_path=script_file_name)\n",
        "with open(script_file_name, \"r\") as scoring_file:\n",
--- a/how-to-use-azureml/deployment/accelerated-models/accelerated-models-object-detection.ipynb
+++ b/how-to-use-azureml/deployment/accelerated-models/accelerated-models-object-detection.ipynb
@@ -1,497 +0,0 @@
 {
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/deployment/accelerated-models/accelerated-models-object-detection.png)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Copyright (c) Microsoft Corporation. All rights reserved.\n",
        "\n",
        "Licensed under the MIT License."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "# Azure ML Hardware Accelerated Object Detection"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "This tutorial will show you how to deploy an object detection service based on the SSD-VGG model in just a few minutes using the Azure Machine Learning Accelerated AI service.\n",
        "\n",
        "We will use the SSD-VGG model accelerated on an FPGA. Our Accelerated Models Service handles translating deep neural networks (DNN) into an FPGA program.\n",
        "\n",
        "The steps in this notebook are: \n",
        "1. [Setup Environment](#set-up-environment)\n",
        "* [Construct Model](#construct-model)\n",
        "    * Image Preprocessing\n",
        "    * Featurizer\n",
        "    * Save Model\n",
        "    * Save input and output tensor names\n",
        "* [Create Image](#create-image)\n",
        "* [Deploy Image](#deploy-image)\n",
        "* [Test the Service](#test-service)\n",
        "    * Create Client\n",
        "    * Serve the model\n",
        "* [Cleanup](#cleanup)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "<a id=\"set-up-environment\"></a>\n",
        "## 1. Set up Environment\n",
        "### 1.a. Imports"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import os\n",
        "import tensorflow as tf"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### 1.b. Retrieve Workspace\n",
        "If you haven't created a Workspace, please follow [this notebook](\"../../../configuration.ipynb\") to do so. If you have, run the codeblock below to retrieve it. "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.core import Workspace\n",
        "\n",
        "ws = Workspace.from_config()\n",
        "print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep = '\\n')"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "<a id=\"construct-model\"></a>\n",
        "## 2. Construct model\n",
        "### 2.a. Image preprocessing\n",
        "We'd like our service to accept JPEG images as input. However the input to SSD-VGG is a float tensor of shape \\[1, 300, 300, 3\\]. The first dimension is batch, then height, width, and channels (i.e. NHWC). To bridge this gap, we need code that decodes JPEG images and resizes them appropriately for input to SSD-VGG. The Accelerated AI service can execute TensorFlow graphs as part of the service and we'll use that ability to do the image preprocessing. This code defines a TensorFlow graph that preprocesses an array of JPEG images (as TensorFlow strings) and produces a tensor that is ready to be featurized by SSD-VGG.\n",
        "\n",
        "**Note:** Expect to see TF deprecation warnings until we port our SDK over to use Tensorflow 2.0."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Input images as a two-dimensional tensor containing an arbitrary number of images represented a strings\n",
        "import azureml.accel.models.utils as utils\n",
        "tf.reset_default_graph()\n",
        "\n",
        "in_images = tf.placeholder(tf.string)\n",
        "image_tensors = utils.preprocess_array(in_images, output_width=300, output_height=300, preserve_aspect_ratio=False)\n",
        "print(image_tensors.shape)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### 2.b. Featurizer\n",
        "The SSD-VGG model is different from our other models in that it generates 12 tensor outputs. These corresponds to x,y displacements of the anchor boxes and the detection confidence (for 21 classes). Because these outputs are not convenient to work with, we will later use a pre-defined post-processing utility to transform the outputs into a simplified list of bounding boxes with their respective class and confidence.\n",
        "\n",
        "For more information about the output tensors, take this example: the output tensor 'ssd_300_vgg/block4_box/Reshape_1:0' has a shape of [None, 37, 37, 4, 21]. This gives the pre-softmax confidence for 4 anchor boxes situated at each site of a 37 x 37 grid imposed on the image, one confidence score for each of the 21 classes. The first dimension is the batch dimension. Likewise, 'ssd_300_vgg/block4_box/Reshape:0' has shape [None, 37, 37, 4, 4] and encodes the (cx, cy) center shift and rescaling (sw, sh) relative to each anchor box. Refer to the [SSD-VGG paper](https://arxiv.org/abs/1512.02325) to understand how these are computed. The other 10 tensors are defined similarly."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.accel.models import SsdVgg\n",
        "\n",
        "saved_model_dir = os.path.join(os.path.expanduser('~'), 'models')\n",
        "model_graph = SsdVgg(saved_model_dir, is_frozen = True)\n",
        "\n",
        "print('SSD-VGG Input Tensors:')\n",
        "for idx, input_name in enumerate(model_graph.input_tensor_list):\n",
        "    print('{}, {}'.format(input_name, model_graph.get_input_dims(idx)))\n",
        "    \n",
        "print('SSD-VGG Output Tensors:')\n",
        "for idx, output_name in enumerate(model_graph.output_tensor_list):\n",
        "    print('{}, {}'.format(output_name, model_graph.get_output_dims(idx)))\n",
        "\n",
        "ssd_outputs = model_graph.import_graph_def(image_tensors, is_training=False)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### 2.c. Save Model\n",
        "Now that we loaded both parts of the tensorflow graph (preprocessor and SSD-VGG featurizer), we can save the graph and associated variables to a directory which we can register as an Azure ML Model."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "model_name = \"ssdvgg\"\n",
        "model_save_path = os.path.join(saved_model_dir, model_name, \"saved_model\")\n",
        "print(\"Saving model in {}\".format(model_save_path))\n",
        "\n",
        "output_map = {}\n",
        "for i, output in enumerate(ssd_outputs):\n",
        "    output_map['out_{}'.format(i)] = output\n",
        "\n",
        "with tf.Session() as sess:\n",
        "    model_graph.restore_weights(sess)\n",
        "    tf.saved_model.simple_save(sess, \n",
        "                               model_save_path, \n",
        "                               inputs={'images': in_images}, \n",
        "                               outputs=output_map)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### 2.d. Important! Save names of input and output tensors\n",
        "\n",
        "These input and output tensors that were created during the preprocessing and classifier steps are also going to be used when **converting the model** to an Accelerated Model that can run on FPGA's and for **making an inferencing request**. It is very important to save this information!"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "tags": [
          "register model from file"
        ]
      },
      "outputs": [],
      "source": [
        "input_tensors = in_images.name\n",
        "# We will use the list of output tensors during inferencing\n",
        "output_tensors = [output.name for output in ssd_outputs]\n",
        "# However, for multiple output tensors, our AccelOnnxConverter will \n",
        "#    accept comma-delimited strings (lists will cause error)\n",
        "output_tensors_str = \",\".join(output_tensors)\n",
        "\n",
        "print(input_tensors)\n",
        "print(output_tensors)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "<a id=\"create-image\"></a>\n",
        "## 3. Create AccelContainerImage\n",
        "Below we will execute all the same steps as in the [Quickstart](./accelerated-models-quickstart.ipynb#create-image) to package the model we have saved locally into an accelerated Docker image saved in our workspace. To complete all the steps, it may take a few minutes. For more details on each step, check out the [Quickstart section on model registration](./accelerated-models-quickstart.ipynb#register-model)."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.core import Workspace\n",
        "from azureml.core.model import Model\n",
        "from azureml.core.image import Image\n",
        "from azureml.accel import AccelOnnxConverter\n",
        "from azureml.accel import AccelContainerImage\n",
        "\n",
        "# Retrieve workspace\n",
        "ws = Workspace.from_config()\n",
        "print(\"Successfully retrieved workspace:\", ws.name, ws.resource_group, ws.location, ws.subscription_id, '\\n')\n",
        "\n",
        "# Register model\n",
        "registered_model = Model.register(workspace = ws,\n",
        "                                  model_path = model_save_path,\n",
        "                                  model_name = model_name)\n",
        "print(\"Successfully registered: \", registered_model.name, registered_model.description, registered_model.version, '\\n', sep = '\\t')\n",
        "\n",
        "# Convert model\n",
        "convert_request = AccelOnnxConverter.convert_tf_model(ws, registered_model, input_tensors, output_tensors_str)\n",
        "if convert_request.wait_for_completion(show_output = False):\n",
        "    # If the above call succeeded, get the converted model\n",
        "    converted_model = convert_request.result\n",
        "    print(\"\\nSuccessfully converted: \", converted_model.name, converted_model.url, converted_model.version, \n",
        "          converted_model.id, converted_model.created_time, '\\n')\n",
        "else:\n",
        "    print(\"Model conversion failed. Showing output.\")\n",
        "    convert_request.wait_for_completion(show_output = True)\n",
        "\n",
        "# Package into AccelContainerImage\n",
        "image_config = AccelContainerImage.image_configuration()\n",
        "# Image name must be lowercase\n",
        "image_name = \"{}-image\".format(model_name)\n",
        "image = Image.create(name = image_name,\n",
        "                     models = [converted_model],\n",
        "                     image_config = image_config, \n",
        "                     workspace = ws)\n",
        "image.wait_for_creation()\n",
        "print(\"Created AccelContainerImage: {} {} {}\\n\".format(image.name, image.creation_state, image.image_location))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "<a id=\"deploy-image\"></a>\n",
        "## 4. Deploy image\n",
        "Once you have an Azure ML Accelerated Image in your Workspace, you can deploy it to two destinations, to a Databox Edge machine or to an AKS cluster. \n",
        "\n",
        "### 4.a. Deploy to Databox Edge Machine using IoT Hub\n",
        "See the sample [here](https://github.com/Azure-Samples/aml-real-time-ai/) for using the Azure IoT CLI extension for deploying your Docker image to your Databox Edge Machine.\n",
        "\n",
        "### 4.b. Deploy to AKS Cluster\n",
        "Same as in the [Quickstart section on image deployment](./accelerated-models-quickstart.ipynb#deploy-image), we are going to create an AKS cluster with FPGA-enabled machines, then deploy our service to it.\n",
        "#### Create AKS ComputeTarget"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.core.compute import AksCompute, ComputeTarget\n",
        "\n",
        "# Uses the specific FPGA enabled VM (sku: Standard_PB6s)\n",
        "# Standard_PB6s are available in: eastus, westus2, westeurope, southeastasia\n",
        "prov_config = AksCompute.provisioning_configuration(vm_size = \"Standard_PB6s\",\n",
        "                                                    agent_count = 1, \n",
        "                                                    location = \"eastus\")\n",
        "\n",
        "aks_name = 'aks-pb6-obj'\n",
        "# Create the cluster\n",
        "aks_target = ComputeTarget.create(workspace = ws, \n",
        "                                  name = aks_name, \n",
        "                                  provisioning_configuration = prov_config)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Provisioning an AKS cluster might take awhile (15 or so minutes), and we want to wait until it's successfully provisioned before we can deploy a service to it. If you interrupt this cell, provisioning of the cluster will continue. You can re-run it or check the status in your Workspace under Compute."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "%%time\n",
        "aks_target.wait_for_completion(show_output = True)\n",
        "print(aks_target.provisioning_state)\n",
        "print(aks_target.provisioning_errors)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "#### Deploy AccelContainerImage to AKS ComputeTarget"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "%%time\n",
        "from azureml.core.webservice import Webservice, AksWebservice\n",
        "\n",
        "# Set the web service configuration (for creating a test service, we don't want autoscale enabled)\n",
        "# Authentication is enabled by default, but for testing we specify False\n",
        "aks_config = AksWebservice.deploy_configuration(autoscale_enabled=False,\n",
        "                                                num_replicas=1,\n",
        "                                                auth_enabled = False)\n",
        "\n",
        "aks_service_name ='my-aks-service-3'\n",
        "\n",
        "aks_service = Webservice.deploy_from_image(workspace = ws,\n",
        "                                           name = aks_service_name,\n",
        "                                           image = image,\n",
        "                                           deployment_config = aks_config,\n",
        "                                           deployment_target = aks_target)\n",
        "aks_service.wait_for_deployment(show_output = True)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "<a id=\"test-service\"></a>\n",
        "## 5. Test the service\n",
        "<a id=\"create-client\"></a>\n",
        "### 5.a. Create Client\n",
        "The image supports gRPC and the TensorFlow Serving \"predict\" API. We will create a PredictionClient from the Webservice object that can call into the docker image to get predictions. If you do not have the Webservice object, you can also create [PredictionClient](https://docs.microsoft.com/en-us/python/api/azureml-accel-models/azureml.accel.predictionclient?view=azure-ml-py) directly.\n",
        "\n",
        "**Note:** If you chose to use auth_enabled=True when creating your AksWebservice.deploy_configuration(), see documentation [here](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.webservice(class)?view=azure-ml-py#get-keys--) on how to retrieve your keys and use either key as an argument to PredictionClient(...,access_token=key).\n",
        "**WARNING:** If you are running on Azure Notebooks free compute, you will not be able to make outgoing calls to your service. Try locating your client on a different machine to consume it."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Using the grpc client in AzureML Accelerated Models SDK\n",
        "from azureml.accel import client_from_service\n",
        "\n",
        "# Initialize AzureML Accelerated Models client\n",
        "client = client_from_service(aks_service)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "You can adapt the client [code](https://github.com/Azure/aml-real-time-ai/blob/master/pythonlib/amlrealtimeai/client.py) to meet your needs. There is also an example C# [client](https://github.com/Azure/aml-real-time-ai/blob/master/sample-clients/csharp).\n",
        "\n",
        "The service provides an API that is compatible with TensorFlow Serving. There are instructions to download a sample client [here](https://www.tensorflow.org/serving/setup)."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "<a id=\"serve-model\"></a>\n",
        "### 5.b. Serve the model\n",
        "The SSD-VGG model returns the confidence and bounding boxes for all possible anchor boxes. As mentioned earlier, we will use a post-processing routine to transform this into a list of bounding boxes (y1, x1, y2, x2) where x, y are fractional coordinates measured from left and top respectively. A respective list of classes and scores is also returned to tag each bounding box. Below we make use of this information to draw the bounding boxes on top the original image. Note that in the post-processing routine we select a confidence threshold of 0.5."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import cv2\n",
        "from matplotlib import pyplot as plt\n",
        "\n",
        "colors_tableau = [(255, 255, 255), (31, 119, 180), (174, 199, 232), (255, 127, 14), (255, 187, 120),\n",
        "                  (44, 160, 44), (152, 223, 138), (214, 39, 40), (255, 152, 150),\n",
        "                  (148, 103, 189), (197, 176, 213), (140, 86, 75), (196, 156, 148),\n",
        "                  (227, 119, 194), (247, 182, 210), (127, 127, 127), (199, 199, 199),\n",
        "                  (188, 189, 34), (219, 219, 141), (23, 190, 207), (158, 218, 229)]\n",
        "\n",
        "\n",
        "def draw_boxes_on_img(img, classes, scores, bboxes, thickness=2):\n",
        "    shape = img.shape\n",
        "    for i in range(bboxes.shape[0]):\n",
        "        bbox = bboxes[i]\n",
        "        color = colors_tableau[classes[i]]\n",
        "        # Draw bounding box...\n",
        "        p1 = (int(bbox[0] * shape[0]), int(bbox[1] * shape[1]))\n",
        "        p2 = (int(bbox[2] * shape[0]), int(bbox[3] * shape[1]))\n",
        "        cv2.rectangle(img, p1[::-1], p2[::-1], color, thickness)\n",
        "        # Draw text...\n",
        "        s = '%s/%.3f' % (classes[i], scores[i])\n",
        "        p1 = (p1[0]-5, p1[1])\n",
        "        cv2.putText(img, s, p1[::-1], cv2.FONT_HERSHEY_DUPLEX, 0.4, color, 1)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import azureml.accel._external.ssdvgg_utils as ssdvgg_utils\n",
        "\n",
        "result = client.score_file(path=\"meeting.jpg\", input_name=input_tensors, outputs=output_tensors)\n",
        "classes, scores, bboxes = ssdvgg_utils.postprocess(result, select_threshold=0.5)\n",
        "\n",
        "img = cv2.imread('meeting.jpg', 1)\n",
        "img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)\n",
        "draw_boxes_on_img(img, classes, scores, bboxes)\n",
        "plt.imshow(img)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "<a id=\"cleanup\"></a>\n",
        "## 6. Cleanup\n",
        "It's important to clean up your resources, so that you won't incur unnecessary costs. In the [next notebook](./accelerated-models-training.ipynb) you will learn how to train a classfier on a new dataset using transfer learning."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "aks_service.delete()\n",
        "aks_target.delete()\n",
        "image.delete()\n",
        "registered_model.delete()\n",
        "converted_model.delete()"
      ]
    }
  ],
  "metadata": {
    "authors": [
      {
        "name": "coverste"
      },
      {
        "name": "paledger"
      },
      {
        "name": "sukha"
      }
    ],
    "kernelspec": {
      "display_name": "Python 3.6",
      "language": "python",
      "name": "python36"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.5.6"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 2
 }
--- a/how-to-use-azureml/deployment/accelerated-models/accelerated-models-object-detection.yml
+++ b/how-to-use-azureml/deployment/accelerated-models/accelerated-models-object-detection.yml
@@ -1,7 +0,0 @@
 name: accelerated-models-object-detection
 dependencies:
 - pip:
  - azureml-sdk
  - azureml-accel-models[cpu]
  - opencv-python
  - matplotlib
--- a/how-to-use-azureml/deployment/accelerated-models/accelerated-models-quickstart.ipynb
+++ b/how-to-use-azureml/deployment/accelerated-models/accelerated-models-quickstart.ipynb
@@ -1,555 +0,0 @@
 {
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/deployment/accelerated-models/accelerated-models-quickstart.png)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Copyright (c) Microsoft Corporation. All rights reserved.\n",
        "\n",
        "Licensed under the MIT License."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "# Azure ML Hardware Accelerated Models Quickstart"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "This tutorial will show you how to deploy an image recognition service based on the ResNet 50 classifier using the Azure Machine Learning Accelerated Models service.  Get more information about our service from our [documentation](https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-accelerate-with-fpgas), [API reference](https://docs.microsoft.com/en-us/python/api/azureml-accel-models/azureml.accel?view=azure-ml-py), or [forum](https://aka.ms/aml-forum).\n",
        "\n",
        "We will use an accelerated ResNet50 featurizer running on an FPGA. Our Accelerated Models Service handles translating deep neural networks (DNN) into an FPGA program.\n",
        "\n",
        "For more information about using other models besides Resnet50, see the [README](./README.md).\n",
        "\n",
        "The steps covered in this notebook are: \n",
        "1. [Set up environment](#set-up-environment)\n",
        "* [Construct model](#construct-model)\n",
        "    * Image Preprocessing\n",
        "    * Featurizer (Resnet50)\n",
        "    * Classifier\n",
        "    * Save Model\n",
        "* [Register Model](#register-model)\n",
        "* [Convert into Accelerated Model](#convert-model)\n",
        "* [Create Image](#create-image)\n",
        "* [Deploy](#deploy-image)\n",
        "* [Test service](#test-service)\n",
        "* [Clean-up](#clean-up)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "<a id=\"set-up-environment\"></a>\n",
        "## 1. Set up environment"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import os\n",
        "import tensorflow as tf"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Retrieve Workspace\n",
        "If you haven't created a Workspace, please follow [this notebook](https://github.com/Azure/MachineLearningNotebooks/blob/master/configuration.ipynb) to do so. If you have, run the codeblock below to retrieve it. "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.core import Workspace\n",
        "\n",
        "ws = Workspace.from_config()\n",
        "print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep = '\\n')"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "<a id=\"construct-model\"></a>\n",
        "## 2. Construct model\n",
        "\n",
        "There are three parts to the model we are deploying: pre-processing, featurizer with ResNet50, and classifier with ImageNet dataset. Then we will save this complete Tensorflow model graph locally before registering it to your Azure ML Workspace.\n",
        "\n",
        "### 2.a. Image preprocessing\n",
        "We'd like our service to accept JPEG images as input. However the input to ResNet50 is a tensor. So we need code that decodes JPEG images and does the preprocessing required by ResNet50. The Accelerated AI service can execute TensorFlow graphs as part of the service and we'll use that ability to do the image preprocessing. This code defines a TensorFlow graph that preprocesses an array of JPEG images (as strings) and produces a tensor that is ready to be featurized by ResNet50.\n",
        "\n",
        "**Note:** Expect to see TF deprecation warnings until we port our SDK over to use Tensorflow 2.0."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Input images as a two-dimensional tensor containing an arbitrary number of images represented a strings\n",
        "import azureml.accel.models.utils as utils\n",
        "tf.reset_default_graph()\n",
        "\n",
        "in_images = tf.placeholder(tf.string)\n",
        "image_tensors = utils.preprocess_array(in_images)\n",
        "print(image_tensors.shape)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### 2.b. Featurizer\n",
        "We use ResNet50 as a featurizer. In this step we initialize the model. This downloads a TensorFlow checkpoint of the quantized ResNet50."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.accel.models import QuantizedResnet50\n",
        "save_path = os.path.expanduser('~/models')\n",
        "model_graph = QuantizedResnet50(save_path, is_frozen = True)\n",
        "feature_tensor = model_graph.import_graph_def(image_tensors)\n",
        "print(model_graph.version)\n",
        "print(feature_tensor.name)\n",
        "print(feature_tensor.shape)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### 2.c. Classifier\n",
        "The model we downloaded includes a classifier which takes the output of the ResNet50 and identifies an image. This classifier is trained on the ImageNet dataset. We are going to use this classifier for our service. The next [notebook](./accelerated-models-training.ipynb) shows how to train a classifier for a different data set. The input to the classifier is a tensor matching the output of our ResNet50 featurizer."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "classifier_output = model_graph.get_default_classifier(feature_tensor)\n",
        "print(classifier_output)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### 2.d. Save Model\n",
        "Now that we loaded all three parts of the tensorflow graph (preprocessor, resnet50 featurizer, and the classifier), we can save the graph and associated variables to a directory which we can register as an Azure ML Model."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# model_name must be lowercase\n",
        "model_name = \"resnet50\"\n",
        "model_save_path = os.path.join(save_path, model_name)\n",
        "print(\"Saving model in {}\".format(model_save_path))\n",
        "\n",
        "with tf.Session() as sess:\n",
        "    model_graph.restore_weights(sess)\n",
        "    tf.saved_model.simple_save(sess, model_save_path,\n",
        "                                   inputs={'images': in_images},\n",
        "                                   outputs={'output_alias': classifier_output})"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### 2.e. Important! Save names of input and output tensors\n",
        "\n",
        "These input and output tensors that were created during the preprocessing and classifier steps are also going to be used when **converting the model** to an Accelerated Model that can run on FPGA's and for **making an inferencing request**. It is very important to save this information! You can see our defaults for all the models in the [README](./README.md).\n",
        "\n",
        "By default for Resnet50, these are the values you should see when running the cell below: \n",
        "* input_tensors = \"Placeholder:0\"\n",
        "* output_tensors = \"classifier/resnet_v1_50/predictions/Softmax:0\""
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "tags": [
          "register model from file"
        ]
      },
      "outputs": [],
      "source": [
        "input_tensors = in_images.name\n",
        "output_tensors = classifier_output.name\n",
        "\n",
        "print(input_tensors)\n",
        "print(output_tensors)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "<a id=\"register-model\"></a>\n",
        "## 3. Register Model"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "You can add tags and descriptions to your models. Using tags, you can track useful information such as the name and version of the machine learning library used to train the model. Note that tags must be alphanumeric."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "tags": [
          "register model from file"
        ]
      },
      "outputs": [],
      "source": [
        "from azureml.core.model import Model\n",
        "\n",
        "registered_model = Model.register(workspace = ws,\n",
        "                                  model_path = model_save_path,\n",
        "                                  model_name = model_name)\n",
        "\n",
        "print(\"Successfully registered: \", registered_model.name, registered_model.description, registered_model.version, sep = '\\t')"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "<a id=\"convert-model\"></a>\n",
        "## 4. Convert Model"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "For conversion you need to provide names of input and output tensors. This information can be found from the model_graph you saved in step 2.e. above.\n",
        "\n",
        "**Note**: Conversion may take a while and on average for FPGA model it is about 1-3 minutes and it depends on model type."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "tags": [
          "register model from file"
        ]
      },
      "outputs": [],
      "source": [
        "from azureml.accel import AccelOnnxConverter\n",
        "\n",
        "convert_request = AccelOnnxConverter.convert_tf_model(ws, registered_model, input_tensors, output_tensors)\n",
        "\n",
        "if convert_request.wait_for_completion(show_output = False):\n",
        "    # If the above call succeeded, get the converted model\n",
        "    converted_model = convert_request.result\n",
        "    print(\"\\nSuccessfully converted: \", converted_model.name, converted_model.url, converted_model.version, \n",
        "          converted_model.id, converted_model.created_time, '\\n')\n",
        "else:\n",
        "    print(\"Model conversion failed. Showing output.\")\n",
        "    convert_request.wait_for_completion(show_output = True)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "<a id=\"create-image\"></a>\n",
        "## 5. Package the model into an Image"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "You can add tags and descriptions to image. Also, for FPGA model an image can only contain **single** model.\n",
        "\n",
        "**Note**: The following command can take few minutes. "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.core.image import Image\n",
        "from azureml.accel import AccelContainerImage\n",
        "\n",
        "image_config = AccelContainerImage.image_configuration()\n",
        "# Image name must be lowercase\n",
        "image_name = \"{}-image\".format(model_name)\n",
        "\n",
        "image = Image.create(name = image_name,\n",
        "                     models = [converted_model],\n",
        "                     image_config = image_config, \n",
        "                     workspace = ws)\n",
        "image.wait_for_creation(show_output = False)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "<a id=\"deploy-image\"></a>\n",
        "## 6. Deploy\n",
        "Once you have an Azure ML Accelerated Image in your Workspace, you can deploy it to two destinations, to a Databox Edge machine or to an AKS cluster. \n",
        "\n",
        "### 6.a. Databox Edge Machine using IoT Hub\n",
        "See the sample [here](https://github.com/Azure-Samples/aml-real-time-ai/) for using the Azure IoT CLI extension for deploying your Docker image to your Databox Edge Machine.\n",
        "\n",
        "### 6.b. Azure Kubernetes Service (AKS) using Azure ML Service\n",
        "We are going to create an AKS cluster with FPGA-enabled machines, then deploy our service to it. For more information, see [AKS official docs](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-deploy-and-where#aks).\n",
        "\n",
        "#### Create AKS ComputeTarget"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "tags": [
          "sample-akscompute-provision"
        ]
      },
      "outputs": [],
      "source": [
        "from azureml.core.compute import AksCompute, ComputeTarget\n",
        "\n",
        "# Uses the specific FPGA enabled VM (sku: Standard_PB6s)\n",
        "# Standard_PB6s are available in: eastus, westus2, westeurope, southeastasia\n",
        "prov_config = AksCompute.provisioning_configuration(vm_size = \"Standard_PB6s\",\n",
        "                                                    agent_count = 1, \n",
        "                                                    location = \"eastus\")\n",
        "\n",
        "aks_name = 'my-aks-pb6'\n",
        "# Create the cluster\n",
        "aks_target = ComputeTarget.create(workspace = ws, \n",
        "                                  name = aks_name, \n",
        "                                  provisioning_configuration = prov_config)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Provisioning an AKS cluster might take awhile (15 or so minutes), and we want to wait until it's successfully provisioned before we can deploy a service to it. If you interrupt this cell, provisioning of the cluster will continue. You can also check the status in your Workspace under Compute."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "%%time\n",
        "aks_target.wait_for_completion(show_output = True)\n",
        "print(aks_target.provisioning_state)\n",
        "print(aks_target.provisioning_errors)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "#### Deploy AccelContainerImage to AKS ComputeTarget"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "%%time\n",
        "from azureml.core.webservice import Webservice, AksWebservice\n",
        "\n",
        "# Set the web service configuration (for creating a test service, we don't want autoscale enabled)\n",
        "# Authentication is enabled by default, but for testing we specify False\n",
        "aks_config = AksWebservice.deploy_configuration(autoscale_enabled=False,\n",
        "                                                num_replicas=1,\n",
        "                                                auth_enabled = False)\n",
        "\n",
        "aks_service_name ='my-aks-service-1'\n",
        "\n",
        "aks_service = Webservice.deploy_from_image(workspace = ws,\n",
        "                                           name = aks_service_name,\n",
        "                                           image = image,\n",
        "                                           deployment_config = aks_config,\n",
        "                                           deployment_target = aks_target)\n",
        "aks_service.wait_for_deployment(show_output = True)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "<a id=\"test-service\"></a>\n",
        "## 7. Test the service"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### 7.a. Create Client\n",
        "The image supports gRPC and the TensorFlow Serving \"predict\" API. We will create a PredictionClient from the Webservice object that can call into the docker image to get predictions. If you do not have the Webservice object, you can also create [PredictionClient](https://docs.microsoft.com/en-us/python/api/azureml-accel-models/azureml.accel.predictionclient?view=azure-ml-py) directly.\n",
        "\n",
        "**Note:** If you chose to use auth_enabled=True when creating your AksWebservice, see documentation [here](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.webservice(class)?view=azure-ml-py#get-keys--) on how to retrieve your keys and use either key as an argument to PredictionClient(...,access_token=key).\n",
        "**WARNING:** If you are running on Azure Notebooks free compute, you will not be able to make outgoing calls to your service. Try locating your client on a different machine to consume it."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Using the grpc client in AzureML Accelerated Models SDK\n",
        "from azureml.accel import client_from_service\n",
        "\n",
        "# Initialize AzureML Accelerated Models client\n",
        "client = client_from_service(aks_service)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "You can adapt the client [code](https://github.com/Azure/aml-real-time-ai/blob/master/pythonlib/amlrealtimeai/client.py) to meet your needs. There is also an example C# [client](https://github.com/Azure/aml-real-time-ai/blob/master/sample-clients/csharp).\n",
        "\n",
        "The service provides an API that is compatible with TensorFlow Serving. There are instructions to download a sample client [here](https://www.tensorflow.org/serving/setup)."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### 7.b. Serve the model\n",
        "To understand the results we need a mapping to the human readable imagenet classes"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import requests\n",
        "classes_entries = requests.get(\"https://raw.githubusercontent.com/Lasagne/Recipes/master/examples/resnet50/imagenet_classes.txt\").text.splitlines()"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Score image with input and output tensor names\n",
        "results = client.score_file(path=\"./snowleopardgaze.jpg\", \n",
        "                             input_name=input_tensors, \n",
        "                             outputs=output_tensors)\n",
        "\n",
        "# map results [class_id] => [confidence]\n",
        "results = enumerate(results)\n",
        "# sort results by confidence\n",
        "sorted_results = sorted(results, key=lambda x: x[1], reverse=True)\n",
        "# print top 5 results\n",
        "for top in sorted_results[:5]:\n",
        "    print(classes_entries[top[0]], 'confidence:', top[1])"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "<a id=\"clean-up\"></a>\n",
        "## 8. Clean-up\n",
        "Run the cell below to delete your webservice, image, and model (must be done in that order). In the [next notebook](./accelerated-models-training.ipynb) you will learn how to train a classfier on a new dataset using transfer learning and finetune the weights."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "aks_service.delete()\n",
        "aks_target.delete()\n",
        "image.delete()\n",
        "registered_model.delete()\n",
        "converted_model.delete()"
      ]
    }
  ],
  "metadata": {
    "authors": [
      {
        "name": "coverste"
      },
      {
        "name": "paledger"
      },
      {
        "name": "aibhalla"
      }
    ],
    "kernelspec": {
      "display_name": "Python 3.6",
      "language": "python",
      "name": "python36"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.7.3"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 2
 }
--- a/how-to-use-azureml/deployment/accelerated-models/accelerated-models-quickstart.yml
+++ b/how-to-use-azureml/deployment/accelerated-models/accelerated-models-quickstart.yml
@@ -1,5 +0,0 @@
 name: accelerated-models-quickstart
 dependencies:
 - pip:
  - azureml-sdk
  - azureml-accel-models[cpu]
--- a/how-to-use-azureml/deployment/accelerated-models/accelerated-models-training.ipynb
+++ b/how-to-use-azureml/deployment/accelerated-models/accelerated-models-training.ipynb
@@ -1,870 +0,0 @@
 {
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/deployment/accelerated-models/accelerated-models-training.png)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Copyright (c) Microsoft Corporation. All rights reserved.\n",
        "\n",
        "Licensed under the MIT License."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "# Training with the Azure Machine Learning Accelerated Models Service"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "This notebook will introduce how to apply common machine learning techniques, like transfer learning, custom weights, and unquantized vs. quantized models, when working with our Azure Machine Learning Accelerated Models Service (Azure ML Accel Models).\n",
        "\n",
        "We will use Tensorflow for the preprocessing steps, ResNet50 for the featurizer, and the Keras API (built on Tensorflow backend) to build the classifier layers instead of the default ImageNet classifier used in Quickstart. Then we will train the model, evaluate it, and deploy it to run on an FPGA.\n",
        "\n",
        "#### Transfer Learning and Custom weights\n",
        "We will walk you through two ways to build and train a ResNet50 model on the Kaggle Cats and Dogs dataset: transfer learning only and then transfer learning with custom weights.\n",
        "\n",
        "In using transfer learning, our goal is to re-purpose the ResNet50 model already trained on the [ImageNet image dataset](http://www.image-net.org/) as a basis for our training of the Kaggle Cats and Dogs dataset. The ResNet50 featurizer will be imported as frozen, so only the Keras classifier will be trained.\n",
        "\n",
        "With the addition of custom weights, we will build the model so that the ResNet50 featurizer weights as not frozen. This will let us retrain starting with custom weights trained with ImageNet on ResNet50 and then use the Kaggle Cats and Dogs dataset to retrain and fine-tune the quantized version of the model.\n",
        "\n",
        "#### Unquantized vs. Quantized models\n",
        "The unquantized version of our models (ie. Resnet50, Resnet152, Densenet121, Vgg16, SsdVgg) uses native float precision (32-bit floats), which will be faster at training. We will use this for our first run through, then fine-tune the weights with the quantized version. The quantized version of our models (i.e. QuantizedResnet50, QuantizedResnet152, QuantizedDensenet121, QuantizedVgg16, QuantizedSsdVgg) will have the same node names as the unquantized version, but use quantized operations and will match the performance of the model when running on an FPGA.\n",
        "\n",
        "#### Contents\n",
        "1. [Setup Environment](#setup)\n",
        "* [Prepare Data](#prepare-data)\n",
        "* [Construct Model](#construct-model)\n",
        "    * Preprocessor\n",
        "    * Classifier\n",
        "    * Model construction\n",
        "* [Train Model](#train-model)\n",
        "* [Test Model](#test-model)\n",
        "* [Execution](#execution)\n",
        "    * [Transfer Learning](#transfer-learning)\n",
        "    * [Transfer Learning with Custom Weights](#custom-weights)\n",
        "* [Create Image](#create-image)\n",
        "* [Deploy Image](#deploy-image)\n",
        "* [Test the service](#test-service)\n",
        "* [Clean-up](#cleanup)\n",
        "* [Appendix](#appendix)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "<a id=\"setup\"></a>\n",
        "## 1. Setup Environment\n",
        "#### 1.a. Please set up your environment as described in the [Quickstart](./accelerated-models-quickstart.ipynb), meaning:\n",
        "* Make sure your Workspace config.json exists and has the correct info\n",
        "* Install Tensorflow\n",
        "\n",
        "#### 1.b. Download dataset into ~/catsanddogs \n",
        "The dataset we will be using for training can be downloaded [here](https://www.microsoft.com/en-us/download/details.aspx?id=54765). Download the zip and extract to a directory named 'catsanddogs' under your user directory (\"~/catsanddogs\"). \n",
        "\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "#### 1.c. Import packages"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import os\n",
        "import sys\n",
        "import tensorflow as tf\n",
        "import numpy as np\n",
        "from keras import backend as K\n",
        "import sklearn\n",
        "import tqdm"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "#### 1.d. Create directories for later use\n",
        "After you train your model in float32, you'll write the weights to a place on disk. We also need a location to store the models that get downloaded."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "custom_weights_dir = os.path.expanduser(\"~/custom-weights\")\n",
        "saved_model_dir = os.path.expanduser(\"~/models\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "<a id=\"prepare-data\"></a>\n",
        "## 2. Prepare Data\n",
        "Load the files we are going to use for training and testing. By default this notebook uses only a very small subset of the Cats and Dogs dataset. That makes it run relatively quickly."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import glob\n",
        "import imghdr\n",
        "datadir = os.path.expanduser(\"~/catsanddogs\")\n",
        "\n",
        "cat_files = glob.glob(os.path.join(datadir, 'PetImages', 'Cat', '*.jpg'))\n",
        "dog_files = glob.glob(os.path.join(datadir, 'PetImages', 'Dog', '*.jpg'))\n",
        "\n",
        "# Limit the data set to make the notebook execute quickly.\n",
        "cat_files = cat_files[:64]\n",
        "dog_files = dog_files[:64]\n",
        "\n",
        "# The data set has a few images that are not jpeg. Remove them.\n",
        "cat_files = [f for f in cat_files if imghdr.what(f) == 'jpeg']\n",
        "dog_files = [f for f in dog_files if imghdr.what(f) == 'jpeg']\n",
        "\n",
        "if(not len(cat_files) or not len(dog_files)):\n",
        "    print(\"Please download the Kaggle Cats and Dogs dataset form https://www.microsoft.com/en-us/download/details.aspx?id=54765 and extract the zip to \" + datadir)    \n",
        "    raise ValueError(\"Data not found\")\n",
        "else:\n",
        "    print(cat_files[0])\n",
        "    print(dog_files[0])"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Construct a numpy array as labels\n",
        "image_paths = cat_files + dog_files\n",
        "total_files = len(cat_files) + len(dog_files)\n",
        "labels = np.zeros(total_files)\n",
        "labels[len(cat_files):] = 1"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Split images data as training data and test data\n",
        "from sklearn.model_selection import train_test_split\n",
        "onehot_labels = np.array([[0,1] if i else [1,0] for i in labels])\n",
        "img_train, img_test, label_train, label_test = train_test_split(image_paths, onehot_labels, random_state=42, shuffle=True)\n",
        "\n",
        "print(len(img_train), len(img_test), label_train.shape, label_test.shape)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "<a id=\"construct-model\"></a>\n",
        "## 3. Construct Model\n",
        "We will define the functions to handle creating the preprocessor and the classifier first, and then run them together to actually construct the model with the Resnet50 featurizer in a single Tensorflow session in a separate cell.\n",
        "\n",
        "We use ResNet50 for the featurizer and build our own classifier using Keras layers. We train the featurizer and the classifier as one model. We will provide parameters to determine whether we are using the quantized version and whether we are using custom weights in training or not."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### 3.a. Define image preprocessing step\n",
        "Same as in the Quickstart, before passing image dataset to the ResNet50 featurizer, we need to preprocess the input file to get it into the form expected by ResNet50. ResNet50 expects float tensors representing the images in BGR, channel last order. We've provided a default implementation of the preprocessing that you can use.\n",
        "\n",
        "**Note:** Expect to see TF deprecation warnings until we port our SDK over to use Tensorflow 2.0."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import azureml.accel.models.utils as utils\n",
        "\n",
        "def preprocess_images(scaling_factor=1.0):\n",
        "    # Convert images to 3D tensors [width,height,channel] - channels are in BGR order.\n",
        "    in_images = tf.placeholder(tf.string)\n",
        "    image_tensors = utils.preprocess_array(in_images, 'RGB', scaling_factor)\n",
        "    return in_images, image_tensors"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### 3.b. Define classifier\n",
        "We use Keras layer APIs to construct the classifier. Because we're using the tensorflow backend, we can train this classifier in one session with our Resnet50 model."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "def construct_classifier(in_tensor, seed=None):\n",
        "    from keras.layers import Dropout, Dense, Flatten\n",
        "    from keras.initializers import glorot_uniform\n",
        "    K.set_session(tf.get_default_session())\n",
        "\n",
        "    FC_SIZE = 1024\n",
        "    NUM_CLASSES = 2\n",
        "\n",
        "    x = Dropout(0.2, input_shape=(1, 1, int(in_tensor.shape[3]),), seed=seed)(in_tensor)\n",
        "    x = Dense(FC_SIZE, activation='relu', input_dim=(1, 1, int(in_tensor.shape[3]),),\n",
        "              kernel_initializer=glorot_uniform(seed=seed), bias_initializer='zeros')(x)\n",
        "    x = Flatten()(x)\n",
        "    preds = Dense(NUM_CLASSES, activation='softmax', input_dim=FC_SIZE, name='classifier_output',\n",
        "                  kernel_initializer=glorot_uniform(seed=seed), bias_initializer='zeros')(x)\n",
        "    return preds"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### 3.c. Define model construction\n",
        "Now that the preprocessor and classifier for the model are defined, we can define how we want to construct the model. \n",
        "\n",
        "Constructing the model has these steps: \n",
        "1. Get preprocessing steps\n",
        "* Get featurizer using the Azure ML Accel Models SDK:\n",
        "    * import the graph definition\n",
        "    * restore the weights of the model into a Tensorflow session\n",
        "* Get classifier\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "def construct_model(quantized, starting_weights_directory = None):\n",
        "    from azureml.accel.models import Resnet50, QuantizedResnet50\n",
        "    \n",
        "    # Convert images to 3D tensors [width,height,channel]\n",
        "    in_images, image_tensors = preprocess_images(1.0)\n",
        "\n",
        "    # Construct featurizer using quantized or unquantized ResNet50 model\n",
        "    if not quantized:\n",
        "        featurizer = Resnet50(saved_model_dir)\n",
        "    else:\n",
        "        featurizer = QuantizedResnet50(saved_model_dir, custom_weights_directory = starting_weights_directory)\n",
        "\n",
        "    features = featurizer.import_graph_def(input_tensor=image_tensors)\n",
        "    \n",
        "    # Construct classifier\n",
        "    preds = construct_classifier(features)\n",
        "    \n",
        "    # Initialize weights\n",
        "    sess = tf.get_default_session()\n",
        "    tf.global_variables_initializer().run()\n",
        "\n",
        "    featurizer.restore_weights(sess)\n",
        "\n",
        "    return in_images, image_tensors, features, preds, featurizer"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "<a id=\"train-model\"></a>\n",
        "## 4. Train Model"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "def read_files(files):\n",
        "    \"\"\" Read files to array\"\"\"\n",
        "    contents = []\n",
        "    for path in files:\n",
        "        with open(path, 'rb') as f:\n",
        "            contents.append(f.read())\n",
        "    return contents"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "def train_model(preds, in_images, img_train, label_train, is_retrain = False, train_epoch = 10, learning_rate=None):\n",
        "    \"\"\" training model \"\"\"\n",
        "    from keras.objectives import binary_crossentropy\n",
        "    from tqdm import tqdm\n",
        "    \n",
        "    learning_rate = learning_rate if learning_rate else 0.001 if is_retrain else 0.01\n",
        "        \n",
        "    # Specify the loss function\n",
        "    in_labels = tf.placeholder(tf.float32, shape=(None, 2))   \n",
        "    cross_entropy = tf.reduce_mean(binary_crossentropy(in_labels, preds))\n",
        "    optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cross_entropy)\n",
        "\n",
        "    def chunks(a, b, n):\n",
        "        \"\"\"Yield successive n-sized chunks from a and b.\"\"\"\n",
        "        if (len(a) != len(b)):\n",
        "            print(\"a and b are not equal in chunks(a,b,n)\")\n",
        "            raise ValueError(\"Parameter error\")\n",
        "\n",
        "        for i in range(0, len(a), n):\n",
        "            yield a[i:i + n], b[i:i + n]\n",
        "\n",
        "    chunk_size = 16\n",
        "    chunk_num = len(label_train) / chunk_size\n",
        "\n",
        "    sess = tf.get_default_session()\n",
        "    for epoch in range(train_epoch):\n",
        "        avg_loss = 0\n",
        "        for img_chunk, label_chunk in tqdm(chunks(img_train, label_train, chunk_size)):\n",
        "            contents = read_files(img_chunk)\n",
        "            _, loss = sess.run([optimizer, cross_entropy],\n",
        "                                feed_dict={in_images: contents,\n",
        "                                           in_labels: label_chunk,\n",
        "                                           K.learning_phase(): 1})\n",
        "            avg_loss += loss / chunk_num\n",
        "        print(\"Epoch:\", (epoch + 1), \"loss = \", \"{:.3f}\".format(avg_loss))\n",
        "            \n",
        "        # Reach desired performance\n",
        "        if (avg_loss < 0.001):\n",
        "            break"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "<a id=\"test-model\"></a>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "<a id=\"test-model\"></a>\n",
        "## 5. Test Model"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "def test_model(preds, in_images, img_test, label_test):\n",
        "    \"\"\"Test the model\"\"\"\n",
        "    from keras.metrics import categorical_accuracy\n",
        "\n",
        "    in_labels = tf.placeholder(tf.float32, shape=(None, 2))\n",
        "    accuracy = tf.reduce_mean(categorical_accuracy(in_labels, preds))\n",
        "    contents = read_files(img_test)\n",
        "\n",
        "    accuracy = accuracy.eval(feed_dict={in_images: contents,\n",
        "                                        in_labels: label_test,\n",
        "                                        K.learning_phase(): 0})\n",
        "    return accuracy"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "<a id=\"execution\"></a>\n",
        "## 6. Execute steps\n",
        "You can run through the Transfer Learning section, then skip to Create AccelContainerImage. By default, because the custom weights section takes much longer for training twice, it is not saved as executable cells. You can copy the code or change cell type to 'Code'.\n",
        "\n",
        "<a id=\"transfer-learning\"></a>\n",
        "### 6.a. Training using Transfer Learning"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "%%time\n",
        "# Launch the training\n",
        "tf.reset_default_graph()\n",
        "sess = tf.Session(graph=tf.get_default_graph())\n",
        "\n",
        "with sess.as_default():\n",
        "    in_images, image_tensors, features, preds, featurizer = construct_model(quantized=True)\n",
        "    train_model(preds, in_images, img_train, label_train, is_retrain=False, train_epoch=10, learning_rate=0.01)    \n",
        "    accuracy = test_model(preds, in_images, img_test, label_test)  \n",
        "    print(\"Accuracy:\", accuracy)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "#### Save Model"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "model_name = 'resnet50-catsanddogs-tl'\n",
        "model_save_path = os.path.join(saved_model_dir, model_name)\n",
        "\n",
        "tf.saved_model.simple_save(sess, model_save_path,\n",
        "                               inputs={'images': in_images},\n",
        "                               outputs={'output_alias': preds})\n",
        "\n",
        "input_tensors = in_images.name\n",
        "output_tensors = preds.name\n",
        "\n",
        "print(input_tensors)\n",
        "print(output_tensors)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "<a id=\"custom-weights\"></a>\n",
        "### 6.b. Traning using Custom Weights\n",
        "\n",
        "Because the quantized graph defintion and the float32 graph defintion share the same node names in the graph definitions, we can initally train the weights in float32, and then reload them with the quantized operations (which take longer) to fine-tune the model.\n",
        "\n",
        "First we train the model with custom weights but without quantization. Training is done with native float precision (32-bit floats). We load the training data set and batch the training with 10 epochs. When the performance reaches desired level or starts decredation, we stop the training iteration and save the weights as tensorflow checkpoint files. "
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "#### Launch the training\n",
        "```\n",
        "tf.reset_default_graph()\n",
        "sess = tf.Session(graph=tf.get_default_graph())\n",
        "\n",
        "with sess.as_default():\n",
        "    in_images, image_tensors, features, preds, featurizer = construct_model(quantized=False)\n",
        "    train_model(preds, in_images, img_train, label_train, is_retrain=False, train_epoch=10)    \n",
        "    accuracy = test_model(preds, in_images, img_test, label_test)  \n",
        "    print(\"Accuracy:\", accuracy)\n",
        "    featurizer.save_weights(custom_weights_dir + \"/rn50\", tf.get_default_session())\n",
        "```"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "#### Test Model\n",
        "After training, we evaluate the trained model's accuracy on test dataset with quantization. So that we know the model's performance if it is deployed on the FPGA."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "```\n",
        "tf.reset_default_graph()\n",
        "sess = tf.Session(graph=tf.get_default_graph())\n",
        "\n",
        "with sess.as_default():\n",
        "    print(\"Testing trained model with quantization\")\n",
        "    in_images, image_tensors, features, preds, quantized_featurizer = construct_model(quantized=True, starting_weights_directory=custom_weights_dir)\n",
        "    accuracy = test_model(preds, in_images, img_test, label_test)      \n",
        "    print(\"Accuracy:\", accuracy)\n",
        "```"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "#### Fine-Tune Model\n",
        "Sometimes, the model's accuracy can drop significantly after quantization. In those cases, we need to retrain the model enabled with quantization to get better model accuracy."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "```\n",
        "if (accuracy < 0.93):\n",
        "    with sess.as_default():\n",
        "        print(\"Fine-tuning model with quantization\")\n",
        "        train_model(preds, in_images, img_train, label_train, is_retrain=True, train_epoch=10)\n",
        "        accuracy = test_model(preds, in_images, img_test, label_test)        \n",
        "        print(\"Accuracy:\", accuracy)\n",
        "```"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "#### Save Model"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "```\n",
        "model_name = 'resnet50-catsanddogs-cw'\n",
        "model_save_path = os.path.join(saved_model_dir, model_name)\n",
        "\n",
        "tf.saved_model.simple_save(sess, model_save_path,\n",
        "                               inputs={'images': in_images},\n",
        "                               outputs={'output_alias': preds})\n",
        "\n",
        "input_tensors = in_images.name\n",
        "output_tensors = preds.name\n",
        "```"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "<a id=\"create-image\"></a>\n",
        "## 7. Create AccelContainerImage\n",
        "\n",
        "Below we will execute all the same steps as in the [Quickstart](./accelerated-models-quickstart.ipynb#create-image) to package the model we have saved locally into an accelerated Docker image saved in our workspace. To complete all the steps, it may take a few minutes. For more details on each step, check out the [Quickstart section on model registration](./accelerated-models-quickstart.ipynb#register-model)."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.core import Workspace\n",
        "from azureml.core.model import Model\n",
        "from azureml.core.image import Image\n",
        "from azureml.accel import AccelOnnxConverter\n",
        "from azureml.accel import AccelContainerImage\n",
        "\n",
        "# Retrieve workspace\n",
        "ws = Workspace.from_config()\n",
        "print(\"Successfully retrieved workspace:\", ws.name, ws.resource_group, ws.location, ws.subscription_id, '\\n')\n",
        "\n",
        "# Register model\n",
        "registered_model = Model.register(workspace = ws,\n",
        "                                  model_path = model_save_path,\n",
        "                                  model_name = model_name)\n",
        "print(\"Successfully registered: \", registered_model.name, registered_model.description, registered_model.version, '\\n', sep = '\\t')\n",
        "\n",
        "# Convert model\n",
        "convert_request = AccelOnnxConverter.convert_tf_model(ws, registered_model, input_tensors, output_tensors)\n",
        "if convert_request.wait_for_completion(show_output = False):\n",
        "    # If the above call succeeded, get the converted model\n",
        "    converted_model = convert_request.result\n",
        "    print(\"\\nSuccessfully converted: \", converted_model.name, converted_model.url, converted_model.version, \n",
        "          converted_model.id, converted_model.created_time, '\\n')\n",
        "else:\n",
        "    print(\"Model conversion failed. Showing output.\")\n",
        "    convert_request.wait_for_completion(show_output = True)\n",
        "\n",
        "# Package into AccelContainerImage\n",
        "image_config = AccelContainerImage.image_configuration()\n",
        "# Image name must be lowercase\n",
        "image_name = \"{}-image\".format(model_name)\n",
        "image = Image.create(name = image_name,\n",
        "                     models = [converted_model],\n",
        "                     image_config = image_config, \n",
        "                     workspace = ws)\n",
        "image.wait_for_creation()\n",
        "print(\"Created AccelContainerImage: {} {} {}\\n\".format(image.name, image.creation_state, image.image_location))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "<a id=\"deploy-image\"></a>\n",
        "## 8. Deploy image\n",
        "Once you have an Azure ML Accelerated Image in your Workspace, you can deploy it to two destinations, to a Databox Edge machine or to an AKS cluster. \n",
        "\n",
        "### 8.a. Deploy to Databox Edge Machine using IoT Hub\n",
        "See the sample [here](https://github.com/Azure-Samples/aml-real-time-ai/) for using the Azure IoT CLI extension for deploying your Docker image to your Databox Edge Machine.\n",
        "\n",
        "### 8.b. Deploy to AKS Cluster"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "#### Create AKS ComputeTarget"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.core.compute import AksCompute, ComputeTarget\n",
        "\n",
        "# Uses the specific FPGA enabled VM (sku: Standard_PB6s)\n",
        "# Standard_PB6s are available in: eastus, westus2, westeurope, southeastasia\n",
        "prov_config = AksCompute.provisioning_configuration(vm_size = \"Standard_PB6s\",\n",
        "                                                    agent_count = 1,\n",
        "                                                    location = \"eastus\")\n",
        "\n",
        "aks_name = 'aks-pb6-tl'\n",
        "# Create the cluster\n",
        "aks_target = ComputeTarget.create(workspace = ws, \n",
        "                                  name = aks_name, \n",
        "                                  provisioning_configuration = prov_config)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Provisioning an AKS cluster might take awhile (15 or so minutes), and we want to wait until it's successfully provisioned before we can deploy a service to it. If you interrupt this cell, provisioning of the cluster will continue. You can re-run it or check the status in your Workspace under Compute."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "%%time\n",
        "aks_target.wait_for_completion(show_output = True)\n",
        "print(aks_target.provisioning_state)\n",
        "print(aks_target.provisioning_errors)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "#### Deploy AccelContainerImage to AKS ComputeTarget"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "tags": [
          "sample-akswebservice-deploy-from-image"
        ]
      },
      "outputs": [],
      "source": [
        "%%time\n",
        "from azureml.core.webservice import Webservice, AksWebservice\n",
        "\n",
        "# Set the web service configuration (for creating a test service, we don't want autoscale enabled)\n",
        "# Authentication is enabled by default, but for testing we specify False\n",
        "aks_config = AksWebservice.deploy_configuration(autoscale_enabled=False,\n",
        "                                                num_replicas=1,\n",
        "                                                auth_enabled = False)\n",
        "\n",
        "aks_service_name ='my-aks-service-2'\n",
        "\n",
        "aks_service = Webservice.deploy_from_image(workspace = ws,\n",
        "                                           name = aks_service_name,\n",
        "                                           image = image,\n",
        "                                           deployment_config = aks_config,\n",
        "                                           deployment_target = aks_target)\n",
        "aks_service.wait_for_deployment(show_output = True)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "<a id=\"test-service\"></a>\n",
        "## 9. Test the service\n",
        "\n",
        "<a id=\"create-client\"></a>\n",
        "### 9.a. Create Client\n",
        "The image supports gRPC and the TensorFlow Serving \"predict\" API. We will create a PredictionClient from the Webservice object that can call into the docker image to get predictions. If you do not have the Webservice object, you can also create [PredictionClient](https://docs.microsoft.com/en-us/python/api/azureml-accel-models/azureml.accel.predictionclient?view=azure-ml-py) directly.\n",
        "\n",
        "**Note:** If you chose to use auth_enabled=True when creating your AksWebservice.deploy_configuration(), see documentation [here](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.webservice(class)?view=azure-ml-py#get-keys--) on how to retrieve your keys and use either key as an argument to PredictionClient(...,access_token=key).\n",
        "**WARNING:** If you are running on Azure Notebooks free compute, you will not be able to make outgoing calls to your service. Try locating your client on a different machine to consume it."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Using the grpc client in AzureML Accelerated Models SDK\n",
        "from azureml.accel import client_from_service\n",
        "\n",
        "# Initialize AzureML Accelerated Models client\n",
        "client = client_from_service(aks_service)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "<a id=\"serve-model\"></a>\n",
        "### 9.b. Serve the model\n",
        "Let's see how our service does on a few images. It may get a few wrong."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Specify an image to classify\n",
        "print('CATS')\n",
        "for image_file in cat_files[:8]:\n",
        "    results = client.score_file(path=image_file, \n",
        "                                 input_name=input_tensors, \n",
        "                                 outputs=output_tensors)\n",
        "    result = 'CORRECT ' if results[0] > results[1] else 'WRONG '\n",
        "    print(result + str(results))\n",
        "print('DOGS')\n",
        "for image_file in dog_files[:8]:\n",
        "    results = client.score_file(path=image_file, \n",
        "                                 input_name=input_tensors, \n",
        "                                 outputs=output_tensors)\n",
        "    result = 'CORRECT ' if results[1] > results[0] else 'WRONG '\n",
        "    print(result + str(results))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "<a id=\"cleanup\"></a>\n",
        "## 10. Cleanup\n",
        "It's important to clean up your resources, so that you won't incur unnecessary costs."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "aks_service.delete()\n",
        "aks_target.delete()\n",
        "image.delete()\n",
        "registered_model.delete()\n",
        "converted_model.delete()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "<a id=\"appendix\"></a>\n",
        "## 11. Appendix"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "License for plot_confusion_matrix:\n",
        "\n",
        "New BSD License\n",
        "\n",
        "Copyright (c) 2007-2018 The scikit-learn developers.\n",
        "All rights reserved.\n",
        "\n",
        "\n",
        "Redistribution and use in source and binary forms, with or without\n",
        "modification, are permitted provided that the following conditions are met:\n",
        "\n",
        "  a. Redistributions of source code must retain the above copyright notice,\n",
        "     this list of conditions and the following disclaimer.\n",
        "  b. Redistributions in binary form must reproduce the above copyright\n",
        "     notice, this list of conditions and the following disclaimer in the\n",
        "     documentation and/or other materials provided with the distribution.\n",
        "  c. Neither the name of the Scikit-learn Developers  nor the names of\n",
        "     its contributors may be used to endorse or promote products\n",
        "     derived from this software without specific prior written\n",
        "     permission. \n",
        "\n",
        "\n",
        "THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS \"AS IS\"\n",
        "AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE\n",
        "IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE\n",
        "ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE FOR\n",
        "ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL\n",
        "DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR\n",
        "SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER\n",
        "CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT\n",
        "LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY\n",
        "OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH\n",
        "DAMAGE.\n"
      ]
    }
  ],
  "metadata": {
    "authors": [
      {
        "name": "coverste"
      },
      {
        "name": "paledger"
      }
    ],
    "kernelspec": {
      "display_name": "Python 3.6",
      "language": "python",
      "name": "python36"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.5.6"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 2
 }
--- a/how-to-use-azureml/deployment/accelerated-models/accelerated-models-training.yml
+++ b/how-to-use-azureml/deployment/accelerated-models/accelerated-models-training.yml
@@ -1,8 +0,0 @@
 name: accelerated-models-training
 dependencies:
 - pip:
  - azureml-sdk
  - azureml-accel-models[cpu]
  - keras
  - tqdm 
  - sklearn
--- a/how-to-use-azureml/deployment/accelerated-models/meeting.jpg
+++ b/how-to-use-azureml/deployment/accelerated-models/meeting.jpg
--- a/how-to-use-azureml/deployment/accelerated-models/snowleopardgaze.jpg
+++ b/how-to-use-azureml/deployment/accelerated-models/snowleopardgaze.jpg
--- a/how-to-use-azureml/deployment/deploy-to-cloud/model-register-and-deploy.ipynb
+++ b/how-to-use-azureml/deployment/deploy-to-cloud/model-register-and-deploy.ipynb
@@ -383,6 +383,8 @@
        "- an inference configuration\n",
        "- a single column tabular dataset, where each row contains a string representing sample request data sent to the service.\n",
        "\n",
        "Please, note that profiling is a long running operation and can take up to 25 minutes depending on the size of the dataset.\n",
        "\n",
        "At this point we only support profiling of services that expect their request data to be a string, for example: string serialized json, text, string serialized image, etc. The content of each row of the dataset (string) will be put into the body of the HTTP request and sent to the service encapsulating the model for scoring.\n",
        "\n",
        "Below is an example of how you can construct an input dataset to profile a service which expects its incoming requests to contain serialized json. In this case we created a dataset based one hundred instances of the same request data. In real world scenarios however, we suggest that you use larger datasets with various inputs, especially if your model resource usage/behavior is input dependent."
@@ -483,6 +485,7 @@
        "            cpu=1.0,\n",
        "            memory_in_gb=0.5)\n",
        "\n",
        "# profiling is a long running operation and may take up to 25 min\n",
        "profile.wait_for_completion(True)\n",
        "details = profile.get_details()"
      ]
--- a/how-to-use-azureml/deployment/deploy-to-local/register-model-deploy-local-advanced.ipynb
+++ b/how-to-use-azureml/deployment/deploy-to-local/register-model-deploy-local-advanced.ipynb
@@ -86,7 +86,7 @@
      "cell_type": "markdown",
      "metadata": {},
      "source": [
-        "You can add tags and descriptions to your models. we are using `sklearn_regression_model.pkl` file in the current directory as a model with the name `sklearn_regression_model_local_adv` in the workspace.\n",
+        "You can add tags and descriptions to your models. we are using `sklearn_regression_model.pkl` file in the current directory as a model with the name `sklearn_regression_model` in the workspace.\n",
        "\n",
        "Using tags, you can track useful information such as the name and version of the machine learning library used to train the model, framework, category, target customer etc. Note that tags must be alphanumeric."
      ]
@@ -105,7 +105,7 @@
        "from azureml.core.model import Model\n",
        "\n",
        "model = Model.register(model_path=\"sklearn_regression_model.pkl\",\n",
-        "                       model_name=\"sklearn_regression_model_local_adv\",\n",
+        "                       model_name=\"sklearn_regression_model\",\n",
        "                       tags={'area': \"diabetes\", 'type': \"regression\"},\n",
        "                       description=\"Ridge regression model to predict diabetes\",\n",
        "                       workspace=ws)"
@@ -126,12 +126,12 @@
      "source": [
        "import os\n",
        "\n",
-        "source_directory = \"C:/abc\"\n",
+        "source_directory = \"source_directory\"\n",
        "\n",
        "os.makedirs(source_directory, exist_ok=True)\n",
-        "os.makedirs(\"C:/abc/x/y\", exist_ok=True)\n",
+        "os.makedirs(os.path.join(source_directory, \"x/y\"), exist_ok=True)\n",
-        "os.makedirs(\"C:/abc/env\", exist_ok=True)\n",
+        "os.makedirs(os.path.join(source_directory, \"env\"), exist_ok=True)\n",
-        "os.makedirs(\"C:/abc/dockerstep\", exist_ok=True)"
+        "os.makedirs(os.path.join(source_directory, \"dockerstep\"), exist_ok=True)"
      ]
    },
    {
@@ -147,7 +147,7 @@
      "metadata": {},
      "outputs": [],
      "source": [
-        "%%writefile C:/abc/x/y/score.py\n",
+        "%%writefile source_directory/x/y/score.py\n",
        "import os\n",
        "import pickle\n",
        "import json\n",
@@ -170,7 +170,7 @@
        "    global name\n",
        "    # note here, entire source directory on inference config gets added into image\n",
        "    # bellow is the example how you can use any extra files in image\n",
-        "    with open('./abc/extradata.json') as json_file:  \n",
+        "    with open('./source_directory/extradata.json') as json_file:\n",
        "        data = json.load(json_file)\n",
        "        name = data[\"people\"][0][\"name\"]\n",
        "\n",
@@ -191,9 +191,7 @@
    },
    {
      "cell_type": "markdown",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "Please note that you must indicate azureml-defaults with verion >= 1.0.45 as a pip dependency for your environemnt. This package contains the functionality needed to host the model as a web service."
      ]
@@ -204,7 +202,7 @@
      "metadata": {},
      "outputs": [],
      "source": [
-        "%%writefile C:/abc/env/myenv.yml\n",
+        "%%writefile source_directory/env/myenv.yml\n",
        "name: project_environment\n",
        "dependencies:\n",
        "  - python=3.6.2\n",
@@ -221,7 +219,7 @@
      "metadata": {},
      "outputs": [],
      "source": [
-        "%%writefile C:/abc/extradata.json\n",
+        "%%writefile source_directory/extradata.json\n",
        "{\n",
        "    \"people\": [\n",
        "        {\n",
@@ -255,13 +253,14 @@
        "from azureml.core.model import InferenceConfig\n",
        "\n",
        "\n",
-        "myenv = Environment.from_conda_specification(name='myenv', file_path='env/myenv.yml')\n",
+        "myenv = Environment.from_conda_specification(name='myenv', file_path='myenv.yml')\n",
        "\n",
        "# explicitly set base_image to None when setting base_dockerfile\n",
        "myenv.docker.base_image = None\n",
-        "myenv.docker.base_dockerfile = \"RUN echo \\\"this is test\\\"\"\n",
+        "myenv.docker.base_dockerfile = \"FROM mcr.microsoft.com/azureml/base:intelmpi2018.3-ubuntu16.04\\nRUN echo \\\"this is test\\\"\"\n",
        "myenv.inferencing_stack_version = \"latest\"\n",
        "\n",
-        "inference_config = InferenceConfig(source_directory=\"C:/abc\",\n",
+        "inference_config = InferenceConfig(source_directory=source_directory,\n",
        "                                   entry_script=\"x/y/score.py\",\n",
        "                                   environment=myenv)\n"
      ]
@@ -379,7 +378,7 @@
      "metadata": {},
      "outputs": [],
      "source": [
-        "%%writefile C:/abc/x/y/score.py\n",
+        "%%writefile source_directory/x/y/score.py\n",
        "import os\n",
        "import pickle\n",
        "import json\n",
@@ -401,7 +400,7 @@
        "    global name, from_location\n",
        "    # note here, entire source directory on inference config gets added into image\n",
        "    # bellow is the example how you can use any extra files in image\n",
-        "    with open('./abc/extradata.json') as json_file:  \n",
+        "    with open('source_directory/extradata.json') as json_file:  \n",
        "        data = json.load(json_file)\n",
        "        name = data[\"people\"][0][\"name\"]\n",
        "        from_location = data[\"people\"][0][\"from\"]\n",
--- a/how-to-use-azureml/deployment/deploy-to-local/register-model-deploy-local.ipynb
+++ b/how-to-use-azureml/deployment/deploy-to-local/register-model-deploy-local.ipynb
@@ -82,7 +82,7 @@
      "cell_type": "markdown",
      "metadata": {},
      "source": [
-        "You can add tags and descriptions to your models. we are using `sklearn_regression_model.pkl` file in the current directory as a model with the name `sklearn_regression_model_local` in the workspace.\n",
+        "You can add tags and descriptions to your models. we are using `sklearn_regression_model.pkl` file in the current directory as a model with the name `sklearn_regression_model` in the workspace.\n",
        "\n",
        "Using tags, you can track useful information such as the name and version of the machine learning library used to train the model, framework, category, target customer etc. Note that tags must be alphanumeric."
      ]
@@ -100,7 +100,7 @@
        "from azureml.core.model import Model\n",
        "\n",
        "model = Model.register(model_path=\"sklearn_regression_model.pkl\",\n",
-        "                       model_name=\"sklearn_regression_model_local\",\n",
+        "                       model_name=\"sklearn_regression_model\",\n",
        "                       tags={'area': \"diabetes\", 'type': \"regression\"},\n",
        "                       description=\"Ridge regression model to predict diabetes\",\n",
        "                       workspace=ws)"
@@ -159,6 +159,8 @@
        "- an inference configuration\n",
        "- a single column tabular dataset, where each row contains a string representing sample request data sent to the service.\n",
        "\n",
        "Please, note that profiling is a long running operation and can take up to 25 minutes depending on the size of the dataset.\n",
        "\n",
        "At this point we only support profiling of services that expect their request data to be a string, for example: string serialized json, text, string serialized image, etc. The content of each row of the dataset (string) will be put into the body of the HTTP request and sent to the service encapsulating the model for scoring.\n",
        "\n",
        "Below is an example of how you can construct an input dataset to profile a service which expects its incoming requests to contain serialized json. In this case we created a dataset based one hundred instances of the same request data. In real world scenarios however, we suggest that you use larger datasets with various inputs, especially if your model resource usage/behavior is input dependent."
@@ -245,6 +247,7 @@
        "            cpu=1.0,\n",
        "            memory_in_gb=0.5)\n",
        "\n",
        "# profiling is a long running operation and may take up to 25 min\n",
        "profile.wait_for_completion(True)\n",
        "details = profile.get_details()"
      ]
--- a/how-to-use-azureml/deployment/onnx/onnx-convert-aml-deploy-tinyyolo.yml
+++ b/how-to-use-azureml/deployment/onnx/onnx-convert-aml-deploy-tinyyolo.yml
@@ -4,4 +4,4 @@ dependencies:
  - azureml-sdk
  - numpy
  - git+https://github.com/apple/coremltools@v2.1
-  - onnxmltools==1.3.1
+  - onnxmltools
--- a/how-to-use-azureml/deployment/onnx/onnx-inference-facial-expression-recognition-deploy.yml
+++ b/how-to-use-azureml/deployment/onnx/onnx-inference-facial-expression-recognition-deploy.yml
@@ -6,4 +6,4 @@ dependencies:
  - matplotlib
  - numpy
  - onnx
-  - opencv-python
+  - opencv-python-headless
--- a/how-to-use-azureml/deployment/onnx/onnx-inference-mnist-deploy.yml
+++ b/how-to-use-azureml/deployment/onnx/onnx-inference-mnist-deploy.yml
@@ -6,4 +6,4 @@ dependencies:
  - matplotlib
  - numpy
  - onnx
-  - opencv-python
+  - opencv-python-headless
--- a/how-to-use-azureml/deployment/production-deploy-to-aks-gpu/production-deploy-to-aks-gpu.ipynb
+++ b/how-to-use-azureml/deployment/production-deploy-to-aks-gpu/production-deploy-to-aks-gpu.ipynb
@@ -59,8 +59,44 @@
      "cell_type": "markdown",
      "metadata": {},
      "source": [
-        "# Register the model\n",
+        "# Download the model\n",
-        "Register an existing trained model, add descirption and tags. Prior to registering the model, you should have a TensorFlow [Saved Model](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/saved_model/README.md) in the `resnet50` directory. You can download a [pretrained resnet50](http://download.tensorflow.org/models/official/20181001_resnet/savedmodels/resnet_v1_fp32_savedmodel_NCHW_jpg.tar.gz) and unpack it to that directory."
+        "\n",
        "Prior to registering the model, you should have a TensorFlow [Saved Model](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/saved_model/README.md) in the `resnet50` directory. This cell will download a [pretrained resnet50](http://download.tensorflow.org/models/official/20181001_resnet/savedmodels/resnet_v1_fp32_savedmodel_NCHW_jpg.tar.gz) and unpack it to that directory."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import os\n",
        "import requests\n",
        "import shutil\n",
        "import tarfile\n",
        "import tempfile\n",
        "\n",
        "from io import BytesIO\n",
        "\n",
        "model_url = \"http://download.tensorflow.org/models/official/20181001_resnet/savedmodels/resnet_v1_fp32_savedmodel_NCHW_jpg.tar.gz\"\n",
        "\n",
        "archive_prefix = \"./resnet_v1_fp32_savedmodel_NCHW_jpg/1538686758/\"\n",
        "target_folder = \"resnet50\"\n",
        "\n",
        "if not os.path.exists(target_folder):\n",
        "    response = requests.get(model_url)\n",
        "    archive = tarfile.open(fileobj=BytesIO(response.content))\n",
        "    with tempfile.TemporaryDirectory() as temp_folder:\n",
        "        archive.extractall(temp_folder)\n",
        "        shutil.copytree(os.path.join(temp_folder, archive_prefix), target_folder)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "# Register the model\n",
        "Register an existing trained model, add description and tags."
      ]
    },
    {
@@ -69,13 +105,13 @@
      "metadata": {},
      "outputs": [],
      "source": [
        "#Register the model\n",
        "from azureml.core.model import Model\n",
-        "model = Model.register(model_path = \"resnet50\", # this points to a local file\n",
+        "\n",
-        "                       model_name = \"resnet50\", # this is the name the model is registered as\n",
+        "model = Model.register(model_path=\"resnet50\", # This points to the local directory to upload.\n",
-        "                       tags = {'area': \"Image classification\", 'type': \"classification\"},\n",
+        "                       model_name=\"resnet50\", # This is the name the model is registered as.\n",
-        "                       description = \"Image classification trained on Imagenet Dataset\",\n",
+        "                       tags={'area': \"Image classification\", 'type': \"classification\"},\n",
-        "                       workspace = ws)\n",
+        "                       description=\"Image classification trained on Imagenet Dataset\",\n",
        "                       workspace=ws)\n",
        "\n",
        "print(model.name, model.description, model.version)"
      ]
--- a/how-to-use-azureml/deployment/production-deploy-to-aks/production-deploy-to-aks.ipynb
+++ b/how-to-use-azureml/deployment/production-deploy-to-aks/production-deploy-to-aks.ipynb
@@ -212,6 +212,8 @@
        "- an inference configuration\n",
        "- a single column tabular dataset, where each row contains a string representing sample request data sent to the service.\n",
        "\n",
        "Please, note that profiling is a long running operation and can take up to 25 minutes depending on the size of the dataset.\n",
        "\n",
        "At this point we only support profiling of services that expect their request data to be a string, for example: string serialized json, text, string serialized image, etc. The content of each row of the dataset (string) will be put into the body of the HTTP request and sent to the service encapsulating the model for scoring.\n",
        "\n",
        "Below is an example of how you can construct an input dataset to profile a service which expects its incoming requests to contain serialized json. In this case we created a dataset based one hundred instances of the same request data. In real world scenarios however, we suggest that you use larger datasets with various inputs, especially if your model resource usage/behavior is input dependent."
@@ -312,6 +314,7 @@
        "            cpu=1.0,\n",
        "            memory_in_gb=0.5)\n",
        "\n",
        "# profiling is a long running operation and may take up to 25 min\n",
        "profile.wait_for_completion(True)\n",
        "details = profile.get_details()"
      ]
--- a/how-to-use-azureml/explain-model/azure-integration/remote-explanation/explain-model-on-amlcompute.ipynb
+++ b/how-to-use-azureml/explain-model/azure-integration/remote-explanation/explain-model-on-amlcompute.ipynb
@@ -243,8 +243,25 @@
        "    'azureml-interpret', 'sklearn-pandas', 'azureml-dataprep'\n",
        "]\n",
        "\n",
        "# Note: this is to pin the scikit-learn and pandas versions to be same as notebook.\n",
        "# In production scenario user would choose their dependencies\n",
        "import pkg_resources\n",
        "available_packages = pkg_resources.working_set\n",
        "sklearn_ver = None\n",
        "pandas_ver = None\n",
        "for dist in available_packages:\n",
        "    if dist.key == 'scikit-learn':\n",
        "        sklearn_ver = dist.version\n",
        "    elif dist.key == 'pandas':\n",
        "        pandas_ver = dist.version\n",
        "sklearn_dep = 'scikit-learn'\n",
        "pandas_dep = 'pandas'\n",
        "if sklearn_ver:\n",
        "    sklearn_dep = 'scikit-learn=={}'.format(sklearn_ver)\n",
        "if pandas_ver:\n",
        "    pandas_dep = 'pandas=={}'.format(pandas_ver)\n",
        "# specify CondaDependencies obj\n",
-        "run_config.environment.python.conda_dependencies = CondaDependencies.create(conda_packages=['scikit-learn'],\n",
+        "run_config.environment.python.conda_dependencies = CondaDependencies.create(conda_packages=[sklearn_dep, pandas_dep],\n",
        "                                                                            pip_packages=azureml_pip_packages)\n",
        "\n",
        "# Now submit a run on AmlCompute\n",
@@ -344,8 +361,25 @@
        "    'azureml-interpret', 'azureml-dataprep'\n",
        "]\n",
        "\n",
        "# Note: this is to pin the scikit-learn and pandas versions to be same as notebook.\n",
        "# In production scenario user would choose their dependencies\n",
        "import pkg_resources\n",
        "available_packages = pkg_resources.working_set\n",
        "sklearn_ver = None\n",
        "pandas_ver = None\n",
        "for dist in available_packages:\n",
        "    if dist.key == 'scikit-learn':\n",
        "        sklearn_ver = dist.version\n",
        "    elif dist.key == 'pandas':\n",
        "        pandas_ver = dist.version\n",
        "sklearn_dep = 'scikit-learn'\n",
        "pandas_dep = 'pandas'\n",
        "if sklearn_ver:\n",
        "    sklearn_dep = 'scikit-learn=={}'.format(sklearn_ver)\n",
        "if pandas_ver:\n",
        "    pandas_dep = 'pandas=={}'.format(pandas_ver)\n",
        "# specify CondaDependencies obj\n",
-        "run_config.environment.python.conda_dependencies = CondaDependencies.create(conda_packages=['scikit-learn'],\n",
+        "run_config.environment.python.conda_dependencies = CondaDependencies.create(conda_packages=[sklearn_dep, pandas_dep],\n",
        "                                                                            pip_packages=azureml_pip_packages)\n",
        "\n",
        "from azureml.core import Run\n",
@@ -457,8 +491,25 @@
        "\n",
        "\n",
        "\n",
        "# Note: this is to pin the scikit-learn and pandas versions to be same as notebook.\n",
        "# In production scenario user would choose their dependencies\n",
        "import pkg_resources\n",
        "available_packages = pkg_resources.working_set\n",
        "sklearn_ver = None\n",
        "pandas_ver = None\n",
        "for dist in available_packages:\n",
        "    if dist.key == 'scikit-learn':\n",
        "        sklearn_ver = dist.version\n",
        "    elif dist.key == 'pandas':\n",
        "        pandas_ver = dist.version\n",
        "sklearn_dep = 'scikit-learn'\n",
        "pandas_dep = 'pandas'\n",
        "if sklearn_ver:\n",
        "    sklearn_dep = 'scikit-learn=={}'.format(sklearn_ver)\n",
        "if pandas_ver:\n",
        "    pandas_dep = 'pandas=={}'.format(pandas_ver)\n",
        "# specify CondaDependencies obj\n",
-        "run_config.environment.python.conda_dependencies = CondaDependencies.create(conda_packages=['scikit-learn'],\n",
+        "run_config.environment.python.conda_dependencies = CondaDependencies.create(conda_packages=[sklearn_dep, pandas_dep],\n",
        "                                                                            pip_packages=azureml_pip_packages)\n",
        "\n",
        "from azureml.core import Run\n",
@@ -696,6 +747,7 @@
        "1. [Save model explanations via Azure Machine Learning Run History](../run-history/save-retrieve-explanations-run-history.ipynb)\n",
        "1. Inferencing time: deploy a classification model and explainer:\n",
        "    1. [Deploy a locally-trained model and explainer](../scoring-time/train-explain-model-locally-and-deploy.ipynb)\n",
        "    1. [Deploy a locally-trained keras model and explainer](../scoring-time/train-explain-model-keras-locally-and-deploy.ipynb)\n",
        "    1. [Deploy a remotely-trained model and explainer](../scoring-time/train-explain-model-on-amlcompute-and-deploy.ipynb)"
      ]
    },
--- a/how-to-use-azureml/explain-model/azure-integration/run-history/save-retrieve-explanations-run-history.ipynb
+++ b/how-to-use-azureml/explain-model/azure-integration/run-history/save-retrieve-explanations-run-history.ipynb
@@ -591,6 +591,7 @@
        "1. [Run explainers remotely on Azure Machine Learning Compute (AMLCompute)](../remote-explanation/explain-model-on-amlcompute.ipynb)\n",
        "1. Inferencing time: deploy a classification model and explainer:\n",
        "    1. [Deploy a locally-trained model and explainer](../scoring-time/train-explain-model-locally-and-deploy.ipynb)\n",
        "    1. [Deploy a locally-trained keras model and explainer](../scoring-time/train-explain-model-keras-locally-and-deploy.ipynb)\n",
        "    1. [Deploy a remotely-trained model and explainer](../scoring-time/train-explain-model-on-amlcompute-and-deploy.ipynb)"
      ]
    },
--- a/how-to-use-azureml/explain-model/azure-integration/scoring-time/train-explain-model-locally-and-deploy.ipynb
+++ b/how-to-use-azureml/explain-model/azure-integration/scoring-time/train-explain-model-locally-and-deploy.ipynb
@@ -328,8 +328,25 @@
        "]\n",
        " \n",
        "\n",
        "# Note: this is to pin the scikit-learn and pandas versions to be same as notebook.\n",
        "# In production scenario user would choose their dependencies\n",
        "import pkg_resources\n",
        "available_packages = pkg_resources.working_set\n",
        "sklearn_ver = None\n",
        "pandas_ver = None\n",
        "for dist in available_packages:\n",
        "    if dist.key == 'scikit-learn':\n",
        "        sklearn_ver = dist.version\n",
        "    elif dist.key == 'pandas':\n",
        "        pandas_ver = dist.version\n",
        "sklearn_dep = 'scikit-learn'\n",
        "pandas_dep = 'pandas'\n",
        "if sklearn_ver:\n",
        "    sklearn_dep = 'scikit-learn=={}'.format(sklearn_ver)\n",
        "if pandas_ver:\n",
        "    pandas_dep = 'pandas=={}'.format(pandas_ver)\n",
        "# specify CondaDependencies obj\n",
-        "myenv = CondaDependencies.create(conda_packages=['scikit-learn', 'pandas'],\n",
+        "myenv = CondaDependencies.create(conda_packages=[sklearn_dep, pandas_dep],\n",
        "                                 pip_packages=['sklearn-pandas', 'pyyaml'] + azureml_pip_packages,\n",
        "                                 pin_sdk_version=False)\n",
        "\n",
@@ -453,7 +470,8 @@
        "    1. [Advanced feature transformations](https://github.com/interpretml/interpret-community/blob/master/notebooks/advanced-feature-transformations-explain-local.ipynb)\n",
        "1. [Save model explanations via Azure Machine Learning Run History](../run-history/save-retrieve-explanations-run-history.ipynb)\n",
        "1. [Run explainers remotely on Azure Machine Learning Compute (AMLCompute)](../remote-explanation/explain-model-on-amlcompute.ipynb)\n",
-        "1. [Inferencing time: deploy a remotely-trained model and explainer](./train-explain-model-on-amlcompute-and-deploy.ipynb)"
+        "1. [Inferencing time: deploy a remotely-trained model and explainer](./train-explain-model-on-amlcompute-and-deploy.ipynb)\n",
        "1. [Inferencing time: deploy a locally-trained keras model and explainer](./train-explain-model-keras-locally-and-deploy.ipynb)"
      ]
    },
    {
--- a/how-to-use-azureml/explain-model/azure-integration/scoring-time/train-explain-model-on-amlcompute-and-deploy.ipynb
+++ b/how-to-use-azureml/explain-model/azure-integration/scoring-time/train-explain-model-on-amlcompute-and-deploy.ipynb
@@ -246,8 +246,25 @@
        " \n",
        "\n",
        "\n",
        "# Note: this is to pin the scikit-learn version to be same as notebook.\n",
        "# In production scenario user would choose their dependencies\n",
        "import pkg_resources\n",
        "available_packages = pkg_resources.working_set\n",
        "sklearn_ver = None\n",
        "pandas_ver = None\n",
        "for dist in available_packages:\n",
        "    if dist.key == 'scikit-learn':\n",
        "        sklearn_ver = dist.version\n",
        "    elif dist.key == 'pandas':\n",
        "        pandas_ver = dist.version\n",
        "sklearn_dep = 'scikit-learn'\n",
        "pandas_dep = 'pandas'\n",
        "if sklearn_ver:\n",
        "    sklearn_dep = 'scikit-learn=={}'.format(sklearn_ver)\n",
        "if pandas_ver:\n",
        "    pandas_dep = 'pandas=={}'.format(pandas_ver)\n",
        "# specify CondaDependencies obj\n",
-        "run_config.environment.python.conda_dependencies = CondaDependencies.create(conda_packages=['scikit-learn'],\n",
+        "run_config.environment.python.conda_dependencies = CondaDependencies.create(conda_packages=[sklearn_dep, pandas_dep],\n",
        "                                pip_packages=['sklearn_pandas', 'pyyaml'] + azureml_pip_packages,\n",
        "                                pin_sdk_version=False)\n",
        "# Now submit a run on AmlCompute\n",
@@ -397,8 +414,25 @@
        "]\n",
        " \n",
        "\n",
        "# Note: this is to pin the scikit-learn and pandas versions to be same as notebook.\n",
        "# In production scenario user would choose their dependencies\n",
        "import pkg_resources\n",
        "available_packages = pkg_resources.working_set\n",
        "sklearn_ver = None\n",
        "pandas_ver = None\n",
        "for dist in available_packages:\n",
        "    if dist.key == 'scikit-learn':\n",
        "        sklearn_ver = dist.version\n",
        "    elif dist.key == 'pandas':\n",
        "        pandas_ver = dist.version\n",
        "sklearn_dep = 'scikit-learn'\n",
        "pandas_dep = 'pandas'\n",
        "if sklearn_ver:\n",
        "    sklearn_dep = 'scikit-learn=={}'.format(sklearn_ver)\n",
        "if pandas_ver:\n",
        "    pandas_dep = 'pandas=={}'.format(pandas_ver)\n",
        "# specify CondaDependencies obj\n",
-        "myenv = CondaDependencies.create(conda_packages=['scikit-learn', 'pandas'],\n",
+        "myenv = CondaDependencies.create(conda_packages=[sklearn_dep, pandas_dep],\n",
        "                                 pip_packages=['sklearn-pandas', 'pyyaml'] + azureml_pip_packages,\n",
        "                                 pin_sdk_version=False)\n",
        "\n",
@@ -491,7 +525,8 @@
        "    1. [Advanced feature transformations](https://github.com/interpretml/interpret-community/blob/master/notebooks/advanced-feature-transformations-explain-local.ipynb)\n",
        "1. [Save model explanations via Azure Machine Learning Run History](../run-history/save-retrieve-explanations-run-history.ipynb)\n",
        "1. [Run explainers remotely on Azure Machine Learning Compute (AMLCompute)](../remote-explanation/explain-model-on-amlcompute.ipynb)\n",
-        "1. [Inferencing time: deploy a locally-trained model and explainer](./train-explain-model-locally-and-deploy.ipynb)"
+        "1. [Inferencing time: deploy a locally-trained model and explainer](./train-explain-model-locally-and-deploy.ipynb)\n",
        "1. [Inferencing time: deploy a locally-trained keras model and explainer](./train-explain-model-keras-locally-and-deploy.ipynb)"
      ]
    },
    {
--- a/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-parameter-tuning-with-hyperdrive.ipynb
+++ b/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-parameter-tuning-with-hyperdrive.ipynb
@@ -537,259 +537,7 @@
      "cell_type": "markdown",
      "metadata": {},
      "source": [
-        "## Deploy the model in ACI\n",
+        "For model deployment, please refer to [Training, hyperparameter tune, and deploy with TensorFlow](https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/ml-frameworks/tensorflow/deployment/train-hyperparameter-tune-deploy-with-tensorflow/train-hyperparameter-tune-deploy-with-tensorflow.ipynb)."
        "Now we are ready to deploy the model as a web service running in Azure Container Instance [ACI](https://azure.microsoft.com/en-us/services/container-instances/). \n",
        "### Create score.py\n",
        "First, we will create a scoring script that will be invoked by the web service call. \n",
        "\n",
        "* Note that the scoring script must have two required functions, `init()` and `run(input_data)`. \n",
        "  * In `init()` function, you typically load the model into a global object. This function is executed only once when the Docker container is started. \n",
        "  * In `run(input_data)` function, the model is used to predict a value based on the input data. The input and output to `run` typically use JSON as serialization and de-serialization format but you are not limited to that."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "%%writefile score.py\n",
        "import json\n",
        "import numpy as np\n",
        "import os\n",
        "import tensorflow as tf\n",
        "\n",
        "def init():\n",
        "    global X, output, sess\n",
        "    tf.reset_default_graph()\n",
        "    # AZUREML_MODEL_DIR is an environment variable created during deployment.\n",
        "    # It is the path to the model folder (./azureml-models/$MODEL_NAME/$VERSION)\n",
        "    # For multiple models, it points to the folder containing all deployed models (./azureml-models)\n",
        "    model_root = os.path.join(os.getenv('AZUREML_MODEL_DIR'), 'model')\n",
        "    saver = tf.train.import_meta_graph(os.path.join(model_root, 'mnist-tf.model.meta'))\n",
        "    X = tf.get_default_graph().get_tensor_by_name(\"network/X:0\")\n",
        "    output = tf.get_default_graph().get_tensor_by_name(\"network/output/MatMul:0\")\n",
        "    \n",
        "    sess = tf.Session()\n",
        "    saver.restore(sess, os.path.join(model_root, 'mnist-tf.model'))\n",
        "\n",
        "def run(raw_data):\n",
        "    data = np.array(json.loads(raw_data)['data'])\n",
        "    # make prediction\n",
        "    out = output.eval(session=sess, feed_dict={X: data})\n",
        "    y_hat = np.argmax(out, axis=1)\n",
        "    return y_hat.tolist()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Create myenv.yml\n",
        "We also need to create an environment file so that Azure Machine Learning can install the necessary packages in the Docker image which are required by your scoring script. In this case, we need to specify packages `numpy`, `tensorflow`."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.core.runconfig import CondaDependencies\n",
        "\n",
        "cd = CondaDependencies.create()\n",
        "cd.add_conda_package('numpy')\n",
        "cd.add_tensorflow_conda_package()\n",
        "cd.save_to_file(base_directory='./', conda_file_path='myenv.yml')\n",
        "\n",
        "print(cd.serialize_to_string())"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Deploy to ACI\n",
        "Now we can deploy. **This cell will run for about 7-8 minutes**. Behind the scene, AzureML will build a Docker container image with the given configuration, if already not available. This image will be deployed to the ACI infrastructure and the scoring script and model will be mounted on the container. The model will then be available as a web service with an HTTP endpoint to accept REST client calls."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "%%time\n",
        "from azureml.core.environment import Environment\n",
        "from azureml.core.model import Model, InferenceConfig\n",
        "from azureml.core.webservice import AciWebservice\n",
        "\n",
        "\n",
        "myenv = Environment.from_conda_specification(name=\"env\", file_path=\"myenv.yml\")\n",
        "inference_config = InferenceConfig(entry_script=\"score.py\", environment=myenv)\n",
        "\n",
        "aciconfig = AciWebservice.deploy_configuration(cpu_cores=1, \n",
        "                                               memory_gb=1, \n",
        "                                               tags={'name':'mnist', 'framework': 'TensorFlow DNN'},\n",
        "                                               description='Tensorflow DNN on MNIST')\n",
        "\n",
        "service = Model.deploy(ws, 'tf-mnist-svc', [model], inference_config, aciconfig)\n",
        "service.wait_for_deployment(show_output=True)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "**Tip: If something goes wrong with the deployment, the first thing to look at is the logs from the service by running the following command:**"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "print(service.get_logs())"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "This is the scoring web service endpoint:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "print(service.scoring_uri)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Test the deployed model\n",
        "Let's test the deployed model. Pick 30 random samples from the test set, and send it to the web service hosted in ACI. Note here we are using the `run` API in the SDK to invoke the service. You can also make raw HTTP calls using any HTTP tool such as curl.\n",
        "\n",
        "After the invocation, we print the returned predictions and plot them along with the input images. Use red font color and inversed image (white on black) to highlight the misclassified samples. Note since the model accuracy is pretty high, you might have to run the below cell a few times before you can see a misclassified sample."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import json\n",
        "\n",
        "# find 30 random samples from test set\n",
        "n = 30\n",
        "sample_indices = np.random.permutation(X_test.shape[0])[0:n]\n",
        "\n",
        "test_samples = json.dumps({\"data\": X_test[sample_indices].tolist()})\n",
        "test_samples = bytes(test_samples, encoding='utf8')\n",
        "\n",
        "# predict using the deployed model\n",
        "result = service.run(input_data=test_samples)\n",
        "\n",
        "# compare actual value vs. the predicted values:\n",
        "i = 0\n",
        "plt.figure(figsize = (20, 1))\n",
        "\n",
        "for s in sample_indices:\n",
        "    plt.subplot(1, n, i + 1)\n",
        "    plt.axhline('')\n",
        "    plt.axvline('')\n",
        "    \n",
        "    # use different color for misclassified sample\n",
        "    font_color = 'red' if y_test[s] != result[i] else 'black'\n",
        "    clr_map = plt.cm.gray if y_test[s] != result[i] else plt.cm.Greys\n",
        "    \n",
        "    plt.text(x=10, y=-10, s=y_hat[s], fontsize=18, color=font_color)\n",
        "    plt.imshow(X_test[s].reshape(28, 28), cmap=clr_map)\n",
        "    \n",
        "    i = i + 1\n",
        "plt.show()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "We can also send raw HTTP request to the service."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import requests\n",
        "\n",
        "# send a random row from the test set to score\n",
        "random_index = np.random.randint(0, len(X_test)-1)\n",
        "input_data = \"{\\\"data\\\": [\" + str(list(X_test[random_index])) + \"]}\"\n",
        "\n",
        "headers = {'Content-Type':'application/json'}\n",
        "\n",
        "resp = requests.post(service.scoring_uri, input_data, headers=headers)\n",
        "\n",
        "print(\"POST to url\", service.scoring_uri)\n",
        "print(\"input data:\", input_data)\n",
        "print(\"label:\", y_test[random_index])\n",
        "print(\"prediction:\", resp.text)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Let's look at the workspace after the web service was deployed. You should see \n",
        "* a registered model named 'model' and with the id 'model:1'\n",
        "* an image called 'tf-mnist' and with a docker image location pointing to your workspace's Azure Container Registry (ACR)  \n",
        "* a webservice called 'tf-mnist' with some scoring URL"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "models = ws.models\n",
        "for name, model in models.items():\n",
        "    print(\"Model: {}, ID: {}\".format(name, model.id))\n",
        "    \n",
        "images = ws.images\n",
        "for name, image in images.items():\n",
        "    print(\"Image: {}, location: {}\".format(name, image.image_location))\n",
        "    \n",
        "webservices = ws.webservices\n",
        "for name, webservice in webservices.items():\n",
        "    print(\"Webservice: {}, scoring URI: {}\".format(name, webservice.scoring_uri))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Clean up\n",
        "You can delete the ACI deployment with a simple delete API call."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "service.delete()"
      ]
    }
  ],
--- a/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-with-automated-machine-learning-step.ipynb
+++ b/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-with-automated-machine-learning-step.ipynb
@@ -70,11 +70,7 @@
        "from azureml.core.experiment import Experiment\n",
        "from azureml.core.workspace import Workspace\n",
        "from azureml.train.automl import AutoMLConfig\n",
        "from azureml.core.compute import AmlCompute\n",
        "from azureml.core.compute import ComputeTarget\n",
        "from azureml.core.dataset import Dataset\n",
        "from azureml.core.runconfig import RunConfiguration\n",
        "from azureml.core.conda_dependencies import CondaDependencies\n",
        "\n",
        "from azureml.pipeline.steps import AutoMLStep\n",
        "\n",
@@ -138,31 +134,25 @@
      "metadata": {},
      "outputs": [],
      "source": [
-        "# Choose a name for your cluster.\n",
+        "from azureml.core.compute import AmlCompute\n",
        "from azureml.core.compute import ComputeTarget\n",
        "from azureml.core.compute_target import ComputeTargetException\n",
        "\n",
        "# Choose a name for your CPU cluster\n",
        "amlcompute_cluster_name = \"cpu-cluster\"\n",
        "\n",
-        "found = False\n",
+        "# Verify that cluster does not exist already\n",
-        "# Check if this compute target already exists in the workspace.\n",
+        "try:\n",
-        "cts = ws.compute_targets\n",
+        "    compute_target = ComputeTarget(workspace=ws, name=amlcompute_cluster_name)\n",
-        "if amlcompute_cluster_name in cts and cts[amlcompute_cluster_name].type == 'AmlCompute':\n",
+        "    print('Found existing cluster, use it.')\n",
-        "    found = True\n",
+        "except ComputeTargetException:\n",
-        "    print('Found existing compute target.')\n",
+        "    compute_config = AmlCompute.provisioning_configuration(vm_size='STANDARD_D2_V2',# for GPU, use \"STANDARD_NC6\"\n",
-        "    compute_target = cts[amlcompute_cluster_name]\n",
+        "                                                           #vm_priority = 'lowpriority', # optional\n",
-        "    \n",
+        "                                                           max_nodes=4)\n",
-        "if not found:\n",
+        "    compute_target = ComputeTarget.create(ws, amlcompute_cluster_name, compute_config)\n",
        "    print('Creating a new compute target...')\n",
        "    provisioning_config = AmlCompute.provisioning_configuration(vm_size = \"STANDARD_D2_V2\", # for GPU, use \"STANDARD_NC6\"\n",
        "                                                                #vm_priority = 'lowpriority', # optional\n",
        "                                                                max_nodes = 4)\n",
        "\n",
-        "    # Create the cluster.\n",
+        "compute_target.wait_for_completion(show_output=True, min_node_count = 1, timeout_in_minutes = 10)\n",
-        "    compute_target = ComputeTarget.create(ws, amlcompute_cluster_name, provisioning_config)\n",
+        "# For a more detailed view of current AmlCompute status, use get_status()."
        "    \n",
        "    # Can poll for a minimum number of nodes and for a specific timeout.\n",
        "    # If no min_node_count is provided, it will use the scale settings for the cluster.\n",
        "    compute_target.wait_for_completion(show_output = True, min_node_count = 1, timeout_in_minutes = 10)\n",
        "    \n",
        "     # For a more detailed view of current AmlCompute status, use get_status()."
      ]
    },
    {
--- a/how-to-use-azureml/machine-learning-pipelines/nyc-taxi-data-regression-model-building/nyc-taxi-data-regression-model-building.ipynb
+++ b/how-to-use-azureml/machine-learning-pipelines/nyc-taxi-data-regression-model-building/nyc-taxi-data-regression-model-building.ipynb
@@ -686,8 +686,7 @@
        "    \"n_cross_validations\": 5\n",
        "}\n",
        "\n",
-        "train_X = output_split_train.parse_parquet_files(file_extension=None).keep_columns(['pickup_weekday','pickup_hour', 'distance','passengers', 'vendor'])\n",
+        "training_dataset = output_split_train.parse_parquet_files(file_extension=None).keep_columns(['pickup_weekday','pickup_hour', 'distance','passengers', 'vendor', 'cost'])\n",
        "train_y = output_split_train.parse_parquet_files(file_extension=None).keep_columns('cost')\n",
        "\n",
        "automl_config = AutoMLConfig(task = 'regression',\n",
        "                             debug_log = 'automated_ml_errors.log',\n",
@@ -695,8 +694,8 @@
        "                             compute_target = aml_compute,\n",
        "                             run_configuration = aml_run_config,\n",
        "                             featurization = 'auto',\n",
-        "                             X = train_X,\n",
+        "                             training_data = training_dataset,\n",
-        "                             y = train_y,\n",
+        "                             label_column_name = 'cost',\n",
        "                             **automl_settings)\n",
        "                             \n",
        "print(\"AutoML config created.\")"
--- a/how-to-use-azureml/machine-learning-pipelines/parallel-run/file-dataset-image-inference-mnist.ipynb
+++ b/how-to-use-azureml/machine-learning-pipelines/parallel-run/file-dataset-image-inference-mnist.ipynb
@@ -341,7 +341,7 @@
        "from azureml.core import Environment\n",
        "from azureml.core.runconfig import CondaDependencies, DEFAULT_CPU_IMAGE\n",
        "\n",
-        "batch_conda_deps = CondaDependencies.create(pip_packages=[\"tensorflow==1.13.1\", \"pillow\"])\n",
+        "batch_conda_deps = CondaDependencies.create(pip_packages=[\"tensorflow==1.15.2\", \"pillow\"])\n",
        "\n",
        "batch_env = Environment(name=\"batch_environment\")\n",
        "batch_env.python.conda_dependencies = batch_conda_deps\n",
--- a/how-to-use-azureml/machine-learning-pipelines/pipeline-style-transfer/pipeline-style-transfer.yml
+++ b/how-to-use-azureml/machine-learning-pipelines/pipeline-style-transfer/pipeline-style-transfer.yml
@@ -3,5 +3,6 @@ dependencies:
 - pip:
  - azureml-sdk
  - azureml-contrib-pipeline-steps
  - azureml-pipeline-steps
  - azureml-widgets
  - requests
--- a/how-to-use-azureml/ml-frameworks/pytorch/training/mask-rcnn-object-detection/coco_eval.py
+++ b/how-to-use-azureml/ml-frameworks/pytorch/training/mask-rcnn-object-detection/coco_eval.py
@@ -1,350 +0,0 @@
 import json
 import tempfile
 import numpy as np
 import copy
 import time
 import torch
 import torch._six
 from pycocotools.cocoeval import COCOeval
 from pycocotools.coco import COCO
 import pycocotools.mask as mask_util
 from collections import defaultdict
 import utils
 class CocoEvaluator(object):
    def __init__(self, coco_gt, iou_types):
        assert isinstance(iou_types, (list, tuple))
        coco_gt = copy.deepcopy(coco_gt)
        self.coco_gt = coco_gt
        self.iou_types = iou_types
        self.coco_eval = {}
        for iou_type in iou_types:
            self.coco_eval[iou_type] = COCOeval(coco_gt, iouType=iou_type)
        self.img_ids = []
        self.eval_imgs = {k: [] for k in iou_types}
    def update(self, predictions):
        img_ids = list(np.unique(list(predictions.keys())))
        self.img_ids.extend(img_ids)
        for iou_type in self.iou_types:
            results = self.prepare(predictions, iou_type)
            coco_dt = loadRes(self.coco_gt, results) if results else COCO()
            coco_eval = self.coco_eval[iou_type]
            coco_eval.cocoDt = coco_dt
            coco_eval.params.imgIds = list(img_ids)
            img_ids, eval_imgs = evaluate(coco_eval)
            self.eval_imgs[iou_type].append(eval_imgs)
    def synchronize_between_processes(self):
        for iou_type in self.iou_types:
            self.eval_imgs[iou_type] = np.concatenate(self.eval_imgs[iou_type], 2)
            create_common_coco_eval(self.coco_eval[iou_type], self.img_ids, self.eval_imgs[iou_type])
    def accumulate(self):
        for coco_eval in self.coco_eval.values():
            coco_eval.accumulate()
    def summarize(self):
        for iou_type, coco_eval in self.coco_eval.items():
            print("IoU metric: {}".format(iou_type))
            coco_eval.summarize()
    def prepare(self, predictions, iou_type):
        if iou_type == "bbox":
            return self.prepare_for_coco_detection(predictions)
        elif iou_type == "segm":
            return self.prepare_for_coco_segmentation(predictions)
        elif iou_type == "keypoints":
            return self.prepare_for_coco_keypoint(predictions)
        else:
            raise ValueError("Unknown iou type {}".format(iou_type))
    def prepare_for_coco_detection(self, predictions):
        coco_results = []
        for original_id, prediction in predictions.items():
            if len(prediction) == 0:
                continue
            boxes = prediction["boxes"]
            boxes = convert_to_xywh(boxes).tolist()
            scores = prediction["scores"].tolist()
            labels = prediction["labels"].tolist()
            coco_results.extend(
                [
                    {
                        "image_id": original_id,
                        "category_id": labels[k],
                        "bbox": box,
                        "score": scores[k],
                    }
                    for k, box in enumerate(boxes)
                ]
            )
        return coco_results
    def prepare_for_coco_segmentation(self, predictions):
        coco_results = []
        for original_id, prediction in predictions.items():
            if len(prediction) == 0:
                continue
            scores = prediction["scores"]
            labels = prediction["labels"]
            masks = prediction["masks"]
            masks = masks > 0.5
            scores = prediction["scores"].tolist()
            labels = prediction["labels"].tolist()
            rles = [
                mask_util.encode(np.array(mask[0, :, :, np.newaxis], dtype=np.uint8, order="F"))[0]
                for mask in masks
            ]
            for rle in rles:
                rle["counts"] = rle["counts"].decode("utf-8")
            coco_results.extend(
                [
                    {
                        "image_id": original_id,
                        "category_id": labels[k],
                        "segmentation": rle,
                        "score": scores[k],
                    }
                    for k, rle in enumerate(rles)
                ]
            )
        return coco_results
    def prepare_for_coco_keypoint(self, predictions):
        coco_results = []
        for original_id, prediction in predictions.items():
            if len(prediction) == 0:
                continue
            boxes = prediction["boxes"]
            boxes = convert_to_xywh(boxes).tolist()
            scores = prediction["scores"].tolist()
            labels = prediction["labels"].tolist()
            keypoints = prediction["keypoints"]
            keypoints = keypoints.flatten(start_dim=1).tolist()
            coco_results.extend(
                [
                    {
                        "image_id": original_id,
                        "category_id": labels[k],
                        'keypoints': keypoint,
                        "score": scores[k],
                    }
                    for k, keypoint in enumerate(keypoints)
                ]
            )
        return coco_results
 def convert_to_xywh(boxes):
    xmin, ymin, xmax, ymax = boxes.unbind(1)
    return torch.stack((xmin, ymin, xmax - xmin, ymax - ymin), dim=1)
 def merge(img_ids, eval_imgs):
    all_img_ids = utils.all_gather(img_ids)
    all_eval_imgs = utils.all_gather(eval_imgs)
    merged_img_ids = []
    for p in all_img_ids:
        merged_img_ids.extend(p)
    merged_eval_imgs = []
    for p in all_eval_imgs:
        merged_eval_imgs.append(p)
    merged_img_ids = np.array(merged_img_ids)
    merged_eval_imgs = np.concatenate(merged_eval_imgs, 2)
    # keep only unique (and in sorted order) images
    merged_img_ids, idx = np.unique(merged_img_ids, return_index=True)
    merged_eval_imgs = merged_eval_imgs[..., idx]
    return merged_img_ids, merged_eval_imgs
 def create_common_coco_eval(coco_eval, img_ids, eval_imgs):
    img_ids, eval_imgs = merge(img_ids, eval_imgs)
    img_ids = list(img_ids)
    eval_imgs = list(eval_imgs.flatten())
    coco_eval.evalImgs = eval_imgs
    coco_eval.params.imgIds = img_ids
    coco_eval._paramsEval = copy.deepcopy(coco_eval.params)
 #################################################################
 # From pycocotools, just removed the prints and fixed
 # a Python3 bug about unicode not defined
 #################################################################
 # Ideally, pycocotools wouldn't have hard-coded prints
 # so that we could avoid copy-pasting those two functions
 def createIndex(self):
    # create index
    # print('creating index...')
    anns, cats, imgs = {}, {}, {}
    imgToAnns, catToImgs = defaultdict(list), defaultdict(list)
    if 'annotations' in self.dataset:
        for ann in self.dataset['annotations']:
            imgToAnns[ann['image_id']].append(ann)
            anns[ann['id']] = ann
    if 'images' in self.dataset:
        for img in self.dataset['images']:
            imgs[img['id']] = img
    if 'categories' in self.dataset:
        for cat in self.dataset['categories']:
            cats[cat['id']] = cat
    if 'annotations' in self.dataset and 'categories' in self.dataset:
        for ann in self.dataset['annotations']:
            catToImgs[ann['category_id']].append(ann['image_id'])
    # print('index created!')
    # create class members
    self.anns = anns
    self.imgToAnns = imgToAnns
    self.catToImgs = catToImgs
    self.imgs = imgs
    self.cats = cats
 maskUtils = mask_util
 def loadRes(self, resFile):
    """
    Load result file and return a result api object.
    :param   resFile (str)     : file name of result file
    :return: res (obj)         : result api object
    """
    res = COCO()
    res.dataset['images'] = [img for img in self.dataset['images']]
    # print('Loading and preparing results...')
    # tic = time.time()
    if isinstance(resFile, torch._six.string_classes):
        anns = json.load(open(resFile))
    elif type(resFile) == np.ndarray:
        anns = self.loadNumpyAnnotations(resFile)
    else:
        anns = resFile
    assert type(anns) == list, 'results in not an array of objects'
    annsImgIds = [ann['image_id'] for ann in anns]
    assert set(annsImgIds) == (set(annsImgIds) & set(self.getImgIds())), \
        'Results do not correspond to current coco set'
    if 'caption' in anns[0]:
        imgIds = set([img['id'] for img in res.dataset['images']]) & set([ann['image_id'] for ann in anns])
        res.dataset['images'] = [img for img in res.dataset['images'] if img['id'] in imgIds]
        for id, ann in enumerate(anns):
            ann['id'] = id + 1
    elif 'bbox' in anns[0] and not anns[0]['bbox'] == []:
        res.dataset['categories'] = copy.deepcopy(self.dataset['categories'])
        for id, ann in enumerate(anns):
            bb = ann['bbox']
            x1, x2, y1, y2 = [bb[0], bb[0] + bb[2], bb[1], bb[1] + bb[3]]
            if 'segmentation' not in ann:
                ann['segmentation'] = [[x1, y1, x1, y2, x2, y2, x2, y1]]
            ann['area'] = bb[2] * bb[3]
            ann['id'] = id + 1
            ann['iscrowd'] = 0
    elif 'segmentation' in anns[0]:
        res.dataset['categories'] = copy.deepcopy(self.dataset['categories'])
        for id, ann in enumerate(anns):
            # now only support compressed RLE format as segmentation results
            ann['area'] = maskUtils.area(ann['segmentation'])
            if 'bbox' not in ann:
                ann['bbox'] = maskUtils.toBbox(ann['segmentation'])
            ann['id'] = id + 1
            ann['iscrowd'] = 0
    elif 'keypoints' in anns[0]:
        res.dataset['categories'] = copy.deepcopy(self.dataset['categories'])
        for id, ann in enumerate(anns):
            s = ann['keypoints']
            x = s[0::3]
            y = s[1::3]
            x1, x2, y1, y2 = np.min(x), np.max(x), np.min(y), np.max(y)
            ann['area'] = (x2 - x1) * (y2 - y1)
            ann['id'] = id + 1
            ann['bbox'] = [x1, y1, x2 - x1, y2 - y1]
    # print('DONE (t={:0.2f}s)'.format(time.time()- tic))
    res.dataset['annotations'] = anns
    createIndex(res)
    return res
 def evaluate(self):
    '''
    Run per image evaluation on given images and store results (a list of dict) in self.evalImgs
    :return: None
    '''
    # tic = time.time()
    # print('Running per image evaluation...')
    p = self.params
    # add backward compatibility if useSegm is specified in params
    if p.useSegm is not None:
        p.iouType = 'segm' if p.useSegm == 1 else 'bbox'
        print('useSegm (deprecated) is not None. Running {} evaluation'.format(p.iouType))
    # print('Evaluate annotation type *{}*'.format(p.iouType))
    p.imgIds = list(np.unique(p.imgIds))
    if p.useCats:
        p.catIds = list(np.unique(p.catIds))
    p.maxDets = sorted(p.maxDets)
    self.params = p
    self._prepare()
    # loop through images, area range, max detection number
    catIds = p.catIds if p.useCats else [-1]
    if p.iouType == 'segm' or p.iouType == 'bbox':
        computeIoU = self.computeIoU
    elif p.iouType == 'keypoints':
        computeIoU = self.computeOks
    self.ious = {
        (imgId, catId): computeIoU(imgId, catId)
        for imgId in p.imgIds
        for catId in catIds}
    evaluateImg = self.evaluateImg
    maxDet = p.maxDets[-1]
    evalImgs = [
        evaluateImg(imgId, catId, areaRng, maxDet)
        for catId in catIds
        for areaRng in p.areaRng
        for imgId in p.imgIds
    ]
    # this is NOT in the pycocotools code, but could be done outside
    evalImgs = np.asarray(evalImgs).reshape(
        len(catIds), len(p.areaRng), len(p.imgIds))
    self._paramsEval = copy.deepcopy(self.params)
    # toc = time.time()
    # print('DONE (t={:0.2f}s).'.format(toc-tic))
    return p.imgIds, evalImgs
 #################################################################
 # end of straight copy from pycocotools, just removing the prints
 #################################################################
--- a/how-to-use-azureml/ml-frameworks/pytorch/training/mask-rcnn-object-detection/coco_utils.py
+++ b/how-to-use-azureml/ml-frameworks/pytorch/training/mask-rcnn-object-detection/coco_utils.py
@@ -1,252 +0,0 @@
 import copy
 import os
 from PIL import Image
 import torch
 import torch.utils.data
 import torchvision
 from pycocotools import mask as coco_mask
 from pycocotools.coco import COCO
 import transforms as T
 class FilterAndRemapCocoCategories(object):
    def __init__(self, categories, remap=True):
        self.categories = categories
        self.remap = remap
    def __call__(self, image, target):
        anno = target["annotations"]
        anno = [obj for obj in anno if obj["category_id"] in self.categories]
        if not self.remap:
            target["annotations"] = anno
            return image, target
        anno = copy.deepcopy(anno)
        for obj in anno:
            obj["category_id"] = self.categories.index(obj["category_id"])
        target["annotations"] = anno
        return image, target
 def convert_coco_poly_to_mask(segmentations, height, width):
    masks = []
    for polygons in segmentations:
        rles = coco_mask.frPyObjects(polygons, height, width)
        mask = coco_mask.decode(rles)
        if len(mask.shape) < 3:
            mask = mask[..., None]
        mask = torch.as_tensor(mask, dtype=torch.uint8)
        mask = mask.any(dim=2)
        masks.append(mask)
    if masks:
        masks = torch.stack(masks, dim=0)
    else:
        masks = torch.zeros((0, height, width), dtype=torch.uint8)
    return masks
 class ConvertCocoPolysToMask(object):
    def __call__(self, image, target):
        w, h = image.size
        image_id = target["image_id"]
        image_id = torch.tensor([image_id])
        anno = target["annotations"]
        anno = [obj for obj in anno if obj['iscrowd'] == 0]
        boxes = [obj["bbox"] for obj in anno]
        # guard against no boxes via resizing
        boxes = torch.as_tensor(boxes, dtype=torch.float32).reshape(-1, 4)
        boxes[:, 2:] += boxes[:, :2]
        boxes[:, 0::2].clamp_(min=0, max=w)
        boxes[:, 1::2].clamp_(min=0, max=h)
        classes = [obj["category_id"] for obj in anno]
        classes = torch.tensor(classes, dtype=torch.int64)
        segmentations = [obj["segmentation"] for obj in anno]
        masks = convert_coco_poly_to_mask(segmentations, h, w)
        keypoints = None
        if anno and "keypoints" in anno[0]:
            keypoints = [obj["keypoints"] for obj in anno]
            keypoints = torch.as_tensor(keypoints, dtype=torch.float32)
            num_keypoints = keypoints.shape[0]
            if num_keypoints:
                keypoints = keypoints.view(num_keypoints, -1, 3)
        keep = (boxes[:, 3] > boxes[:, 1]) & (boxes[:, 2] > boxes[:, 0])
        boxes = boxes[keep]
        classes = classes[keep]
        masks = masks[keep]
        if keypoints is not None:
            keypoints = keypoints[keep]
        target = {}
        target["boxes"] = boxes
        target["labels"] = classes
        target["masks"] = masks
        target["image_id"] = image_id
        if keypoints is not None:
            target["keypoints"] = keypoints
        # for conversion to coco api
        area = torch.tensor([obj["area"] for obj in anno])
        iscrowd = torch.tensor([obj["iscrowd"] for obj in anno])
        target["area"] = area
        target["iscrowd"] = iscrowd
        return image, target
 def _coco_remove_images_without_annotations(dataset, cat_list=None):
    def _has_only_empty_bbox(anno):
        return all(any(o <= 1 for o in obj["bbox"][2:]) for obj in anno)
    def _count_visible_keypoints(anno):
        return sum(sum(1 for v in ann["keypoints"][2::3] if v > 0) for ann in anno)
    min_keypoints_per_image = 10
    def _has_valid_annotation(anno):
        # if it's empty, there is no annotation
        if len(anno) == 0:
            return False
        # if all boxes have close to zero area, there is no annotation
        if _has_only_empty_bbox(anno):
            return False
        # keypoints task have a slight different critera for considering
        # if an annotation is valid
        if "keypoints" not in anno[0]:
            return True
        # for keypoint detection tasks, only consider valid images those
        # containing at least min_keypoints_per_image
        if _count_visible_keypoints(anno) >= min_keypoints_per_image:
            return True
        return False
    assert isinstance(dataset, torchvision.datasets.CocoDetection)
    ids = []
    for ds_idx, img_id in enumerate(dataset.ids):
        ann_ids = dataset.coco.getAnnIds(imgIds=img_id, iscrowd=None)
        anno = dataset.coco.loadAnns(ann_ids)
        if cat_list:
            anno = [obj for obj in anno if obj["category_id"] in cat_list]
        if _has_valid_annotation(anno):
            ids.append(ds_idx)
    dataset = torch.utils.data.Subset(dataset, ids)
    return dataset
 def convert_to_coco_api(ds):
    coco_ds = COCO()
    # annotation IDs need to start at 1, not 0, see torchvision issue #1530
    ann_id = 1
    dataset = {'images': [], 'categories': [], 'annotations': []}
    categories = set()
    for img_idx in range(len(ds)):
        # find better way to get target
        # targets = ds.get_annotations(img_idx)
        img, targets = ds[img_idx]
        image_id = targets["image_id"].item()
        img_dict = {}
        img_dict['id'] = image_id
        img_dict['height'] = img.shape[-2]
        img_dict['width'] = img.shape[-1]
        dataset['images'].append(img_dict)
        bboxes = targets["boxes"]
        bboxes[:, 2:] -= bboxes[:, :2]
        bboxes = bboxes.tolist()
        labels = targets['labels'].tolist()
        areas = targets['area'].tolist()
        iscrowd = targets['iscrowd'].tolist()
        if 'masks' in targets:
            masks = targets['masks']
            # make masks Fortran contiguous for coco_mask
            masks = masks.permute(0, 2, 1).contiguous().permute(0, 2, 1)
        if 'keypoints' in targets:
            keypoints = targets['keypoints']
            keypoints = keypoints.reshape(keypoints.shape[0], -1).tolist()
        num_objs = len(bboxes)
        for i in range(num_objs):
            ann = {}
            ann['image_id'] = image_id
            ann['bbox'] = bboxes[i]
            ann['category_id'] = labels[i]
            categories.add(labels[i])
            ann['area'] = areas[i]
            ann['iscrowd'] = iscrowd[i]
            ann['id'] = ann_id
            if 'masks' in targets:
                ann["segmentation"] = coco_mask.encode(masks[i].numpy())
            if 'keypoints' in targets:
                ann['keypoints'] = keypoints[i]
                ann['num_keypoints'] = sum(k != 0 for k in keypoints[i][2::3])
            dataset['annotations'].append(ann)
            ann_id += 1
    dataset['categories'] = [{'id': i} for i in sorted(categories)]
    coco_ds.dataset = dataset
    coco_ds.createIndex()
    return coco_ds
 def get_coco_api_from_dataset(dataset):
    for _ in range(10):
        if isinstance(dataset, torchvision.datasets.CocoDetection):
            break
        if isinstance(dataset, torch.utils.data.Subset):
            dataset = dataset.dataset
    if isinstance(dataset, torchvision.datasets.CocoDetection):
        return dataset.coco
    return convert_to_coco_api(dataset)
 class CocoDetection(torchvision.datasets.CocoDetection):
    def __init__(self, img_folder, ann_file, transforms):
        super(CocoDetection, self).__init__(img_folder, ann_file)
        self._transforms = transforms
    def __getitem__(self, idx):
        img, target = super(CocoDetection, self).__getitem__(idx)
        image_id = self.ids[idx]
        target = dict(image_id=image_id, annotations=target)
        if self._transforms is not None:
            img, target = self._transforms(img, target)
        return img, target
 def get_coco(root, image_set, transforms, mode='instances'):
    anno_file_template = "{}_{}2017.json"
    PATHS = {
        "train": ("train2017", os.path.join("annotations", anno_file_template.format(mode, "train"))),
        "val": ("val2017", os.path.join("annotations", anno_file_template.format(mode, "val"))),
        # "train": ("val2017", os.path.join("annotations", anno_file_template.format(mode, "val")))
    }
    t = [ConvertCocoPolysToMask()]
    if transforms is not None:
        t.append(transforms)
    transforms = T.Compose(t)
    img_folder, ann_file = PATHS[image_set]
    img_folder = os.path.join(root, img_folder)
    ann_file = os.path.join(root, ann_file)
    dataset = CocoDetection(img_folder, ann_file, transforms=transforms)
    if image_set == "train":
        dataset = _coco_remove_images_without_annotations(dataset)
    # dataset = torch.utils.data.Subset(dataset, [i for i in range(500)])
    return dataset
 def get_coco_kp(root, image_set, transforms):
    return get_coco(root, image_set, transforms, mode="person_keypoints")
--- a/how-to-use-azureml/ml-frameworks/pytorch/training/mask-rcnn-object-detection/data.py
+++ b/how-to-use-azureml/ml-frameworks/pytorch/training/mask-rcnn-object-detection/data.py
@@ -1,77 +0,0 @@
 import numpy as np
 import os
 import torch.utils.data
 from azureml.core import Run
 from PIL import Image
 class PennFudanDataset(torch.utils.data.Dataset):
    def __init__(self, root, transforms=None):
        self.root = root
        self.transforms = transforms
        # load all image files, sorting them to ensure that they are aligned
        self.img_dir = os.path.join(root, "PNGImages")
        self.mask_dir = os.path.join(root, "PedMasks")
        self.imgs = list(sorted(os.listdir(self.img_dir)))
        self.masks = list(sorted(os.listdir(self.mask_dir)))
    def __getitem__(self, idx):
        # load images ad masks
        img_path = os.path.join(self.img_dir, self.imgs[idx])
        mask_path = os.path.join(self.mask_dir, self.masks[idx])
        img = Image.open(img_path).convert("RGB")
        # note that we haven't converted the mask to RGB,
        # because each color corresponds to a different instance
        # with 0 being background
        mask = Image.open(mask_path)
        mask = np.array(mask)
        # instances are encoded as different colors
        obj_ids = np.unique(mask)
        # first id is the background, so remove it
        obj_ids = obj_ids[1:]
        # split the color-encoded mask into a set
        # of binary masks
        masks = mask == obj_ids[:, None, None]
        # get bounding box coordinates for each mask
        num_objs = len(obj_ids)
        boxes = []
        for i in range(num_objs):
            pos = np.where(masks[i])
            xmin = np.min(pos[1])
            xmax = np.max(pos[1])
            ymin = np.min(pos[0])
            ymax = np.max(pos[0])
            boxes.append([xmin, ymin, xmax, ymax])
        boxes = torch.as_tensor(boxes, dtype=torch.float32)
        # there is only one class
        labels = torch.ones((num_objs,), dtype=torch.int64)
        masks = torch.as_tensor(masks, dtype=torch.uint8)
        image_id = torch.tensor([idx])
        area = (boxes[:, 3] - boxes[:, 1]) * (boxes[:, 2] - boxes[:, 0])
        # suppose all instances are not crowd
        iscrowd = torch.zeros((num_objs,), dtype=torch.int64)
        target = {}
        target["boxes"] = boxes
        target["labels"] = labels
        target["masks"] = masks
        target["image_id"] = image_id
        target["area"] = area
        target["iscrowd"] = iscrowd
        if self.transforms is not None:
            img, target = self.transforms(img, target)
        return img, target
    def __len__(self):
        return len(self.imgs)
--- a/how-to-use-azureml/ml-frameworks/pytorch/training/mask-rcnn-object-detection/dockerfiles/Dockerfile
+++ b/how-to-use-azureml/ml-frameworks/pytorch/training/mask-rcnn-object-detection/dockerfiles/Dockerfile
@@ -1,16 +0,0 @@
 # From https://github.com/microsoft/AzureML-BERT/blob/master/finetune/PyTorch/dockerfile
 FROM mcr.microsoft.com/azureml/base-gpu:openmpi3.1.2-cuda10.1-cudnn7-ubuntu18.04
 RUN apt update && apt install git -y && rm -rf /var/lib/apt/lists/*
 RUN /opt/miniconda/bin/conda update -n base -c defaults conda
 RUN /opt/miniconda/bin/conda install -y cython=0.29.15 numpy=1.18.1
 RUN /opt/miniconda/bin/conda install -y pytorch=1.4 torchvision=0.5.0 -c pytorch
 # Install cocoapi, required for drawing bounding boxes
 RUN git clone https://github.com/cocodataset/cocoapi.git && cd cocoapi/PythonAPI && python setup.py build_ext install
 RUN pip install azureml-defaults
 RUN pip install "azureml-dataprep[fuse]"
 RUN pip install pandas pyarrow
--- a/how-to-use-azureml/ml-frameworks/pytorch/training/mask-rcnn-object-detection/engine.py
+++ b/how-to-use-azureml/ml-frameworks/pytorch/training/mask-rcnn-object-detection/engine.py
@@ -1,108 +0,0 @@
 import math
 import sys
 import time
 import torch
 import torchvision.models.detection.mask_rcnn
 from coco_utils import get_coco_api_from_dataset
 from coco_eval import CocoEvaluator
 import utils
 def train_one_epoch(model, optimizer, data_loader, device, epoch, print_freq):
    model.train()
    metric_logger = utils.MetricLogger(delimiter="  ")
    metric_logger.add_meter('lr', utils.SmoothedValue(window_size=1, fmt='{value:.6f}'))
    header = 'Epoch: [{}]'.format(epoch)
    lr_scheduler = None
    if epoch == 0:
        warmup_factor = 1. / 1000
        warmup_iters = min(1000, len(data_loader) - 1)
        lr_scheduler = utils.warmup_lr_scheduler(optimizer, warmup_iters, warmup_factor)
    for images, targets in metric_logger.log_every(data_loader, print_freq, header):
        images = list(image.to(device) for image in images)
        targets = [{k: v.to(device) for k, v in t.items()} for t in targets]
        loss_dict = model(images, targets)
        losses = sum(loss for loss in loss_dict.values())
        # reduce losses over all GPUs for logging purposes
        loss_dict_reduced = utils.reduce_dict(loss_dict)
        losses_reduced = sum(loss for loss in loss_dict_reduced.values())
        loss_value = losses_reduced.item()
        if not math.isfinite(loss_value):
            print("Loss is {}, stopping training".format(loss_value))
            print(loss_dict_reduced)
            sys.exit(1)
        optimizer.zero_grad()
        losses.backward()
        optimizer.step()
        if lr_scheduler is not None:
            lr_scheduler.step()
        metric_logger.update(loss=losses_reduced, **loss_dict_reduced)
        metric_logger.update(lr=optimizer.param_groups[0]["lr"])
 def _get_iou_types(model):
    model_without_ddp = model
    if isinstance(model, torch.nn.parallel.DistributedDataParallel):
        model_without_ddp = model.module
    iou_types = ["bbox"]
    if isinstance(model_without_ddp, torchvision.models.detection.MaskRCNN):
        iou_types.append("segm")
    if isinstance(model_without_ddp, torchvision.models.detection.KeypointRCNN):
        iou_types.append("keypoints")
    return iou_types
@torch.no_grad()
 def evaluate(model, data_loader, device):
    n_threads = torch.get_num_threads()
    # FIXME remove this and make paste_masks_in_image run on the GPU
    torch.set_num_threads(1)
    cpu_device = torch.device("cpu")
    model.eval()
    metric_logger = utils.MetricLogger(delimiter="  ")
    header = 'Test:'
    coco = get_coco_api_from_dataset(data_loader.dataset)
    iou_types = _get_iou_types(model)
    coco_evaluator = CocoEvaluator(coco, iou_types)
    for image, targets in metric_logger.log_every(data_loader, 100, header):
        image = list(img.to(device) for img in image)
        targets = [{k: v.to(device) for k, v in t.items()} for t in targets]
        torch.cuda.synchronize()
        model_time = time.time()
        outputs = model(image)
        outputs = [{k: v.to(cpu_device) for k, v in t.items()} for t in outputs]
        model_time = time.time() - model_time
        res = {target["image_id"].item(): output for target, output in zip(targets, outputs)}
        evaluator_time = time.time()
        coco_evaluator.update(res)
        evaluator_time = time.time() - evaluator_time
        metric_logger.update(model_time=model_time, evaluator_time=evaluator_time)
    # gather the stats from all processes
    metric_logger.synchronize_between_processes()
    print("Averaged stats:", metric_logger)
    coco_evaluator.synchronize_between_processes()
    # accumulate predictions from all images
    coco_evaluator.accumulate()
    coco_evaluator.summarize()
    torch.set_num_threads(n_threads)
    return coco_evaluator
--- a/how-to-use-azureml/ml-frameworks/pytorch/training/mask-rcnn-object-detection/model.py
+++ b/how-to-use-azureml/ml-frameworks/pytorch/training/mask-rcnn-object-detection/model.py
@@ -1,23 +0,0 @@
 import torchvision
 from torchvision.models.detection.faster_rcnn import FastRCNNPredictor
 from torchvision.models.detection.mask_rcnn import MaskRCNNPredictor
 def get_instance_segmentation_model(num_classes):
    # load an instance segmentation model pre-trained on COCO
    model = torchvision.models.detection.maskrcnn_resnet50_fpn(pretrained=True)
    # get the number of input features for the classifier
    in_features = model.roi_heads.box_predictor.cls_score.in_features
    # replace the pre-trained head with a new one
    model.roi_heads.box_predictor = FastRCNNPredictor(in_features, num_classes)
    # now get the number of input features for the mask classifier
    in_features_mask = model.roi_heads.mask_predictor.conv5_mask.in_channels
    hidden_layer = 256
    # and replace the mask predictor with a new one
    model.roi_heads.mask_predictor = MaskRCNNPredictor(in_features_mask,
                                                       hidden_layer,
                                                       num_classes)
    return model
--- a/how-to-use-azureml/ml-frameworks/pytorch/training/mask-rcnn-object-detection/pytorch-mask-rcnn.ipynb
+++ b/how-to-use-azureml/ml-frameworks/pytorch/training/mask-rcnn-object-detection/pytorch-mask-rcnn.ipynb
@@ -1,544 +0,0 @@
 {
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Copyright (c) Microsoft Corporation. All rights reserved.\n",
        "\n",
        "Licensed under the MIT License."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/ml-frameworks/pytorch/training/mask-rcnn-object-detection/pytorch-mask-rcnn.png)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "# Object detection with PyTorch, Mask R-CNN, and a custom Dockerfile\n",
        "\n",
        "In this tutorial, you will finetune a pre-trained [Mask R-CNN](https://arxiv.org/abs/1703.06870) model on images from the [Penn-Fudan Database for Pedestrian Detection and Segmentation](https://www.cis.upenn.edu/~jshi/ped_html/). The dataset has 170 images with 345 instances of pedestrians. After running this tutorial, you will have a model that can outline the silhouettes of all pedestrians within an image.\n",
        "\n",
        "You\u00e2\u20ac\u2122ll use Azure Machine Learning to: \n",
        "\n",
        "- Initialize a workspace \n",
        "- Create a compute cluster\n",
        "- Define a training environment\n",
        "- Train a model remotely\n",
        "- Register your model\n",
        "- Generate predictions locally\n",
        "\n",
        "## Prerequisities\n",
        "\n",
        "- If you are using an Azure Machine Learning Notebook VM, your environment already meets these prerequisites. Otherwise, go through the [configuration notebook](../../../../../configuration.ipynb) to install the Azure Machine Learning Python SDK and [create an Azure ML Workspace](https://docs.microsoft.com/azure/machine-learning/how-to-manage-workspace#create-a-workspace). You also need matplotlib 3.2, pycocotools-2.0.0, torchvision >= 0.5.0 and torch >= 1.4.0.\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Check core SDK version number, check other dependencies\n",
        "import azureml.core\n",
        "import matplotlib\n",
        "import pycocotools\n",
        "import torch\n",
        "import torchvision\n",
        "\n",
        "print(\"SDK version:\", azureml.core.VERSION)\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Diagnostics\n",
        "\n",
        "Opt-in diagnostics for better experience, quality, and security in future releases."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.telemetry import set_diagnostics_collection\n",
        "\n",
        "set_diagnostics_collection(send_diagnostics=True)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Initialize a workspace\n",
        "\n",
        "Initialize a [workspace](https://docs.microsoft.com/en-us/azure/machine-learning/concept-workspace) object from the existing workspace you created in the Prerequisites step. `Workspace.from_config()` creates a workspace object from the details stored in `config.json`, using the [from_config()](https://docs.microsoft.com/python/api/azureml-core/azureml.core.workspace(class)?view=azure-ml-py#from-config-path-none--auth-none---logger-none---file-name-none-) method."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.core.workspace import Workspace\n",
        "\n",
        "ws = Workspace.from_config()\n",
        "print('Workspace name: ' + ws.name, \n",
        "      'Azure region: ' + ws.location, \n",
        "      'Subscription id: ' + ws.subscription_id, \n",
        "      'Resource group: ' + ws.resource_group, sep='\\n')"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Create or attach existing Azure ML Managed Compute\n",
        "\n",
        "You will need to create a [compute target](https://docs.microsoft.com/azure/machine-learning/concept-compute-target) for training your model. In this tutorial, we use [Azure ML managed compute](https://docs.microsoft.com/azure/machine-learning/how-to-set-up-training-targets#amlcompute) for our remote training compute resource. Specifically, the below code creates a `STANDARD_NC6` GPU cluster that autoscales from 0 to 4 nodes.\n",
        "\n",
        "**Creation of Compute takes approximately 5 minutes.** If the Aauzre ML Compute with that name is already in your workspace, this code will skip the creation process. \n",
        "\n",
        "As with other Azure servies, there are limits on certain resources associated with the Azure Machine Learning service. Please read [this article](https://docs.microsoft.com/azure/machine-learning/how-to-manage-quotas) on the default limits and how to request more quota.\n",
        "\n",
        "> Note that the below code creates GPU compute. If you instead want to create CPU compute, provide a different VM size to the `vm_size` parameter, such as `STANDARD_D2_V2`."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.core.compute import ComputeTarget, AmlCompute\n",
        "from azureml.core.compute_target import ComputeTargetException\n",
        "\n",
        "\n",
        "# choose a name for your cluster\n",
        "cluster_name = 'gpu-cluster'\n",
        "\n",
        "try:\n",
        "    compute_target = ComputeTarget(workspace=ws, name=cluster_name)\n",
        "    print('Found existing compute target.')\n",
        "except ComputeTargetException:\n",
        "    print('Creating a new compute target...')\n",
        "    compute_config = AmlCompute.provisioning_configuration(vm_size='STANDARD_NC6', \n",
        "                                                           max_nodes=4)\n",
        "\n",
        "    # create the cluster\n",
        "    compute_target = ComputeTarget.create(ws, cluster_name, compute_config)\n",
        "\n",
        "    compute_target.wait_for_completion(show_output=True)\n",
        "\n",
        "# use get_status() to get a detailed status for the current cluster. \n",
        "print(compute_target.get_status().serialize())"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Define a training environment\n",
        "\n",
        "### Create a project directory\n",
        "Create a directory that will contain all the code from your local machine that you will need access to on the remote resource. This includes the training script an any additional files your training script depends on."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import os\n",
        "\n",
        "project_folder = './pytorch-peds'\n",
        "\n",
        "try:\n",
        "    os.makedirs(project_folder, exist_ok=False)\n",
        "except FileExistsError:\n",
        "    print('project folder {} exists, moving on...'.format(project_folder))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Copy training script and dependencies into project directory"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import shutil\n",
        "\n",
        "files_to_copy = ['data', 'model', 'script', 'utils', 'transforms', 'coco_eval', 'engine', 'coco_utils']\n",
        "for file in files_to_copy:\n",
        "    shutil.copy(os.path.join(os.getcwd(), (file + '.py')), project_folder)\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Create an experiment"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.core import Experiment\n",
        "\n",
        "experiment_name = 'pytorch-peds'\n",
        "experiment = Experiment(ws, name=experiment_name)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Specify dependencies with a custom Dockerfile\n",
        "\n",
        "There are a number of ways to [use environments](https://docs.microsoft.com/azure/machine-learning/how-to-use-environments) for specifying dependencies during model training. In this case, we use a custom Dockerfile."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.core import Environment\n",
        "\n",
        "my_env = Environment(name='maskr-docker')\n",
        "my_env.docker.enabled = True\n",
        "with open(\"dockerfiles/Dockerfile\", \"r\") as f:\n",
        "    dockerfile_contents=f.read()\n",
        "my_env.docker.base_dockerfile=dockerfile_contents\n",
        "my_env.docker.base_image = None\n",
        "my_env.python.interpreter_path = '/opt/miniconda/bin/python'\n",
        "my_env.python.user_managed_dependencies = True\n",
        "\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Create a ScriptRunConfig\n",
        "\n",
        "Use the [ScriptRunConfig](https://docs.microsoft.com/python/api/azureml-core/azureml.core.scriptrunconfig?view=azure-ml-py) class to define your run. Specify the source directory, compute target, and environment."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.train.dnn import PyTorch\n",
        "from azureml.core import ScriptRunConfig\n",
        "\n",
        "model_name = 'pytorch-peds'\n",
        "output_dir = './outputs/'\n",
        "n_epochs = 2\n",
        "\n",
        "script_args = [\n",
        "    '--model_name', model_name,\n",
        "    '--output_dir', output_dir,\n",
        "    '--n_epochs', n_epochs,\n",
        "]\n",
        "# Add training script to run config\n",
        "runconfig = ScriptRunConfig(\n",
        "    source_directory=project_folder,\n",
        "    script=\"script.py\",\n",
        "    arguments=script_args)\n",
        "\n",
        "# Attach compute target to run config\n",
        "runconfig.run_config.target = cluster_name\n",
        "\n",
        "# Uncomment the line below if you want to try this locally first\n",
        "#runconfig.run_config.target = \"local\"\n",
        "\n",
        "# Attach environment to run config\n",
        "runconfig.run_config.environment = my_env"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Train remotely\n",
        "\n",
        "### Submit your run"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Submit run \n",
        "run = experiment.submit(runconfig)\n",
        "\n",
        "# to get more details of your run\n",
        "print(run.get_details())"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Monitor your run\n",
        "\n",
        "Use a widget to keep track of your run. You can also view the status of the run within the [Azure Machine Learning service portal](https://ml.azure.com)."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.widgets import RunDetails\n",
        "\n",
        "RunDetails(run).show()\n",
        "run.wait_for_completion(show_output=True)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Test your model\n",
        "\n",
        "Now that we are done training, let's see how well this model actually performs.\n",
        "\n",
        "### Get your latest run\n",
        "First, pull the latest run using `experiment.get_runs()`, which lists runs from `experiment` in reverse chronological order."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.core import Run\n",
        "\n",
        "last_run = next(experiment.get_runs())"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Register your model\n",
        "Next, [register the model](https://docs.microsoft.com/azure/machine-learning/concept-model-management-and-deployment#register-package-and-deploy-models-from-anywhere) from your run. Registering your model assigns it a version and helps you with auditability."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "last_run.register_model(model_name=model_name, model_path=os.path.join(output_dir, model_name))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Download your model\n",
        "Next, download this registered model. Notice how we can initialize the `Model` object with the name of the registered model, rather than a path to the file itself."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.core import Model\n",
        "\n",
        "model = Model(workspace=ws, name=model_name)\n",
        "path = model.download(target_dir='model', exist_ok=True)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Use your model to make a prediction\n",
        "\n",
        "Run inferencing on a single test image and display the results."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import torch\n",
        "from azureml.core import Dataset\n",
        "from data import PennFudanDataset\n",
        "from script import get_transform, download_data, NUM_CLASSES\n",
        "from model import get_instance_segmentation_model\n",
        "\n",
        "if torch.cuda.is_available():\n",
        "    device = torch.device('cuda')\n",
        "else:\n",
        "    device = torch.device('cpu')\n",
        "\n",
        "# Instantiate model with correct weights, cast to correct device, place in evaluation mode\n",
        "predict_model = get_instance_segmentation_model(NUM_CLASSES)\n",
        "predict_model.to(device)\n",
        "predict_model.load_state_dict(torch.load(path, map_location=device))\n",
        "predict_model.eval()\n",
        "\n",
        "# Load dataset\n",
        "root_dir=download_data()\n",
        "dataset_test = PennFudanDataset(root=root_dir, transforms=get_transform(train=False))\n",
        "\n",
        "# pick one image from the test set\n",
        "img, _ = dataset_test[0]\n",
        "\n",
        "with torch.no_grad():\n",
        "    prediction = predict_model([img.to(device)])\n",
        "\n",
        "# model = torch.load(path)\n",
        "#torch.load(model.get_model_path(model_name='outputs/model.pt'))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Display the input image\n",
        "\n",
        "While tensors are great for computers, a tensor of RGB values doesn't mean much to a human. Let's display the input image in a way that a human could understand."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from PIL import Image\n",
        "\n",
        "\n",
        "Image.fromarray(img.mul(255).permute(1, 2, 0).byte().numpy())"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Display the predicted masks\n",
        "\n",
        "The prediction consists of masks, displaying the outline of pedestrians in the image. Let's take a look at the first two masks, below."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "Image.fromarray(prediction[0]['masks'][0, 0].mul(255).byte().cpu().numpy())"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "Image.fromarray(prediction[0]['masks'][1, 0].mul(255).byte().cpu().numpy())"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Next steps\n",
        "\n",
        "Congratulations! You just trained a Mask R-CNN model with PyTorch in Azure Machine Learning. As next steps, consider:\n",
        "1. Learn more about using PyTorch in Azure Machine Learning service by checking out the [README](./README.md]\n",
        "2. Try exporting your model to [ONNX](https://docs.microsoft.com/azure/machine-learning/concept-onnx) for accelerated inferencing."
      ]
    }
  ],
  "metadata": {
    "authors": [
      {
        "name": "gopalv"
      }
    ],
    "category": "training",
    "compute": [
      "AML Compute"
    ],
    "datasets": [
      "Custom"
    ],
    "deployment": [
      "None"
    ],
    "exclude_from_index": false,
    "framework": [
      "PyTorch"
    ],
    "friendly_name": "PyTorch object detection",
    "index_order": 1,
    "kernel_info": {
      "name": "python3"
    },
    "kernelspec": {
      "display_name": "Python 3.6",
      "language": "python",
      "name": "python36"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.6.5-final"
    },
    "nteract": {
      "version": "nteract-front-end@1.0.0"
    },
    "tags": [
      "remote run",
      "docker"
    ],
    "task": "Fine-tune PyTorch object detection model with a custom dockerfile"
  },
  "nbformat": 4,
  "nbformat_minor": 2
 }
--- a/how-to-use-azureml/ml-frameworks/pytorch/training/mask-rcnn-object-detection/pytorch-mask-rcnn.yml
+++ b/how-to-use-azureml/ml-frameworks/pytorch/training/mask-rcnn-object-detection/pytorch-mask-rcnn.yml
@@ -1,14 +0,0 @@
 name: pytorch-mask-rcnn
 dependencies:
 - cython
 - pytorch -c pytorch
 - torchvision -c pytorch
 - pip:
  - azureml-sdk
  - azureml-widgets
  - azureml-dataprep
  - fuse
  - pandas
  - matplotlib
  - pillow==7.0.0
  - git+https://github.com/philferriere/cocoapi.git#subdirectory=PythonAPI
--- a/how-to-use-azureml/ml-frameworks/pytorch/training/mask-rcnn-object-detection/script.py
+++ b/how-to-use-azureml/ml-frameworks/pytorch/training/mask-rcnn-object-detection/script.py
@@ -1,117 +0,0 @@
 import argparse
 import os
 import torch
 import torchvision
 import transforms as T
 import urllib.request
 import utils
 from azureml.core import Dataset, Run
 from data import PennFudanDataset
 from engine import train_one_epoch, evaluate
 from model import get_instance_segmentation_model
 from zipfile import ZipFile
 NUM_CLASSES = 2
 def download_data():
    data_file = 'PennFudanPed.zip'
    ds_path = 'PennFudanPed/'
    urllib.request.urlretrieve('https://www.cis.upenn.edu/~jshi/ped_html/PennFudanPed.zip', data_file)
    zip = ZipFile(file=data_file)
    zip.extractall(path=ds_path)
    return os.path.join(ds_path, zip.namelist()[0])
 def get_transform(train):
    transforms = []
    # converts the image, a PIL image, into a PyTorch Tensor
    transforms.append(T.ToTensor())
    if train:
        # during training, randomly flip the training images
        # and ground-truth for data augmentation
        transforms.append(T.RandomHorizontalFlip(0.5))
    return T.Compose(transforms)
 def main():
    print("Torch version:", torch.__version__)
    # get command-line arguments
    parser = argparse.ArgumentParser()
    parser.add_argument('--model_name', type=str, default="pytorch-peds.pt",
                        help='name with which to register your model')
    parser.add_argument('--output_dir', default="local-outputs",
                        type=str, help='output directory')
    parser.add_argument('--n_epochs', type=int,
                        default=10, help='number of epochs')
    args = parser.parse_args()
    # In case user inputs a nested output directory
    os.makedirs(name=args.output_dir, exist_ok=True)
    # Get a dataset by name
    root_dir = download_data()
    # use our dataset and defined transformations
    dataset = PennFudanDataset(root=root_dir, transforms=get_transform(train=True))
    dataset_test = PennFudanDataset(root=root_dir, transforms=get_transform(train=False))
    # split the dataset in train and test set
    torch.manual_seed(1)
    indices = torch.randperm(len(dataset)).tolist()
    dataset = torch.utils.data.Subset(dataset, indices[:-50])
    dataset_test = torch.utils.data.Subset(dataset_test, indices[-50:])
    # define training and validation data loaders
    data_loader = torch.utils.data.DataLoader(
        dataset, batch_size=2, shuffle=True, num_workers=4,
        collate_fn=utils.collate_fn)
    data_loader_test = torch.utils.data.DataLoader(
        dataset_test, batch_size=1, shuffle=False, num_workers=4,
        collate_fn=utils.collate_fn)
    if torch.cuda.is_available():
        print('Using GPU')
        device = torch.device('cuda')
    else:
        print('Using CPU')
        device = torch.device('cpu')
    # our dataset has two classes only - background and person
    num_classes = NUM_CLASSES
    # get the model using our helper function
    model = get_instance_segmentation_model(num_classes)
    # move model to the right device
    model.to(device)
    # construct an optimizer
    params = [p for p in model.parameters() if p.requires_grad]
    optimizer = torch.optim.SGD(params, lr=0.005,
                                momentum=0.9, weight_decay=0.0005)
    # and a learning rate scheduler which decreases the learning rate by
    # 10x every 3 epochs
    lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer,
                                                   step_size=3,
                                                   gamma=0.1)
    for epoch in range(args.n_epochs):
        # train for one epoch, printing every 10 iterations
        train_one_epoch(
            model, optimizer, data_loader, device, epoch, print_freq=10)
        # update the learning rate
        lr_scheduler.step()
        # evaluate on the test dataset
        evaluate(model, data_loader_test, device=device)
    # Saving the state dict is recommended method, per
    # https://pytorch.org/tutorials/beginner/saving_loading_models.html
    torch.save(model.state_dict(), os.path.join(args.output_dir, args.model_name))
 if __name__ == '__main__':
    main()
--- a/how-to-use-azureml/ml-frameworks/pytorch/training/mask-rcnn-object-detection/transforms.py
+++ b/how-to-use-azureml/ml-frameworks/pytorch/training/mask-rcnn-object-detection/transforms.py
@@ -1,50 +0,0 @@
 import random
 import torch
 from torchvision.transforms import functional as F
 def _flip_coco_person_keypoints(kps, width):
    flip_inds = [0, 2, 1, 4, 3, 6, 5, 8, 7, 10, 9, 12, 11, 14, 13, 16, 15]
    flipped_data = kps[:, flip_inds]
    flipped_data[..., 0] = width - flipped_data[..., 0]
    # Maintain COCO convention that if visibility == 0, then x, y = 0
    inds = flipped_data[..., 2] == 0
    flipped_data[inds] = 0
    return flipped_data
 class Compose(object):
    def __init__(self, transforms):
        self.transforms = transforms
    def __call__(self, image, target):
        for t in self.transforms:
            image, target = t(image, target)
        return image, target
 class RandomHorizontalFlip(object):
    def __init__(self, prob):
        self.prob = prob
    def __call__(self, image, target):
        if random.random() < self.prob:
            height, width = image.shape[-2:]
            image = image.flip(-1)
            bbox = target["boxes"]
            bbox[:, [0, 2]] = width - bbox[:, [2, 0]]
            target["boxes"] = bbox
            if "masks" in target:
                target["masks"] = target["masks"].flip(-1)
            if "keypoints" in target:
                keypoints = target["keypoints"]
                keypoints = _flip_coco_person_keypoints(keypoints, width)
                target["keypoints"] = keypoints
        return image, target
 class ToTensor(object):
    def __call__(self, image, target):
        image = F.to_tensor(image)
        return image, target
--- a/how-to-use-azureml/ml-frameworks/pytorch/training/mask-rcnn-object-detection/utils.py
+++ b/how-to-use-azureml/ml-frameworks/pytorch/training/mask-rcnn-object-detection/utils.py
@@ -1,326 +0,0 @@
 from __future__ import print_function
 from collections import defaultdict, deque
 import datetime
 import pickle
 import time
 import torch
 import torch.distributed as dist
 import errno
 import os
 class SmoothedValue(object):
    """Track a series of values and provide access to smoothed values over a
    window or the global series average.
    """
    def __init__(self, window_size=20, fmt=None):
        if fmt is None:
            fmt = "{median:.4f} ({global_avg:.4f})"
        self.deque = deque(maxlen=window_size)
        self.total = 0.0
        self.count = 0
        self.fmt = fmt
    def update(self, value, n=1):
        self.deque.append(value)
        self.count += n
        self.total += value * n
    def synchronize_between_processes(self):
        """
        Warning: does not synchronize the deque!
        """
        if not is_dist_avail_and_initialized():
            return
        t = torch.tensor([self.count, self.total], dtype=torch.float64, device='cuda')
        dist.barrier()
        dist.all_reduce(t)
        t = t.tolist()
        self.count = int(t[0])
        self.total = t[1]
    @property
    def median(self):
        d = torch.tensor(list(self.deque))
        return d.median().item()
    @property
    def avg(self):
        d = torch.tensor(list(self.deque), dtype=torch.float32)
        return d.mean().item()
    @property
    def global_avg(self):
        return self.total / self.count
    @property
    def max(self):
        return max(self.deque)
    @property
    def value(self):
        return self.deque[-1]
    def __str__(self):
        return self.fmt.format(
            median=self.median,
            avg=self.avg,
            global_avg=self.global_avg,
            max=self.max,
            value=self.value)
 def all_gather(data):
    """
    Run all_gather on arbitrary picklable data (not necessarily tensors)
    Args:
        data: any picklable object
    Returns:
        list[data]: list of data gathered from each rank
    """
    world_size = get_world_size()
    if world_size == 1:
        return [data]
    # serialized to a Tensor
    buffer = pickle.dumps(data)
    storage = torch.ByteStorage.from_buffer(buffer)
    tensor = torch.ByteTensor(storage).to("cuda")
    # obtain Tensor size of each rank
    local_size = torch.tensor([tensor.numel()], device="cuda")
    size_list = [torch.tensor([0], device="cuda") for _ in range(world_size)]
    dist.all_gather(size_list, local_size)
    size_list = [int(size.item()) for size in size_list]
    max_size = max(size_list)
    # receiving Tensor from all ranks
    # we pad the tensor because torch all_gather does not support
    # gathering tensors of different shapes
    tensor_list = []
    for _ in size_list:
        tensor_list.append(torch.empty((max_size,), dtype=torch.uint8, device="cuda"))
    if local_size != max_size:
        padding = torch.empty(size=(max_size - local_size,), dtype=torch.uint8, device="cuda")
        tensor = torch.cat((tensor, padding), dim=0)
    dist.all_gather(tensor_list, tensor)
    data_list = []
    for size, tensor in zip(size_list, tensor_list):
        buffer = tensor.cpu().numpy().tobytes()[:size]
        data_list.append(pickle.loads(buffer))
    return data_list
 def reduce_dict(input_dict, average=True):
    """
    Args:
        input_dict (dict): all the values will be reduced
        average (bool): whether to do average or sum
    Reduce the values in the dictionary from all processes so that all processes
    have the averaged results. Returns a dict with the same fields as
    input_dict, after reduction.
    """
    world_size = get_world_size()
    if world_size < 2:
        return input_dict
    with torch.no_grad():
        names = []
        values = []
        # sort the keys so that they are consistent across processes
        for k in sorted(input_dict.keys()):
            names.append(k)
            values.append(input_dict[k])
        values = torch.stack(values, dim=0)
        dist.all_reduce(values)
        if average:
            values /= world_size
        reduced_dict = {k: v for k, v in zip(names, values)}
    return reduced_dict
 class MetricLogger(object):
    def __init__(self, delimiter="\t"):
        self.meters = defaultdict(SmoothedValue)
        self.delimiter = delimiter
    def update(self, **kwargs):
        for k, v in kwargs.items():
            if isinstance(v, torch.Tensor):
                v = v.item()
            assert isinstance(v, (float, int))
            self.meters[k].update(v)
    def __getattr__(self, attr):
        if attr in self.meters:
            return self.meters[attr]
        if attr in self.__dict__:
            return self.__dict__[attr]
        raise AttributeError("'{}' object has no attribute '{}'".format(
            type(self).__name__, attr))
    def __str__(self):
        loss_str = []
        for name, meter in self.meters.items():
            loss_str.append(
                "{}: {}".format(name, str(meter))
            )
        return self.delimiter.join(loss_str)
    def synchronize_between_processes(self):
        for meter in self.meters.values():
            meter.synchronize_between_processes()
    def add_meter(self, name, meter):
        self.meters[name] = meter
    def log_every(self, iterable, print_freq, header=None):
        i = 0
        if not header:
            header = ''
        start_time = time.time()
        end = time.time()
        iter_time = SmoothedValue(fmt='{avg:.4f}')
        data_time = SmoothedValue(fmt='{avg:.4f}')
        space_fmt = ':' + str(len(str(len(iterable)))) + 'd'
        if torch.cuda.is_available():
            log_msg = self.delimiter.join([
                header,
                '[{0' + space_fmt + '}/{1}]',
                'eta: {eta}',
                '{meters}',
                'time: {time}',
                'data: {data}',
                'max mem: {memory:.0f}'
            ])
        else:
            log_msg = self.delimiter.join([
                header,
                '[{0' + space_fmt + '}/{1}]',
                'eta: {eta}',
                '{meters}',
                'time: {time}',
                'data: {data}'
            ])
        MB = 1024.0 * 1024.0
        for obj in iterable:
            data_time.update(time.time() - end)
            yield obj
            iter_time.update(time.time() - end)
            if i % print_freq == 0 or i == len(iterable) - 1:
                eta_seconds = iter_time.global_avg * (len(iterable) - i)
                eta_string = str(datetime.timedelta(seconds=int(eta_seconds)))
                if torch.cuda.is_available():
                    print(log_msg.format(
                        i, len(iterable), eta=eta_string,
                        meters=str(self),
                        time=str(iter_time), data=str(data_time),
                        memory=torch.cuda.max_memory_allocated() / MB))
                else:
                    print(log_msg.format(
                        i, len(iterable), eta=eta_string,
                        meters=str(self),
                        time=str(iter_time), data=str(data_time)))
            i += 1
            end = time.time()
        total_time = time.time() - start_time
        total_time_str = str(datetime.timedelta(seconds=int(total_time)))
        print('{} Total time: {} ({:.4f} s / it)'.format(
            header, total_time_str, total_time / len(iterable)))
 def collate_fn(batch):
    return tuple(zip(*batch))
 def warmup_lr_scheduler(optimizer, warmup_iters, warmup_factor):
    def f(x):
        if x >= warmup_iters:
            return 1
        alpha = float(x) / warmup_iters
        return warmup_factor * (1 - alpha) + alpha
    return torch.optim.lr_scheduler.LambdaLR(optimizer, f)
 def mkdir(path):
    try:
        os.makedirs(path)
    except OSError as e:
        if e.errno != errno.EEXIST:
            raise
 def setup_for_distributed(is_master):
    """
    This function disables printing when not in master process
    """
    import builtins as __builtin__
    builtin_print = __builtin__.print
    def print(*args, **kwargs):
        force = kwargs.pop('force', False)
        if is_master or force:
            builtin_print(*args, **kwargs)
    __builtin__.print = print
 def is_dist_avail_and_initialized():
    if not dist.is_available():
        return False
    if not dist.is_initialized():
        return False
    return True
 def get_world_size():
    if not is_dist_avail_and_initialized():
        return 1
    return dist.get_world_size()
 def get_rank():
    if not is_dist_avail_and_initialized():
        return 0
    return dist.get_rank()
 def is_main_process():
    return get_rank() == 0
 def save_on_master(*args, **kwargs):
    if is_main_process():
        torch.save(*args, **kwargs)
 def init_distributed_mode(args):
    if 'RANK' in os.environ and 'WORLD_SIZE' in os.environ:
        args.rank = int(os.environ["RANK"])
        args.world_size = int(os.environ['WORLD_SIZE'])
        args.gpu = int(os.environ['LOCAL_RANK'])
    elif 'SLURM_PROCID' in os.environ:
        args.rank = int(os.environ['SLURM_PROCID'])
        args.gpu = args.rank % torch.cuda.device_count()
    else:
        print('Not using distributed mode')
        args.distributed = False
        return
    args.distributed = True
    torch.cuda.set_device(args.gpu)
    args.dist_backend = 'nccl'
    print('| distributed init (rank {}): {}'.format(
        args.rank, args.dist_url), flush=True)
    torch.distributed.init_process_group(backend=args.dist_backend, init_method=args.dist_url,
                                         world_size=args.world_size, rank=args.rank)
    torch.distributed.barrier()
    setup_for_distributed(args.rank == 0)
--- a/how-to-use-azureml/ml-frameworks/tensorflow/deployment/train-hyperparameter-tune-deploy-with-tensorflow/tf_mnist.py
+++ b/how-to-use-azureml/ml-frameworks/tensorflow/deployment/train-hyperparameter-tune-deploy-with-tensorflow/tf_mnist.py
@@ -4,33 +4,100 @@
 import numpy as np
 import argparse
 import os
 import re
 import tensorflow as tf
 import time
 import glob
 from azureml.core import Run
 from utils import load_data
 from tensorflow.keras import Model, layers
 # Create TF Model.
 class NeuralNet(Model):
    # Set layers.
    def __init__(self):
        super(NeuralNet, self).__init__()
        # First hidden layer.
        self.h1 = layers.Dense(n_h1, activation=tf.nn.relu)
        # Second hidden layer.
        self.h2 = layers.Dense(n_h2, activation=tf.nn.relu)
        self.out = layers.Dense(n_outputs)
    # Set forward pass.
    def call(self, x, is_training=False):
        x = self.h1(x)
        x = self.h2(x)
        x = self.out(x)
        if not is_training:
            # Apply softmax when not training.
            x = tf.nn.softmax(x)
        return x
 def cross_entropy_loss(y, logits):
    # Convert labels to int 64 for tf cross-entropy function.
    y = tf.cast(y, tf.int64)
    # Apply softmax to logits and compute cross-entropy.
    loss = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=y, logits=logits)
    # Average loss across the batch.
    return tf.reduce_mean(loss)
 # Accuracy metric.
 def accuracy(y_pred, y_true):
    # Predicted class is the index of highest score in prediction vector (i.e. argmax).
    correct_prediction = tf.equal(tf.argmax(y_pred, 1), tf.cast(y_true, tf.int64))
    return tf.reduce_mean(tf.cast(correct_prediction, tf.float32), axis=-1)
 # Optimization process.
 def run_optimization(x, y):
    # Wrap computation inside a GradientTape for automatic differentiation.
    with tf.GradientTape() as g:
        # Forward pass.
        logits = neural_net(x, is_training=True)
        # Compute loss.
        loss = cross_entropy_loss(y, logits)
    # Variables to update, i.e. trainable variables.
    trainable_variables = neural_net.trainable_variables
    # Compute gradients.
    gradients = g.gradient(loss, trainable_variables)
    # Update W and b following gradients.
    optimizer.apply_gradients(zip(gradients, trainable_variables))
 print("TensorFlow version:", tf.__version__)
 parser = argparse.ArgumentParser()
-parser.add_argument('--data-folder', type=str, dest='data_folder', help='data folder mounting point')
+parser.add_argument('--data-folder', type=str, dest='data_folder', default='data', help='data folder mounting point')
-parser.add_argument('--batch-size', type=int, dest='batch_size', default=50, help='mini batch size for training')
+parser.add_argument('--batch-size', type=int, dest='batch_size', default=128, help='mini batch size for training')
-parser.add_argument('--first-layer-neurons', type=int, dest='n_hidden_1', default=100,
+parser.add_argument('--first-layer-neurons', type=int, dest='n_hidden_1', default=128,
                    help='# of neurons in the first layer')
-parser.add_argument('--second-layer-neurons', type=int, dest='n_hidden_2', default=100,
+parser.add_argument('--second-layer-neurons', type=int, dest='n_hidden_2', default=128,
                    help='# of neurons in the second layer')
 parser.add_argument('--learning-rate', type=float, dest='learning_rate', default=0.01, help='learning rate')
 parser.add_argument('--resume-from', type=str, default=None,
                    help='location of the model or checkpoint files from where to resume the training')
 args = parser.parse_args()
 previous_model_location = args.resume_from
 # You can also use environment variable to get the model/checkpoint files location
 # previous_model_location = os.path.expandvars(os.getenv("AZUREML_DATAREFERENCE_MODEL_LOCATION", None))
 data_folder = args.data_folder
 print('Data folder:', data_folder)
 # load train and test set into numpy arrays
 # note we scale the pixel intensity values to 0-1 (by dividing it with 255.0) so the model can converge faster.
 X_train = load_data(glob.glob(os.path.join(data_folder, '**/train-images-idx3-ubyte.gz'),
-                              recursive=True)[0], False) / 255.0
+                              recursive=True)[0], False) / np.float32(255.0)
 X_test = load_data(glob.glob(os.path.join(data_folder, '**/t10k-images-idx3-ubyte.gz'),
-                             recursive=True)[0], False) / 255.0
+                             recursive=True)[0], False) / np.float32(255.0)
 y_train = load_data(glob.glob(os.path.join(data_folder, '**/train-labels-idx1-ubyte.gz'),
                              recursive=True)[0], True).reshape(-1)
 y_test = load_data(glob.glob(os.path.join(data_folder, '**/t10k-labels-idx1-ubyte.gz'),
@@ -48,65 +115,76 @@ learning_rate = args.learning_rate
 n_epochs = 20
 batch_size = args.batch_size
-with tf.name_scope('network'):
+# Build neural network model.
-    # construct the DNN
+neural_net = NeuralNet()
    X = tf.placeholder(tf.float32, shape=(None, n_inputs), name='X')
    y = tf.placeholder(tf.int64, shape=(None), name='y')
    h1 = tf.layers.dense(X, n_h1, activation=tf.nn.relu, name='h1')
    h2 = tf.layers.dense(h1, n_h2, activation=tf.nn.relu, name='h2')
    output = tf.layers.dense(h2, n_outputs, name='output')
-with tf.name_scope('train'):
+# Stochastic gradient descent optimizer.
-    cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=y, logits=output)
+optimizer = tf.optimizers.SGD(learning_rate)
    loss = tf.reduce_mean(cross_entropy, name='loss')
    optimizer = tf.train.GradientDescentOptimizer(learning_rate)
    train_op = optimizer.minimize(loss)
 with tf.name_scope('eval'):
    correct = tf.nn.in_top_k(output, y, 1)
    acc_op = tf.reduce_mean(tf.cast(correct, tf.float32))
 init = tf.global_variables_initializer()
 saver = tf.train.Saver()
 # start an Azure ML run
 run = Run.get_context()
-with tf.Session() as sess:
+if previous_model_location:
-    init.run()
+    # Restore variables from latest checkpoint.
-    for epoch in range(n_epochs):
+    checkpoint = tf.train.Checkpoint(model=neural_net, optimizer=optimizer)
    checkpoint_file_path = tf.train.latest_checkpoint(previous_model_location)
    checkpoint.restore(checkpoint_file_path)
    checkpoint_filename = os.path.basename(checkpoint_file_path)
    num_found = re.search(r'\d+', checkpoint_filename)
    if num_found:
        start_epoch = int(num_found.group(0))
        print("Resuming from epoch {}".format(str(start_epoch)))
-        # randomly shuffle training set
+start_time = time.perf_counter()
-        indices = np.random.permutation(training_set_size)
+for epoch in range(0, n_epochs):
        X_train = X_train[indices]
        y_train = y_train[indices]
-        # batch index
+    # randomly shuffle training set
-        b_start = 0
+    indices = np.random.permutation(training_set_size)
-        b_end = b_start + batch_size
+    X_train = X_train[indices]
-        for _ in range(training_set_size // batch_size):
+    y_train = y_train[indices]
            # get a batch
            X_batch, y_batch = X_train[b_start: b_end], y_train[b_start: b_end]
-            # update batch index for the next batch
+    # batch index
-            b_start = b_start + batch_size
+    b_start = 0
-            b_end = min(b_start + batch_size, training_set_size)
+    b_end = b_start + batch_size
    for _ in range(training_set_size // batch_size):
        # get a batch
        X_batch, y_batch = X_train[b_start: b_end], y_train[b_start: b_end]
-            # train
+        # update batch index for the next batch
-            sess.run(train_op, feed_dict={X: X_batch, y: y_batch})
+        b_start = b_start + batch_size
-        # evaluate training set
+        b_end = min(b_start + batch_size, training_set_size)
        acc_train = acc_op.eval(feed_dict={X: X_batch, y: y_batch})
        # evaluate validation set
        acc_val = acc_op.eval(feed_dict={X: X_test, y: y_test})
-        # log accuracies
+        # train
-        run.log('training_acc', np.float(acc_train))
+        run_optimization(X_batch, y_batch)
        run.log('validation_acc', np.float(acc_val))
        print(epoch, '-- Training accuracy:', acc_train, '\b Validation accuracy:', acc_val)
        y_hat = np.argmax(output.eval(feed_dict={X: X_test}), axis=1)
-    run.log('final_acc', np.float(acc_val))
+    # evaluate training set
    pred = neural_net(X_batch, is_training=False)
    acc_train = accuracy(pred, y_batch)
-    os.makedirs('./outputs/model', exist_ok=True)
+    # evaluate validation set
-    # files saved in the "./outputs" folder are automatically uploaded into run history
+    pred = neural_net(X_test, is_training=False)
-    saver.save(sess, './outputs/model/mnist-tf.model')
+    acc_val = accuracy(pred, y_test)
    # log accuracies
    run.log('training_acc', np.float(acc_train))
    run.log('validation_acc', np.float(acc_val))
    print(epoch, '-- Training accuracy:', acc_train, '\b Validation accuracy:', acc_val)
    # Save checkpoints in the "./outputs" folder so that they are automatically uploaded into run history.
    checkpoint_dir = './outputs/'
    checkpoint = tf.train.Checkpoint(model=neural_net, optimizer=optimizer)
    if epoch % 2 == 0:
        checkpoint.save(checkpoint_dir)
 run.log('final_acc', np.float(acc_val))
 os.makedirs('./outputs/model', exist_ok=True)
 # files saved in the "./outputs" folder are automatically uploaded into run history
 # this is workaround for https://github.com/tensorflow/tensorflow/issues/33913 and will be fixed once we move to >tf2.1
 neural_net._set_inputs(X_train)
 tf.saved_model.save(neural_net, './outputs/model/')
 stop_time = time.perf_counter()
 training_time = (stop_time - start_time) * 1000
 print("Total time in milliseconds for training: {}".format(str(training_time)))
--- a/how-to-use-azureml/ml-frameworks/tensorflow/deployment/train-hyperparameter-tune-deploy-with-tensorflow/train-hyperparameter-tune-deploy-with-tensorflow.ipynb
+++ b/how-to-use-azureml/ml-frameworks/tensorflow/deployment/train-hyperparameter-tune-deploy-with-tensorflow/train-hyperparameter-tune-deploy-with-tensorflow.ipynb
@@ -170,18 +170,19 @@
      "metadata": {},
      "outputs": [],
      "source": [
-        "import urllib\n",
+        "import urllib.request\n",
-        "data_folder = 'data'\n",
+        "\n",
        "data_folder = os.path.join(os.getcwd(), 'data')\n",
        "os.makedirs(data_folder, exist_ok=True)\n",
        "\n",
        "urllib.request.urlretrieve('https://azureopendatastorage.blob.core.windows.net/mnist/train-images-idx3-ubyte.gz',\n",
-        "                           filename=os.path.join(data_folder, 'train-images.gz'))\n",
+        "                           filename=os.path.join(data_folder, 'train-images-idx3-ubyte.gz'))\n",
        "urllib.request.urlretrieve('https://azureopendatastorage.blob.core.windows.net/mnist/train-labels-idx1-ubyte.gz',\n",
-        "                           filename=os.path.join(data_folder, 'train-labels.gz'))\n",
+        "                           filename=os.path.join(data_folder, 'train-labels-idx1-ubyte.gz'))\n",
        "urllib.request.urlretrieve('https://azureopendatastorage.blob.core.windows.net/mnist/t10k-images-idx3-ubyte.gz',\n",
-        "                           filename=os.path.join(data_folder, 'test-images.gz'))\n",
+        "                           filename=os.path.join(data_folder, 't10k-images-idx3-ubyte.gz'))\n",
        "urllib.request.urlretrieve('https://azureopendatastorage.blob.core.windows.net/mnist/t10k-labels-idx1-ubyte.gz',\n",
-        "                           filename=os.path.join(data_folder, 'test-labels.gz'))"
+        "                           filename=os.path.join(data_folder, 't10k-labels-idx1-ubyte.gz'))"
      ]
    },
    {
@@ -209,11 +210,10 @@
        "from utils import load_data\n",
        "\n",
        "# note we also shrink the intensity values (X) from 0-255 to 0-1. This helps the neural network converge faster.\n",
-        "X_train = load_data(os.path.join(data_folder, 'train-images.gz'), False) / 255.0\n",
+        "X_train = load_data(os.path.join(data_folder, 'train-images-idx3-ubyte.gz'), False) / np.float32(255.0)\n",
-        "y_train = load_data(os.path.join(data_folder, 'train-labels.gz'), True).reshape(-1)\n",
+        "X_test = load_data(os.path.join(data_folder, 't10k-images-idx3-ubyte.gz'), False) / np.float32(255.0)\n",
-        "\n",
+        "y_train = load_data(os.path.join(data_folder, 'train-labels-idx1-ubyte.gz'), True).reshape(-1)\n",
-        "X_test = load_data(os.path.join(data_folder, 'test-images.gz'), False) / 255.0\n",
+        "y_test = load_data(os.path.join(data_folder, 't10k-labels-idx1-ubyte.gz'), True).reshape(-1)\n",
        "y_test = load_data(os.path.join(data_folder, 'test-labels.gz'), True).reshape(-1)\n",
        "\n",
        "count = 0\n",
        "sample_size = 30\n",
@@ -447,9 +447,9 @@
        "\n",
        "script_params = {\n",
        "    '--data-folder': dataset.as_named_input('mnist').as_mount(),\n",
-        "    '--batch-size': 50,\n",
+        "    '--batch-size': 64,\n",
-        "    '--first-layer-neurons': 300,\n",
+        "    '--first-layer-neurons': 256,\n",
-        "    '--second-layer-neurons': 100,\n",
+        "    '--second-layer-neurons': 128,\n",
        "    '--learning-rate': 0.01\n",
        "}\n",
        "\n",
@@ -458,6 +458,7 @@
        "                 compute_target=compute_target,\n",
        "                 entry_script='tf_mnist.py',\n",
        "                 use_gpu=True,\n",
        "                 framework_version='2.0',\n",
        "                 pip_packages=['azureml-dataprep[pandas,fuse]'])"
      ]
    },
@@ -622,14 +623,7 @@
      "metadata": {},
      "outputs": [],
      "source": [
-        "# create a model folder in the current directory\n",
+        "run.download_files(prefix='outputs/model', output_directory='./model', append_prefix=False)"
        "os.makedirs('./model', exist_ok=True)\n",
        "\n",
        "for f in run.get_file_names():\n",
        "    if f.startswith('outputs/model'):\n",
        "        output_file_path = os.path.join('./model', f.split('/')[-1])\n",
        "        print('Downloading from {} to {} ...'.format(f, output_file_path))\n",
        "        run.download_file(name=f, output_file_path=output_file_path)"
      ]
    },
    {
@@ -649,22 +643,7 @@
      "outputs": [],
      "source": [
        "import tensorflow as tf\n",
-        "\n",
+        "imported_model = tf.saved_model.load('./model')"
        "tf.reset_default_graph()\n",
        "\n",
        "saver = tf.train.import_meta_graph(\"./model/mnist-tf.model.meta\")\n",
        "graph = tf.get_default_graph()\n",
        "\n",
        "for op in graph.get_operations():\n",
        "    if op.name.startswith('network'):\n",
        "        print(op.name)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Feed test dataset to the persisted model to get predictions."
      ]
    },
    {
@@ -673,16 +652,8 @@
      "metadata": {},
      "outputs": [],
      "source": [
-        "# input tensor. this is an array of 784 elements, each representing the intensity of a pixel in the digit image.\n",
+        "pred =imported_model(X_test)\n",
-        "X = tf.get_default_graph().get_tensor_by_name(\"network/X:0\")\n",
+        "y_hat = np.argmax(pred, axis=1)\n",
        "# output tensor. this is an array of 10 elements, each representing the probability of predicted value of the digit.\n",
        "output = tf.get_default_graph().get_tensor_by_name(\"network/output/MatMul:0\")\n",
        "\n",
        "with tf.Session() as sess:\n",
        "    saver.restore(sess, './model/mnist-tf.model')\n",
        "    k = output.eval(feed_dict={X : X_test})\n",
        "# get the prediction, which is the index of the element that has the largest probability value.\n",
        "y_hat = np.argmax(k, axis=1)\n",
        "\n",
        "# print the first 30 labels and predictions\n",
        "print('labels:  \\t', y_test[:30])\n",
@@ -690,10 +661,12 @@
      ]
    },
    {
-      "cell_type": "markdown",
+      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
-        "Calculate the overall accuracy by comparing the predicted value against the test set."
+        "print(\"Accuracy on the test set:\", np.average(y_hat == y_test))"
      ]
    },
    {
@@ -724,9 +697,9 @@
        "\n",
        "ps = RandomParameterSampling(\n",
        "    {\n",
-        "        '--batch-size': choice(25, 50, 100),\n",
+        "        '--batch-size': choice(32, 64, 128),\n",
-        "        '--first-layer-neurons': choice(10, 50, 200, 300, 500),\n",
+        "        '--first-layer-neurons': choice(16, 64, 128, 256, 512),\n",
-        "        '--second-layer-neurons': choice(10, 50, 200, 500),\n",
+        "        '--second-layer-neurons': choice(16, 64, 256, 512),\n",
        "        '--learning-rate': loguniform(-6, -1)\n",
        "    }\n",
        ")"
@@ -748,7 +721,8 @@
        "est = TensorFlow(source_directory=script_folder,\n",
        "                 script_params={'--data-folder': dataset.as_named_input('mnist').as_mount()},\n",
        "                 compute_target=compute_target,\n",
-        "                 entry_script='tf_mnist.py', \n",
+        "                 entry_script='tf_mnist.py',\n",
        "                 framework_version='2.0',\n",
        "                 use_gpu=True,\n",
        "                 pip_packages=['azureml-dataprep[pandas,fuse]'])"
      ]
@@ -928,24 +902,20 @@
        "from azureml.core.model import Model\n",
        "\n",
        "def init():\n",
-        "    global X, output, sess\n",
+        "    global tf_model\n",
        "    tf.reset_default_graph()\n",
        "    model_root = os.getenv('AZUREML_MODEL_DIR')\n",
        "    # the name of the folder in which to look for tensorflow model files\n",
        "    tf_model_folder = 'model'\n",
-        "    saver = tf.train.import_meta_graph(\n",
+        "    \n",
-        "        os.path.join(model_root, tf_model_folder, 'mnist-tf.model.meta'))\n",
+        "    tf_model = tf.saved_model.load(os.path.join(model_root, tf_model_folder))\n",
        "    X = tf.get_default_graph().get_tensor_by_name(\"network/X:0\")\n",
        "    output = tf.get_default_graph().get_tensor_by_name(\"network/output/MatMul:0\")\n",
        "\n",
        "    sess = tf.Session()\n",
        "    saver.restore(sess, os.path.join(model_root, tf_model_folder, 'mnist-tf.model'))\n",
        "\n",
        "def run(raw_data):\n",
-        "    data = np.array(json.loads(raw_data)['data'])\n",
+        "    data = np.array(json.loads(raw_data)['data'], dtype=np.float32)\n",
        "    \n",
        "    # make prediction\n",
-        "    out = output.eval(session=sess, feed_dict={X: data})\n",
+        "    out = tf_model(data)\n",
        "    y_hat = np.argmax(out, axis=1)\n",
        "\n",
        "    return y_hat.tolist()"
      ]
    },
@@ -967,7 +937,7 @@
        "\n",
        "cd = CondaDependencies.create()\n",
        "cd.add_conda_package('numpy')\n",
-        "cd.add_pip_package('tensorflow==1.13.1')\n",
+        "cd.add_pip_package('tensorflow==2.0.0')\n",
        "cd.add_pip_package(\"azureml-defaults\")\n",
        "cd.save_to_file(base_directory='./', conda_file_path='myenv.yml')\n",
        "\n",
--- a/how-to-use-azureml/ml-frameworks/tensorflow/deployment/train-hyperparameter-tune-deploy-with-tensorflow/train-hyperparameter-tune-deploy-with-tensorflow.yml
+++ b/how-to-use-azureml/ml-frameworks/tensorflow/deployment/train-hyperparameter-tune-deploy-with-tensorflow/train-hyperparameter-tune-deploy-with-tensorflow.yml
@@ -1,13 +1,12 @@
 name: train-hyperparameter-tune-deploy-with-tensorflow
 dependencies:
 - numpy
 - tensorflow==1.10.0
 - matplotlib
 - pip:
  - azureml-sdk
  - azureml-widgets
  - pandas
  - keras
  - tensorflow==2.0.0
  - matplotlib
  - azureml-dataprep
  - fuse
--- a/how-to-use-azureml/ml-frameworks/tensorflow/training/distributed-tensorflow-with-horovod/distributed-tensorflow-with-horovod.yml
+++ b/how-to-use-azureml/ml-frameworks/tensorflow/training/distributed-tensorflow-with-horovod/distributed-tensorflow-with-horovod.yml
@@ -7,6 +7,5 @@ dependencies:
  - tensorflow-gpu==1.13.2
  - horovod==0.16.1
  - matplotlib
  - azureml-dataprep
  - pandas
  - fuse
--- a/how-to-use-azureml/ml-frameworks/tensorflow/training/hyperparameter-tune-and-warm-start-with-tensorflow/hyperparameter-tune-and-warm-start-with-tensorflow.ipynb
+++ b/how-to-use-azureml/ml-frameworks/tensorflow/training/hyperparameter-tune-and-warm-start-with-tensorflow/hyperparameter-tune-and-warm-start-with-tensorflow.ipynb
@@ -175,13 +175,13 @@
        "os.makedirs(data_folder, exist_ok=True)\n",
        "\n",
        "urllib.request.urlretrieve('https://azureopendatastorage.blob.core.windows.net/mnist/train-images-idx3-ubyte.gz',\n",
-        "                           filename=os.path.join(data_folder, 'train-images.gz'))\n",
+        "                           filename=os.path.join(data_folder, 'train-images-idx3-ubyte.gz'))\n",
        "urllib.request.urlretrieve('https://azureopendatastorage.blob.core.windows.net/mnist/train-labels-idx1-ubyte.gz',\n",
-        "                           filename=os.path.join(data_folder, 'train-labels.gz'))\n",
+        "                           filename=os.path.join(data_folder, 'train-labels-idx1-ubyte.gz'))\n",
        "urllib.request.urlretrieve('https://azureopendatastorage.blob.core.windows.net/mnist/t10k-images-idx3-ubyte.gz',\n",
-        "                           filename=os.path.join(data_folder, 'test-images.gz'))\n",
+        "                           filename=os.path.join(data_folder, 't10k-images-idx3-ubyte.gz'))\n",
        "urllib.request.urlretrieve('https://azureopendatastorage.blob.core.windows.net/mnist/t10k-labels-idx1-ubyte.gz',\n",
-        "                           filename=os.path.join(data_folder, 'test-labels.gz'))"
+        "                           filename=os.path.join(data_folder, 't10k-labels-idx1-ubyte.gz'))"
      ]
    },
    {
@@ -209,10 +209,10 @@
        "from utils import load_data\n",
        "\n",
        "# note we also shrink the intensity values (X) from 0-255 to 0-1. This helps the model converge faster.\n",
-        "X_train = load_data(os.path.join(data_folder, 'train-images.gz'), False) / 255.0\n",
+        "X_train = load_data(os.path.join(data_folder, 'train-images-idx3-ubyte.gz'), False) / 255.0\n",
-        "X_test = load_data(os.path.join(data_folder, 'test-images.gz'), False) / 255.0\n",
+        "X_test = load_data(os.path.join(data_folder, 't10k-images-idx3-ubyte.gz'), False) / 255.0\n",
-        "y_train = load_data(os.path.join(data_folder, 'train-labels.gz'), True).reshape(-1)\n",
+        "y_train = load_data(os.path.join(data_folder, 'train-labels-idx1-ubyte.gz'), True).reshape(-1)\n",
-        "y_test = load_data(os.path.join(data_folder, 'test-labels.gz'), True).reshape(-1)\n",
+        "y_test = load_data(os.path.join(data_folder, 't10k-labels-idx1-ubyte.gz'), True).reshape(-1)\n",
        "\n",
        "# now let's show some randomly chosen images from the training set.\n",
        "count = 0\n",
@@ -243,10 +243,10 @@
      "outputs": [],
      "source": [
        "from azureml.core.dataset import Dataset\n",
-        "web_paths = ['http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz',\n",
+        "web_paths = ['https://azureopendatastorage.blob.core.windows.net/mnist/train-images-idx3-ubyte.gz',\n",
-        "             'http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz',\n",
+        "             'https://azureopendatastorage.blob.core.windows.net/mnist/train-labels-idx1-ubyte.gz',\n",
-        "             'http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz',\n",
+        "             'https://azureopendatastorage.blob.core.windows.net/mnist/t10k-images-idx3-ubyte.gz',\n",
-        "             'http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz'\n",
+        "             'https://azureopendatastorage.blob.core.windows.net/mnist/t10k-labels-idx1-ubyte.gz'\n",
        "            ]\n",
        "dataset = Dataset.File.from_files(path = web_paths)"
      ]
@@ -445,9 +445,9 @@
        "# ensure latest azureml-dataprep and other required packages installed in the environment\n",
        "cd = CondaDependencies.create(pip_packages=['keras',\n",
        "                                            'azureml-sdk',\n",
-        "                                            'tensorflow==1.14.0',\n",
+        "                                            'tensorflow==2.0.0',\n",
        "                                            'matplotlib',\n",
-        "                                            'azureml-dataprep[pandas,fuse]>=1.1.14'])\n",
+        "                                            'azureml-dataprep[pandas,fuse]'])\n",
        "\n",
        "env.python.conda_dependencies = cd"
      ]
@@ -466,9 +466,9 @@
        "\n",
        "script_params = {\n",
        "    '--data-folder': dataset.as_named_input('mnist').as_mount(),\n",
-        "    '--batch-size': 50,\n",
+        "    '--batch-size': 64,\n",
-        "    '--first-layer-neurons': 300,\n",
+        "    '--first-layer-neurons': 256,\n",
-        "    '--second-layer-neurons': 100,\n",
+        "    '--second-layer-neurons': 128,\n",
        "    '--learning-rate': 0.01\n",
        "}\n",
        "\n",
@@ -476,7 +476,7 @@
        "                 script_params=script_params,\n",
        "                 compute_target=compute_target,\n",
        "                 entry_script='tf_mnist.py', \n",
-        "                 framework_version='1.13',\n",
+        "                 framework_version='2.0',\n",
        "                 environment_definition= env)"
      ]
    },
@@ -534,9 +534,9 @@
        "\n",
        "ps = RandomParameterSampling(\n",
        "    {\n",
-        "        '--batch-size': choice(25, 50, 100),\n",
+        "        '--batch-size': choice(32, 64, 128),\n",
-        "        '--first-layer-neurons': choice(10, 50, 200, 300, 500),\n",
+        "        '--first-layer-neurons': choice(16, 64, 128, 256, 512),\n",
-        "        '--second-layer-neurons': choice(10, 50, 200, 500),\n",
+        "        '--second-layer-neurons': choice(16, 64, 256, 512),\n",
        "        '--learning-rate': loguniform(-6, -1)\n",
        "    }\n",
        ")"
@@ -558,7 +558,8 @@
        "est = TensorFlow(source_directory=script_folder,\n",
        "                 script_params={'--data-folder': dataset.as_named_input('mnist').as_mount()},\n",
        "                 compute_target=compute_target,\n",
-        "                 entry_script='tf_mnist.py', \n",
+        "                 entry_script='tf_mnist.py',\n",
        "                 framework_version='2.0',\n",
        "                 environment_definition = env)"
      ]
    },
@@ -566,7 +567,7 @@
      "cell_type": "markdown",
      "metadata": {},
      "source": [
-        "Next we will define an early termnination policy. This will terminate poorly performing runs automatically, reducing wastage of resources and instead efficiently using these resources for exploring other parameter configurations. In this example, we will use the `TruncationSelectionPolicy`, truncating the bottom performing 10% runs. It states to check the job every 2 iterations. If the primary metric (defined later) falls in the bottom 25% range, Azure ML terminate the job. This saves us from continuing to explore hyperparameters that don't show promise of helping reach our target metric."
+        "Next we will define an early termnination policy. This will terminate poorly performing runs automatically, reducing wastage of resources and instead efficiently using these resources for exploring other parameter configurations. In this example, we will use the `TruncationSelectionPolicy`, truncating the bottom performing 25% runs. It states to check the job every 2 iterations. If the primary metric (defined later) falls in the bottom 25% range, Azure ML terminate the job. This saves us from continuing to explore hyperparameters that don't show promise of helping reach our target metric."
      ]
    },
    {
--- a/how-to-use-azureml/ml-frameworks/tensorflow/training/hyperparameter-tune-and-warm-start-with-tensorflow/hyperparameter-tune-and-warm-start-with-tensorflow.yml
+++ b/how-to-use-azureml/ml-frameworks/tensorflow/training/hyperparameter-tune-and-warm-start-with-tensorflow/hyperparameter-tune-and-warm-start-with-tensorflow.yml
@@ -7,7 +7,6 @@ dependencies:
  - azureml-widgets
  - pandas
  - keras
-  - tensorflow==1.14.0
+  - tensorflow
  - matplotlib
  - azureml-dataprep
  - fuse
--- a/how-to-use-azureml/ml-frameworks/tensorflow/training/hyperparameter-tune-and-warm-start-with-tensorflow/tf_mnist.py
+++ b/how-to-use-azureml/ml-frameworks/tensorflow/training/hyperparameter-tune-and-warm-start-with-tensorflow/tf_mnist.py
@@ -11,15 +11,74 @@ import glob
 from azureml.core import Run
 from utils import load_data
 from tensorflow.keras import Model, layers
 # Create TF Model.
 class NeuralNet(Model):
    # Set layers.
    def __init__(self):
        super(NeuralNet, self).__init__()
        # First hidden layer.
        self.h1 = layers.Dense(n_h1, activation=tf.nn.relu)
        # Second hidden layer.
        self.h2 = layers.Dense(n_h2, activation=tf.nn.relu)
        self.out = layers.Dense(n_outputs)
    # Set forward pass.
    def call(self, x, is_training=False):
        x = self.h1(x)
        x = self.h2(x)
        x = self.out(x)
        if not is_training:
            # Apply softmax when not training.
            x = tf.nn.softmax(x)
        return x
 def cross_entropy_loss(y, logits):
    # Convert labels to int 64 for tf cross-entropy function.
    y = tf.cast(y, tf.int64)
    # Apply softmax to logits and compute cross-entropy.
    loss = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=y, logits=logits)
    # Average loss across the batch.
    return tf.reduce_mean(loss)
 # Accuracy metric.
 def accuracy(y_pred, y_true):
    # Predicted class is the index of highest score in prediction vector (i.e. argmax).
    correct_prediction = tf.equal(tf.argmax(y_pred, 1), tf.cast(y_true, tf.int64))
    return tf.reduce_mean(tf.cast(correct_prediction, tf.float32), axis=-1)
 # Optimization process.
 def run_optimization(x, y):
    # Wrap computation inside a GradientTape for automatic differentiation.
    with tf.GradientTape() as g:
        # Forward pass.
        logits = neural_net(x, is_training=True)
        # Compute loss.
        loss = cross_entropy_loss(y, logits)
    # Variables to update, i.e. trainable variables.
    trainable_variables = neural_net.trainable_variables
    # Compute gradients.
    gradients = g.gradient(loss, trainable_variables)
    # Update W and b following gradients.
    optimizer.apply_gradients(zip(gradients, trainable_variables))
 print("TensorFlow version:", tf.__version__)
 parser = argparse.ArgumentParser()
-parser.add_argument('--data-folder', type=str, dest='data_folder', help='data folder mounting point')
+parser.add_argument('--data-folder', type=str, dest='data_folder', default='data', help='data folder mounting point')
-parser.add_argument('--batch-size', type=int, dest='batch_size', default=50, help='mini batch size for training')
+parser.add_argument('--batch-size', type=int, dest='batch_size', default=128, help='mini batch size for training')
-parser.add_argument('--first-layer-neurons', type=int, dest='n_hidden_1', default=100,
+parser.add_argument('--first-layer-neurons', type=int, dest='n_hidden_1', default=128,
                    help='# of neurons in the first layer')
-parser.add_argument('--second-layer-neurons', type=int, dest='n_hidden_2', default=100,
+parser.add_argument('--second-layer-neurons', type=int, dest='n_hidden_2', default=128,
                    help='# of neurons in the second layer')
 parser.add_argument('--learning-rate', type=float, dest='learning_rate', default=0.01, help='learning rate')
 parser.add_argument('--resume-from', type=str, default=None,
@@ -36,9 +95,9 @@ print('Data folder:', data_folder)
 # load train and test set into numpy arrays
 # note we scale the pixel intensity values to 0-1 (by dividing it with 255.0) so the model can converge faster.
 X_train = load_data(glob.glob(os.path.join(data_folder, '**/train-images-idx3-ubyte.gz'),
-                              recursive=True)[0], False) / 255.0
+                              recursive=True)[0], False) / np.float32(255.0)
 X_test = load_data(glob.glob(os.path.join(data_folder, '**/t10k-images-idx3-ubyte.gz'),
-                             recursive=True)[0], False) / 255.0
+                             recursive=True)[0], False) / np.float32(255.0)
 y_train = load_data(glob.glob(os.path.join(data_folder, '**/train-labels-idx1-ubyte.gz'),
                              recursive=True)[0], True).reshape(-1)
 y_test = load_data(glob.glob(os.path.join(data_folder, '**/t10k-labels-idx1-ubyte.gz'),
@@ -56,88 +115,77 @@ learning_rate = args.learning_rate
 n_epochs = 20
 batch_size = args.batch_size
-with tf.name_scope('network'):
+# Build neural network model.
-    # construct the DNN
+neural_net = NeuralNet()
    X = tf.placeholder(tf.float32, shape=(None, n_inputs), name='X')
    y = tf.placeholder(tf.int64, shape=(None), name='y')
    h1 = tf.layers.dense(X, n_h1, activation=tf.nn.relu, name='h1')
    h2 = tf.layers.dense(h1, n_h2, activation=tf.nn.relu, name='h2')
    output = tf.layers.dense(h2, n_outputs, name='output')
-with tf.name_scope('train'):
+# Stochastic gradient descent optimizer.
-    cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=y, logits=output)
+optimizer = tf.optimizers.SGD(learning_rate)
    loss = tf.reduce_mean(cross_entropy, name='loss')
    optimizer = tf.train.GradientDescentOptimizer(learning_rate)
    train_op = optimizer.minimize(loss)
 with tf.name_scope('eval'):
    correct = tf.nn.in_top_k(output, y, 1)
    acc_op = tf.reduce_mean(tf.cast(correct, tf.float32))
 init = tf.global_variables_initializer()
 saver = tf.train.Saver()
 # start an Azure ML run
 run = Run.get_context()
-with tf.Session() as sess:
+if previous_model_location:
-    start_time = time.perf_counter()
+    # Restore variables from latest checkpoint.
    checkpoint = tf.train.Checkpoint(model=neural_net, optimizer=optimizer)
    checkpoint_file_path = tf.train.latest_checkpoint(previous_model_location)
    checkpoint.restore(checkpoint_file_path)
    checkpoint_filename = os.path.basename(checkpoint_file_path)
    num_found = re.search(r'\d+', checkpoint_filename)
    if num_found:
        start_epoch = int(num_found.group(0))
        print("Resuming from epoch {}".format(str(start_epoch)))
-    start_epoch = 0
+start_time = time.perf_counter()
-    if previous_model_location:
+for epoch in range(0, n_epochs):
        checkpoint_file_path = tf.train.latest_checkpoint(previous_model_location)
        saver.restore(sess, checkpoint_file_path)
        checkpoint_filename = os.path.basename(checkpoint_file_path)
        num_found = re.search(r'\d+', checkpoint_filename)
        if num_found:
            start_epoch = int(num_found.group(0))
            print("Resuming from epoch {}".format(str(start_epoch)))
    else:
        init.run()
-    for epoch in range(start_epoch, n_epochs):
+    # randomly shuffle training set
    indices = np.random.permutation(training_set_size)
    X_train = X_train[indices]
    y_train = y_train[indices]
-        # randomly shuffle training set
+    # batch index
-        indices = np.random.permutation(training_set_size)
+    b_start = 0
-        X_train = X_train[indices]
+    b_end = b_start + batch_size
-        y_train = y_train[indices]
+    for _ in range(training_set_size // batch_size):
        # get a batch
        X_batch, y_batch = X_train[b_start: b_end], y_train[b_start: b_end]
-        # batch index
+        # update batch index for the next batch
-        b_start = 0
+        b_start = b_start + batch_size
-        b_end = b_start + batch_size
+        b_end = min(b_start + batch_size, training_set_size)
        for _ in range(training_set_size // batch_size):
            # get a batch
            X_batch, y_batch = X_train[b_start: b_end], y_train[b_start: b_end]
-            # update batch index for the next batch
+        # train
-            b_start = b_start + batch_size
+        run_optimization(X_batch, y_batch)
            b_end = min(b_start + batch_size, training_set_size)
-            # train
+    # evaluate training set
-            sess.run(train_op, feed_dict={X: X_batch, y: y_batch})
+    pred = neural_net(X_batch, is_training=False)
-        # evaluate training set
+    acc_train = accuracy(pred, y_batch)
        acc_train = acc_op.eval(feed_dict={X: X_batch, y: y_batch})
        # evaluate validation set
        acc_val = acc_op.eval(feed_dict={X: X_test, y: y_test})
-        time.sleep(10)
+    # evaluate validation set
    pred = neural_net(X_test, is_training=False)
    acc_val = accuracy(pred, y_test)
-        # log accuracies
+    # log accuracies
-        run.log('training_acc', np.float(acc_train))
+    run.log('training_acc', np.float(acc_train))
-        run.log('validation_acc', np.float(acc_val))
+    run.log('validation_acc', np.float(acc_val))
-        print(epoch, '-- Training accuracy:', acc_train, '\b Validation accuracy:', acc_val)
+    print(epoch, '-- Training accuracy:', acc_train, '\b Validation accuracy:', acc_val)
        y_hat = np.argmax(output.eval(feed_dict={X: X_test}), axis=1)
-        # Save checkpoints in the "./outputs" folder so that they are automatically uploaded into run history.
+    # Save checkpoints in the "./outputs" folder so that they are automatically uploaded into run history.
-        if epoch % 2 == 0:
+    checkpoint_dir = './outputs/'
-            saver.save(sess, './outputs/', global_step=epoch)
+    checkpoint = tf.train.Checkpoint(model=neural_net, optimizer=optimizer)
-    run.log('final_acc', np.float(acc_val))
+    if epoch % 2 == 0:
        checkpoint.save(checkpoint_dir)
    time.sleep(3)
-    os.makedirs('./outputs/model', exist_ok=True)
+run.log('final_acc', np.float(acc_val))
-    # files saved in the "./outputs" folder are automatically uploaded into run history
+os.makedirs('./outputs/model', exist_ok=True)
    saver.save(sess, './outputs/model/mnist-tf.model')
-    stop_time = time.perf_counter()
+# files saved in the "./outputs" folder are automatically uploaded into run history
-    training_time = (stop_time - start_time) * 1000
+# this is workaround for https://github.com/tensorflow/tensorflow/issues/33913 and will be fixed once we move to >tf2.1
-    print("Total time in milliseconds for training: {}".format(str(training_time)))
+neural_net._set_inputs(X_train)
 tf.saved_model.save(neural_net, './outputs/model/')
 stop_time = time.perf_counter()
 training_time = (stop_time - start_time) * 1000
 print("Total time in milliseconds for training: {}".format(str(training_time)))
--- a/how-to-use-azureml/ml-frameworks/tensorflow/training/train-tensorflow-resume-training/train-tensorflow-resume-training.yml
+++ b/how-to-use-azureml/ml-frameworks/tensorflow/training/train-tensorflow-resume-training/train-tensorflow-resume-training.yml
@@ -7,5 +7,4 @@ dependencies:
  - keras
  - tensorflow==1.14.0
  - matplotlib
  - azureml-dataprep
  - fuse
--- a/how-to-use-azureml/monitor-models/data-drift/drift-on-aks.ipynb
+++ b/how-to-use-azureml/monitor-models/data-drift/drift-on-aks.ipynb
@@ -184,11 +184,10 @@
        "prov_config = AksCompute.provisioning_configuration()\n",
        "\n",
        "aks_name = 'drift-aks'\n",
        "aks_target = ws.compute_targets.get(aks_name)\n",
        "\n",
        "# Create the cluster\n",
-        "try:\n",
+        "if not aks_target:\n",
        "    aks_target = ws.compute_targets[aks_name]\n",
        "except KeyError:\n",
        "    aks_target = ComputeTarget.create(workspace = ws,\n",
        "                                      name = aks_name,\n",
        "                                      provisioning_configuration = prov_config)\n",
--- a/how-to-use-azureml/reinforcement-learning/README.md
+++ b/how-to-use-azureml/reinforcement-learning/README.md
@@ -0,0 +1,118 @@
 # Azure Machine Learning - Reinforcement Learning (Public Preview)
 <!-- 
 Guidelines on README format: https://review.docs.microsoft.com/help/onboard/admin/samples/concepts/readme-template?branch=master
 Guidance on onboarding samples to docs.microsoft.com/samples: https://review.docs.microsoft.com/help/onboard/admin/samples/process/onboarding?branch=master
 Taxonomies for products and languages: https://review.docs.microsoft.com/new-hope/information-architecture/metadata/taxonomies?branch=master
 -->
 This is an introduction to the [Azure Machine Learning](https://docs.microsoft.com/en-us/azure/machine-learning/service/) Reinforcement Learning (Public Preview) using the [Ray](https://github.com/ray-project/ray/) framework.
 Using these samples, you will be able to do the following.
 1. Use an Azure Machine Learning workspace, set up virtual network and create compute clusters for running Ray.
 2. Run some experiments to train a reinforcement learning agent using Ray and RLlib.
 ## Contents
 | File/folder       | Description                                |
 |-------------------|--------------------------------------------|
 | [devenv_setup.ipynb](setup/devenv_setup.ipynb) | Notebook to setup development environment for Azure ML RL |
 | [cartpole_ci.ipynb](cartpole-on-compute-instance/cartpole_ci.ipynb)  | Notebook to train a Cartpole playing agent on an Azure ML Compute Instance |
 | [cartpole_cc.ipynb](cartpole-on-single-compute/cartpole_cc.ipynb)  | Notebook to train a Cartpole playing agent on an Azure ML Compute Cluster (single node) |
 | [pong_rllib.ipynb](atari-on-distributed-compute/pong_rllib.ipynb)   | Notebook to train Pong agent using RLlib on multiple compute targets |
 | [minecraft.ipynb](minecraft-on-distributed-compute/minecraft.ipynb)   | Notebook to train an agent to navigate through a lava maze in the Minecraft game |
 ## Prerequisites
 To make use of these samples, you need the following.
 * A Microsoft Azure subscription.
 * A Microsoft Azure resource group.
 * An Azure Machine Learning Workspace in the resource group. Please make sure that the VM sizes `STANDARD_NC6` and `STANDARD_D2_V2` are supported in the workspace's region.
 * A virtual network set up in the resource group.
  * A virtual network is needed for the examples training on multiple compute targets.
  * The [devenv_setup.ipynb](setup/devenv_setup.ipynb) notebook shows you how to create a virtual network. You can alternatively use an existing virtual network, make sure it's in the same region as workspace is.
  * Any network security group defined on the virtual network must allow network traffic on ports used by Azure infrastructure services. This is described in more detail in the [devenv_setup.ipynb](setup/devenv_setup.ipynb) notebook.
 ## Setup
 You can run these samples in the following ways.
 * On an Azure ML Compute Instance or Notebook VM.
 * On a workstation with Python and the Azure ML Python SDK installed.
 ### Azure ML Compute Instance or Notebook VM
 #### Update packages
 We recommend that you update the required Python packages before you proceed. The following commands are for entering in a Python interpreter such as a notebook.
 ```shell
 # We recommend updating pip to the latest version.
 !pip install --upgrade pip
 # Update matplotlib for plotting charts
 !pip install --upgrade matplotlib
 # Update Azure Machine Learning SDK to the latest version
 !pip install --upgrade azureml-sdk
 # For Jupyter notebook widget used in samples
 !pip install --upgrade azureml-widgets
 # For Tensorboard used in samples
 !pip install --upgrade azureml-tensorboard
 # Install Azure Machine Learning Reinforcement Learning SDK
 !pip install --upgrade azureml-contrib-reinforcementlearning
 ```
 ### Your own workstation
 #### Install/update packages
 For a local workstation, create a Python environment and install [Azure Machine Learning SDK](https://docs.microsoft.com/en-us/python/api/overview/azure/ml/install?view=azure-ml-py) and the RL SDK. We recommend Python 3.6 and higher.
 ```shell
 # Activate your environment first.
 # e.g.,
 # conda activate amlrl
 # We recommend updating pip to the latest version.
 pip install --upgrade pip
 # Install/upgrade matplotlib for plotting charts
 pip install --upgrade matplotlib
 # Install/upgrade tensorboard used in samples
 pip install --upgrade tensorboard
 # Install/upgrade Azure ML SDK to the latest version
 pip install --upgrade azureml-sdk
 # For Jupyter notebook widget used in samples
 pip install --upgrade azureml-widgets
 # For Tensorboard used in samples
 pip install --upgrade azureml-tensorboard
 # Install Azure Machine Learning Reinforcement Learning SDK
 pip install --upgrade azureml-contrib-reinforcementlearning
 # To use the notebook widget, you may need to register and enable the Azure ML extensions first.
 jupyter nbextension install --py --user azureml.widgets
 jupyter nbextension enable --py --user azureml.widgets
 ```
 ## Contributing
 This project welcomes contributions and suggestions.  Most contributions require you to agree to a
 Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us
 the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.
 When you submit a pull request, a CLA bot will automatically determine whether you need to provide
 a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions
 provided by the bot. You will only need to do this once across all repos using our CLA.
 This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/).
 For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or
 contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments.
 For more on SDK concepts, please refer to [notebooks](https://github.com/Azure/MachineLearningNotebooks).
 **Please let us know your [feedback](https://github.com/Azure/MachineLearningNotebooks/labels/Reinforcement%20Learning).**
 ![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/reinforcement-learning/README.png)
--- a/how-to-use-azureml/reinforcement-learning/atari-on-distributed-compute/files/pong_rllib.py
+++ b/how-to-use-azureml/reinforcement-learning/atari-on-distributed-compute/files/pong_rllib.py
@@ -0,0 +1,39 @@
 import ray
 import ray.tune as tune
 from ray.rllib import train
 import os
 import sys
 from azureml.core import Run
 from utils import callbacks
 DEFAULT_RAY_ADDRESS = 'localhost:6379'
 if __name__ == "__main__":
    # Parse arguments
    train_parser = train.create_parser()
    args = train_parser.parse_args()
    print("Algorithm config:", args.config)
    if args.ray_address is None:
        args.ray_address = DEFAULT_RAY_ADDRESS
    ray.init(address=args.ray_address)
    tune.run(run_or_experiment=args.run,
             config={
                 "env": args.env,
                 "num_gpus": args.config["num_gpus"],
                 "num_workers": args.config["num_workers"],
                 "callbacks": {"on_train_result": callbacks.on_train_result},
                 "sample_batch_size": 50,
                 "train_batch_size": 1000,
                 "num_sgd_iter": 2,
                 "num_data_loader_buffers": 2,
                 "model": {"dim": 42},
             },
             stop=args.stop,
             local_dir='./logs')
--- a/how-to-use-azureml/reinforcement-learning/atari-on-distributed-compute/files/utils/callbacks.py
+++ b/how-to-use-azureml/reinforcement-learning/atari-on-distributed-compute/files/utils/callbacks.py
@@ -0,0 +1,17 @@
 '''RLlib callbacks module:
    Common callback methods to be passed to RLlib trainer.
 '''
 from azureml.core import Run
 def on_train_result(info):
    '''Callback on train result to record metrics returned by trainer.
    '''
    run = Run.get_context()
    run.log(
        name='episode_reward_mean',
        value=info["result"]["episode_reward_mean"])
    run.log(
        name='episodes_total',
        value=info["result"]["episodes_total"])
--- a/how-to-use-azureml/reinforcement-learning/atari-on-distributed-compute/images/pong.gif
+++ b/how-to-use-azureml/reinforcement-learning/atari-on-distributed-compute/images/pong.gif
--- a/how-to-use-azureml/reinforcement-learning/atari-on-distributed-compute/pong_rllib.ipynb
+++ b/how-to-use-azureml/reinforcement-learning/atari-on-distributed-compute/pong_rllib.ipynb
@@ -0,0 +1,604 @@
 {
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Copyright (c) Microsoft Corporation. All rights reserved.\n",
        "\n",
        "Licensed under the MIT License."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/tutorials/how-to-use-azureml/reinforcement-learning/atari-on-distributed-compute/pong_rllib.png)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "# Azure ML Reinforcement Learning Sample - Pong problem\n",
        "Azure ML Reinforcement Learning (Azure ML RL) is a managed service for running distributed RL (reinforcement learning) simulation and training using the Ray framework.\n",
        "This example uses Ray RLlib to train a Pong playing agent on a multi-node cluster.\n",
        "\n",
        "## Pong problem\n",
        "[Pong](https://en.wikipedia.org/wiki/Pong) is a two-dimensional sports game that simulates table tennis. The player controls an in-game paddle by moving it vertically across the left or right side of the screen. They can compete against another player controlling a second paddle on the opposing side. Players use the paddles to hit a ball back and forth."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "<table style=\"width:50%\">\n",
        "  <tr>\n",
        "      <th style=\"text-align: center;\"><img src=\"./images/pong.gif\" alt=\"Pong image\" align=\"middle\" margin-left=\"auto\" margin-right=\"auto\"/></th>\n",
        "  </tr>\n",
        "  <tr style=\"text-align: center;\">\n",
        "      <th>Fig 1. Pong game animation (from <a href=\"https://towardsdatascience.com/intro-to-reinforcement-learning-pong-92a94aa0f84d\">towardsdatascience.com</a>).</th>\n",
        "  </tr>\n",
        "</table>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "The goal here is to train an agent to win an episode of Pong game against opponent with the score of at least 18 points. An episode in Pong runs until one of the players reaches a score of 21. Episodes are a terminology that is used across all the [OpenAI gym](https://gym.openai.com/envs/Pong-v0/) environments that contains a strictly defined task.\n",
        "\n",
        "Training a Pong agent is a CPU intensive task and this example demonstrates the use of Azure ML RL service to train an agent faster in a distributed, parallel environment. You'll learn more about using the head and the worker compute targets to train an agent in this notebook below."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Prerequisite\n",
        "\n",
        "The user should have completed the [Azure ML Reinforcement Learning Sample - Setting Up Development Environment](../setup/devenv_setup.ipynb) to setup a virtual network. This virtual network will be used here for head and worker compute targets. It is highly recommended that the user should go through the [Azure ML Reinforcement Learning Sample - Cartpole Problem](../cartpole-on-single-compute/cartpole_cc.ipynb) to understand the basics of Azure ML RL and Ray RLlib used in this notebook."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Set up Development Environment\n",
        "The following subsections show typical steps to setup your development environment. Setup includes:\n",
        "\n",
        "* Connecting to a workspace to enable communication between your local machine and remote resources\n",
        "* Creating an experiment to track all your runs\n",
        "* Creating a remote head and worker compute target on a vnet to use for training"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Azure Machine Learning SDK\n",
        "Display the Azure Machine Learning SDK version."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "%matplotlib inline\n",
        "\n",
        "# Azure ML core imports\n",
        "import azureml.core\n",
        "\n",
        "# Check core SDK version number\n",
        "print(\"Azure ML SDK Version: \", azureml.core.VERSION)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Get Azure ML workspace\n",
        "Get a reference to an existing Azure ML workspace."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.core import Workspace\n",
        "\n",
        "ws = Workspace.from_config()\n",
        "print(ws.name, ws.location, ws.resource_group, sep = ' | ')"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Create Azure ML experiment\n",
        "Create an experiment to track the runs in your workspace."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.core.experiment import Experiment\n",
        "\n",
        "# Experiment name\n",
        "experiment_name = 'rllib-pong-multi-node'\n",
        "exp = Experiment(workspace=ws, name=experiment_name)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Specify the name of your vnet\n",
        "\n",
        "The resource group you use must contain a vnet.  Specify the name of the vnet here created in the [Azure ML Reinforcement Learning Sample - Setting Up Development Environment](../setup/devenv_setup.ipynb)."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Virtual network name\n",
        "vnet_name = 'your_vnet'"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Create head computing cluster\n",
        "\n",
        "In this example, we show how to set up separate compute clusters for the Ray head and Ray worker nodes. First we define the head cluster with GPU for the Ray head node. One CPU of the head node will be used for the Ray head process and the rest of the CPUs will be used by the Ray worker processes."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.core.compute import AmlCompute, ComputeTarget\n",
        "\n",
        "# Choose a name for the Ray head cluster\n",
        "head_compute_name = 'head-gpu'\n",
        "head_compute_min_nodes = 0\n",
        "head_compute_max_nodes = 2\n",
        "\n",
        "# This example uses GPU VM. For using CPU VM, set SKU to STANDARD_D2_V2\n",
        "head_vm_size = 'STANDARD_NC6'\n",
        "\n",
        "if head_compute_name in ws.compute_targets:\n",
        "    head_compute_target = ws.compute_targets[head_compute_name]\n",
        "    if head_compute_target and type(head_compute_target) is AmlCompute:\n",
        "        if head_compute_target.provisioning_state == 'Succeeded':\n",
        "            print('found head compute target. just use it', head_compute_name)\n",
        "        else: \n",
        "            raise Exception('found head compute target but it is in state', head_compute_target.provisioning_state)\n",
        "else:\n",
        "    print('creating a new head compute target...')\n",
        "    provisioning_config = AmlCompute.provisioning_configuration(vm_size=head_vm_size,\n",
        "                                                                min_nodes=head_compute_min_nodes, \n",
        "                                                                max_nodes=head_compute_max_nodes,\n",
        "                                                                vnet_resourcegroup_name=ws.resource_group,\n",
        "                                                                vnet_name=vnet_name,\n",
        "                                                                subnet_name='default')\n",
        "\n",
        "    # Create the cluster\n",
        "    head_compute_target = ComputeTarget.create(ws, head_compute_name, provisioning_config)\n",
        "    \n",
        "    # Can poll for a minimum number of nodes and for a specific timeout. \n",
        "    # If no min node count is provided it will use the scale settings for the cluster\n",
        "    head_compute_target.wait_for_completion(show_output=True, min_node_count=None, timeout_in_minutes=20)\n",
        "    \n",
        "     # For a more detailed view of current AmlCompute status, use get_status()\n",
        "    print(head_compute_target.get_status().serialize())"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Create worker computing cluster\n",
        "\n",
        "Now we create a compute cluster with CPUs for the additional Ray worker nodes. CPUs in these worker nodes are used by Ray worker processes. Each Ray worker node may have multiple Ray worker processes depending on CPUs on the worker node. Ray can distribute multiple worker tasks on each worker node."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Choose a name for your Ray worker cluster\n",
        "worker_compute_name = 'worker-cpu'\n",
        "worker_compute_min_nodes = 0 \n",
        "worker_compute_max_nodes = 4\n",
        "\n",
        "# This example uses CPU VM. For using GPU VM, set SKU to STANDARD_NC6\n",
        "worker_vm_size = 'STANDARD_D2_V2'\n",
        "\n",
        "# Create the compute target if it hasn't been created already\n",
        "if worker_compute_name in ws.compute_targets:\n",
        "    worker_compute_target = ws.compute_targets[worker_compute_name]\n",
        "    if worker_compute_target and type(worker_compute_target) is AmlCompute:\n",
        "        if worker_compute_target.provisioning_state == 'Succeeded':\n",
        "            print('found worker compute target. just use it', worker_compute_name)\n",
        "        else: \n",
        "            raise Exception('found worker compute target but it is in state', head_compute_target.provisioning_state)\n",
        "else:\n",
        "    print('creating a new worker compute target...')\n",
        "    provisioning_config = AmlCompute.provisioning_configuration(vm_size=worker_vm_size,\n",
        "                                                                min_nodes=worker_compute_min_nodes, \n",
        "                                                                max_nodes=worker_compute_max_nodes,\n",
        "                                                                vnet_resourcegroup_name=ws.resource_group,\n",
        "                                                                vnet_name=vnet_name,\n",
        "                                                                subnet_name='default')\n",
        "\n",
        "    # Create the cluster\n",
        "    worker_compute_target = ComputeTarget.create(ws, worker_compute_name, provisioning_config)\n",
        "    \n",
        "    # Can poll for a minimum number of nodes and for a specific timeout. \n",
        "    # If no min node count is provided it will use the scale settings for the cluster\n",
        "    worker_compute_target.wait_for_completion(show_output=True, min_node_count=None, timeout_in_minutes=20)\n",
        "    \n",
        "     # For a more detailed view of current AmlCompute status, use get_status()\n",
        "    print(worker_compute_target.get_status().serialize())"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Train Pong Agent Using Azure ML RL\n",
        "To facilitate reinforcement learning, Azure Machine Learning Python SDK provides a high level abstraction, the _ReinforcementLearningEstimator_ class, which allows users to easily construct RL run configurations for the underlying RL framework. Azure ML RL initially supports the [Ray framework](https://ray.io/) and its highly customizable [RLLib](https://ray.readthedocs.io/en/latest/rllib.html#rllib-scalable-reinforcement-learning). In this section we show how to use _ReinforcementLearningEstimator_ and Ray/RLLib framework to train a Pong playing agent.\n",
        "\n",
        "\n",
        "### Define worker configuration\n",
        "Define a `WorkerConfiguration` using your worker compute target. We also specify the number of nodes in the worker compute target to be used for training and additional PIP packages to install on those nodes as a part of setup.\n",
        "In this case, we define the PIP packages as dependencies for both head and worker nodes. With this setup, the game simulations will run directly on the worker compute nodes."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.contrib.train.rl import WorkerConfiguration\n",
        "\n",
        "# Pip packages we will use for both head and worker\n",
        "pip_packages=[\"ray[rllib]==0.8.3\"] # Latest version of Ray has fixes for isses related to object transfers\n",
        "\n",
        "# Specify the Ray worker configuration\n",
        "worker_conf = WorkerConfiguration(\n",
        "    \n",
        "    # Azure ML compute cluster to run Ray workers\n",
        "    compute_target=worker_compute_target, \n",
        "    \n",
        "    # Number of worker nodes\n",
        "    node_count=4,\n",
        "    \n",
        "    # GPU\n",
        "    use_gpu=False, \n",
        "    \n",
        "    # PIP packages to use\n",
        "    pip_packages=pip_packages\n",
        ")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Create reinforcement learning estimator\n",
        "\n",
        "The `ReinforcementLearningEstimator` is used to submit a job to Azure Machine Learning to start the Ray experiment run. We define the training script parameters here that will be passed to estimator. \n",
        "\n",
        "We specify `episode_reward_mean` to 18 as we want to stop the training as soon as the trained agent reaches an average win margin of at least 18 point over opponent over all episodes in the training epoch.\n",
        "Number of Ray worker processes are defined by parameter `num_workers`. We set it to 13 as we have 13 CPUs available in our compute targets. Multiple Ray worker processes parallelizes agent training and helps in achieving our goal faster. \n",
        "\n",
        "```\n",
        "Number of CPUs in head_compute_target = 6 CPUs in 1 node = 6\n",
        "Number of CPUs in worker_compute_target = 2 CPUs in each of 4 nodes = 8\n",
        "Number of CPUs available = (Number of CPUs in head_compute_target) + (Number of CPUs in worker_compute_target) - (1 CPU for head node) = 6 + 8 - 1 = 13\n",
        "```"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.contrib.train.rl import ReinforcementLearningEstimator, Ray\n",
        "\n",
        "training_algorithm = \"IMPALA\"\n",
        "rl_environment = \"PongNoFrameskip-v4\"\n",
        "\n",
        "# Training script parameters\n",
        "script_params = {\n",
        "    \n",
        "    # Training algorithm, IMPALA in this case\n",
        "    \"--run\": training_algorithm,\n",
        "    \n",
        "    # Environment, Pong in this case\n",
        "    \"--env\": rl_environment,\n",
        "    \n",
        "    # Add additional single quotes at the both ends of string values as we have spaces in the \n",
        "    # string parameters, outermost quotes are not passed to scripts as they are not actually part of string\n",
        "    # Number of GPUs\n",
        "    # Number of ray workers\n",
        "    \"--config\": '\\'{\"num_gpus\": 1, \"num_workers\": 13}\\'',\n",
        "    \n",
        "    # Target episode reward mean to stop the training\n",
        "    # Total training time in seconds\n",
        "    \"--stop\": '\\'{\"episode_reward_mean\": 18, \"time_total_s\": 3600}\\'',\n",
        "}\n",
        "\n",
        "# RL estimator\n",
        "rl_estimator = ReinforcementLearningEstimator(\n",
        "    \n",
        "    # Location of source files\n",
        "    source_directory='files',\n",
        "    \n",
        "    # Python script file\n",
        "    entry_script=\"pong_rllib.py\",\n",
        "    \n",
        "    # Parameters to pass to the script file\n",
        "    # Defined above.\n",
        "    script_params=script_params,\n",
        "    \n",
        "    # The Azure ML compute target set up for Ray head nodes\n",
        "    compute_target=head_compute_target,\n",
        "    \n",
        "    # Pip packages\n",
        "    pip_packages=pip_packages,\n",
        "    \n",
        "    # GPU usage\n",
        "    use_gpu=True,\n",
        "    \n",
        "    # RL framework.  Currently must be Ray.\n",
        "    rl_framework=Ray(),\n",
        "    \n",
        "    # Ray worker configuration defined above.\n",
        "    worker_configuration=worker_conf,\n",
        "    \n",
        "    # How long to wait for whole cluster to start\n",
        "    cluster_coordination_timeout_seconds=3600,\n",
        "    \n",
        "    # Maximum time for the whole Ray job to run\n",
        "    # This will cut off the run after an hour\n",
        "    max_run_duration_seconds=3600,\n",
        "    \n",
        "    # Allow the docker container Ray runs in to make full use\n",
        "    # of the shared memory available from the host OS.\n",
        "    shm_size=24*1024*1024*1024\n",
        ")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Training script\n",
        "As recommended in [RLLib](https://ray.readthedocs.io/en/latest/rllib.html) documentations, we use Ray [Tune](https://ray.readthedocs.io/en/latest/tune.html) API to run training algorithm. All the RLLib built-in trainers are compatible with the Tune API. Here we use tune.run() to execute a built-in training algorithm. For convenience, down below you can see part of the entry script where we make this call.\n",
        "\n",
        "```python\n",
        "    tune.run(run_or_experiment=args.run,\n",
        "             config={\n",
        "                 \"env\": args.env,\n",
        "                 \"num_gpus\": args.config[\"num_gpus\"],\n",
        "                 \"num_workers\": args.config[\"num_workers\"],\n",
        "                 \"callbacks\": {\"on_train_result\": callbacks.on_train_result},\n",
        "                 \"sample_batch_size\": 50,\n",
        "                 \"train_batch_size\": 1000,\n",
        "                 \"num_sgd_iter\": 2,\n",
        "                 \"num_data_loader_buffers\": 2,\n",
        "                 \"model\": {\"dim\": 42},\n",
        "             },\n",
        "             stop=args.stop,\n",
        "             local_dir='./logs')\n",
        "```"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Submit the estimator to start a run\n",
        "Now we use the rl_estimator configured above to submit a run."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "run = exp.submit(config=rl_estimator)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Monitor the run\n",
        "\n",
        "Azure ML provides a Jupyter widget to show the real-time status of an experiment run. You could use this widget to monitor the status of runs. The widget shows the list of two child runs, one for head compute target run and one for worker compute target run, as well. You can click on the link under Status to see the details of the child run."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.widgets import RunDetails\n",
        "\n",
        "RunDetails(run).show()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Wait for the run to complete before proceeding. If you want to stop the run, you may skip this and move to next section below. \n",
        "\n",
        "**Note: the run may take anywhere from 30 minutes to 45 minutes to complete.**"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "run.wait_for_completion()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Stop the run\n",
        "\n",
        "To cancel the run, call run.cancel()."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# run.cancel()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Performance of the agent during training\n",
        "\n",
        "Let's get the reward metrics for the training run agent and observe how the agent's rewards improved over the training iterations and how the agent learns to win the Pong game. \n",
        "\n",
        "Collect the episode reward metrics from the worker run's metrics. "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Get all child runs\n",
        "child_runs = list(run.get_children(_rehydrate_runs=False))\n",
        "\n",
        "# Get the reward metrics from worker run\n",
        "if child_runs[0].id.endswith(\"_worker\"):\n",
        "    episode_reward_mean = child_runs[0].get_metrics(name='episode_reward_mean')\n",
        "else:\n",
        "    episode_reward_mean = child_runs[1].get_metrics(name='episode_reward_mean')"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Plot the reward metrics. "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import matplotlib.pyplot as plt\n",
        "\n",
        "plt.plot(episode_reward_mean['episode_reward_mean'])\n",
        "plt.xlabel('training_iteration')\n",
        "plt.ylabel('episode_reward_mean')\n",
        "plt.show()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "We observe that during the training over multiple episodes, the agent learn to win the Pong game against opponent with our target of 18 points in each episode of 21 points.\n",
        "**Congratulations!! You have trained your Pong agent to win a game marvelously.**"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Cleaning up\n",
        "For your convenience, below you can find code snippets to clean up any resources created as part of this tutorial that you don't wish to retain."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# To archive the created experiment:\n",
        "#experiment.archive()\n",
        "\n",
        "# To delete the compute targets:\n",
        "#head_compute_target.delete()\n",
        "#worker_compute_target.delete()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Next\n",
        "In this example, you learnt how to solve distributed RL training problems using head and worker compute targets. This is currently the last introductory tutorial for Azure Machine Learning service's Reinforcement Learning offering. We would love to hear your feedback to build the features you need!"
      ]
    }
  ],
  "metadata": {
    "authors": [
      {
        "name": "vineetg"
      }
    ],
    "kernelspec": {
      "display_name": "Python 3.6",
      "language": "python",
      "name": "python36"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.7.4"
    },
    "notice": "Copyright (c) Microsoft Corporation. All rights reserved.\u00e2\u20ac\u00afLicensed under the MIT License.\u00e2\u20ac\u00af "
  },
  "nbformat": 4,
  "nbformat_minor": 4
 }
--- a/how-to-use-azureml/reinforcement-learning/atari-on-distributed-compute/pong_rllib.yml
+++ b/how-to-use-azureml/reinforcement-learning/atari-on-distributed-compute/pong_rllib.yml
@@ -0,0 +1,7 @@
 name: pong_rllib
 dependencies:
 - pip:
  - azureml-sdk
  - azureml-contrib-reinforcementlearning
  - azureml-widgets
  - matplotlib
--- a/how-to-use-azureml/reinforcement-learning/cartpole-on-compute-instance/cartpole_ci.ipynb
+++ b/how-to-use-azureml/reinforcement-learning/cartpole-on-compute-instance/cartpole_ci.ipynb
@@ -0,0 +1,700 @@
 {
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Copyright (c) Microsoft Corporation. All rights reserved.\n",
        "\n",
        "Licensed under the MIT License."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/tutorials/how-to-use-azureml/reinforcement-learning/cartpole-on-compute-instance/cartpole_ci.png)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "# Azure ML Reinforcement Learning Sample - Cartpole Problem on Compute Instance\n",
        "\n",
        "Azure ML Reinforcement Learning (Azure ML RL) is a managed service for running reinforcement learning training and simulation. With Azure MLRL, data scientists can start developing RL systems on one machine, and scale to compute clusters with 100\u00e2\u20ac\u2122s of nodes if needed.\n",
        "\n",
        "This example shows how to use Azure ML RL to train a Cartpole playing agent on a compute instance."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Cartpole problem\n",
        "\n",
        "Cartpole, also known as [Inverted Pendulum](https://en.wikipedia.org/wiki/Inverted_pendulum), is a pendulum with a center of mass above its pivot point. This formation is essentially unstable and will easily fall over but can be kept balanced by applying appropriate horizontal forces to the pivot point.\n",
        "\n",
        "<table style=\"width:50%\">\n",
        "  <tr>\n",
        "    <th>\n",
        "      <img src=\"./images/cartpole.png\" alt=\"Cartpole image\" /> \n",
        "    </th>\n",
        "  </tr>\n",
        "  <tr>\n",
        "      <th><p>Fig 1. Cartpole problem schematic description (from <a href=\"https://towardsdatascience.com/cartpole-introduction-to-reinforcement-learning-ed0eb5b58288\">towardsdatascience.com</a>).</p></th>\n",
        "  </tr>\n",
        "</table>\n",
        "\n",
        "The goal here is to train an agent to keep the cartpole balanced by applying appropriate forces to the pivot point.\n",
        "\n",
        "See [this video](https://www.youtube.com/watch?v=XiigTGKZfks) for a real-world demonstration of cartpole problem."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Prerequisite\n",
        "The user should have completed the Azure Machine Learning Tutorial: [Get started creating your first ML experiment with the Python SDK](https://docs.microsoft.com/en-us/azure/machine-learning/tutorial-1st-experiment-sdk-setup). You will need to make sure that you have a valid subscription id, a resource group and a workspace. All datastores and datasets you use should be associated with your workspace."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Set up Development Environment\n",
        "The following subsections show typical steps to setup your development environment. Setup includes:\n",
        "\n",
        "* Connecting to a workspace to enable communication between your local machine and remote resources\n",
        "* Creating an experiment to track all your runs\n",
        "* Using a Compute Instance as compute target"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Azure ML SDK \n",
        "Display the Azure ML SDK version."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import azureml.core\n",
        "print(\"Azure ML SDK Version: \", azureml.core.VERSION)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Get Azure ML workspace\n",
        "Get a reference to an existing Azure ML workspace."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.core import Workspace\n",
        "\n",
        "ws = Workspace.from_config()\n",
        "print(ws.name, ws.location, ws.resource_group, sep = ' | ')"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Use  Compute Instance as compute target\n",
        "\n",
        "A compute target is a designated compute resource where you run your training and simulation scripts. This location may be your local machine or a cloud-based compute resource. For more information see [What are compute targets in Azure Machine Learning?](https://docs.microsoft.com/en-us/azure/machine-learning/concept-compute-target)\n",
        "\n",
        "The code below shows how to use current compute instance as a compute target. First some helper functions:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import os.path\n",
        "\n",
        "\n",
        "# Get information about the currently running compute instance (notebook VM), like its name and prefix.\n",
        "def load_nbvm():\n",
        "    if not os.path.isfile(\"/mnt/azmnt/.nbvm\"):\n",
        "        return None\n",
        "    with open(\"/mnt/azmnt/.nbvm\", 'r') as file:\n",
        "        return {key:value for (key, value) in [line.strip().split('=') for line in file]}\n",
        "\n",
        "\n",
        "# Get information about the capabilities of an azureml.core.compute.AmlCompute target\n",
        "# In particular how much RAM + GPU + HDD it has.\n",
        "def get_compute_size(self, workspace):\n",
        "    for size in self.supported_vmsizes(workspace):\n",
        "        if(size['name'].upper() == self.vm_size):\n",
        "            return size\n",
        "\n",
        "azureml.core.compute.ComputeTarget.size = get_compute_size\n",
        "del(get_compute_size)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Then we use these helper functions to get a handle to current compute instance."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Load current compute instance info\n",
        "current_compute_instance = load_nbvm()\n",
        "print(\"Current compute instance:\", current_compute_instance)\n",
        "\n",
        "# For this demo, let's use the current compute instance as the compute target, if available\n",
        "if current_compute_instance:\n",
        "    instance_name = current_compute_instance['instance']\n",
        "else:\n",
        "    instance_name = next(iter(ws.compute_targets))\n",
        "\n",
        "compute_target = ws.compute_targets[instance_name]\n",
        "\n",
        "print(\"Compute target status:\")\n",
        "print(compute_target.get_status().serialize())\n",
        "\n",
        "print(\"Compute target size:\")\n",
        "print(compute_target.size(ws))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Create Azure ML experiment\n",
        "Create an experiment to track the runs in your workspace. "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.core.experiment import Experiment\n",
        "\n",
        "experiment_name = 'CartPole-v0-CI'\n",
        "exp = Experiment(workspace=ws, name=experiment_name)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Train Cartpole Agent Using Azure ML RL\n",
        "To facilitate reinforcement learning, Azure Machine Learning Python SDK provides a high level abstraction, the _ReinforcementLearningEstimator_ class, which allows users to easily construct RL run configurations for the underlying RL framework. Azure ML RL initially supports the [Ray framework](https://ray.io/) and its highly customizable [RLlib](https://ray.readthedocs.io/en/latest/rllib.html#rllib-scalable-reinforcement-learning). In this section we show how to use _ReinforcementLearningEstimator_ and Ray/RLlib framework to train a cartpole playing agent. "
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Create reinforcement learning estimator\n",
        "\n",
        "The code below creates an instance of *ReinforcementLearningEstimator*, `training_estimator`, which then will be used to submit a job to Azure Machine Learning to start the Ray experiment run.\n",
        "\n",
        "Note that this example is purposely simplified to the minimum. Here is a short description of the parameters we are passing into the constructor:\n",
        "\n",
        "- `source_directory`, local directory containing your training script(s) and helper modules,\n",
        "- `entry_script`, path to your entry script relative to the source directory,\n",
        "- `script_params`, constant parameters to be passed to each run of training script,\n",
        "- `compute_target`, reference to the compute target in which the trainer and worker(s) jobs will be executed,\n",
        "- `rl_framework`, the RL framework to be used (currently must be Ray).\n",
        "\n",
        "We use the `script_params` parameter to pass in general and algorithm-specific parameters to the training script.\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.contrib.train.rl import ReinforcementLearningEstimator, Ray\n",
        "\n",
        "training_algorithm = \"PPO\"\n",
        "rl_environment = \"CartPole-v0\"\n",
        "\n",
        "script_params = {\n",
        "\n",
        "    # Training algorithm\n",
        "    \"--run\": training_algorithm,\n",
        "    \n",
        "    # Training environment\n",
        "    \"--env\": rl_environment,\n",
        "    \n",
        "    # Algorithm-specific parameters\n",
        "    \"--config\": '\\'{\"num_gpus\": 0, \"num_workers\": 1}\\'',\n",
        "    \n",
        "    # Stop conditions\n",
        "    \"--stop\": '\\'{\"episode_reward_mean\": 200, \"time_total_s\": 300}\\'',\n",
        "    \n",
        "    # Frequency of taking checkpoints\n",
        "    \"--checkpoint-freq\": 2,\n",
        "    \n",
        "    # If a checkpoint should be taken at the end - optional argument with no value\n",
        "    \"--checkpoint-at-end\": \"\",\n",
        "    \n",
        "    # Log directory\n",
        "    \"--local-dir\": './logs'\n",
        "}\n",
        "\n",
        "training_estimator = ReinforcementLearningEstimator(\n",
        "\n",
        "    # Location of source files\n",
        "    source_directory='files',\n",
        "    \n",
        "    # Python script file\n",
        "    entry_script='cartpole_training.py',\n",
        "    \n",
        "    # A dictionary of arguments to pass to the training script specified in ``entry_script``\n",
        "    script_params=script_params,\n",
        "    \n",
        "    # The Azure ML compute target set up for Ray head nodes\n",
        "    compute_target=compute_target,\n",
        "    \n",
        "    # RL framework.  Currently must be Ray.\n",
        "    rl_framework=Ray()\n",
        ")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Training script\n",
        "\n",
        "As recommended in RLlib documentations, we use Ray Tune API to run the training algorithm. All the RLlib built-in trainers are compatible with the Tune API. Here we use `tune.run()` to execute a built-in training algorithm. For convenience, down below you can see part of the entry script where we make this call.\n",
        "\n",
        "This is the list of parameters we are passing into `tune.run()` via the `script_params` parameter:\n",
        "\n",
        "- `run_or_experiment`: name of the [built-in algorithm](https://ray.readthedocs.io/en/latest/rllib-algorithms.html#rllib-algorithms), 'PPO' in our example,\n",
        "- `config`: Algorithm-specific configuration. This includes specifying the environment, `env`, which in our example is the gym **[CartPole-v0](https://gym.openai.com/envs/CartPole-v0/)** environment,\n",
        "- `stop`: stopping conditions, which could be any of the metrics returned by the trainer. Here we use \"mean of episode reward\", and \"total training time in seconds\" as stop conditions, and\n",
        "- `checkpoint_freq` and `checkpoint_at_end`: Frequency of taking checkpoints (number of training iterations between checkpoints), and if a checkpoint should be taken at the end.\n",
        "\n",
        "We also specify the `local_dir`, the directory in which the training logs, checkpoints and other training artificats will be recorded. \n",
        "\n",
        "See [RLlib Training APIs](https://ray.readthedocs.io/en/latest/rllib-training.html#rllib-training-apis) for more details, and also [Training (tune.run, tune.Experiment)](https://ray.readthedocs.io/en/latest/tune/api_docs/execution.html#training-tune-run-tune-experiment) for the complete list of parameters.\n",
        "\n",
        "```python\n",
        "import ray\n",
        "import ray.tune as tune\n",
        "\n",
        "if __name__ == \"__main__\":\n",
        "\n",
        "    # parse arguments ...\n",
        "    \n",
        "    # Intitialize ray\n",
        "    ay.init(address=args.ray_address)\n",
        "\n",
        "    # Run training task using tune.run\n",
        "    tune.run(\n",
        "        run_or_experiment=args.run,\n",
        "        config=dict(args.config, env=args.env),\n",
        "        stop=args.stop,\n",
        "        checkpoint_freq=args.checkpoint_freq,\n",
        "        checkpoint_at_end=args.checkpoint_at_end,\n",
        "        local_dir=args.local_dir\n",
        "    )\n",
        "```"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Submit the estimator to start experiment\n",
        "Now we use the *training_estimator* to submit a run."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "training_run = exp.submit(training_estimator)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Monitor experiment\n",
        "Azure ML provides a Jupyter widget to show the real-time status of an experiment run. You could use this widget to monitor status of the runs.\n",
        "\n",
        "Note that _ReinforcementLearningEstimator_ creates at least two runs: (a) A parent run, i.e. the run returned above, and (b) a collection of child runs. The number of the child runs depends on the configuration of the reinforcement learning estimator. In our simple scenario, configured above, only one child run will be created.\n",
        "\n",
        "The widget will show a list of the child runs as well. You can click on the link under **Status** to see the details of a child run."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.widgets import RunDetails\n",
        "\n",
        "RunDetails(training_run).show()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Stop the run\n",
        "\n",
        "To cancel the run, call `training_run.cancel()`."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Uncomment line below to cancel the run\n",
        "# training_run.cancel()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Wait for completion\n",
        "Wait for the run to complete before proceeding.\n",
        "\n",
        "**Note: The run may take a few minutes to complete.**"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "training_run.wait_for_completion()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Get a handle to the child run\n",
        "You can obtain a handle to the child run as follows. In our scenario, there is only one child run, we have it called `child_run_0`."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import time\n",
        "\n",
        "child_run_0 = None\n",
        "timeout = 30\n",
        "while timeout > 0 and not child_run_0:\n",
        "    child_runs = list(training_run.get_children())\n",
        "    print('Number of child runs:', len(child_runs))\n",
        "    if len(child_runs) > 0:\n",
        "        child_run_0 = child_runs[0]\n",
        "        break\n",
        "    time.sleep(2) # Wait for 2 seconds\n",
        "    timeout -= 2\n",
        "\n",
        "print('Child run info:')\n",
        "print(child_run_0)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Evaluate Trained Agent and See Results\n",
        "\n",
        "We can evaluate a previously trained policy using the `rollout.py` helper script provided by RLlib (see [Evaluating Trained Policies](https://ray.readthedocs.io/en/latest/rllib-training.html#evaluating-trained-policies) for more details). Here we use an adaptation of this script to reconstruct a policy from a checkpoint taken and saved during training. We took these checkpoints by setting `checkpoint-freq` and `checkpoint-at-end` parameters above.\n",
        "\n",
        "In this section we show how to get access to these checkpoints data, and then how to use them to evaluate the trained policy."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Create a dataset of training artifacts\n",
        "To evaluate a trained policy (a checkpoint) we need to make the checkpoint accessible to the rollout script. All the training artifacts are stored in workspace default datastore under **azureml/&lt;run_id&gt;** directory.\n",
        "\n",
        "Here we create a file dataset from the stored artifacts, and then use this dataset to feed these data to rollout estimator."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.core import Dataset\n",
        "\n",
        "run_id = child_run_0.id # Or set to run id of a completed run (e.g. 'rl-cartpole-v0_1587572312_06e04ace_head')\n",
        "run_artifacts_path = os.path.join('azureml', run_id)\n",
        "print(\"Run artifacts path:\", run_artifacts_path)\n",
        "\n",
        "# Create a file dataset object from the files stored on default datastore\n",
        "datastore = ws.get_default_datastore()\n",
        "training_artifacts_ds = Dataset.File.from_files(datastore.path(os.path.join(run_artifacts_path, '**')))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "To verify, we can print out the number (and paths) of all the files in the dataset, as follows."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "artifacts_paths = training_artifacts_ds.to_path()\n",
        "print(\"Number of files in dataset:\", len(artifacts_paths))\n",
        "\n",
        "# Uncomment line below to print all file paths\n",
        "#print(\"Artifacts dataset file paths: \", artifacts_paths)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Evaluate a trained policy\n",
        "We need to configure another reinforcement learning estimator, `rollout_estimator`, and then use it to submit another run. Note that the entry script for this estimator now points to `cartpole-rollout.py` script.\n",
        "Also note how we pass the checkpoints dataset to this script using `inputs` parameter of the _ReinforcementLearningEstimator_.\n",
        "\n",
        "We are using script parameters to pass in the same algorithm and the same environment used during training. We also specify the checkpoint number of the checkpoint we wish to evaluate, `checkpoint-number`, and number of the steps we shall run the rollout, `steps`.\n",
        "\n",
        "The checkpoints dataset will be accessible to the rollout script as a mounted folder. The mounted folder and the checkpoint number, passed in via `checkpoint-number`, will be used to create a path to the checkpoint we are going to evaluate. The created checkpoint path then will be passed into RLlib rollout script for evaluation.\n",
        "\n",
        "Let's find the checkpoints and the last checkpoint number first."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Find checkpoints and last checkpoint number\n",
        "from os import path\n",
        "checkpoint_files = [\n",
        "    os.path.basename(file) for file in training_artifacts_ds.to_path() \\\n",
        "        if os.path.basename(file).startswith('checkpoint-') and \\\n",
        "            not os.path.basename(file).endswith('tune_metadata')\n",
        "]\n",
        "\n",
        "checkpoint_numbers = []\n",
        "for file in checkpoint_files:\n",
        "    checkpoint_numbers.append(int(file.split('-')[1]))\n",
        "\n",
        "print(\"Checkpoints:\", checkpoint_numbers)\n",
        "\n",
        "last_checkpoint_number = max(checkpoint_numbers)\n",
        "print(\"Last checkpoint number:\", last_checkpoint_number)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Now let's configure rollout estimator. Note that we use the last checkpoint for evaluation. The assumption is that the last checkpoint points to our best trained agent. You may change this to any of the checkpoint numbers printed above and observe the effect."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "script_params = {    \n",
        "    # Checkpoint number of the checkpoint from which to roll out\n",
        "    \"--checkpoint-number\": last_checkpoint_number,\n",
        "\n",
        "    # Training algorithm\n",
        "    \"--run\": training_algorithm,\n",
        "    \n",
        "    # Training environment\n",
        "    \"--env\": rl_environment,\n",
        "    \n",
        "    # Algorithm-specific parameters\n",
        "    \"--config\": '{}',\n",
        "    \n",
        "    # Number of rollout steps \n",
        "    \"--steps\": 2000,\n",
        "    \n",
        "    # If should repress rendering of the environment\n",
        "    \"--no-render\": \"\"\n",
        "}\n",
        "\n",
        "rollout_estimator = ReinforcementLearningEstimator(\n",
        "    # Location of source files\n",
        "    source_directory='files',\n",
        "    \n",
        "    # Python script file\n",
        "    entry_script='cartpole_rollout.py',\n",
        "    \n",
        "    # A dictionary of arguments to pass to the rollout script specified in ``entry_script``\n",
        "    script_params = script_params,\n",
        "    \n",
        "    # Data inputs\n",
        "    inputs=[\n",
        "        training_artifacts_ds.as_named_input('artifacts_dataset'),\n",
        "        training_artifacts_ds.as_named_input('artifacts_path').as_mount()],\n",
        "    \n",
        "    # The Azure ML compute target\n",
        "    compute_target=compute_target,\n",
        "    \n",
        "    # RL framework. Currently must be Ray.\n",
        "    rl_framework=Ray(),\n",
        "    \n",
        "    # Additional pip packages to install\n",
        "    pip_packages = ['azureml-dataprep[fuse,pandas]'])"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Same as before, we use the *rollout_estimator* to submit a run."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "rollout_run = exp.submit(rollout_estimator)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "And then, similar to the training section, we can monitor the real-time progress of the rollout run and its chid as follows. If you browse logs of the child run you can see the evaluation results recorded in driver_log.txt file. Note that you may need to wait several minutes before these results become available."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.widgets import RunDetails\n",
        "\n",
        "RunDetails(rollout_run).show()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Wait for completion of the rollout run, or you may cancel the run."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Uncomment line below to cancel the run\n",
        "#rollout_run.cancel()\n",
        "rollout_run.wait_for_completion()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Cleaning up\n",
        "For your convenience, below you can find code snippets to clean up any resources created as part of this tutorial that you don't wish to retain."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# To archive the created experiment:\n",
        "#exp.archive()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Next\n",
        "This example was about running Azure ML RL (Ray/RLlib Framework) on compute instance. Please see [Cartpole problem](../cartpole-on-single-compute/cartpole_cc.ipynb)\n",
        "example which uses Ray RLlib to train a Cartpole playing agent on a single node remote compute.\n"
      ]
    }
  ],
  "metadata": {
    "authors": [
      {
        "name": "adrosa"
      },
      {
        "name": "hoazari"
      }
    ],
    "kernelspec": {
      "display_name": "Python 3.6",
      "language": "python",
      "name": "python36"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.6.9"
    },
    "notice": "Copyright (c) Microsoft Corporation. All rights reserved. Licensed under the MIT License."
  },
  "nbformat": 4,
  "nbformat_minor": 4
 }
--- a/how-to-use-azureml/reinforcement-learning/cartpole-on-compute-instance/cartpole_ci.yml
+++ b/how-to-use-azureml/reinforcement-learning/cartpole-on-compute-instance/cartpole_ci.yml
@@ -0,0 +1,6 @@
 name: cartpole_ci
 dependencies:
 - pip:
  - azureml-sdk
  - azureml-contrib-reinforcementlearning
  - azureml-widgets
--- a/how-to-use-azureml/reinforcement-learning/cartpole-on-compute-instance/files/cartpole_rollout.py
+++ b/how-to-use-azureml/reinforcement-learning/cartpole-on-compute-instance/files/cartpole_rollout.py
@@ -0,0 +1,119 @@
 import argparse
 import os
 import sys
 import ray
 from ray.rllib import rollout
 from ray.tune.registry import get_trainable_cls
 from azureml.core import Run
 from utils import callbacks
 DEFAULT_RAY_ADDRESS = 'localhost:6379'
 def run_rollout(args, parser, ray_address):
    config = args.config
    if not args.env:
        if not config.get("env"):
            parser.error("the following arguments are required: --env")
        args.env = config.get("env")
    ray.init(address=ray_address)
    # Create the Trainer from config.
    cls = get_trainable_cls(args.run)
    agent = cls(env=args.env, config=config)
    # Load state from checkpoint.
    agent.restore(args.checkpoint)
    num_steps = int(args.steps)
    num_episodes = int(args.episodes)
    # Determine the video output directory.
    use_arg_monitor = False
    try:
        args.video_dir
    except AttributeError:
        print("There is no such attribute: args.video_dir")
        use_arg_monitor = True
    video_dir = None
    if not use_arg_monitor:
        if args.monitor:
            video_dir = os.path.join("./logs", "video")
        elif args.video_dir:
            video_dir = os.path.expanduser(args.video_dir)
    # Do the actual rollout.
    with rollout.RolloutSaver(
            args.out,
            args.use_shelve,
            write_update_file=args.track_progress,
            target_steps=num_steps,
            target_episodes=num_episodes,
            save_info=args.save_info) as saver:
        if use_arg_monitor:
            rollout.rollout(
                agent,
                args.env,
                num_steps,
                num_episodes,
                saver,
                args.no_render,
                args.monitor)
        else:
            rollout.rollout(
                agent, args.env,
                num_steps,
                num_episodes,
                saver,
                args.no_render, video_dir)
 if __name__ == "__main__":
    # Add positional argument - serves as placeholder for checkpoint
    argvc = sys.argv[1:]
    argvc.insert(0, 'checkpoint-placeholder')
    # Parse arguments
    rollout_parser = rollout.create_parser()
    rollout_parser.add_argument(
        '--checkpoint-number', required=False, type=int, default=1,
        help='Checkpoint number of the checkpoint from which to roll out')
    rollout_parser.add_argument(
        '--ray-address', required=False, default=DEFAULT_RAY_ADDRESS,
        help='The address of the Ray cluster to connect to')
    args = rollout_parser.parse_args(argvc)
    # Get a handle to run
    run = Run.get_context()
    # Get handles to the tarining artifacts dataset and mount path
    artifacts_dataset = run.input_datasets['artifacts_dataset']
    artifacts_path = run.input_datasets['artifacts_path']
    # Find checkpoint file to be evaluated
    checkpoint_id = '-' + str(args.checkpoint_number)
    checkpoint_files = list(filter(
        lambda filename: filename.endswith(checkpoint_id),
        artifacts_dataset.to_path()))
    checkpoint_file = checkpoint_files[0]
    if checkpoint_file[0] == '/':
        checkpoint_file = checkpoint_file[1:]
    checkpoint = os.path.join(artifacts_path, checkpoint_file)
    print('Checkpoint:', checkpoint)
    # Set rollout checkpoint
    args.checkpoint = checkpoint
    # Start rollout
    run_rollout(args, rollout_parser, args.ray_address)
--- a/how-to-use-azureml/reinforcement-learning/cartpole-on-compute-instance/files/cartpole_training.py
+++ b/how-to-use-azureml/reinforcement-learning/cartpole-on-compute-instance/files/cartpole_training.py
@@ -0,0 +1,41 @@
 import argparse
 import os
 import sys
 import ray
 from ray.rllib import train
 from ray import tune
 from utils import callbacks
 DEFAULT_RAY_ADDRESS = 'localhost:6379'
 if __name__ == "__main__":
    # Parse arguments and add callbacks to config
    train_parser = train.create_parser()
    args = train_parser.parse_args()
    args.config["callbacks"] = {"on_train_result": callbacks.on_train_result}
    # Trace if video capturing is on
    if 'monitor' in args.config and args.config['monitor']:
        print("Video capturing is ON!")
    # Start (connect to) Ray cluster
    if args.ray_address is None:
        args.ray_address = DEFAULT_RAY_ADDRESS
    ray.init(address=args.ray_address)
    # Run training task using tune.run
    tune.run(
        run_or_experiment=args.run,
        config=dict(args.config, env=args.env),
        stop=args.stop,
        checkpoint_freq=args.checkpoint_freq,
        checkpoint_at_end=args.checkpoint_at_end,
        local_dir=args.local_dir
    )
--- a/how-to-use-azureml/reinforcement-learning/cartpole-on-compute-instance/files/utils/callbacks.py
+++ b/how-to-use-azureml/reinforcement-learning/cartpole-on-compute-instance/files/utils/callbacks.py
@@ -0,0 +1,17 @@
 '''RLlib callbacks module:
    Common callback methods to be passed to RLlib trainer.
 '''
 from azureml.core import Run
 def on_train_result(info):
    '''Callback on train result to record metrics returned by trainer.
    '''
    run = Run.get_context()
    run.log(
        name='episode_reward_mean',
        value=info["result"]["episode_reward_mean"])
    run.log(
        name='episodes_total',
        value=info["result"]["episodes_total"])
--- a/how-to-use-azureml/reinforcement-learning/cartpole-on-compute-instance/files/utils/misc.py
+++ b/how-to-use-azureml/reinforcement-learning/cartpole-on-compute-instance/files/utils/misc.py
@@ -0,0 +1,13 @@
 '''Misc module:
    Miscellaneous helper functions and utilities.
 '''
 import os
 import glob
 # Helper function to find a file or folder path
 def find_path(name, path_prefix):
    for root, _, _ in os.walk(path_prefix):
        if glob.glob(os.path.join(root, name)):
            return root
--- a/how-to-use-azureml/reinforcement-learning/cartpole-on-compute-instance/images/cartpole.png
+++ b/how-to-use-azureml/reinforcement-learning/cartpole-on-compute-instance/images/cartpole.png
--- a/how-to-use-azureml/reinforcement-learning/cartpole-on-single-compute/cartpole_cc.ipynb
+++ b/how-to-use-azureml/reinforcement-learning/cartpole-on-single-compute/cartpole_cc.ipynb
--- a/how-to-use-azureml/reinforcement-learning/cartpole-on-single-compute/cartpole_cc.yml
+++ b/how-to-use-azureml/reinforcement-learning/cartpole-on-single-compute/cartpole_cc.yml
@@ -0,0 +1,6 @@
 name: cartpole_cc
 dependencies:
 - pip:
  - azureml-sdk
  - azureml-contrib-reinforcementlearning
  - azureml-widgets
--- a/how-to-use-azureml/reinforcement-learning/cartpole-on-single-compute/files/cartpole_rollout.py
+++ b/how-to-use-azureml/reinforcement-learning/cartpole-on-single-compute/files/cartpole_rollout.py
@@ -0,0 +1,119 @@
 import argparse
 import os
 import sys
 import ray
 from ray.rllib import rollout
 from ray.tune.registry import get_trainable_cls
 from azureml.core import Run
 from utils import callbacks
 DEFAULT_RAY_ADDRESS = 'localhost:6379'
 def run_rollout(args, parser, ray_address):
    config = args.config
    if not args.env:
        if not config.get("env"):
            parser.error("the following arguments are required: --env")
        args.env = config.get("env")
    ray.init(address=ray_address)
    # Create the Trainer from config.
    cls = get_trainable_cls(args.run)
    agent = cls(env=args.env, config=config)
    # Load state from checkpoint.
    agent.restore(args.checkpoint)
    num_steps = int(args.steps)
    num_episodes = int(args.episodes)
    # Determine the video output directory.
    use_arg_monitor = False
    try:
        args.video_dir
    except AttributeError:
        print("There is no such attribute: args.video_dir")
        use_arg_monitor = True
    video_dir = None
    if not use_arg_monitor:
        if args.monitor:
            video_dir = os.path.join("./logs", "video")
        elif args.video_dir:
            video_dir = os.path.expanduser(args.video_dir)
    # Do the actual rollout.
    with rollout.RolloutSaver(
            args.out,
            args.use_shelve,
            write_update_file=args.track_progress,
            target_steps=num_steps,
            target_episodes=num_episodes,
            save_info=args.save_info) as saver:
        if use_arg_monitor:
            rollout.rollout(
                agent,
                args.env,
                num_steps,
                num_episodes,
                saver,
                args.no_render,
                args.monitor)
        else:
            rollout.rollout(
                agent, args.env,
                num_steps,
                num_episodes,
                saver,
                args.no_render, video_dir)
 if __name__ == "__main__":
    # Add positional argument - serves as placeholder for checkpoint
    argvc = sys.argv[1:]
    argvc.insert(0, 'checkpoint-placeholder')
    # Parse arguments
    rollout_parser = rollout.create_parser()
    rollout_parser.add_argument(
        '--checkpoint-number', required=False, type=int, default=1,
        help='Checkpoint number of the checkpoint from which to roll out')
    rollout_parser.add_argument(
        '--ray-address', required=False, default=DEFAULT_RAY_ADDRESS,
        help='The address of the Ray cluster to connect to')
    args = rollout_parser.parse_args(argvc)
    # Get a handle to run
    run = Run.get_context()
    # Get handles to the tarining artifacts dataset and mount path
    artifacts_dataset = run.input_datasets['artifacts_dataset']
    artifacts_path = run.input_datasets['artifacts_path']
    # Find checkpoint file to be evaluated
    checkpoint_id = '-' + str(args.checkpoint_number)
    checkpoint_files = list(filter(
        lambda filename: filename.endswith(checkpoint_id),
        artifacts_dataset.to_path()))
    checkpoint_file = checkpoint_files[0]
    if checkpoint_file[0] == '/':
        checkpoint_file = checkpoint_file[1:]
    checkpoint = os.path.join(artifacts_path, checkpoint_file)
    print('Checkpoint:', checkpoint)
    # Set rollout checkpoint
    args.checkpoint = checkpoint
    # Start rollout
    run_rollout(args, rollout_parser, args.ray_address)
--- a/how-to-use-azureml/reinforcement-learning/cartpole-on-single-compute/files/cartpole_training.py
+++ b/how-to-use-azureml/reinforcement-learning/cartpole-on-single-compute/files/cartpole_training.py
@@ -0,0 +1,41 @@
 import argparse
 import os
 import sys
 import ray
 from ray.rllib import train
 from ray import tune
 from utils import callbacks
 DEFAULT_RAY_ADDRESS = 'localhost:6379'
 if __name__ == "__main__":
    # Parse arguments and add callbacks to config
    train_parser = train.create_parser()
    args = train_parser.parse_args()
    args.config["callbacks"] = {"on_train_result": callbacks.on_train_result}
    # Trace if video capturing is on
    if 'monitor' in args.config and args.config['monitor']:
        print("Video capturing is ON!")
    # Start (connect to) Ray cluster
    if args.ray_address is None:
        args.ray_address = DEFAULT_RAY_ADDRESS
    ray.init(address=args.ray_address)
    # Run training task using tune.run
    tune.run(
        run_or_experiment=args.run,
        config=dict(args.config, env=args.env),
        stop=args.stop,
        checkpoint_freq=args.checkpoint_freq,
        checkpoint_at_end=args.checkpoint_at_end,
        local_dir=args.local_dir
    )
--- a/how-to-use-azureml/reinforcement-learning/cartpole-on-single-compute/files/docker/Dockerfile
+++ b/how-to-use-azureml/reinforcement-learning/cartpole-on-single-compute/files/docker/Dockerfile
@@ -0,0 +1,29 @@
 FROM mcr.microsoft.com/azureml/base:openmpi3.1.2-ubuntu18.04
 RUN apt-get update && apt-get install -y --no-install-recommends \
    python-opengl \
    rsync \
    xvfb && \
    apt-get clean -y && \
    rm -rf /var/lib/apt/lists/* && \
    rm -rf /usr/share/man/*
 RUN conda install -y conda=4.7.12 python=3.6.2 && conda clean -ay && \
    pip install --no-cache-dir \
    azureml-defaults \
    azureml-dataprep[fuse,pandas] \
    azureml-contrib-reinforcementlearning \
    gputil \
    cloudpickle==1.3.0 \
    tensorboardX \
    tensorflow==1.14.0 \
    tabulate \
    dm_tree \
    lz4 \
    ray==0.8.3 \
    ray[rllib,dashboard,tune]==0.8.3 \
    psutil \
    setproctitle \
    gym[atari] && \
    conda install -y -c conda-forge x264='1!152.20180717' ffmpeg=4.0.2 && \
    conda install opencv
--- a/how-to-use-azureml/reinforcement-learning/cartpole-on-single-compute/files/utils/callbacks.py
+++ b/how-to-use-azureml/reinforcement-learning/cartpole-on-single-compute/files/utils/callbacks.py
@@ -0,0 +1,17 @@
 '''RLlib callbacks module:
    Common callback methods to be passed to RLlib trainer.
 '''
 from azureml.core import Run
 def on_train_result(info):
    '''Callback on train result to record metrics returned by trainer.
    '''
    run = Run.get_context()
    run.log(
        name='episode_reward_mean',
        value=info["result"]["episode_reward_mean"])
    run.log(
        name='episodes_total',
        value=info["result"]["episodes_total"])
--- a/how-to-use-azureml/reinforcement-learning/cartpole-on-single-compute/files/utils/misc.py
+++ b/how-to-use-azureml/reinforcement-learning/cartpole-on-single-compute/files/utils/misc.py
@@ -0,0 +1,13 @@
 '''Misc module:
    Miscellaneous helper functions and utilities.
 '''
 import os
 import glob
 # Helper function to find a file or folder path
 def find_path(name, path_prefix):
    for root, _, _ in os.walk(path_prefix):
        if glob.glob(os.path.join(root, name)):
            return root
--- a/how-to-use-azureml/reinforcement-learning/cartpole-on-single-compute/images/cartpole.png
+++ b/how-to-use-azureml/reinforcement-learning/cartpole-on-single-compute/images/cartpole.png
--- a/how-to-use-azureml/reinforcement-learning/minecraft-on-distributed-compute/docker/cpu/Dockerfile
+++ b/how-to-use-azureml/reinforcement-learning/minecraft-on-distributed-compute/docker/cpu/Dockerfile
@@ -0,0 +1,70 @@
 FROM mcr.microsoft.com/azureml/base:openmpi3.1.2-ubuntu18.04
 # Install some basic utilities
 RUN apt-get update && apt-get install -y \
    curl \
    ca-certificates \
    sudo \
 	cpio \
    git \
    bzip2 \
    libx11-6 \
    tmux \
    htop \
    gcc \
    xvfb \
    python-opengl \
    x11-xserver-utils \
    ffmpeg \
    mesa-utils \
    nano \
    vim \
    rsync \
 && rm -rf /var/lib/apt/lists/*
 # Create a working directory
 RUN mkdir /app
 WORKDIR /app
 # Install Minecraft needed libraries
 RUN mkdir -p /usr/share/man/man1 && \
    sudo apt-get update && \
    sudo apt-get install -y \
    openjdk-8-jre-headless=8u162-b12-1 \
    openjdk-8-jdk-headless=8u162-b12-1 \
    openjdk-8-jre=8u162-b12-1 \
    openjdk-8-jdk=8u162-b12-1
 # Create a Python 3.7 environment
 RUN conda install conda-build \
    && conda create -y --name py37 python=3.7.3 \
    && conda clean -ya
 ENV CONDA_DEFAULT_ENV=py37
 # Install minerl
 RUN pip install --upgrade --user minerl
 RUN pip install \
    pandas \
    matplotlib \
    numpy \
    scipy \
    azureml-defaults \
    tensorboardX \
    tensorflow==1.15rc2 \
    tabulate \
    dm_tree \
    lz4 \
    ray==0.8.3 \
    ray[rllib]==0.8.3 \
    ray[tune]==0.8.3
 COPY patch_files/* /root/.local/lib/python3.7/site-packages/minerl/env/Malmo/Minecraft/src/main/java/com/microsoft/Malmo/Client/
 # Start minerl to pre-fetch minerl files (saves time when starting minerl during training)
 RUN xvfb-run -a -s "-screen 0 1400x900x24" python -c "import gym; import minerl; env = gym.make('MineRLTreechop-v0'); env.close();"
 RUN pip install --index-url https://test.pypi.org/simple/ malmo && \
    python -c "import malmo.minecraftbootstrap; malmo.minecraftbootstrap.download();"
 ENV MALMO_XSD_PATH="/app/MalmoPlatform/Schemas"
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
vizhur	879a272a8d	update samples from Release-52 as a part of SDK release	2020-05-18 19:21:05 +00:00
Harneet Virk	bc65bde097	Merge pull request #971 from Azure/release_update/Release-51 update samples from Release-51 as a part of SDK release	2020-05-13 22:17:45 -07:00
vizhur	690bdfbdbe	update samples from Release-51 as a part of SDK release	2020-05-14 05:03:47 +00:00
Harneet Virk	3c02bd8782	Merge pull request #967 from Azure/release_update/Release-50 update samples from Release-50 as a part of SDK release	2020-05-12 19:57:40 -07:00
vizhur	5c14610a1c	update samples from Release-50 as a part of SDK release	2020-05-13 02:45:40 +00:00
Harneet Virk	4e3afae6fb	Merge pull request #965 from Azure/release_update/Release-49 update samples from Release-49 as a part of SDK release	2020-05-11 19:25:28 -07:00
vizhur	a2144aa083	update samples from Release-49 as a part of SDK release	2020-05-12 02:24:34 +00:00
Harneet Virk	0e6334178f	Merge pull request #963 from Azure/release_update/Release-46 update samples from Release-46 as a part of SDK release	2020-05-11 14:49:34 -07:00
vizhur	4ec9178d22	update samples from Release-46 as a part of SDK release	2020-05-11 21:48:31 +00:00
Harneet Virk	2aa7c53b0c	Merge pull request #962 from Azure/release_update_stablev2/Release-11 update samples from Release-11 as a part of 1.5.0 SDK stable release	2020-05-11 12:42:32 -07:00
vizhur	553fa43e17	update samples from Release-11 as a part of 1.5.0 SDK stable release	2020-05-11 18:59:22 +00:00
Harneet Virk	e98131729e	Merge pull request #949 from Azure/release_update_stablev2/Release-8 update samples from Release-8 as a part of 1.4.0 SDK stable release	2020-04-27 11:00:37 -07:00
vizhur	fd2b09e2c2	update samples from Release-8 as a part of 1.4.0 SDK stable release	2020-04-27 17:44:41 +00:00