update samples from Release-137 as a part of 1.0.53 SDK release

2025-12-21 10:05:09 -05:00 · 2019-07-24 22:37:36 +00:00
parent ddfce6b24c
commit ee1da0ee19
57 changed files with 2778 additions and 511 deletions
--- a/README.md
+++ b/README.md
@@ -38,6 +38,7 @@ The [How to use Azure ML](./how-to-use-azureml) folder contains specific example
 - [Machine Learning Pipelines](./how-to-use-azureml/machine-learning-pipelines) - Examples showing how to create and use reusable pipelines for training and batch scoring
 - [Deployment](./how-to-use-azureml/deployment) - Examples showing how to deploy and manage machine learning models and solutions
 - [Azure Databricks](./how-to-use-azureml/azure-databricks) - Examples showing how to use Azure ML with Azure Databricks
 - [Monitor Models](./how-to-use-azureml/monitor-models) - Examples showing how to enable model monitoring services such as DataDrift
 ---
 ## Documentation
@@ -52,6 +53,7 @@ The [How to use Azure ML](./how-to-use-azureml) folder contains specific example
 Visit following repos to see projects contributed by Azure ML users:
 - [AMLSamples](https://github.com/Azure/AMLSamples) Number of end-to-end examples, including face recognition, predictive maintenance, customer churn and sentiment analysis.
 - [Fine tune natural language processing models using Azure Machine Learning service](https://github.com/Microsoft/AzureML-BERT)
 - [Fashion MNIST with Azure ML SDK](https://github.com/amynic/azureml-sdk-fashion)
--- a/configuration.ipynb
+++ b/configuration.ipynb
@@ -103,7 +103,7 @@
      "source": [
        "import azureml.core\n",
        "\n",
-        "print(\"This notebook was created using version 1.0.48\r\n of the Azure ML SDK\")\n",
+        "print(\"This notebook was created using version 1.0.53 of the Azure ML SDK\")\n",
        "print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
      ]
    },
--- a/how-to-use-azureml/automated-machine-learning/automl_env_mac.yml
+++ b/how-to-use-azureml/automated-machine-learning/automl_env_mac.yml
@@ -2,6 +2,7 @@ name: azure_automl
 dependencies:
  # The python interpreter version.
  # Currently Azure ML only supports 3.5.2 and later.
 - pip
 - nomkl
 - python>=3.5.2,<3.6.8
 - nb_conda
--- a/how-to-use-azureml/automated-machine-learning/forecasting-bike-share/auto-ml-forecasting-bike-share.ipynb
+++ b/how-to-use-azureml/automated-machine-learning/forecasting-bike-share/auto-ml-forecasting-bike-share.ipynb
@@ -578,7 +578,7 @@
  "metadata": {
    "authors": [
      {
-        "name": "xiaga@microsoft.com, tosingli@microsoft.com, erwright@microsoft.com"
+        "name": "erwright"
      }
    ],
    "kernelspec": {
--- a/how-to-use-azureml/automated-machine-learning/forecasting-energy-demand/auto-ml-forecasting-energy-demand.ipynb
+++ b/how-to-use-azureml/automated-machine-learning/forecasting-energy-demand/auto-ml-forecasting-energy-demand.ipynb
@@ -587,7 +587,7 @@
  "metadata": {
    "authors": [
      {
-        "name": "xiaga, tosingli, erwright"
+        "name": "erwright"
      }
    ],
    "kernelspec": {
--- a/how-to-use-azureml/automated-machine-learning/forecasting-orange-juice-sales/auto-ml-forecasting-orange-juice-sales.ipynb
+++ b/how-to-use-azureml/automated-machine-learning/forecasting-orange-juice-sales/auto-ml-forecasting-orange-juice-sales.ipynb
@@ -829,7 +829,7 @@
  "metadata": {
    "authors": [
      {
-        "name": "erwright, tosingli"
+        "name": "erwright"
      }
    ],
    "kernelspec": {
--- a/how-to-use-azureml/automated-machine-learning/sql-server/README.md
+++ b/how-to-use-azureml/automated-machine-learning/sql-server/README.md
@@ -87,7 +87,7 @@ These instruction setup the integration for SQL Server 2017 on Windows.
   sudo /opt/mssql/mlservices/bin/python/python -m pip install --upgrade sklearn
 ```
 7. Start SQL Server. 
-8. Execute the files aml_model.sql, aml_connection.sql, AutoMLGetMetrics.sql, AutoMLPredict.sql and AutoMLTrain.sql in SQL Server Management Studio. 
+8. Execute the files aml_model.sql, aml_connection.sql, AutoMLGetMetrics.sql, AutoMLPredict.sql, AutoMLForecast.sql and AutoMLTrain.sql in SQL Server Management Studio. 
 9. Create an Azure Machine Learning Workspace.  You can use the instructions at: [https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-manage-workspace](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-manage-workspace)
 10. Create a config.json file file using the subscription id, resource group name and workspace name that you use to create the workspace.  The file is described at: [https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-configure-environment#workspace](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-configure-environment#workspace)
 11. Create an Azure service principal.  You can do this with the commands: 
@@ -109,5 +109,5 @@ First you need to load the sample data in the database.
 You can then run the queries in the energy-demand folder:
 * TrainEnergyDemand.sql runs AutoML, trains multiple models on data and selects the best model.
-* PredictEnergyDemand.sql predicts based on the most recent training run.
+* ForecastEnergyDemand.sql forecasts based on the most recent training run.
 * GetMetrics.sql returns all the metrics for each model in the most recent training run.
--- a/how-to-use-azureml/deployment/accelerated-models/README.md
+++ b/how-to-use-azureml/deployment/accelerated-models/README.md
@@ -12,7 +12,7 @@ Easily create and train a model using various deep neural networks (DNNs) as a f
 To learn more about the azureml-accel-model classes, see the section [Model Classes](#model-classes) below or the [Azure ML Accel Models SDK documentation](https://docs.microsoft.com/en-us/python/api/azureml-accel-models/azureml.accel?view=azure-ml-py).
 ### Step 1: Create an Azure ML workspace
-Follow [these instructions](https://docs.microsoft.com/en-us/azure/machine-learning/service/quickstart-create-workspace-with-python) to install the Azure ML SDK on your local machine, create an Azure ML workspace, and set up your notebook environment, which is required for the next step.
+Follow [these instructions](https://docs.microsoft.com/en-us/azure/machine-learning/service/setup-create-workspace) to install the Azure ML SDK on your local machine, create an Azure ML workspace, and set up your notebook environment, which is required for the next step.
 ### Step 2: Check your FPGA quota
 Use the Azure CLI to check whether you have quota.
--- a/how-to-use-azureml/deployment/accelerated-models/accelerated-models-object-detection.ipynb
+++ b/how-to-use-azureml/deployment/accelerated-models/accelerated-models-object-detection.ipynb
@@ -1,5 +1,12 @@
 {
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/deployment/accelerated-models/accelerated-models-object-detection.png)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
@@ -230,11 +237,14 @@
        "\n",
        "# Convert model\n",
        "convert_request = AccelOnnxConverter.convert_tf_model(ws, registered_model, input_tensors, output_tensors_str)\n",
-        "# If it fails, you can run wait_for_completion again with show_output=True.\n",
+        "if convert_request.wait_for_completion(show_output = False):\n",
-        "convert_request.wait_for_completion(show_output=False)\n",
+        "    # If the above call succeeded, get the converted model\n",
-        "converted_model = convert_request.result\n",
+        "    converted_model = convert_request.result\n",
-        "print(\"\\nSuccessfully converted: \", converted_model.name, converted_model.url, converted_model.version, \n",
+        "    print(\"\\nSuccessfully converted: \", converted_model.name, converted_model.url, converted_model.version, \n",
-        "      converted_model.id, converted_model.created_time, '\\n')\n",
+        "          converted_model.id, converted_model.created_time, '\\n')\n",
        "else:\n",
        "    print(\"Model conversion failed. Showing output.\")\n",
        "    convert_request.wait_for_completion(show_output = True)\n",
        "\n",
        "# Package into AccelContainerImage\n",
        "image_config = AccelContainerImage.image_configuration()\n",
@@ -298,6 +308,7 @@
      "metadata": {},
      "outputs": [],
      "source": [
        "%%time\n",
        "aks_target.wait_for_completion(show_output = True)\n",
        "print(aks_target.provisioning_state)\n",
        "print(aks_target.provisioning_errors)"
@@ -316,6 +327,7 @@
      "metadata": {},
      "outputs": [],
      "source": [
        "%%time\n",
        "from azureml.core.webservice import Webservice, AksWebservice\n",
        "\n",
        "# Set the web service configuration (for creating a test service, we don't want autoscale enabled)\n",
@@ -342,10 +354,9 @@
        "## 5. Test the service\n",
        "<a id=\"create-client\"></a>\n",
        "### 5.a. Create Client\n",
-        "The image supports gRPC and the TensorFlow Serving \"predict\" API. We have a client that can call into the docker image to get predictions. \n",
+        "The image supports gRPC and the TensorFlow Serving \"predict\" API. We will create a PredictionClient from the Webservice object that can call into the docker image to get predictions. If you do not have the Webservice object, you can also create [PredictionClient](https://docs.microsoft.com/en-us/python/api/azureml-accel-models/azureml.accel.predictionclient?view=azure-ml-py) directly.\n",
        "\n",
        "**Note:** If you chose to use auth_enabled=True when creating your AksWebservice.deploy_configuration(), see documentation [here](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.webservice(class)?view=azure-ml-py#get-keys--) on how to retrieve your keys and use either key as an argument to PredictionClient(...,access_token=key).",
        "\n",
        "**Note:** If you chose to use auth_enabled=True when creating your AksWebservice.deploy_configuration(), see documentation [here](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.webservice(class)?view=azure-ml-py#get-keys--) on how to retrieve your keys and use either key as an argument to PredictionClient(...,access_token=key).\n",
        "**WARNING:** If you are running on Azure Notebooks free compute, you will not be able to make outgoing calls to your service. Try locating your client on a different machine to consume it."
      ]
    },
@@ -356,18 +367,10 @@
      "outputs": [],
      "source": [
        "# Using the grpc client in AzureML Accelerated Models SDK\n",
-        "from azureml.accel.client import PredictionClient\n",
+        "from azureml.accel import client_from_service\n",
        "\n",
        "address = aks_service.scoring_uri\n",
        "ssl_enabled = address.startswith(\"https\")\n",
        "address = address[address.find('/')+2:].strip('/')\n",
        "port = 443 if ssl_enabled else 80\n",
        "\n",
        "# Initialize AzureML Accelerated Models client\n",
-        "client = PredictionClient(address=address,\n",
+        "client = client_from_service(aks_service)"
        "                          port=port,\n",
        "                          use_ssl=ssl_enabled,\n",
        "                          service_name=aks_service.name)"
      ]
    },
    {
@@ -486,7 +489,7 @@
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
-      "version": "3.6.0"
+      "version": "3.5.6"
    }
  },
  "nbformat": 4,
--- a/how-to-use-azureml/deployment/accelerated-models/accelerated-models-object-detection.yml
+++ b/how-to-use-azureml/deployment/accelerated-models/accelerated-models-object-detection.yml
@@ -0,0 +1,8 @@
 name: accelerated-models-object-detection
 dependencies:
 - pip:
  - azureml-sdk
  - azureml-accel-models
  - tensorflow
  - opencv-python
  - matplotlib
--- a/how-to-use-azureml/deployment/accelerated-models/accelerated-models-quickstart.ipynb
+++ b/how-to-use-azureml/deployment/accelerated-models/accelerated-models-quickstart.ipynb
@@ -1,5 +1,12 @@
 {
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/deployment/accelerated-models/accelerated-models-quickstart.png)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
@@ -270,12 +277,15 @@
        "from azureml.accel import AccelOnnxConverter\n",
        "\n",
        "convert_request = AccelOnnxConverter.convert_tf_model(ws, registered_model, input_tensors, output_tensors)\n",
-        "# If it fails, you can run wait_for_completion again with show_output=True.\n",
+        "\n",
-        "convert_request.wait_for_completion(show_output = False)\n",
+        "if convert_request.wait_for_completion(show_output = False):\n",
-        "# If the above call succeeded, get the converted model\n",
+        "    # If the above call succeeded, get the converted model\n",
-        "converted_model = convert_request.result\n",
+        "    converted_model = convert_request.result\n",
-        "print(\"\\nSuccessfully converted: \", converted_model.name, converted_model.url, converted_model.version, \n",
+        "    print(\"\\nSuccessfully converted: \", converted_model.name, converted_model.url, converted_model.version, \n",
-        "      converted_model.id, converted_model.created_time, '\\n')"
+        "          converted_model.id, converted_model.created_time, '\\n')\n",
        "else:\n",
        "    print(\"Model conversion failed. Showing output.\")\n",
        "    convert_request.wait_for_completion(show_output = True)"
      ]
    },
    {
@@ -366,6 +376,7 @@
      "metadata": {},
      "outputs": [],
      "source": [
        "%%time\n",
        "aks_target.wait_for_completion(show_output = True)\n",
        "print(aks_target.provisioning_state)\n",
        "print(aks_target.provisioning_errors)"
@@ -384,9 +395,10 @@
      "metadata": {},
      "outputs": [],
      "source": [
        "%%time\n",
        "from azureml.core.webservice import Webservice, AksWebservice\n",
        "\n",
-        "#Set the web service configuration (for creating a test service, we don't want autoscale enabled)\n",
+        "# Set the web service configuration (for creating a test service, we don't want autoscale enabled)\n",
        "# Authentication is enabled by default, but for testing we specify False\n",
        "aks_config = AksWebservice.deploy_configuration(autoscale_enabled=False,\n",
        "                                                num_replicas=1,\n",
@@ -415,10 +427,9 @@
      "metadata": {},
      "source": [
        "### 7.a. Create Client\n",
-        "The image supports gRPC and the TensorFlow Serving \"predict\" API. We have a client that can call into the docker image to get predictions.\n",
+        "The image supports gRPC and the TensorFlow Serving \"predict\" API. We will create a PredictionClient from the Webservice object that can call into the docker image to get predictions. If you do not have the Webservice object, you can also create [PredictionClient](https://docs.microsoft.com/en-us/python/api/azureml-accel-models/azureml.accel.predictionclient?view=azure-ml-py) directly.\n",
        "\n",
        "**Note:** If you chose to use auth_enabled=True when creating your AksWebservice, see documentation [here](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.webservice(class)?view=azure-ml-py#get-keys--) on how to retrieve your keys and use either key as an argument to PredictionClient(...,access_token=key).",
        "\n",
        "**Note:** If you chose to use auth_enabled=True when creating your AksWebservice, see documentation [here](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.webservice(class)?view=azure-ml-py#get-keys--) on how to retrieve your keys and use either key as an argument to PredictionClient(...,access_token=key).\n",
        "**WARNING:** If you are running on Azure Notebooks free compute, you will not be able to make outgoing calls to your service. Try locating your client on a different machine to consume it."
      ]
    },
@@ -429,18 +440,10 @@
      "outputs": [],
      "source": [
        "# Using the grpc client in AzureML Accelerated Models SDK\n",
-        "from azureml.accel.client import PredictionClient\n",
+        "from azureml.accel import client_from_service\n",
        "\n",
        "address = aks_service.scoring_uri\n",
        "ssl_enabled = address.startswith(\"https\")\n",
        "address = address[address.find('/')+2:].strip('/')\n",
        "port = 443 if ssl_enabled else 80\n",
        "\n",
        "# Initialize AzureML Accelerated Models client\n",
-        "client = PredictionClient(address=address,\n",
+        "client = client_from_service(aks_service)"
        "                          port=port,\n",
        "                          use_ssl=ssl_enabled,\n",
        "                          service_name=aks_service.name)"
      ]
    },
    {
@@ -540,7 +543,7 @@
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
-      "version": "3.6.0"
+      "version": "3.5.6"
    }
  },
  "nbformat": 4,
--- a/how-to-use-azureml/deployment/accelerated-models/accelerated-models-quickstart.yml
+++ b/how-to-use-azureml/deployment/accelerated-models/accelerated-models-quickstart.yml
@@ -0,0 +1,6 @@
 name: accelerated-models-quickstart
 dependencies:
 - pip:
  - azureml-sdk
  - azureml-accel-models
  - tensorflow
--- a/how-to-use-azureml/deployment/accelerated-models/accelerated-models-training.ipynb
+++ b/how-to-use-azureml/deployment/accelerated-models/accelerated-models-training.ipynb
@@ -1,5 +1,12 @@
 {
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/deployment/accelerated-models/accelerated-models-training.png)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
@@ -410,6 +417,7 @@
      "metadata": {},
      "outputs": [],
      "source": [
        "%%time\n",
        "# Launch the training\n",
        "tf.reset_default_graph()\n",
        "sess = tf.Session(graph=tf.get_default_graph())\n",
@@ -582,11 +590,14 @@
        "\n",
        "# Convert model\n",
        "convert_request = AccelOnnxConverter.convert_tf_model(ws, registered_model, input_tensors, output_tensors)\n",
-        "# If it fails, you can run wait_for_completion again with show_output=True.\n",
+        "if convert_request.wait_for_completion(show_output = False):\n",
-        "convert_request.wait_for_completion(show_output=False)\n",
+        "    # If the above call succeeded, get the converted model\n",
-        "converted_model = convert_request.result\n",
+        "    converted_model = convert_request.result\n",
-        "print(\"\\nSuccessfully converted: \", converted_model.name, converted_model.url, converted_model.version, \n",
+        "    print(\"\\nSuccessfully converted: \", converted_model.name, converted_model.url, converted_model.version, \n",
-        "      converted_model.id, converted_model.created_time, '\\n')\n",
+        "          converted_model.id, converted_model.created_time, '\\n')\n",
        "else:\n",
        "    print(\"Model conversion failed. Showing output.\")\n",
        "    convert_request.wait_for_completion(show_output = True)\n",
        "\n",
        "# Package into AccelContainerImage\n",
        "image_config = AccelContainerImage.image_configuration()\n",
@@ -655,6 +666,7 @@
      "metadata": {},
      "outputs": [],
      "source": [
        "%%time\n",
        "aks_target.wait_for_completion(show_output = True)\n",
        "print(aks_target.provisioning_state)\n",
        "print(aks_target.provisioning_errors)"
@@ -673,6 +685,7 @@
      "metadata": {},
      "outputs": [],
      "source": [
        "%%time\n",
        "from azureml.core.webservice import Webservice, AksWebservice\n",
        "\n",
        "# Set the web service configuration (for creating a test service, we don't want autoscale enabled)\n",
@@ -700,10 +713,9 @@
        "\n",
        "<a id=\"create-client\"></a>\n",
        "### 9.a. Create Client\n",
-        "The image supports gRPC and the TensorFlow Serving \"predict\" API. We have a client that can call into the docker image to get predictions. \n",
+        "The image supports gRPC and the TensorFlow Serving \"predict\" API. We will create a PredictionClient from the Webservice object that can call into the docker image to get predictions. If you do not have the Webservice object, you can also create [PredictionClient](https://docs.microsoft.com/en-us/python/api/azureml-accel-models/azureml.accel.predictionclient?view=azure-ml-py) directly.\n",
        "\n",
        "**Note:** If you chose to use auth_enabled=True when creating your AksWebservice.deploy_configuration(), see documentation [here](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.webservice(class)?view=azure-ml-py#get-keys--) on how to retrieve your keys and use either key as an argument to PredictionClient(...,access_token=key).",
        "\n",
        "**Note:** If you chose to use auth_enabled=True when creating your AksWebservice.deploy_configuration(), see documentation [here](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.webservice(class)?view=azure-ml-py#get-keys--) on how to retrieve your keys and use either key as an argument to PredictionClient(...,access_token=key).\n",
        "**WARNING:** If you are running on Azure Notebooks free compute, you will not be able to make outgoing calls to your service. Try locating your client on a different machine to consume it."
      ]
    },
@@ -714,18 +726,10 @@
      "outputs": [],
      "source": [
        "# Using the grpc client in AzureML Accelerated Models SDK\n",
-        "from azureml.accel.client import PredictionClient\n",
+        "from azureml.accel import client_from_service\n",
        "\n",
        "address = aks_service.scoring_uri\n",
        "ssl_enabled = address.startswith(\"https\")\n",
        "address = address[address.find('/')+2:].strip('/')\n",
        "port = 443 if ssl_enabled else 80\n",
        "\n",
        "# Initialize AzureML Accelerated Models client\n",
-        "client = PredictionClient(address=address,\n",
+        "client = client_from_service(aks_service)"
        "                          port=port,\n",
        "                          use_ssl=ssl_enabled,\n",
        "                          service_name=aks_service.name)"
      ]
    },
    {
@@ -854,7 +858,7 @@
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
-      "version": "3.6.0"
+      "version": "3.5.6"
    }
  },
  "nbformat": 4,
--- a/how-to-use-azureml/deployment/accelerated-models/accelerated-models-training.yml
+++ b/how-to-use-azureml/deployment/accelerated-models/accelerated-models-training.yml
@@ -0,0 +1,9 @@
 name: accelerated-models-training
 dependencies:
 - pip:
  - azureml-sdk
  - azureml-accel-models
  - tensorflow
  - keras
  - tqdm 
  - sklearn
--- a/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-how-to-use-estimatorstep.ipynb
+++ b/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-how-to-use-estimatorstep.ipynb
@@ -150,7 +150,9 @@
        "> Estimator object initialization involves specifying a list of DataReference objects in its 'inputs' parameter.\n",
        "    In Pipelines, a step can take another step's output or DataReferences as input. So when creating an EstimatorStep,\n",
        "    the parameters 'inputs' and 'outputs' need to be set explicitly and that will override 'inputs' parameter\n",
-        "    specified in the Estimator object."
+        "    specified in the Estimator object.\n",
        "   \n",
        "> The best practice is to use separate folders for scripts and its dependent files for each step and specify that folder as the `source_directory` for the step. This helps reduce the size of the snapshot created for the step (only the specific folder is snapshotted). Since changes in any files in the `source_directory` would trigger a re-upload of the snapshot, this helps keep the reuse of the step when there are no changes in the `source_directory` of the step."
      ]
    },
    {
@@ -170,7 +172,9 @@
        "    data_reference_name=\"input_data\",\n",
        "    path_on_datastore=\"20newsgroups/20news.pkl\")\n",
        "\n",
-        "output = PipelineData(\"output\", datastore=def_blob_store)"
+        "output = PipelineData(\"output\", datastore=def_blob_store)\n",
        "\n",
        "source_directory = 'estimator_train'"
      ]
    },
    {
@@ -181,7 +185,7 @@
      "source": [
        "from azureml.train.estimator import Estimator\n",
        "\n",
-        "est = Estimator(source_directory='.', \n",
+        "est = Estimator(source_directory=source_directory, \n",
        "                compute_target=cpu_cluster, \n",
        "                entry_script='dummy_train.py', \n",
        "                conda_packages=['scikit-learn'])"
--- a/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-parameter-tuning-with-hyperdrive.ipynb
+++ b/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-parameter-tuning-with-hyperdrive.ipynb
@@ -88,7 +88,11 @@
      "metadata": {},
      "source": [
        "## Create an Azure ML experiment\n",
-        "Let's create an experiment named \"tf-mnist\" and a folder to hold the training scripts. The script runs will be recorded under the experiment in Azure.\n"
+        "Let's create an experiment named \"tf-mnist\" and a folder to hold the training scripts. \n",
        "\n",
        "> The best practice is to use separate folders for scripts and its dependent files for each step. This helps reduce the size of the snapshot created for the step (only the specific folder is snapshotted). Since changes in any files in the `source_directory` would trigger a re-upload of the snapshot, this helps keep the reuse of the step when there are no changes in the `source_directory` of the step. \n",
        "\n",
        "> The script runs will be recorded under the experiment in Azure."
      ]
    },
    {
--- a/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-publish-and-run-using-rest-endpoint.ipynb
+++ b/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-publish-and-run-using-rest-endpoint.ipynb
@@ -57,10 +57,8 @@
        "ws = Workspace.from_config()\n",
        "print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep = '\\n')\n",
        "\n",
-        "# Default datastore (Azure file storage)\n",
+        "# Default datastore (Azure blob storage)\n",
-        "def_file_store = ws.get_default_datastore() \n",
+        "# def_blob_store = ws.get_default_datastore()\n",
        "print(\"Default datastore's name: {}\".format(def_file_store.name))\n",
        "\n",
        "def_blob_store = Datastore(ws, \"workspaceblobstore\")\n",
        "print(\"Blobstore's name: {}\".format(def_blob_store.name))"
      ]
@@ -147,7 +145,9 @@
        "#### Define a Step that consumes a datasource and produces intermediate data.\n",
        "In this step, we define a step that consumes a datasource and produces intermediate data.\n",
        "\n",
-        "**Open `train.py` in the local machine and examine the arguments, inputs, and outputs for the script. That will give you a good sense of why the script argument names used below are important.** "
+        "**Open `train.py` in the local machine and examine the arguments, inputs, and outputs for the script. That will give you a good sense of why the script argument names used below are important.** \n",
        "\n",
        "The best practice is to use separate folders for scripts and its dependent files for each step and specify that folder as the `source_directory` for the step. This helps reduce the size of the snapshot created for the step (only the specific folder is snapshotted). Since changes in any files in the `source_directory` would trigger a re-upload of the snapshot, this helps keep the reuse of the step when there are no changes in the `source_directory` of the step."
      ]
    },
    {
@@ -158,13 +158,16 @@
      "source": [
        "# trainStep consumes the datasource (Datareference) in the previous step\n",
        "# and produces processed_data1\n",
        "\n",
        "source_directory = \"publish_run_train\"\n",
        "\n",
        "trainStep = PythonScriptStep(\n",
        "    script_name=\"train.py\", \n",
        "        arguments=[\"--input_data\", blob_input_data, \"--output_train\", processed_data1],\n",
        "    inputs=[blob_input_data],\n",
        "    outputs=[processed_data1],\n",
        "    compute_target=aml_compute, \n",
-        "    source_directory='.'\n",
+        "    source_directory=source_directory\n",
        ")\n",
        "print(\"trainStep created\")"
      ]
@@ -188,6 +191,7 @@
        "# extractStep to use the intermediate data produced by step4\n",
        "# This step also produces an output processed_data2\n",
        "processed_data2 = PipelineData(\"processed_data2\", datastore=def_blob_store)\n",
        "source_directory = \"publish_run_extract\"\n",
        "\n",
        "extractStep = PythonScriptStep(\n",
        "    script_name=\"extract.py\",\n",
@@ -195,7 +199,7 @@
        "    inputs=[processed_data1],\n",
        "    outputs=[processed_data2],\n",
        "    compute_target=aml_compute, \n",
-        "    source_directory='.')\n",
+        "    source_directory=source_directory)\n",
        "print(\"extractStep created\")"
      ]
    },
@@ -247,8 +251,7 @@
      "source": [
        "# Now define step6 that takes two inputs (both intermediate data), and produce an output\n",
        "processed_data3 = PipelineData(\"processed_data3\", datastore=def_blob_store)\n",
-        "\n",
+        "source_directory = \"publish_run_compare\"\n",
        "\n",
        "\n",
        "compareStep = PythonScriptStep(\n",
        "    script_name=\"compare.py\",\n",
@@ -256,7 +259,7 @@
        "    inputs=[processed_data1, processed_data2],\n",
        "    outputs=[processed_data3],    \n",
        "    compute_target=aml_compute, \n",
-        "    source_directory='.')\n",
+        "    source_directory=source_directory)\n",
        "print(\"compareStep created\")"
      ]
    },
--- a/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-setup-schedule-for-a-published-pipeline.ipynb
+++ b/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-setup-schedule-for-a-published-pipeline.ipynb
@@ -103,7 +103,7 @@
      "metadata": {},
      "source": [
        "### Define a pipeline step\n",
-        "Define a single step pipeline for demonstration purpose."
+        "Define a single step pipeline for demonstration purpose. The best practice is to use separate folders for scripts and its dependent files for each step and specify that folder as the `source_directory` for the step. This helps reduce the size of the snapshot created for the step (only the specific folder is snapshotted). Since changes in any files in the `source_directory` would trigger a re-upload of the snapshot, this helps keep the reuse of the step when there are no changes in the `source_directory` of the step."
      ]
    },
    {
@@ -114,11 +114,13 @@
      "source": [
        "from azureml.pipeline.steps import PythonScriptStep\n",
        "\n",
        "source_directory = \"publish_run_train\"\n",
        "\n",
        "trainStep = PythonScriptStep(\n",
        "    name=\"Training_Step\",\n",
        "    script_name=\"train.py\", \n",
        "    compute_target=aml_compute_target, \n",
-        "    source_directory='.'\n",
+        "    source_directory=source_directory\n",
        ")\n",
        "print(\"TrainStep created\")"
      ]
--- a/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-setup-versioned-pipeline-endpoints.ipynb
+++ b/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-setup-versioned-pipeline-endpoints.ipynb
@@ -76,7 +76,9 @@
      "cell_type": "markdown",
      "metadata": {},
      "source": [
-        "####  Initialization, Steps to create a Pipeline"
+        "####  Initialization, Steps to create a Pipeline\n",
        "\n",
        "The best practice is to use separate folders for scripts and its dependent files for each step and specify that folder as the `source_directory` for the step. This helps reduce the size of the snapshot created for the step (only the specific folder is snapshotted). Since changes in any files in the `source_directory` would trigger a re-upload of the snapshot, this helps keep the reuse of the step when there are no changes in the `source_directory` of the step."
      ]
    },
    {
@@ -105,7 +107,7 @@
        "    aml_compute.wait_for_completion(show_output=True, min_node_count=None, timeout_in_minutes=20)\n",
        "\n",
        "# source_directory\n",
-        "source_directory = '.'\n",
+        "source_directory = 'publish_run_train'\n",
        "# define a single step pipeline for demonstration purpose.\n",
        "trainStep = PythonScriptStep(\n",
        "    name=\"Training_Step\",\n",
--- a/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-use-adla-as-compute-target.ipynb
+++ b/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-use-adla-as-compute-target.ipynb
@@ -290,7 +290,9 @@
        "- **priority:** the priority value to use for the current job *(optional)*\n",
        "- **runtime_version:** the runtime version of the Data Lake Analytics engine *(optional)*\n",
        "- **source_directory:** folder that contains the script, assemblies etc. *(optional)*\n",
-        "- **hash_paths:** list of paths to hash to detect a change (script file is always hashed) *(optional)*"
+        "- **hash_paths:** list of paths to hash to detect a change (script file is always hashed) *(optional)*\n",
        "\n",
        "The best practice is to use separate folders for scripts and its dependent files for each step and specify that folder as the `source_directory` for the step. This helps reduce the size of the snapshot created for the step (only the specific folder is snapshotted). Since changes in any files in the `source_directory` would trigger a re-upload of the snapshot, this helps keep the reuse of the step when there are no changes in the `source_directory` of the step."
      ]
    },
    {
--- a/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-use-databricks-as-compute-target.ipynb
+++ b/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-use-databricks-as-compute-target.ipynb
@@ -175,7 +175,7 @@
      "metadata": {},
      "source": [
        "## Data Connections with Inputs and Outputs\n",
-        "The DatabricksStep supports Azure Blob and ADLS for inputs and outputs. You also will need to define a [Secrets](https://docs.azuredatabricks.net/user-guide/secrets/index.html) scope to enable authentication to external data sources such as Blob and ADLS from Databricks.\n",
+        "The DatabricksStep supports DBFS, Azure Blob and ADLS for inputs and outputs. You also will need to define a [Secrets](https://docs.azuredatabricks.net/user-guide/secrets/index.html) scope to enable authentication to external data sources such as Blob and ADLS from Databricks.\n",
        "\n",
        "- Databricks documentation on [Azure Blob](https://docs.azuredatabricks.net/spark/latest/data-sources/azure/azure-storage.html)\n",
        "- Databricks documentation on [ADLS](https://docs.databricks.com/spark/latest/data-sources/azure/azure-datalake.html)\n",
--- a/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-with-automated-machine-learning-step.ipynb
+++ b/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-with-automated-machine-learning-step.ipynb
@@ -108,7 +108,9 @@
      "metadata": {},
      "source": [
        "## Create an Azure ML experiment\n",
-        "Let's create an experiment named \"automl-classification\" and a folder to hold the training scripts. The script runs will be recorded under the experiment in Azure.\n"
+        "Let's create an experiment named \"automl-classification\" and a folder to hold the training scripts. The script runs will be recorded under the experiment in Azure.\n",
        "\n",
        "The best practice is to use separate folders for scripts and its dependent files for each step and specify that folder as the `source_directory` for the step. This helps reduce the size of the snapshot created for the step (only the specific folder is snapshotted). Since changes in any files in the `source_directory` would trigger a re-upload of the snapshot, this helps keep the reuse of the step when there are no changes in the `source_directory` of the step."
      ]
    },
    {
--- a/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-with-data-dependency-steps.ipynb
+++ b/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-with-data-dependency-steps.ipynb
@@ -76,14 +76,20 @@
        "ws = Workspace.from_config()\n",
        "print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep = '\\n')\n",
        "\n",
-        "# Default datastore (Azure file storage)\n",
+        "# Default datastore (Azure blob storage)\n",
-        "def_file_store = ws.get_default_datastore() \n",
+        "# def_blob_store = ws.get_default_datastore()\n",
        "print(\"Default datastore's name: {}\".format(def_file_store.name))\n",
        "\n",
        "def_blob_store = Datastore(ws, \"workspaceblobstore\")\n",
        "print(\"Blobstore's name: {}\".format(def_blob_store.name))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Source Directory\n",
        "The best practice is to use separate folders for scripts and its dependent files for each step and specify that folder as the `source_directory` for the step. This helps reduce the size of the snapshot created for the step (only the specific folder is snapshotted). Since changes in any files in the `source_directory` would trigger a re-upload of the snapshot, this helps keep the reuse of the step when there are no changes in the `source_directory` of the step."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
@@ -91,7 +97,7 @@
      "outputs": [],
      "source": [
        "# source directory\n",
-        "source_directory = '.'\n",
+        "source_directory = 'data_dependency_run_train'\n",
        "    \n",
        "print('Sample scripts will be created in {} directory.'.format(source_directory))"
      ]
@@ -340,6 +346,7 @@
        "# step5 to use the intermediate data produced by step4\n",
        "# This step also produces an output processed_data2\n",
        "processed_data2 = PipelineData(\"processed_data2\", datastore=def_blob_store)\n",
        "source_directory = \"data_dependency_run_extract\"\n",
        "\n",
        "extractStep = PythonScriptStep(\n",
        "    script_name=\"extract.py\",\n",
@@ -386,6 +393,7 @@
      "source": [
        "# Now define the compare step which takes two inputs and produces an output\n",
        "processed_data3 = PipelineData(\"processed_data3\", datastore=def_blob_store)\n",
        "source_directory = \"data_dependency_run_compare\"\n",
        "\n",
        "compareStep = PythonScriptStep(\n",
        "    script_name=\"compare.py\",\n",
--- a/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/data_dependency_run_compare/compare.py
+++ b/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/data_dependency_run_compare/compare.py
@@ -0,0 +1,24 @@
 # Copyright (c) Microsoft. All rights reserved.
 # Licensed under the MIT license.
 import argparse
 import os
 print("In compare.py")
 print("As a data scientist, this is where I use my compare code.")
 parser = argparse.ArgumentParser("compare")
 parser.add_argument("--compare_data1", type=str, help="compare_data1 data")
 parser.add_argument("--compare_data2", type=str, help="compare_data2 data")
 parser.add_argument("--output_compare", type=str, help="output_compare directory")
 parser.add_argument("--pipeline_param", type=int, help="pipeline parameter")
 args = parser.parse_args()
 print("Argument 1: %s" % args.compare_data1)
 print("Argument 2: %s" % args.compare_data2)
 print("Argument 3: %s" % args.output_compare)
 print("Argument 4: %s" % args.pipeline_param)
 if not (args.output_compare is None):
    os.makedirs(args.output_compare, exist_ok=True)
    print("%s created" % args.output_compare)
--- a/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/data_dependency_run_extract/extract.py
+++ b/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/data_dependency_run_extract/extract.py
@@ -0,0 +1,21 @@
 # Copyright (c) Microsoft. All rights reserved.
 # Licensed under the MIT license.
 import argparse
 import os
 print("In extract.py")
 print("As a data scientist, this is where I use my extract code.")
 parser = argparse.ArgumentParser("extract")
 parser.add_argument("--input_extract", type=str, help="input_extract data")
 parser.add_argument("--output_extract", type=str, help="output_extract directory")
 args = parser.parse_args()
 print("Argument 1: %s" % args.input_extract)
 print("Argument 2: %s" % args.output_extract)
 if not (args.output_extract is None):
    os.makedirs(args.output_extract, exist_ok=True)
    print("%s created" % args.output_extract)
--- a/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/data_dependency_run_train/train.py
+++ b/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/data_dependency_run_train/train.py
@@ -0,0 +1,22 @@
 # Copyright (c) Microsoft. All rights reserved.
 # Licensed under the MIT license.
 import argparse
 import os
 print("In train.py")
 print("As a data scientist, this is where I use my training code.")
 parser = argparse.ArgumentParser("train")
 parser.add_argument("--input_data", type=str, help="input data")
 parser.add_argument("--output_train", type=str, help="output_train directory")
 args = parser.parse_args()
 print("Argument 1: %s" % args.input_data)
 print("Argument 2: %s" % args.output_train)
 if not (args.output_train is None):
    os.makedirs(args.output_train, exist_ok=True)
    print("%s created" % args.output_train)
--- a/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/estimator_train/dummy_train.py
+++ b/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/estimator_train/dummy_train.py
@@ -0,0 +1,30 @@
 # Copyright (c) Microsoft Corporation. All rights reserved.
 # Licensed under the MIT License.
 import argparse
 import os
 print("*********************************************************")
 print("Hello Azure ML!")
 parser = argparse.ArgumentParser()
 parser.add_argument('--datadir', type=str, help="data directory")
 parser.add_argument('--output', type=str, help="output")
 args = parser.parse_args()
 print("Argument 1: %s" % args.datadir)
 print("Argument 2: %s" % args.output)
 if not (args.output is None):
    os.makedirs(args.output, exist_ok=True)
    print("%s created" % args.output)
 try:
    from azureml.core import Run
    run = Run.get_context()
    print("Log Fibonacci numbers.")
    run.log_list('Fibonacci numbers', [0, 1, 1, 2, 3, 5, 8, 13, 21, 34])
    run.complete()
 except:
    print("Warning: you need to install Azure ML SDK in order to log metrics.")
 print("*********************************************************")
--- a/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/publish_run_compare/compare.py
+++ b/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/publish_run_compare/compare.py
@@ -0,0 +1,24 @@
 # Copyright (c) Microsoft. All rights reserved.
 # Licensed under the MIT license.
 import argparse
 import os
 print("In compare.py")
 print("As a data scientist, this is where I use my compare code.")
 parser = argparse.ArgumentParser("compare")
 parser.add_argument("--compare_data1", type=str, help="compare_data1 data")
 parser.add_argument("--compare_data2", type=str, help="compare_data2 data")
 parser.add_argument("--output_compare", type=str, help="output_compare directory")
 parser.add_argument("--pipeline_param", type=int, help="pipeline parameter")
 args = parser.parse_args()
 print("Argument 1: %s" % args.compare_data1)
 print("Argument 2: %s" % args.compare_data2)
 print("Argument 3: %s" % args.output_compare)
 print("Argument 4: %s" % args.pipeline_param)
 if not (args.output_compare is None):
    os.makedirs(args.output_compare, exist_ok=True)
    print("%s created" % args.output_compare)
--- a/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/publish_run_extract/extract.py
+++ b/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/publish_run_extract/extract.py
@@ -0,0 +1,21 @@
 # Copyright (c) Microsoft. All rights reserved.
 # Licensed under the MIT license.
 import argparse
 import os
 print("In extract.py")
 print("As a data scientist, this is where I use my extract code.")
 parser = argparse.ArgumentParser("extract")
 parser.add_argument("--input_extract", type=str, help="input_extract data")
 parser.add_argument("--output_extract", type=str, help="output_extract directory")
 args = parser.parse_args()
 print("Argument 1: %s" % args.input_extract)
 print("Argument 2: %s" % args.output_extract)
 if not (args.output_extract is None):
    os.makedirs(args.output_extract, exist_ok=True)
    print("%s created" % args.output_extract)
--- a/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/publish_run_train/train.py
+++ b/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/publish_run_train/train.py
@@ -0,0 +1,22 @@
 # Copyright (c) Microsoft. All rights reserved.
 # Licensed under the MIT license.
 import argparse
 import os
 print("In train.py")
 print("As a data scientist, this is where I use my training code.")
 parser = argparse.ArgumentParser("train")
 parser.add_argument("--input_data", type=str, help="input data")
 parser.add_argument("--output_train", type=str, help="output_train directory")
 args = parser.parse_args()
 print("Argument 1: %s" % args.input_data)
 print("Argument 2: %s" % args.output_train)
 if not (args.output_train is None):
    os.makedirs(args.output_train, exist_ok=True)
    print("%s created" % args.output_train)
--- a/how-to-use-azureml/monitor-models/data-drift/azure-ml-datadrift.ipynb
+++ b/how-to-use-azureml/monitor-models/data-drift/azure-ml-datadrift.ipynb
@@ -0,0 +1,724 @@
 {
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "# Track Data Drift between Training and Inference Data in Production \n",
        "\n",
        "With this notebook, you will learn how to enable the DataDrift service to automatically track and determine whether your inference data is drifting from the data your model was initially trained on. The DataDrift service provides metrics and visualizations to help stakeholders identify which specific features cause the concept drift to occur.\n",
        "\n",
        "Please email driftfeedback@microsoft.com with any issues. A member from the DataDrift team will respond shortly. \n",
        "\n",
        "The DataDrift Public Preview API can be found [here](https://docs.microsoft.com/en-us/python/api/azureml-contrib-datadrift/?view=azure-ml-py). "
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/monitor-models/data-drift/azureml-datadrift.png)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "# Prerequisites and Setup"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Install the DataDrift package\n",
        "\n",
        "Install the azureml-contrib-datadrift, azureml-opendatasets and lightgbm packages before running this notebook.\n",
        "```\n",
        "pip install azureml-contrib-datadrift\n",
        "pip install lightgbm\n",
        "```"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Import Dependencies"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import json\n",
        "import os\n",
        "import time\n",
        "from datetime import datetime, timedelta\n",
        "\n",
        "import numpy as np\n",
        "import pandas as pd\n",
        "import requests\n",
        "from azureml.contrib.datadrift import DataDriftDetector, AlertConfiguration\n",
        "from azureml.opendatasets import NoaaIsdWeather\n",
        "from azureml.core import Dataset, Workspace, Run\n",
        "from azureml.core.compute import AksCompute, ComputeTarget\n",
        "from azureml.core.conda_dependencies import CondaDependencies\n",
        "from azureml.core.experiment import Experiment\n",
        "from azureml.core.image import ContainerImage\n",
        "from azureml.core.model import Model\n",
        "from azureml.core.webservice import Webservice, AksWebservice\n",
        "from azureml.widgets import RunDetails\n",
        "from sklearn.externals import joblib\n",
        "from sklearn.model_selection import train_test_split\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Set up Configuraton and Create Azure ML Workspace\n",
        "\n",
        "If you are using an Azure Machine Learning Notebook VM, you are all set. Otherwise, go through the [configuration notebook](../../../configuration.ipynb) first if you haven't already to establish your connection to the AzureML Workspace."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Please type in your initials/alias. The prefix is prepended to the names of resources created by this notebook. \n",
        "prefix = \"dd\"\n",
        "\n",
        "# NOTE: Please do not change the model_name, as it's required by the score.py file\n",
        "model_name = \"driftmodel\"\n",
        "image_name = \"{}driftimage\".format(prefix)\n",
        "service_name = \"{}driftservice\".format(prefix)\n",
        "\n",
        "# optionally, set email address to receive an email alert for DataDrift\n",
        "email_address = \"\""
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "ws = Workspace.from_config()\n",
        "print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep = '\\n')"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Generate Train/Testing Data\n",
        "\n",
        "For this demo, we will use NOAA weather data from [Azure Open Datasets](https://azure.microsoft.com/services/open-datasets/). You may replace this step with your own dataset. "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "usaf_list = ['725724', '722149', '723090', '722159', '723910', '720279',\n",
        "             '725513', '725254', '726430', '720381', '723074', '726682',\n",
        "             '725486', '727883', '723177', '722075', '723086', '724053',\n",
        "             '725070', '722073', '726060', '725224', '725260', '724520',\n",
        "             '720305', '724020', '726510', '725126', '722523', '703333',\n",
        "             '722249', '722728', '725483', '722972', '724975', '742079',\n",
        "             '727468', '722193', '725624', '722030', '726380', '720309',\n",
        "             '722071', '720326', '725415', '724504', '725665', '725424',\n",
        "             '725066']\n",
        "\n",
        "columns = ['usaf', 'wban', 'datetime', 'latitude', 'longitude', 'elevation', 'windAngle', 'windSpeed', 'temperature', 'stationName', 'p_k']\n",
        "\n",
        "\n",
        "def enrich_weather_noaa_data(noaa_df):\n",
        "    hours_in_day = 23\n",
        "    week_in_year = 52\n",
        "    \n",
        "    noaa_df[\"hour\"] = noaa_df[\"datetime\"].dt.hour\n",
        "    noaa_df[\"weekofyear\"] = noaa_df[\"datetime\"].dt.week\n",
        "    \n",
        "    noaa_df[\"sine_weekofyear\"] = noaa_df['datetime'].transform(lambda x: np.sin((2*np.pi*x.dt.week-1)/week_in_year))\n",
        "    noaa_df[\"cosine_weekofyear\"] = noaa_df['datetime'].transform(lambda x: np.cos((2*np.pi*x.dt.week-1)/week_in_year))\n",
        "\n",
        "    noaa_df[\"sine_hourofday\"] = noaa_df['datetime'].transform(lambda x: np.sin(2*np.pi*x.dt.hour/hours_in_day))\n",
        "    noaa_df[\"cosine_hourofday\"] = noaa_df['datetime'].transform(lambda x: np.cos(2*np.pi*x.dt.hour/hours_in_day))\n",
        "    \n",
        "    return noaa_df\n",
        "\n",
        "def add_window_col(input_df):\n",
        "    shift_interval = pd.Timedelta('-7 days') # your X days interval\n",
        "    df_shifted = input_df.copy()\n",
        "    df_shifted['datetime'] = df_shifted['datetime'] - shift_interval\n",
        "    df_shifted.drop(list(input_df.columns.difference(['datetime', 'usaf', 'wban', 'sine_hourofday', 'temperature'])), axis=1, inplace=True)\n",
        "\n",
        "    # merge, keeping only observations where -1 lag is present\n",
        "    df2 = pd.merge(input_df,\n",
        "                   df_shifted,\n",
        "                   on=['datetime', 'usaf', 'wban', 'sine_hourofday'],\n",
        "                   how='inner',  # use 'left' to keep observations without lags\n",
        "                   suffixes=['', '-7'])\n",
        "    return df2\n",
        "\n",
        "def get_noaa_data(start_time, end_time, cols, station_list):\n",
        "    isd = NoaaIsdWeather(start_time, end_time, cols=cols)\n",
        "    # Read into Pandas data frame.\n",
        "    noaa_df = isd.to_pandas_dataframe()\n",
        "    noaa_df = noaa_df.rename(columns={\"stationName\": \"station_name\"})\n",
        "    \n",
        "    df_filtered = noaa_df[noaa_df[\"usaf\"].isin(station_list)]\n",
        "    df_filtered.reset_index(drop=True)\n",
        "    \n",
        "    # Enrich with time features\n",
        "    df_enriched = enrich_weather_noaa_data(df_filtered)\n",
        "    \n",
        "    return df_enriched\n",
        "\n",
        "def get_featurized_noaa_df(start_time, end_time, cols, station_list):\n",
        "    df_1 = get_noaa_data(start_time - timedelta(days=7), start_time - timedelta(seconds=1), cols, station_list)\n",
        "    df_2 = get_noaa_data(start_time, end_time, cols, station_list)\n",
        "    noaa_df = pd.concat([df_1, df_2])\n",
        "    \n",
        "    print(\"Adding window feature\")\n",
        "    df_window = add_window_col(noaa_df)\n",
        "    \n",
        "    cat_columns = df_window.dtypes == object\n",
        "    cat_columns = cat_columns[cat_columns == True]\n",
        "    \n",
        "    print(\"Encoding categorical columns\")\n",
        "    df_encoded = pd.get_dummies(df_window, columns=cat_columns.keys().tolist())\n",
        "    \n",
        "    print(\"Dropping unnecessary columns\")\n",
        "    df_featurized = df_encoded.drop(['windAngle', 'windSpeed', 'datetime', 'elevation'], axis=1).dropna().drop_duplicates()\n",
        "    \n",
        "    return df_featurized"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Train model on Jan 1 - 14, 2009 data\n",
        "df = get_featurized_noaa_df(datetime(2009, 1, 1), datetime(2009, 1, 14, 23, 59, 59), columns, usaf_list)\n",
        "df.head()"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "label = \"temperature\"\n",
        "x_df = df.drop(label, axis=1)\n",
        "y_df = df[[label]]\n",
        "x_train, x_test, y_train, y_test = train_test_split(df, y_df, test_size=0.2, random_state=223)\n",
        "print(x_train.shape, x_test.shape, y_train.shape, y_test.shape)\n",
        "\n",
        "training_dir = 'outputs/training'\n",
        "training_file = \"training.csv\"\n",
        "\n",
        "# Generate training dataframe to register as Training Dataset\n",
        "os.makedirs(training_dir, exist_ok=True)\n",
        "training_df = pd.merge(x_train.drop(label, axis=1), y_train, left_index=True, right_index=True)\n",
        "training_df.to_csv(training_dir + \"/\" + training_file)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Create/Register Training Dataset"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "dataset_name = \"dataset\"\n",
        "name_suffix = datetime.utcnow().strftime(\"%Y-%m-%d-%H-%M-%S\")\n",
        "snapshot_name = \"snapshot-{}\".format(name_suffix)\n",
        "\n",
        "dstore = ws.get_default_datastore()\n",
        "dstore.upload(training_dir, \"data/training\", show_progress=True)\n",
        "dpath = dstore.path(\"data/training/training.csv\")\n",
        "trainingDataset = Dataset.auto_read_files(dpath, include_path=True)\n",
        "trainingDataset = trainingDataset.register(workspace=ws, name=dataset_name, description=\"dset\", exist_ok=True)\n",
        "\n",
        "datasets = [(Dataset.Scenario.TRAINING, trainingDataset)]\n",
        "print(\"dataset registration done.\\n\")\n",
        "datasets"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Train and Save Model"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import lightgbm as lgb\n",
        "\n",
        "train = lgb.Dataset(data=x_train, \n",
        "                    label=y_train)\n",
        "\n",
        "test = lgb.Dataset(data=x_test, \n",
        "                   label=y_test,\n",
        "                   reference=train)\n",
        "\n",
        "params = {'learning_rate'    : 0.1,\n",
        "          'boosting'         : 'gbdt',\n",
        "          'metric'           : 'rmse',\n",
        "          'feature_fraction' : 1,\n",
        "          'bagging_fraction' : 1,\n",
        "          'max_depth': 6,\n",
        "          'num_leaves'       : 31,\n",
        "          'objective'        : 'regression',\n",
        "          'bagging_freq'     : 1,\n",
        "          \"verbose\": -1,\n",
        "          'min_data_per_leaf': 100}\n",
        "\n",
        "model = lgb.train(params, \n",
        "                  num_boost_round=500,\n",
        "                  train_set=train,\n",
        "                  valid_sets=[train, test],\n",
        "                  verbose_eval=50,\n",
        "                  early_stopping_rounds=25)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "model_file = 'outputs/{}.pkl'.format(model_name)\n",
        "\n",
        "os.makedirs('outputs', exist_ok=True)\n",
        "joblib.dump(model, model_file)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Register Model"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "model = Model.register(model_path=model_file,\n",
        "                       model_name=model_name,\n",
        "                       workspace=ws,\n",
        "                       datasets=datasets)\n",
        "\n",
        "print(model_name, image_name, service_name, model)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "# Deploy Model To AKS"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": []
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Prepare Environment"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "myenv = CondaDependencies.create(conda_packages=['numpy','scikit-learn', 'joblib', 'lightgbm', 'pandas'],\n",
        "                                 pip_packages=['azureml-monitoring', 'azureml-sdk[automl]'])\n",
        "\n",
        "with open(\"myenv.yml\",\"w\") as f:\n",
        "    f.write(myenv.serialize_to_string())"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Create Image"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Image creation may take up to 15 minutes.\n",
        "\n",
        "image_name = image_name + str(model.version)\n",
        "\n",
        "if not image_name in ws.images:\n",
        "    # Use the score.py defined in this directory as the execution script\n",
        "    # NOTE: The Model Data Collector must be enabled in the execution script for DataDrift to run correctly\n",
        "    image_config = ContainerImage.image_configuration(execution_script=\"score.py\",\n",
        "                                                      runtime=\"python\",\n",
        "                                                      conda_file=\"myenv.yml\",\n",
        "                                                      description=\"Image with weather dataset model\")\n",
        "    image = ContainerImage.create(name=image_name,\n",
        "                                  models=[model],\n",
        "                                  image_config=image_config,\n",
        "                                  workspace=ws)\n",
        "\n",
        "    image.wait_for_creation(show_output=True)\n",
        "else:\n",
        "    image = ws.images[image_name]"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Create Compute Target"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "aks_name = 'dd-demo-e2e'\n",
        "prov_config = AksCompute.provisioning_configuration()\n",
        "\n",
        "if not aks_name in ws.compute_targets:\n",
        "    aks_target = ComputeTarget.create(workspace=ws,\n",
        "                                      name=aks_name,\n",
        "                                      provisioning_configuration=prov_config)\n",
        "\n",
        "    aks_target.wait_for_completion(show_output=True)\n",
        "    print(aks_target.provisioning_state)\n",
        "    print(aks_target.provisioning_errors)\n",
        "else:\n",
        "    aks_target=ws.compute_targets[aks_name]"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Deploy Service"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "aks_service_name = service_name\n",
        "\n",
        "if not aks_service_name in ws.webservices:\n",
        "    aks_config = AksWebservice.deploy_configuration(collect_model_data=True, enable_app_insights=True)\n",
        "    aks_service = Webservice.deploy_from_image(workspace=ws,\n",
        "                                               name=aks_service_name,\n",
        "                                               image=image,\n",
        "                                               deployment_config=aks_config,\n",
        "                                               deployment_target=aks_target)\n",
        "    aks_service.wait_for_deployment(show_output=True)\n",
        "    print(aks_service.state)\n",
        "else:\n",
        "    aks_service = ws.webservices[aks_service_name]"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "# Run DataDrift Analysis"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Send Scoring Data to Service"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Download Scoring Data"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Score Model on March 15, 2016 data\n",
        "scoring_df = get_noaa_data(datetime(2016, 3, 15) - timedelta(days=7), datetime(2016, 3, 16),  columns, usaf_list)\n",
        "# Add the window feature column\n",
        "scoring_df = add_window_col(scoring_df)\n",
        "\n",
        "# Drop features not used by the model\n",
        "print(\"Dropping unnecessary columns\")\n",
        "scoring_df = scoring_df.drop(['windAngle', 'windSpeed', 'datetime', 'elevation'], axis=1).dropna()\n",
        "scoring_df.head()"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# One Hot Encode the scoring dataset to match the training dataset schema\n",
        "columns_dict = model.datasets[\"training\"][0].get_profile().columns\n",
        "extra_cols = ('Path', 'Column1')\n",
        "for k in extra_cols:\n",
        "    columns_dict.pop(k, None)\n",
        "training_columns = list(columns_dict.keys())\n",
        "\n",
        "categorical_columns = scoring_df.dtypes == object\n",
        "categorical_columns = categorical_columns[categorical_columns == True]\n",
        "\n",
        "test_df = pd.get_dummies(scoring_df[categorical_columns.keys().tolist()])\n",
        "encoded_df = scoring_df.join(test_df)\n",
        "\n",
        "# Populate missing OHE columns with 0 values to match traning dataset schema\n",
        "difference = list(set(training_columns) - set(encoded_df.columns.tolist()))\n",
        "for col in difference:\n",
        "    encoded_df[col] = 0\n",
        "encoded_df.head()"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Serialize dataframe to list of row dictionaries\n",
        "encoded_dict = encoded_df.to_dict('records')"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Submit Scoring Data to Service"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "%%time\n",
        "\n",
        "# retreive the API keys. AML generates two keys.\n",
        "key1, key2 = aks_service.get_keys()\n",
        "\n",
        "total_count = len(scoring_df)\n",
        "i = 0\n",
        "load = []\n",
        "for row in encoded_dict:\n",
        "    load.append(row)\n",
        "    i = i + 1\n",
        "    if i % 100 == 0:\n",
        "        payload = json.dumps({\"data\": load})\n",
        "        \n",
        "        # construct raw HTTP request and send to the service\n",
        "        payload_binary = bytes(payload,encoding = 'utf8')\n",
        "        headers = {'Content-Type':'application/json', 'Authorization': 'Bearer ' + key1}\n",
        "        resp = requests.post(aks_service.scoring_uri, payload_binary, headers=headers)\n",
        "        \n",
        "        print(\"prediction:\", resp.content, \"Progress: {}/{}\".format(i, total_count))   \n",
        "\n",
        "        load = []\n",
        "        time.sleep(3)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "We need to wait up to 10 minutes for the Model Data Collector to dump the model input and inference data to storage in the Workspace, where it's used by the DataDriftDetector job."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "time.sleep(600)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Configure DataDrift"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "services = [service_name]\n",
        "start = datetime.now() - timedelta(days=2)\n",
        "end = datetime(year=2020, month=1, day=22, hour=15, minute=16)\n",
        "feature_list = ['usaf', 'wban', 'latitude', 'longitude', 'station_name', 'p_k',  'sine_hourofday', 'cosine_hourofday', 'temperature-7']\n",
        "alert_config = AlertConfiguration([email_address]) if email_address else None\n",
        "\n",
        "# there will be an exception indicating using get() method if DataDrift object already exist\n",
        "try:\n",
        "    datadrift = DataDriftDetector.create(ws, model.name, model.version, services, frequency=\"Day\", alert_config=alert_config)\n",
        "except KeyError:\n",
        "    datadrift = DataDriftDetector.get(ws, model.name, model.version)\n",
        "    \n",
        "print(\"Details of DataDrift Object:\\n{}\".format(datadrift))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Run an Adhoc DataDriftDetector Run"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "target_date = datetime.today()\n",
        "run = datadrift.run(target_date, services, feature_list=feature_list, create_compute_target=True)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "exp = Experiment(ws, datadrift._id)\n",
        "dd_run = Run(experiment=exp, run_id=run)\n",
        "RunDetails(dd_run).show()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Get Drift Analysis Results"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "children = list(dd_run.get_children())\n",
        "for child in children:\n",
        "    child.wait_for_completion()\n",
        "\n",
        "drift_metrics = datadrift.get_output(start_time=start, end_time=end)\n",
        "drift_metrics"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Show all drift figures, one per serivice.\n",
        "# If setting with_details is False (by default), only drift will be shown; if it's True, all details will be shown.\n",
        "\n",
        "drift_figures = datadrift.show(with_details=True)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Enable DataDrift Schedule"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "datadrift.enable_schedule()"
      ]
    }
  ],
  "metadata": {
    "authors": [
      {
        "name": "rafarmah"
      }
    ],
    "kernelspec": {
      "display_name": "Python 3.6",
      "language": "python",
      "name": "python36"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.6.6"
    },
    "notice": "Copyright (c) Microsoft Corporation. All rights reserved. Licensed under the MIT License."
  },
  "nbformat": 4,
  "nbformat_minor": 2
 }
--- a/how-to-use-azureml/monitor-models/data-drift/azure-ml-datadrift.yml
+++ b/how-to-use-azureml/monitor-models/data-drift/azure-ml-datadrift.yml
@@ -0,0 +1,8 @@
 name: azure-ml-datadrift
 dependencies:
 - pip:
  - azureml-sdk
  - azureml-contrib-datadrift
  - azureml-opendatasets
  - lightgbm
  - azureml-widgets
--- a/how-to-use-azureml/monitor-models/data-drift/score.py
+++ b/how-to-use-azureml/monitor-models/data-drift/score.py
@@ -0,0 +1,58 @@
 import pickle
 import json
 import numpy
 import azureml.train.automl
 from sklearn.externals import joblib
 from sklearn.linear_model import Ridge
 from azureml.core.model import Model
 from azureml.core.run import Run
 from azureml.monitoring import ModelDataCollector
 import time
 import pandas as pd
 def init():
    global model, inputs_dc, prediction_dc, feature_names, categorical_features
    print("Model is initialized" + time.strftime("%H:%M:%S"))
    model_path = Model.get_model_path(model_name="driftmodel")
    model = joblib.load(model_path)
    feature_names = ["usaf", "wban", "latitude", "longitude", "station_name", "p_k",
                     "sine_weekofyear", "cosine_weekofyear", "sine_hourofday", "cosine_hourofday",
                     "temperature-7"]
    categorical_features = ["usaf", "wban", "p_k", "station_name"]
    inputs_dc = ModelDataCollector(model_name="driftmodel",
                                   identifier="inputs",
                                   feature_names=feature_names)
    prediction_dc = ModelDataCollector("driftmodel",
                                       identifier="predictions",
                                       feature_names=["temperature"])
 def run(raw_data):
    global inputs_dc, prediction_dc
    try:
        data = json.loads(raw_data)["data"]
        data = pd.DataFrame(data)
        # Remove the categorical features as the model expects OHE values
        input_data = data.drop(categorical_features, axis=1)
        result = model.predict(input_data)
        # Collect the non-OHE dataframe
        collected_df = data[feature_names]
        inputs_dc.collect(collected_df.values)
        prediction_dc.collect(result)
        return result.tolist()
    except Exception as e:
        error = str(e)
        print(error + time.strftime("%H:%M:%S"))
        return error
--- a/how-to-use-azureml/training-with-deep-learning/export-run-history-to-tensorboard/export-run-history-to-tensorboard.ipynb
+++ b/how-to-use-azureml/training-with-deep-learning/export-run-history-to-tensorboard/export-run-history-to-tensorboard.ipynb
@@ -153,7 +153,11 @@
    {
      "cell_type": "code",
      "execution_count": null,
-      "metadata": {},
+      "metadata": {
        "tags": [
          "tensorboard-export-sample"
        ]
      },
      "outputs": [],
      "source": [
        "# Export Run History to Tensorboard logs\n",
--- a/how-to-use-azureml/training-with-deep-learning/tensorboard/tensorboard.ipynb
+++ b/how-to-use-azureml/training-with-deep-learning/tensorboard/tensorboard.ipynb
@@ -227,7 +227,11 @@
    {
      "cell_type": "code",
      "execution_count": null,
-      "metadata": {},
+      "metadata": {
        "tags": [
          "tensorboard-sample"
        ]
      },
      "outputs": [],
      "source": [
        "from azureml.tensorboard import Tensorboard\n",
--- a/how-to-use-azureml/training-with-deep-learning/train-hyperparameter-tune-deploy-with-chainer/chainer_mnist.py
+++ b/how-to-use-azureml/training-with-deep-learning/train-hyperparameter-tune-deploy-with-chainer/chainer_mnist.py
@@ -1,5 +1,6 @@
 import argparse
 import os
 import numpy as np
@@ -131,6 +132,8 @@ def main():
            run.log("Accuracy", np.float(val_accuracy))
    serializers.save_npz(os.path.join(args.output_dir, 'model.npz'), model)
 if __name__ == '__main__':
    main()
--- a/how-to-use-azureml/training-with-deep-learning/train-hyperparameter-tune-deploy-with-chainer/chainer_score.py
+++ b/how-to-use-azureml/training-with-deep-learning/train-hyperparameter-tune-deploy-with-chainer/chainer_score.py
@@ -0,0 +1,45 @@
 import numpy as np
 import os
 import json
 from chainer import serializers, using_config, Variable, datasets
 import chainer.functions as F
 import chainer.links as L
 from chainer import Chain
 from azureml.core.model import Model
 class MyNetwork(Chain):
    def __init__(self, n_mid_units=100, n_out=10):
        super(MyNetwork, self).__init__()
        with self.init_scope():
            self.l1 = L.Linear(None, n_mid_units)
            self.l2 = L.Linear(n_mid_units, n_mid_units)
            self.l3 = L.Linear(n_mid_units, n_out)
    def forward(self, x):
        h = F.relu(self.l1(x))
        h = F.relu(self.l2(h))
        return self.l3(h)
 def init():
    global model
    model_root = Model.get_model_path('chainer-dnn-mnist')
    # Load our saved artifacts
    model = MyNetwork()
    serializers.load_npz(model_root, model)
 def run(input_data):
    i = np.array(json.loads(input_data)['data'])
    _, test = datasets.get_mnist()
    x = Variable(np.asarray([test[i][0]]))
    y = model(x)
    return np.ndarray.tolist(y.data.argmax(axis=1))
--- a/how-to-use-azureml/training-with-deep-learning/train-hyperparameter-tune-deploy-with-chainer/train-hyperparameter-tune-deploy-with-chainer.ipynb
+++ b/how-to-use-azureml/training-with-deep-learning/train-hyperparameter-tune-deploy-with-chainer/train-hyperparameter-tune-deploy-with-chainer.ipynb
@@ -45,6 +45,16 @@
        "print(\"SDK version:\", azureml.core.VERSION)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "!jupyter nbextension install --py --user azureml.widgets\n",
        "!jupyter nbextension enable --py --user azureml.widgets"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
@@ -121,6 +131,7 @@
        "except ComputeTargetException:\n",
        "    print('Creating a new compute target...')\n",
        "    compute_config = AmlCompute.provisioning_configuration(vm_size='STANDARD_NC6', \n",
        "                                                           min_nodes=2,\n",
        "                                                           max_nodes=4)\n",
        "\n",
        "    # create the cluster\n",
@@ -206,7 +217,8 @@
      "source": [
        "import shutil\n",
        "\n",
-        "shutil.copy('chainer_mnist.py', project_folder)"
+        "shutil.copy('chainer_mnist.py', project_folder)\n",
        "shutil.copy('chainer_score.py', project_folder)"
      ]
    },
    {
@@ -353,6 +365,7 @@
        "hyperdrive_config = HyperDriveConfig(estimator=estimator,\n",
        "                                     hyperparameter_sampling=param_sampling, \n",
        "                                     primary_metric_name='Accuracy',\n",
        "                                     policy=BanditPolicy(evaluation_interval=1, slack_factor=0.1, delay_evaluation=3),\n",
        "                                     primary_metric_goal=PrimaryMetricGoal.MAXIMIZE,\n",
        "                                     max_total_runs=8,\n",
        "                                     max_concurrent_runs=4)\n"
@@ -398,14 +411,344 @@
      "metadata": {},
      "outputs": [],
      "source": [
-        "run.wait_for_completion(show_output=True)"
+        "hyperdrive_run.wait_for_completion(show_output=True)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Find and register best model\n",
        "When all jobs finish, we can find out the one that has the highest accuracy."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "best_run = hyperdrive_run.get_best_run_by_primary_metric()\n",
        "print(best_run.get_details()['runDefinition']['arguments'])"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Now, let's list the model files uploaded during the run."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "print(best_run.get_file_names())"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "We can then register the folder (and all files in it) as a model named `chainer-dnn-mnist` under the workspace for deployment"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "model = best_run.register_model(model_name='chainer-dnn-mnist', model_path='outputs/model.npz')"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Deploy the model in ACI\n",
        "Now, we are ready to deploy the model as a web service running in Azure Container Instance, [ACI](https://azure.microsoft.com/en-us/services/container-instances/). Azure Machine Learning accomplishes this by constructing a Docker image with the scoring logic and model baked in.\n",
        "\n",
        "### Create scoring script\n",
        "First, we will create a scoring script that will be invoked by the web service call.\n",
        "+ Now that the scoring script must have two required functions, `init()` and `run(input_data)`.\n",
        "    + In `init()`, you typically load the model into a global object. This function is executed only once when the Docker contianer is started.\n",
        "    + In `run(input_data)`, the model is used to predict a value based on the input data. The input and output to `run` uses NPZ as the serialization and de-serialization format because it is the preferred format for Chainer, but you are not limited to it.\n",
        "    \n",
        "Refer to the scoring script `chainer_score.py` for this tutorial. Our web service will use this file to predict. When writing your own scoring script, don't forget to test it locally first before you go and deploy the web service."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "shutil.copy('chainer_score.py', project_folder)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Create myenv.yml\n",
        "We also need to create an environment file so that Azure Machine Learning can install the necessary packages in the Docker image which are required by your scoring script. In this case, we need to specify conda packages `numpy` and `chainer`."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.core.runconfig import CondaDependencies\n",
        "\n",
        "cd = CondaDependencies.create()\n",
        "cd.add_conda_package('numpy')\n",
        "cd.add_conda_package('chainer')\n",
        "cd.save_to_file(base_directory='./', conda_file_path='myenv.yml')\n",
        "\n",
        "print(cd.serialize_to_string())"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Deploy to ACI\n",
        "We are almost ready to deploy. Create a deployment configuration and specify the number of CPUs and gigabytes of RAM needed for your ACI container."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.core.webservice import AciWebservice\n",
        "\n",
        "aciconfig = AciWebservice.deploy_configuration(cpu_cores=1,\n",
        "                                               auth_enabled=True, # this flag generates API keys to secure access\n",
        "                                               memory_gb=1,\n",
        "                                               tags={'name': 'mnist', 'framework': 'Chainer'},\n",
        "                                               description='Chainer DNN with MNIST')"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "**Deployment Process**\n",
        "\n",
        "Now we can deploy. **This cell will run for about 7-8 minutes.** Behind the scenes, it will do the following:\n",
        "\n",
        "1. **Build Docker image**\n",
        "Build a Docker image using the scoring file (chainer_score.py), the environment file (myenv.yml), and the model object.\n",
        "2. **Register image**\n",
        "Register that image under the workspace.\n",
        "3. **Ship to ACI**\n",
        "And finally ship the image to the ACI infrastructure, start up a container in ACI using that image, and expose an HTTP endpoint to accept REST client calls."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.core.image import ContainerImage\n",
        "\n",
        "imgconfig = ContainerImage.image_configuration(execution_script=\"chainer_score.py\", \n",
        "                                               runtime=\"python\", \n",
        "                                               conda_file=\"myenv.yml\")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "%%time\n",
        "from azureml.core.webservice import Webservice\n",
        "\n",
        "service = Webservice.deploy_from_model(workspace=ws,\n",
        "                                       name='chainer-mnist-1',\n",
        "                                       deployment_config=aciconfig,\n",
        "                                       models=[model],\n",
        "                                       image_config=imgconfig)\n",
        "\n",
        "service.wait_for_deployment(show_output=True)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "print(service.get_logs())"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "print(service.scoring_uri)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "**Tip: If something goes wrong with the deployment, the first thing to look at is the logs from the service by running the following command:** `print(service.get_logs())`"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "This is the scoring web service endpoint: `print(service.scoring_uri)`"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Test the deployed model\n",
        "Let's test the deployed model. Pick a random sample from the test set, and send it to the web service hosted in ACI for a prediction. Note, here we are using the an HTTP request to invoke the service.\n",
        "\n",
        "We can retrieve the API keys used for accessing the HTTP endpoint and construct a raw HTTP request to send to the service. Don't forget to add key to the HTTP header."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# retreive the API keys. two keys were generated.\n",
        "key1, Key2 = service.get_keys()\n",
        "print(key1)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "%matplotlib inline\n",
        "import matplotlib.pyplot as plt\n",
        "import urllib\n",
        "import gzip\n",
        "import numpy as np\n",
        "import struct\n",
        "import requests\n",
        "\n",
        "\n",
        "# load compressed MNIST gz files and return numpy arrays\n",
        "def load_data(filename, label=False):\n",
        "    with gzip.open(filename) as gz:\n",
        "        struct.unpack('I', gz.read(4))\n",
        "        n_items = struct.unpack('>I', gz.read(4))\n",
        "        if not label:\n",
        "            n_rows = struct.unpack('>I', gz.read(4))[0]\n",
        "            n_cols = struct.unpack('>I', gz.read(4))[0]\n",
        "            res = np.frombuffer(gz.read(n_items[0] * n_rows * n_cols), dtype=np.uint8)\n",
        "            res = res.reshape(n_items[0], n_rows * n_cols)\n",
        "        else:\n",
        "            res = np.frombuffer(gz.read(n_items[0]), dtype=np.uint8)\n",
        "            res = res.reshape(n_items[0], 1)\n",
        "    return res\n",
        "\n",
        "os.makedirs('./data/mnist', exist_ok=True)\n",
        "urllib.request.urlretrieve('http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz', filename = './data/mnist/test-images.gz')\n",
        "urllib.request.urlretrieve('http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz', filename = './data/mnist/test-labels.gz')\n",
        "\n",
        "X_test = load_data('./data/mnist/test-images.gz', False)\n",
        "y_test = load_data('./data/mnist/test-labels.gz', True).reshape(-1)\n",
        "\n",
        "\n",
        "# send a random row from the test set to score\n",
        "random_index = np.random.randint(0, len(X_test)-1)\n",
        "input_data = \"{\\\"data\\\": [\" + str(random_index) + \"]}\"\n",
        "\n",
        "headers = {'Content-Type':'application/json', 'Authorization': 'Bearer ' + key1}\n",
        "\n",
        "# send sample to service for scoring\n",
        "resp = requests.post(service.scoring_uri, input_data, headers=headers)\n",
        "\n",
        "print(\"label:\", y_test[random_index])\n",
        "print(\"prediction:\", resp.text[1])\n",
        "\n",
        "plt.imshow(X_test[random_index].reshape((28,28)), cmap='gray')\n",
        "plt.axis('off')\n",
        "plt.show()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Let's look at the workspace after the web service was deployed. You should see\n",
        "\n",
        " + a registered model named 'chainer-dnn-mnist' and with the id 'chainer-dnn-mnist:1'\n",
        " + an image called 'chainer-mnist-svc' and with a docker image location pointing to your workspace's Azure Container Registry (ACR)\n",
        " + a webservice called 'chainer-mnist-svc' with some scoring URL"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "models = ws.models\n",
        "for name, model in models.items():\n",
        "    print(\"Model: {}, ID: {}\".format(name, model.id))\n",
        "    \n",
        "images = ws.images\n",
        "for name, image in images.items():\n",
        "    print(\"Image: {}, location: {}\".format(name, image.image_location))\n",
        "    \n",
        "webservices = ws.webservices\n",
        "for name, webservice in webservices.items():\n",
        "    print(\"Webservice: {}, scoring URI: {}\".format(name, webservice.scoring_uri))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Clean up"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "You can delete the ACI deployment with a simple delete API call."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "service.delete()"
      ]
    }
  ],
  "metadata": {
    "authors": [
      {
-        "name": "ninhu"
+        "name": "dipeck"
      }
    ],
    "kernelspec": {
@@ -424,7 +767,8 @@
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.6.6"
-    }
+    },
    "msauthor": "dipeck"
  },
  "nbformat": 4,
  "nbformat_minor": 2
--- a/how-to-use-azureml/training-with-deep-learning/train-hyperparameter-tune-deploy-with-chainer/train-hyperparameter-tune-deploy-with-chainer.yml
+++ b/how-to-use-azureml/training-with-deep-learning/train-hyperparameter-tune-deploy-with-chainer/train-hyperparameter-tune-deploy-with-chainer.yml
@@ -4,4 +4,9 @@ dependencies:
  - azureml-sdk
  - azureml-widgets
  - numpy
-  - pytest
+  - matplotlib
  - json
  - urllib
  - gzip
  - struct
  - requests
--- a/how-to-use-azureml/training-with-deep-learning/train-hyperparameter-tune-deploy-with-pytorch/pytorch_score.py
+++ b/how-to-use-azureml/training-with-deep-learning/train-hyperparameter-tune-deploy-with-pytorch/pytorch_score.py
@@ -11,7 +11,7 @@ from azureml.core.model import Model
 def init():
    global model
-    model_path = Model.get_model_path('pytorch-hymenoptera')
+    model_path = Model.get_model_path('pytorch-birds')
    model = torch.load(model_path, map_location=lambda storage, loc: storage)
    model.eval()
@@ -22,7 +22,7 @@ def run(input_data):
    # get prediction
    with torch.no_grad():
        output = model(input_data)
-        classes = ['ants', 'bees']
+        classes = ['chicken', 'turkey']
        softmax = nn.Softmax(dim=1)
        pred_probs = softmax(output).numpy()[0]
        index = torch.argmax(output, 1)
--- a/how-to-use-azureml/training-with-deep-learning/train-hyperparameter-tune-deploy-with-pytorch/pytorch_train.py
+++ b/how-to-use-azureml/training-with-deep-learning/train-hyperparameter-tune-deploy-with-pytorch/pytorch_train.py
@@ -165,8 +165,8 @@ def download_data():
    import urllib
    from zipfile import ZipFile
    # download data
-    data_file = './hymenoptera_data.zip'
+    data_file = './fowl_data.zip'
-    download_url = 'https://download.pytorch.org/tutorial/hymenoptera_data.zip'
+    download_url = 'https://msdocsdatasets.blob.core.windows.net/pytorchfowl/fowl_data.zip'
    urllib.request.urlretrieve(download_url, filename=data_file)
    # extract files
--- a/how-to-use-azureml/training-with-deep-learning/train-hyperparameter-tune-deploy-with-pytorch/test_img.jpg
+++ b/how-to-use-azureml/training-with-deep-learning/train-hyperparameter-tune-deploy-with-pytorch/test_img.jpg
--- a/how-to-use-azureml/training-with-deep-learning/train-hyperparameter-tune-deploy-with-pytorch/train-hyperparameter-tune-deploy-with-pytorch.ipynb
+++ b/how-to-use-azureml/training-with-deep-learning/train-hyperparameter-tune-deploy-with-pytorch/train-hyperparameter-tune-deploy-with-pytorch.ipynb
@@ -24,7 +24,7 @@
        "\n",
        "In this tutorial, you will train, hyperparameter tune, and deploy a PyTorch model using the Azure Machine Learning (Azure ML) Python SDK.\n",
        "\n",
-        "This tutorial will train an image classification model using transfer learning, based on PyTorch's [Transfer Learning tutorial](https://pytorch.org/tutorials/beginner/transfer_learning_tutorial.html). The model is trained to classify ants and bees by first using a pretrained ResNet18 model that has been trained on the [ImageNet](http://image-net.org/index) dataset."
+        "This tutorial will train an image classification model using transfer learning, based on PyTorch's [Transfer Learning tutorial](https://pytorch.org/tutorials/beginner/transfer_learning_tutorial.html). The model is trained to classify chickens and turkeys by first using a pretrained ResNet18 model that has been trained on the [ImageNet](http://image-net.org/index) dataset."
      ]
    },
    {
@@ -165,7 +165,7 @@
      "source": [
        "import os\n",
        "\n",
-        "project_folder = './pytorch-hymenoptera'\n",
+        "project_folder = './pytorch-birds'\n",
        "os.makedirs(project_folder, exist_ok=True)"
      ]
    },
@@ -174,7 +174,7 @@
      "metadata": {},
      "source": [
        "### Download training data\n",
-        "The dataset we will use (located [here](https://download.pytorch.org/tutorial/hymenoptera_data.zip) as a zip file) consists of about 120 training images each for ants and bees, with 75 validation images for each class. [Hymenoptera](https://en.wikipedia.org/wiki/Hymenoptera) is the order of insects that includes ants and bees. We will download and extract the dataset as part of our training script `pytorch_train.py`"
+        "The dataset we will use (located on a public blob [here](https://msdocsdatasets.blob.core.windows.net/pytorchfowl/fowl_data.zip) as a zip file) consists of about 120 training images each for turkeys and chickens, with 100 validation images for each class. The images are a subset of the [Open Images v5 Dataset](https://storage.googleapis.com/openimages/web/index.html). We will download and extract the dataset as part of our training script `pytorch_train.py`"
      ]
    },
    {
@@ -235,7 +235,7 @@
      "source": [
        "from azureml.core import Experiment\n",
        "\n",
-        "experiment_name = 'pytorch-hymenoptera'\n",
+        "experiment_name = 'pytorch-birds'\n",
        "experiment = Experiment(ws, name=experiment_name)"
      ]
    },
@@ -273,7 +273,7 @@
      "metadata": {},
      "source": [
        "The `script_params` parameter is a dictionary containing the command-line arguments to your training script `entry_script`. Please note the following:\n",
-        "- We passed our training data reference `ds_data` to our script's `--data_dir` argument. This will 1) mount our datastore on the remote compute and 2) provide the path to the training data `hymenoptera_data` on our datastore.\n",
+        "- We passed our training data reference `ds_data` to our script's `--data_dir` argument. This will 1) mount our datastore on the remote compute and 2) provide the path to the training data `fowl_data` on our datastore.\n",
        "- We specified the output directory as `./outputs`. The `outputs` directory is specially treated by Azure ML in that all the content in this directory gets uploaded to your workspace as part of your run history. The files written to this directory are therefore accessible even once your remote run is over. In this tutorial, we will save our trained model to this output directory.\n",
        "\n",
        "To leverage the Azure VM's GPU for training, we set `use_gpu=True`."
@@ -481,7 +481,7 @@
      "metadata": {},
      "outputs": [],
      "source": [
-        "model = best_run.register_model(model_name = 'pytorch-hymenoptera', model_path = 'outputs/model.pt')\n",
+        "model = best_run.register_model(model_name = 'pytorch-birds', model_path = 'outputs/model.pt')\n",
        "print(model.name, model.id, model.version, sep = '\\t')"
      ]
    },
@@ -503,7 +503,7 @@
        "* `init()`: In this function, you typically load the model into a `global` object. This function is executed only once when the Docker container is started. \n",
        "* `run(input_data)`: In this function, the model is used to predict a value based on the input data. The input and output typically use JSON as serialization and deserialization format, but you are not limited to that.\n",
        "\n",
-        "Refer to the scoring script `pytorch_score.py` for this tutorial. Our web service will use this file to predict whether an image is an ant or a bee. When writing your own scoring script, don't forget to test it locally first before you go and deploy the web service."
+        "Refer to the scoring script `pytorch_score.py` for this tutorial. Our web service will use this file to predict whether an image is a chicken or a turkey. When writing your own scoring script, don't forget to test it locally first before you go and deploy the web service."
      ]
    },
    {
@@ -549,7 +549,7 @@
        "image_config = ContainerImage.image_configuration(execution_script='pytorch_score.py', \n",
        "                                                  runtime='python', \n",
        "                                                  conda_file='myenv.yml',\n",
-        "                                                  description='Image with hymenoptera model')"
+        "                                                  description='Image with bird model')"
      ]
    },
    {
@@ -570,8 +570,8 @@
        "\n",
        "aciconfig = AciWebservice.deploy_configuration(cpu_cores=1, \n",
        "                                               memory_gb=1, \n",
-        "                                               tags={'data': 'hymenoptera',  'method':'transfer learning', 'framework':'pytorch'},\n",
+        "                                               tags={'data': 'birds',  'method':'transfer learning', 'framework':'pytorch'},\n",
-        "                                               description='Classify ants/bees using transfer learning with PyTorch')"
+        "                                               description='Classify turkey/chickens using transfer learning with PyTorch')"
      ]
    },
    {
@@ -591,7 +591,7 @@
        "%%time\n",
        "from azureml.core.webservice import Webservice\n",
        "\n",
-        "service_name = 'aci-hymenoptera'\n",
+        "service_name = 'aci-birds'\n",
        "service = Webservice.deploy_from_model(workspace=ws,\n",
        "                                       name=service_name,\n",
        "                                       models=[model],\n",
@@ -659,6 +659,7 @@
        "from PIL import Image\n",
        "import matplotlib.pyplot as plt\n",
        "\n",
        "%matplotlib inline\n",
        "plt.imshow(Image.open('test_img.jpg'))"
      ]
    },
--- a/how-to-use-azureml/training-with-deep-learning/train-tensorflow-resume-training/tf_mnist_with_checkpoint.py
+++ b/how-to-use-azureml/training-with-deep-learning/train-tensorflow-resume-training/tf_mnist_with_checkpoint.py
@@ -0,0 +1,123 @@
 # Copyright (c) Microsoft Corporation. All rights reserved.
 # Licensed under the MIT License.
 import numpy as np
 import argparse
 import os
 import re
 import tensorflow as tf
 from azureml.core import Run
 from utils import load_data
 print("TensorFlow version:", tf.VERSION)
 parser = argparse.ArgumentParser()
 parser.add_argument('--data-folder', type=str, dest='data_folder', help='data folder mounting point')
 parser.add_argument('--resume-from', type=str, default=None,
                    help='location of the model or checkpoint files from where to resume the training')
 args = parser.parse_args()
 previous_model_location = args.resume_from
 # You can also use environment variable to get the model/checkpoint files location
 # previous_model_location = os.path.expandvars(os.getenv("AZUREML_DATAREFERENCE_MODEL_LOCATION", None))
 data_folder = os.path.join(args.data_folder, 'mnist')
 print('training dataset is stored here:', data_folder)
 X_train = load_data(os.path.join(data_folder, 'train-images.gz'), False) / 255.0
 X_test = load_data(os.path.join(data_folder, 'test-images.gz'), False) / 255.0
 y_train = load_data(os.path.join(data_folder, 'train-labels.gz'), True).reshape(-1)
 y_test = load_data(os.path.join(data_folder, 'test-labels.gz'), True).reshape(-1)
 print(X_train.shape, y_train.shape, X_test.shape, y_test.shape, sep='\n')
 training_set_size = X_train.shape[0]
 n_inputs = 28 * 28
 n_h1 = 100
 n_h2 = 100
 n_outputs = 10
 learning_rate = 0.01
 n_epochs = 20
 batch_size = 50
 with tf.name_scope('network'):
    # construct the DNN
    X = tf.placeholder(tf.float32, shape=(None, n_inputs), name='X')
    y = tf.placeholder(tf.int64, shape=(None), name='y')
    h1 = tf.layers.dense(X, n_h1, activation=tf.nn.relu, name='h1')
    h2 = tf.layers.dense(h1, n_h2, activation=tf.nn.relu, name='h2')
    output = tf.layers.dense(h2, n_outputs, name='output')
 with tf.name_scope('train'):
    cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=y, logits=output)
    loss = tf.reduce_mean(cross_entropy, name='loss')
    optimizer = tf.train.GradientDescentOptimizer(learning_rate)
    train_op = optimizer.minimize(loss)
 with tf.name_scope('eval'):
    correct = tf.nn.in_top_k(output, y, 1)
    acc_op = tf.reduce_mean(tf.cast(correct, tf.float32))
 init = tf.global_variables_initializer()
 saver = tf.train.Saver()
 # start an Azure ML run
 run = Run.get_context()
 with tf.Session() as sess:
    start_epoch = 0
    if previous_model_location:
        checkpoint_file_path = tf.train.latest_checkpoint(previous_model_location)
        saver.restore(sess, checkpoint_file_path)
        checkpoint_filename = os.path.basename(checkpoint_file_path)
        num_found = re.search(r'\d+', checkpoint_filename)
        if num_found:
            start_epoch = int(num_found.group(0))
            print("Resuming from epoch {}".format(str(start_epoch)))
    else:
        init.run()
    for epoch in range(start_epoch, n_epochs):
        # randomly shuffle training set
        indices = np.random.permutation(training_set_size)
        X_train = X_train[indices]
        y_train = y_train[indices]
        # batch index
        b_start = 0
        b_end = b_start + batch_size
        for _ in range(training_set_size // batch_size):
            # get a batch
            X_batch, y_batch = X_train[b_start: b_end], y_train[b_start: b_end]
            # update batch index for the next batch
            b_start = b_start + batch_size
            b_end = min(b_start + batch_size, training_set_size)
            # train
            sess.run(train_op, feed_dict={X: X_batch, y: y_batch})
        # evaluate training set
        acc_train = acc_op.eval(feed_dict={X: X_batch, y: y_batch})
        # evaluate validation set
        acc_val = acc_op.eval(feed_dict={X: X_test, y: y_test})
        # log accuracies
        run.log('training_acc', np.float(acc_train))
        run.log('validation_acc', np.float(acc_val))
        print(epoch, '-- Training accuracy:', acc_train, '\b Validation accuracy:', acc_val)
        y_hat = np.argmax(output.eval(feed_dict={X: X_test}), axis=1)
        if epoch % 5 == 0:
            saver.save(sess, './outputs/', global_step=epoch)
        # saving only half of the model and resuming again from same epoch
        if not previous_model_location and epoch == 10:
            break
    run.log('final_acc', np.float(acc_val))
--- a/how-to-use-azureml/training-with-deep-learning/train-tensorflow-resume-training/train-tensorflow-resume-training.ipynb
+++ b/how-to-use-azureml/training-with-deep-learning/train-tensorflow-resume-training/train-tensorflow-resume-training.ipynb
@@ -0,0 +1,487 @@
 {
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Copyright (c) Microsoft Corporation. All rights reserved.\n",
        "\n",
        "Licensed under the MIT License."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/training-with-deep-learning/train-tensorflow-resume-training/tensorflow-resume-training.png)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "# Resuming Tensorflow training from previous run\n",
        "In this tutorial, you will resume a mnist model in TensorFlow from a previously submitted run."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Prerequisites\n",
        "* Understand the [architecture and terms](https://docs.microsoft.com/azure/machine-learning/service/concept-azure-machine-learning-architecture) introduced by Azure Machine Learning (AML)\n",
        "* Go through the [configuration notebook](../../../configuration.ipynb) to:\n",
        "    * install the AML SDK\n",
        "    * create a workspace and its configuration file (`config.json`)\n",
        "* Review the [tutorial](../train-hyperparameter-tune-deploy-with-tensorflow/train-hyperparameter-tune-deploy-with-tensorflow.ipynb) on single-node TensorFlow training using the SDK"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Check core SDK version number\n",
        "import azureml.core\n",
        "\n",
        "print(\"SDK version:\", azureml.core.VERSION)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Diagnostics\n",
        "Opt-in diagnostics for better experience, quality, and security of future releases."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "tags": [
          "Diagnostics"
        ]
      },
      "outputs": [],
      "source": [
        "from azureml.telemetry import set_diagnostics_collection\n",
        "\n",
        "set_diagnostics_collection(send_diagnostics=True)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Initialize workspace\n",
        "Initialize a [Workspace](https://docs.microsoft.com/azure/machine-learning/service/concept-azure-machine-learning-architecture#workspace) object from the existing workspace you created in the Prerequisites step. `Workspace.from_config()` creates a workspace object from the details stored in `config.json`."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.core.workspace import Workspace\n",
        "\n",
        "ws = Workspace.from_config()\n",
        "print('Workspace name: ' + ws.name, \n",
        "      'Azure region: ' + ws.location, \n",
        "      'Subscription id: ' + ws.subscription_id, \n",
        "      'Resource group: ' + ws.resource_group, sep='\\n')"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Create or Attach existing AmlCompute\n",
        "You will need to create a [compute target](https://docs.microsoft.com/azure/machine-learning/service/concept-azure-machine-learning-architecture#compute-target) for training your model. In this tutorial, you create `AmlCompute` as your training compute resource.\n",
        "\n",
        "**Creation of AmlCompute takes approximately 5 minutes.** If the AmlCompute with that name is already in your workspace this code will skip the creation process.\n",
        "\n",
        "As with other Azure services, there are limits on certain resources (e.g. AmlCompute) associated with the Azure Machine Learning service. Please read [this article](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-manage-quotas) on the default limits and how to request more quota."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.core.compute import ComputeTarget, AmlCompute\n",
        "from azureml.core.compute_target import ComputeTargetException\n",
        "\n",
        "# choose a name for your cluster\n",
        "cluster_name = \"gpu-cluster\"\n",
        "\n",
        "try:\n",
        "    compute_target = ComputeTarget(workspace=ws, name=cluster_name)\n",
        "    print('Found existing compute target.')\n",
        "except ComputeTargetException:\n",
        "    print('Creating a new compute target...')\n",
        "    compute_config = AmlCompute.provisioning_configuration(vm_size='STANDARD_NC6', \n",
        "                                                           max_nodes=4)\n",
        "\n",
        "    # create the cluster\n",
        "    compute_target = ComputeTarget.create(ws, cluster_name, compute_config)\n",
        "\n",
        "    compute_target.wait_for_completion(show_output=True)\n",
        "\n",
        "# use get_status() to get a detailed status for the current cluster. \n",
        "print(compute_target.get_status().serialize())"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "The above code creates a GPU cluster. If you instead want to create a CPU cluster, provide a different VM size to the `vm_size` parameter, such as `STANDARD_D2_V2`."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Upload data to datastore\n",
        "To make data accessible for remote training, AML provides a convenient way to do so via a [Datastore](https://docs.microsoft.com/azure/machine-learning/service/how-to-access-data). The datastore provides a mechanism for you to upload/download data to Azure Storage, and interact with it from your remote compute targets. \n",
        "\n",
        "If your data is already stored in Azure, or you download the data as part of your training script, you will not need to do this step. For this tutorial, although you can download the data in your training script, we will demonstrate how to upload the training data to a datastore and access it during training to illustrate the datastore functionality."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "First download the data from Yan LeCun's web site directly and save them in a data folder locally."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import os\n",
        "import urllib\n",
        "\n",
        "os.makedirs('./data/mnist', exist_ok=True)\n",
        "\n",
        "urllib.request.urlretrieve('http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz', filename = './data/mnist/train-images.gz')\n",
        "urllib.request.urlretrieve('http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz', filename = './data/mnist/train-labels.gz')\n",
        "urllib.request.urlretrieve('http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz', filename = './data/mnist/test-images.gz')\n",
        "urllib.request.urlretrieve('http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz', filename = './data/mnist/test-labels.gz')"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Each workspace is associated with a default datastore. In this tutorial, we will upload the training data to this default datastore."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "ds = ws.get_default_datastore()\n",
        "print(ds.datastore_type, ds.account_name, ds.container_name)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Upload MNIST data to the default datastore."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "ds.upload(src_dir='./data/mnist', target_path='mnist', overwrite=True, show_progress=True)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "For convenience, let's get a reference to the datastore. In the next section, we can then pass this reference to our training script's `--data-folder` argument. "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "ds_data = ds.as_mount()\n",
        "print(ds_data)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Train model on the remote compute"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Create a project directory\n",
        "Create a directory that will contain all the necessary code from your local machine that you will need access to on the remote resource. This includes the training script, and any additional files your training script depends on."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "script_folder = './tf-resume-training'\n",
        "os.makedirs(script_folder, exist_ok=True)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Copy the training script `tf_mnist_with_checkpoint.py` into this project directory."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import shutil\n",
        "\n",
        "# the training logic is in the tf_mnist_with_checkpoint.py file.\n",
        "shutil.copy('./tf_mnist_with_checkpoint.py', script_folder)\n",
        "\n",
        "# the utils.py just helps loading data from the downloaded MNIST dataset into numpy arrays.\n",
        "shutil.copy('./utils.py', script_folder)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Create an experiment\n",
        "Create an [Experiment](https://docs.microsoft.com/azure/machine-learning/service/concept-azure-machine-learning-architecture#experiment) to track all the runs in your workspace for this distributed TensorFlow tutorial. "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.core import Experiment\n",
        "\n",
        "experiment_name = 'tf-resume-training'\n",
        "experiment = Experiment(ws, name=experiment_name)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Create a TensorFlow estimator\n",
        "The AML SDK's TensorFlow estimator enables you to easily submit TensorFlow training jobs for both single-node and distributed runs. For more information on the TensorFlow estimator, refer [here](https://docs.microsoft.com/azure/machine-learning/service/how-to-train-tensorflow).\n",
        "\n",
        "The TensorFlow estimator also takes a `framework_version` parameter -- if no version is provided, the estimator will default to the latest version supported by AzureML. Use `TensorFlow.get_supported_versions()` to get a list of all versions supported by your current SDK version or see the [SDK documentation](https://docs.microsoft.com/en-us/python/api/azureml-train-core/azureml.train.dnn?view=azure-ml-py) for the versions supported in the most current release."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.train.dnn import TensorFlow\n",
        "\n",
        "script_params={\n",
        "    '--data-folder': ds_data\n",
        "}\n",
        "\n",
        "estimator= TensorFlow(source_directory=script_folder,\n",
        "                      compute_target=compute_target,\n",
        "                      script_params=script_params,\n",
        "                      entry_script='tf_mnist_with_checkpoint.py')"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "In the above code, we passed our training data reference `ds_data` to our script's `--data-folder` argument. This will 1) mount our datastore on the remote compute and 2) provide the path to the data zip file on our datastore."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Submit job\n",
        "### Run your experiment by submitting your estimator object. Note that this call is asynchronous."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "run = experiment.submit(estimator)\n",
        "print(run)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Monitor your run\n",
        "You can monitor the progress of the run with a Jupyter widget. Like the run submission, the widget is asynchronous and provides live updates every 10-15 seconds until the job completes."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.widgets import RunDetails\n",
        "RunDetails(run).show()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Alternatively, you can block until the script has completed training before running more code."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "run.wait_for_completion(show_output=True)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "# Now let's resume the training from the above run"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "First, we will get the DataPath to the outputs directory of the above run which\n",
        "contains the checkpoint files and/or model"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "model_location = run._get_outputs_datapath()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Now, we will create a new TensorFlow estimator and pass in the model location. On passing 'resume_from' parameter, a new entry in script_params is created with key as 'resume_from' and value as the model/checkpoint files location and the location gets automatically mounted on the compute target."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.train.dnn import TensorFlow\n",
        "\n",
        "script_params={\n",
        "    '--data-folder': ds_data\n",
        "}\n",
        "\n",
        "estimator2 = TensorFlow(source_directory=script_folder,\n",
        "                      compute_target=compute_target,\n",
        "                      script_params=script_params,\n",
        "                      entry_script='tf_mnist_with_checkpoint.py',\n",
        "                      resume_from=model_location)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Now you can submit the experiment and it should resume from previous run's checkpoint files."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "run2 = experiment.submit(estimator2)\n",
        "print(run2)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "run2.wait_for_completion(show_output=True)"
      ]
    }
  ],
  "metadata": {
    "authors": [
      {
        "name": "hesuri"
      }
    ],
    "kernelspec": {
      "display_name": "Python 3.6",
      "language": "python",
      "name": "python36"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.6.6"
    },
    "msauthor": "hesuri"
  },
  "nbformat": 4,
  "nbformat_minor": 2
 }
--- a/how-to-use-azureml/training-with-deep-learning/train-tensorflow-resume-training/train-tensorflow-resume-training.yml
+++ b/how-to-use-azureml/training-with-deep-learning/train-tensorflow-resume-training/train-tensorflow-resume-training.yml
@@ -0,0 +1,5 @@
 name: train-tensorflow-resume-training
 dependencies:
 - pip:
  - azureml-sdk
  - azureml-widgets
--- a/how-to-use-azureml/training-with-deep-learning/train-tensorflow-resume-training/utils.py
+++ b/how-to-use-azureml/training-with-deep-learning/train-tensorflow-resume-training/utils.py
@@ -0,0 +1,27 @@
 # Copyright (c) Microsoft Corporation. All rights reserved.
 # Licensed under the MIT License.
 import gzip
 import numpy as np
 import struct
 # load compressed MNIST gz files and return numpy arrays
 def load_data(filename, label=False):
    with gzip.open(filename) as gz:
        struct.unpack('I', gz.read(4))
        n_items = struct.unpack('>I', gz.read(4))
        if not label:
            n_rows = struct.unpack('>I', gz.read(4))[0]
            n_cols = struct.unpack('>I', gz.read(4))[0]
            res = np.frombuffer(gz.read(n_items[0] * n_rows * n_cols), dtype=np.uint8)
            res = res.reshape(n_items[0], n_rows * n_cols)
        else:
            res = np.frombuffer(gz.read(n_items[0]), dtype=np.uint8)
            res = res.reshape(n_items[0], 1)
    return res
 # one-hot encode a 1-D array
 def one_hot_encode(array, num_of_classes):
    return np.eye(num_of_classes)[array.reshape(-1)]
--- a/how-to-use-azureml/training/logging-api/logging-api.ipynb
+++ b/how-to-use-azureml/training/logging-api/logging-api.ipynb
@@ -100,7 +100,7 @@
        "\n",
        "# Check core SDK version number\n",
        "\n",
-        "print(\"This notebook was created using SDK version 1.0.48\r\n, you are currently running version\", azureml.core.VERSION)"
+        "print(\"This notebook was created using SDK version 1.0.53, you are currently running version\", azureml.core.VERSION)"
      ]
    },
    {
--- a/how-to-use-azureml/training/train-hyperparameter-tune-deploy-with-sklearn/train-hyperparameter-tune-deploy-with-sklearn.ipynb
+++ b/how-to-use-azureml/training/train-hyperparameter-tune-deploy-with-sklearn/train-hyperparameter-tune-deploy-with-sklearn.ipynb
@@ -120,19 +120,42 @@
        "As with other Azure services, there are limits on certain resources (e.g. AmlCompute) associated with the Azure Machine Learning service. Please read [this article](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-manage-quotas) on the default limits and how to request more quota."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "If we could not find the cluster with the given name, then we will create a new cluster here. We will create an `AmlCompute` cluster of `STANDARD_D2_V2` CPU VMs. This process is broken down into 3 steps:\n",
        "1. create the configuration (this step is local and only takes a second)\n",
        "2. create the cluster (this step will take about **20 seconds**)\n",
        "3. provision the VMs to bring the cluster to the initial size (of 1 in this case). This step will take about **3-5 minutes** and is providing only sparse output in the process. Please make sure to wait until the call returns before moving to the next cell"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
-        "from azureml.core.compute import ComputeTarget\n",
+        "from azureml.core.compute import ComputeTarget, AmlCompute\n",
        "from azureml.core.compute_target import ComputeTargetException\n",
        "\n",
        "# choose a name for your cluster\n",
        "cluster_name = \"cpu-cluster\"\n",
        "\n",
-        "compute_target = ComputeTarget(workspace=ws, name=cluster_name)\n",
+        "try:\n",
-        "print('Found existing compute target.')\n",
+        "    compute_target = ComputeTarget(workspace=ws, name=cluster_name)\n",
        "    print('Found existing compute target')\n",
        "except ComputeTargetException:\n",
        "    print('Creating a new compute target...')\n",
        "    compute_config = AmlCompute.provisioning_configuration(vm_size='STANDARD_D2_V2', \n",
        "                                                           max_nodes=4)\n",
        "\n",
        "    # create the cluster\n",
        "    compute_target = ComputeTarget.create(ws, cluster_name, compute_config)\n",
        "\n",
        "    # can poll for a minimum number of nodes and for a specific timeout. \n",
        "    # if no min node count is provided it uses the scale settings for the cluster\n",
        "    compute_target.wait_for_completion(show_output=True, min_node_count=None, timeout_in_minutes=20)\n",
        "\n",
        "# use get_status() to get a detailed status for the current cluster. \n",
        "print(compute_target.get_status().serialize())"
@@ -142,7 +165,7 @@
      "cell_type": "markdown",
      "metadata": {},
      "source": [
-        "The above code retrieves an existing CPU compute target. Scikit-learn does not support GPU computing."
+        "The above code retrieves a CPU compute target. Scikit-learn does not support GPU computing."
      ]
    },
    {
@@ -289,7 +312,7 @@
        "                    script_params=script_params,\n",
        "                    compute_target=compute_target,\n",
        "                    entry_script='train_iris.py',\n",
-        "                    pip_packages=['joblib']\n",
+        "                    pip_packages=['joblib==0.13.2']\n",
        "                   )"
      ]
    },
@@ -507,7 +530,7 @@
      "metadata": {},
      "outputs": [],
      "source": [
-        "model = best_run.register_model(model_name='sklearn-iris', model_path='model.joblib')"
+        "model = best_run.register_model(model_name='sklearn-iris', model_path='outputs/model.joblib')"
      ]
    }
  ],
--- a/how-to-use-azureml/training/train-hyperparameter-tune-deploy-with-sklearn/train_iris.py
+++ b/how-to-use-azureml/training/train-hyperparameter-tune-deploy-with-sklearn/train_iris.py
@@ -1,6 +1,7 @@
 # Modified from https://www.geeksforgeeks.org/multiclass-classification-using-scikit-learn/
 import argparse
 import os
 # importing necessary libraries
 import numpy as np
@@ -50,8 +51,9 @@ def main():
    cm = confusion_matrix(y_test, svm_predictions)
    print(cm)
-    # save model
+    os.makedirs('outputs', exist_ok=True)
-    joblib.dump(svm_model_linear, 'model.joblib')
+    # files saved in the "outputs" folder are automatically uploaded into run history
    joblib.dump(svm_model_linear, 'outputs/model.joblib')
 if __name__ == '__main__':
--- a/setup-environment/configuration.ipynb
+++ b/setup-environment/configuration.ipynb
@@ -102,7 +102,7 @@
      "source": [
        "import azureml.core\n",
        "\n",
-        "print(\"This notebook was created using version 1.0.48\r\n of the Azure ML SDK\")\n",
+        "print(\"This notebook was created using version 1.0.53 of the Azure ML SDK\")\n",
        "print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
      ]
    },
--- a/tutorials/tutorial-1st-experiment-sdk-train.ipynb
+++ b/tutorials/tutorial-1st-experiment-sdk-train.ipynb
@@ -1,385 +1,385 @@
 {
- "cells": [
+  "cells": [
-  {
+    {
-   "cell_type": "markdown",
+      "cell_type": "markdown",
-   "metadata": {},
+      "metadata": {},
-   "source": [
+      "source": [
-    "Copyright (c) Microsoft Corporation. All rights reserved."
+        "Copyright (c) Microsoft Corporation. All rights reserved."
-   ]
+      ]
-  },
+    },
-  {
+    {
-   "cell_type": "markdown",
+      "cell_type": "markdown",
-   "metadata": {},
+      "metadata": {},
-   "source": [
+      "source": [
-    "![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/tutorials/tutorial-1st-experiment-sdk-train.png)"
+        "![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/tutorials/tutorial-1st-experiment-sdk-train.png)"
-   ]
+      ]
-  },
+    },
-  {
+    {
-   "cell_type": "markdown",
+      "cell_type": "markdown",
-   "metadata": {},
+      "metadata": {},
-   "source": [
+      "source": [
-    "# Tutorial: Train your first model"
+        "# Tutorial: Train your first model"
-   ]
+      ]
-  },
+    },
-  {
+    {
-   "cell_type": "markdown",
+      "cell_type": "markdown",
-   "metadata": {},
+      "metadata": {},
-   "source": [
+      "source": [
-    "This tutorial is **part two of a two-part tutorial series**. In the previous tutorial, you created a workspace and chose a development environment. In this tutorial, you learn the foundational design patterns in Azure Machine Learning service, and train a simple scikit-learn model based on the diabetes data set. After completing this tutorial, you will have the practical knowledge of the SDK to scale up to developing more-complex experiments and workflows. \n",
+        "This tutorial is **part two of a two-part tutorial series**. In the previous tutorial, you created a workspace and chose a development environment. In this tutorial, you learn the foundational design patterns in Azure Machine Learning service, and train a simple scikit-learn model based on the diabetes data set. After completing this tutorial, you will have the practical knowledge of the SDK to scale up to developing more-complex experiments and workflows. \n",
-    "\n",
+        "\n",
-    "In this tutorial, you learn the following tasks:\n",
+        "In this tutorial, you learn the following tasks:\n",
-    "\n",
+        "\n",
-    "> * Connect your workspace and create an experiment \n",
+        "> * Connect your workspace and create an experiment \n",
-    "> * Load data and train a scikit-learn model\n",
+        "> * Load data and train a scikit-learn model\n",
-    "> * View training results in the portal\n",
+        "> * View training results in the portal\n",
-    "> * Retrieve the best model"
+        "> * Retrieve the best model"
-   ]
+      ]
-  },
+    },
-  {
+    {
-   "cell_type": "markdown",
+      "cell_type": "markdown",
-   "metadata": {},
+      "metadata": {},
-   "source": [
+      "source": [
-    "## Prerequisites\n",
+        "## Prerequisites\n",
-    "\n",
+        "\n",
-    "The only prerequisite is to run the previous tutorial, Setup environment and workspace."
+        "The only prerequisite is to run the previous tutorial, Setup environment and workspace."
-   ]
+      ]
-  },
+    },
-  {
+    {
-   "cell_type": "markdown",
+      "cell_type": "markdown",
-   "metadata": {},
+      "metadata": {},
-   "source": [
+      "source": [
-    "## Connect workspace and create experiment"
+        "## Connect workspace and create experiment"
-   ]
+      ]
-  },
+    },
-  {
+    {
-   "cell_type": "markdown",
+      "cell_type": "markdown",
-   "metadata": {},
+      "metadata": {},
-   "source": [
+      "source": [
-    "Import the `Workspace` class, and load your subscription information from the file `config.json` using the function `from_config().` This looks for the JSON file in the current directory by default, but you can also specify a path parameter to point to the file using `from_config(path=\"your/file/path\")`. If you are running this notebook in a cloud notebook server in your workspace, the file is automatically in the root directory.\n",
+        "Import the `Workspace` class, and load your subscription information from the file `config.json` using the function `from_config().` This looks for the JSON file in the current directory by default, but you can also specify a path parameter to point to the file using `from_config(path=\"your/file/path\")`. If you are running this notebook in a cloud notebook server in your workspace, the file is automatically in the root directory.\n",
-    "\n",
+        "\n",
-    "If the following code asks for additional authentication, simply paste the link in a browser and enter the authentication token."
+        "If the following code asks for additional authentication, simply paste the link in a browser and enter the authentication token."
-   ]
+      ]
-  },
+    },
-  {
+    {
-   "cell_type": "code",
+      "cell_type": "code",
-   "execution_count": null,
+      "execution_count": null,
-   "metadata": {},
+      "metadata": {},
-   "outputs": [],
+      "outputs": [],
-   "source": [
+      "source": [
-    "from azureml.core import Workspace\n",
+        "from azureml.core import Workspace\n",
-    "ws = Workspace.from_config()"
+        "ws = Workspace.from_config()"
-   ]
+      ]
-  },
+    },
-  {
+    {
-   "cell_type": "markdown",
+      "cell_type": "markdown",
-   "metadata": {},
+      "metadata": {},
-   "source": [
+      "source": [
-    "Now create an experiment in your workspace. An experiment is another foundational cloud resource that represents a collection of trials (individual model runs). In this tutorial you use the experiment to create runs and track your model training in the Azure Portal. Parameters include your workspace reference, and a string name for the experiment."
+        "Now create an experiment in your workspace. An experiment is another foundational cloud resource that represents a collection of trials (individual model runs). In this tutorial you use the experiment to create runs and track your model training in the Azure Portal. Parameters include your workspace reference, and a string name for the experiment."
-   ]
+      ]
-  },
+    },
-  {
+    {
-   "cell_type": "code",
+      "cell_type": "code",
-   "execution_count": null,
+      "execution_count": null,
-   "metadata": {},
+      "metadata": {},
-   "outputs": [],
+      "outputs": [],
-   "source": [
+      "source": [
-    "from azureml.core import Experiment\n",
+        "from azureml.core import Experiment\n",
-    "experiment = Experiment(workspace=ws, name=\"diabetes-experiment\")"
+        "experiment = Experiment(workspace=ws, name=\"diabetes-experiment\")"
-   ]
+      ]
-  },
+    },
-  {
+    {
-   "cell_type": "markdown",
+      "cell_type": "markdown",
-   "metadata": {},
+      "metadata": {},
-   "source": [
+      "source": [
-    "## Load data and prepare for training"
+        "## Load data and prepare for training"
-   ]
+      ]
-  },
+    },
-  {
+    {
-   "cell_type": "markdown",
+      "cell_type": "markdown",
-   "metadata": {},
+      "metadata": {},
-   "source": [
+      "source": [
-    "For this tutorial, you use the diabetes data set, which is a pre-normalized data set included in scikit-learn. This data set uses features like age, gender, and BMI to predict diabetes disease progression. Load the data from the `load_diabetes()` static function, and split it into training and test sets using `train_test_split()`. This function segregates the data so the model has unseen data to use for testing following training."
+        "For this tutorial, you use the diabetes data set, which is a pre-normalized data set included in scikit-learn. This data set uses features like age, gender, and BMI to predict diabetes disease progression. Load the data from the `load_diabetes()` static function, and split it into training and test sets using `train_test_split()`. This function segregates the data so the model has unseen data to use for testing following training."
-   ]
+      ]
-  },
+    },
-  {
+    {
-   "cell_type": "code",
+      "cell_type": "code",
-   "execution_count": null,
+      "execution_count": null,
-   "metadata": {},
+      "metadata": {},
-   "outputs": [],
+      "outputs": [],
-   "source": [
+      "source": [
-    "from sklearn.datasets import load_diabetes\n",
+        "from sklearn.datasets import load_diabetes\n",
-    "from sklearn.model_selection import train_test_split\n",
+        "from sklearn.model_selection import train_test_split\n",
-    "\n",
+        "\n",
-    "X, y = load_diabetes(return_X_y = True)\n",
+        "X, y = load_diabetes(return_X_y = True)\n",
-    "X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=66)"
+        "X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=66)"
-   ]
+      ]
-  },
+    },
-  {
+    {
-   "cell_type": "markdown",
+      "cell_type": "markdown",
-   "metadata": {},
+      "metadata": {},
-   "source": [
+      "source": [
-    "## Train a model"
+        "## Train a model"
-   ]
+      ]
-  },
+    },
-  {
+    {
-   "cell_type": "markdown",
+      "cell_type": "markdown",
-   "metadata": {},
+      "metadata": {},
-   "source": [
+      "source": [
-    "Training a simple scikit-learn model can easily be done locally for small-scale training, but when training many iterations with dozens of different feature permutations and hyperparameter settings, it is easy to lose track of what models you've trained and how you trained them. The following design pattern shows how to leverage the SDK to easily keep track of your training in the cloud.\n",
+        "Training a simple scikit-learn model can easily be done locally for small-scale training, but when training many iterations with dozens of different feature permutations and hyperparameter settings, it is easy to lose track of what models you've trained and how you trained them. The following design pattern shows how to leverage the SDK to easily keep track of your training in the cloud.\n",
-    "\n",
+        "\n",
-    "Build a script that trains ridge models in a loop through different hyperparameter alpha values."
+        "Build a script that trains ridge models in a loop through different hyperparameter alpha values."
-   ]
+      ]
-  },
+    },
-  {
+    {
-   "cell_type": "code",
+      "cell_type": "code",
-   "execution_count": null,
+      "execution_count": null,
-   "metadata": {},
+      "metadata": {},
-   "outputs": [],
+      "outputs": [],
-   "source": [
+      "source": [
-    "from sklearn.linear_model import Ridge\n",
+        "from sklearn.linear_model import Ridge\n",
-    "from sklearn.metrics import mean_squared_error\n",
+        "from sklearn.metrics import mean_squared_error\n",
-    "from sklearn.externals import joblib\n",
+        "from sklearn.externals import joblib\n",
-    "import math\n",
+        "import math\n",
-    "\n",
+        "\n",
-    "alphas = [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]\n",
+        "alphas = [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]\n",
-    "\n",
+        "\n",
-    "for alpha in alphas:\n",
+        "for alpha in alphas:\n",
-    "    run = experiment.start_logging()\n",
+        "    run = experiment.start_logging()\n",
-    "    run.log(\"alpha_value\", alpha)\n",
+        "    run.log(\"alpha_value\", alpha)\n",
-    "    \n",
+        "    \n",
-    "    model = Ridge(alpha=alpha)\n",
+        "    model = Ridge(alpha=alpha)\n",
-    "    model.fit(X=X_train, y=y_train)\n",
+        "    model.fit(X=X_train, y=y_train)\n",
-    "    y_pred = model.predict(X=X_test)\n",
+        "    y_pred = model.predict(X=X_test)\n",
-    "    rmse = math.sqrt(mean_squared_error(y_true=y_test, y_pred=y_pred))\n",
+        "    rmse = math.sqrt(mean_squared_error(y_true=y_test, y_pred=y_pred))\n",
-    "    run.log(\"rmse\", rmse)\n",
+        "    run.log(\"rmse\", rmse)\n",
-    "    \n",
+        "    \n",
-    "    model_name = \"model_alpha_\" + str(alpha) + \".pkl\"\n",
+        "    model_name = \"model_alpha_\" + str(alpha) + \".pkl\"\n",
-    "    filename = \"outputs/\" + model_name\n",
+        "    filename = \"outputs/\" + model_name\n",
-    "    \n",
+        "    \n",
-    "    joblib.dump(value=model, filename=filename)\n",
+        "    joblib.dump(value=model, filename=filename)\n",
-    "    run.upload_file(name=model_name, path_or_stream=filename)\n",
+        "    run.upload_file(name=model_name, path_or_stream=filename)\n",
-    "    run.complete()"
+        "    run.complete()"
-   ]
+      ]
-  },
+    },
-  {
+    {
-   "cell_type": "markdown",
+      "cell_type": "markdown",
-   "metadata": {},
+      "metadata": {},
-   "source": [
+      "source": [
-    "The above code accomplishes the following:\n",
+        "The above code accomplishes the following:\n",
-    "\n",
+        "\n",
-    "1. For each alpha hyperparameter value in the `alphas` array, a new run is created within the experiment. The alpha value is logged to differentiate between each run.\n",
+        "1. For each alpha hyperparameter value in the `alphas` array, a new run is created within the experiment. The alpha value is logged to differentiate between each run.\n",
-    "1. In each run, a Ridge model is instantiated, trained, and used to run predictions. The root-mean-squared-error is calculated for the actual versus predicted values, and then logged to the run. At this point the run has metadata attached for both the alpha value and the rmse accuracy.\n",
+        "1. In each run, a Ridge model is instantiated, trained, and used to run predictions. The root-mean-squared-error is calculated for the actual versus predicted values, and then logged to the run. At this point the run has metadata attached for both the alpha value and the rmse accuracy.\n",
-    "1. Next, the model for each run is serialized and uploaded to the run. This allows you to download the model file from the run in the portal.\n",
+        "1. Next, the model for each run is serialized and uploaded to the run. This allows you to download the model file from the run in the portal.\n",
-    "1. At the end of each iteration the run is completed by calling `run.complete()`.\n",
+        "1. At the end of each iteration the run is completed by calling `run.complete()`.\n",
-    "\n"
+        "\n"
-   ]
+      ]
-  },
+    },
-  {
+    {
-   "cell_type": "markdown",
+      "cell_type": "markdown",
-   "metadata": {},
+      "metadata": {},
-   "source": [
+      "source": [
-    "After the training has completed, call the `experiment` variable to fetch a link to the experiment in the portal."
+        "After the training has completed, call the `experiment` variable to fetch a link to the experiment in the portal."
-   ]
+      ]
-  },
+    },
-  {
+    {
-   "cell_type": "code",
+      "cell_type": "code",
-   "execution_count": null,
+      "execution_count": null,
-   "metadata": {},
+      "metadata": {},
-   "outputs": [],
+      "outputs": [],
-   "source": [
+      "source": [
-    "experiment"
+        "experiment"
-   ]
+      ]
-  },
+    },
-  {
+    {
-   "cell_type": "markdown",
+      "cell_type": "markdown",
-   "metadata": {},
+      "metadata": {},
-   "source": [
+      "source": [
-    "## View training results in portal"
+        "## View training results in portal"
-   ]
+      ]
-  },
+    },
-  {
+    {
-   "cell_type": "markdown",
+      "cell_type": "markdown",
-   "metadata": {},
+      "metadata": {},
-   "source": [
+      "source": [
-    "Following the **Link to Azure Portal** takes you to the main experiment page. Here you see all the individual runs in the experiment. Any custom-logged values (`alpha_value` and `rmse`, in this case) become fields for each run, and also become available for the charts and tiles at the top of the experiment page. To add a logged metric to a chart or tile, hover over it, click the edit button, and find your custom-logged metric.\n",
+        "Following the **Link to Azure Portal** takes you to the main experiment page. Here you see all the individual runs in the experiment. Any custom-logged values (`alpha_value` and `rmse`, in this case) become fields for each run, and also become available for the charts and tiles at the top of the experiment page. To add a logged metric to a chart or tile, hover over it, click the edit button, and find your custom-logged metric.\n",
-    "\n",
+        "\n",
-    "When training models at scale over hundreds and thousands of runs, this page makes it easy to see every model you trained, specifically how they were trained, and how your unique metrics have changed over time."
+        "When training models at scale over hundreds and thousands of runs, this page makes it easy to see every model you trained, specifically how they were trained, and how your unique metrics have changed over time."
-   ]
+      ]
-  },
+    },
-  {
+    {
-   "cell_type": "markdown",
+      "cell_type": "markdown",
-   "metadata": {},
+      "metadata": {},
-   "source": [
+      "source": [
-    "![Main Experiment page in Portal](imgs/experiment_main.png)"
+        "![Main Experiment page in Portal](imgs/experiment_main.png)"
-   ]
+      ]
-  },
+    },
-  {
+    {
-   "cell_type": "markdown",
+      "cell_type": "markdown",
-   "metadata": {},
+      "metadata": {},
-   "source": [
+      "source": [
-    "Clicking on a run number link in the `RUN NUMBER` column takes you to the page for each individual run. The default tab **Details** shows you more-detailed information on each run. Navigate to the **Outputs** tab, and you see the `.pkl` file for the model that was uploaded to the run during each training iteration. Here you can download the model file, rather than having to retrain it manually."
+        "Clicking on a run number link in the `RUN NUMBER` column takes you to the page for each individual run. The default tab **Details** shows you more-detailed information on each run. Navigate to the **Outputs** tab, and you see the `.pkl` file for the model that was uploaded to the run during each training iteration. Here you can download the model file, rather than having to retrain it manually."
-   ]
+      ]
-  },
+    },
-  {
+    {
-   "cell_type": "markdown",
+      "cell_type": "markdown",
-   "metadata": {},
+      "metadata": {},
-   "source": [
+      "source": [
-    "![Run details page in Portal](imgs/model_download.png)"
+        "![Run details page in Portal](imgs/model_download.png)"
-   ]
+      ]
-  },
+    },
-  {
+    {
-   "cell_type": "markdown",
+      "cell_type": "markdown",
-   "metadata": {},
+      "metadata": {},
-   "source": [
+      "source": [
-    "## Get the best model"
+        "## Get the best model"
-   ]
+      ]
-  },
+    },
-  {
+    {
-   "cell_type": "markdown",
+      "cell_type": "markdown",
-   "metadata": {},
+      "metadata": {},
-   "source": [
+      "source": [
-    "In addition to being able to download model files from the experiment in the portal, you can also download them programmatically. The following code iterates through each run in the experiment, and accesses both the logged run metrics and the run details (which contains the run_id). This keeps track of the best run, in this case the run with the lowest root-mean-squared-error."
+        "In addition to being able to download model files from the experiment in the portal, you can also download them programmatically. The following code iterates through each run in the experiment, and accesses both the logged run metrics and the run details (which contains the run_id). This keeps track of the best run, in this case the run with the lowest root-mean-squared-error."
-   ]
+      ]
-  },
+    },
-  {
+    {
-   "cell_type": "code",
+      "cell_type": "code",
-   "execution_count": null,
+      "execution_count": null,
-   "metadata": {},
+      "metadata": {},
-   "outputs": [],
+      "outputs": [],
-   "source": [
+      "source": [
-    "minimum_rmse_runid = None\n",
+        "minimum_rmse_runid = None\n",
-    "minimum_rmse = None\n",
+        "minimum_rmse = None\n",
-    "\n",
+        "\n",
-    "for run in experiment.get_runs():\n",
+        "for run in experiment.get_runs():\n",
-    "    run_metrics = run.get_metrics()\n",
+        "    run_metrics = run.get_metrics()\n",
-    "    run_details = run.get_details()\n",
+        "    run_details = run.get_details()\n",
-    "    # each logged metric becomes a key in this returned dict\n",
+        "    # each logged metric becomes a key in this returned dict\n",
-    "    run_rmse = run_metrics[\"rmse\"]\n",
+        "    run_rmse = run_metrics[\"rmse\"]\n",
-    "    run_id = run_details[\"runId\"]\n",
+        "    run_id = run_details[\"runId\"]\n",
-    "    \n",
+        "    \n",
-    "    if minimum_rmse is None:\n",
+        "    if minimum_rmse is None:\n",
-    "        minimum_rmse = run_rmse\n",
+        "        minimum_rmse = run_rmse\n",
-    "        minimum_rmse_runid = run_id\n",
+        "        minimum_rmse_runid = run_id\n",
-    "    else:\n",
+        "    else:\n",
-    "        if run_rmse < minimum_rmse:\n",
+        "        if run_rmse < minimum_rmse:\n",
-    "            minimum_rmse = run_rmse\n",
+        "            minimum_rmse = run_rmse\n",
-    "            minimum_rmse_runid = run_id\n",
+        "            minimum_rmse_runid = run_id\n",
-    "\n",
+        "\n",
-    "print(\"Best run_id: \" + minimum_rmse_runid)\n",
+        "print(\"Best run_id: \" + minimum_rmse_runid)\n",
-    "print(\"Best run_id rmse: \" + str(minimum_rmse))    "
+        "print(\"Best run_id rmse: \" + str(minimum_rmse))    "
-   ]
+      ]
-  },
+    },
-  {
+    {
-   "cell_type": "markdown",
+      "cell_type": "markdown",
-   "metadata": {},
+      "metadata": {},
-   "source": [
+      "source": [
-    "Use the best run id to fetch the individual run using the `Run` constructor along with the experiment object. Then call `get_file_names()` to see all the files available for download from this run. In this case, you only uploaded one file for each run during training."
+        "Use the best run id to fetch the individual run using the `Run` constructor along with the experiment object. Then call `get_file_names()` to see all the files available for download from this run. In this case, you only uploaded one file for each run during training."
-   ]
+      ]
-  },
+    },
-  {
+    {
-   "cell_type": "code",
+      "cell_type": "code",
-   "execution_count": null,
+      "execution_count": null,
-   "metadata": {},
+      "metadata": {},
-   "outputs": [],
+      "outputs": [],
-   "source": [
+      "source": [
-    "from azureml.core import Run\n",
+        "from azureml.core import Run\n",
-    "best_run = Run(experiment=experiment, run_id=minimum_rmse_runid)\n",
+        "best_run = Run(experiment=experiment, run_id=minimum_rmse_runid)\n",
-    "print(best_run.get_file_names())"
+        "print(best_run.get_file_names())"
-   ]
+      ]
-  },
+    },
-  {
+    {
-   "cell_type": "markdown",
+      "cell_type": "markdown",
-   "metadata": {},
+      "metadata": {},
-   "source": [
+      "source": [
-    "Call `download()` on the run object, specifying the model file name to download. By default this function downloads to the current directory."
+        "Call `download()` on the run object, specifying the model file name to download. By default this function downloads to the current directory."
-   ]
+      ]
-  },
+    },
-  {
+    {
-   "cell_type": "code",
+      "cell_type": "code",
-   "execution_count": null,
+      "execution_count": null,
-   "metadata": {},
+      "metadata": {},
-   "outputs": [],
+      "outputs": [],
-   "source": [
+      "source": [
-    "best_run.download_file(name=\"model_alpha_0.1.pkl\")"
+        "best_run.download_file(name=\"model_alpha_0.1.pkl\")"
-   ]
+      ]
-  },
+    },
-  {
+    {
-   "cell_type": "markdown",
+      "cell_type": "markdown",
-   "metadata": {},
+      "metadata": {},
-   "source": [
+      "source": [
-    "## Clean up resources\n",
+        "## Clean up resources\n",
-    "\n",
+        "\n",
-    "Do not complete this section if you plan on running other Azure Machine Learning service tutorials.\n",
+        "Do not complete this section if you plan on running other Azure Machine Learning service tutorials.\n",
-    "\n",
+        "\n",
-    "### Stop the notebook VM\n",
+        "### Stop the notebook VM\n",
-    "\n",
+        "\n",
-    "If you used a cloud notebook server, stop the VM when you are not using it to reduce cost.\n",
+        "If you used a cloud notebook server, stop the VM when you are not using it to reduce cost.\n",
-    "\n",
+        "\n",
-    "1. In your workspace, select **Notebook VMs**.\n",
+        "1. In your workspace, select **Notebook VMs**.\n",
-    "\n",
+        "\n",
-    "1. From the list, select the VM.\n",
+        "1. From the list, select the VM.\n",
-    "\n",
+        "\n",
-    "1. Select **Stop**.\n",
+        "1. Select **Stop**.\n",
-    "\n",
+        "\n",
-    "1. When you're ready to use the server again, select **Start**.\n",
+        "1. When you're ready to use the server again, select **Start**.\n",
-    "\n",
+        "\n",
-    "### Delete everything\n",
+        "### Delete everything\n",
-    "\n",
+        "\n",
-    "If you don't plan to use the resources you created, delete them, so you don't incur any charges:\n",
+        "If you don't plan to use the resources you created, delete them, so you don't incur any charges:\n",
-    "\n",
+        "\n",
-    "1. In the Azure portal, select **Resource groups** on the far left.\n",
+        "1. In the Azure portal, select **Resource groups** on the far left.\n",
-    "\n",
+        "\n",
-    "1. From the list, select the resource group you created.\n",
+        "1. From the list, select the resource group you created.\n",
-    "\n",
+        "\n",
-    "1. Select **Delete resource group**.\n",
+        "1. Select **Delete resource group**.\n",
-    "\n",
+        "\n",
-    "1. Enter the resource group name. Then select **Delete**.\n",
+        "1. Enter the resource group name. Then select **Delete**.\n",
-    "\n",
+        "\n",
-    "You can also keep the resource group but delete a single workspace. Display the workspace properties and select **Delete**."
+        "You can also keep the resource group but delete a single workspace. Display the workspace properties and select **Delete**."
-   ]
+      ]
-  },
+    },
-  {
+    {
-   "cell_type": "markdown",
+      "cell_type": "markdown",
-   "metadata": {},
+      "metadata": {},
-   "source": [
+      "source": [
-    "## Next steps\n",
+        "## Next steps\n",
-    "\n",
+        "\n",
-    "In this tutorial, you did the following tasks:\n",
+        "In this tutorial, you did the following tasks:\n",
-    "\n",
+        "\n",
-    "> * Connected your workspace and created an experiment\n",
+        "> * Connected your workspace and created an experiment\n",
-    "> * Loaded data and trained scikit-learn models\n",
+        "> * Loaded data and trained scikit-learn models\n",
-    "> * Viewed training results in the portal and retrieved models\n",
+        "> * Viewed training results in the portal and retrieved models\n",
-    "\n",
+        "\n",
-    "[Deploy your model](https://docs.microsoft.com/azure/machine-learning/service/tutorial-deploy-models-with-aml) with Azure Machine Learning.\n",
+        "[Deploy your model](https://docs.microsoft.com/azure/machine-learning/service/tutorial-deploy-models-with-aml) with Azure Machine Learning.\n",
-    "Learn how to develop [automated machine learning](https://docs.microsoft.com/azure/machine-learning/service/tutorial-auto-train-models) experiments."
+        "Learn how to develop [automated machine learning](https://docs.microsoft.com/azure/machine-learning/service/tutorial-auto-train-models) experiments."
-   ]
+      ]
-  }
+    }
 ],
 "metadata": {
  "authors": [
   {
    "name": "trbye"
   }
  ],
-  "kernelspec": {
+  "metadata": {
-   "display_name": "Python 3.6",
+    "authors": [
-   "language": "python",
+      {
-   "name": "python36"
+        "name": "trbye"
      }
    ],
    "kernelspec": {
      "display_name": "Python 3.6",
      "language": "python",
      "name": "python36"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.6.5"
    },
    "msauthor": "trbye"
  },
-  "language_info": {
+  "nbformat": 4,
-   "codemirror_mode": {
+  "nbformat_minor": 2
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.5"
  },
  "msauthor": "trbye"
 },
 "nbformat": 4,
 "nbformat_minor": 2
 }
--- a/work-with-data/dataprep/data/ADLSgen2-datapreptest.crt
+++ b/work-with-data/dataprep/data/ADLSgen2-datapreptest.crt
@@ -0,0 +1,45 @@
 -----BEGIN PRIVATE KEY-----
 MIIEvgIBADANBgkqhkiG9w0BAQEFAASCBKgwggSkAgEAAoIBAQC/C0oc6vvF1UEc
 y9JeGDXdtKynG11wTTIHIokFhNinHNSpJBLmNWFyFkqzvjJCPR4kWuqw4IXhCS3L
 VoqRmT680SvUFFF6HnEaa75Bc1YSACn1ZsHuCRGrqO9BaTgt3mM0sRYC67+f+W0E
 tA+k+EA0XnTtDdEBX3RLzvaYAR4yijEHIBQeeNemPYK4msW6Xw67ib1xn59blX4Z
 a4Z85FjrekmoTl9493bFj6znDTX6wpKsPF7WLEF9S+oD/Lg4EHBi9BfefFxQpGZ9
 FQHToFKyz1tA2iaY/9LjCtJcincMkuXt3KuQA4Nv2GiTzz4+FEy1pOqHnyNL2tFR
 1G5n04BHAgMBAAECggEAAqcXeltQ76hMZSf3XdMcPF3b394jaAHKZgr2uBrmHzvp
 QAf+MzAekET6+I/1hrHujzar95TGhx9ngWFMP0VPd7O31hQKJZXyoBlK5QHC+jEC
 ZCPvIW0Cz81itRfO7eQeoIas9ZFscb4240/Uv8eqrI97NCdy9X/rz3mqNuYdEzqN
 2v9XlwE/Fyx79O1PQqzPRiQt3n4ss9NO169y7X99KUZtYiZAiyBBGS8wYdaGF69G
 URZ3qwoUE+nByZdeRfFLLTy+UDCOwQZV+0V4p0J++YLqQAac340A1F4D60qzMHnv
 KVKnMc+RrYYVFOZU+USRlphSl3Ws5j0u94CiLitK4QKBgQDivJVHNmk1JleI/MPF
 bx/YT5gzcVRFhGxkGso12JrQiFPs05JmoRFaqNBDNoZYDn2ggUrMwZVfPI5C6+7U
 tCe2vrjVpvcAO9reK1u4N9ohpUpkocxWQy0nNHlrorDTZnyKreRtPC87W8xpiwl4
 R/+nMgGd8vex7tGfchpThj8ZeQKBgQDXs2sgpE8vmnZBWrXAuGD8M9VnfcALEjwL
 Fi3NR+XCr8jHkeIJVbSI2/asWsBGg8v6gV6Cdx9KV9r+fHDzdocS85X4P7crP83A
 IX2rTT6Hsmc170SzCDa2jJJyLHQ6qtXBS9ZW8/dPFc1fiBf0NcmTLrRoNg5N8Px6
 Qt0T51q3vwKBgQCYAfhOetMD2AW9iEAzwDFoUsxmSKdHx+TnI/LHMMVx4sPpNVqk
 RX2d+ylMtmRQ6r4cejHMnkfnRnDVutkubu1lHe5LBpn35Sjx472k/oTWI7uBRdv5
 RSYjb5GrsLG9uKrsSnKnLT85G20qoRUjN5nU3LiqzPZ0qviMXfH6ZzkseQKBgQCT
 ft6MTY7QUGD4w5xxEiNPkeolgHmnmGpyclITg0x7WlSDEyBrna17wF3m8Y91KH58
 56XGtMoyvezEBDgAY1ZuAR7VyEvqSRDahow2bPWLONUWrmxduAohvfIOHJPF4jeU
 m9UPVHgSHih3YMpwda9G87LtZ7lUVqtutvYRvCvuZQKBgAypo514DZW7Y9lMCgkR
 GpJLKCWFR0Sl9bQXI7N5nAG0YFz5ZhdA1PjS2tj+OKyWR6wekbv3g0CyVXT4XYsi
 tKRu9PR2OUQLPv/h2qLAeSOYdScfWoOU5tlb4tkLoUNmj5/N9VpqbvLdDh6hPWQL
 o4s+29QYKEoNmOrcZ6oRkRP8
 -----END PRIVATE KEY-----
 -----BEGIN CERTIFICATE-----
 MIICoTCCAYkCAgPoMA0GCSqGSIb3DQEBBQUAMBQxEjAQBgNVBAMMCUNMSS1Mb2dp
 bjAiGA8yMDE5MDUwMzIwMDIwOVoYDzIwMjAwNTAzMjAwMjExWjAUMRIwEAYDVQQD
 DAlDTEktTG9naW4wggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQC/C0oc
 6vvF1UEcy9JeGDXdtKynG11wTTIHIokFhNinHNSpJBLmNWFyFkqzvjJCPR4kWuqw
 4IXhCS3LVoqRmT680SvUFFF6HnEaa75Bc1YSACn1ZsHuCRGrqO9BaTgt3mM0sRYC
 67+f+W0EtA+k+EA0XnTtDdEBX3RLzvaYAR4yijEHIBQeeNemPYK4msW6Xw67ib1x
 n59blX4Za4Z85FjrekmoTl9493bFj6znDTX6wpKsPF7WLEF9S+oD/Lg4EHBi9Bfe
 fFxQpGZ9FQHToFKyz1tA2iaY/9LjCtJcincMkuXt3KuQA4Nv2GiTzz4+FEy1pOqH
 nyNL2tFR1G5n04BHAgMBAAEwDQYJKoZIhvcNAQEFBQADggEBAGz3pOgNPESr+QoO
 OVCgSS6VtWlmrAcxl5JaiNBFpBGAqfvbfRe1eZY7Rn6fuw1jc3pPBVzNTf8Plel+
 DcuLzDLJAEag2GpRE+Xg57DNSwPqP6jZfHRE/ufLwIRLcNG9wRUwqlBvdAu1Kign
 nlTZvTEAwxlQdvmIIT1XrTLZ+OwtVXcgrf0vInmueZKz/UDqsSDPY+d426S9eOWt
 60h2WgXPU3QvBYfA6Yd2ReeP3+SHwBd4/1ByNFWBytcI9ow3pp2JznU366dfX4IQ
 Q0iOTvHzXbfPmtsxqho6+hBbLvXVNWJMg8e22Pp/TyXYqeV5V09k18EgCnuA/9Gd
 kKDVROA=
 -----END CERTIFICATE-----
--- a/work-with-data/dataprep/how-to-guides/custom-python-transforms.ipynb
+++ b/work-with-data/dataprep/how-to-guides/custom-python-transforms.ipynb
@@ -222,7 +222,7 @@
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
-      "version": "3.6.8"
+      "version": "3.6.4"
    },
    "notice": "Copyright (c) Microsoft Corporation. All rights reserved. Licensed under the MIT License."
  },
--- a/work-with-data/dataprep/how-to-guides/data-ingestion.ipynb
+++ b/work-with-data/dataprep/how-to-guides/data-ingestion.ipynb
@@ -47,6 +47,7 @@
        "[Read PostgreSQL](#postgresql)<br>\n",
        "[Read From Azure Blob](#azure-blob)<br>\n",
        "[Read From ADLS](#adls)<br>\n",
        "[Read From ADLSGen2](#adlsgen2)<br>\n",
        "[Read Pandas DataFrame](#pandas-df)<br>"
      ]
    },
@@ -315,6 +316,25 @@
        "df"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "You can see in the results that the FBI Code column now contains some NaN values where before, when calling head, it didn't. By default, `to_pandas_dataframe` attempts to coalesce columns into a single type for better performance and lower memory overhead. This specific column has a mixutre of both numbers and strings and the strings were replaced with NaN values.\n",
        "\n",
        "If you wish to keep the mixed-type column in the Pandas DataFrame, you can set the `extended_types` argument to True when calling `to_pandas_dataframe`."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "df = dflow_skipped_rows.to_pandas_dataframe(extended_types=True)\n",
        "df"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
@@ -635,7 +655,7 @@
      "metadata": {},
      "outputs": [],
      "source": [
-        "df = dflow.to_pandas_dataframe()\n",
+        "df = dflow.to_pandas_dataframe(extended_types=True)\n",
        "df.dtypes"
      ]
    },
@@ -751,7 +771,7 @@
      "cell_type": "markdown",
      "metadata": {},
      "source": [
-        "There are two ways the Data Prep API can acquire the necessary OAuth token to access Azure DataLake Storage:\n",
+        "Data Prep currently supports both ADLS and ADLSGen2. There are two ways the Data Prep API can acquire the necessary OAuth token to access Azure DataLake Storage:\n",
        "1. Retrieve the access token from a recent login session of the user's [Azure CLI](https://docs.microsoft.com/en-us/cli/azure/install-azure-cli?view=azure-cli-latest) login.\n",
        "2. Use a ServicePrincipal (SP) and a certificate as a secret."
      ]
@@ -883,6 +903,70 @@
        "dflow.to_pandas_dataframe().head()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "<a id=\"adlsgen2\"></a>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Read from ADLSGen2"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Please refer to the Read for ADLS section above to get details of how to register a Service Principal and obtain an OAuth access token.[ADLS](http://localhost:8888/notebooks/notebooks/how-to-guides/data-ingestion.ipynb#adls)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Configure ADLSGen2 Account for ServicePrincipal"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "certThumbprint = '23:66:84:6B:3A:14:9E:B1:17:CA:EE:E3:BB:2C:21:2D:20:B0:DF:F2'\n",
        "certificate = ''\n",
        "with open('../data/ADLSgen2-datapreptest.crt', 'rt', encoding='utf-8') as crtFile:\n",
        "    certificate = crtFile.read()\n",
        "\n",
        "servicePrincipalAppId = \"127a58c3-f307-46a1-969e-a6b63da3f411\""
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Acquire an OAuth Access Token for ADLSGen2"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import adal\n",
        "from azureml.dataprep.api.datasources import ADLSGen2\n",
        "\n",
        "ctx = adal.AuthenticationContext('https://login.microsoftonline.com/72f988bf-86f1-41af-91ab-2d7cd011db47')\n",
        "token = ctx.acquire_token_with_client_certificate('https://storage.azure.com/', servicePrincipalAppId, certificate, certThumbprint)\n",
        "dflow = dprep.read_csv(path = ADLSGen2(path='https://adlsgen2datapreptest.dfs.core.windows.net/datapreptest/people.csv', accessToken=token['accessToken']))\n",
        "dflow.to_pandas_dataframe().head()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
@@ -923,7 +1007,24 @@
      "cell_type": "markdown",
      "metadata": {},
      "source": [
-        "After loading in the data you can now do `read_pandas_dataframe`."
+        "After loading in the data you can now do `read_pandas_dataframe`. If you only need to consume the Dataflow created from the current environment, you can read the DataFrame in memory."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "dflow_df = dprep.read_pandas_dataframe(df, in_memory=True)\n",
        "dflow_df.head(5)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "However, if you intend to use this Dataflow past the end of your current Python session (such as by saving the Dataflow to a file), you can provide a cache directory where the contents of the DataFrame will be stored so they can be retrieved later."
      ]
    },
    {
--- a/work-with-data/dataprep/how-to-guides/datastore.ipynb
+++ b/work-with-data/dataprep/how-to-guides/datastore.ipynb
@@ -183,6 +183,37 @@
        "dflow_adls = dprep.read_csv(path=DataPath(datastore, path_on_datastore='/input/crime0-10.csv'))\n",
        "dflow_adls.head(5)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Now you can read all the files in the `dataprep_adlsgen2` datastore which references an ADLSGen2 Storage account."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# read a file from ADLSGen2\n",
        "datastore = Datastore(workspace=workspace, name='adlsgen2')\n",
        "dflow_adlsgen2 = dprep.read_csv(path=DataPath(datastore, path_on_datastore='/testfolder/peopletest.csv'))\n",
        "dflow_adlsgen2.head(5)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# read all files from ADLSGen2 directory\n",
        "datastore = Datastore(workspace=workspace, name='adlsgen2')\n",
        "dflow_adlsgen2 = dprep.read_csv(path=DataPath(datastore, path_on_datastore='/testfolder/testdir'))\n",
        "dflow_adlsgen2.head()"
      ]
    }
  ],
  "metadata": {
--- a/work-with-data/dataprep/how-to-guides/split-column-by-example.ipynb
+++ b/work-with-data/dataprep/how-to-guides/split-column-by-example.ipynb
@@ -186,7 +186,7 @@
      "cell_type": "markdown",
      "metadata": {},
      "source": [
-        "Now we have successfully split the data into useful columns through examples. "
+        "Now we have successfully split the data into useful columns through examples."
      ]
    }
  ],