Merge pull request #333 from rastala/master

version 1.0.30
2025-12-21 18:15:13 -05:00 · 2019-04-22 15:40:11 -04:00 · 2019-04-22 15:39:18 -04:00 · 2019-04-18 16:19:51 -04:00 · 2019-04-18 16:18:33 -04:00 · 2019-04-18 10:01:50 -04:00
35 changed files with 2441 additions and 443 deletions
--- a/Dockerfiles/1.0.23/Dockerfile
+++ b/Dockerfiles/1.0.23/Dockerfile
@@ -0,0 +1,29 @@
 FROM continuumio/miniconda:4.5.11
 # install git
 RUN apt-get update && apt-get upgrade -y && apt-get install -y git
 # create a new conda environment named azureml
 RUN conda create -n azureml -y -q Python=3.6
 # install additional packages used by sample notebooks. this is optional
 RUN ["/bin/bash", "-c", "source activate azureml && conda install -y tqdm cython matplotlib scikit-learn"]
 # install azurmel-sdk components
 RUN ["/bin/bash", "-c", "source activate azureml && pip install azureml-sdk[notebooks]==1.0.23"]
 # clone Azure ML GitHub sample notebooks
 RUN cd /home && git clone -b "azureml-sdk-1.0.23" --single-branch https://github.com/Azure/MachineLearningNotebooks.git
 # generate jupyter configuration file
 RUN ["/bin/bash", "-c", "source activate azureml && mkdir ~/.jupyter && cd ~/.jupyter && jupyter notebook --generate-config"]
 # set an emtpy token for Jupyter to remove authentication. 
 # this is NOT recommended for production environment
 RUN echo "c.NotebookApp.token = ''" >> ~/.jupyter/jupyter_notebook_config.py
 # open up port 8887 on the container
 EXPOSE 8887
 # start Jupyter notebook server on port 8887 when the container starts
 CMD /bin/bash -c "cd /home/MachineLearningNotebooks && source activate azureml && jupyter notebook --port 8887 --no-browser --ip 0.0.0.0 --allow-root"
--- a/configuration.ipynb
+++ b/configuration.ipynb
@@ -96,7 +96,7 @@
      "source": [
        "import azureml.core\n",
        "\n",
-        "print(\"This notebook was created using version 1.0.21 of the Azure ML SDK\")\n",
+        "print(\"This notebook was created using version 1.0.23 of the Azure ML SDK\")\n",
        "print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
      ]
    },
--- a/how-to-use-azureml/automated-machine-learning/README.md
+++ b/how-to-use-azureml/automated-machine-learning/README.md
@@ -211,10 +211,18 @@ The main code of the file must be indented so that it is under this condition.
 <a name="troubleshooting"></a>
 # Troubleshooting
 ## automl_setup fails
-1. On windows, make sure that you are running automl_setup from an Anconda Prompt window rather than a regular cmd window.  You can launch the "Anaconda Prompt" window by hitting the Start button and typing "Anaconda Prompt".  If you don't see the application "Anaconda Prompt", you might not have conda or mini conda installed.  In that case, you can install it [here](https://conda.io/miniconda.html)
+1. On Windows, make sure that you are running automl_setup from an Anconda Prompt window rather than a regular cmd window.  You can launch the "Anaconda Prompt" window by hitting the Start button and typing "Anaconda Prompt".  If you don't see the application "Anaconda Prompt", you might not have conda or mini conda installed.  In that case, you can install it [here](https://conda.io/miniconda.html)
 2. Check that you have conda 64-bit installed rather than 32-bit.  You can check this with the command `conda info`.  The `platform` should be `win-64` for Windows or `osx-64` for Mac.
 3. Check that you have conda 4.4.10 or later.  You can check the version with the command `conda -V`.  If you have a previous version installed, you can update it using the command: `conda update conda`.
-4. Pass a new name as the first parameter to automl_setup so that it creates a new conda environment. You can view existing conda environments using `conda env list` and remove them with `conda env remove -n <environmentname>`. 
+4. On Linux, if the error is `gcc: error trying to exec 'cc1plus': execvp: No such file or directory`, install build essentials using the command `sudo apt-get install build-essential`.
 5. Pass a new name as the first parameter to automl_setup so that it creates a new conda environment. You can view existing conda environments using `conda env list` and remove them with `conda env remove -n <environmentname>`. 
 ## automl_setup_linux.sh fails
 If automl_setup_linux.sh fails on Ubuntu Linux with the error: `unable to execute 'gcc': No such file or directory`
 1. Make sure that outbound ports 53 and 80 are enabled.  On an Azure VM, you can do this from the Azure Portal by selecting the VM and clicking on Networking.
 2. Run the command: `sudo apt-get update`
 3. Run the command: `sudo apt-get install build-essential --fix-missing`
 4. Run `automl_setup_linux.sh` again.
 ## configuration.ipynb fails
 1) For local conda, make sure that you have susccessfully run automl_setup first.
--- a/how-to-use-azureml/automated-machine-learning/classification-with-deployment/auto-ml-classification-with-deployment.ipynb
+++ b/how-to-use-azureml/automated-machine-learning/classification-with-deployment/auto-ml-classification-with-deployment.ipynb
@@ -302,7 +302,8 @@
      "source": [
        "from azureml.core.conda_dependencies import CondaDependencies\n",
        "\n",
-        "myenv = CondaDependencies.create(conda_packages=['numpy','scikit-learn'], pip_packages=['azureml-sdk[automl]'])\n",
+        "myenv = CondaDependencies.create(conda_packages=['numpy','scikit-learn','py-xgboost<=0.80'],\n",
        "                                 pip_packages=['azureml-sdk[automl]'])\n",
        "\n",
        "conda_env_file_name = 'myenv.yml'\n",
        "myenv.save_to_file('.', conda_env_file_name)"
--- a/how-to-use-azureml/automated-machine-learning/classification/auto-ml-classification.ipynb
+++ b/how-to-use-azureml/automated-machine-learning/classification/auto-ml-classification.ipynb
@@ -72,6 +72,32 @@
        "from azureml.train.automl import AutoMLConfig"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Accessing the Azure ML workspace requires authentication with Azure.\n",
        "\n",
        "The default authentication is interactive authentication using the default tenant.  Executing the `ws = Workspace.from_config()` line in the cell below will prompt for authentication the first time that it is run.\n",
        "\n",
        "If you have multiple Azure tenants, you can specify the tenant by replacing the `ws = Workspace.from_config()` line in the cell below with the following:\n",
        "\n",
        "```\n",
        "from azureml.core.authentication import InteractiveLoginAuthentication\n",
        "auth = InteractiveLoginAuthentication(tenant_id = 'mytenantid')\n",
        "ws = Workspace.from_config(auth = auth)\n",
        "```\n",
        "\n",
        "If you need to run in an environment where interactive login is not possible, you can use Service Principal authentication by replacing the `ws = Workspace.from_config()` line in the cell below with the following:\n",
        "\n",
        "```\n",
        "from azureml.core.authentication import ServicePrincipalAuthentication\n",
        "auth = auth = ServicePrincipalAuthentication('mytenantid', 'myappid', 'mypassword')\n",
        "ws = Workspace.from_config(auth = auth)\n",
        "```\n",
        "For more details, see [aka.ms/aml-notebook-auth](http://aka.ms/aml-notebook-auth)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
@@ -133,11 +159,10 @@
        "|-|-|\n",
        "|**task**|classification or regression|\n",
        "|**primary_metric**|This is the metric that you want to optimize. Classification supports the following primary metrics: <br><i>accuracy</i><br><i>AUC_weighted</i><br><i>average_precision_score_weighted</i><br><i>norm_macro_recall</i><br><i>precision_score_weighted</i>|\n",
        "|**iteration_timeout_minutes**|Time limit in minutes for each iteration.|\n",
        "|**iterations**|Number of iterations. In each iteration AutoML trains a specific pipeline with the data.|\n",
        "|**X**|(sparse) array-like, shape = [n_samples, n_features]|\n",
        "|**y**|(sparse) array-like, shape = [n_samples, ], Multi-class targets.|\n",
-        "|**path**|Relative path to the project folder. AutoML stores configuration files for the experiment under this folder. You can specify a new empty folder.|"
+        "|**n_cross_validations**|Number of cross validation splits.|\n",
        "|<i>Exit Criteria [optional]</i><br><br>iterations<br>experiment_timeout_minutes|An optional duration parameter that says how long AutoML should be run.<br>This could be either number of iterations or number of minutes AutoML is allowed to run. <br><br><i>iterations</i> number of algorithm iterations to run<br><i>experiment_timeout_minutes</i> is the number of minutes that AutoML should run<br><br>By default, this is set to stop whenever AutoML determines that progress in scores is not being made|"
      ]
    },
    {
@@ -147,14 +172,10 @@
      "outputs": [],
      "source": [
        "automl_config = AutoMLConfig(task = 'classification',\n",
        "                             debug_log = 'automl_errors.log',\n",
        "                             primary_metric = 'AUC_weighted',\n",
        "                             iteration_timeout_minutes = 60,\n",
        "                             iterations = 25,\n",
        "                             verbosity = logging.INFO,\n",
        "                             X = X_train, \n",
        "                             y = y_train,\n",
-        "                             path = project_folder)"
+        "                             n_cross_validations = 3)"
      ]
    },
    {
--- a/how-to-use-azureml/automated-machine-learning/forecasting-energy-demand/auto-ml-forecasting-energy-demand.ipynb
+++ b/how-to-use-azureml/automated-machine-learning/forecasting-energy-demand/auto-ml-forecasting-energy-demand.ipynb
@@ -37,7 +37,8 @@
        "2. Instantiating AutoMLConfig with new task type \"forecasting\" for timeseries data training, and other timeseries related settings: for this dataset we use the basic one: \"time_column_name\" \n",
        "3. Training the Model using local compute\n",
        "4. Exploring the results\n",
-        "5. Testing the fitted model"
+        "5. Viewing the engineered names for featurized data and featurization summary for all raw features\n",
        "6. Testing the fitted model"
      ]
    },
    {
@@ -126,7 +127,7 @@
      "cell_type": "markdown",
      "metadata": {},
      "source": [
-        "### Split the data to train and test\n",
+        "### Get the train data\n",
        "\n"
      ]
    },
@@ -172,14 +173,10 @@
      "metadata": {},
      "outputs": [],
      "source": [
-        "X_train = train[train['timeStamp'] < '2017-01-01']\n",
+        "X_train = train\n",
        "X_valid = train[train['timeStamp'] >= '2017-01-01']\n",
        "y_train = X_train.pop('demand').values\n",
        "y_valid = X_valid.pop('demand').values\n",
        "print(X_train.shape)\n",
-        "print(y_train.shape)\n",
+        "print(y_train.shape)"
        "print(X_valid.shape)\n",
        "print(y_valid.shape)"
      ]
    },
    {
@@ -198,8 +195,7 @@
        "|**iteration_timeout_minutes**|Time limit in minutes for each iteration.|\n",
        "|**X**|(sparse) array-like, shape = [n_samples, n_features]|\n",
        "|**y**|(sparse) array-like, shape = [n_samples, ], targets values.|\n",
-        "|**X_valid**|Data used to evaluate a model in a iteration. (sparse) array-like, shape = [n_samples, n_features]|\n",
+        "|**n_cross_validations**|Number of cross validation splits.|\n",
        "|**y_valid**|Data used to evaluate a model in a iteration. (sparse) array-like, shape = [n_samples, ], targets values.|\n",
        "|**path**|Relative path to the project folder.  AutoML stores configuration files for the experiment under this folder. You can specify a new empty folder. "
      ]
    },
@@ -222,8 +218,7 @@
        "                             iteration_timeout_minutes = 5,\n",
        "                             X = X_train,\n",
        "                             y = y_train,\n",
-        "                             X_valid = X_valid,\n",
+        "                             n_cross_validations = 2,\n",
        "                             y_valid = y_valid,\n",
        "                             path=project_folder,\n",
        "                             verbosity = logging.INFO,\n",
        "                            **automl_settings)"
@@ -273,6 +268,45 @@
        "fitted_model.steps"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### View the engineered names for featurized data\n",
        "Below we display the engineered feature names generated for the featurized data using the time-series featurization."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "fitted_model.named_steps['timeseriestransformer'].get_engineered_feature_names()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### View the featurization summary\n",
        "Below we display the featurization that was performed on different raw features in the user data. For each raw feature in the user data, the following information is displayed:-\n",
        "- Raw feature name\n",
        "- Number of engineered features formed out of this raw feature\n",
        "- Type detected\n",
        "- If feature was dropped\n",
        "- List of feature transformations for the raw feature"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "fitted_model.named_steps['timeseriestransformer'].get_featurization_summary()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
--- a/how-to-use-azureml/automated-machine-learning/forecasting-orange-juice-sales/auto-ml-forecasting-orange-juice-sales.ipynb
+++ b/how-to-use-azureml/automated-machine-learning/forecasting-orange-juice-sales/auto-ml-forecasting-orange-juice-sales.ipynb
@@ -36,7 +36,8 @@
        "1. Create an Experiment in an existing Workspace\n",
        "2. Instantiate an AutoMLConfig \n",
        "3. Find and train a forecasting model using local compute\n",
-        "4. Evaluate the performance of the model\n",
+        "4. Viewing the engineered names for featurized data and featurization summary for all raw features\n",
        "5. Evaluate the performance of the model\n",
        "\n",
        "The examples in the follow code samples use the University of Chicago's Dominick's Finer Foods dataset to forecast orange juice sales. Dominick's was a grocery chain in the Chicago metropolitan area."
      ]
@@ -320,6 +321,45 @@
        "fitted_pipeline.steps"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### View the engineered names for featurized data\n",
        "Below we display the engineered feature names generated for the featurized data using the time-series featurization."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "fitted_pipeline.named_steps['timeseriestransformer'].get_engineered_feature_names()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### View the featurization summary\n",
        "Below we display the featurization that was performed on different raw features in the user data. For each raw feature in the user data, the following information is displayed:-\n",
        "- Raw feature name\n",
        "- Number of engineered features formed out of this raw feature\n",
        "- Type detected\n",
        "- If feature was dropped\n",
        "- List of feature transformations for the raw feature"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "fitted_pipeline.named_steps['timeseriestransformer'].get_featurization_summary()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
--- a/how-to-use-azureml/automated-machine-learning/missing-data-blacklist-early-termination/auto-ml-missing-data-blacklist-early-termination.ipynb
+++ b/how-to-use-azureml/automated-machine-learning/missing-data-blacklist-early-termination/auto-ml-missing-data-blacklist-early-termination.ipynb
@@ -37,8 +37,9 @@
        "In this notebook you will learn how to:\n",
        "1. Create an `Experiment` in an existing `Workspace`.\n",
        "2. Configure AutoML using `AutoMLConfig`.\n",
-        "4. Train the model.\n",
+        "3. Train the model.\n",
-        "5. Explore the results.\n",
+        "4. Explore the results.\n",
        "5. Viewing the engineered names for featurized data and featurization summary for all raw features.\n",
        "6. Test the best fitted model.\n",
        "\n",
        "In addition this notebook showcases the following features\n",
@@ -316,6 +317,45 @@
        "# best_run, fitted_model = local_run.get_output(iteration = iteration)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "#### View the engineered names for featurized data\n",
        "Below we display the engineered feature names generated for the featurized data using the preprocessing featurization."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "fitted_model.named_steps['datatransformer'].get_engineered_feature_names()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "#### View the featurization summary\n",
        "Below we display the featurization that was performed on different raw features in the user data. For each raw feature in the user data, the following information is displayed:-\n",
        "- Raw feature name\n",
        "- Number of engineered features formed out of this raw feature\n",
        "- Type detected\n",
        "- If feature was dropped\n",
        "- List of feature transformations for the raw feature"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "fitted_model.named_steps['datatransformer'].get_featurization_summary()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
--- a/how-to-use-azureml/automated-machine-learning/model-explanation/auto-ml-model-explanation.ipynb
+++ b/how-to-use-azureml/automated-machine-learning/model-explanation/auto-ml-model-explanation.ipynb
@@ -305,7 +305,7 @@
        "from azureml.train.automl.automlexplainer import explain_model\n",
        "\n",
        "shap_values, expected_values, overall_summary, overall_imp, per_class_summary, per_class_imp = \\\n",
-        "    explain_model(fitted_model, X_train, X_test)"
+        "    explain_model(fitted_model, X_train, X_test, features=features)"
      ]
    },
    {
--- a/how-to-use-azureml/automated-machine-learning/remote-attach/auto-ml-remote-attach.ipynb
+++ b/how-to-use-azureml/automated-machine-learning/remote-attach/auto-ml-remote-attach.ipynb
@@ -40,7 +40,8 @@
        "3. Configure AutoML using `AutoMLConfig`.\n",
        "4. Train the model using the DSVM.\n",
        "5. Explore the results.\n",
-        "6. Test the best fitted model.\n",
+        "6. Viewing the engineered names for featurized data and featurization summary for all raw features.\n",
        "7. Test the best fitted model.\n",
        "\n",
        "In addition this notebook showcases the following features\n",
        "- **Parallel** executions for iterations\n",
@@ -160,6 +161,7 @@
      "source": [
        "from azureml.core.runconfig import RunConfiguration\n",
        "from azureml.core.conda_dependencies import CondaDependencies\n",
        "import pkg_resources\n",
        "\n",
        "# create a new RunConfig object\n",
        "conda_run_config = RunConfiguration(framework=\"python\")\n",
@@ -167,7 +169,9 @@
        "# Set compute target to the Linux DSVM\n",
        "conda_run_config.target = dsvm_compute\n",
        "\n",
-        "cd = CondaDependencies.create(pip_packages=['azureml-sdk[automl]'], conda_packages=['numpy','py-xgboost<=0.80'])\n",
+        "pandas_dependency = 'pandas==' + pkg_resources.get_distribution(\"pandas\").version\n",
        "\n",
        "cd = CondaDependencies.create(pip_packages=['azureml-sdk[automl]'], conda_packages=['numpy','py-xgboost<=0.80',pandas_dependency])\n",
        "conda_run_config.environment.python.conda_dependencies = cd"
      ]
    },
@@ -407,6 +411,45 @@
        "print(fitted_model)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "#### View the engineered names for featurized data\n",
        "Below we display the engineered feature names generated for the featurized data using the preprocessing featurization."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "fitted_model.named_steps['datatransformer'].get_engineered_feature_names()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "#### View the featurization summary\n",
        "Below we display the featurization that was performed on different raw features in the user data. For each raw feature in the user data, the following information is displayed:-\n",
        "- Raw feature name\n",
        "- Number of engineered features formed out of this raw feature\n",
        "- Type detected\n",
        "- If feature was dropped\n",
        "- List of feature transformations for the raw feature"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "fitted_model.named_steps['datatransformer'].get_featurization_summary()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
--- a/how-to-use-azureml/automated-machine-learning/remote-execution-with-datastore/auto-ml-remote-execution-with-datastore.ipynb
+++ b/how-to-use-azureml/automated-machine-learning/remote-execution-with-datastore/auto-ml-remote-execution-with-datastore.ipynb
@@ -245,6 +245,7 @@
      "source": [
        "from azureml.core.runconfig import RunConfiguration\n",
        "from azureml.core.conda_dependencies import CondaDependencies\n",
        "import pkg_resources\n",
        "\n",
        "# create a new RunConfig object\n",
        "conda_run_config = RunConfiguration(framework=\"python\")\n",
@@ -254,7 +255,9 @@
        "# set the data reference of the run coonfiguration\n",
        "conda_run_config.data_references = {ds.name: dr}\n",
        "\n",
-        "cd = CondaDependencies.create(pip_packages=['azureml-sdk[automl]'], conda_packages=['numpy','py-xgboost<=0.80'])\n",
+        "pandas_dependency = 'pandas==' + pkg_resources.get_distribution(\"pandas\").version\n",
        "\n",
        "cd = CondaDependencies.create(pip_packages=['azureml-sdk[automl]'], conda_packages=['numpy','py-xgboost<=0.80',pandas_dependency])\n",
        "conda_run_config.environment.python.conda_dependencies = cd"
      ]
    },
--- a/how-to-use-azureml/azure-databricks/automl/automl-databricks-local-01.ipynb
+++ b/how-to-use-azureml/azure-databricks/automl/automl-databricks-local-01.ipynb
@@ -23,7 +23,8 @@
        "3. Configure Automated ML using `AutoMLConfig`.\n",
        "4. Train the model using Azure Databricks.\n",
        "5. Explore the results.\n",
-        "6. Test the best fitted model.\n",
+        "6. Viewing the engineered names for featurized data and featurization summary for all raw features.\n",
        "7. Test the best fitted model.\n",
        "\n",
        "Before running this notebook, please follow the <a href=\"https://github.com/Azure/MachineLearningNotebooks/tree/master/how-to-use-azureml/azure-databricks\" target=\"_blank\">readme for using Automated ML on Azure Databricks</a> for installing necessary libraries to your cluster."
      ]
@@ -556,6 +557,45 @@
        "print(fitted_model)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "#### View the engineered names for featurized data\n",
        "Below we display the engineered feature names generated for the featurized data using the preprocessing featurization."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "fitted_model.named_steps['datatransformer'].get_engineered_feature_names()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "#### View the featurization summary\n",
        "Below we display the featurization that was performed on different raw features in the user data. For each raw feature in the user data, the following information is displayed:-\n",
        "- Raw feature name\n",
        "- Number of engineered features formed out of this raw feature\n",
        "- Type detected\n",
        "- If feature was dropped\n",
        "- List of feature transformations for the raw feature"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "fitted_model.named_steps['datatransformer'].get_featurization_summary()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
--- a/how-to-use-azureml/azure-hdi/README.md
+++ b/how-to-use-azureml/azure-hdi/README.md
@@ -0,0 +1,53 @@
 **Azure HDInsight**
 Azure HDInsight is a fully managed cloud Hadoop & Spark offering the gives
 optimized open-source analytic clusters for Spark, Hive, MapReduce, HBase,
 Storm, and Kafka. HDInsight Spark clusters provide kernels that you can use with
 the Jupyter notebook on [Apache Spark](https://spark.apache.org/) for testing
 your applications. 
 How Azure HDInsight works with Azure Machine Learning service
 -   You can train a model using Spark clusters and deploy the model to ACI/AKS
    from within Azure HDInsight.
 -   You can also use [automated machine
    learning](https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-automated-ml) capabilities
    integrated within Azure HDInsight.
 You can use Azure HDInsight as a compute target from an [Azure Machine Learning
 pipeline](https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-ml-pipelines).
 **Set up your HDInsight cluster**
 Create [HDInsight
 cluster](https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-hadoop-provision-linux-clusters)
 **Quick create: Basic cluster setup**
 This article walks you through setup in the [Azure
 portal](https://portal.azure.com/), where you can create an HDInsight cluster
 using *Quick create* or *Custom*.
 ![hdinsight create options custom quick create](media/0a235b34c0b881117e51dc31a232dbe1.png)
 Follow instructions on the screen to do a basic cluster setup. Details are
 provided below for:
 -   [Resource group
    name](https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-hadoop-provision-linux-clusters#resource-group-name)
 -   [Cluster types and
    configuration](https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-hadoop-provision-linux-clusters#cluster-types)
    (Cluster must be Spark 2.3 (HDI 3.6) or greater)
 -   Cluster login and SSH username
 -   [Location](https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-hadoop-provision-linux-clusters#location)
 **Import the sample HDI notebook in Jupyter**
 **Important links:**
 Create HDI cluster:
 <https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-hadoop-provision-linux-clusters>
--- a/how-to-use-azureml/azure-hdi/automl_hdi_local_classification.ipynb
+++ b/how-to-use-azureml/azure-hdi/automl_hdi_local_classification.ipynb
@@ -0,0 +1,624 @@
 {
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Copyright (c) Microsoft Corporation. All rights reserved.\n",
    "\n",
    "Licensed under the MIT License."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Automated ML on Azure HDInsight\n",
    "\n",
    "In this example we use the scikit-learn's <a href=\"http://scikit-learn.org/stable/datasets/index.html#optical-recognition-of-handwritten-digits-dataset\" target=\"_blank\">digit dataset</a> to showcase how you can use AutoML for a simple classification problem.\n",
    "\n",
    "In this notebook you will learn how to:\n",
    "1. Create Azure Machine Learning Workspace object and initialize your notebook directory to easily reload this object from a configuration file.\n",
    "2. Create an `Experiment` in an existing `Workspace`.\n",
    "3. Configure Automated ML using `AutoMLConfig`.\n",
    "4. Train the model using Azure HDInsight.\n",
    "5. Explore the results.\n",
    "6. Test the best fitted model.\n",
    "\n",
    "Before running this notebook, please follow the readme for using Automated ML on Azure HDI for installing necessary libraries to your cluster."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Check the Azure ML Core SDK Version to Validate Your Installation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import azureml.core\n",
    "import pandas as pd\n",
    "from azureml.core.authentication import ServicePrincipalAuthentication\n",
    "from azureml.core.workspace import Workspace\n",
    "from azureml.core.experiment import Experiment\n",
    "from azureml.train.automl import AutoMLConfig\n",
    "from azureml.train.automl.run import AutoMLRun\n",
    "import logging\n",
    "\n",
    "print(\"SDK Version:\", azureml.core.VERSION)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Initialize an Azure ML Workspace\n",
    "### What is an Azure ML Workspace and Why Do I Need One?\n",
    "\n",
    "An Azure ML workspace is an Azure resource that organizes and coordinates the actions of many other Azure resources to assist in executing and sharing machine learning workflows.  In particular, an Azure ML workspace coordinates storage, databases, and compute resources providing added functionality for machine learning experimentation, operationalization, and the monitoring of operationalized models.\n",
    "\n",
    "\n",
    "### What do I Need?\n",
    "\n",
    "To create or access an Azure ML workspace, you will need to import the Azure ML library and specify following information:\n",
    "* A name for your workspace. You can choose one.\n",
    "* Your subscription id. Use the `id` value from the `az account show` command output above.\n",
    "* The resource group name. The resource group organizes Azure resources and provides a default region for the resources in the group. The resource group will be created if it doesn't exist. Resource groups can be created and viewed in the [Azure portal](https://portal.azure.com)\n",
    "* Supported regions include `eastus2`, `eastus`,`westcentralus`, `southeastasia`, `westeurope`, `australiaeast`, `westus2`, `southcentralus`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import azureml.core\n",
    "import pandas as pd\n",
    "from azureml.core.authentication import ServicePrincipalAuthentication\n",
    "from azureml.core.workspace import Workspace\n",
    "from azureml.core.experiment import Experiment\n",
    "from azureml.train.automl import AutoMLConfig\n",
    "from azureml.train.automl.run import AutoMLRun\n",
    "import logging\n",
    "\n",
    "subscription_id = \"<Your SubscriptionId>\" #you should be owner or contributor\n",
    "resource_group = \"<Resource group - new or existing>\" #you should be owner or contributor\n",
    "workspace_name = \"<workspace to be created>\" #your workspace name\n",
    "workspace_region = \"<azureregion>\" #your region\n",
    "\n",
    "\n",
    "tenant_id = \"<tenant_id>\"\n",
    "app_id = \"<app_id>\"\n",
    "app_key = \"<app_key>\"\n",
    "\n",
    "auth_sp = ServicePrincipalAuthentication(tenant_id = tenant_id,\n",
    "                                         service_principal_id = app_id,\n",
    "                                         service_principal_password = app_key)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Creating a Workspace\n",
    "If you already have access to an Azure ML workspace you want to use, you can skip this cell.  Otherwise, this cell will create an Azure ML workspace for you in the specified subscription, provided you have the correct permissions for the given `subscription_id`.\n",
    "\n",
    "This will fail when:\n",
    "1. The workspace already exists.\n",
    "2. You do not have permission to create a workspace in the resource group.\n",
    "3. You are not a subscription owner or contributor and no Azure ML workspaces have ever been created in this subscription.\n",
    "\n",
    "If workspace creation fails for any reason other than already existing, please work with your IT administrator to provide you with the appropriate permissions or to provision the required resources.\n",
    "\n",
    "**Note:** Creation of a new workspace can take several minutes."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "##TESTONLY\n",
    "# Import the Workspace class and check the Azure ML SDK version.\n",
    "from azureml.core import Workspace\n",
    "\n",
    "ws = Workspace.create(name = workspace_name,\n",
    "                      subscription_id = subscription_id,\n",
    "                      resource_group = resource_group, \n",
    "                      location = workspace_region,\n",
    "                      auth = auth_sp,\n",
    "                      exist_ok=True)\n",
    "ws.get_details()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Configuring Your Local Environment\n",
    "You can validate that you have access to the specified workspace and write a configuration file to the default configuration location, `./aml_config/config.json`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from azureml.core import Workspace\n",
    "\n",
    "ws = Workspace(workspace_name = workspace_name,\n",
    "               subscription_id = subscription_id,\n",
    "               resource_group = resource_group,\n",
    "               auth = auth_sp)\n",
    "\n",
    "# Persist the subscription id, resource group name, and workspace name in aml_config/config.json.\n",
    "ws.write_config()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Create a Folder to Host Sample Projects\n",
    "Finally, create a folder where all the sample projects will be hosted."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "\n",
    "sample_projects_folder = './sample_projects'\n",
    "\n",
    "if not os.path.isdir(sample_projects_folder):\n",
    "    os.mkdir(sample_projects_folder)\n",
    "    \n",
    "print('Sample projects will be created in {}.'.format(sample_projects_folder))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Create an Experiment\n",
    "\n",
    "As part of the setup you have already created an Azure ML `Workspace` object. For Automated ML you will need to create an `Experiment` object, which is a named object in a `Workspace` used to run experiments."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import logging\n",
    "import os\n",
    "import random\n",
    "import time\n",
    "\n",
    "from matplotlib import pyplot as plt\n",
    "from matplotlib.pyplot import imshow\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "\n",
    "import azureml.core\n",
    "from azureml.core.experiment import Experiment\n",
    "from azureml.core.workspace import Workspace\n",
    "from azureml.train.automl import AutoMLConfig\n",
    "from azureml.train.automl.run import AutoMLRun"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Choose a name for the experiment and specify the project folder.\n",
    "experiment_name = 'automl-local-classification-hdi'\n",
    "project_folder = './sample_projects/automl-local-classification-hdi'\n",
    "\n",
    "experiment = Experiment(ws, experiment_name)\n",
    "\n",
    "output = {}\n",
    "output['SDK version'] = azureml.core.VERSION\n",
    "output['Subscription ID'] = ws.subscription_id\n",
    "output['Workspace Name'] = ws.name\n",
    "output['Resource Group'] = ws.resource_group\n",
    "output['Location'] = ws.location\n",
    "output['Project Directory'] = project_folder\n",
    "output['Experiment Name'] = experiment.name\n",
    "pd.set_option('display.max_colwidth', -1)\n",
    "pd.DataFrame(data = output, index = ['']).T"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Diagnostics\n",
    "\n",
    "Opt-in diagnostics for better experience, quality, and security of future releases."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from azureml.telemetry import set_diagnostics_collection\n",
    "set_diagnostics_collection(send_diagnostics = True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Registering Datastore"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Datastore is the way to save connection information to a storage service (e.g. Azure Blob, Azure Data Lake, Azure SQL) information to your workspace so you can access them without exposing credentials in your code. The first thing you will need to do is register a datastore, you can refer to our [python SDK documentation](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.datastore.datastore?view=azure-ml-py) on how to register datastores. __Note: for best security practices, please do not check in code that contains registering datastores with secrets into your source control__\n",
    "\n",
    "The code below registers a datastore pointing to a publicly readable blob container."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from azureml.core import Datastore\n",
    "\n",
    "datastore_name = 'demo_training'\n",
    "container_name = 'digits' \n",
    "account_name = 'automlpublicdatasets'\n",
    "Datastore.register_azure_blob_container(\n",
    "    workspace = ws, \n",
    "    datastore_name = datastore_name, \n",
    "    container_name = container_name, \n",
    "    account_name = account_name,\n",
    "     overwrite = True\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Below is an example on how to register a private blob container\n",
    "```python\n",
    "datastore = Datastore.register_azure_blob_container(\n",
    "    workspace = ws, \n",
    "    datastore_name = 'example_datastore', \n",
    "    container_name = 'example-container', \n",
    "    account_name = 'storageaccount',\n",
    "    account_key = 'accountkey'\n",
    ")\n",
    "```\n",
    "The example below shows how  to register an Azure Data Lake store. Please make sure you have granted the necessary permissions for the service principal to access the data lake.\n",
    "```python\n",
    "datastore = Datastore.register_azure_data_lake(\n",
    "    workspace = ws,\n",
    "    datastore_name = 'example_datastore',\n",
    "    store_name = 'adlsstore',\n",
    "    tenant_id = 'tenant-id-of-service-principal',\n",
    "    client_id = 'client-id-of-service-principal',\n",
    "    client_secret = 'client-secret-of-service-principal'\n",
    ")\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Load Training Data Using DataPrep"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Automated ML takes a Dataflow as input.\n",
    "\n",
    "If you are familiar with Pandas and have done your data preparation work in Pandas already, you can use the `read_pandas_dataframe` method in dprep to convert the DataFrame to a Dataflow.\n",
    "```python\n",
    "df = pd.read_csv(...)\n",
    "# apply some transforms\n",
    "dprep.read_pandas_dataframe(df, temp_folder='/path/accessible/by/both/driver/and/worker')\n",
    "```\n",
    "\n",
    "If you just need to ingest data without doing any preparation, you can directly use AzureML Data Prep (Data Prep) to do so. The code below demonstrates this scenario. Data Prep also has data preparation capabilities, we have many [sample notebooks](https://github.com/Microsoft/AMLDataPrepDocs) demonstrating the capabilities.\n",
    "\n",
    "You will get the datastore you registered previously and pass it to Data Prep for reading. The data comes from the digits dataset: `sklearn.datasets.load_digits()`. `DataPath` points to a specific location within a datastore. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import azureml.dataprep as dprep\n",
    "from azureml.data.datapath import DataPath\n",
    "\n",
    "datastore = Datastore.get(workspace = ws, datastore_name = datastore_name)\n",
    "\n",
    "X_train = dprep.read_csv(datastore.path('X.csv'))\n",
    "y_train = dprep.read_csv(datastore.path('y.csv')).to_long(dprep.ColumnSelector(term='.*', use_regex = True))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Review the Data Preparation Result\n",
    "You can peek the result of a Dataflow at any range using `skip(i)` and `head(j)`. Doing so evaluates only j records for all the steps in the Dataflow, which makes it fast even against large datasets."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "X_train.get_profile()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "y_train.get_profile()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Configure AutoML\n",
    "\n",
    "Instantiate an `AutoMLConfig` object to specify the settings and data used to run the experiment.\n",
    "\n",
    "|Property|Description|\n",
    "|-|-|\n",
    "|**task**|classification or regression|\n",
    "|**primary_metric**|This is the metric that you want to optimize. Classification supports the following primary metrics: <br><i>accuracy</i><br><i>AUC_weighted</i><br><i>average_precision_score_weighted</i><br><i>norm_macro_recall</i><br><i>precision_score_weighted</i>|\n",
    "|**primary_metric**|This is the metric that you want to optimize. Regression supports the following primary metrics: <br><i>spearman_correlation</i><br><i>normalized_root_mean_squared_error</i><br><i>r2_score</i><br><i>normalized_mean_absolute_error</i>|\n",
    "|**iteration_timeout_minutes**|Time limit in minutes for each iteration.|\n",
    "|**iterations**|Number of iterations. In each iteration AutoML trains a specific pipeline with the data.|\n",
    "|**n_cross_validations**|Number of cross validation splits.|\n",
    "|**spark_context**|Spark Context object. for HDInsight, use spark_context=sc|\n",
    "|**max_concurrent_iterations**|Maximum number of iterations to execute in parallel. This should be <= number of worker nodes in your Azure HDInsight cluster.|\n",
    "|**X**|(sparse) array-like, shape = [n_samples, n_features]|\n",
    "|**y**|(sparse) array-like, shape = [n_samples, ], [n_samples, n_classes]<br>Multi-class targets. An indicator matrix turns on multilabel classification. This should be an array of integers.|\n",
    "|**path**|Relative path to the project folder. AutoML stores configuration files for the experiment under this folder. You can specify a new empty folder.|\n",
    "|**preprocess**|set this to True to enable pre-processing of data eg. string to numeric using one-hot encoding|\n",
    "|**exit_score**|Target score for experiment. It is associated with the metric. eg. exit_score=0.995 will exit experiment after that|"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "automl_config = AutoMLConfig(task = 'classification',\n",
    "                             debug_log = 'automl_errors.log',\n",
    "                             primary_metric = 'AUC_weighted',\n",
    "                             iteration_timeout_minutes = 10,\n",
    "                             iterations = 3,\n",
    "                             preprocess = True,\n",
    "                             n_cross_validations = 10,\n",
    "                             max_concurrent_iterations = 2, #change it based on number of worker nodes\n",
    "                             verbosity = logging.INFO,\n",
    "                             spark_context=sc, #HDI /spark related\n",
    "                             X = X_train, \n",
    "                             y = y_train,\n",
    "                             path = project_folder)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Train the Models\n",
    "\n",
    "Call the `submit` method on the experiment object and pass the run configuration. Execution of local runs is synchronous. Depending on the data and the number of iterations this can run for a while."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "local_run = experiment.submit(automl_config, show_output = True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Explore the Results"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The following will show the child runs and waits for the parent run to complete."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Retrieve All Child Runs after the experiment is completed (in portal)\n",
    "You can also use SDK methods to fetch all the child runs and see individual metrics that we log."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "children = list(local_run.get_children())\n",
    "metricslist = {}\n",
    "for run in children:\n",
    "    properties = run.get_properties()\n",
    "    metrics = {k: v for k, v in run.get_metrics().items() if isinstance(v, float)}    \n",
    "    metricslist[int(properties['iteration'])] = metrics\n",
    "\n",
    "rundata = pd.DataFrame(metricslist).sort_index(1)\n",
    "rundata"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Retrieve the Best Model after the above run is complete \n",
    "\n",
    "Below we select the best pipeline from our iterations. The `get_output` method returns the best run and the fitted model. The Model includes the pipeline and any pre-processing.  Overloads on `get_output` allow you to retrieve the best run and fitted model for *any* logged metric or for a particular *iteration*."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "best_run, fitted_model = local_run.get_output()\n",
    "print(best_run)\n",
    "print(fitted_model)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Best Model Based on Any Other Metric after the above run is complete based on the child run\n",
    "Show the run and the model that has the smallest `log_loss` value:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "lookup_metric = \"log_loss\"\n",
    "best_run, fitted_model = local_run.get_output(metric = lookup_metric)\n",
    "print(best_run)\n",
    "print(fitted_model)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Test the Best Fitted Model\n",
    "\n",
    "#### Load Test Data - you can split the dataset beforehand & pass Train dataset to AutoML and use Test dataset to evaluate the best model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "blob_location = \"https://{}.blob.core.windows.net/{}\".format(account_name, container_name)\n",
    "X_test = pd.read_csv(\"{}./X_valid.csv\".format(blob_location), header=0)\n",
    "y_test = pd.read_csv(\"{}/y_valid.csv\".format(blob_location), header=0)\n",
    "images  = pd.read_csv(\"{}/images.csv\".format(blob_location), header=None)\n",
    "images = np.reshape(images.values, (100,8,8))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Testing Our Best Fitted Model\n",
    "We will try to predict digits and see how our model works. This is just an example to show you."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Randomly select digits and test.\n",
    "for index in np.random.choice(len(y_test), 2, replace = False):\n",
    "    print(index)\n",
    "    predicted = fitted_model.predict(X_test[index:index + 1])[0]\n",
    "    label = y_test.values[index]\n",
    "    title = \"Label value = %d  Predicted value = %d \" % (label, predicted)\n",
    "    fig = plt.figure(3, figsize = (5,5))\n",
    "    ax1 = fig.add_axes((0,0,.8,.8))\n",
    "    ax1.set_title(title)\n",
    "    plt.imshow(images[index], cmap = plt.cm.gray_r, interpolation = 'nearest')\n",
    "    display(fig)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "When deploying an automated ML trained model, please specify _pippackages=['azureml-sdk[automl]']_ in your CondaDependencies.\n",
    "\n",
    "Please refer to only the **Deploy** section in this notebook - <a href=\"https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/automated-machine-learning/classification-with-deployment\" target=\"_blank\">Deployment of Automated ML trained model</a>"
   ]
  }
 ],
 "metadata": {
  "authors": [
   {
    "name": "savitam"
   },
   {
    "name": "sasum"
   }
  ],
  "kernelspec": {
   "display_name": "Python 3.6",
   "language": "Python",
   "name": "Python36"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "python",
    "version": 3
   },
   "mimetype": "text/x-python",
   "name": "pyspark3",
   "pygments_lexer": "python3"
  },
  "name": "auto-ml-classification-local-adb",
  "notebookId": 587284549713154
 },
 "nbformat": 4,
 "nbformat_minor": 1
 }
--- a/how-to-use-azureml/deployment/onnx/README.md
+++ b/how-to-use-azureml/deployment/onnx/README.md
@@ -6,15 +6,18 @@ These tutorials show how to create and deploy Open Neural Network eXchange ([ONN
 0. [Configure your Azure Machine Learning Workspace](../../../configuration.ipynb)
-#### Obtain models from the [ONNX Model Zoo](https://github.com/onnx/models) and deploy with ONNX Runtime Inference
+#### Obtain pretrained models from the [ONNX Model Zoo](https://github.com/onnx/models) and deploy with ONNX Runtime
-1. [Handwritten Digit Classification (MNIST)](https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/deployment/onnx/onnx-inference-mnist-deploy.ipynb)
+1. [MNIST - Handwritten Digit Classification with ONNX Runtime](onnx-inference-mnist-deploy.ipynb)
-2. [Facial Expression Recognition (Emotion FER+)](https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/deployment/onnx/onnx-inference-facial-expression-recognition-deploy.ipynb)
+2. [Emotion FER+ - Facial Expression Recognition with ONNX Runtime](onnx-inference-facial-expression-recognition-deploy.ipynb)
 #### Train model on Azure ML, convert to ONNX, and deploy with ONNX Runtime
 3. [MNIST - Train using PyTorch and deploy with ONNX Runtime](onnx-train-pytorch-aml-deploy-mnist.ipynb)
 #### Demo Notebooks from Microsoft Ignite 2018
 Note that the following notebooks do not have evaluation sections for the models since they were deployed as part of a live demo. You can find the respective pre-processing and post-processing code linked from the ONNX Model Zoo Github pages ([ResNet](https://github.com/onnx/models/tree/master/models/image_classification/resnet), [TinyYoloV2](https://github.com/onnx/models/tree/master/tiny_yolov2)), or experiment with the ONNX models by [running them in the browser](https://microsoft.github.io/onnxjs-demo/#/).
-3. [Image Recognition (ResNet50)](https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/deployment/onnx/onnx-modelzoo-aml-deploy-resnet50.ipynb)
+4. [ResNet50 - Image Recognition with ONNX Runtime](onnx-modelzoo-aml-deploy-resnet50.ipynb)
-4. [Convert Core ML Model to ONNX and deploy - Real Time Object Detection (TinyYOLO)](https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/deployment/onnx/onnx-convert-aml-deploy-tinyyolo.ipynb) 
+5. [TinyYoloV2 - Convert from CoreML and deploy with ONNX Runtime](onnx-convert-aml-deploy-tinyyolo.ipynb)
 ## Documentation
 - [ONNX Runtime Python API Documentation](http://aka.ms/onnxruntime-python)
--- a/how-to-use-azureml/deployment/onnx/mnist.py
+++ b/how-to-use-azureml/deployment/onnx/mnist.py
@@ -0,0 +1,124 @@
 # This is a modified version of https://github.com/pytorch/examples/blob/master/mnist/main.py which is
 # licensed under BSD 3-Clause (https://github.com/pytorch/examples/blob/master/LICENSE)
 from __future__ import print_function
 import argparse
 import torch
 import torch.nn as nn
 import torch.nn.functional as F
 import torch.optim as optim
 from torchvision import datasets, transforms
 import os
 class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(1, 10, kernel_size=5)
        self.conv2 = nn.Conv2d(10, 20, kernel_size=5)
        self.conv2_drop = nn.Dropout2d()
        self.fc1 = nn.Linear(320, 50)
        self.fc2 = nn.Linear(50, 10)
    def forward(self, x):
        x = F.relu(F.max_pool2d(self.conv1(x), 2))
        x = F.relu(F.max_pool2d(self.conv2_drop(self.conv2(x)), 2))
        x = x.view(-1, 320)
        x = F.relu(self.fc1(x))
        x = F.dropout(x, training=self.training)
        x = self.fc2(x)
        return F.log_softmax(x, dim=1)
 def train(args, model, device, train_loader, optimizer, epoch, output_dir):
    model.train()
    for batch_idx, (data, target) in enumerate(train_loader):
        data, target = data.to(device), target.to(device)
        optimizer.zero_grad()
        output = model(data)
        loss = F.nll_loss(output, target)
        loss.backward()
        optimizer.step()
        if batch_idx % args.log_interval == 0:
            print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
                epoch, batch_idx * len(data), len(train_loader.dataset),
                100. * batch_idx / len(train_loader), loss.item()))
 def test(args, model, device, test_loader):
    model.eval()
    test_loss = 0
    correct = 0
    with torch.no_grad():
        for data, target in test_loader:
            data, target = data.to(device), target.to(device)
            output = model(data)
            test_loss += F.nll_loss(output, target, size_average=False, reduce=True).item()  # sum up batch loss
            pred = output.max(1, keepdim=True)[1]  # get the index of the max log-probability
            correct += pred.eq(target.view_as(pred)).sum().item()
    test_loss /= len(test_loader.dataset)
    print('\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format(
        test_loss, correct, len(test_loader.dataset),
        100. * correct / len(test_loader.dataset)))
 def main():
    # Training settings
    parser = argparse.ArgumentParser(description='PyTorch MNIST Example')
    parser.add_argument('--batch-size', type=int, default=64, metavar='N',
                        help='input batch size for training (default: 64)')
    parser.add_argument('--test-batch-size', type=int, default=1000, metavar='N',
                        help='input batch size for testing (default: 1000)')
    parser.add_argument('--epochs', type=int, default=5, metavar='N',
                        help='number of epochs to train (default: 5)')
    parser.add_argument('--lr', type=float, default=0.01, metavar='LR',
                        help='learning rate (default: 0.01)')
    parser.add_argument('--momentum', type=float, default=0.5, metavar='M',
                        help='SGD momentum (default: 0.5)')
    parser.add_argument('--no-cuda', action='store_true', default=False,
                        help='disables CUDA training')
    parser.add_argument('--seed', type=int, default=1, metavar='S',
                        help='random seed (default: 1)')
    parser.add_argument('--log-interval', type=int, default=10, metavar='N',
                        help='how many batches to wait before logging training status')
    parser.add_argument('--output-dir', type=str, default='outputs')
    args = parser.parse_args()
    use_cuda = not args.no_cuda and torch.cuda.is_available()
    torch.manual_seed(args.seed)
    device = torch.device("cuda" if use_cuda else "cpu")
    output_dir = args.output_dir
    os.makedirs(output_dir, exist_ok=True)
    kwargs = {'num_workers': 1, 'pin_memory': True} if use_cuda else {}
    train_loader = torch.utils.data.DataLoader(
        datasets.MNIST('data', train=True, download=True,
                       transform=transforms.Compose([transforms.ToTensor(),
                                                    transforms.Normalize((0.1307,), (0.3081,))])
                       ),
        batch_size=args.batch_size, shuffle=True, **kwargs)
    test_loader = torch.utils.data.DataLoader(
        datasets.MNIST('data', train=False,
                       transform=transforms.Compose([transforms.ToTensor(),
                                                    transforms.Normalize((0.1307,), (0.3081,))])
                       ),
        batch_size=args.test_batch_size, shuffle=True, **kwargs)
    model = Net().to(device)
    optimizer = optim.SGD(model.parameters(), lr=args.lr, momentum=args.momentum)
    for epoch in range(1, args.epochs + 1):
        train(args, model, device, train_loader, optimizer, epoch, output_dir)
        test(args, model, device, test_loader)
    # save model
    dummy_input = torch.randn(1, 1, 28, 28, device=device)
    model_path = os.path.join(output_dir, 'mnist.onnx')
    torch.onnx.export(model, dummy_input, model_path)
 if __name__ == '__main__':
    main()
--- a/how-to-use-azureml/deployment/onnx/onnx-train-pytorch-aml-deploy-mnist.ipynb
+++ b/how-to-use-azureml/deployment/onnx/onnx-train-pytorch-aml-deploy-mnist.ipynb
--- a/how-to-use-azureml/deployment/production-deploy-to-aks/production-deploy-to-aks.ipynb
+++ b/how-to-use-azureml/deployment/production-deploy-to-aks/production-deploy-to-aks.ipynb
@@ -216,6 +216,56 @@
        "                                  provisioning_configuration = prov_config)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "# Create AKS Cluster in an existing virtual network (optional)\n",
        "See code snippet below. Check the documentation [here](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-enable-virtual-network#use-azure-kubernetes-service) for more details."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "'''\n",
        "from azureml.core.compute import ComputeTarget, AksCompute\n",
        "\n",
        "# Create the compute configuration and set virtual network information\n",
        "config = AksCompute.provisioning_configuration(location=\"eastus2\")\n",
        "config.vnet_resourcegroup_name = \"mygroup\"\n",
        "config.vnet_name = \"mynetwork\"\n",
        "config.subnet_name = \"default\"\n",
        "config.service_cidr = \"10.0.0.0/16\"\n",
        "config.dns_service_ip = \"10.0.0.10\"\n",
        "config.docker_bridge_cidr = \"172.17.0.1/16\"\n",
        "\n",
        "# Create the compute target\n",
        "aks_target = ComputeTarget.create(workspace = ws,\n",
        "                                  name = \"myaks\",\n",
        "                                  provisioning_configuration = config)\n",
        "'''"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "# Enable SSL on the AKS Cluster (optional)\n",
        "See code snippet below. Check the documentation [here](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-secure-web-service) for more details"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# provisioning_config = AksCompute.provisioning_configuration(ssl_cert_pem_file=\"cert.pem\", ssl_key_pem_file=\"key.pem\", ssl_cname=\"www.contoso.com\")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
@@ -295,8 +345,9 @@
      "cell_type": "markdown",
      "metadata": {},
      "source": [
-        "# Test the web service\n",
+        "# Test the web service using run method\n",
-        "We test the web sevice by passing data."
+        "We test the web sevice by passing data.\n",
        "Run() method retrieves API keys behind the scenes to make sure that call is authenticated."
      ]
    },
    {
@@ -318,6 +369,57 @@
        "print(prediction)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "# Test the web service using raw HTTP request (optional)\n",
        "Alternatively you can construct a raw HTTP request and send it to the service. In this case you need to explicitly pass the HTTP header. This process is shown in the next 2 cells."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# retreive the API keys. AML generates two keys.\n",
        "'''\n",
        "key1, Key2 = aks_service.get_keys()\n",
        "print(key1)\n",
        "'''"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# construct raw HTTP request and send to the service\n",
        "'''\n",
        "%%time\n",
        "\n",
        "import requests\n",
        "\n",
        "import json\n",
        "\n",
        "test_sample = json.dumps({'data': [\n",
        "    [1,2,3,4,5,6,7,8,9,10], \n",
        "    [10,9,8,7,6,5,4,3,2,1]\n",
        "]})\n",
        "test_sample = bytes(test_sample,encoding = 'utf8')\n",
        "\n",
        "# Don't forget to add key to the HTTP header.\n",
        "headers = {'Content-Type':'application/json', 'Authorization': 'Bearer ' + key1}\n",
        "\n",
        "resp = requests.post(aks_service.scoring_uri, test_sample, headers=headers)\n",
        "\n",
        "\n",
        "print(\"prediction:\", resp.text)\n",
        "'''"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
--- a/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-getting-started.ipynb
+++ b/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-getting-started.ipynb
@@ -303,7 +303,7 @@
        "\n",
        "The following code will create a PythonScriptStep to be executed in the Azure Machine Learning Compute we created above using train.py, one of the files already made available in the project folder.\n",
        "\n",
-        "A **PythonScriptStep** is a basic, built-in step to run a Python Script on a compute target. It takes a script name and optionally other parameters like arguments for the script, compute target, inputs and outputs. If no compute target is specified, default compute target for the workspace is used."
+        "A **PythonScriptStep** is a basic, built-in step to run a Python Script on a compute target. It takes a script name and optionally other parameters like arguments for the script, compute target, inputs and outputs. If no compute target is specified, default compute target for the workspace is used. You can also use a [**RunConfiguration**](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.runconfiguration?view=azure-ml-py) to specify requirements for the PythonScriptStep, such as conda dependencies and docker image."
      ]
    },
    {
@@ -369,10 +369,34 @@
        "                         compute_target=aml_compute, \n",
        "                         source_directory=project_folder)\n",
        "\n",
        "# Use a RunConfiguration to specify some additional requirements for this step.\n",
        "from azureml.core.runconfig import RunConfiguration\n",
        "from azureml.core.conda_dependencies import CondaDependencies\n",
        "from azureml.core.runconfig import DEFAULT_CPU_IMAGE\n",
        "\n",
        "# create a new runconfig object\n",
        "run_config = RunConfiguration()\n",
        "\n",
        "# enable Docker \n",
        "run_config.environment.docker.enabled = True\n",
        "\n",
        "# set Docker base image to the default CPU-based image\n",
        "run_config.environment.docker.base_image = DEFAULT_CPU_IMAGE\n",
        "\n",
        "# use conda_dependencies.yml to create a conda environment in the Docker image for execution\n",
        "run_config.environment.python.user_managed_dependencies = False\n",
        "\n",
        "# auto-prepare the Docker image when used for execution (if it is not already prepared)\n",
        "run_config.auto_prepare_environment = True\n",
        "\n",
        "# specify CondaDependencies obj\n",
        "run_config.environment.python.conda_dependencies = CondaDependencies.create(conda_packages=['scikit-learn'])\n",
        "\n",
        "step3 = PythonScriptStep(name=\"extract_step\",\n",
        "                         script_name=\"extract.py\", \n",
        "                         compute_target=aml_compute, \n",
-        "                         source_directory=project_folder)\n",
+        "                         source_directory=project_folder,\n",
        "                         runconfig=run_config)\n",
        "\n",
        "# list of steps to run\n",
        "steps = [step1, step2, step3]\n",
--- a/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-parameter-tuning-with-hyperdrive.ipynb
+++ b/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-parameter-tuning-with-hyperdrive.ipynb
@@ -36,7 +36,7 @@
        "from azureml.exceptions import ComputeTargetException\n",
        "from azureml.data.data_reference import DataReference\n",
        "from azureml.pipeline.steps import HyperDriveStep\n",
-        "from azureml.pipeline.core import Pipeline\n",
+        "from azureml.pipeline.core import Pipeline, PipelineData\n",
        "from azureml.train.dnn import TensorFlow\n",
        "from azureml.train.hyperdrive import *\n",
        "\n",
@@ -310,11 +310,17 @@
      "metadata": {},
      "outputs": [],
      "source": [
        "metrics_output_name = 'metrics_output'\n",
        "metirics_data = PipelineData(name='metrics_data',\n",
        "                             datastore=ds,\n",
        "                             pipeline_output_name=metrics_output_name)\n",
        "\n",
        "hd_step = HyperDriveStep(\n",
        "    name=\"hyperdrive_module\",\n",
        "    hyperdrive_run_config=hd_config,\n",
        "    estimator_entry_script_arguments=['--data-folder', data_folder],\n",
-        "    inputs=[data_folder])"
+        "    inputs=[data_folder],\n",
        "    metrics_output=metirics_data)"
      ]
    },
    {
@@ -366,6 +372,40 @@
      "source": [
        "pipeline_run.wait_for_completion()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Retrieve the metrics\n",
        "Outputs of above run can be used as inputs of other steps in pipeline. In this tutorial, we will show the result metrics."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "metrics_output = pipeline_run.get_pipeline_output(metrics_output_name)\n",
        "num_file_downloaded = metrics_output.download('.', show_progress=True)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import pandas as pd\n",
        "import json\n",
        "with open(metrics_output._path_on_datastore) as f:  \n",
        "   metrics_output_result = f.read()\n",
        "    \n",
        "deserialized_metrics_output = json.loads(metrics_output_result)\n",
        "df = pd.DataFrame(deserialized_metrics_output)\n",
        "df"
      ]
    }
  ],
  "metadata": {
--- a/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-publish-and-run-using-rest-endpoint.ipynb
+++ b/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-publish-and-run-using-rest-endpoint.ipynb
@@ -33,7 +33,7 @@
      "outputs": [],
      "source": [
        "import azureml.core\n",
-        "from azureml.core import Workspace, Datastore\n",
+        "from azureml.core import Workspace, Datastore, Experiment\n",
        "from azureml.core.compute import AmlCompute\n",
        "from azureml.core.compute import ComputeTarget\n",
        "\n",
@@ -55,10 +55,7 @@
        "print(\"Default datastore's name: {}\".format(def_file_store.name))\n",
        "\n",
        "def_blob_store = Datastore(ws, \"workspaceblobstore\")\n",
-        "print(\"Blobstore's name: {}\".format(def_blob_store.name))\n",
+        "print(\"Blobstore's name: {}\".format(def_blob_store.name))"
        "\n",
        "# project folder\n",
        "project_folder = '.'"
      ]
    },
    {
@@ -160,7 +157,7 @@
        "    inputs=[blob_input_data],\n",
        "    outputs=[processed_data1],\n",
        "    compute_target=aml_compute, \n",
-        "    source_directory=project_folder\n",
+        "    source_directory='.'\n",
        ")\n",
        "print(\"trainStep created\")"
      ]
@@ -191,7 +188,7 @@
        "    inputs=[processed_data1],\n",
        "    outputs=[processed_data2],\n",
        "    compute_target=aml_compute, \n",
-        "    source_directory=project_folder)\n",
+        "    source_directory='.')\n",
        "print(\"extractStep created\")"
      ]
    },
@@ -252,7 +249,7 @@
        "    inputs=[processed_data1, processed_data2],\n",
        "    outputs=[processed_data3],    \n",
        "    compute_target=aml_compute, \n",
-        "    source_directory=project_folder)\n",
+        "    source_directory='.')\n",
        "print(\"compareStep created\")"
      ]
    },
@@ -270,10 +267,7 @@
      "outputs": [],
      "source": [
        "pipeline1 = Pipeline(workspace=ws, steps=[compareStep])\n",
-        "print (\"Pipeline is built\")\n",
+        "print (\"Pipeline is built\")"
        "\n",
        "pipeline1.validate()\n",
        "print(\"Simple validation complete\") "
      ]
    },
    {
@@ -290,10 +284,38 @@
      "metadata": {},
      "outputs": [],
      "source": [
-        "published_pipeline1 = pipeline1.publish(name=\"My_New_Pipeline\", description=\"My Published Pipeline Description\")\n",
+        "published_pipeline1 = pipeline1.publish(name=\"My_New_Pipeline\", description=\"My Published Pipeline Description\", continue_on_step_failure=True)\n",
        "published_pipeline1"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Note: the continue_on_step_failure parameter specifies whether the execution of steps in the Pipeline will continue if one step fails. The default value is False, meaning when one step fails, the Pipeline execution will stop, canceling any running steps."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Publish the pipeline from a submitted PipelineRun\n",
        "It is also possible to publish a pipeline from a submitted PipelineRun"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# submit a pipeline run\n",
        "pipeline_run1 = Experiment(ws, 'Pipeline_experiment').submit(pipeline1)\n",
        "# publish a pipeline from the submitted pipeline run\n",
        "published_pipeline2 = pipeline_run1.publish_pipeline(name=\"My_New_Pipeline2\", description=\"My Published Pipeline Description\", version=\"0.1\", continue_on_step_failure=True)\n",
        "published_pipeline2"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
@@ -325,7 +347,8 @@
      "cell_type": "markdown",
      "metadata": {},
      "source": [
-        "### Run published pipeline using its REST endpoint"
+        "### Run published pipeline using its REST endpoint\n",
        "[This notebook](https://aka.ms/pl-restep-auth) shows how to authenticate to AML workspace."
      ]
    },
    {
--- a/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-setup-schedule-for-a-published-pipeline.ipynb
+++ b/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-setup-schedule-for-a-published-pipeline.ipynb
@@ -107,15 +107,11 @@
      "source": [
        "from azureml.pipeline.steps import PythonScriptStep\n",
        "\n",
        "\n",
        "# project folder\n",
        "project_folder = 'scripts'\n",
        "\n",
        "trainStep = PythonScriptStep(\n",
        "    name=\"Training_Step\",\n",
        "    script_name=\"train.py\", \n",
        "    compute_target=aml_compute_target, \n",
-        "    source_directory=project_folder\n",
+        "    source_directory='.'\n",
        ")\n",
        "print(\"TrainStep created\")"
      ]
@@ -136,9 +132,7 @@
        "from azureml.pipeline.core import Pipeline\n",
        "\n",
        "pipeline1 = Pipeline(workspace=ws, steps=[trainStep])\n",
-        "print (\"Pipeline is built\")\n",
+        "print (\"Pipeline is built\")"
        "\n",
        "pipeline1.validate()"
      ]
    },
    {
@@ -255,10 +249,11 @@
        "schedules = Schedule.get_all(ws, pipeline_id=pub_pipeline_id)\n",
        "\n",
        "# We will iterate through the list of schedules and \n",
-        "# use the last ID in the list for further operations: \n",
+        "# use the last recurrence schedule in the list for further operations: \n",
        "print(\"Found these schedules for the pipeline id {}:\".format(pub_pipeline_id))\n",
        "for schedule in schedules: \n",
        "    print(schedule.id)\n",
        "    if schedule.recurrence is not None:\n",
        "        schedule_id = schedule.id\n",
        "\n",
        "print(\"Schedule id to be used for schedule operations: {}\".format(schedule_id))"
@@ -380,7 +375,8 @@
      "metadata": {},
      "source": [
        "### Create a schedule for the pipeline using a Datastore\n",
-        "This schedule will run when additions or modifications are made to Blobs in the Datastore container.\n",
+        "This schedule will run when additions or modifications are made to Blobs in the Datastore.\n",
        "By default, the Datastore container is monitored for changes. Use the path_on_datastore parameter to instead specify a path on the Datastore to monitor for changes. Changes made to subfolders in the container/path will not trigger the schedule.\n",
        "Note: Only Blob Datastores are supported."
      ]
    },
@@ -400,6 +396,7 @@
        "                           datastore=datastore,\n",
        "                           wait_for_provisioning=True,\n",
        "                           description=\"Schedule Run\")\n",
        "                          #path_on_datastore=\"file/path\") use path_on_datastore to specify a specific folder to monitor for changes.\n",
        "\n",
        "# You may want to make sure that the schedule is provisioned properly\n",
        "# before making any further changes to the schedule\n",
--- a/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-with-automated-machine-learning-step.ipynb
+++ b/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-with-automated-machine-learning-step.ipynb
@@ -215,7 +215,9 @@
        "conda_run_config.environment.docker.enabled = True\n",
        "conda_run_config.environment.docker.base_image = azureml.core.runconfig.DEFAULT_CPU_IMAGE\n",
        "\n",
-        "cd = CondaDependencies.create(pip_packages=['azureml-sdk[automl]'], conda_packages=['numpy'], pin_sdk_version=False)\n",
+        "cd = CondaDependencies.create(pip_packages=['azureml-sdk[automl]'], \n",
        "                              conda_packages=['numpy', 'py-xgboost'], \n",
        "                              pin_sdk_version=False)\n",
        "conda_run_config.environment.python.conda_dependencies = cd\n",
        "\n",
        "print('run config is ready')"
@@ -297,6 +299,27 @@
        "## Define AutoMLStep"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.pipeline.core import PipelineData, TrainingOutput\n",
        "\n",
        "metrics_output_name = 'metrics_output'\n",
        "best_model_output_name = 'best_model_output'\n",
        "\n",
        "metirics_data = PipelineData(name='metrics_data',\n",
        "                           datastore=ds,\n",
        "                           pipeline_output_name=metrics_output_name,\n",
        "                           training_output=TrainingOutput(type='Metrics'))\n",
        "model_data = PipelineData(name='model_data',\n",
        "                           datastore=ds,\n",
        "                           pipeline_output_name=best_model_output_name,\n",
        "                           training_output=TrainingOutput(type='Model'))"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
@@ -308,6 +331,7 @@
        "    experiment=experiment,\n",
        "    automl_config=automl_config,\n",
        "    inputs=[input_data],\n",
        "    outputs=[metirics_data, model_data],\n",
        "    allow_reuse=True)"
      ]
    },
@@ -358,8 +382,8 @@
      "source": [
        "## Examine Results\n",
        "\n",
-        "#### Loading executed runs\n",
+        "### Retrieve the metrics of all child runs\n",
-        "In case you need to load a previously executed run, enable the cell below and replace the `run_id` value."
+        "Outputs of above run can be used as inputs of other steps in pipeline. In this tutorial, we will examine the outputs by retrieve output data and running some tests."
      ]
    },
    {
@@ -368,24 +392,8 @@
      "metadata": {},
      "outputs": [],
      "source": [
-        "from azureml.train.automl.run import AutoMLRun\n",
+        "metrics_output = pipeline_run.get_pipeline_output(metrics_output_name)\n",
-        "\n",
+        "num_file_downloaded = metrics_output.download('.', show_progress=True)"
        "# only one step exists in this pipeline\n",
        "run_id = None\n",
        "step_runs = pipeline_run.get_children()\n",
        "for run in step_runs:\n",
        "    run_id=run._run_id\n",
        "    \n",
        "automl_run = AutoMLRun(experiment = experiment, run_id=run_id)\n",
        "automl_run"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "#### Retrieve All Child Runs\n",
        "You can also use SDK methods to fetch all the child runs and see individual metrics that we log."
      ]
    },
    {
@@ -394,24 +402,20 @@
      "metadata": {},
      "outputs": [],
      "source": [
-        "children = list(automl_run.get_children())\n",
+        "import json\n",
-        "metricslist = {}\n",
+        "with open(metrics_output._path_on_datastore) as f:  \n",
-        "for run in children:\n",
+        "    metrics_output_result = f.read()\n",
        "    properties = run.get_properties()\n",
        "    metrics = {k: v for k, v in run.get_metrics().items() if isinstance(v, float)}\n",
        "    metricslist[int(properties['iteration'])] = metrics\n",
        "    \n",
-        "rundata = pd.DataFrame(metricslist).sort_index(1)\n",
+        "deserialized_metrics_output = json.loads(metrics_output_result)\n",
-        "rundata"
+        "df = pd.DataFrame(deserialized_metrics_output)\n",
        "df"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
-        "### Retrieve the Best Model\n",
+        "### Retrieve the Best Model"
        "\n",
        "Below we select the best pipeline from our iterations. The `get_output` method returns the best run and the fitted model. The Model includes the pipeline and any pre-processing.  Overloads on `get_output` allow you to retrieve the best run and fitted model for *any* logged metric or for a particular *iteration*."
      ]
    },
    {
@@ -420,17 +424,8 @@
      "metadata": {},
      "outputs": [],
      "source": [
-        "best_run, fitted_model = automl_run.get_output()\n",
+        "best_model_output = pipeline_run.get_pipeline_output(best_model_output_name)\n",
-        "print(best_run)\n",
+        "num_file_downloaded = best_model_output.download('.', show_progress=True)"
        "print(fitted_model)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "#### Best Model Based on Any Other Metric\n",
        "Show the run and the model which has the smallest `log_loss` value:"
      ]
    },
    {
@@ -439,39 +434,19 @@
      "metadata": {},
      "outputs": [],
      "source": [
-        "lookup_metric = \"log_loss\"\n",
+        " import pickle\n",
        "best_run, fitted_model = automl_run.get_output(metric = lookup_metric)\n",
        "print(best_run)\n",
        "print(fitted_model)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "#### Model from a Specific Iteration\n",
        "Show the run and the model from the third iteration:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "iteration = 3\n",
        "third_run, third_model = automl_run.get_output(iteration=iteration)\n",
        "print(third_run)\n",
        "print(third_model)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Test the Model\n",
        "\n",
-        "### Load Test Data"
+        " with open(best_model_output._path_on_datastore, \"rb\" ) as f:\n",
        "     best_model = pickle.load(f)\n",
        " best_model"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Test the Model\n",
        "#### Load Test Data"
      ]
    },
    {
@@ -490,7 +465,7 @@
      "cell_type": "markdown",
      "metadata": {},
      "source": [
-        "### Testing Our Best Fitted Model"
+        "#### Testing Best Model"
      ]
    },
    {
@@ -502,7 +477,7 @@
        "# Randomly select digits and test.\n",
        "for index in np.random.choice(len(y_test), 3, replace = False):\n",
        "   print(index)\n",
-        "    predicted = fitted_model.predict(X_test[index:index + 1])[0]\n",
+        "   predicted = best_model.predict(X_test[index:index + 1])[0]\n",
        "   label = y_test[index]\n",
        "   title = \"Label value = %d  Predicted value = %d \" % (label, predicted)\n",
        "   fig = plt.figure(1, figsize=(3,3))\n",
--- a/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-with-data-dependency-steps.ipynb
+++ b/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-with-data-dependency-steps.ipynb
@@ -83,10 +83,10 @@
      "metadata": {},
      "outputs": [],
      "source": [
-        "# project folder\n",
+        "# source directory\n",
-        "project_folder = '.'\n",
+        "source_directory = '.'\n",
        "    \n",
-        "print('Sample projects will be created in {}.'.format(project_folder))"
+        "print('Sample scripts will be created in {} directory.'.format(source_directory))"
      ]
    },
    {
@@ -259,6 +259,44 @@
        "**Open `train.py` in the local machine and examine the arguments, inputs, and outputs for the script. That will give you a good sense of why the script argument names used below are important.** "
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "#### Specify conda dependencies and a base docker image through a RunConfiguration\n",
        "\n",
        "This step uses a docker image and scikit-learn, use a [**RunConfiguration**](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.runconfiguration?view=azure-ml-py) to specify these requirements and use when creating the PythonScriptStep. "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.core.runconfig import RunConfiguration\n",
        "from azureml.core.conda_dependencies import CondaDependencies\n",
        "from azureml.core.runconfig import DEFAULT_CPU_IMAGE\n",
        "\n",
        "# create a new runconfig object\n",
        "run_config = RunConfiguration()\n",
        "\n",
        "# enable Docker \n",
        "run_config.environment.docker.enabled = True\n",
        "\n",
        "# set Docker base image to the default CPU-based image\n",
        "run_config.environment.docker.base_image = DEFAULT_CPU_IMAGE\n",
        "\n",
        "# use conda_dependencies.yml to create a conda environment in the Docker image for execution\n",
        "run_config.environment.python.user_managed_dependencies = False\n",
        "\n",
        "# auto-prepare the Docker image when used for execution (if it is not already prepared)\n",
        "run_config.auto_prepare_environment = True\n",
        "\n",
        "# specify CondaDependencies obj\n",
        "run_config.environment.python.conda_dependencies = CondaDependencies.create(conda_packages=['scikit-learn'])"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
@@ -273,7 +311,8 @@
        "    inputs=[blob_input_data],\n",
        "    outputs=[processed_data1],\n",
        "    compute_target=aml_compute, \n",
-        "    source_directory=project_folder\n",
+        "    source_directory=source_directory,\n",
        "    runconfig=run_config\n",
        ")\n",
        "print(\"trainStep created\")"
      ]
@@ -304,7 +343,7 @@
        "    inputs=[processed_data1],\n",
        "    outputs=[processed_data2],\n",
        "    compute_target=aml_compute, \n",
-        "    source_directory=project_folder)\n",
+        "    source_directory=source_directory)\n",
        "print(\"extractStep created\")"
      ]
    },
@@ -312,8 +351,10 @@
      "cell_type": "markdown",
      "metadata": {},
      "source": [
-        "#### Define a Step that consumes multiple intermediate data and produces intermediate data\n",
+        "#### Define a Step that consumes intermediate data and existing data and produces intermediate data\n",
-        "In this step, we define a step that consumes multiple intermediate data and produces intermediate data.\n",
+        "In this step, we define a step that consumes multiple data types and produces intermediate data.\n",
        "\n",
        "This step uses the output generated from the previous step as well as existing data on a DataStore. The location of the existing data is specified using a [**PipelineParameter**](https://docs.microsoft.com/en-us/python/api/azureml-pipeline-core/azureml.pipeline.core.pipelineparameter?view=azure-ml-py) and a [**DataPath**](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.data.datapath.datapath?view=azure-ml-py). Using a PipelineParameter enables easy modification of the data location when the Pipeline is published and resubmitted.\n",
        "\n",
        "**Open `compare.py` in the local machine and examine the arguments, inputs, and outputs for the script. That will give you a good sense of why the script argument names used below are important.**"
      ]
@@ -324,16 +365,31 @@
      "metadata": {},
      "outputs": [],
      "source": [
-        "# Now define step6 that takes two inputs (both intermediate data), and produce an output\n",
+        "# Reference the data uploaded to blob storage using a PipelineParameter and a DataPath\n",
        "from azureml.pipeline.core import PipelineParameter\n",
        "from azureml.data.datapath import DataPath, DataPathComputeBinding\n",
        "\n",
        "datapath = DataPath(datastore=def_blob_store, path_on_datastore='20newsgroups/20news.pkl')\n",
        "datapath_param = PipelineParameter(name=\"compare_data\", default_value=datapath)\n",
        "data_parameter1 = (datapath_param, DataPathComputeBinding(mode='mount'))"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Now define the compare step which takes two inputs and produces an output\n",
        "processed_data3 = PipelineData(\"processed_data3\", datastore=def_blob_store)\n",
        "\n",
        "compareStep = PythonScriptStep(\n",
        "    script_name=\"compare.py\",\n",
-        "    arguments=[\"--compare_data1\", processed_data1, \"--compare_data2\", processed_data2, \"--output_compare\", processed_data3],\n",
+        "    arguments=[\"--compare_data1\", data_parameter1, \"--compare_data2\", processed_data2, \"--output_compare\", processed_data3],\n",
-        "    inputs=[processed_data1, processed_data2],\n",
+        "    inputs=[data_parameter1, processed_data2],\n",
        "    outputs=[processed_data3],    \n",
        "    compute_target=aml_compute, \n",
-        "    source_directory=project_folder)\n",
+        "    source_directory=source_directory)\n",
        "print(\"compareStep created\")"
      ]
    },
@@ -351,10 +407,7 @@
      "outputs": [],
      "source": [
        "pipeline1 = Pipeline(workspace=ws, steps=[compareStep])\n",
-        "print (\"Pipeline is built\")\n",
+        "print (\"Pipeline is built\")"
        "\n",
        "pipeline1.validate()\n",
        "print(\"Simple validation complete\") "
      ]
    },
    {
--- a/how-to-use-azureml/machine-learning-pipelines/pipeline-batch-scoring/pipeline-batch-scoring.ipynb
+++ b/how-to-use-azureml/machine-learning-pipelines/pipeline-batch-scoring/pipeline-batch-scoring.ipynb
@@ -508,7 +508,8 @@
      "cell_type": "markdown",
      "metadata": {},
      "source": [
-        "### Get AAD token"
+        "### Get AAD token\n",
        "[This notebook](https://aka.ms/pl-restep-auth) shows how to authenticate to AML workspace."
      ]
    },
    {
--- a/how-to-use-azureml/machine-learning-pipelines/pipeline-style-transfer/pipeline-style-transfer.ipynb
+++ b/how-to-use-azureml/machine-learning-pipelines/pipeline-style-transfer/pipeline-style-transfer.ipynb
@@ -492,7 +492,8 @@
      "cell_type": "markdown",
      "metadata": {},
      "source": [
-        "## Get AAD token"
+        "## Get AAD token\n",
        "[This notebook](https://aka.ms/pl-restep-auth) shows how to authenticate to AML workspace."
      ]
    },
    {
--- a/how-to-use-azureml/training-with-deep-learning/distributed-chainer/distributed-chainer.ipynb
+++ b/how-to-use-azureml/training-with-deep-learning/distributed-chainer/distributed-chainer.ipynb
@@ -220,14 +220,14 @@
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.core.runconfig import MpiConfiguration\n",
        "from azureml.train.dnn import Chainer\n",
        "\n",
        "estimator = Chainer(source_directory=project_folder,\n",
        "                    compute_target=compute_target,\n",
        "                    entry_script='train_mnist.py',\n",
        "                    node_count=2,\n",
-        "                    process_count_per_node=1,\n",
+        "                    distributed_training=MpiConfiguration(),\n",
        "                    distributed_backend='mpi',\n",
        "                    use_gpu=True)"
      ]
    },
--- a/how-to-use-azureml/training-with-deep-learning/distributed-pytorch-with-horovod/distributed-pytorch-with-horovod.ipynb
+++ b/how-to-use-azureml/training-with-deep-learning/distributed-pytorch-with-horovod/distributed-pytorch-with-horovod.ipynb
@@ -233,14 +233,14 @@
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.core.runconfig import MpiConfiguration\n",
        "from azureml.train.dnn import PyTorch\n",
        "\n",
        "estimator = PyTorch(source_directory=project_folder,\n",
        "                    compute_target=compute_target,\n",
        "                    entry_script='pytorch_horovod_mnist.py',\n",
        "                    node_count=2,\n",
-        "                    process_count_per_node=1,\n",
+        "                    distributed_training=MpiConfiguration(),\n",
        "                    distributed_backend='mpi',\n",
        "                    use_gpu=True)"
      ]
    },
--- a/how-to-use-azureml/training-with-deep-learning/distributed-tensorflow-with-horovod/distributed-tensorflow-with-horovod.ipynb
+++ b/how-to-use-azureml/training-with-deep-learning/distributed-tensorflow-with-horovod/distributed-tensorflow-with-horovod.ipynb
@@ -296,6 +296,7 @@
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.core.runconfig import MpiConfiguration\n",
        "from azureml.train.dnn import TensorFlow\n",
        "\n",
        "script_params={\n",
@@ -307,9 +308,7 @@
        "                      script_params=script_params,\n",
        "                      entry_script='tf_horovod_word2vec.py',\n",
        "                      node_count=2,\n",
-        "                      process_count_per_node=1,\n",
+        "                      distributed_training=MpiConfiguration(),\n",
        "                      distributed_backend='mpi',\n",
        "                      use_gpu=True, \n",
        "                      framework_version='1.12')"
      ]
    },
--- a/how-to-use-azureml/training-with-deep-learning/distributed-tensorflow-with-parameter-server/distributed-tensorflow-with-parameter-server.ipynb
+++ b/how-to-use-azureml/training-with-deep-learning/distributed-tensorflow-with-parameter-server/distributed-tensorflow-with-parameter-server.ipynb
@@ -26,7 +26,7 @@
        "* Go through the [configuration notebook](../../../configuration.ipynb) to:\n",
        "    * install the AML SDK\n",
        "    * create a workspace and its configuration file (`config.json`)\n",
-        "* Review the [tutorial](https://aka.ms/aml-notebook-hyperdrive) on single-node TensorFlow training using the SDK"
+        "* Review the [tutorial](../train-hyperparameter-tune-deploy-with-tensorflow/train-hyperparameter-tune-deploy-with-tensorflow.ipynb) on single-node TensorFlow training using the SDK"
      ]
    },
    {
@@ -208,6 +208,7 @@
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.core.runconfig import TensorflowConfiguration\n",
        "from azureml.train.dnn import TensorFlow\n",
        "\n",
        "script_params={\n",
@@ -215,14 +216,15 @@
        "    '--train_steps': 500\n",
        "}\n",
        "\n",
        "distributed_training = TensorflowConfiguration()\n",
        "distributed_training.worker_count = 2\n",
        "\n",
        "estimator = TensorFlow(source_directory=project_folder,\n",
        "                       compute_target=compute_target,\n",
        "                       script_params=script_params,\n",
        "                       entry_script='tf_mnist_replica.py',\n",
        "                       node_count=2,\n",
-        "                       worker_count=2,\n",
+        "                       distributed_training=distributed_training,\n",
        "                       parameter_server_count=1,   \n",
        "                       distributed_backend='ps',\n",
        "                       use_gpu=True)"
      ]
    },
--- a/how-to-use-azureml/training/logging-api/logging-api.ipynb
+++ b/how-to-use-azureml/training/logging-api/logging-api.ipynb
@@ -54,7 +54,7 @@
        "\n",
        "The experiment's Run History report page automatically creates a report that can be customized to show the KPI's, charts, and column sets that are interesting to you. \n",
        "\n",
-        "| ![Run Details](./img/run_details.PNG) | ![Run History](./img/run_history.png) |\n",
+        "| ![Run Details](./img/run_details.PNG) | ![Run History](./img/run_history.PNG) |\n",
        "|:--:|:--:|\n",
        "| *Run Details* | *Run History* |\n",
        "\n",
--- a/images/python36.png
+++ b/images/python36.png
--- a/images/yt_cover.png
+++ b/images/yt_cover.png
--- a/index.html
+++ b/index.html
@@ -0,0 +1,52 @@
 <!DOCTYPE html>
 <html>
    <head>
        <meta name="google-site-verification" content="fkZxAt5AEHiB_Wom2R_25VTmNyj19J8lZlfTREsaEN4" />
        <title>Azure Machine Learning</title>
        </head>
 <body>
 <h1 id="azure-machine-learning-service-example-notebooks">Azure Machine Learning service example notebooks</h1>
 <p>This repository contains example notebooks demonstrating the <a href="https://azure.microsoft.com/en-us/services/machine-learning-service/">Azure Machine Learning</a> Python SDK which allows you to build, train, deploy and manage machine learning solutions using Azure. The AML SDK allows you the choice of using local or cloud compute resources, while managing and maintaining the complete data science workflow from the cloud.</p>
 <div class="figure">
 <img src="https://raw.githubusercontent.com/MicrosoftDocs/azure-docs/master/articles/machine-learning/service/media/overview-what-is-azure-ml/aml.png" alt="Azure ML workflow" /><p class="caption">Azure ML workflow</p>
 </div>
 <h2 id="quick-installation">Quick installation</h2>
 <pre class="sh"><code>pip install azureml-sdk</code></pre>
 <p>Read more detailed instructions on <a href="https://github.com/Azure/MachineLearningNotebooks/blob/master/NBSETUP.md">how to set up your environment</a> using Azure Notebook service, your own Jupyter notebook server, or Docker.</p>
 <h2 id="how-to-navigate-and-use-the-example-notebooks">How to navigate and use the example notebooks?</h2>
 <p>You should always run the <a href="https://github.com/Azure/MachineLearningNotebooks/blob/master/configuration.ipynb">Configuration</a> notebook first when setting up a notebook library on a new machine or in a new environment. It configures your notebook library to connect to an Azure Machine Learning workspace, and sets up your workspace and compute to be used by many of the other examples.</p>
 <p>If you want to...</p>
 <ul>
 <li>...try out and explore Azure ML, start with image classification tutorials: <a href="https://github.com/Azure/MachineLearningNotebooks/blob/master/tutorials/img-classification-part1-training.ipynb">Part 1 (Training)</a> and <a href="https://github.com/Azure/MachineLearningNotebooks/blob/master/tutorials/img-classification-part2-deploy.ipynb">Part 2 (Deployment)</a>.</li>
 <li>...prepare your data and do automated machine learning, start with regression tutorials: <a href="https://github.com/Azure/MachineLearningNotebooks/blob/master/tutorials/regression-part1-data-prep.ipynb">Part 1 (Data Prep)</a> and <a href="https://github.com/Azure/MachineLearningNotebooks/blob/master/tutorials/regression-part2-automated-ml.ipynb">Part 2 (Automated ML)</a>.</li>
 <li>...learn about experimentation and tracking run history, first <a href="https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/training/train-within-notebook/train-within-notebook.ipynb">train within Notebook</a>, then try <a href="https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/training/train-on-remote-vm/train-on-remote-vm.ipynb">training on remote VM</a> and <a href="https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/training/logging-api/logging-api.ipynb">using logging APIs</a>.</li>
 <li>...train deep learning models at scale, first learn about <a href="https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/training/train-on-amlcompute/train-on-amlcompute.ipynb">Machine Learning Compute</a>, and then try <a href="https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/training-with-deep-learning/train-hyperparameter-tune-deploy-with-pytorch/train-hyperparameter-tune-deploy-with-pytorch.ipynb">distributed hyperparameter tuning</a> and <a href="https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/training-with-deep-learning/distributed-pytorch-with-horovod/distributed-pytorch-with-horovod.ipynb">distributed training</a>.</li>
 <li>...deploy models as a realtime scoring service, first learn the basics by <a href="https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/training/train-within-notebook/train-within-notebook.ipynb">training within Notebook and deploying to Azure Container Instance</a>, then learn how to <a href="https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/deployment/register-model-create-image-deploy-service/register-model-create-image-deploy-service.ipynb">register and manage models, and create Docker images</a>, and <a href="https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/deployment/production-deploy-to-aks/production-deploy-to-aks.ipynb">production deploy models on Azure Kubernetes Cluster</a>.</li>
 <li>...deploy models as a batch scoring service, first <a href="https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/training/train-within-notebook/train-within-notebook.ipynb">train a model within Notebook</a>, learn how to <a href="https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/deployment/register-model-create-image-deploy-service/register-model-create-image-deploy-service.ipynb">register and manage models</a>, then <a href="https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/training/train-on-amlcompute/train-on-amlcompute.ipynb">create Machine Learning Compute for scoring compute</a>, and <a href="https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/machine-learning-pipelines/pipeline-mpi-batch-prediction.ipynb">use Machine Learning Pipelines to deploy your model</a>.</li>
 <li>...monitor your deployed models, learn about using <a href="https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/deployment/enable-app-insights-in-production-service/enable-app-insights-in-production-service.ipynb">App Insights</a> and <a href="https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/deployment/enable-data-collection-for-models-in-aks/enable-data-collection-for-models-in-aks.ipynb">model data collection</a>.</li>
 </ul>
 <h2 id="tutorials">Tutorials</h2>
 <p>The <a href="https://github.com/Azure/MachineLearningNotebooks/blob/master/tutorials">Tutorials</a> folder contains notebooks for the tutorials described in the <a href="https://aka.ms/aml-docs">Azure Machine Learning documentation</a>.</p>
 <h2 id="how-to-use-azure-ml">How to use Azure ML</h2>
 <p>The <a href="https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml">How to use Azure ML</a> folder contains specific examples demonstrating the features of the Azure Machine Learning SDK</p>
 <ul>
 <li><a href="https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/training">Training</a> - Examples of how to build models using Azure ML's logging and execution capabilities on local and remote compute targets</li>
 <li><a href="https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/training-with-deep-learning">Training with Deep Learning</a> - Examples demonstrating how to build deep learning models using estimators and parameter sweeps</li>
 <li><a href="https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/manage-azureml-service">Manage Azure ML Service</a> - Examples how to perform tasks, such as authenticate against Azure ML service in different ways.</li>
 <li><a href="https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/automated-machine-learning">Automated Machine Learning</a> - Examples using Automated Machine Learning to automatically generate optimal machine learning pipelines and models</li>
 <li><a href="https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/machine-learning-pipelines">Machine Learning Pipelines</a> - Examples showing how to create and use reusable pipelines for training and batch scoring</li>
 <li><a href="https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/deployment">Deployment</a> - Examples showing how to deploy and manage machine learning models and solutions</li>
 <li><a href="https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/azure-databricks">Azure Databricks</a> - Examples showing how to use Azure ML with Azure Databricks</li>
 </ul>
 <h2 id="projects-using-azure-machine-learning">Projects using Azure Machine Learning</h2>
 <p>Visit following repos to see projects contributed by Azure ML users:</p>
 <ul>
 <li><a href="https://github.com/Microsoft/AzureML-BERT">Fine tune natural language processing models using Azure Machine Learning service</a></li>
 <li><a href="https://github.com/amynic/azureml-sdk-fashion">Fashion MNIST with Azure ML SDK</a></li>
 </ul>
 </body>
 </html>
Author	SHA1	Message	Date
Roope Astala	644729e5db	Merge pull request #333 from rastala/master version 1.0.30	2019-04-22 15:40:11 -04:00
Roope Astala	e2b1b3fcaa	version 1.0.30	2019-04-22 15:39:18 -04:00
Roope Astala	dc692589a9	Merge pull request #326 from rastala/master update aks notebook	2019-04-18 16:19:51 -04:00
Roope Astala	624b4595b5	update aks notebook	2019-04-18 16:18:33 -04:00
Roope Astala	0ed85c33c2	Delete release.json	2019-04-18 10:01:50 -04:00
Roope Astala	5b01de605f	Merge pull request #318 from savitamittal1/hdinotebook Sample HDI notebook	2019-04-18 10:01:26 -04:00
Savitam	c351ac988a	Sample HDI notebook sample HDI notebook	2019-04-15 12:35:34 -07:00
Josée Martens	759ec3934c	Delete yt_cover.png	2019-04-15 12:06:25 -05:00
Josée Martens	b499b88a85	Delete python36.png	2019-04-15 12:06:16 -05:00
Josée Martens	5f4edac3c1	Update NBSETUP.md	2019-04-15 12:00:31 -05:00
Josée Martens	edfce0d936	Update README.md	2019-04-12 17:28:16 -05:00
Josée Martens	1516c7fc24	Update README.md testing for search	2019-04-12 17:19:55 -05:00
Roope Astala	389fb668ce	Add files via upload	2019-04-10 11:12:55 -04:00
Josée Martens	647d5e72a5	Merge pull request #307 from Azure/vizhur-patch-2 Create googled8147fb6c0788258.html	2019-04-09 15:21:51 -05:00
vizhur	43ac4c84bb	Create googled8147fb6c0788258.html	2019-04-09 16:19:47 -04:00
Roope Astala	8a1a82b50a	Merge pull request #303 from rastala/master dockerfile and missing config update	2019-04-08 15:38:13 -04:00
Roope Astala	72f386298c	dockerfile and missing config update	2019-04-08 15:37:48 -04:00