add cell metadata

Update train-within-notebook.ipynb
2025-12-20 09:37:04 -05:00 · 2020-02-04 11:32:41 -06:00 · 2020-02-04 11:06:41 -06:00 · 2020-02-04 09:13:56 -06:00 · 2020-01-31 15:19:58 -05:00 · 2020-01-23 15:46:43 -08:00
67 changed files with 834 additions and 2853 deletions
--- a/README.md
+++ b/README.md
@@ -20,8 +20,8 @@ If you want to...
 * ...try out and explore Azure ML, start with image classification tutorials: [Part 1 (Training)](./tutorials/img-classification-part1-training.ipynb) and [Part 2 (Deployment)](./tutorials/img-classification-part2-deploy.ipynb).
 * ...learn about experimentation and tracking run history, first [train within Notebook](./how-to-use-azureml/training/train-within-notebook/train-within-notebook.ipynb), then try [training on remote VM](./how-to-use-azureml/training/train-on-remote-vm/train-on-remote-vm.ipynb) and [using logging APIs](./how-to-use-azureml/training/logging-api/logging-api.ipynb).
 * ...train deep learning models at scale, first learn about [Machine Learning Compute](./how-to-use-azureml/training/train-on-amlcompute/train-on-amlcompute.ipynb), and then try [distributed hyperparameter tuning](./how-to-use-azureml/training-with-deep-learning/train-hyperparameter-tune-deploy-with-pytorch/train-hyperparameter-tune-deploy-with-pytorch.ipynb) and [distributed training](./how-to-use-azureml/training-with-deep-learning/distributed-pytorch-with-horovod/distributed-pytorch-with-horovod.ipynb).
- * ...deploy models as a realtime scoring service, first learn the basics by [training within Notebook and deploying to Azure Container Instance](./how-to-use-azureml/training/train-within-notebook/train-within-notebook.ipynb), then learn how to [register and manage models, and create Docker images](./how-to-use-azureml/deployment/register-model-create-image-deploy-service/register-model-create-image-deploy-service.ipynb), and [production deploy models on Azure Kubernetes Cluster](./how-to-use-azureml/deployment/production-deploy-to-aks/production-deploy-to-aks.ipynb).
+ * ...deploy models as a realtime scoring service, first learn the basics by [training within Notebook and deploying to Azure Container Instance](./how-to-use-azureml/training/train-within-notebook/train-within-notebook.ipynb), then learn how to [production deploy models on Azure Kubernetes Cluster](./how-to-use-azureml/deployment/production-deploy-to-aks/production-deploy-to-aks.ipynb).
- * ...deploy models as a batch scoring service, first [train a model within Notebook](./how-to-use-azureml/training/train-within-notebook/train-within-notebook.ipynb), learn how to [register and manage models](./how-to-use-azureml/deployment/register-model-create-image-deploy-service/register-model-create-image-deploy-service.ipynb), then [create Machine Learning Compute for scoring compute](./how-to-use-azureml/training/train-on-amlcompute/train-on-amlcompute.ipynb), and [use Machine Learning Pipelines to deploy your model](https://aka.ms/pl-batch-scoring).
+ * ...deploy models as a batch scoring service, first [train a model within Notebook](./how-to-use-azureml/training/train-within-notebook/train-within-notebook.ipynb), then [create Machine Learning Compute for scoring compute](./how-to-use-azureml/training/train-on-amlcompute/train-on-amlcompute.ipynb), and [use Machine Learning Pipelines to deploy your model](https://aka.ms/pl-batch-scoring).
 * ...monitor your deployed models, learn about using [App Insights](./how-to-use-azureml/deployment/enable-app-insights-in-production-service/enable-app-insights-in-production-service.ipynb).
 ## Tutorials
--- a/configuration.ipynb
+++ b/configuration.ipynb
@@ -103,7 +103,7 @@
      "source": [
        "import azureml.core\n",
        "\n",
-        "print(\"This notebook was created using version 1.0.83 of the Azure ML SDK\")\n",
+        "print(\"This notebook was created using version 1.0.85 of the Azure ML SDK\")\n",
        "print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
      ]
    },
--- a/how-to-use-azureml/README.md
+++ b/how-to-use-azureml/README.md
@@ -9,7 +9,6 @@ As a pre-requisite, run the [configuration Notebook](../configuration.ipynb) not
 * [train-on-amlcompute](./training/train-on-amlcompute): Use a 1-n node Azure ML managed compute cluster for remote runs on Azure CPU or GPU infrastructure.
 * [train-on-remote-vm](./training/train-on-remote-vm): Use Data Science Virtual Machine as a target for remote runs.
 * [logging-api](./track-and-monitor-experiments/logging-api): Learn about the details of logging metrics to run history.
 * [register-model-create-image-deploy-service](./deployment/register-model-create-image-deploy-service): Learn about the details of model management.
 * [production-deploy-to-aks](./deployment/production-deploy-to-aks) Deploy a model to production at scale on Azure Kubernetes Service.
 * [enable-app-insights-in-production-service](./deployment/enable-app-insights-in-production-service) Learn how to use App Insights with production web service.
--- a/how-to-use-azureml/automated-machine-learning/README.md
+++ b/how-to-use-azureml/automated-machine-learning/README.md
@@ -197,6 +197,17 @@ If automl_setup_linux.sh fails on Ubuntu Linux with the error: `unable to execut
 4) Check that the region is one of the supported regions: `eastus2`, `eastus`, `westcentralus`, `southeastasia`, `westeurope`, `australiaeast`, `westus2`, `southcentralus`
 5) Check that you have access to the region using the Azure Portal.
 ## import AutoMLConfig fails after upgrade from before 1.0.76 to 1.0.76 or later
 There were package changes in automated machine learning version 1.0.76, which require the previous version to be uninstalled before upgrading to the new version.
 If you have manually upgraded from a version of automated machine learning before 1.0.76 to 1.0.76 or later, you may get the error:
 `ImportError: cannot import name 'AutoMLConfig'`
 This can be resolved by running:
 `pip uninstall azureml-train-automl` and then 
 `pip install azureml-train-automl`
 The automl_setup.cmd script does this automatically.
 ## workspace.from_config fails
 If the call `ws = Workspace.from_config()` fails:
 1) Make sure that you have run the `configuration.ipynb` notebook successfully.
--- a/how-to-use-azureml/automated-machine-learning/automl_env.yml
+++ b/how-to-use-azureml/automated-machine-learning/automl_env.yml
@@ -2,7 +2,7 @@ name: azure_automl
 dependencies:
  # The python interpreter version.
  # Currently Azure ML only supports 3.5.2 and later.
- pip
+- pip<=19.3.1
 - python>=3.5.2,<3.6.8
 - nb_conda
 - matplotlib==2.1.0
--- a/how-to-use-azureml/automated-machine-learning/automl_env_mac.yml
+++ b/how-to-use-azureml/automated-machine-learning/automl_env_mac.yml
@@ -2,7 +2,7 @@ name: azure_automl
 dependencies:
  # The python interpreter version.
  # Currently Azure ML only supports 3.5.2 and later.
- pip
+- pip<=19.3.1
 - nomkl
 - python>=3.5.2,<3.6.8
 - nb_conda
--- a/how-to-use-azureml/automated-machine-learning/classification-bank-marketing-all-features/auto-ml-classification-bank-marketing-all-features.ipynb
+++ b/how-to-use-azureml/automated-machine-learning/classification-bank-marketing-all-features/auto-ml-classification-bank-marketing-all-features.ipynb
@@ -92,6 +92,32 @@
        "from azureml.explain.model._internal.explanation_client import ExplanationClient"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Accessing the Azure ML workspace requires authentication with Azure.\n",
        "\n",
        "The default authentication is interactive authentication using the default tenant.  Executing the `ws = Workspace.from_config()` line in the cell below will prompt for authentication the first time that it is run.\n",
        "\n",
        "If you have multiple Azure tenants, you can specify the tenant by replacing the `ws = Workspace.from_config()` line in the cell below with the following:\n",
        "\n",
        "```\n",
        "from azureml.core.authentication import InteractiveLoginAuthentication\n",
        "auth = InteractiveLoginAuthentication(tenant_id = 'mytenantid')\n",
        "ws = Workspace.from_config(auth = auth)\n",
        "```\n",
        "\n",
        "If you need to run in an environment where interactive login is not possible, you can use Service Principal authentication by replacing the `ws = Workspace.from_config()` line in the cell below with the following:\n",
        "\n",
        "```\n",
        "from azureml.core.authentication import ServicePrincipalAuthentication\n",
        "auth = auth = ServicePrincipalAuthentication('mytenantid', 'myappid', 'mypassword')\n",
        "ws = Workspace.from_config(auth = auth)\n",
        "```\n",
        "For more details, see [aka.ms/aml-notebook-auth](http://aka.ms/aml-notebook-auth)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
--- a/how-to-use-azureml/automated-machine-learning/classification-bank-marketing-all-features/auto-ml-classification-bank-marketing-all-features.yml
+++ b/how-to-use-azureml/automated-machine-learning/classification-bank-marketing-all-features/auto-ml-classification-bank-marketing-all-features.yml
@@ -2,3 +2,10 @@ name: auto-ml-classification-bank-marketing-all-features
 dependencies:
 - pip:
  - azureml-sdk
  - azureml-train-automl
  - azureml-widgets
  - matplotlib
  - interpret
  - onnxruntime==1.0.0
  - azureml-explain-model
  - azureml-contrib-interpret
--- a/how-to-use-azureml/automated-machine-learning/classification-credit-card-fraud/auto-ml-classification-credit-card-fraud.ipynb
+++ b/how-to-use-azureml/automated-machine-learning/classification-credit-card-fraud/auto-ml-classification-credit-card-fraud.ipynb
@@ -210,7 +210,6 @@
        "automl_settings = {\n",
        "    \"n_cross_validations\": 3,\n",
        "    \"primary_metric\": 'average_precision_score_weighted',\n",
        "    \"preprocess\": True,\n",
        "    \"enable_early_stopping\": True,\n",
        "    \"max_concurrent_iterations\": 2, # This is a limit for testing purpose, please increase it as per cluster size\n",
        "    \"experiment_timeout_hours\": 0.2, # This is a time limit for testing purposes, remove it for real use cases, this will drastically limit ablity to find the best model possible\n",
--- a/how-to-use-azureml/automated-machine-learning/classification-credit-card-fraud/auto-ml-classification-credit-card-fraud.yml
+++ b/how-to-use-azureml/automated-machine-learning/classification-credit-card-fraud/auto-ml-classification-credit-card-fraud.yml
@@ -2,3 +2,8 @@ name: auto-ml-classification-credit-card-fraud
 dependencies:
 - pip:
  - azureml-sdk
  - azureml-train-automl
  - azureml-widgets
  - matplotlib
  - interpret
  - azureml-explain-model
--- a/how-to-use-azureml/automated-machine-learning/classification-text-dnn/auto-ml-classification-text-dnn.ipynb
+++ b/how-to-use-azureml/automated-machine-learning/classification-text-dnn/auto-ml-classification-text-dnn.ipynb
@@ -275,7 +275,6 @@
        "automl_settings = {\n",
        "    \"experiment_timeout_minutes\": 20,\n",
        "    \"primary_metric\": 'accuracy',\n",
        "    \"preprocess\": True,\n",
        "    \"max_concurrent_iterations\": 4, \n",
        "    \"max_cores_per_iteration\": -1,\n",
        "    \"enable_dnn\": True,\n",
--- a/how-to-use-azureml/automated-machine-learning/classification-text-dnn/auto-ml-classification-text-dnn.yml
+++ b/how-to-use-azureml/automated-machine-learning/classification-text-dnn/auto-ml-classification-text-dnn.yml
@@ -2,3 +2,7 @@ name: auto-ml-classification-text-dnn
 dependencies:
 - pip:
  - azureml-sdk
  - azureml-train-automl
  - azureml-widgets
  - matplotlib
  - azurmel-train
--- a/how-to-use-azureml/automated-machine-learning/continuous-retraining/auto-ml-continuous-retraining.ipynb
+++ b/how-to-use-azureml/automated-machine-learning/continuous-retraining/auto-ml-continuous-retraining.ipynb
@@ -350,7 +350,6 @@
        "    \"experiment_timeout_hours\": 0.2,\n",
        "    \"n_cross_validations\": 3,\n",
        "    \"primary_metric\": 'r2_score',\n",
        "    \"preprocess\": True,\n",
        "    \"max_concurrent_iterations\": 3,\n",
        "    \"max_cores_per_iteration\": -1,\n",
        "    \"verbosity\": logging.INFO,\n",
--- a/how-to-use-azureml/automated-machine-learning/continuous-retraining/auto-ml-continuous-retraining.yml
+++ b/how-to-use-azureml/automated-machine-learning/continuous-retraining/auto-ml-continuous-retraining.yml
@@ -2,3 +2,7 @@ name: auto-ml-continuous-retraining
 dependencies:
 - pip:
  - azureml-sdk
  - azureml-train-automl
  - azureml-widgets
  - matplotlib
  - azureml-pipeline
--- a/how-to-use-azureml/automated-machine-learning/forecasting-beer-remote/auto-ml-forecasting-beer-remote.yml
+++ b/how-to-use-azureml/automated-machine-learning/forecasting-beer-remote/auto-ml-forecasting-beer-remote.yml
@@ -1,4 +1,10 @@
 name: auto-ml-forecasting-beer-remote
 dependencies:
 - fbprophet==0.5
 - py-xgboost<=0.80
 - pip:
  - azureml-sdk
  - azureml-train-automl
  - azureml-widgets
  - matplotlib
  - azureml-train
--- a/how-to-use-azureml/automated-machine-learning/forecasting-beer-remote/helper.py
+++ b/how-to-use-azureml/automated-machine-learning/forecasting-beer-remote/helper.py
@@ -76,9 +76,12 @@ def get_result_df(remote_run):
 def run_inference(test_experiment, compute_target, script_folder, train_run,
                  test_dataset, lookback_dataset, max_horizon,
                  target_column_name, time_column_name, freq):
-    train_run.download_file('outputs/model.pkl', 'inference/model.pkl')
+    model_base_name = 'model.pkl'
-    train_run.download_file('outputs/conda_env_v_1_0_0.yml',
+    if 'model_data_location' in train_run.properties:
-                            'inference/condafile.yml')
+        model_location = train_run.properties['model_data_location']
        _, model_base_name = model_location.rsplit('/', 1)
    train_run.download_file('outputs/{}'.format(model_base_name), 'inference/{}'.format(model_base_name))
    train_run.download_file('outputs/conda_env_v_1_0_0.yml', 'inference/condafile.yml')
    inference_env = Environment("myenv")
    inference_env.docker.enabled = True
@@ -91,7 +94,8 @@ def run_inference(test_experiment, compute_target, script_folder, train_run,
                        '--max_horizon': max_horizon,
                        '--target_column_name': target_column_name,
                        '--time_column_name': time_column_name,
-                        '--frequency': freq
+                        '--frequency': freq,
                        '--model_path': model_base_name
                    },
                    inputs=[test_dataset.as_named_input('test_data'),
                            lookback_dataset.as_named_input('lookback_data')],
--- a/how-to-use-azureml/automated-machine-learning/forecasting-beer-remote/infer.py
+++ b/how-to-use-azureml/automated-machine-learning/forecasting-beer-remote/infer.py
@@ -232,6 +232,9 @@ parser.add_argument(
 parser.add_argument(
    '--frequency', type=str, dest='freq',
    help='Frequency of prediction')
 parser.add_argument(
    '--model_path', type=str, dest='model_path',
    default='model.pkl', help='Filename of model to be loaded')
 args = parser.parse_args()
@@ -239,6 +242,7 @@ max_horizon = args.max_horizon
 target_column_name = args.target_column_name
 time_column_name = args.time_column_name
 freq = args.freq
 model_path = args.model_path
 print('args passed are: ')
@@ -246,6 +250,7 @@ print(max_horizon)
 print(target_column_name)
 print(time_column_name)
 print(freq)
 print(model_path)
 run = Run.get_context()
 # get input dataset by name
@@ -267,7 +272,8 @@ X_lookback_df = lookback_dataset.drop_columns(columns=[target_column_name])
 y_lookback_df = lookback_dataset.with_timestamp_columns(
    None).keep_columns(columns=[target_column_name])
-fitted_model = joblib.load('model.pkl')
+fitted_model = joblib.load(model_path)
 if hasattr(fitted_model, 'get_lookback'):
    lookback = fitted_model.get_lookback()
--- a/how-to-use-azureml/automated-machine-learning/forecasting-bike-share/auto-ml-forecasting-bike-share.yml
+++ b/how-to-use-azureml/automated-machine-learning/forecasting-bike-share/auto-ml-forecasting-bike-share.yml
@@ -1,4 +1,9 @@
 name: auto-ml-forecasting-bike-share
 dependencies:
 - fbprophet==0.5
 - py-xgboost<=0.80
 - pip:
  - azureml-sdk
  - azureml-train-automl
  - azureml-widgets
  - matplotlib
--- a/how-to-use-azureml/automated-machine-learning/forecasting-energy-demand/auto-ml-forecasting-energy-demand.ipynb
+++ b/how-to-use-azureml/automated-machine-learning/forecasting-energy-demand/auto-ml-forecasting-energy-demand.ipynb
@@ -253,7 +253,7 @@
      "source": [
        "# split into train based on time\n",
        "train = dataset.time_before(datetime(2017, 8, 8, 5), include_boundary=True)\n",
-        "train.to_pandas_dataframe().sort_values(time_column_name).tail(5).reset_index(drop=True)"
+        "train.to_pandas_dataframe().reset_index(drop=True).sort_values(time_column_name).tail(5)"
      ]
    },
    {
@@ -264,7 +264,7 @@
      "source": [
        "# split into test based on time\n",
        "test = dataset.time_between(datetime(2017, 8, 8, 6), datetime(2017, 8, 10, 5))\n",
-        "test.to_pandas_dataframe().head(5).reset_index(drop=True)"
+        "test.to_pandas_dataframe().reset_index(drop=True).head(5)"
      ]
    },
    {
--- a/how-to-use-azureml/automated-machine-learning/forecasting-energy-demand/auto-ml-forecasting-energy-demand.yml
+++ b/how-to-use-azureml/automated-machine-learning/forecasting-energy-demand/auto-ml-forecasting-energy-demand.yml
@@ -2,3 +2,9 @@ name: auto-ml-forecasting-energy-demand
 dependencies:
 - pip:
  - azureml-sdk
  - azureml-train-automl
  - azureml-widgets
  - matplotlib
  - interpret
  - azureml-explain-model
  - azureml-contrib-interpret
--- a/how-to-use-azureml/automated-machine-learning/forecasting-grouping/auto-ml-forecasting-grouping.yml
+++ b/how-to-use-azureml/automated-machine-learning/forecasting-grouping/auto-ml-forecasting-grouping.yml
@@ -2,3 +2,7 @@ name: auto-ml-forecasting-grouping
 dependencies:
 - pip:
  - azureml-sdk
  - azureml-train-automl
  - azureml-widgets
  - matplotlib
  - azureml-pipeline
--- a/how-to-use-azureml/automated-machine-learning/forecasting-grouping/deploy/deploy.py
+++ b/how-to-use-azureml/automated-machine-learning/forecasting-grouping/deploy/deploy.py
@@ -1,9 +1,9 @@
 import argparse
 import json
-from azureml.core import Run, Model, Workspace
+from azureml.core import Run, Model
 from azureml.core.conda_dependencies import CondaDependencies
 from azureml.core.model import InferenceConfig
 from azureml.core.environment import Environment
 from azureml.core.webservice import AciWebservice
@@ -39,6 +39,8 @@ print(model_list)
 run = Run.get_context()
 ws = run.experiment.workspace
 myenv = Environment.from_conda_specification(name="env", file_path=conda_env_file_name)
 deployment_config = AciWebservice.deploy_configuration(
    cpu_cores=1,
    memory_gb=2,
@@ -46,11 +48,7 @@ deployment_config = AciWebservice.deploy_configuration(
    description='grouping demo aci deployment'
 )
-inference_config = InferenceConfig(
+inference_config = InferenceConfig(entry_script=script_file_name, environment=myenv)
    entry_script=script_file_name,
    runtime='python',
    conda_file=conda_env_file_name
 )
 models = []
 for model_name in model_list:
--- a/how-to-use-azureml/automated-machine-learning/forecasting-high-frequency/automl-forecasting-function.yml
+++ b/how-to-use-azureml/automated-machine-learning/forecasting-high-frequency/automl-forecasting-function.yml
@@ -1,4 +1,9 @@
 name: automl-forecasting-function
 dependencies:
 - fbprophet==0.5
 - py-xgboost<=0.80
 - pip:
  - azureml-sdk
  - azureml-train-automl
  - azureml-widgets
  - matplotlib
--- a/how-to-use-azureml/automated-machine-learning/forecasting-orange-juice-sales/auto-ml-forecasting-orange-juice-sales.ipynb
+++ b/how-to-use-azureml/automated-machine-learning/forecasting-orange-juice-sales/auto-ml-forecasting-orange-juice-sales.ipynb
@@ -631,9 +631,7 @@
      "outputs": [],
      "source": [
        "import json\n",
        "# The request data frame needs to have y_query column which corresponds to query.\n",
        "X_query = X_test.copy()\n",
        "X_query['y_query'] = np.NaN\n",
        "# We have to convert datetime to string, because Timestamps cannot be serialized to JSON.\n",
        "X_query[time_column_name] = X_query[time_column_name].astype(str)\n",
        "# The Service object accept the complex dictionary, which is internally converted to JSON string.\n",
--- a/how-to-use-azureml/automated-machine-learning/forecasting-orange-juice-sales/auto-ml-forecasting-orange-juice-sales.yml
+++ b/how-to-use-azureml/automated-machine-learning/forecasting-orange-juice-sales/auto-ml-forecasting-orange-juice-sales.yml
@@ -1,4 +1,9 @@
 name: auto-ml-forecasting-orange-juice-sales
 dependencies:
 - fbprophet==0.5
 - py-xgboost<=0.80
 - pip:
  - azureml-sdk
  - azureml-train-automl
  - azureml-widgets
  - matplotlib
--- a/how-to-use-azureml/automated-machine-learning/local-run-classification-credit-card-fraud/auto-ml-classification-credit-card-fraud-local.ipynb
+++ b/how-to-use-azureml/automated-machine-learning/local-run-classification-credit-card-fraud/auto-ml-classification-credit-card-fraud-local.ipynb
@@ -155,7 +155,6 @@
        "automl_settings = {\n",
        "    \"n_cross_validations\": 3,\n",
        "    \"primary_metric\": 'average_precision_score_weighted',\n",
        "    \"preprocess\": True,\n",
        "    \"experiment_timeout_hours\": 0.2, # This is a time limit for testing purposes, remove it for real use cases, this will drastically limit ability to find the best model possible\n",
        "    \"verbosity\": logging.INFO,\n",
        "    \"enable_stack_ensemble\": False\n",
--- a/how-to-use-azureml/automated-machine-learning/local-run-classification-credit-card-fraud/auto-ml-classification-credit-card-fraud-local.yml
+++ b/how-to-use-azureml/automated-machine-learning/local-run-classification-credit-card-fraud/auto-ml-classification-credit-card-fraud-local.yml
@@ -2,3 +2,8 @@ name: auto-ml-classification-credit-card-fraud-local
 dependencies:
 - pip:
  - azureml-sdk
  - azureml-train-automl
  - azureml-widgets
  - matplotlib
  - interpret
  - azureml-explain-model
--- a/how-to-use-azureml/automated-machine-learning/regression-hardware-performance-explanation-and-featurization/auto-ml-regression-hardware-performance-explanation-and-featurization.ipynb
+++ b/how-to-use-azureml/automated-machine-learning/regression-hardware-performance-explanation-and-featurization/auto-ml-regression-hardware-performance-explanation-and-featurization.ipynb
@@ -208,7 +208,7 @@
        "|**primary_metric**|This is the metric that you want to optimize. Regression supports the following primary metrics: <br><i>spearman_correlation</i><br><i>normalized_root_mean_squared_error</i><br><i>r2_score</i><br><i>normalized_mean_absolute_error</i>|\n",
        "|**experiment_timeout_hours**| Maximum amount of time in hours that all iterations combined can take before the experiment terminates.|\n",
        "|**enable_early_stopping**| Flag to enble early termination if the score is not improving in the short term.|\n",
-        "|**featurization**| 'auto' / 'off' / FeaturizationConfig Indicator for whether featurization step should be done automatically or not, or whether customized featurization should be used. Note: If the input data is sparse, featurization cannot be turned on.|\n",
+        "|**featurization**| 'auto' / 'off' / FeaturizationConfig Indicator for whether featurization step should be done automatically or not, or whether customized featurization should be used. Setting this enables AutoML to perform featurization on the input to handle *missing data*, and to perform some common *feature extraction*. Note: If the input data is sparse, featurization cannot be turned on.|\n",
        "|**n_cross_validations**|Number of cross validation splits.|\n",
        "|**training_data**|(sparse) array-like, shape = [n_samples, n_features]|\n",
        "|**label_column_name**|(sparse) array-like, shape = [n_samples, ], targets values.|"
@@ -244,7 +244,7 @@
      "source": [
        "featurization_config = FeaturizationConfig()\n",
        "featurization_config.blocked_transformers = ['LabelEncoder']\n",
-        "#featurization_config.drop_columns = ['ERP', 'MMIN']\n",
+        "#featurization_config.drop_columns = ['MMIN']\n",
        "featurization_config.add_column_purpose('MYCT', 'Numeric')\n",
        "featurization_config.add_column_purpose('VendorName', 'CategoricalHash')\n",
        "#default strategy mean, add transformer param for for 3 columns\n",
--- a/how-to-use-azureml/automated-machine-learning/regression-hardware-performance-explanation-and-featurization/auto-ml-regression-hardware-performance-explanation-and-featurization.yml
+++ b/how-to-use-azureml/automated-machine-learning/regression-hardware-performance-explanation-and-featurization/auto-ml-regression-hardware-performance-explanation-and-featurization.yml
@@ -2,3 +2,10 @@ name: auto-ml-regression-hardware-performance-explanation-and-featurization
 dependencies:
 - pip:
  - azureml-sdk
  - azureml-train-automl
  - azureml-widgets
  - matplotlib
  - interpret
  - azureml-explain-model
  - azureml-explain-model
  - azureml-contrib-interpret
--- a/how-to-use-azureml/automated-machine-learning/regression/auto-ml-regression.ipynb
+++ b/how-to-use-azureml/automated-machine-learning/regression/auto-ml-regression.ipynb
@@ -198,7 +198,6 @@
        "automl_settings = {\n",
        "    \"n_cross_validations\": 3,\n",
        "    \"primary_metric\": 'r2_score',\n",
        "    \"preprocess\": True,\n",
        "    \"enable_early_stopping\": True, \n",
        "    \"experiment_timeout_hours\": 0.3, #for real scenarios we reccommend a timeout of at least one hour \n",
        "    \"max_concurrent_iterations\": 4,\n",
--- a/how-to-use-azureml/automated-machine-learning/regression/auto-ml-regression.yml
+++ b/how-to-use-azureml/automated-machine-learning/regression/auto-ml-regression.yml
@@ -2,3 +2,6 @@ name: auto-ml-regression
 dependencies:
 - pip:
  - azureml-sdk
  - azureml-train-automl
  - azureml-widgets
  - matplotlib
--- a/how-to-use-azureml/azure-databricks/amlsdk/deploy-to-aci-04.ipynb
+++ b/how-to-use-azureml/azure-databricks/amlsdk/deploy-to-aci-04.ipynb
@@ -11,6 +11,13 @@
        "Licensed under the MIT License."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "# Register Azure Databricks trained model and deploy it to ACI\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
@@ -265,6 +272,15 @@
        "myservice.delete()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Deploying to other types of computes\n",
        "\n",
        "In order to learn how to deploy to other types of compute targets, such as AKS, please take a look at the set of notebooks in the [deployment](https://github.com/Azure/MachineLearningNotebooks/tree/master/how-to-use-azureml/deployment) folder."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
--- a/how-to-use-azureml/azure-databricks/amlsdk/deploy-to-aks-05.ipynb
+++ b/how-to-use-azureml/azure-databricks/amlsdk/deploy-to-aks-05.ipynb
@@ -1,312 +0,0 @@
 {
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Azure ML & Azure Databricks notebooks by Parashar Shah.\n",
        "\n",
        "Copyright (c) Microsoft Corporation. All rights reserved.\n",
        "\n",
        "Licensed under the MIT License."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "This notebook uses image from ACI notebook for deploying to AKS."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import azureml.core\n",
        "\n",
        "# Check core SDK version number\n",
        "print(\"SDK version:\", azureml.core.VERSION)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Set auth to be used by workspace related APIs.\n",
        "# For automation or CI/CD ServicePrincipalAuthentication can be used.\n",
        "# https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.authentication.serviceprincipalauthentication?view=azure-ml-py\n",
        "auth = None"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.core import Workspace\n",
        "\n",
        "ws = Workspace.from_config(auth = auth)\n",
        "print('Workspace name: ' + ws.name, \n",
        "      'Azure region: ' + ws.location, \n",
        "      'Subscription id: ' + ws.subscription_id, \n",
        "      'Resource group: ' + ws.resource_group, sep = '\\n')"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "#Register the model\n",
        "import os\n",
        "from azureml.core.model import Model\n",
        "\n",
        "model_name = \"AdultCensus_runHistory_aks.mml\" # \n",
        "model_name_dbfs = os.path.join(\"/dbfs\", model_name)\n",
        "\n",
        "print(\"copy model from dbfs to local\")\n",
        "model_local = \"file:\" + os.getcwd() + \"/\" + model_name\n",
        "dbutils.fs.cp(model_name, model_local, True)\n",
        "\n",
        "mymodel = Model.register(model_path = model_name, # this points to a local file\n",
        "                       model_name = model_name, # this is the name the model is registered as, am using same name for both path and name.                 \n",
        "                       description = \"ADB trained model by Parashar\",\n",
        "                       workspace = ws)\n",
        "\n",
        "print(mymodel.name, mymodel.description, mymodel.version)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "#%%writefile score_sparkml.py\n",
        "score_sparkml = \"\"\"\n",
        " \n",
        "import json\n",
        " \n",
        "def init():\n",
        "    # One-time initialization of PySpark and predictive model\n",
        "    import pyspark\n",
        "    from azureml.core.model import Model\n",
        "    from pyspark.ml import PipelineModel\n",
        " \n",
        "    global trainedModel\n",
        "    global spark\n",
        " \n",
        "    spark = pyspark.sql.SparkSession.builder.appName(\"ADB and AML notebook by Parashar\").getOrCreate()\n",
        "    model_name = \"{model_name}\" #interpolated\n",
        "    model_path = Model.get_model_path(model_name)\n",
        "    trainedModel = PipelineModel.load(model_path)\n",
        "    \n",
        "def run(input_json):\n",
        "    if isinstance(trainedModel, Exception):\n",
        "        return json.dumps({{\"trainedModel\":str(trainedModel)}})\n",
        "      \n",
        "    try:\n",
        "        sc = spark.sparkContext\n",
        "        input_list = json.loads(input_json)\n",
        "        input_rdd = sc.parallelize(input_list)\n",
        "        input_df = spark.read.json(input_rdd)\n",
        "    \n",
        "        # Compute prediction\n",
        "        prediction = trainedModel.transform(input_df)\n",
        "        #result = prediction.first().prediction\n",
        "        predictions = prediction.collect()\n",
        " \n",
        "        #Get each scored result\n",
        "        preds = [str(x['prediction']) for x in predictions]\n",
        "        result = \",\".join(preds)\n",
        "        # you can return any data type as long as it is JSON-serializable\n",
        "        return result.tolist()\n",
        "    except Exception as e:\n",
        "        result = str(e)\n",
        "        return result\n",
        "    \n",
        "\"\"\".format(model_name=model_name)\n",
        " \n",
        "exec(score_sparkml)\n",
        " \n",
        "with open(\"score_sparkml.py\", \"w\") as file:\n",
        "    file.write(score_sparkml)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.core.conda_dependencies import CondaDependencies \n",
        "\n",
        "myacienv = CondaDependencies.create(conda_packages=['scikit-learn','numpy','pandas']) #showing how to add libs as an eg. - not needed for this model.\n",
        "\n",
        "with open(\"mydeployenv.yml\",\"w\") as f:\n",
        "    f.write(myacienv.serialize_to_string())"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "#create AKS compute\n",
        "#it may take 20-25 minutes to create a new cluster\n",
        "\n",
        "from azureml.core.compute import AksCompute, ComputeTarget\n",
        "from azureml.core.compute_target import ComputeTargetException\n",
        "\n",
        "aks_name = 'ps-aks-demo2' \n",
        "\n",
        "try:\n",
        "    aks_target = ComputeTarget(workspace=ws, name=aks_name)\n",
        "    print('Found existing cluster, use it.')\n",
        "except ComputeTargetException:\n",
        "    # Use the default configuration (can also provide parameters to customize)\n",
        "    prov_config = AksCompute.provisioning_configuration()\n",
        "    \n",
        "    # Create the cluster\n",
        "    aks_target = ComputeTarget.create(workspace = ws, \n",
        "                                  name = aks_name, \n",
        "                                  provisioning_configuration = prov_config)\n",
        "\n",
        "aks_target.wait_for_completion(show_output = True)\n",
        "\n",
        "print(aks_target.provisioning_state)\n",
        "print(aks_target.provisioning_errors)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "#deploy to AKS\n",
        "from azureml.core.webservice import AksWebservice, Webservice\n",
        "from azureml.exceptions import WebserviceException\n",
        "from azureml.core.model import InferenceConfig\n",
        "\n",
        "aks_config = AksWebservice.deploy_configuration(enable_app_insights=True)\n",
        "\n",
        "service_name = 'ps-aks-service'\n",
        "\n",
        "# Remove any existing service under the same name.\n",
        "try:\n",
        "    Webservice(ws, service_name).delete()\n",
        "except WebserviceException:\n",
        "    pass\n",
        "\n",
        "inference_config = InferenceConfig(runtime = 'spark-py', \n",
        "                                   entry_script ='score_sparkml.py',\n",
        "                                   conda_file ='mydeployenv.yml')\n",
        "\n",
        "aks_service = Model.deploy(ws, service_name, [mymodel], inference_config, aks_config, aks_target)\n",
        "aks_service.wait_for_deployment(show_output=True)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "aks_service.deployment_status"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "#for using the Web HTTP API \n",
        "print(aks_service.scoring_uri)\n",
        "print(aks_service.get_keys())"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import json\n",
        "\n",
        "#get the some sample data\n",
        "test_data_path = \"AdultCensusIncomeTest\"\n",
        "test = spark.read.parquet(test_data_path).limit(5)\n",
        "\n",
        "test_json = json.dumps(test.toJSON().collect())\n",
        "\n",
        "print(test_json)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "#using data defined above predict if income is >50K (1) or <=50K (0)\n",
        "aks_service.run(input_data=test_json)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "#comment to not delete the web service\n",
        "aks_service.delete()\n",
        "#model.delete()\n",
        "aks_target.delete() "
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/azure-databricks/amlsdk/deploy-to-aks-existingimage-05.png)"
      ]
    }
  ],
  "metadata": {
    "authors": [
      {
        "name": "pasha"
      }
    ],
    "kernelspec": {
      "display_name": "Python 3.6",
      "language": "python",
      "name": "python36"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.6.8"
    },
    "name": "deploy-to-aks-existingimage-05",
    "notebookId": 1030695628045968
  },
  "nbformat": 4,
  "nbformat_minor": 1
 }
--- a/how-to-use-azureml/deployment/deploy-to-cloud/model-register-and-deploy.ipynb
+++ b/how-to-use-azureml/deployment/deploy-to-cloud/model-register-and-deploy.ipynb
@@ -20,7 +20,7 @@
      "cell_type": "markdown",
      "metadata": {},
      "source": [
-        "# Register model and deploy as webservice\n",
+        "# Register model and deploy as webservice in ACI\n",
        "\n",
        "Following this notebook, you will:\n",
        "\n",
@@ -45,6 +45,7 @@
      "source": [
        "import azureml.core\n",
        "\n",
        "\n",
        "# Check core SDK version number.\n",
        "print('SDK version:', azureml.core.VERSION)"
      ]
@@ -70,6 +71,7 @@
      "source": [
        "from azureml.core import Workspace\n",
        "\n",
        "\n",
        "ws = Workspace.from_config()\n",
        "print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep='\\n')"
      ]
@@ -91,6 +93,7 @@
      "source": [
        "from azureml.core import Dataset\n",
        "\n",
        "\n",
        "datastore = ws.get_default_datastore()\n",
        "datastore.upload_files(files=['./features.csv', './labels.csv'],\n",
        "                       target_path='sklearn_regression/',\n",
@@ -125,6 +128,7 @@
        "from azureml.core import Model\n",
        "from azureml.core.resource_configuration import ResourceConfiguration\n",
        "\n",
        "\n",
        "model = Model.register(workspace=ws,\n",
        "                       model_name='my-sklearn-model',                # Name of the registered model in your workspace.\n",
        "                       model_path='./sklearn_regression_model.pkl',  # Local file to upload and register as a model.\n",
@@ -159,6 +163,8 @@
        "\n",
        "The Azure Machine Learning service provides a default environment for supported model frameworks, including scikit-learn, based on the metadata you provided when registering your model. This is the easiest way to deploy your model.\n",
        "\n",
        "Even when you deploy your model to ACI with a default environment you can still customize the deploy configuration (i.e. the number of cores and amount of memory made available for the deployment) using the [AciWebservice.deploy_configuration()](https://docs.microsoft.com/python/api/azureml-core/azureml.core.webservice.aci.aciwebservice#deploy-configuration-cpu-cores-none--memory-gb-none--tags-none--properties-none--description-none--location-none--auth-enabled-none--ssl-enabled-none--enable-app-insights-none--ssl-cert-pem-file-none--ssl-key-pem-file-none--ssl-cname-none--dns-name-label-none--). Look at the \"Use a custom environment\" section of this notebook for more information on deploy configuration.\n",
        "\n",
        "**Note**: This step can take several minutes."
      ]
    },
@@ -171,6 +177,7 @@
        "from azureml.core import Webservice\n",
        "from azureml.exceptions import WebserviceException\n",
        "\n",
        "\n",
        "service_name = 'my-sklearn-service'\n",
        "\n",
        "# Remove any existing service under the same name.\n",
@@ -198,6 +205,7 @@
      "source": [
        "import json\n",
        "\n",
        "\n",
        "input_payload = json.dumps({\n",
        "    'data': [\n",
        "        [ 0.03807591,  0.05068012,  0.06169621, 0.02187235, -0.0442235,\n",
@@ -231,9 +239,9 @@
      "cell_type": "markdown",
      "metadata": {},
      "source": [
-        "### Use a custom environment (for all models)\n",
+        "### Use a custom environment\n",
        "\n",
-        "If you want more control over how your model is run, if it uses another framework, or if it has special runtime requirements, you can instead specify your own environment and scoring method.\n",
+        "If you want more control over how your model is run, if it uses another framework, or if it has special runtime requirements, you can instead specify your own environment and scoring method. Custom environments can be used for any model you want to deploy.\n",
        "\n",
        "Specify the model's runtime environment by creating an [Environment](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.environment%28class%29?view=azure-ml-py) object and providing the [CondaDependencies](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.conda_dependencies.condadependencies?view=azure-ml-py) needed by your model."
      ]
@@ -247,6 +255,7 @@
        "from azureml.core import Environment\n",
        "from azureml.core.conda_dependencies import CondaDependencies\n",
        "\n",
        "\n",
        "environment = Environment('my-sklearn-environment')\n",
        "environment.python.conda_dependencies = CondaDependencies.create(pip_packages=[\n",
        "    'azureml-defaults',\n",
@@ -278,7 +287,7 @@
      "cell_type": "markdown",
      "metadata": {},
      "source": [
-        "Deploy your model in the custom environment by providing an [InferenceConfig](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.model.inferenceconfig?view=azure-ml-py) object to [Model.deploy()](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.model.model?view=azure-ml-py#deploy-workspace--name--models--inference-config--deployment-config-none--deployment-target-none-).\n",
+        "Deploy your model in the custom environment by providing an [InferenceConfig](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.model.inferenceconfig?view=azure-ml-py) object to [Model.deploy()](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.model.model?view=azure-ml-py#deploy-workspace--name--models--inference-config--deployment-config-none--deployment-target-none-). In this case we are also using the [AciWebservice.deploy_configuration()](https://docs.microsoft.com/python/api/azureml-core/azureml.core.webservice.aci.aciwebservice#deploy-configuration-cpu-cores-none--memory-gb-none--tags-none--properties-none--description-none--location-none--auth-enabled-none--ssl-enabled-none--enable-app-insights-none--ssl-cert-pem-file-none--ssl-key-pem-file-none--ssl-cname-none--dns-name-label-none--) method to generate a custom deploy configuration.\n",
        "\n",
        "**Note**: This step can take several minutes."
      ]
@@ -288,15 +297,18 @@
      "execution_count": null,
      "metadata": {
        "tags": [
-          "azuremlexception-remarks-sample"
+          "azuremlexception-remarks-sample",
          "sample-aciwebservice-deploy-config"
        ]
      },
      "outputs": [],
      "source": [
        "from azureml.core import Webservice\n",
        "from azureml.core.model import InferenceConfig\n",
        "from azureml.core.webservice import AciWebservice\n",
        "from azureml.exceptions import WebserviceException\n",
        "\n",
        "\n",
        "service_name = 'my-custom-env-service'\n",
        "\n",
        "# Remove any existing service under the same name.\n",
@@ -305,11 +317,14 @@
        "except WebserviceException:\n",
        "    pass\n",
        "\n",
-        "inference_config = InferenceConfig(entry_script='score.py',\n",
+        "inference_config = InferenceConfig(entry_script='score.py', environment=environment)\n",
-        "                                   source_directory='.',\n",
+        "aci_config = AciWebservice.deploy_configuration(cpu_cores=1, memory_gb=1)\n",
        "                                   environment=environment)\n",
        "\n",
-        "service = Model.deploy(ws, service_name, [model], inference_config)\n",
+        "service = Model.deploy(workspace=ws,\n",
        "                       name=service_name,\n",
        "                       models=[model],\n",
        "                       inference_config=inference_config,\n",
        "                       deployment_config=aci_config)\n",
        "service.wait_for_deployment(show_output=True)"
      ]
    },
@@ -328,6 +343,7 @@
      "source": [
        "import json\n",
        "\n",
        "\n",
        "input_payload = json.dumps({\n",
        "    'data': [\n",
        "        [ 0.03807591,  0.05068012,  0.06169621, 0.02187235, -0.0442235,\n",
--- a/how-to-use-azureml/deployment/production-deploy-to-aks/production-deploy-to-aks.ipynb
+++ b/how-to-use-azureml/deployment/production-deploy-to-aks/production-deploy-to-aks.ipynb
@@ -318,7 +318,11 @@
    {
      "cell_type": "code",
      "execution_count": null,
-      "metadata": {},
+      "metadata": {
        "tags": [
          "sample-deploy-to-aks"
        ]
      },
      "outputs": [],
      "source": [
        "# Set the web service configuration (using default here)\n",
@@ -331,7 +335,11 @@
    {
      "cell_type": "code",
      "execution_count": null,
-      "metadata": {},
+      "metadata": {
        "tags": [
          "sample-deploy-to-aks"
        ]
      },
      "outputs": [],
      "source": [
        "%%time\n",
--- a/how-to-use-azureml/deployment/register-model-create-image-deploy-service/register-model-create-image-deploy-service.ipynb
+++ b/how-to-use-azureml/deployment/register-model-create-image-deploy-service/register-model-create-image-deploy-service.ipynb
@@ -1,457 +0,0 @@
 {
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Copyright (c) Microsoft Corporation. All rights reserved.\n",
        "\n",
        "Licensed under the MIT License."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/deployment/register-model-create-image-deploy-service/register-model-create-image-deploy-service.png)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Register Model, Create Image and Deploy Service\n",
        "\n",
        "This example shows how to deploy a web service in step-by-step fashion:\n",
        "\n",
        " 1. Register model\n",
        " 2. Query versions of models and select one to deploy\n",
        " 3. Create Docker image\n",
        " 4. Query versions of images\n",
        " 5. Deploy the image as web service\n",
        " \n",
        "**IMPORTANT**:\n",
        " * This notebook requires you to first complete [train-within-notebook](../../training/train-within-notebook/train-within-notebook.ipynb) example\n",
        " \n",
        "The train-within-notebook example taught you how to deploy a web service directly from model in one step. This Notebook shows a more advanced approach that gives you more control over model versions and Docker image versions.  "
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Prerequisites\n",
        "If you are using an Azure Machine Learning Notebook VM, you are all set. Otherwise, make sure you go through the [configuration](../../../configuration.ipynb) Notebook first if you haven't."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Check core SDK version number\n",
        "import azureml.core\n",
        "\n",
        "print(\"SDK version:\", azureml.core.VERSION)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Initialize Workspace\n",
        "\n",
        "Initialize a workspace object from persisted configuration."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "tags": [
          "create workspace"
        ]
      },
      "outputs": [],
      "source": [
        "from azureml.core import Workspace\n",
        "\n",
        "ws = Workspace.from_config()\n",
        "print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep = '\\n')"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Register Model"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "You can add tags and descriptions to your models. Note you need to have a `sklearn_linreg_model.pkl` file in the current directory. This file is generated by the 01 notebook. The below call registers that file as a model with the same name `sklearn_linreg_model.pkl` in the workspace.\n",
        "\n",
        "Using tags, you can track useful information such as the name and version of the machine learning library used to train the model. Note that tags must be alphanumeric."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "tags": [
          "register model from file"
        ]
      },
      "outputs": [],
      "source": [
        "from azureml.core.model import Model\n",
        "import sklearn\n",
        "\n",
        "library_version = \"sklearn\"+sklearn.__version__.replace(\".\",\"x\")\n",
        "\n",
        "model = Model.register(model_path = \"sklearn_regression_model.pkl\",\n",
        "                       model_name = \"sklearn_regression_model.pkl\",\n",
        "                       tags = {'area': \"diabetes\", 'type': \"regression\", 'version': library_version},\n",
        "                       description = \"Ridge regression model to predict diabetes\",\n",
        "                       workspace = ws)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "You can explore the registered models within your workspace and query by tag. Models are versioned. If you call the register_model command many times with same model name, you will get multiple versions of the model with increasing version numbers."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "tags": [
          "register model from file"
        ]
      },
      "outputs": [],
      "source": [
        "regression_models = Model.list(workspace=ws, tags=['area'])\n",
        "for m in regression_models:\n",
        "    print(\"Name:\", m.name,\"\\tVersion:\", m.version, \"\\tDescription:\", m.description, m.tags)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "You can pick a specific model to deploy"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "print(model.name, model.description, model.version, sep = '\\t')"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Create Docker Image"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Show `score.py`. Note that the `sklearn_regression_model.pkl` in the `get_model_path` call is referring to a model named `sklearn_linreg_model.pkl` registered under the workspace. It is NOT referenceing the local file."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "%%writefile score.py\n",
        "import os\n",
        "import pickle\n",
        "import json\n",
        "import numpy\n",
        "from sklearn.externals import joblib\n",
        "from sklearn.linear_model import Ridge\n",
        "\n",
        "def init():\n",
        "    global model\n",
        "    # AZUREML_MODEL_DIR is an environment variable created during deployment.\n",
        "    # It is the path to the model folder (./azureml-models/$MODEL_NAME/$VERSION)\n",
        "    # For multiple models, it points to the folder containing all deployed models (./azureml-models)\n",
        "    model_path = os.path.join(os.getenv('AZUREML_MODEL_DIR'), 'sklearn_regression_model.pkl')\n",
        "    # deserialize the model file back into a sklearn model\n",
        "    model = joblib.load(model_path)\n",
        "\n",
        "# note you can pass in multiple rows for scoring\n",
        "def run(raw_data):\n",
        "    try:\n",
        "        data = json.loads(raw_data)['data']\n",
        "        data = numpy.array(data)\n",
        "        result = model.predict(data)\n",
        "        # you can return any datatype as long as it is JSON-serializable\n",
        "        return result.tolist()\n",
        "    except Exception as e:\n",
        "        error = str(e)\n",
        "        return error"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.core.conda_dependencies import CondaDependencies \n",
        "\n",
        "myenv = CondaDependencies.create(conda_packages=['numpy','scikit-learn'])\n",
        "\n",
        "with open(\"myenv.yml\",\"w\") as f:\n",
        "    f.write(myenv.serialize_to_string())"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Note that following command can take few minutes. \n",
        "\n",
        "You can add tags and descriptions to images. Also, an image can contain multiple models."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "tags": [
          "create image",
          "sample-image-create"
        ]
      },
      "outputs": [],
      "source": [
        "from azureml.core.image import Image, ContainerImage\n",
        "\n",
        "image_config = ContainerImage.image_configuration(runtime= \"python\",\n",
        "                                 execution_script=\"score.py\",\n",
        "                                 conda_file=\"myenv.yml\",\n",
        "                                 tags = {'area': \"diabetes\", 'type': \"regression\"},\n",
        "                                 description = \"Image with ridge regression model\")\n",
        "\n",
        "image = Image.create(name = \"myimage1\",\n",
        "                     # this is the model object. note you can pass in 0-n models via this list-type parameter\n",
        "                     # in case you need to reference multiple models, or none at all, in your scoring script.\n",
        "                     models = [model],\n",
        "                     image_config = image_config, \n",
        "                     workspace = ws)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "tags": [
          "create image"
        ]
      },
      "outputs": [],
      "source": [
        "image.wait_for_creation(show_output = True)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "#### Use a custom Docker image\n",
        "\n",
        "You can also specify a custom Docker image to be used as base image if you don't want to use the default base image provided by Azure ML. Please make sure the custom Docker image has Ubuntu >= 16.04, Conda >= 4.5.\\* and Python(3.5.\\* or 3.6.\\*).\n",
        "\n",
        "Only Supported for `ContainerImage`(from azureml.core.image) with `python` runtime.\n",
        "```python\n",
        "# use an image available in public Container Registry without authentication\n",
        "image_config.base_image = \"mcr.microsoft.com/azureml/o16n-sample-user-base/ubuntu-miniconda\"\n",
        "\n",
        "# or, use an image available in a private Container Registry\n",
        "image_config.base_image = \"myregistry.azurecr.io/mycustomimage:1.0\"\n",
        "image_config.base_image_registry.address = \"myregistry.azurecr.io\"\n",
        "image_config.base_image_registry.username = \"username\"\n",
        "image_config.base_image_registry.password = \"password\"\n",
        "\n",
        "# or, use an image built during training.\n",
        "image_config.base_image = run.properties[\"AzureML.DerivedImageName\"]\n",
        "```\n",
        "You can get the address of training image from the properties of a Run object. Only new runs submitted with azureml-sdk>=1.0.22 to AMLCompute targets will have the 'AzureML.DerivedImageName' property. Instructions on how to get a Run can be found in [manage-runs](../../training/manage-runs/manage-runs.ipynb). \n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "List images by tag and find out the detailed build log for debugging."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "tags": [
          "create image"
        ]
      },
      "outputs": [],
      "source": [
        "for i in Image.list(workspace = ws,tags = [\"area\"]):\n",
        "    print('{}(v.{} [{}]) stored at {} with build log {}'.format(i.name, i.version, i.creation_state, i.image_location, i.image_build_log_uri))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Deploy image as web service on Azure Container Instance\n",
        "\n",
        "Note that the service creation can take few minutes."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "tags": [
          "deploy service",
          "aci",
          "sample-aciwebservice-deploy-config"
        ]
      },
      "outputs": [],
      "source": [
        "from azureml.core.webservice import AciWebservice\n",
        "\n",
        "aciconfig = AciWebservice.deploy_configuration(cpu_cores = 1, \n",
        "                                               memory_gb = 1, \n",
        "                                               tags = {'area': \"diabetes\", 'type': \"regression\"}, \n",
        "                                               description = 'Predict diabetes using regression model')"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "tags": [
          "deploy service",
          "aci",
          "sample-aciwebservice-deploy-from-image"
        ]
      },
      "outputs": [],
      "source": [
        "from azureml.core.webservice import Webservice\n",
        "\n",
        "aci_service_name = 'my-aci-service-2'\n",
        "print(aci_service_name)\n",
        "aci_service = Webservice.deploy_from_image(deployment_config = aciconfig,\n",
        "                                           image = image,\n",
        "                                           name = aci_service_name,\n",
        "                                           workspace = ws)\n",
        "aci_service.wait_for_deployment(True)\n",
        "print(aci_service.state)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Test web service"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Call the web service with some dummy input data to get a prediction."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "tags": [
          "deploy service",
          "aci"
        ]
      },
      "outputs": [],
      "source": [
        "import json\n",
        "\n",
        "test_sample = json.dumps({'data': [\n",
        "    [1,2,3,4,5,6,7,8,9,10], \n",
        "    [10,9,8,7,6,5,4,3,2,1]\n",
        "]})\n",
        "test_sample = bytes(test_sample,encoding = 'utf8')\n",
        "\n",
        "prediction = aci_service.run(input_data=test_sample)\n",
        "print(prediction)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Delete ACI to clean up"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "tags": [
          "deploy service",
          "aci"
        ]
      },
      "outputs": [],
      "source": [
        "aci_service.delete()"
      ]
    }
  ],
  "metadata": {
    "authors": [
      {
        "name": "aashishb"
      }
    ],
    "kernelspec": {
      "display_name": "Python 3.6",
      "language": "python",
      "name": "python36"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.6.6"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 2
 }
--- a/how-to-use-azureml/deployment/register-model-create-image-deploy-service/register-model-create-image-deploy-service.yml
+++ b/how-to-use-azureml/deployment/register-model-create-image-deploy-service/register-model-create-image-deploy-service.yml
@@ -1,8 +0,0 @@
 name: register-model-create-image-deploy-service
 dependencies:
 - pip:
  - azureml-sdk
  - matplotlib
  - tqdm
  - scipy
  - sklearn
--- a/how-to-use-azureml/deployment/register-model-create-image-deploy-service/sklearn_regression_model.pkl
+++ b/how-to-use-azureml/deployment/register-model-create-image-deploy-service/sklearn_regression_model.pkl
--- a/how-to-use-azureml/deployment/spark/iris.model/data/_SUCCESS
+++ b/how-to-use-azureml/deployment/spark/iris.model/data/_SUCCESS
--- a/how-to-use-azureml/deployment/spark/iris.model/data/part-00000-dabcf097-2b45-4b28-bbca-6c17889ddcbf-c000.snappy.parquet
+++ b/how-to-use-azureml/deployment/spark/iris.model/data/part-00000-dabcf097-2b45-4b28-bbca-6c17889ddcbf-c000.snappy.parquet
--- a/how-to-use-azureml/deployment/spark/iris.model/metadata/_SUCCESS
+++ b/how-to-use-azureml/deployment/spark/iris.model/metadata/_SUCCESS
--- a/how-to-use-azureml/deployment/spark/iris.model/metadata/part-00000
+++ b/how-to-use-azureml/deployment/spark/iris.model/metadata/part-00000
@@ -0,0 +1 @@
 {"class":"org.apache.spark.ml.classification.LogisticRegressionModel","timestamp":1570147252329,"sparkVersion":"2.4.0","uid":"LogisticRegression_5df3978caaf3","paramMap":{"regParam":0.01},"defaultParamMap":{"aggregationDepth":2,"threshold":0.5,"rawPredictionCol":"rawPrediction","featuresCol":"features","labelCol":"label","predictionCol":"prediction","family":"auto","regParam":0.0,"tol":1.0E-6,"probabilityCol":"probability","standardization":true,"elasticNetParam":0.0,"maxIter":100,"fitIntercept":true}}
--- a/how-to-use-azureml/deployment/spark/model-register-and-deploy-spark.ipynb
+++ b/how-to-use-azureml/deployment/spark/model-register-and-deploy-spark.ipynb
@@ -0,0 +1,343 @@
 {
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Copyright (c) Microsoft Corporation. All rights reserved.\n",
        "\n",
        "Licensed under the MIT License."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/deployment/deploy-to-cloud/model-register-and-deploy.png)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "# Register Spark Model and deploy as Webservice\n",
        "\n",
        "This example shows how to deploy a Webservice in step-by-step fashion:\n",
        "\n",
        " 1. Register Spark Model\n",
        " 2. Deploy Spark Model as Webservice"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Prerequisites\n",
        "If you are using an Azure Machine Learning Notebook VM, you are all set. Otherwise, make sure you go through the [configuration](../../../configuration.ipynb) Notebook first if you haven't."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Check core SDK version number\n",
        "import azureml.core\n",
        "\n",
        "print(\"SDK version:\", azureml.core.VERSION)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Initialize Workspace\n",
        "\n",
        "Initialize a workspace object from persisted configuration."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "tags": [
          "create workspace"
        ]
      },
      "outputs": [],
      "source": [
        "from azureml.core import Workspace\n",
        "\n",
        "ws = Workspace.from_config()\n",
        "print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep='\\n')"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Register Model"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "You can add tags and descriptions to your Models. Note you need to have a `iris.model` file in the current directory. This model file is generated using [train in spark](../training/train-in-spark/train-in-spark.ipynb) notebook. The below call registers that file as a Model with the same name `iris.model` in the workspace.\n",
        "\n",
        "Using tags, you can track useful information such as the name and version of the machine learning library used to train the model. Note that tags must be alphanumeric."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "tags": [
          "register model from file"
        ]
      },
      "outputs": [],
      "source": [
        "from azureml.core.model import Model\n",
        "\n",
        "model = Model.register(model_path=\"iris.model\",\n",
        "                       model_name=\"iris.model\",\n",
        "                       tags={'type': \"regression\"},\n",
        "                       description=\"Logistic regression model to predict iris species\",\n",
        "                       workspace=ws)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Fetch Environment"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "You can now create and/or use an Environment object when deploying a Webservice. The Environment can have been previously registered with your Workspace, or it will be registered with it as a part of the Webservice deployment.\n",
        "\n",
        "In this notebook, we will be using 'AzureML-PySpark-MmlSpark-0.15', a curated environment.\n",
        "\n",
        "More information can be found in our [using environments notebook](../training/using-environments/using-environments.ipynb)."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.core import Environment\n",
        "\n",
        "env = Environment.get(ws, name='AzureML-PySpark-MmlSpark-0.15')\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Create Inference Configuration\n",
        "\n",
        "There is now support for a source directory, you can upload an entire folder from your local machine as dependencies for the Webservice.\n",
        "Note: in that case, your entry_script is relative path to the source_directory path.\n",
        "\n",
        "Sample code for using a source directory:\n",
        "\n",
        "```python\n",
        "inference_config = InferenceConfig(source_directory=\"C:/abc\",\n",
        "                                   entry_script=\"x/y/score.py\",\n",
        "                                   environment=environment)\n",
        "```\n",
        "\n",
        " - source_directory = holds source path as string, this entire folder gets added in image so its really easy to access any files within this folder or subfolder\n",
        " - entry_script = contains logic specific to initializing your model and running predictions\n",
        " - environment = An environment object to use for the deployment. Doesn't have to be registered"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "tags": [
          "create image"
        ]
      },
      "outputs": [],
      "source": [
        "from azureml.core.model import InferenceConfig\n",
        "\n",
        "inference_config = InferenceConfig(entry_script=\"score.py\", environment=env)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Deploy Model as Webservice on Azure Container Instance\n",
        "\n",
        "Note that the service creation can take few minutes."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "tags": [
          "azuremlexception-remarks-sample"
        ]
      },
      "outputs": [],
      "source": [
        "from azureml.core.webservice import AciWebservice, Webservice\n",
        "from azureml.exceptions import WebserviceException\n",
        "\n",
        "deployment_config = AciWebservice.deploy_configuration(cpu_cores=1, memory_gb=1)\n",
        "aci_service_name = 'aciservice1'\n",
        "\n",
        "try:\n",
        "    # if you want to get existing service below is the command\n",
        "    # since aci name needs to be unique in subscription deleting existing aci if any\n",
        "    # we use aci_service_name to create azure aci\n",
        "    service = Webservice(ws, name=aci_service_name)\n",
        "    if service:\n",
        "        service.delete()\n",
        "except WebserviceException as e:\n",
        "    print()\n",
        "\n",
        "service = Model.deploy(ws, aci_service_name, [model], inference_config, deployment_config)\n",
        "\n",
        "service.wait_for_deployment(True)\n",
        "print(service.state)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "#### Test web service"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import json\n",
        "test_sample = json.dumps({'features':{'type':1,'values':[4.3,3.0,1.1,0.1]},'label':2.0})\n",
        "\n",
        "test_sample_encoded = bytes(test_sample, encoding='utf8')\n",
        "prediction = service.run(input_data=test_sample_encoded)\n",
        "print(prediction)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "#### Delete ACI to clean up"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "tags": [
          "deploy service",
          "aci"
        ]
      },
      "outputs": [],
      "source": [
        "service.delete()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Model Profiling\n",
        "\n",
        "You can also take advantage of the profiling feature to estimate CPU and memory requirements for models.\n",
        "\n",
        "```python\n",
        "profile = Model.profile(ws, \"profilename\", [model], inference_config, test_sample)\n",
        "profile.wait_for_profiling(True)\n",
        "profiling_results = profile.get_results()\n",
        "print(profiling_results)\n",
        "```"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Model Packaging\n",
        "\n",
        "If you want to build a Docker image that encapsulates your model and its dependencies, you can use the model packaging option. The output image will be pushed to your workspace's ACR.\n",
        "\n",
        "You must include an Environment object in your inference configuration to use `Model.package()`.\n",
        "\n",
        "```python\n",
        "package = Model.package(ws, [model], inference_config)\n",
        "package.wait_for_creation(show_output=True)  # Or show_output=False to hide the Docker build logs.\n",
        "package.pull()\n",
        "```\n",
        "\n",
        "Instead of a fully-built image, you can also generate a Dockerfile and download all the assets needed to build an image on top of your Environment.\n",
        "\n",
        "```python\n",
        "package = Model.package(ws, [model], inference_config, generate_dockerfile=True)\n",
        "package.wait_for_creation(show_output=True)\n",
        "package.save(\"./local_context_dir\")\n",
        "```"
      ]
    }
  ],
  "metadata": {
    "authors": [
      {
        "name": "aashishb"
      }
    ],
    "category": "deployment",
    "compute": [
      "None"
    ],
    "datasets": [
      "Iris"
    ],
    "deployment": [
      "Azure Container Instance"
    ],
    "exclude_from_index": false,
    "framework": [
      "PySpark"
    ],
    "friendly_name": "Register Spark model and deploy as webservice",
    "kernelspec": {
      "display_name": "Python 3.6",
      "language": "python",
      "name": "python36"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.6.2"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 2
 }
--- a/how-to-use-azureml/deployment/spark/model-register-and-deploy-spark.yml
+++ b/how-to-use-azureml/deployment/spark/model-register-and-deploy-spark.yml
@@ -0,0 +1,4 @@
 name: model-register-and-deploy-spark
 dependencies:
 - pip:
  - azureml-sdk
--- a/how-to-use-azureml/deployment/spark/score.py
+++ b/how-to-use-azureml/deployment/spark/score.py
@@ -0,0 +1,37 @@
 import traceback
 from pyspark.ml.linalg import VectorUDT
 from azureml.core.model import Model
 from pyspark.ml.classification import LogisticRegressionModel
 from pyspark.sql.types import StructType, StructField
 from pyspark.sql.types import DoubleType
 from pyspark.sql import SQLContext
 from pyspark import SparkContext
 sc = SparkContext.getOrCreate()
 sqlContext = SQLContext(sc)
 spark = sqlContext.sparkSession
 input_schema = StructType([StructField("features", VectorUDT()), StructField("label", DoubleType())])
 reader = spark.read
 reader.schema(input_schema)
 def init():
    global model
    # note here "iris.model" is the name of the model registered under the workspace
    # this call should return the path to the model.pkl file on the local disk.
    model_path = Model.get_model_path('iris.model')
    # Load the model file back into a LogisticRegression model
    model = LogisticRegressionModel.load(model_path)
 def run(data):
    try:
        input_df = reader.json(sc.parallelize([data]))
        result = model.transform(input_df)
        # you can return any datatype as long as it is JSON-serializable
        return result.collect()[0]['prediction']
    except Exception as e:
        traceback.print_exc()
        error = str(e)
        return error
--- a/how-to-use-azureml/machine-learning-pipelines/README.md
+++ b/how-to-use-azureml/machine-learning-pipelines/README.md
@@ -43,5 +43,7 @@ In this directory, there are two types of notebooks:
 1. [pipeline-batch-scoring.ipynb](https://aka.ms/pl-batch-score): This notebook demonstrates how to run a batch scoring job using Azure Machine Learning pipelines.
 2. [pipeline-style-transfer.ipynb](https://aka.ms/pl-style-trans): This notebook demonstrates a multi-step pipeline that uses GPU compute. This sample also showcases how to use conda dependencies using runconfig when using Pipelines.
 3. [nyc-taxi-data-regression-model-building.ipynb](https://aka.ms/pl-nyctaxi-tutorial): This notebook is an AzureML Pipelines version of the previously published two part sample.
 4. [file-dataset-image-inference-mnist.ipynb](https://aka.ms/pl-pr-filedata): This notebook demonstrates how to use ParallelRunStep to process unstructured data (file dataset).
 5. [tabular-dataset-inference-iris.ipynb](https://aka.ms/pl-pr-tabulardata): This notebook demonstrates how to use ParallelRunStep to process structured data (tabular dataset).
 ![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/machine-learning-pipelines/README.png)
--- a/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-parameter-tuning-with-hyperdrive.ipynb
+++ b/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-parameter-tuning-with-hyperdrive.ipynb
@@ -620,14 +620,13 @@
      "outputs": [],
      "source": [
        "%%time\n",
-        "from azureml.core.model import InferenceConfig\n",
+        "from azureml.core.environment import Environment\n",
        "from azureml.core.model import Model, InferenceConfig\n",
        "from azureml.core.webservice import AciWebservice\n",
        "from azureml.core.webservice import Webservice\n",
        "from azureml.core.model import Model\n",
        "\n",
-        "inference_config = InferenceConfig(runtime = \"python\", \n",
+        "\n",
-        "                                   entry_script = \"score.py\",\n",
+        "myenv = Environment.from_conda_specification(name=\"env\", file_path=\"myenv.yml\")\n",
-        "                                   conda_file = \"myenv.yml\")\n",
+        "inference_config = InferenceConfig(entry_script=\"score.py\", environment=myenv)\n",
        "\n",
        "aciconfig = AciWebservice.deploy_configuration(cpu_cores=1, \n",
        "                                               memory_gb=1, \n",
--- a/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-with-notebook-runner-step.ipynb
+++ b/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-with-notebook-runner-step.ipynb
@@ -326,7 +326,7 @@
        "\n",
        "Once we have the steps (or steps collection), we can build the [pipeline](https://docs.microsoft.com/en-us/python/api/azureml-pipeline-core/azureml.pipeline.core.pipeline.pipeline?view=azure-ml-py). By deafult, all these steps will run in **parallel** once we submit the pipeline for run.\n",
        "\n",
-        "A pipeline is created with a list of steps and a workspace. Submit a pipeline using [submit](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.experiment(class)?view=azure-ml-py#submit-config--tags-none----kwargs-). When submit is called, a [PipelineRun](https://docs.microsoft.com/en-us/python/api/azureml-pipeline-core/azureml.pipeline.core.pipelinerun?view=azure-ml-py) is created which in turn creates [StepRun](https://docs.microsoft.com/en-us/python/api/azureml-pipeline-core/azureml.pipeline.core.steprun?view=azure-ml-py) objects for each step in the workflow."
+        "A pipeline is created with a list of steps and a workspace. Submit a pipeline using [submit](https://docs.microsoft.com/en-us/python/api/azureml-pipeline-core/azureml.pipeline.core.pipeline.pipeline?view=azure-ml-py#submit-experiment-name--pipeline-parameters-none--continue-on-step-failure-false--regenerate-outputs-false--parent-run-id-none----kwargs-). When submit is called, a [PipelineRun](https://docs.microsoft.com/en-us/python/api/azureml-pipeline-core/azureml.pipeline.core.pipelinerun?view=azure-ml-py) is created which in turn creates [StepRun](https://docs.microsoft.com/en-us/python/api/azureml-pipeline-core/azureml.pipeline.core.steprun?view=azure-ml-py) objects for each step in the workflow."
      ]
    },
    {
@@ -336,9 +336,7 @@
      "outputs": [],
      "source": [
        "pipeline1 = Pipeline(workspace=ws, steps=[notebook_runner_step])\n",
-        "\n",
+        "print(\"Pipeline creation complete\")"
        "pipeline1.validate()\n",
        "print(\"Pipeline validation complete\")"
      ]
    },
    {
@@ -375,8 +373,7 @@
      "outputs": [],
      "source": [
        "pipeline_run1.wait_for_completion()\n",
-        " Retrieve the step runs by name `train.py`\n",
+        "train_step = pipeline_run1.find_step_run('training_notebook_step') # Retrieve the step runs by name `train.py`\n",
        "train_step = pipeline_run1.find_step_run('training_notebook_step')\n",
        "\n",
        "if train_step:\n",
        "    train_step_obj = train_step[0] # since we have only one step by name `training_notebook_step`\n",
@@ -420,7 +417,7 @@
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
-      "version": "3.7.3"
+      "version": "3.6.7"
    },
    "order_index": 12,
    "star_tag": [
--- a/how-to-use-azureml/machine-learning-pipelines/pipeline-batch-scoring/batch_scoring.py
+++ b/how-to-use-azureml/machine-learning-pipelines/pipeline-batch-scoring/batch_scoring.py
@@ -1,119 +0,0 @@
 # Copyright (c) Microsoft. All rights reserved.
 # Licensed under the MIT license.
 import os
 import argparse
 import datetime
 import time
 import tensorflow as tf
 from math import ceil
 import numpy as np
 import shutil
 from tensorflow.contrib.slim.python.slim.nets import inception_v3
 from azureml.core.model import Model
 slim = tf.contrib.slim
 parser = argparse.ArgumentParser(description="Start a tensorflow model serving")
 parser.add_argument('--model_name', dest="model_name", required=True)
 parser.add_argument('--label_dir', dest="label_dir", required=True)
 parser.add_argument('--dataset_path', dest="dataset_path", required=True)
 parser.add_argument('--output_dir', dest="output_dir", required=True)
 parser.add_argument('--batch_size', dest="batch_size", type=int, required=True)
 args = parser.parse_args()
 image_size = 299
 num_channel = 3
 # create output directory if it does not exist
 os.makedirs(args.output_dir, exist_ok=True)
 def get_class_label_dict(label_file):
    label = []
    proto_as_ascii_lines = tf.gfile.GFile(label_file).readlines()
    for l in proto_as_ascii_lines:
        label.append(l.rstrip())
    return label
 class DataIterator:
    def __init__(self, data_dir):
        self.file_paths = []
        image_list = os.listdir(data_dir)
        # total_size = len(image_list)
        self.file_paths = [data_dir + '/' + file_name.rstrip() for file_name in image_list]
        self.labels = [1 for file_name in self.file_paths]
    @property
    def size(self):
        return len(self.labels)
    def input_pipeline(self, batch_size):
        images_tensor = tf.convert_to_tensor(self.file_paths, dtype=tf.string)
        labels_tensor = tf.convert_to_tensor(self.labels, dtype=tf.int64)
        input_queue = tf.train.slice_input_producer([images_tensor, labels_tensor], shuffle=False)
        labels = input_queue[1]
        images_content = tf.read_file(input_queue[0])
        image_reader = tf.image.decode_jpeg(images_content, channels=num_channel, name="jpeg_reader")
        float_caster = tf.cast(image_reader, tf.float32)
        new_size = tf.constant([image_size, image_size], dtype=tf.int32)
        images = tf.image.resize_images(float_caster, new_size)
        images = tf.divide(tf.subtract(images, [0]), [255])
        image_batch, label_batch = tf.train.batch([images, labels], batch_size=batch_size, capacity=5 * batch_size)
        return image_batch
 def main(_):
    # start_time = datetime.datetime.now()
    label_file_name = os.path.join(args.label_dir, "labels.txt")
    label_dict = get_class_label_dict(label_file_name)
    classes_num = len(label_dict)
    test_feeder = DataIterator(data_dir=args.dataset_path)
    total_size = len(test_feeder.labels)
    count = 0
    # get model from model registry
    model_path = Model.get_model_path(args.model_name)
    with tf.Session() as sess:
        test_images = test_feeder.input_pipeline(batch_size=args.batch_size)
        with slim.arg_scope(inception_v3.inception_v3_arg_scope()):
            input_images = tf.placeholder(tf.float32, [args.batch_size, image_size, image_size, num_channel])
            logits, _ = inception_v3.inception_v3(input_images,
                                                  num_classes=classes_num,
                                                  is_training=False)
            probabilities = tf.argmax(logits, 1)
        sess.run(tf.global_variables_initializer())
        sess.run(tf.local_variables_initializer())
        coord = tf.train.Coordinator()
        threads = tf.train.start_queue_runners(sess=sess, coord=coord)
        saver = tf.train.Saver()
        saver.restore(sess, model_path)
        out_filename = os.path.join(args.output_dir, "result-labels.txt")
        with open(out_filename, "w") as result_file:
            i = 0
            while count < total_size and not coord.should_stop():
                test_images_batch = sess.run(test_images)
                file_names_batch = test_feeder.file_paths[i * args.batch_size:
                                                          min(test_feeder.size, (i + 1) * args.batch_size)]
                results = sess.run(probabilities, feed_dict={input_images: test_images_batch})
                new_add = min(args.batch_size, total_size - count)
                count += new_add
                i += 1
                for j in range(new_add):
                    result_file.write(os.path.basename(file_names_batch[j]) + ": " + label_dict[results[j]] + "\n")
                result_file.flush()
            coord.request_stop()
            coord.join(threads)
        # copy the file to artifacts
        shutil.copy(out_filename, "./outputs/")
        # Move the processed data out of the blob so that the next run can process the data.
 if __name__ == "__main__":
    tf.app.run()
--- a/how-to-use-azureml/machine-learning-pipelines/pipeline-batch-scoring/pipeline-batch-scoring.ipynb
+++ b/how-to-use-azureml/machine-learning-pipelines/pipeline-batch-scoring/pipeline-batch-scoring.ipynb
@@ -1,630 +0,0 @@
 {
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Copyright (c) Microsoft Corporation. All rights reserved.  \n",
        "Licensed under the MIT License."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/machine-learning-pipelines/pipeline-batch-scoring/pipeline-batch-scoring.png)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "**Note**: Azure Machine Learning recently released ParallelRunStep for public preview, this will allow for parallelization of your workload across many compute nodes without the difficulty of orchestrating worker pools and queues. See the [batch inference notebooks](../../../contrib/batch_inferencing/) for examples on how to get started."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "# Using Azure Machine Learning Pipelines for batch prediction\n",
        "\n",
        "In this notebook we will demonstrate how to run a batch scoring job using Azure Machine Learning pipelines. Our example job will be to take an already-trained image classification model, and run that model on some unlabeled images. The image classification model that we'll use is the __[Inception-V3 model](https://arxiv.org/abs/1512.00567)__  and we'll run this model on unlabeled images from the __[ImageNet](http://image-net.org/)__ dataset. \n",
        "\n",
        "The outline of this notebook is as follows:\n",
        "\n",
        "- Register the pretrained inception model into the model registry. \n",
        "- Store the dataset images in a blob container.\n",
        "- Use the registered model to do batch scoring on the images in the data blob container."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Prerequisites\n",
        "If you are using an Azure Machine Learning Notebook VM, you are all set. Otherwise, make sure you go through the configuration Notebook located at https://github.com/Azure/MachineLearningNotebooks first if you haven't. This sets you up with a working config file that has information on your workspace, subscription id, etc. "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.core import Experiment\n",
        "from azureml.core.compute import AmlCompute, ComputeTarget\n",
        "from azureml.core.datastore import Datastore\n",
        "from azureml.core.runconfig import CondaDependencies, RunConfiguration\n",
        "from azureml.data.data_reference import DataReference\n",
        "from azureml.pipeline.core import Pipeline, PipelineData\n",
        "from azureml.pipeline.steps import PythonScriptStep"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import os\n",
        "from azureml.core import Workspace\n",
        "\n",
        "ws = Workspace.from_config()\n",
        "print('Workspace name: ' + ws.name, \n",
        "      'Azure region: ' + ws.location, \n",
        "      'Subscription id: ' + ws.subscription_id, \n",
        "      'Resource group: ' + ws.resource_group, sep = '\\n')\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Set up machine learning resources"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Set up datastores\n",
        "First, let\u00e2\u20ac\u2122s access the datastore that has the model, labels, and images. \n",
        "\n",
        "### Create a datastore that points to a blob container containing sample images\n",
        "\n",
        "We have created a public blob container `sampledata` on an account named `pipelinedata`, containing images from the ImageNet evaluation set. In the next step, we create a datastore with the name `images_datastore`, which points to this container. In the call to `register_azure_blob_container` below, setting the `overwrite` flag to `True` overwrites any datastore that was created previously with that name. \n",
        "\n",
        "This step can be changed to point to your blob container by providing your own `datastore_name`, `container_name`, and `account_name`."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "account_name = \"pipelinedata\"\n",
        "datastore_name=\"images_datastore\"\n",
        "container_name=\"sampledata\"\n",
        "\n",
        "batchscore_blob = Datastore.register_azure_blob_container(ws, \n",
        "                      datastore_name=datastore_name, \n",
        "                      container_name= container_name, \n",
        "                      account_name=account_name, \n",
        "                      overwrite=True)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Next, let\u00e2\u20ac\u2122s specify the default datastore for the outputs."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "def_data_store = ws.get_default_datastore()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Configure data references\n",
        "Now you need to add references to the data, as inputs to the appropriate pipeline steps in your pipeline. A data source in a pipeline is represented by a DataReference object. The DataReference object points to data that lives in, or is accessible from, a datastore. We need DataReference objects corresponding to the following: the directory containing the input images, the directory in which the pretrained model is stored, the directory containing the labels, and the output directory."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "input_images = DataReference(datastore=batchscore_blob, \n",
        "                             data_reference_name=\"input_images\",\n",
        "                             path_on_datastore=\"batchscoring/images\",\n",
        "                             mode=\"download\"\n",
        "                            )\n",
        "model_dir = DataReference(datastore=batchscore_blob, \n",
        "                          data_reference_name=\"input_model\",\n",
        "                          path_on_datastore=\"batchscoring/models\",\n",
        "                          mode=\"download\"                          \n",
        "                         )\n",
        "label_dir = DataReference(datastore=batchscore_blob, \n",
        "                          data_reference_name=\"input_labels\",\n",
        "                          path_on_datastore=\"batchscoring/labels\",\n",
        "                          mode=\"download\"                          \n",
        "                         )\n",
        "output_dir = PipelineData(name=\"scores\", \n",
        "                          datastore=def_data_store, \n",
        "                          output_path_on_compute=\"batchscoring/results\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Create and attach Compute targets\n",
        "Use the below code to create and attach Compute targets. "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# choose a name for your cluster\n",
        "aml_compute_name = os.environ.get(\"AML_COMPUTE_NAME\", \"gpu-cluster\")\n",
        "cluster_min_nodes = os.environ.get(\"AML_COMPUTE_MIN_NODES\", 0)\n",
        "cluster_max_nodes = os.environ.get(\"AML_COMPUTE_MAX_NODES\", 1)\n",
        "vm_size = os.environ.get(\"AML_COMPUTE_SKU\", \"STANDARD_NC6\")\n",
        "\n",
        "\n",
        "if aml_compute_name in ws.compute_targets:\n",
        "    compute_target = ws.compute_targets[aml_compute_name]\n",
        "    if compute_target and type(compute_target) is AmlCompute:\n",
        "        print('found compute target. just use it. ' + aml_compute_name)\n",
        "else:\n",
        "    print('creating a new compute target...')\n",
        "    provisioning_config = AmlCompute.provisioning_configuration(vm_size = vm_size, # NC6 is GPU-enabled\n",
        "                                                                vm_priority = 'lowpriority', # optional\n",
        "                                                                min_nodes = cluster_min_nodes, \n",
        "                                                                max_nodes = cluster_max_nodes)\n",
        "\n",
        "    # create the cluster\n",
        "    compute_target = ComputeTarget.create(ws, aml_compute_name, provisioning_config)\n",
        "    \n",
        "    # can poll for a minimum number of nodes and for a specific timeout. \n",
        "    # if no min node count is provided it will use the scale settings for the cluster\n",
        "    compute_target.wait_for_completion(show_output=True, min_node_count=None, timeout_in_minutes=20)\n",
        "    \n",
        "     # For a more detailed view of current Azure Machine Learning Compute  status, use get_status()\n",
        "    print(compute_target.get_status().serialize())"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Prepare the Model"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Download the Model\n",
        "\n",
        "Download and extract the model from http://download.tensorflow.org/models/inception_v3_2016_08_28.tar.gz to `\"models\"`"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# create directory for model\n",
        "model_dir = 'models'\n",
        "if not os.path.isdir(model_dir):\n",
        "    os.mkdir(model_dir)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import tarfile\n",
        "import urllib.request\n",
        "\n",
        "url=\"http://download.tensorflow.org/models/inception_v3_2016_08_28.tar.gz\"\n",
        "response = urllib.request.urlretrieve(url, \"model.tar.gz\")\n",
        "tar = tarfile.open(\"model.tar.gz\", \"r:gz\")\n",
        "tar.extractall(model_dir)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Register the model with Workspace"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import shutil\n",
        "from azureml.core.model import Model\n",
        "\n",
        "# register downloaded model \n",
        "model = Model.register(model_path = \"models/inception_v3.ckpt\",\n",
        "                       model_name = \"inception\", # this is the name the model is registered as\n",
        "                       tags = {'pretrained': \"inception\"},\n",
        "                       description = \"Imagenet trained tensorflow inception\",\n",
        "                       workspace = ws)\n",
        "# remove the downloaded dir after registration if you wish\n",
        "shutil.rmtree(\"models\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Write your scoring script"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "To do the scoring, we use a batch scoring script `batch_scoring.py`, which is located in the same directory that this notebook is in. You can take a look at this script to see how you might modify it for your custom batch scoring task.\n",
        "\n",
        "The python script `batch_scoring.py` takes input images, applies the image classification model to these images, and outputs a classification result to a results file.\n",
        "\n",
        "The script `batch_scoring.py` takes the following parameters:\n",
        "\n",
        "- `--model_name`: the name of the model being used, which is expected to be in the `model_dir` directory\n",
        "- `--label_dir` : the directory holding the `labels.txt` file \n",
        "- `--dataset_path`: the directory containing the input images\n",
        "- `--output_dir` : the script will run the model on the data and output a `results-label.txt` to this directory\n",
        "- `--batch_size` : the batch size used in running the model.\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Build and run the batch scoring pipeline\n",
        "You have everything you need to build the pipeline. Let\u00e2\u20ac\u2122s put all these together."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "###  Specify the environment to run the script\n",
        "Specify the conda dependencies for your script. You will need this object when you create the pipeline step later on."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.core.runconfig import DEFAULT_GPU_IMAGE\n",
        "\n",
        "cd = CondaDependencies.create(pip_packages=[\"tensorflow-gpu==1.13.1\", \"azureml-defaults\"])\n",
        "\n",
        "# Runconfig\n",
        "amlcompute_run_config = RunConfiguration(conda_dependencies=cd)\n",
        "amlcompute_run_config.environment.docker.enabled = True\n",
        "amlcompute_run_config.environment.docker.base_image = DEFAULT_GPU_IMAGE\n",
        "amlcompute_run_config.environment.spark.precache_packages = False"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Specify the parameters for your pipeline\n",
        "A subset of the parameters to the python script can be given as input when we re-run a `PublishedPipeline`. In the current example, we define `batch_size` taken by the script as such parameter."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.pipeline.core.graph import PipelineParameter\n",
        "batch_size_param = PipelineParameter(name=\"param_batch_size\", default_value=20)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Create the pipeline step\n",
        "Create the pipeline step using the script, environment configuration, and parameters. Specify the compute target you already attached to your workspace as the target of execution of the script. We will use PythonScriptStep to create the pipeline step."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "inception_model_name = \"inception_v3.ckpt\"\n",
        "\n",
        "batch_score_step = PythonScriptStep(\n",
        "    name=\"batch_scoring\",\n",
        "    script_name=\"batch_scoring.py\",\n",
        "    arguments=[\"--dataset_path\", input_images, \n",
        "               \"--model_name\", \"inception\",\n",
        "               \"--label_dir\", label_dir, \n",
        "               \"--output_dir\", output_dir, \n",
        "               \"--batch_size\", batch_size_param],\n",
        "    compute_target=compute_target,\n",
        "    inputs=[input_images, label_dir],\n",
        "    outputs=[output_dir],\n",
        "    runconfig=amlcompute_run_config\n",
        ")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Run the pipeline\n",
        "At this point you can run the pipeline and examine the output it produced. "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "tags": [
          "pipelineparameterssample"
        ]
      },
      "outputs": [],
      "source": [
        "pipeline = Pipeline(workspace=ws, steps=[batch_score_step])\n",
        "pipeline_run = Experiment(ws, 'batch_scoring').submit(pipeline, pipeline_parameters={\"param_batch_size\": 20})"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Monitor the run"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.widgets import RunDetails\n",
        "RunDetails(pipeline_run).show()"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "pipeline_run.wait_for_completion(show_output=True)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Download and review output"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "step_run = list(pipeline_run.get_children())[0]\n",
        "step_run.download_file(\"./outputs/result-labels.txt\")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import pandas as pd\n",
        "df = pd.read_csv(\"result-labels.txt\", delimiter=\":\", header=None)\n",
        "df.columns = [\"Filename\", \"Prediction\"]\n",
        "df.head()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Publish a pipeline and rerun using a REST call"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Create a published pipeline\n",
        "Once you are satisfied with the outcome of the run, you can publish the pipeline to run it with different input values later. When you publish a pipeline, you will get a REST endpoint that accepts invoking of the pipeline with the set of parameters you have already incorporated above using PipelineParameter."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "published_pipeline = pipeline_run.publish_pipeline(\n",
        "    name=\"Inception_v3_scoring\", description=\"Batch scoring using Inception v3 model\", version=\"1.0\")\n",
        "\n",
        "published_pipeline"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Get published pipeline\n",
        "\n",
        "You can get the published pipeline using **pipeline id**.\n",
        "\n",
        "To get all the published pipelines for a given workspace(ws): \n",
        "```css\n",
        "all_pub_pipelines = PublishedPipeline.get_all(ws)\n",
        "```"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.pipeline.core import PublishedPipeline\n",
        "\n",
        "pipeline_id = published_pipeline.id # use your published pipeline id\n",
        "published_pipeline = PublishedPipeline.get(ws, pipeline_id)\n",
        "\n",
        "published_pipeline"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Rerun the pipeline using the REST endpoint"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Get AAD token\n",
        "[This notebook](https://aka.ms/pl-restep-auth) shows how to authenticate to AML workspace."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.core.authentication import InteractiveLoginAuthentication\n",
        "import requests\n",
        "\n",
        "auth = InteractiveLoginAuthentication()\n",
        "aad_token = auth.get_authentication_header()\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Run published pipeline"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "rest_endpoint = published_pipeline.endpoint\n",
        "# specify batch size when running the pipeline\n",
        "response = requests.post(rest_endpoint, \n",
        "                         headers=aad_token, \n",
        "                         json={\"ExperimentName\": \"batch_scoring\",\n",
        "                               \"ParameterAssignments\": {\"param_batch_size\": 50}})"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "try:\n",
        "    response.raise_for_status()\n",
        "except Exception:    \n",
        "    raise Exception('Received bad response from the endpoint: {}\\n'\n",
        "                    'Response Code: {}\\n'\n",
        "                    'Headers: {}\\n'\n",
        "                    'Content: {}'.format(rest_endpoint, response.status_code, response.headers, response.content))\n",
        "\n",
        "run_id = response.json().get('Id')\n",
        "print('Submitted pipeline run: ', run_id)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Monitor the new run"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.pipeline.core.run import PipelineRun\n",
        "published_pipeline_run = PipelineRun(ws.experiments[\"batch_scoring\"], run_id)\n",
        "\n",
        "RunDetails(published_pipeline_run).show()"
      ]
    }
  ],
  "metadata": {
    "authors": [
      {
        "name": "sanpil"
      }
    ],
    "kernelspec": {
      "display_name": "Python 3.6",
      "language": "python",
      "name": "python36"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.6.7"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 2
 }
--- a/how-to-use-azureml/machine-learning-pipelines/pipeline-batch-scoring/pipeline-batch-scoring.yml
+++ b/how-to-use-azureml/machine-learning-pipelines/pipeline-batch-scoring/pipeline-batch-scoring.yml
@@ -1,7 +0,0 @@
 name: pipeline-batch-scoring
 dependencies:
 - pip:
  - azureml-sdk
  - azureml-widgets
  - pandas
  - requests
--- a/how-to-use-azureml/track-and-monitor-experiments/logging-api/logging-api.ipynb
+++ b/how-to-use-azureml/track-and-monitor-experiments/logging-api/logging-api.ipynb
@@ -100,7 +100,7 @@
        "\n",
        "# Check core SDK version number\n",
        "\n",
-        "print(\"This notebook was created using SDK version 1.0.83, you are currently running version\", azureml.core.VERSION)"
+        "print(\"This notebook was created using SDK version 1.0.85, you are currently running version\", azureml.core.VERSION)"
      ]
    },
    {
--- a/how-to-use-azureml/track-and-monitor-experiments/using-mlflow/deploy-model/deploy-model.ipynb
+++ b/how-to-use-azureml/track-and-monitor-experiments/using-mlflow/deploy-model/deploy-model.ipynb
@@ -1,342 +0,0 @@
 {
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Copyright (c) Microsoft Corporation. All rights reserved.\n",
        "\n",
        "Licensed under the MIT License."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/using-mlflow/deploy-model/deploy-model.png)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Deploy Model as Azure Machine Learning Web Service using MLflow\n",
        "\n",
        "This example shows you how to use mlflow together with Azure Machine Learning services for deploying a model as a web service. You'll learn how to:\n",
        "\n",
        " 1. Retrieve a previously trained scikit-learn model\n",
        " 2. Create a Docker image from the model\n",
        " 3. Deploy the model as a web service on Azure Container Instance\n",
        " 4. Make a scoring request against the web service.\n",
        "\n",
        "## Prerequisites and Set-up\n",
        "\n",
        "This notebook requires you to first complete the [Use MLflow with Azure Machine Learning for Local Training Run](../train-local/train-local.ipnyb) or [Use MLflow with Azure Machine Learning for Remote Training Run](../train-remote/train-remote.ipnyb) notebook, so as to have an experiment run with uploaded model in your Azure Machine Learning Workspace.\n",
        "\n",
        "Also install following packages if you haven't already\n",
        "\n",
        "```\n",
        "pip install azureml-mlflow pandas\n",
        "```\n",
        "\n",
        "Then, import necessary packages:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import mlflow\n",
        "import azureml.mlflow\n",
        "import azureml.core\n",
        "from azureml.core import Workspace\n",
        "\n",
        "# Check core SDK version number\n",
        "print(\"SDK version:\", azureml.core.VERSION)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Connect to workspace and set MLflow tracking URI\n",
        "\n",
        "Setting the tracking URI is required for retrieving the model and creating an image using the MLflow APIs."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "ws = Workspace.from_config()\n",
        "\n",
        "mlflow.set_tracking_uri(ws.get_mlflow_tracking_uri())"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Retrieve model from previous run\n",
        "\n",
        "Let's retrieve the experiment from training notebook, and list the runs within that experiment."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "experiment_name = \"experiment-with-mlflow\"\n",
        "exp = ws.experiments[experiment_name]\n",
        "\n",
        "runs = list(exp.get_runs())\n",
        "runs"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Then, let's select the most recent training run and find its ID. You also need to specify the path in run history where the model was saved. "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "runid = runs[0].id\n",
        "model_save_path = \"model\""
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Create Docker image\n",
        "\n",
        "To create a Docker image with Azure Machine Learning for Model Management, use ```mlflow.azureml.build_image``` method. Specify the model path, your workspace, run ID and other parameters.\n",
        "\n",
        "MLflow automatically recognizes the model framework as scikit-learn, and creates the scoring logic and includes library dependencies for you.\n",
        "\n",
        "Note that the image creation can take several minutes."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import mlflow.azureml\n",
        "\n",
        "azure_image, azure_model = mlflow.azureml.build_image(model_uri=\"runs:/{}/{}\".format(runid, model_save_path),\n",
        "                                                      workspace=ws,\n",
        "                                                      model_name='diabetes-sklearn-model',\n",
        "                                                      image_name='diabetes-sklearn-image',\n",
        "                                                      synchronous=True)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Deploy web service\n",
        "\n",
        "Let's use Azure Machine Learning SDK to deploy the image as a web service. \n",
        "\n",
        "First, specify the deployment configuration. Azure Container Instance is a suitable choice for a quick dev-test deployment, while Azure Kubernetes Service is suitable for scalable production deployments.\n",
        "\n",
        "Then, deploy the image using Azure Machine Learning SDK's ```deploy_from_image``` method.\n",
        "\n",
        "Note that the deployment can take several minutes."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.core.webservice import AciWebservice, Webservice\n",
        "\n",
        "\n",
        "aci_config = AciWebservice.deploy_configuration(cpu_cores=1, \n",
        "                                                memory_gb=1, \n",
        "                                                tags={\"method\" : \"sklearn\"}, \n",
        "                                                description='Diabetes model',\n",
        "                                                location='eastus2')\n",
        "\n",
        "\n",
        "# Deploy the image to Azure Container Instances (ACI) for real-time serving\n",
        "webservice = Webservice.deploy_from_image(\n",
        "    image=azure_image, workspace=ws, name=\"diabetes-model-1\", deployment_config=aci_config)\n",
        "\n",
        "\n",
        "webservice.wait_for_deployment(show_output=True)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Make a scoring request\n",
        "\n",
        "Let's take the first few rows of test data and score them using the web service"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "test_rows = [\n",
        "    [0.01991321,  0.05068012,  0.10480869,  0.07007254, -0.03596778,\n",
        "     -0.0266789 , -0.02499266, -0.00259226,  0.00371174,  0.04034337],\n",
        "    [-0.01277963, -0.04464164,  0.06061839,  0.05285819,  0.04796534,\n",
        "     0.02937467, -0.01762938,  0.03430886,  0.0702113 ,  0.00720652],\n",
        "    [ 0.03807591,  0.05068012,  0.00888341,  0.04252958, -0.04284755,\n",
        "     -0.02104223, -0.03971921, -0.00259226, -0.01811827,  0.00720652]]"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "MLflow-based web service for scikit-learn model requires the data to be converted to Pandas DataFrame, and then serialized as JSON. "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import json\n",
        "import pandas as pd\n",
        "\n",
        "test_rows_as_json = pd.DataFrame(test_rows).to_json(orient=\"split\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Let's pass the conveted and serialized data to web service to get the predictions."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "predictions = webservice.run(test_rows_as_json)\n",
        "\n",
        "print(predictions)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "You can use the web service's scoring URI to make a raw HTTP request"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "webservice.scoring_uri"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "You can diagnose the web service using ```get_logs``` method."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "webservice.get_logs()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Next Steps\n",
        "\n",
        "Learn about [model management and inference in Azure Machine Learning service](https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-model-management-and-deployment)."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": []
    }
  ],
  "metadata": {
    "authors": [
      {
        "name": "shipatel"
      }
    ],
    "category": "deployment",
    "compute": [
      "None"
    ],
    "datasets": [
      "Diabetes"
    ],
    "deployment": [
      "Azure Container Instance"
    ],
    "exclude_from_index": false,
    "framework": [
      "Scikit-learn"
    ],
    "friendly_name": "Deploy a model as a web service using MLflow",
    "index_order": 4,
    "kernelspec": {
      "display_name": "Python 3.6",
      "language": "python",
      "name": "python36"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.6.4"
    },
    "tags": [
      "None"
    ],
    "task": "Use MLflow with AML"
  },
  "nbformat": 4,
  "nbformat_minor": 2
 }
--- a/how-to-use-azureml/track-and-monitor-experiments/using-mlflow/deploy-model/deploy-model.yml
+++ b/how-to-use-azureml/track-and-monitor-experiments/using-mlflow/deploy-model/deploy-model.yml
@@ -1,8 +0,0 @@
 name: deploy-model
 dependencies:
 - scikit-learn
 - matplotlib
 - pip:
  - azureml-sdk
  - azureml-mlflow
  - pandas
--- a/how-to-use-azureml/track-and-monitor-experiments/using-mlflow/train-deploy-pytorch/scripts/train.py
+++ b/how-to-use-azureml/track-and-monitor-experiments/using-mlflow/train-deploy-pytorch/scripts/train.py
@@ -1,150 +0,0 @@
 # Copyright (c) 2017, PyTorch Team
 # All rights reserved
 # Licensed under BSD 3-Clause License.
 # This example is based on PyTorch MNIST example:
 # https://github.com/pytorch/examples/blob/master/mnist/main.py
 import mlflow
 import mlflow.pytorch
 from mlflow.utils.environment import _mlflow_conda_env
 import warnings
 import cloudpickle
 import torch
 import torch.nn as nn
 import torch.nn.functional as F
 import torch.optim as optim
 import torchvision
 from torchvision import datasets, transforms
 class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(1, 20, 5, 1)
        self.conv2 = nn.Conv2d(20, 50, 5, 1)
        self.fc1 = nn.Linear(4 * 4 * 50, 500)
        self.fc2 = nn.Linear(500, 10)
    def forward(self, x):
        # Added the view for reshaping score requests
        x = x.view(-1, 1, 28, 28)
        x = F.relu(self.conv1(x))
        x = F.max_pool2d(x, 2, 2)
        x = F.relu(self.conv2(x))
        x = F.max_pool2d(x, 2, 2)
        x = x.view(-1, 4 * 4 * 50)
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        return F.log_softmax(x, dim=1)
 def train(args, model, device, train_loader, optimizer, epoch):
    model.train()
    for batch_idx, (data, target) in enumerate(train_loader):
        data, target = data.to(device), target.to(device)
        optimizer.zero_grad()
        output = model(data)
        loss = F.nll_loss(output, target)
        loss.backward()
        optimizer.step()
        if batch_idx % args.log_interval == 0:
            print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
                epoch, batch_idx * len(data), len(train_loader.dataset),
                100. * batch_idx / len(train_loader), loss.item()))
            # Use MLflow logging
            mlflow.log_metric("epoch_loss", loss.item())
 def test(args, model, device, test_loader):
    model.eval()
    test_loss = 0
    correct = 0
    with torch.no_grad():
        for data, target in test_loader:
            data, target = data.to(device), target.to(device)
            output = model(data)
            # sum up batch loss
            test_loss += F.nll_loss(output, target, reduction="sum").item()
            # get the index of the max log-probability
            pred = output.argmax(dim=1, keepdim=True)
            correct += pred.eq(target.view_as(pred)).sum().item()
    test_loss /= len(test_loader.dataset)
    print("\n")
    print("Test set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n".format(
        test_loss, correct, len(test_loader.dataset),
        100. * correct / len(test_loader.dataset)))
    # Use MLflow logging
    mlflow.log_metric("average_loss", test_loss)
 class Args(object):
    pass
 # Training settings
 args = Args()
 setattr(args, 'batch_size', 64)
 setattr(args, 'test_batch_size', 1000)
 setattr(args, 'epochs', 3)  # Higher number for better convergence
 setattr(args, 'lr', 0.01)
 setattr(args, 'momentum', 0.5)
 setattr(args, 'no_cuda', True)
 setattr(args, 'seed', 1)
 setattr(args, 'log_interval', 10)
 setattr(args, 'save_model', True)
 use_cuda = not args.no_cuda and torch.cuda.is_available()
 torch.manual_seed(args.seed)
 device = torch.device("cuda" if use_cuda else "cpu")
 kwargs = {'num_workers': 1, 'pin_memory': True} if use_cuda else {}
 train_loader = torch.utils.data.DataLoader(
    datasets.MNIST('../data', train=True, download=True,
                   transform=transforms.Compose([
                       transforms.ToTensor(),
                       transforms.Normalize((0.1307,), (0.3081,))
                   ])),
    batch_size=args.batch_size, shuffle=True, **kwargs)
 test_loader = torch.utils.data.DataLoader(
    datasets.MNIST(
        '../data',
        train=False,
        transform=transforms.Compose([
            transforms.ToTensor(),
            transforms.Normalize((0.1307,), (0.3081,))])),
    batch_size=args.test_batch_size, shuffle=True, **kwargs)
 def driver():
    warnings.filterwarnings("ignore")
    # Dependencies for deploying the model
    pytorch_index = "https://download.pytorch.org/whl/"
    pytorch_version = "cpu/torch-1.1.0-cp36-cp36m-linux_x86_64.whl"
    deps = [
        "cloudpickle=={}".format(cloudpickle.__version__),
        pytorch_index + pytorch_version,
        "torchvision=={}".format(torchvision.__version__),
        "Pillow=={}".format("6.0.0")
    ]
    with mlflow.start_run() as run:
        model = Net().to(device)
        optimizer = optim.SGD(
            model.parameters(),
            lr=args.lr,
            momentum=args.momentum)
        for epoch in range(1, args.epochs + 1):
            train(args, model, device, train_loader, optimizer, epoch)
            test(args, model, device, test_loader)
        # Log model to run history using MLflow
        if args.save_model:
            model_env = _mlflow_conda_env(additional_pip_deps=deps)
            mlflow.pytorch.log_model(model, "model", conda_env=model_env)
    return run
 if __name__ == "__main__":
    driver()
--- a/how-to-use-azureml/track-and-monitor-experiments/using-mlflow/train-deploy-pytorch/train-and-deploy-pytorch.ipynb
+++ b/how-to-use-azureml/track-and-monitor-experiments/using-mlflow/train-deploy-pytorch/train-and-deploy-pytorch.ipynb
@@ -1,501 +0,0 @@
 {
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Copyright (c) Microsoft Corporation. All rights reserved.\n",
        "\n",
        "Licensed under the MIT License."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/using-mlflow/train-deploy-pytorch/train-deploy-pytorch.png)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Use MLflow with Azure Machine Learning to Train and Deploy PyTorch Image Classifier\n",
        "\n",
        "This example shows you how to use MLflow together with Azure Machine Learning services for tracking the metrics and artifacts while training a PyTorch model to classify MNIST digit images, and then deploy the model  as a web service. You'll learn how to:\n",
        "\n",
        " 1. Set up MLflow tracking URI so as to use Azure ML\n",
        " 2. Create experiment\n",
        " 3. Instrument your model with MLflow tracking\n",
        " 4. Train a PyTorch model locally\n",
        " 5. Train a model on GPU compute on Azure\n",
        " 6. View your experiment within your Azure ML Workspace in Azure Portal\n",
        " 7. Create a Docker image from the trained model\n",
        " 8. Deploy the model as a web service on Azure Container Instance\n",
        " 9. Call the model to make predictions\n",
        " \n",
        "### Pre-requisites\n",
        " \n",
        "Make sure you have completed the [Configuration](../../../configuration.ipnyb) notebook to set up your Azure Machine Learning workspace and ensure other common prerequisites are met.\n",
        "\n",
        "Also, install mlflow-azureml package using ```pip install mlflow-azureml```. Note that mlflow-azureml installs mlflow package itself as a dependency, if you haven't done so previously.\n",
        "\n",
        "### Set-up\n",
        "\n",
        "Import packages and check versions of Azure ML SDK and MLflow installed on your computer. Then connect to your Workspace."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import sys, os\n",
        "import mlflow\n",
        "import mlflow.azureml\n",
        "import mlflow.sklearn\n",
        "\n",
        "import azureml.core\n",
        "from azureml.core import Workspace\n",
        "\n",
        "\n",
        "print(\"SDK version:\", azureml.core.VERSION)\n",
        "print(\"MLflow version:\", mlflow.version.VERSION)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "ws = Workspace.from_config()\n",
        "ws.get_details()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Set tracking URI\n",
        "\n",
        "Set the MLFlow tracking URI to point to your Azure ML Workspace. The subsequent logging calls from MLFlow APIs will go to Azure ML services and will be tracked under your Workspace."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "mlflow.set_tracking_uri(ws.get_mlflow_tracking_uri())"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Create Experiment\n",
        "\n",
        "In both MLflow and Azure ML, training runs are grouped into experiments. Let's create one for our experimentation."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "experiment_name = \"pytorch-with-mlflow\"\n",
        "mlflow.set_experiment(experiment_name)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Train model locally while logging metrics and artifacts\n",
        "\n",
        "The ```scripts/train.py``` program contains the code to load the image dataset, and train and test the model. Within this program, the train.driver function wraps the end-to-end workflow.\n",
        "\n",
        "Within the driver, the ```mlflow.start_run``` starts MLflow tracking. Then, ```mlflow.log_metric``` functions are used to track the convergence of the neural network training iterations. Finally ```mlflow.pytorch.save_model``` is used to save the trained model in framework-aware manner.\n",
        "\n",
        "Let's add the program to search path, import it as a module, and then invoke the driver function. Note that the training can take few minutes."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "lib_path = os.path.abspath(\"scripts\")\n",
        "sys.path.append(lib_path)\n",
        "\n",
        "import train"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "run = train.driver()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "You can view the metrics of the run at Azure Portal"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "print(azureml.mlflow.get_portal_url(run))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Train model on GPU compute on Azure\n",
        "\n",
        "Next, let's run the same script on GPU-enabled compute for faster training. If you've completed the the [Configuration](../../../configuration.ipnyb) notebook, you should have a GPU cluster named \"gpu-cluster\" available in your workspace. Otherwise, follow the instructions in the notebook to create one. For simplicity, this example uses single process on single VM to train the model.\n",
        "\n",
        "Create a PyTorch estimator to specify the training configuration: script, compute as well as additional packages needed. To enable MLflow tracking, include ```azureml-mlflow``` as pip package. The low-level specifications for the training run are encapsulated in the estimator instance."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.train.dnn import PyTorch\n",
        "\n",
        "pt = PyTorch(source_directory=\"./scripts\", \n",
        "             entry_script = \"train.py\", \n",
        "             compute_target = \"gpu-cluster\", \n",
        "             node_count = 1, \n",
        "             process_count_per_node = 1, \n",
        "             use_gpu=True,\n",
        "             pip_packages = [\"azureml-mlflow\", \"Pillow==6.0.0\"])\n",
        "\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Get a reference to the experiment you created previously, but this time, as Azure Machine Learning experiment object.\n",
        "\n",
        "Then, use ```Experiment.submit``` method to start the remote training run. Note that the first training run often takes longer as Azure Machine Learning service builds the Docker image for executing the script. Subsequent runs will be faster as cached image is used."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.core import Experiment\n",
        "\n",
        "exp = Experiment(ws, experiment_name)\n",
        "run = exp.submit(pt)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "You can monitor the run and its metrics on Azure Portal."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "run"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Also, you can wait for run to complete."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "run.wait_for_completion(show_output=True)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Deploy model as web service\n",
        "\n",
        "To deploy a web service, first create a Docker image, and then deploy that Docker image on inferencing compute.\n",
        "\n",
        "The ```mlflow.azureml.build_image``` function builds a Docker image from saved PyTorch model in a framework-aware manner. It automatically creates the PyTorch-specific inferencing wrapper code and specififies package dependencies for you."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "run.get_file_names()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Then build a docker image using *runs:/&lt;run.id&gt;/model* as the model_uri path.\n",
        "\n",
        "Note that the image building can take several minutes."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "model_path = \"model\"\n",
        "\n",
        "\n",
        "azure_image, azure_model = mlflow.azureml.build_image(model_uri='runs:/{}/{}'.format(run.id, model_path),\n",
        "                                                      workspace=ws,\n",
        "                                                      model_name='pytorch_mnist',\n",
        "                                                      image_name='pytorch-mnist-img',\n",
        "                                                      synchronous=True)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Then, deploy the Docker image to Azure Container Instance: a serverless compute capable of running a single container. You can tag and add descriptions to help keep track of your web service. \n",
        "\n",
        "[Other inferencing compute choices](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-deploy-and-where) include Azure Kubernetes Service which provides scalable endpoint suitable for production use.\n",
        "\n",
        "Note that the service deployment can take several minutes."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.core.webservice import AciWebservice, Webservice\n",
        "\n",
        "aci_config = AciWebservice.deploy_configuration(cpu_cores=2, \n",
        "                                                memory_gb=5, \n",
        "                                                tags={\"data\": \"MNIST\",  \"method\" : \"pytorch\"}, \n",
        "                                                description=\"Predict using webservice\")\n",
        "\n",
        "\n",
        "# Deploy the image to Azure Container Instances (ACI) for real-time serving\n",
        "webservice = Webservice.deploy_from_image(\n",
        "    image=azure_image, workspace=ws, name=\"pytorch-mnist-1\", deployment_config=aci_config)\n",
        "\n",
        "\n",
        "webservice.wait_for_deployment()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Once the deployment has completed you can check the scoring URI of the web service."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "print(\"Scoring URI is: {}\".format(webservice.scoring_uri))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "In case of a service creation issue, you can use ```webservice.get_logs()``` to get logs to debug."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Make predictions using web service\n",
        "\n",
        "To make the web service, create a test data set as normalized PyTorch tensors. \n",
        "\n",
        "Then, let's define a utility function that takes a random image and converts it into format and shape suitable for as input to PyTorch inferencing end-point. The conversion is done by: \n",
        "\n",
        " 1. Select a random (image, label) tuple\n",
        " 2. Take the image and converting the tensor to NumPy array \n",
        " 3. Reshape array into 1 x 1 x N array\n",
        "    * 1 image in batch, 1 color channel, N = 784 pixels for MNIST images\n",
        "    * Note also ```x = x.view(-1, 1, 28, 28)``` in net definition in ```train.py``` program to shape incoming scoring requests.\n",
        " 4. Convert the NumPy array to list to make it into a built-in type.\n",
        " 5. Create a dictionary {\"data\", &lt;list&gt;} that can be converted to JSON string for web service requests."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from torchvision import datasets, transforms\n",
        "import random\n",
        "import numpy as np\n",
        "\n",
        "test_data = datasets.MNIST('../data', train=False, transform=transforms.Compose([\n",
        "                       transforms.ToTensor(),\n",
        "                       transforms.Normalize((0.1307,), (0.3081,))]))\n",
        "\n",
        "\n",
        "def get_random_image():\n",
        "    image_idx = random.randint(0,len(test_data))\n",
        "    image_as_tensor = test_data[image_idx][0]\n",
        "    return {\"data\": elem for elem in image_as_tensor.numpy().reshape(1,1,-1).tolist()}"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Then, invoke the web service using a random test image. Convert the dictionary containing the image to JSON string before passing it to web service.\n",
        "\n",
        "The response contains the raw scores for each label, with greater value indicating higher probability. Sort the labels and select the one with greatest score to get the prediction. Let's also plot the image sent to web service for comparison purposes."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "%matplotlib inline\n",
        "\n",
        "import json\n",
        "import matplotlib.pyplot as plt\n",
        "\n",
        "test_image = get_random_image()\n",
        "\n",
        "response = webservice.run(json.dumps(test_image))\n",
        "\n",
        "response = sorted(response[0].items(), key = lambda x: x[1], reverse = True)\n",
        "\n",
        "\n",
        "print(\"Predicted label:\", response[0][0])\n",
        "plt.imshow(np.array(test_image[\"data\"]).reshape(28,28), cmap = \"gray\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "You can also call the web service using a raw POST method against the web service"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import requests\n",
        "\n",
        "response = requests.post(url=webservice.scoring_uri, data=json.dumps(test_image),headers={\"Content-type\": \"application/json\"})\n",
        "print(response.text)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": []
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": []
    }
  ],
  "metadata": {
    "authors": [
      {
        "name": "shipatel"
      }
    ],
    "category": "tutorial",
    "celltoolbar": "Edit Metadata",
    "compute": [
      "AML Compute"
    ],
    "datasets": [
      "MNIST"
    ],
    "deployment": [
      "Azure Container Instance"
    ],
    "exclude_from_index": false,
    "framework": [
      "PyTorch"
    ],
    "friendly_name": "Use MLflow with Azure Machine Learning for training and deployment",
    "index_order": 6,
    "kernelspec": {
      "display_name": "Python 3.6",
      "language": "python",
      "name": "python36"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.7.3"
    },
    "name": "mlflow-sparksummit-pytorch",
    "notebookId": 2495374963457641,
    "tags": [
      "None"
    ],
    "task": "Use MLflow with Azure Machine Learning to train and deploy Pa yTorch image classifier model"
  },
  "nbformat": 4,
  "nbformat_minor": 1
 }
--- a/how-to-use-azureml/track-and-monitor-experiments/using-mlflow/train-deploy-pytorch/train-and-deploy-pytorch.yml
+++ b/how-to-use-azureml/track-and-monitor-experiments/using-mlflow/train-deploy-pytorch/train-and-deploy-pytorch.yml
@@ -1,8 +0,0 @@
 name: train-and-deploy-pytorch
 dependencies:
 - matplotlib
 - pip:
  - azureml-sdk
  - azureml-mlflow
  - https://download.pytorch.org/whl/cpu/torch-1.1.0-cp35-cp35m-win_amd64.whl
  - https://download.pytorch.org/whl/cpu/torchvision-0.3.0-cp35-cp35m-win_amd64.whl
--- a/how-to-use-azureml/training/train-on-local/train-on-local.ipynb
+++ b/how-to-use-azureml/training/train-on-local/train-on-local.ipynb
@@ -167,7 +167,7 @@
    {
      "cell_type": "code",
      "execution_count": null,
-      "metadata": {},
+      "metadata": {"name":"user_managed_env"},
      "outputs": [],
      "source": [
        "from azureml.core import Environment\n",
@@ -192,7 +192,7 @@
    {
      "cell_type": "code",
      "execution_count": null,
-      "metadata": {},
+      "metadata": {"name":"src"},
      "outputs": [],
      "source": [
        "from azureml.core import ScriptRunConfig\n",
@@ -224,7 +224,7 @@
    {
      "cell_type": "code",
      "execution_count": null,
-      "metadata": {},
+      "metadata": {"name":"run"},
      "outputs": [],
      "source": [
        "run"
--- a/how-to-use-azureml/training/train-within-notebook/train-within-notebook.ipynb
+++ b/how-to-use-azureml/training/train-within-notebook/train-within-notebook.ipynb
@@ -80,7 +80,8 @@
      "metadata": {
        "tags": [
          "install"
-        ]
+        ],
        "name": "load_ws"
      },
      "outputs": [],
      "source": [
@@ -113,7 +114,7 @@
    {
      "cell_type": "code",
      "execution_count": null,
-      "metadata": {},
+      "metadata": {"name": "load_data"},
      "outputs": [],
      "source": [
        "from sklearn.datasets import load_diabetes\n",
@@ -155,7 +156,8 @@
        "tags": [
          "local run",
          "outputs upload"
-        ]
+        ],
        "name": "create_experiment"
      },
      "outputs": [],
      "source": [
--- a/how-to-use-azureml/work-with-data/dataset-api-change-notice.md
+++ b/how-to-use-azureml/work-with-data/dataset-api-change-notice.md
@@ -18,7 +18,7 @@ Methods to be deprecated|Replacement in the new version|
 [Dataset.from_parquet_files()](https://docs.microsoft.com/python/api/azureml-core/azureml.core.dataset.dataset?view=azure-ml-py#from-parquet-files-path--include-path-false--partition-format-none-)|[Dataset.Tabular.from_parquet_files()](https://docs.microsoft.com/python/api/azureml-core/azureml.data.dataset_factory.tabulardatasetfactory?view=azure-ml-py#from-parquet-files-path--validate-true--include-path-false--set-column-types-none-)
 [Dataset.from_sql_query()](https://docs.microsoft.com/python/api/azureml-core/azureml.core.dataset.dataset?view=azure-ml-py#from-sql-query-data-source--query-)|[Dataset.Tabular.from_sql_query()](https://docs.microsoft.com/python/api/azureml-core/azureml.data.dataset_factory.tabulardatasetfactory?view=azure-ml-py#from-sql-query-query--validate-true--set-column-types-none-)
 [Dataset.from_excel_files()](https://docs.microsoft.com/python/api/azureml-core/azureml.core.dataset.dataset?view=azure-ml-py#from-excel-files-path--sheet-name-none--use-column-headers-false--skip-rows-0--include-path-false--infer-column-types-true--partition-format-none-)|We will support creating a TabularDataset from Excel files in a future release.
-[Dataset.from_json_files()](https://docs.microsoft.com/python/api/azureml-core/azureml.core.dataset.dataset?view=azure-ml-py#from-json-files-path--encoding--fileencoding-utf8--0---flatten-nested-arrays-false--include-path-false--partition-format-none-)| We will support creating a TabularDataset from json files in a future release.
+[Dataset.from_json_files()](https://docs.microsoft.com/python/api/azureml-core/azureml.core.dataset.dataset?view=azure-ml-py#from-json-files-path--encoding--fileencoding-utf8--0---flatten-nested-arrays-false--include-path-false--partition-format-none-)| [Dataset.Tabular.from_json_lines_files](https://docs.microsoft.com/python/api/azureml-core/azureml.data.dataset_factory.tabulardatasetfactory?view=azure-ml-py#from-json-lines-files-path--validate-true--include-path-false--set-column-types-none--partition-format-none-)
 [Dataset.to_pandas_dataframe()](https://docs.microsoft.com/python/api/azureml-core/azureml.core.dataset.dataset?view=azure-ml-py#to-pandas-dataframe--)|[TabularDataset.to_pandas_dataframe()](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.data.tabulardataset?view=azure-ml-py#to-pandas-dataframe--)
 [Dataset.to_spark_dataframe()](https://docs.microsoft.com/python/api/azureml-core/azureml.core.dataset.dataset?view=azure-ml-py#to-spark-dataframe--)|[TabularDataset.to_spark_dataframe()](https://docs.microsoft.com/python/api/azureml-core/azureml.data.tabulardataset?view=azure-ml-py#to-spark-dataframe--)
 [Dataset.head(3)](https://docs.microsoft.com/python/api/azureml-core/azureml.core.dataset.dataset?view=azure-ml-py#head-count-)|[TabularDataset.take(3).to_pandas_dataframe()](https://docs.microsoft.com/python/api/azureml-core/azureml.data.tabulardataset?view=azure-ml-py#take-count-)
@@ -29,27 +29,13 @@ Methods to be deprecated|Replacement in the new version|
 ## Why should I use the new Dataset API if I'm only dealing with tabular data?
 The current Dataset will be kept around for backward compatibility, but we strongly encourage you to move to TabularDataset for the new capabilities listed below: 
 - You are able to version and track the new typed Datasets. [Learn How](https://aka.ms/azureml/howto/versiondata)
 - You are able to use TabularDatasets as automated ML input. [Learn How](https://aka.ms/automl-dataset)
- You are able to version the new typed Datasets. [Learn How](https://aka.ms/azureml/howto/createdatasets)
+- You are able to use the new typed Datasets as ScriptRun, Estimator, HyperDrive input. [Learn How](https://aka.ms/train-with-datasets)
- You will be able to use the new typed Datasets as ScriptRun, Estimator, HyperDrive input.
+- You are be able to use the new typed Datasets in Azure Machine Learning Pipelines. [Learn How](https://aka.ms/pl-datasets)
 - You will be able to use the new typed Datasets in Azure Machine Learning Pipelines.
 - You will be able to track the lineage of new typed Datasets for model reproducibility.
 ## How to migrate registered Datasets to new typed Datasets?
-If you have registered Datasets created using the old API, you can easily migrate these old Datasets to the new typed Datasets using the following code.
+We handled the migration for you. All legacy datasets are migrated to new typed Datasets automatically. To use registered datasets, simply call [Dataset.get_by_name](https://docs.microsoft.com/python/api/azureml-core/azureml.core.dataset.dataset?view=azure-ml-py#get-by-name-workspace--name--version--latest--).
 ```Python
 from azureml.core.workspace import Workspace
 from azureml.core.dataset import Dataset
 # get existing workspace
 workspace = Workspace.from_config()
 # This method will convert old Dataset without type to either a TabularDataset or a FileDataset object automatically.
 new_ds = Dataset.get_by_name(workspace, 'old_ds_name')
 # register the new typed Dataset with the workspace
 new_ds.register(workspace, 'new_ds_name')
 ```
 ## How to provide feedback?
 If you have any feedback about our product, or if there is any missing capability that is essential for you to use new Dataset API, please email us at [AskAzureMLData@microsoft.com](mailto:AskAzureMLData@microsoft.com).
--- a/index.md
+++ b/index.md
@@ -12,7 +12,6 @@ Machine Learning notebook samples and encourage efficient retrieval of topics an
 | [Using Azure ML environments](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/training/using-environments/using-environments.ipynb) | Creating and registering environments | None | Local | None | None | None |
 | [Estimators in AML with hyperparameter tuning](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/training-with-deep-learning/how-to-use-estimator/how-to-use-estimator.ipynb) | Use the Estimator pattern in Azure Machine Learning SDK | None | AML Compute | None | None | None |
 ## Tutorials
 |Title| Task | Dataset | Training Compute | Deployment Target | ML Framework | Tags |
@@ -33,7 +32,6 @@ Machine Learning notebook samples and encourage efficient retrieval of topics an
 | [Automated ML run with basic edition features.](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/automated-machine-learning/classification-bank-marketing-all-features/auto-ml-classification-bank-marketing-all-features.ipynb) | Classification | Bankmarketing | AML | ACI | None | featurization, explainability, remote_run, AutomatedML |
 | [Classification of credit card fraudulent transactions using Automated ML](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/automated-machine-learning/classification-credit-card-fraud/auto-ml-classification-credit-card-fraud.ipynb) | Classification | Creditcard | AML Compute | None | None | remote_run, AutomatedML |
 | [Automated ML run with featurization and model explainability.](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/automated-machine-learning/regression-hardware-performance-explanation-and-featurization/auto-ml-regression-hardware-performance-explanation-and-featurization.ipynb) | Regression | MachineData | AML | ACI | None | featurization, explainability, remote_run, AutomatedML |
 | [Use MLflow with Azure Machine Learning for training and deployment](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/track-and-monitor-experiments/using-mlflow/train-deploy-pytorch/train-and-deploy-pytorch.ipynb) | Use MLflow with Azure Machine Learning to train and deploy Pa yTorch image classifier model | MNIST | AML Compute | Azure Container Instance | PyTorch | None |
 | :star:[Azure Machine Learning Pipeline with DataTranferStep](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-data-transfer.ipynb) | Demonstrates the use of DataTranferStep | Custom | ADF | None | Azure ML | None |
 | [Getting Started with Azure Machine Learning Pipelines](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-getting-started.ipynb) | Getting Started notebook for ANML Pipelines | Custom | AML Compute | None | Azure ML | None |
 | [Azure Machine Learning Pipeline with AzureBatchStep](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-how-to-use-azurebatch-to-run-a-windows-executable.ipynb) | Demonstrates the use of AzureBatchStep | Custom | Azure Batch | None | Azure ML | None |
@@ -51,7 +49,6 @@ Machine Learning notebook samples and encourage efficient retrieval of topics an
 | :star:[Azure Machine Learning Pipelines with Data Dependency](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-with-data-dependency-steps.ipynb) | Demonstrates how to construct a Pipeline with data dependency between steps | Custom | AML Compute | None | Azure ML | None |
 | [How to use run a notebook as a step in AML Pipelines](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-with-notebook-runner-step.ipynb) | Demonstrates the use of NotebookRunnerStep | Custom | AML Compute | None | Azure ML | None |
 ## Training
 |Title| Task | Dataset | Training Compute | Deployment Target | ML Framework | Tags |
@@ -79,7 +76,6 @@ Machine Learning notebook samples and encourage efficient retrieval of topics an
 | [Use MLflow with AML for a remote training run](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/track-and-monitor-experiments/using-mlflow/train-remote/train-remote.ipynb) | Use MLflow tracking APIs together with AML for storing your metrics and artifacts | Diabetes | AML Compute | None | None | None |
 ## Deployment
@@ -91,9 +87,8 @@ Machine Learning notebook samples and encourage efficient retrieval of topics an
 | :star:[Deploy models to AKS using controlled roll out](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/deploy-with-controlled-rollout/deploy-aks-with-controlled-rollout.ipynb) | Deploy a model with Azure Machine Learning | Diabetes | None | Azure Kubernetes Service | Scikit-learn | None |
 | [Train MNIST in PyTorch, convert, and deploy with ONNX Runtime](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/onnx/onnx-train-pytorch-aml-deploy-mnist.ipynb) | Image Classification | MNIST | AML Compute | Azure Container Instance | ONNX | ONNX Converter |
 | [Deploy ResNet50 with ONNX Runtime](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/onnx/onnx-modelzoo-aml-deploy-resnet50.ipynb) | Image Classification | ImageNet | Local | Azure Container Instance | ONNX | ONNX Model Zoo |
 | [Deploy a model as a web service using MLflow](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/track-and-monitor-experiments/using-mlflow/deploy-model/deploy-model.ipynb) | Use MLflow with AML | Diabetes | None | Azure Container Instance | Scikit-learn | None |
 | :star:[Convert and deploy TinyYolo with ONNX Runtime](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/onnx/onnx-convert-aml-deploy-tinyyolo.ipynb) | Object Detection | PASCAL VOC | local | Azure Container Instance | ONNX | ONNX Converter |
-
+| [Register Spark model and deploy as webservice](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/spark/model-register-and-deploy-spark.ipynb) |  | Iris | None | Azure Container Instance | PySpark |  |
 ## Other Notebooks
@@ -110,7 +105,6 @@ Machine Learning notebook samples and encourage efficient retrieval of topics an
 | [auto-ml-regression](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/automated-machine-learning/regression/auto-ml-regression.ipynb) |  |  |  |  |  |  |
 | [build-model-run-history-03](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/azure-databricks/amlsdk/build-model-run-history-03.ipynb) |  |  |  |  |  |  |
 | [deploy-to-aci-04](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/azure-databricks/amlsdk/deploy-to-aci-04.ipynb) |  |  |  |  |  |  |
 | [deploy-to-aks-05](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/azure-databricks/amlsdk/deploy-to-aks-05.ipynb) |  |  |  |  |  |  |
 | [ingest-data-02](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/azure-databricks/amlsdk/ingest-data-02.ipynb) |  |  |  |  |  |  |
 | [installation-and-configuration-01](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/azure-databricks/amlsdk/installation-and-configuration-01.ipynb) |  |  |  |  |  |  |
 | [automl-databricks-local-01](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/azure-databricks/automl/automl-databricks-local-01.ipynb) |  |  |  |  |  |  |
@@ -124,7 +118,6 @@ Machine Learning notebook samples and encourage efficient retrieval of topics an
 | [enable-app-insights-in-production-service](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/enable-app-insights-in-production-service/enable-app-insights-in-production-service.ipynb) |  |  |  |  |  |  |
 | [onnx-model-register-and-deploy](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/onnx/onnx-model-register-and-deploy.ipynb) |  |  |  |  |  |  |
 | [production-deploy-to-aks](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/production-deploy-to-aks/production-deploy-to-aks.ipynb) |  |  |  |  |  |  |
 | [register-model-create-image-deploy-service](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/register-model-create-image-deploy-service/register-model-create-image-deploy-service.ipynb) |  |  |  |  |  |  |
 | [tensorflow-model-register-and-deploy](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/tensorflow/tensorflow-model-register-and-deploy.ipynb) |  |  |  |  |  |  |
 | [explain-model-on-amlcompute](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/explain-model/azure-integration/remote-explanation/explain-model-on-amlcompute.ipynb) |  |  |  |  |  |  |
 | [save-retrieve-explanations-run-history](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/explain-model/azure-integration/run-history/save-retrieve-explanations-run-history.ipynb) |  |  |  |  |  |  |
@@ -132,7 +125,6 @@ Machine Learning notebook samples and encourage efficient retrieval of topics an
 | [train-explain-model-on-amlcompute-and-deploy](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/explain-model/azure-integration/scoring-time/train-explain-model-on-amlcompute-and-deploy.ipynb) |  |  |  |  |  |  |
 | [training_notebook](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/notebook_runner/training_notebook.ipynb) |  |  |  |  |  |  |
 | [nyc-taxi-data-regression-model-building](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/machine-learning-pipelines/nyc-taxi-data-regression-model-building/nyc-taxi-data-regression-model-building.ipynb) |  |  |  |  |  |  |
 | [pipeline-batch-scoring](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/machine-learning-pipelines/pipeline-batch-scoring/pipeline-batch-scoring.ipynb) |  |  |  |  |  |  |
 | [authentication-in-azureml](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/manage-azureml-service/authentication-in-azureml/authentication-in-azureml.ipynb) |  |  |  |  |  |  |
 | [Logging APIs](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/track-and-monitor-experiments/logging-api/logging-api.ipynb) | Logging APIs and analyzing results | None | None | None | None | None |
 | [distributed-cntk-with-custom-docker](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/training-with-deep-learning/distributed-cntk-with-custom-docker/distributed-cntk-with-custom-docker.ipynb) |  |  |  |  |  |  |
@@ -142,5 +134,4 @@ Machine Learning notebook samples and encourage efficient retrieval of topics an
 | [img-classification-part2-deploy](https://github.com/Azure/MachineLearningNotebooks/blob/master//tutorials/img-classification-part2-deploy.ipynb) |  |  |  |  |  |  |
 | [regression-automated-ml](https://github.com/Azure/MachineLearningNotebooks/blob/master//tutorials/regression-automated-ml.ipynb) |  |  |  |  |  |  |
 | [tutorial-1st-experiment-sdk-train](https://github.com/Azure/MachineLearningNotebooks/blob/master//tutorials/tutorial-1st-experiment-sdk-train.ipynb) |  |  |  |  |  |  |
-| [tutorial-pipeline-batch-scoring-classification](https://github.com/Azure/MachineLearningNotebooks/blob/master//tutorials/tutorial-pipeline-batch-scoring-classification.ipynb) |  |  |  |  |  |  |
+| [tutorial-pipeline-batch-scoring-classification](https://github.com/Azure/MachineLearningNotebooks/blob/master//tutorials/machine-learning-pipelines-advanced/tutorial-pipeline-batch-scoring-classification.ipynb) |  |  |  |  |  |  |
--- a/setup-environment/configuration.ipynb
+++ b/setup-environment/configuration.ipynb
@@ -102,7 +102,7 @@
      "source": [
        "import azureml.core\n",
        "\n",
-        "print(\"This notebook was created using version 1.0.83 of the Azure ML SDK\")\n",
+        "print(\"This notebook was created using version 1.0.85 of the Azure ML SDK\")\n",
        "print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
      ]
    },
--- a/tutorials/img-classification-part2-deploy.ipynb
+++ b/tutorials/img-classification-part2-deploy.ipynb
@@ -62,18 +62,7 @@
        "                        model_name=model_name,\n",
        "                        tags={\"data\": \"mnist\", \"model\": \"classification\"},\n",
        "                        description=\"Mnist handwriting recognition\",\n",
-        "                        workspace=ws)\n",
+        "                        workspace=ws)"
        "\n",
        "# download test data\n",
        "import os\n",
        "import urllib.request\n",
        "\n",
        "data_folder = os.path.join(os.getcwd(), 'data')\n",
        "os.makedirs(data_folder, exist_ok = True)\n",
        "\n",
        "\n",
        "urllib.request.urlretrieve('http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz', filename=os.path.join(data_folder, 'test-images.gz'))\n",
        "urllib.request.urlretrieve('http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz', filename=os.path.join(data_folder, 'test-labels.gz'))"
      ]
    },
    {
@@ -150,10 +139,42 @@
        "## Test model locally\n",
        "\n",
        "Before deploying, make sure your model is working locally by:\n",
        "* Downloading the test data if you haven't already\n",
        "* Loading test data\n",
        "* Predicting test data\n",
-        "* Examining the confusion matrix\n",
+        "* Examining the confusion matrix"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Download test data\n",
        "If you haven't already, download the test data to the **./data/** directory"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# download test data\n",
        "import os\n",
        "import urllib.request\n",
        "\n",
        "data_folder = os.path.join(os.getcwd(), 'data')\n",
        "os.makedirs(data_folder, exist_ok = True)\n",
        "\n",
        "\n",
        "urllib.request.urlretrieve('http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz', filename=os.path.join(data_folder, 'test-images.gz'))\n",
        "urllib.request.urlretrieve('http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz', filename=os.path.join(data_folder, 'test-labels.gz'))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Load test data\n",
        "\n",
        "Load the test data from the **./data/** directory created during the training tutorial."
@@ -190,10 +211,11 @@
      "outputs": [],
      "source": [
        "import pickle\n",
-        "from sklearn.externals import joblib\n",
+        "import joblib\n",
        "\n",
        "clf = joblib.load( os.path.join(os.getcwd(), 'sklearn_mnist_model.pkl'))\n",
-        "y_hat = clf.predict(X_test)"
+        "y_hat = clf.predict(X_test)\n",
        "print(y_hat)"
      ]
    },
    {
@@ -286,8 +308,7 @@
        "import numpy as np\n",
        "import os\n",
        "import pickle\n",
-        "from sklearn.externals import joblib\n",
+        "import joblib\n",
        "from sklearn.linear_model import LogisticRegression\n",
        "\n",
        "def init():\n",
        "    global model\n",
@@ -327,7 +348,7 @@
        "from azureml.core.conda_dependencies import CondaDependencies \n",
        "\n",
        "myenv = CondaDependencies()\n",
-        "myenv.add_conda_package(\"scikit-learn\")\n",
+        "myenv.add_pip_package(\"scikit-learn\")\n",
        "myenv.add_pip_package(\"azureml-defaults\")\n",
        "\n",
        "with open(\"myenv.yml\",\"w\") as f:\n",
--- a/tutorials/machine-learning-pipelines-advanced/scripts/batch_scoring.py
+++ b/tutorials/machine-learning-pipelines-advanced/scripts/batch_scoring.py
@@ -0,0 +1,83 @@
 # Copyright (c) Microsoft. All rights reserved.
 # Licensed under the MIT license.
 import os
 import argparse
 import datetime
 import time
 import tensorflow as tf
 from math import ceil
 import numpy as np
 import shutil
 from tensorflow.contrib.slim.python.slim.nets import inception_v3
 from azureml.core import Run
 from azureml.core.model import Model
 from azureml.core.dataset import Dataset
 slim = tf.contrib.slim
 image_size = 299
 num_channel = 3
 def get_class_label_dict():
    label = []
    proto_as_ascii_lines = tf.gfile.GFile("labels.txt").readlines()
    for l in proto_as_ascii_lines:
        label.append(l.rstrip())
    return label
 def init():
    global g_tf_sess, probabilities, label_dict, input_images
    parser = argparse.ArgumentParser(description="Start a tensorflow model serving")
    parser.add_argument('--model_name', dest="model_name", required=True)
    parser.add_argument('--labels_name', dest="labels_name", required=True)
    args, _ = parser.parse_known_args()
    workspace = Run.get_context(allow_offline=False).experiment.workspace
    label_ds = Dataset.get_by_name(workspace=workspace, name=args.labels_name)
    label_ds.download(target_path='.', overwrite=True)
    label_dict = get_class_label_dict()
    classes_num = len(label_dict)
    with slim.arg_scope(inception_v3.inception_v3_arg_scope()):
        input_images = tf.placeholder(tf.float32, [1, image_size, image_size, num_channel])
        logits, _ = inception_v3.inception_v3(input_images,
                                              num_classes=classes_num,
                                              is_training=False)
        probabilities = tf.argmax(logits, 1)
    config = tf.ConfigProto()
    config.gpu_options.allow_growth = True
    g_tf_sess = tf.Session(config=config)
    g_tf_sess.run(tf.global_variables_initializer())
    g_tf_sess.run(tf.local_variables_initializer())
    model_path = Model.get_model_path(args.model_name)
    saver = tf.train.Saver()
    saver.restore(g_tf_sess, model_path)
 def file_to_tensor(file_path):
    image_string = tf.read_file(file_path)
    image = tf.image.decode_image(image_string, channels=3)
    image.set_shape([None, None, None])
    image = tf.image.resize_images(image, [image_size, image_size])
    image = tf.divide(tf.subtract(image, [0]), [255])
    image.set_shape([image_size, image_size, num_channel])
    return image
 def run(mini_batch):
    result_list = []
    for file_path in mini_batch:
        test_image = file_to_tensor(file_path)
        out = g_tf_sess.run(test_image)
        result = g_tf_sess.run(probabilities, feed_dict={input_images: [out]})
        result_list.append(os.path.basename(file_path) + ": " + label_dict[result[0]])
    return result_list
--- a/tutorials/machine-learning-pipelines-advanced/tutorial-pipeline-batch-scoring-classification.ipynb
+++ b/tutorials/machine-learning-pipelines-advanced/tutorial-pipeline-batch-scoring-classification.ipynb
@@ -15,19 +15,16 @@
        "![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/machine-learning-pipelines/pipeline-batch-scoring/pipeline-batch-scoring.png)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "**Note**: Azure Machine Learning recently released ParallelRunStep for public preview, this will allow for parallelization of your workload across many compute nodes without the difficulty of orchestrating worker pools and queues. See the [batch inference notebooks](../contrib/batch_inferencing/) for examples on how to get started."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "# Use Azure Machine Learning Pipelines for batch prediction\n",
        "\n",
        "## Note\n",
        "This notebook uses public preview functionality (ParallelRunStep). Please install azureml-contrib-pipeline-steps package before running this notebook.\n",
        "\n",
        "\n",
        "In this tutorial, you use Azure Machine Learning service pipelines to run a batch scoring image classification job. The example job uses the pre-trained [Inception-V3](https://arxiv.org/abs/1512.00567) CNN (convolutional neural network) Tensorflow model to classify unlabeled images. Machine learning pipelines optimize your workflow with speed, portability, and reuse so you can focus on your expertise, machine learning, rather than on infrastructure and automation. After building and publishing a pipeline, you can configure a REST endpoint to enable triggering the pipeline from any HTTP library on any platform.\n",
        "\n",
        "\n",
@@ -37,6 +34,7 @@
        "> * Create data objects to fetch and output data\n",
        "> * Download, prepare, and register the model to your workspace\n",
        "> * Provision compute targets and create a scoring script\n",
        "> * Use ParallelRunStep to do batch scoring\n",
        "> * Build, run, and publish a pipeline\n",
        "> * Enable a REST endpoint for the pipeline\n",
        "\n",
@@ -111,14 +109,14 @@
      "source": [
        "## Create data objects\n",
        "\n",
-        "When building pipelines, `DataReference` objects are used for reading data from workspace datastores, and `PipelineData` objects are used for transferring intermediate data between pipeline steps.\n",
+        "When building pipelines, `Dataset` objects are used for reading data from workspace datastores, and `PipelineData` objects are used for transferring intermediate data between pipeline steps.\n",
        "\n",
        "This batch scoring example only uses one pipeline step, but in use-cases with multiple steps, the typical flow will include:\n",
        "\n",
-        "1. Using `DataReference` objects as **inputs** to fetch raw data, performing some transformations, then **outputting** a `PipelineData` object.\n",
+        "1. Using `Dataset` objects as **inputs** to fetch raw data, performing some transformations, then **outputting** a `PipelineData` object.\n",
        "1. Use the previous step's `PipelineData` **output object** as an *input object*, repeated for subsequent steps.\n",
        "\n",
-        "For this scenario you create `DataReference` objects corresponding to the datastore directories for both the input images and the classification labels (y-test values). You also create a `PipelineData` object for the batch scoring output data."
+        "For this scenario you create `Dataset` objects corresponding to the datastore directories for both the input images and the classification labels (y-test values). You also create a `PipelineData` object for the batch scoring output data."
      ]
    },
    {
@@ -127,21 +125,11 @@
      "metadata": {},
      "outputs": [],
      "source": [
-        "from azureml.data.data_reference import DataReference\n",
+        "from azureml.core.dataset import Dataset\n",
        "from azureml.pipeline.core import PipelineData\n",
        "\n",
-        "input_images = DataReference(datastore=batchscore_blob, \n",
+        "input_images = Dataset.File.from_files((batchscore_blob, \"batchscoring/images/\"))\n",
-        "                             data_reference_name=\"input_images\",\n",
+        "label_ds = Dataset.File.from_files((batchscore_blob, \"batchscoring/labels/*.txt\"))\n",
        "                             path_on_datastore=\"batchscoring/images\",\n",
        "                             mode=\"download\"\n",
        "                            )\n",
        "\n",
        "label_dir = DataReference(datastore=batchscore_blob, \n",
        "                          data_reference_name=\"input_labels\",\n",
        "                          path_on_datastore=\"batchscoring/labels\",\n",
        "                          mode=\"download\"                          \n",
        "                         )\n",
        "\n",
        "output_dir = PipelineData(name=\"scores\", \n",
        "                          datastore=def_data_store, \n",
        "                          output_path_on_compute=\"batchscoring/results\")"
@@ -150,6 +138,25 @@
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Next, we need to register the datasets with the workspace."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "input_images = input_images.register(workspace = ws, name = \"input_images\")\n",
        "label_ds = label_ds.register(workspace = ws, name = \"label_ds\")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "## Download and register the model"
      ]
@@ -192,13 +199,17 @@
      "metadata": {},
      "outputs": [],
      "source": [
        "import shutil\n",
        "from azureml.core.model import Model\n",
-        " \n",
+        "\n",
        "# register downloaded model \n",
        "model = Model.register(model_path=\"models/inception_v3.ckpt\",\n",
        "                       model_name=\"inception\",\n",
        "                       tags={\"pretrained\": \"inception\"},\n",
        "                       description=\"Imagenet trained tensorflow inception\",\n",
-        "                       workspace=ws)"
+        "                       workspace=ws)\n",
        "# remove the downloaded dir after registration if you wish\n",
        "shutil.rmtree(\"models\")"
      ]
    },
    {
@@ -244,142 +255,16 @@
      "cell_type": "markdown",
      "metadata": {},
      "source": [
-        "To do the scoring, you create a batch scoring script `batch_scoring.py`, and write it to the current directory. The script takes input images, applies the classification model, and outputs the predictions to a results file.\n",
+        "To do the scoring, you create a batch scoring script `batch_scoring.py`, and write it to the current directory. The script takes a minibatch of input images, applies the classification model, and outputs the predictions to a results file.\n",
        "\n",
-        "The script `batch_scoring.py` takes the following parameters, which get passed from the `PythonScriptStep` that you create later:\n",
+        "The script `batch_scoring.py` takes the following parameters, which get passed from the `ParallelRunStep` that you create later:\n",
        "\n",
        "- `--model_name`: the name of the model being used\n",
-        "- `--label_dir` : the directory holding the `labels.txt` file \n",
+        "- `--labels_name` : the name of the `Dataset` holding the `labels.txt` file \n",
        "- `--dataset_path`: the directory containing the input images\n",
        "- `--output_dir` : the script will run the model on the data and output a `results-label.txt` to this directory\n",
        "- `--batch_size` : the batch size used in running the model\n",
        "\n",
        "The pipelines infrastructure uses the `ArgumentParser` class to pass parameters into pipeline steps. For example, in the code below the first argument `--model_name` is given the property identifier `model_name`. In the `main()` function, this property is accessed using `Model.get_model_path(args.model_name)`."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "%%writefile batch_scoring.py\n",
        "\n",
        "import os\n",
        "import argparse\n",
        "import datetime\n",
        "import time\n",
        "import tensorflow as tf\n",
        "from math import ceil\n",
        "import numpy as np\n",
        "import shutil\n",
        "from tensorflow.contrib.slim.python.slim.nets import inception_v3\n",
        "from azureml.core.model import Model\n",
        "\n",
        "slim = tf.contrib.slim\n",
        "\n",
        "parser = argparse.ArgumentParser(description=\"Start a tensorflow model serving\")\n",
        "parser.add_argument('--model_name', dest=\"model_name\", required=True)\n",
        "parser.add_argument('--label_dir', dest=\"label_dir\", required=True)\n",
        "parser.add_argument('--dataset_path', dest=\"dataset_path\", required=True)\n",
        "parser.add_argument('--output_dir', dest=\"output_dir\", required=True)\n",
        "parser.add_argument('--batch_size', dest=\"batch_size\", type=int, required=True)\n",
        "\n",
        "args = parser.parse_args()\n",
        "\n",
        "image_size = 299\n",
        "num_channel = 3\n",
        "\n",
        "# create output directory if it does not exist\n",
        "os.makedirs(args.output_dir, exist_ok=True)\n",
        "\n",
        "\n",
        "def get_class_label_dict(label_file):\n",
        "    label = []\n",
        "    proto_as_ascii_lines = tf.gfile.GFile(label_file).readlines()\n",
        "    for l in proto_as_ascii_lines:\n",
        "        label.append(l.rstrip())\n",
        "    return label\n",
        "\n",
        "\n",
        "class DataIterator:\n",
        "    def __init__(self, data_dir):\n",
        "        self.file_paths = []\n",
        "        image_list = os.listdir(data_dir)\n",
        "        self.file_paths = [data_dir + '/' + file_name.rstrip() for file_name in image_list]\n",
        "\n",
        "        self.labels = [1 for file_name in self.file_paths]\n",
        "\n",
        "    @property\n",
        "    def size(self):\n",
        "        return len(self.labels)\n",
        "\n",
        "    def input_pipeline(self, batch_size):\n",
        "        images_tensor = tf.convert_to_tensor(self.file_paths, dtype=tf.string)\n",
        "        labels_tensor = tf.convert_to_tensor(self.labels, dtype=tf.int64)\n",
        "        input_queue = tf.train.slice_input_producer([images_tensor, labels_tensor], shuffle=False)\n",
        "        labels = input_queue[1]\n",
        "        images_content = tf.read_file(input_queue[0])\n",
        "\n",
        "        image_reader = tf.image.decode_jpeg(images_content, channels=num_channel, name=\"jpeg_reader\")\n",
        "        float_caster = tf.cast(image_reader, tf.float32)\n",
        "        new_size = tf.constant([image_size, image_size], dtype=tf.int32)\n",
        "        images = tf.image.resize_images(float_caster, new_size)\n",
        "        images = tf.divide(tf.subtract(images, [0]), [255])\n",
        "\n",
        "        image_batch, label_batch = tf.train.batch([images, labels], batch_size=batch_size, capacity=5 * batch_size)\n",
        "        return image_batch\n",
        "\n",
        "\n",
        "def main(_):\n",
        "    label_file_name = os.path.join(args.label_dir, \"labels.txt\")\n",
        "    label_dict = get_class_label_dict(label_file_name)\n",
        "    classes_num = len(label_dict)\n",
        "    test_feeder = DataIterator(data_dir=args.dataset_path)\n",
        "    total_size = len(test_feeder.labels)\n",
        "    count = 0\n",
        "    \n",
        "    # get model from model registry\n",
        "    model_path = Model.get_model_path(args.model_name)\n",
        "    \n",
        "    with tf.Session() as sess:\n",
        "        test_images = test_feeder.input_pipeline(batch_size=args.batch_size)\n",
        "        with slim.arg_scope(inception_v3.inception_v3_arg_scope()):\n",
        "            input_images = tf.placeholder(tf.float32, [args.batch_size, image_size, image_size, num_channel])\n",
        "            logits, _ = inception_v3.inception_v3(input_images,\n",
        "                                                  num_classes=classes_num,\n",
        "                                                  is_training=False)\n",
        "            probabilities = tf.argmax(logits, 1)\n",
        "\n",
        "        sess.run(tf.global_variables_initializer())\n",
        "        sess.run(tf.local_variables_initializer())\n",
        "        coord = tf.train.Coordinator()\n",
        "        threads = tf.train.start_queue_runners(sess=sess, coord=coord)\n",
        "        saver = tf.train.Saver()\n",
        "        saver.restore(sess, model_path)\n",
        "        out_filename = os.path.join(args.output_dir, \"result-labels.txt\")\n",
        "        with open(out_filename, \"w\") as result_file:\n",
        "            i = 0\n",
        "            while count < total_size and not coord.should_stop():\n",
        "                test_images_batch = sess.run(test_images)\n",
        "                file_names_batch = test_feeder.file_paths[i * args.batch_size:\n",
        "                                                          min(test_feeder.size, (i + 1) * args.batch_size)]\n",
        "                results = sess.run(probabilities, feed_dict={input_images: test_images_batch})\n",
        "                new_add = min(args.batch_size, total_size - count)\n",
        "                count += new_add\n",
        "                i += 1\n",
        "                for j in range(new_add):\n",
        "                    result_file.write(os.path.basename(file_names_batch[j]) + \": \" + label_dict[results[j]] + \"\\n\")\n",
        "                result_file.flush()\n",
        "            coord.request_stop()\n",
        "            coord.join(threads)\n",
        "\n",
        "        shutil.copy(out_filename, \"./outputs/\")\n",
        "\n",
        "if __name__ == \"__main__\":\n",
        "    tf.app.run()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
@@ -407,26 +292,23 @@
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.core import Environment\n",
        "from azureml.core.conda_dependencies import CondaDependencies\n",
        "from azureml.core.runconfig import DEFAULT_GPU_IMAGE\n",
        "from azureml.core.runconfig import CondaDependencies, RunConfiguration\n",
        "\n",
        "cd = CondaDependencies.create(pip_packages=[\"tensorflow-gpu==1.13.1\", \"azureml-defaults\"])\n",
        "\n",
-        "amlcompute_run_config = RunConfiguration(conda_dependencies=cd)\n",
+        "env = Environment(name=\"parallelenv\")\n",
-        "amlcompute_run_config.environment.docker.enabled = True\n",
+        "env.python.conda_dependencies=cd\n",
-        "amlcompute_run_config.environment.docker.base_image = DEFAULT_GPU_IMAGE\n",
+        "env.docker.base_image = DEFAULT_GPU_IMAGE"
        "amlcompute_run_config.environment.spark.precache_packages = False"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
-        "### Parameterize the pipeline\n",
+        "### Create the configuration to wrap the inference script\n",
-        "\n",
+        "Create the pipeline step using the script, environment configuration, and parameters. Specify the compute target you already attached to your workspace as the target of execution of the script. We will use PythonScriptStep to create the pipeline step."
        "Define a custom parameter for the pipeline to control the batch size. After the pipeline has been published and exposed via a REST endpoint, any configured parameters are also exposed and can be specified in the JSON payload when rerunning the pipeline with an HTTP request.\n",
        "\n",
        "Create a `PipelineParameter` object to enable this behavior, and define a name and default value."
      ]
    },
    {
@@ -435,8 +317,19 @@
      "metadata": {},
      "outputs": [],
      "source": [
-        "from azureml.pipeline.core.graph import PipelineParameter\n",
+        "from azureml.contrib.pipeline.steps import ParallelRunConfig\n",
-        "batch_size_param = PipelineParameter(name=\"param_batch_size\", default_value=20)"
+        "\n",
        "parallel_run_config = ParallelRunConfig(\n",
        "    environment=env,\n",
        "    entry_script=\"batch_scoring.py\",\n",
        "    source_directory=\"scripts\",\n",
        "    output_action=\"append_row\",\n",
        "    mini_batch_size=\"20\",\n",
        "    error_threshold=1,\n",
        "    compute_target=compute_target,\n",
        "    process_count_per_node=2,\n",
        "    node_count=1\n",
        ")"
      ]
    },
    {
@@ -452,7 +345,7 @@
        "* input and output data, and any custom parameters\n",
        "* reference to a script or SDK-logic to run during the step\n",
        "\n",
-        "There are multiple classes that inherit from the parent class [`PipelineStep`](https://docs.microsoft.com/python/api/azureml-pipeline-core/azureml.pipeline.core.builder.pipelinestep?view=azure-ml-py) to assist with building a step using certain frameworks and stacks. In this example, you use the [`PythonScriptStep`](https://docs.microsoft.com/python/api/azureml-pipeline-steps/azureml.pipeline.steps.python_script_step.pythonscriptstep?view=azure-ml-py) class to define your step logic using a custom python script. Note that if an argument to your script is either an input to the step or output of the step, it must be defined **both** in the `arguments` array, **as well as** in either the `input` or `output` parameter, respectively. \n",
+        "There are multiple classes that inherit from the parent class [`PipelineStep`](https://docs.microsoft.com/python/api/azureml-pipeline-core/azureml.pipeline.core.builder.pipelinestep?view=azure-ml-py) to assist with building a step using certain frameworks and stacks. In this example, you use the [`ParallelRunStep`](https://docs.microsoft.com/en-us/python/api/azureml-contrib-pipeline-steps/azureml.contrib.pipeline.steps.parallelrunstep?view=azure-ml-py) class to define your step logic using a scoring script. \n",
        "\n",
        "An object reference in the `outputs` array becomes available as an **input** for a subsequent pipeline step, for scenarios where there is more than one step."
      ]
@@ -463,20 +356,20 @@
      "metadata": {},
      "outputs": [],
      "source": [
-        "from azureml.pipeline.steps import PythonScriptStep\n",
+        "from azureml.contrib.pipeline.steps import ParallelRunStep\n",
        "from datetime import datetime\n",
        "\n",
-        "batch_score_step = PythonScriptStep(\n",
+        "parallel_step_name = \"batchscoring-\" + datetime.now().strftime(\"%Y%m%d%H%M\")\n",
-        "    name=\"batch_scoring\",\n",
+        "\n",
-        "    script_name=\"batch_scoring.py\",\n",
+        "batch_score_step = ParallelRunStep(\n",
-        "    arguments=[\"--dataset_path\", input_images, \n",
+        "    name=parallel_step_name,\n",
-        "               \"--model_name\", \"inception\",\n",
+        "    inputs=[input_images.as_named_input(\"input_images\")],\n",
-        "               \"--label_dir\", label_dir, \n",
+        "    output=output_dir,\n",
-        "               \"--output_dir\", output_dir, \n",
+        "    models=[model],\n",
-        "               \"--batch_size\", batch_size_param],\n",
+        "    arguments=[\"--model_name\", \"inception\",\n",
-        "    compute_target=compute_target,\n",
+        "               \"--labels_name\", \"label_ds\"],\n",
-        "    inputs=[input_images, label_dir],\n",
+        "    parallel_run_config=parallel_run_config,\n",
-        "    outputs=[output_dir],\n",
+        "    allow_reuse=False\n",
        "    runconfig=amlcompute_run_config\n",
        ")"
      ]
    },
@@ -510,7 +403,7 @@
        "from azureml.pipeline.core import Pipeline\n",
        "\n",
        "pipeline = Pipeline(workspace=ws, steps=[batch_score_step])\n",
-        "pipeline_run = Experiment(ws, 'batch_scoring').submit(pipeline, pipeline_parameters={\"param_batch_size\": 20})\n",
+        "pipeline_run = Experiment(ws, \"batch_scoring\").submit(pipeline)\n",
        "pipeline_run.wait_for_completion(show_output=True)"
      ]
    },
@@ -534,14 +427,20 @@
      "metadata": {},
      "outputs": [],
      "source": [
        "batch_run = next(pipeline_run.get_children())\n",
        "batch_output = batch_run.get_output_data(\"scores\")\n",
        "batch_output.download(local_path=\"inception_results\")\n",
        "\n",
        "import pandas as pd\n",
        "for root, dirs, files in os.walk(\"inception_results\"):\n",
        "    for file in files:\n",
        "        if file.endswith(\"parallel_run_step.txt\"):\n",
        "            result_file = os.path.join(root,file)\n",
        "\n",
-        "step_run = list(pipeline_run.get_children())[0]\n",
+        "df = pd.read_csv(result_file, delimiter=\":\", header=None)\n",
        "step_run.download_file(\"./outputs/result-labels.txt\")\n",
        "\n",
        "df = pd.read_csv(\"result-labels.txt\", delimiter=\":\", header=None)\n",
        "df.columns = [\"Filename\", \"Prediction\"]\n",
-        "df.head(10)"
+        "print(\"Prediction has \", df.shape[0], \" rows\")\n",
        "df.head(10) "
      ]
    },
    {
@@ -599,7 +498,7 @@
      "cell_type": "markdown",
      "metadata": {},
      "source": [
-        "Get the REST url from the `endpoint` property of the published pipeline object. You can also find the REST url in your workspace in the portal. Build an HTTP POST request to the endpoint, specifying your authentication header. Additionally, add a JSON payload object with the experiment name and the batch size parameter. As a reminder, the `param_batch_size` is passed through to your `batch_scoring.py` script because you defined it as a `PipelineParameter` object in the step configuration.\n",
+        "Get the REST url from the `endpoint` property of the published pipeline object. You can also find the REST url in your workspace in the portal. Build an HTTP POST request to the endpoint, specifying your authentication header. Additionally, add a JSON payload object with the experiment name and the batch size parameter. As a reminder, the `process_count_per_node` is passed through to `ParallelRunStep` because you defined it is defined as a `PipelineParameter` object in the step configuration.\n",
        "\n",
        "Make the request to trigger the run. Access the `Id` key from the response dict to get the value of the run id."
      ]
@@ -616,8 +515,25 @@
        "response = requests.post(rest_endpoint, \n",
        "                         headers=auth_header, \n",
        "                         json={\"ExperimentName\": \"batch_scoring\",\n",
-        "                               \"ParameterAssignments\": {\"param_batch_size\": 50}})\n",
+        "                               \"ParameterAssignments\": {\"process_count_per_node\": 6}})"
-        "run_id = response.json()[\"Id\"]"
+      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "try:\n",
        "    response.raise_for_status()\n",
        "except Exception:    \n",
        "    raise Exception(\"Received bad response from the endpoint: {}\\n\"\n",
        "                    \"Response Code: {}\\n\"\n",
        "                    \"Headers: {}\\n\"\n",
        "                    \"Content: {}\".format(rest_endpoint, response.status_code, response.headers, response.content))\n",
        "\n",
        "run_id = response.json().get('Id')\n",
        "print('Submitted pipeline run: ', run_id)"
      ]
    },
    {
@@ -652,7 +568,8 @@
        "\n",
        "If you used a cloud notebook server, stop the VM when you are not using it to reduce cost.\n",
        "\n",
-        "1. In your workspace, select **Notebook VMs**.\n",
+        "1. In your workspace, select **Compute**.\n",
        "1. Select the **Notebook VMs** tab in the compute page.\n",
        "1. From the list, select the VM.\n",
        "1. Select **Stop**.\n",
        "1. When you're ready to use the server again, select **Start**.\n",
@@ -683,19 +600,16 @@
        "\n",
        "See the [how-to](https://docs.microsoft.com/azure/machine-learning/service/how-to-create-your-first-pipeline?view=azure-devops) for additional detail on building pipelines with the machine learning SDK."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": []
    }
  ],
  "metadata": {
    "authors": [
      {
-        "name": "sanpil"
+        "name": [
          "sanpil",
          "trmccorm",
          "pansav"
        ]
      }
    ],
    "kernelspec": {
--- a/tutorials/machine-learning-pipelines-advanced/tutorial-pipeline-batch-scoring-classification.yml
+++ b/tutorials/machine-learning-pipelines-advanced/tutorial-pipeline-batch-scoring-classification.yml
@@ -3,7 +3,7 @@ dependencies:
 - pip:
  - azureml-sdk
  - azureml-pipeline-core
-  - azureml-pipeline-steps
+  - azureml-contrib-pipeline-steps
  - pandas
  - requests
  - azureml-widgets
--- a/tutorials/sklearn_mnist_model.pkl
+++ b/tutorials/sklearn_mnist_model.pkl
Author	SHA1	Message	Date
Sheri Gilley	98d24243bd	add cell metadata	2020-02-04 11:32:41 -06:00
Sheri Gilley	3ee5a4c2b2	Update train-within-notebook.ipynb	2020-02-04 11:06:41 -06:00
Sheri Gilley	fd60846887	Update train-within-notebook.ipynb	2020-02-04 09:13:56 -06:00
Harneet Virk	e895d7c2bf	update samples - test (#758 ) Co-authored-by: vizhur <vizhur@live.com>	2020-01-31 15:19:58 -05:00
Shané Winner	3588eb9665	Update index.md	2020-01-23 15:46:43 -08:00
Harneet Virk	a09e726f31	update samples - test (#748 ) Co-authored-by: vizhur <vizhur@live.com>	2020-01-23 16:50:29 -05:00
Shané Winner	4fb1d9ee5b	Update index.md	2020-01-22 11:38:24 -08:00
Harneet Virk	b05ff80e9d	update samples from Release-169 as a part of 1.0.85 SDK release (#742 ) Co-authored-by: vizhur <vizhur@live.com>	2020-01-21 18:00:15 -05:00
		`@@ -0,0 +1 @@`
							{"class":"org.apache.spark.ml.classification.LogisticRegressionModel","timestamp":1570147252329,"sparkVersion":"2.4.0","uid":"LogisticRegression_5df3978caaf3","paramMap":{"regParam":0.01},"defaultParamMap":{"aggregationDepth":2,"threshold":0.5,"rawPredictionCol":"rawPrediction","featuresCol":"features","labelCol":"label","predictionCol":"prediction","family":"auto","regParam":0.0,"tol":1.0E-6,"probabilityCol":"probability","standardization":true,"elasticNetParam":0.0,"maxIter":100,"fitIntercept":true}}