mirror of
https://github.com/Azure/MachineLearningNotebooks.git
synced 2025-12-19 17:17:04 -05:00
951 lines
36 KiB
Plaintext
951 lines
36 KiB
Plaintext
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Copyright (c) Microsoft Corporation. All rights reserved.\n",
|
|
"\n",
|
|
"Licensed under the MIT License.\n",
|
|
"\n",
|
|
"# Batch Predictions for an Image Classification model trained using AutoML\n",
|
|
"In this notebook, we go over how you can use [Azure Machine Learning pipelines](https://docs.microsoft.com/en-us/azure/machine-learning/tutorial-pipeline-batch-scoring-classification) to run a batch scoring image classification job.\n",
|
|
"\n",
|
|
"**Please note:** For this notebook you can use an existing image classification model trained using AutoML for Images or use the simple model training we included below for convenience. For detailed instructions on how to train an image classification model with AutoML, please refer to the official [documentation](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-auto-train-image-models) and to the [image classification multiclass notebook](https://github.com/Azure/azureml-examples/blob/main/python-sdk/tutorials/automl-with-azureml/image-classification-multiclass/auto-ml-image-classification-multiclass.ipynb)."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"**Important:** This feature is currently in public preview. This preview version is provided without a service-level agreement. Certain features might not be supported or might have constrained capabilities. For more information, see [Supplemental Terms of Use for Microsoft Azure Previews](https://azure.microsoft.com/en-us/support/legal/preview-supplemental-terms/)."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Environment Setup\n",
|
|
"Please follow the [\"Setup a new conda environment\"](https://github.com/Azure/azureml-examples/tree/main/python-sdk/tutorials/automl-with-azureml#3-setup-a-new-conda-environment) instructions to get started."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"import azureml.core\n",
|
|
"\n",
|
|
"print(\"This notebook was created using version 1.35.0 of the Azure ML SDK.\")\n",
|
|
"print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK.\")\n",
|
|
"assert (\n",
|
|
" azureml.core.VERSION >= \"1.35\"\n",
|
|
"), \"Please upgrade the Azure ML SDK by running '!pip install --upgrade azureml-sdk' then restart the kernel.\""
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## You will perform the following tasks:\n",
|
|
"\n",
|
|
"* Register a Model already trained using AutoML for Image Classification.\n",
|
|
"* Create an Inference Dataset.\n",
|
|
"* Provision compute targets and create a Batch Scoring script.\n",
|
|
"* Use ParallelRunStep to do batch scoring.\n",
|
|
"* Build, run, and publish a pipeline.\n",
|
|
"* Enable a REST endpoint for the pipeline."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Workspace setup\n",
|
|
"\n",
|
|
"An [Azure ML Workspace](https://docs.microsoft.com/en-us/azure/machine-learning/concept-azure-machine-learning-architecture#workspace) is an Azure resource that organizes and coordinates the actions of many other Azure resources to assist in executing and sharing machine learning workflows. In particular, an Azure ML Workspace coordinates storage, databases, and compute resources providing added functionality for machine learning experimentation, deployment, inference, and the monitoring of deployed models.\n",
|
|
"\n",
|
|
"Create an Azure ML Workspace within your Azure subscription or load an existing workspace."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from azureml.core.workspace import Workspace\n",
|
|
"\n",
|
|
"ws = Workspace.from_config()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Workspace default datastore is used to store inference input images and outputs"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"def_data_store = ws.get_default_datastore()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Compute target setup\n",
|
|
"You will need to provide a [Compute Target](https://docs.microsoft.com/en-us/azure/machine-learning/concept-azure-machine-learning-architecture#computes) that will be used for your AutoML model training. AutoML models for image tasks require [GPU SKUs](https://docs.microsoft.com/en-us/azure/virtual-machines/sizes-gpu) such as the ones from the NC, NCv2, NCv3, ND, NDv2 and NCasT4 series. We recommend using the NCsv3-series (with v100 GPUs) for faster training. Using a compute target with a multi-GPU VM SKU will leverage the multiple GPUs to speed up training. Additionally, setting up a compute target with multiple nodes will allow for faster model training by leveraging parallelism, when tuning hyperparameters for your model."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from azureml.core.compute import AmlCompute, ComputeTarget\n",
|
|
"\n",
|
|
"cluster_name = \"gpu-cluster-nc6\"\n",
|
|
"\n",
|
|
"try:\n",
|
|
" compute_target = ws.compute_targets[cluster_name]\n",
|
|
" print(\"Found existing compute target.\")\n",
|
|
"except KeyError:\n",
|
|
" print(\"Creating a new compute target...\")\n",
|
|
" compute_config = AmlCompute.provisioning_configuration(\n",
|
|
" vm_size=\"Standard_NC6\",\n",
|
|
" idle_seconds_before_scaledown=600,\n",
|
|
" min_nodes=0,\n",
|
|
" max_nodes=4,\n",
|
|
" )\n",
|
|
" compute_target = ComputeTarget.create(ws, cluster_name, compute_config)\n",
|
|
"# Can poll for a minimum number of nodes and for a specific timeout.\n",
|
|
"# If no min_node_count is provided, it will use the scale settings for the cluster.\n",
|
|
"compute_target.wait_for_completion(\n",
|
|
" show_output=True, min_node_count=None, timeout_in_minutes=20\n",
|
|
")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Train an Image Classification model\n",
|
|
"\n",
|
|
"In this section we will do a quick model train to use for the batch scoring. For a datailed example on how to train an image classification model, please refer to the official [documentation](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-auto-train-image-models) or to the [image classification multiclass notebook](https://github.com/Azure/azureml-examples/blob/main/python-sdk/tutorials/automl-with-azureml/image-classification-multiclass/auto-ml-image-classification-multiclass.ipynb). If you already have a model trained in the same workspace, you can skip to section [\"Create data objects\"](#Create-data-objects)."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"#### Experiment Setup"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from azureml.core import Experiment\n",
|
|
"\n",
|
|
"experiment_name = \"automl-image-batchscoring\"\n",
|
|
"experiment = Experiment(ws, name=experiment_name)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"#### Download dataset with input Training Data\n",
|
|
"\n",
|
|
"All images in this notebook are hosted in [this repository](https://github.com/microsoft/computervision-recipes) and are made available under the [MIT license](https://github.com/microsoft/computervision-recipes/blob/master/LICENSE)."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"import os\n",
|
|
"import urllib\n",
|
|
"from zipfile import ZipFile\n",
|
|
"\n",
|
|
"# download data\n",
|
|
"download_url = \"https://cvbp-secondary.z19.web.core.windows.net/datasets/image_classification/fridgeObjects.zip\"\n",
|
|
"data_file = \"./fridgeObjects.zip\"\n",
|
|
"urllib.request.urlretrieve(download_url, filename=data_file)\n",
|
|
"\n",
|
|
"# extract files\n",
|
|
"with ZipFile(data_file, \"r\") as zip:\n",
|
|
" print(\"extracting files...\")\n",
|
|
" zip.extractall()\n",
|
|
" print(\"done\")\n",
|
|
"# delete zip file\n",
|
|
"os.remove(data_file)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"#### Convert the downloaded data to JSONL"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"import json\n",
|
|
"import os\n",
|
|
"\n",
|
|
"src = \"./fridgeObjects/\"\n",
|
|
"train_validation_ratio = 5\n",
|
|
"\n",
|
|
"# Retrieving default datastore that got automatically created when we setup a workspace\n",
|
|
"workspaceblobstore = ws.get_default_datastore().name\n",
|
|
"\n",
|
|
"# Path to the training and validation files\n",
|
|
"train_annotations_file = os.path.join(src, \"train_annotations.jsonl\")\n",
|
|
"validation_annotations_file = os.path.join(src, \"validation_annotations.jsonl\")\n",
|
|
"\n",
|
|
"# sample json line dictionary\n",
|
|
"json_line_sample = {\n",
|
|
" \"image_url\": \"AmlDatastore://\"\n",
|
|
" + workspaceblobstore\n",
|
|
" + \"/\"\n",
|
|
" + os.path.basename(os.path.dirname(src)),\n",
|
|
" \"label\": \"\",\n",
|
|
"}\n",
|
|
"\n",
|
|
"index = 0\n",
|
|
"# Scan each sub directary and generate jsonl line\n",
|
|
"with open(train_annotations_file, \"w\") as train_f:\n",
|
|
" with open(validation_annotations_file, \"w\") as validation_f:\n",
|
|
" for className in os.listdir(src):\n",
|
|
" subDir = src + className\n",
|
|
" if not os.path.isdir(subDir):\n",
|
|
" continue\n",
|
|
" # Scan each sub directary\n",
|
|
" print(\"Parsing \" + subDir)\n",
|
|
" for image in os.listdir(subDir):\n",
|
|
" json_line = dict(json_line_sample)\n",
|
|
" json_line[\"image_url\"] += f\"/{className}/{image}\"\n",
|
|
" json_line[\"label\"] = className\n",
|
|
"\n",
|
|
" if index % train_validation_ratio == 0:\n",
|
|
" # validation annotation\n",
|
|
" validation_f.write(json.dumps(json_line) + \"\\n\")\n",
|
|
" else:\n",
|
|
" # train annotation\n",
|
|
" train_f.write(json.dumps(json_line) + \"\\n\")\n",
|
|
" index += 1"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"#### Upload the JSONL file and images to Datastore"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Retrieving default datastore that got automatically created when we setup a workspace\n",
|
|
"ds = ws.get_default_datastore()\n",
|
|
"ds.upload(src_dir=\"./fridgeObjects\", target_path=\"fridgeObjects\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"#### Create and register datasets in workspace"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from azureml.core import Dataset\n",
|
|
"from azureml.data import DataType\n",
|
|
"\n",
|
|
"# get existing training dataset\n",
|
|
"training_dataset_name = \"fridgeObjectsTrainingDataset\"\n",
|
|
"if training_dataset_name in ws.datasets:\n",
|
|
" training_dataset = ws.datasets.get(training_dataset_name)\n",
|
|
" print(\"Found the training dataset\", training_dataset_name)\n",
|
|
"else:\n",
|
|
" # create training dataset\n",
|
|
" training_dataset = Dataset.Tabular.from_json_lines_files(\n",
|
|
" path=ds.path(\"fridgeObjects/train_annotations.jsonl\"),\n",
|
|
" set_column_types={\"image_url\": DataType.to_stream(ds.workspace)},\n",
|
|
" )\n",
|
|
" training_dataset = training_dataset.register(\n",
|
|
" workspace=ws, name=training_dataset_name\n",
|
|
" )\n",
|
|
"# get existing validation dataset\n",
|
|
"validation_dataset_name = \"fridgeObjectsValidationDataset\"\n",
|
|
"if validation_dataset_name in ws.datasets:\n",
|
|
" validation_dataset = ws.datasets.get(validation_dataset_name)\n",
|
|
" print(\"Found the validation dataset\", validation_dataset_name)\n",
|
|
"else:\n",
|
|
" # create validation dataset\n",
|
|
" validation_dataset = Dataset.Tabular.from_json_lines_files(\n",
|
|
" path=ds.path(\"fridgeObjects/validation_annotations.jsonl\"),\n",
|
|
" set_column_types={\"image_url\": DataType.to_stream(ds.workspace)},\n",
|
|
" )\n",
|
|
" validation_dataset = validation_dataset.register(\n",
|
|
" workspace=ws, name=validation_dataset_name\n",
|
|
" )\n",
|
|
"print(\"Training dataset name: \" + training_dataset.name)\n",
|
|
"print(\"Validation dataset name: \" + validation_dataset.name)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"#### Submit training 1 training run with default hyperparameters"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from azureml.automl.core.shared.constants import ImageTask\n",
|
|
"from azureml.train.automl import AutoMLImageConfig\n",
|
|
"from azureml.train.hyperdrive import GridParameterSampling, choice\n",
|
|
"\n",
|
|
"image_config_vit = AutoMLImageConfig(\n",
|
|
" task=ImageTask.IMAGE_CLASSIFICATION,\n",
|
|
" compute_target=compute_target,\n",
|
|
" training_data=training_dataset,\n",
|
|
" validation_data=validation_dataset,\n",
|
|
" hyperparameter_sampling=GridParameterSampling({\"model_name\": choice(\"vitb16r224\")}),\n",
|
|
" iterations=1,\n",
|
|
")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"automl_image_run = experiment.submit(image_config_vit)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"automl_image_run.wait_for_completion(wait_post_processing=True)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Create data objects\n",
|
|
"\n",
|
|
"When building pipelines, `Dataset` objects are used for reading data from workspace datastores, and `PipelineData` objects are used for transferring intermediate data between pipeline steps.\n",
|
|
"\n",
|
|
"This batch scoring example only uses one pipeline step, but in use-cases with multiple steps, the typical flow will include:\n",
|
|
"\n",
|
|
"1. Using `Dataset` objects as inputs to fetch raw data, performing some transformations, then output a `PipelineData` object. \n",
|
|
"1. Use the previous step's `PipelineData` **output object** as an **input object**, repeated for subsequent steps.\n",
|
|
"\n",
|
|
"For this scenario you create `Dataset` objects corresponding to the datastore directories for the input images. You also create a `PipelineData` object for the batch scoring output data. An object reference in the `outputs` array becomes available as an **input** for a subsequent pipeline step, for scenarios where there is more than one step. In this case we are just going to build a single step pipeline.\n",
|
|
"\n",
|
|
"It is assumed that an image classification training run was already performed in this workspace and the files are already in the datastore. If this is not the case, please refer to the [documentation](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-auto-train-image-models) to know how to train an image classification model with AutoML.\n",
|
|
"\n",
|
|
"All images in this notebook are hosted in [this repository](https://github.com/microsoft/computervision-recipes) and are made available under the [MIT license](https://github.com/microsoft/computervision-recipes/blob/master/LICENSE)."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from azureml.core.dataset import Dataset\n",
|
|
"from azureml.pipeline.core import PipelineData\n",
|
|
"\n",
|
|
"input_images = Dataset.File.from_files((def_data_store, \"fridgeObjects/**/*.jpg\"))\n",
|
|
"\n",
|
|
"output_dir = PipelineData(name=\"scores\", datastore=def_data_store)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Next, we need to register the input datasets for batch scoring with the workspace."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"input_images = input_images.register(\n",
|
|
" workspace=ws, name=\"fridgeObjects_scoring_images\", create_new_version=True\n",
|
|
")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Retrieve the environment and metrics from the training run"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from azureml.core.experiment import Experiment\n",
|
|
"from azureml.core import Run\n",
|
|
"\n",
|
|
"experiment_name = \"automl-image-batchscoring\"\n",
|
|
"# If your model was not trained with this notebook, replace the id below\n",
|
|
"# with the run id of the child training run (i.e., the one ending with HD_0)\n",
|
|
"training_run_id = automl_image_run.id + \"_HD_0\"\n",
|
|
"exp = Experiment(ws, experiment_name)\n",
|
|
"training_run = Run(exp, training_run_id)\n",
|
|
"\n",
|
|
"# The below will give only the requested metric\n",
|
|
"metrics = training_run.get_metrics(\"accuracy\")\n",
|
|
"best_metric = max(metrics[\"accuracy\"])\n",
|
|
"print(\"best_metric:\", best_metric)\n",
|
|
"\n",
|
|
"# Retrieve the training environment\n",
|
|
"env = training_run.get_environment()\n",
|
|
"print(env)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Register model with metric and environment tags\n",
|
|
"\n",
|
|
"Now you register the model to your workspace, which allows you to easily retrieve it in the pipeline process. In the `register()` static function, the `model_name` parameter is the key you use to locate your model throughout the SDK.\n",
|
|
"Tag the model with the metrics and the environment used to train the model."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from azureml.core.model import Model\n",
|
|
"\n",
|
|
"tags = dict()\n",
|
|
"tags[\"accuracy\"] = best_metric\n",
|
|
"tags[\"env_name\"] = env.name\n",
|
|
"tags[\"env_version\"] = env.version\n",
|
|
"\n",
|
|
"model_name = \"fridgeObjectsClassifier\"\n",
|
|
"model = training_run.register_model(\n",
|
|
" model_name=model_name, model_path=\"train_artifacts\", tags=tags\n",
|
|
")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# List the models from the workspace\n",
|
|
"models = Model.list(ws, name=model_name, latest=True)\n",
|
|
"print(model.name)\n",
|
|
"print(model.tags)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Write a scoring script"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"To do the scoring, you create a batch scoring script `batch_scoring.py`, and write it to the scripts folder in current directory. The script takes a minibatch of input images, applies the classification model, and outputs the predictions to a results file.\n",
|
|
"\n",
|
|
"The script `batch_scoring.py` takes the following parameters, which get passed from the `ParallelRunStep` that you create later:\n",
|
|
"\n",
|
|
"- `--model_name`: the name of the model being used\n",
|
|
"\n",
|
|
"While creating the batch scoring script, refer to the scoring scripts generated under the outputs folder of the Automl training runs. This will help to identify the right model settings to be used in the batch scoring script init method while loading the model.\n",
|
|
"Note: The batch scoring script we generate in the subsequent step is different from the scoring script generated by the training runs in the below screenshot. We refer to it just to identify the right model settings to be used in the batch scoring script.\n",
|
|
"\n",
|
|
""
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# View the batch scoring script. Use the model settings as appropriate for your model.\n",
|
|
"with open(\"./scripts/batch_scoring.py\", \"r\") as f:\n",
|
|
" print(f.read())"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Build and run the pipeline"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Create the parallel-run configuration to wrap the inference script\n",
|
|
"Create the pipeline run configuration specifying the script, environment configuration, and parameters. Specify the compute target you already attached to your workspace as the target of execution of the script. This will set the run configuration of the ParallelRunStep we will define next.\n",
|
|
"\n",
|
|
"Refer this [site](https://github.com/Azure/MachineLearningNotebooks/tree/master/how-to-use-azureml/machine-learning-pipelines/parallel-run) for more details on ParallelRunStep of Azure Machine Learning Pipelines."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from azureml.pipeline.steps import ParallelRunConfig\n",
|
|
"\n",
|
|
"parallel_run_config = ParallelRunConfig(\n",
|
|
" environment=env,\n",
|
|
" entry_script=\"batch_scoring.py\",\n",
|
|
" source_directory=\"scripts\",\n",
|
|
" output_action=\"append_row\",\n",
|
|
" append_row_file_name=\"parallel_run_step.txt\",\n",
|
|
" mini_batch_size=\"20\", # Num files to process in one call\n",
|
|
" error_threshold=1,\n",
|
|
" compute_target=compute_target,\n",
|
|
" process_count_per_node=2,\n",
|
|
" node_count=1,\n",
|
|
")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Create the pipeline step\n",
|
|
"\n",
|
|
"A pipeline step is an object that encapsulates everything you need for running a pipeline including:\n",
|
|
"\n",
|
|
"* environment and dependency settings\n",
|
|
"* the compute resource to run the pipeline on\n",
|
|
"* input and output data, and any custom parameters\n",
|
|
"* reference to a script to run during the step\n",
|
|
"\n",
|
|
"There are multiple classes that inherit from the parent class [`PipelineStep`](https://docs.microsoft.com/en-us/python/api/azureml-pipeline-steps/?view=azure-ml-py) to assist with building a step using certain frameworks and stacks. In this example, you use the [`ParallelRunStep`](https://docs.microsoft.com/en-us/python/api/azureml-contrib-pipeline-steps/azureml.contrib.pipeline.steps.parallelrunstep?view=azure-ml-py) class to define your step logic using a scoring script. `ParallelRunStep` executes the script in a distributed fashion.\n",
|
|
"\n",
|
|
"The pipelines infrastructure uses the `ArgumentParser` class to pass parameters into pipeline steps. For example, in the code below the first argument `--model_name` is given the property identifier `model_name`. In the `main()` function, this property is accessed using `Model.get_model_path(args.model_name)`.\n",
|
|
"\n",
|
|
"Note: The pipeline in this tutorial only has one step and writes the output to a file, but for multi-step pipelines, you also use `ArgumentParser` to define a directory to write output data for input to subsequent steps. See the [notebook](https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/machine-learning-pipelines/nyc-taxi-data-regression-model-building/nyc-taxi-data-regression-model-building.ipynb) for an example of passing data between multiple pipeline steps using the `ArgumentParser` design pattern."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from azureml.pipeline.steps import ParallelRunStep\n",
|
|
"from datetime import datetime\n",
|
|
"\n",
|
|
"parallel_step_name = \"batchscoring-\" + datetime.now().strftime(\"%Y%m%d%H%M\")\n",
|
|
"\n",
|
|
"arguments = [\"--model_name\", model_name]\n",
|
|
"\n",
|
|
"# Specify inference batch_size, otherwise uses default value. (This is different from the mini_batch_size above)\n",
|
|
"# NOTE: Large batch sizes may result in OOM errors.\n",
|
|
"# arguments = arguments + [\"--batch_size\", \"20\"]\n",
|
|
"\n",
|
|
"batch_score_step = ParallelRunStep(\n",
|
|
" name=parallel_step_name,\n",
|
|
" inputs=[input_images.as_named_input(\"input_images\")],\n",
|
|
" output=output_dir,\n",
|
|
" arguments=arguments,\n",
|
|
" parallel_run_config=parallel_run_config,\n",
|
|
" allow_reuse=False,\n",
|
|
")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"For a list of all classes for different step types, see the [steps package](https://docs.microsoft.com/python/api/azureml-pipeline-steps/azureml.pipeline.steps?view=azure-ml-py)."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Run the pipeline\n",
|
|
"\n",
|
|
"Now you run the pipeline. First create a `Pipeline` object with your workspace reference and the pipeline step you created. The `steps` parameter is an array of steps, and in this case, there is only one step for batch scoring. To build pipelines with multiple steps, you place the steps in order in this array.\n",
|
|
"\n",
|
|
"Next use the `Experiment.submit()` function to submit the pipeline for execution. You also specify the custom parameter `param_batch_size`. The `wait_for_completion` function will output logs during the pipeline build process, which allows you to see current progress.\n",
|
|
"\n",
|
|
"Note: The first pipeline run takes roughly **15 minutes**, as all dependencies must be downloaded, a Docker image is created, and the Python environment is provisioned/created. Running it again takes significantly less time as those resources are reused. However, total run time depends on the workload of your scripts and processes running in each pipeline step."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from azureml.core import Experiment\n",
|
|
"from azureml.pipeline.core import Pipeline\n",
|
|
"\n",
|
|
"pipeline = Pipeline(workspace=ws, steps=[batch_score_step])\n",
|
|
"pipeline_run = Experiment(ws, \"batch_scoring_automl_image\").submit(pipeline)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# This will output information of the pipeline run, including the link to the details page of portal.\n",
|
|
"pipeline_run"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Wait the run for completion and show output log to console\n",
|
|
"pipeline_run.wait_for_completion(show_output=True)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Download and review output"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"import tempfile\n",
|
|
"import os\n",
|
|
"\n",
|
|
"batch_run = pipeline_run.find_step_run(batch_score_step.name)[0]\n",
|
|
"batch_output = batch_run.get_output_data(output_dir.name)\n",
|
|
"\n",
|
|
"target_dir = tempfile.mkdtemp()\n",
|
|
"batch_output.download(local_path=target_dir)\n",
|
|
"result_file = os.path.join(\n",
|
|
" target_dir, batch_output.path_on_datastore, parallel_run_config.append_row_file_name\n",
|
|
")\n",
|
|
"result_file\n",
|
|
"\n",
|
|
"# Print the first five lines of the output\n",
|
|
"with open(result_file) as f:\n",
|
|
" for x in range(5):\n",
|
|
" print(next(f))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Choose a random file for visualization"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"import random\n",
|
|
"import json\n",
|
|
"\n",
|
|
"with open(result_file, \"r\") as f:\n",
|
|
" contents = f.readlines()\n",
|
|
"rand_file = contents[random.randrange(len(contents))]\n",
|
|
"prediction = json.loads(rand_file)\n",
|
|
"print(prediction[\"filename\"])\n",
|
|
"print(prediction[\"probs\"])\n",
|
|
"print(prediction[\"labels\"])"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Download the image file from the datastore\n",
|
|
"path = (\n",
|
|
" \"fridgeObjects\"\n",
|
|
" + \"/\"\n",
|
|
" + prediction[\"filename\"].split(\"/\")[-2]\n",
|
|
" + \"/\"\n",
|
|
" + prediction[\"filename\"].split(\"/\")[-1]\n",
|
|
")\n",
|
|
"path_on_datastore = def_data_store.path(path)\n",
|
|
"single_image_ds = Dataset.File.from_files(path=path_on_datastore, validate=False)\n",
|
|
"image = single_image_ds.download()[0]"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"%matplotlib inline\n",
|
|
"import matplotlib.pyplot as plt\n",
|
|
"import matplotlib.image as mpimg\n",
|
|
"from PIL import Image\n",
|
|
"import numpy as np\n",
|
|
"import json\n",
|
|
"\n",
|
|
"IMAGE_SIZE = (18, 12)\n",
|
|
"plt.figure(figsize=IMAGE_SIZE)\n",
|
|
"img_np = mpimg.imread(image)\n",
|
|
"img = Image.fromarray(img_np.astype(\"uint8\"), \"RGB\")\n",
|
|
"x, y = img.size\n",
|
|
"\n",
|
|
"fig, ax = plt.subplots(1, figsize=(15, 15))\n",
|
|
"# Display the image\n",
|
|
"ax.imshow(img_np)\n",
|
|
"\n",
|
|
"label_index = np.argmax(prediction[\"probs\"])\n",
|
|
"label = prediction[\"labels\"][label_index]\n",
|
|
"conf_score = prediction[\"probs\"][label_index]\n",
|
|
"\n",
|
|
"display_text = \"{} ({})\".format(label, round(conf_score, 3))\n",
|
|
"print(display_text)\n",
|
|
"\n",
|
|
"color = \"red\"\n",
|
|
"plt.text(30, 30, display_text, color=color, fontsize=30)\n",
|
|
"\n",
|
|
"plt.show()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Publish and run from REST endpoint"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Run the following code to publish the pipeline to your workspace. In your workspace in the portal, you can see metadata for the pipeline including run history and durations. You can also run the pipeline manually from the portal.\n",
|
|
"\n",
|
|
"Additionally, publishing the pipeline enables a REST endpoint to rerun the pipeline from any HTTP library on any platform."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"published_pipeline = pipeline_run.publish_pipeline(\n",
|
|
" name=\"automl-image-batch-scoring\",\n",
|
|
" description=\"Batch scoring using Automl for Image\",\n",
|
|
" version=\"1.0\",\n",
|
|
")\n",
|
|
"\n",
|
|
"published_pipeline"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"To run the pipeline from the REST endpoint, you first need an OAuth2 Bearer-type authentication header. This example uses interactive authentication for illustration purposes, but for most production scenarios requiring automated or headless authentication, use service principal authentication as [described in this notebook](https://aka.ms/pl-restep-auth).\n",
|
|
"\n",
|
|
"Service principal authentication involves creating an **App Registration** in **Azure Active Directory**, generating a client secret, and then granting your service principal **role access** to your machine learning workspace. You then use the [`ServicePrincipalAuthentication`](https://docs.microsoft.com/python/api/azureml-core/azureml.core.authentication.serviceprincipalauthentication?view=azure-ml-py) class to manage your auth flow.\n",
|
|
"\n",
|
|
"Both `InteractiveLoginAuthentication` and `ServicePrincipalAuthentication` inherit from `AbstractAuthentication`, and in both cases you use the `get_authentication_header()` function in the same way to fetch the header."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from azureml.core.authentication import InteractiveLoginAuthentication\n",
|
|
"\n",
|
|
"interactive_auth = InteractiveLoginAuthentication()\n",
|
|
"auth_header = interactive_auth.get_authentication_header()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Get the REST url from the `endpoint` property of the published pipeline object. You can also find the REST url in your workspace in the portal. Build an HTTP POST request to the endpoint, specifying your authentication header. Additionally, add a JSON payload object with the experiment name and the batch size parameter. As a reminder, the `process_count_per_node` is passed through to `ParallelRunStep` because you defined it is defined as a `PipelineParameter` object in the step configuration.\n",
|
|
"\n",
|
|
"Make the request to trigger the run. Access the `Id` key from the response dictionary to get the value of the run id."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"import requests\n",
|
|
"\n",
|
|
"rest_endpoint = published_pipeline.endpoint\n",
|
|
"response = requests.post(\n",
|
|
" rest_endpoint,\n",
|
|
" headers=auth_header,\n",
|
|
" json={\n",
|
|
" \"ExperimentName\": \"batch_scoring\",\n",
|
|
" \"ParameterAssignments\": {\"process_count_per_node\": 2},\n",
|
|
" },\n",
|
|
")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"try:\n",
|
|
" response.raise_for_status()\n",
|
|
"except Exception:\n",
|
|
" raise Exception(\n",
|
|
" \"Received bad response from the endpoint: {}\\n\"\n",
|
|
" \"Response Code: {}\\n\"\n",
|
|
" \"Headers: {}\\n\"\n",
|
|
" \"Content: {}\".format(\n",
|
|
" rest_endpoint, response.status_code, response.headers, response.content\n",
|
|
" )\n",
|
|
" )\n",
|
|
"run_id = response.json().get(\"Id\")\n",
|
|
"print(\"Submitted pipeline run: \", run_id)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Use the run id to monitor the status of the new run. This will take another 10-15 min to run and will look similar to the previous pipeline run, so if you don't need to see another pipeline run, you can skip watching the full output."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from azureml.pipeline.core.run import PipelineRun\n",
|
|
"\n",
|
|
"published_pipeline_run = PipelineRun(ws.experiments[\"batch_scoring\"], run_id)\n",
|
|
"published_pipeline_run"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Wait the run for completion and show output log to console\n",
|
|
"published_pipeline_run.wait_for_completion(show_output=True)"
|
|
]
|
|
}
|
|
],
|
|
"metadata": {
|
|
"authors": [
|
|
{
|
|
"name": [
|
|
"sanpil",
|
|
"trmccorm",
|
|
"pansav"
|
|
]
|
|
}
|
|
],
|
|
"categories": [
|
|
"tutorials"
|
|
],
|
|
"kernelspec": {
|
|
"display_name": "Python 3.6 - AzureML",
|
|
"language": "python",
|
|
"name": "python3-azureml"
|
|
},
|
|
"language_info": {
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"file_extension": ".py",
|
|
"mimetype": "text/x-python",
|
|
"name": "python",
|
|
"nbconvert_exporter": "python",
|
|
"pygments_lexer": "ipython3",
|
|
"version": "3.8.8"
|
|
},
|
|
"metadata": {
|
|
"interpreter": {
|
|
"hash": "0f25b6eb4724eea488a4edd67dd290abce7d142c09986fc811384b5aebc0585a"
|
|
}
|
|
},
|
|
"msauthor": "trbye"
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 4
|
|
}
|