update samples - test

This commit is contained in:
vizhur
2020-01-31 20:05:43 +00:00
parent 3588eb9665
commit fc5fa6530c
7 changed files with 197 additions and 962 deletions

View File

@@ -0,0 +1,83 @@
# Copyright (c) Microsoft. All rights reserved.
# Licensed under the MIT license.
import os
import argparse
import datetime
import time
import tensorflow as tf
from math import ceil
import numpy as np
import shutil
from tensorflow.contrib.slim.python.slim.nets import inception_v3
from azureml.core import Run
from azureml.core.model import Model
from azureml.core.dataset import Dataset
slim = tf.contrib.slim
image_size = 299
num_channel = 3
def get_class_label_dict():
label = []
proto_as_ascii_lines = tf.gfile.GFile("labels.txt").readlines()
for l in proto_as_ascii_lines:
label.append(l.rstrip())
return label
def init():
global g_tf_sess, probabilities, label_dict, input_images
parser = argparse.ArgumentParser(description="Start a tensorflow model serving")
parser.add_argument('--model_name', dest="model_name", required=True)
parser.add_argument('--labels_name', dest="labels_name", required=True)
args, _ = parser.parse_known_args()
workspace = Run.get_context(allow_offline=False).experiment.workspace
label_ds = Dataset.get_by_name(workspace=workspace, name=args.labels_name)
label_ds.download(target_path='.', overwrite=True)
label_dict = get_class_label_dict()
classes_num = len(label_dict)
with slim.arg_scope(inception_v3.inception_v3_arg_scope()):
input_images = tf.placeholder(tf.float32, [1, image_size, image_size, num_channel])
logits, _ = inception_v3.inception_v3(input_images,
num_classes=classes_num,
is_training=False)
probabilities = tf.argmax(logits, 1)
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
g_tf_sess = tf.Session(config=config)
g_tf_sess.run(tf.global_variables_initializer())
g_tf_sess.run(tf.local_variables_initializer())
model_path = Model.get_model_path(args.model_name)
saver = tf.train.Saver()
saver.restore(g_tf_sess, model_path)
def file_to_tensor(file_path):
image_string = tf.read_file(file_path)
image = tf.image.decode_image(image_string, channels=3)
image.set_shape([None, None, None])
image = tf.image.resize_images(image, [image_size, image_size])
image = tf.divide(tf.subtract(image, [0]), [255])
image.set_shape([image_size, image_size, num_channel])
return image
def run(mini_batch):
result_list = []
for file_path in mini_batch:
test_image = file_to_tensor(file_path)
out = g_tf_sess.run(test_image)
result = g_tf_sess.run(probabilities, feed_dict={input_images: [out]})
result_list.append(os.path.basename(file_path) + ": " + label_dict[result[0]])
return result_list

View File

@@ -15,19 +15,16 @@
"![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/machine-learning-pipelines/pipeline-batch-scoring/pipeline-batch-scoring.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Note**: Azure Machine Learning recently released ParallelRunStep for public preview, this will allow for parallelization of your workload across many compute nodes without the difficulty of orchestrating worker pools and queues. See the [batch inference notebooks](../contrib/batch_inferencing/) for examples on how to get started."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Use Azure Machine Learning Pipelines for batch prediction\n",
"\n",
"## Note\n",
"This notebook uses public preview functionality (ParallelRunStep). Please install azureml-contrib-pipeline-steps package before running this notebook.\n",
"\n",
"\n",
"In this tutorial, you use Azure Machine Learning service pipelines to run a batch scoring image classification job. The example job uses the pre-trained [Inception-V3](https://arxiv.org/abs/1512.00567) CNN (convolutional neural network) Tensorflow model to classify unlabeled images. Machine learning pipelines optimize your workflow with speed, portability, and reuse so you can focus on your expertise, machine learning, rather than on infrastructure and automation. After building and publishing a pipeline, you can configure a REST endpoint to enable triggering the pipeline from any HTTP library on any platform.\n",
"\n",
"\n",
@@ -37,6 +34,7 @@
"> * Create data objects to fetch and output data\n",
"> * Download, prepare, and register the model to your workspace\n",
"> * Provision compute targets and create a scoring script\n",
"> * Use ParallelRunStep to do batch scoring\n",
"> * Build, run, and publish a pipeline\n",
"> * Enable a REST endpoint for the pipeline\n",
"\n",
@@ -111,14 +109,14 @@
"source": [
"## Create data objects\n",
"\n",
"When building pipelines, `DataReference` objects are used for reading data from workspace datastores, and `PipelineData` objects are used for transferring intermediate data between pipeline steps.\n",
"When building pipelines, `Dataset` objects are used for reading data from workspace datastores, and `PipelineData` objects are used for transferring intermediate data between pipeline steps.\n",
"\n",
"This batch scoring example only uses one pipeline step, but in use-cases with multiple steps, the typical flow will include:\n",
"\n",
"1. Using `DataReference` objects as **inputs** to fetch raw data, performing some transformations, then **outputting** a `PipelineData` object.\n",
"1. Using `Dataset` objects as **inputs** to fetch raw data, performing some transformations, then **outputting** a `PipelineData` object.\n",
"1. Use the previous step's `PipelineData` **output object** as an *input object*, repeated for subsequent steps.\n",
"\n",
"For this scenario you create `DataReference` objects corresponding to the datastore directories for both the input images and the classification labels (y-test values). You also create a `PipelineData` object for the batch scoring output data."
"For this scenario you create `Dataset` objects corresponding to the datastore directories for both the input images and the classification labels (y-test values). You also create a `PipelineData` object for the batch scoring output data."
]
},
{
@@ -127,21 +125,11 @@
"metadata": {},
"outputs": [],
"source": [
"from azureml.data.data_reference import DataReference\n",
"from azureml.core.dataset import Dataset\n",
"from azureml.pipeline.core import PipelineData\n",
"\n",
"input_images = DataReference(datastore=batchscore_blob, \n",
" data_reference_name=\"input_images\",\n",
" path_on_datastore=\"batchscoring/images\",\n",
" mode=\"download\"\n",
" )\n",
"\n",
"label_dir = DataReference(datastore=batchscore_blob, \n",
" data_reference_name=\"input_labels\",\n",
" path_on_datastore=\"batchscoring/labels\",\n",
" mode=\"download\" \n",
" )\n",
"\n",
"input_images = Dataset.File.from_files((batchscore_blob, \"batchscoring/images/\"))\n",
"label_ds = Dataset.File.from_files((batchscore_blob, \"batchscoring/labels/*.txt\"))\n",
"output_dir = PipelineData(name=\"scores\", \n",
" datastore=def_data_store, \n",
" output_path_on_compute=\"batchscoring/results\")"
@@ -150,6 +138,25 @@
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Next, we need to register the datasets with the workspace."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"input_images = input_images.register(workspace = ws, name = \"input_images\")\n",
"label_ds = label_ds.register(workspace = ws, name = \"label_ds\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"## Download and register the model"
]
@@ -192,13 +199,17 @@
"metadata": {},
"outputs": [],
"source": [
"import shutil\n",
"from azureml.core.model import Model\n",
" \n",
"\n",
"# register downloaded model \n",
"model = Model.register(model_path=\"models/inception_v3.ckpt\",\n",
" model_name=\"inception\",\n",
" tags={\"pretrained\": \"inception\"},\n",
" description=\"Imagenet trained tensorflow inception\",\n",
" workspace=ws)"
" workspace=ws)\n",
"# remove the downloaded dir after registration if you wish\n",
"shutil.rmtree(\"models\")"
]
},
{
@@ -244,142 +255,16 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"To do the scoring, you create a batch scoring script `batch_scoring.py`, and write it to the current directory. The script takes input images, applies the classification model, and outputs the predictions to a results file.\n",
"To do the scoring, you create a batch scoring script `batch_scoring.py`, and write it to the current directory. The script takes a minibatch of input images, applies the classification model, and outputs the predictions to a results file.\n",
"\n",
"The script `batch_scoring.py` takes the following parameters, which get passed from the `PythonScriptStep` that you create later:\n",
"The script `batch_scoring.py` takes the following parameters, which get passed from the `ParallelRunStep` that you create later:\n",
"\n",
"- `--model_name`: the name of the model being used\n",
"- `--label_dir` : the directory holding the `labels.txt` file \n",
"- `--dataset_path`: the directory containing the input images\n",
"- `--output_dir` : the script will run the model on the data and output a `results-label.txt` to this directory\n",
"- `--batch_size` : the batch size used in running the model\n",
"- `--labels_name` : the name of the `Dataset` holding the `labels.txt` file \n",
"\n",
"The pipelines infrastructure uses the `ArgumentParser` class to pass parameters into pipeline steps. For example, in the code below the first argument `--model_name` is given the property identifier `model_name`. In the `main()` function, this property is accessed using `Model.get_model_path(args.model_name)`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%writefile batch_scoring.py\n",
"\n",
"import os\n",
"import argparse\n",
"import datetime\n",
"import time\n",
"import tensorflow as tf\n",
"from math import ceil\n",
"import numpy as np\n",
"import shutil\n",
"from tensorflow.contrib.slim.python.slim.nets import inception_v3\n",
"from azureml.core.model import Model\n",
"\n",
"slim = tf.contrib.slim\n",
"\n",
"parser = argparse.ArgumentParser(description=\"Start a tensorflow model serving\")\n",
"parser.add_argument('--model_name', dest=\"model_name\", required=True)\n",
"parser.add_argument('--label_dir', dest=\"label_dir\", required=True)\n",
"parser.add_argument('--dataset_path', dest=\"dataset_path\", required=True)\n",
"parser.add_argument('--output_dir', dest=\"output_dir\", required=True)\n",
"parser.add_argument('--batch_size', dest=\"batch_size\", type=int, required=True)\n",
"\n",
"args = parser.parse_args()\n",
"\n",
"image_size = 299\n",
"num_channel = 3\n",
"\n",
"# create output directory if it does not exist\n",
"os.makedirs(args.output_dir, exist_ok=True)\n",
"\n",
"\n",
"def get_class_label_dict(label_file):\n",
" label = []\n",
" proto_as_ascii_lines = tf.gfile.GFile(label_file).readlines()\n",
" for l in proto_as_ascii_lines:\n",
" label.append(l.rstrip())\n",
" return label\n",
"\n",
"\n",
"class DataIterator:\n",
" def __init__(self, data_dir):\n",
" self.file_paths = []\n",
" image_list = os.listdir(data_dir)\n",
" self.file_paths = [data_dir + '/' + file_name.rstrip() for file_name in image_list]\n",
"\n",
" self.labels = [1 for file_name in self.file_paths]\n",
"\n",
" @property\n",
" def size(self):\n",
" return len(self.labels)\n",
"\n",
" def input_pipeline(self, batch_size):\n",
" images_tensor = tf.convert_to_tensor(self.file_paths, dtype=tf.string)\n",
" labels_tensor = tf.convert_to_tensor(self.labels, dtype=tf.int64)\n",
" input_queue = tf.train.slice_input_producer([images_tensor, labels_tensor], shuffle=False)\n",
" labels = input_queue[1]\n",
" images_content = tf.read_file(input_queue[0])\n",
"\n",
" image_reader = tf.image.decode_jpeg(images_content, channels=num_channel, name=\"jpeg_reader\")\n",
" float_caster = tf.cast(image_reader, tf.float32)\n",
" new_size = tf.constant([image_size, image_size], dtype=tf.int32)\n",
" images = tf.image.resize_images(float_caster, new_size)\n",
" images = tf.divide(tf.subtract(images, [0]), [255])\n",
"\n",
" image_batch, label_batch = tf.train.batch([images, labels], batch_size=batch_size, capacity=5 * batch_size)\n",
" return image_batch\n",
"\n",
"\n",
"def main(_):\n",
" label_file_name = os.path.join(args.label_dir, \"labels.txt\")\n",
" label_dict = get_class_label_dict(label_file_name)\n",
" classes_num = len(label_dict)\n",
" test_feeder = DataIterator(data_dir=args.dataset_path)\n",
" total_size = len(test_feeder.labels)\n",
" count = 0\n",
" \n",
" # get model from model registry\n",
" model_path = Model.get_model_path(args.model_name)\n",
" \n",
" with tf.Session() as sess:\n",
" test_images = test_feeder.input_pipeline(batch_size=args.batch_size)\n",
" with slim.arg_scope(inception_v3.inception_v3_arg_scope()):\n",
" input_images = tf.placeholder(tf.float32, [args.batch_size, image_size, image_size, num_channel])\n",
" logits, _ = inception_v3.inception_v3(input_images,\n",
" num_classes=classes_num,\n",
" is_training=False)\n",
" probabilities = tf.argmax(logits, 1)\n",
"\n",
" sess.run(tf.global_variables_initializer())\n",
" sess.run(tf.local_variables_initializer())\n",
" coord = tf.train.Coordinator()\n",
" threads = tf.train.start_queue_runners(sess=sess, coord=coord)\n",
" saver = tf.train.Saver()\n",
" saver.restore(sess, model_path)\n",
" out_filename = os.path.join(args.output_dir, \"result-labels.txt\")\n",
" with open(out_filename, \"w\") as result_file:\n",
" i = 0\n",
" while count < total_size and not coord.should_stop():\n",
" test_images_batch = sess.run(test_images)\n",
" file_names_batch = test_feeder.file_paths[i * args.batch_size:\n",
" min(test_feeder.size, (i + 1) * args.batch_size)]\n",
" results = sess.run(probabilities, feed_dict={input_images: test_images_batch})\n",
" new_add = min(args.batch_size, total_size - count)\n",
" count += new_add\n",
" i += 1\n",
" for j in range(new_add):\n",
" result_file.write(os.path.basename(file_names_batch[j]) + \": \" + label_dict[results[j]] + \"\\n\")\n",
" result_file.flush()\n",
" coord.request_stop()\n",
" coord.join(threads)\n",
"\n",
" shutil.copy(out_filename, \"./outputs/\")\n",
"\n",
"if __name__ == \"__main__\":\n",
" tf.app.run()"
]
},
{
"cell_type": "markdown",
"metadata": {},
@@ -407,26 +292,23 @@
"metadata": {},
"outputs": [],
"source": [
"from azureml.core import Environment\n",
"from azureml.core.conda_dependencies import CondaDependencies\n",
"from azureml.core.runconfig import DEFAULT_GPU_IMAGE\n",
"from azureml.core.runconfig import CondaDependencies, RunConfiguration\n",
"\n",
"cd = CondaDependencies.create(pip_packages=[\"tensorflow-gpu==1.13.1\", \"azureml-defaults\"])\n",
"\n",
"amlcompute_run_config = RunConfiguration(conda_dependencies=cd)\n",
"amlcompute_run_config.environment.docker.enabled = True\n",
"amlcompute_run_config.environment.docker.base_image = DEFAULT_GPU_IMAGE\n",
"amlcompute_run_config.environment.spark.precache_packages = False"
"env = Environment(name=\"parallelenv\")\n",
"env.python.conda_dependencies=cd\n",
"env.docker.base_image = DEFAULT_GPU_IMAGE"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Parameterize the pipeline\n",
"\n",
"Define a custom parameter for the pipeline to control the batch size. After the pipeline has been published and exposed via a REST endpoint, any configured parameters are also exposed and can be specified in the JSON payload when rerunning the pipeline with an HTTP request.\n",
"\n",
"Create a `PipelineParameter` object to enable this behavior, and define a name and default value."
"### Create the configuration to wrap the inference script\n",
"Create the pipeline step using the script, environment configuration, and parameters. Specify the compute target you already attached to your workspace as the target of execution of the script. We will use PythonScriptStep to create the pipeline step."
]
},
{
@@ -435,8 +317,19 @@
"metadata": {},
"outputs": [],
"source": [
"from azureml.pipeline.core.graph import PipelineParameter\n",
"batch_size_param = PipelineParameter(name=\"param_batch_size\", default_value=20)"
"from azureml.contrib.pipeline.steps import ParallelRunConfig\n",
"\n",
"parallel_run_config = ParallelRunConfig(\n",
" environment=env,\n",
" entry_script=\"batch_scoring.py\",\n",
" source_directory=\"scripts\",\n",
" output_action=\"append_row\",\n",
" mini_batch_size=\"20\",\n",
" error_threshold=1,\n",
" compute_target=compute_target,\n",
" process_count_per_node=2,\n",
" node_count=1\n",
")"
]
},
{
@@ -452,7 +345,7 @@
"* input and output data, and any custom parameters\n",
"* reference to a script or SDK-logic to run during the step\n",
"\n",
"There are multiple classes that inherit from the parent class [`PipelineStep`](https://docs.microsoft.com/python/api/azureml-pipeline-core/azureml.pipeline.core.builder.pipelinestep?view=azure-ml-py) to assist with building a step using certain frameworks and stacks. In this example, you use the [`PythonScriptStep`](https://docs.microsoft.com/python/api/azureml-pipeline-steps/azureml.pipeline.steps.python_script_step.pythonscriptstep?view=azure-ml-py) class to define your step logic using a custom python script. Note that if an argument to your script is either an input to the step or output of the step, it must be defined **both** in the `arguments` array, **as well as** in either the `input` or `output` parameter, respectively. \n",
"There are multiple classes that inherit from the parent class [`PipelineStep`](https://docs.microsoft.com/python/api/azureml-pipeline-core/azureml.pipeline.core.builder.pipelinestep?view=azure-ml-py) to assist with building a step using certain frameworks and stacks. In this example, you use the [`ParallelRunStep`](https://docs.microsoft.com/en-us/python/api/azureml-contrib-pipeline-steps/azureml.contrib.pipeline.steps.parallelrunstep?view=azure-ml-py) class to define your step logic using a scoring script. \n",
"\n",
"An object reference in the `outputs` array becomes available as an **input** for a subsequent pipeline step, for scenarios where there is more than one step."
]
@@ -463,20 +356,20 @@
"metadata": {},
"outputs": [],
"source": [
"from azureml.pipeline.steps import PythonScriptStep\n",
"from azureml.contrib.pipeline.steps import ParallelRunStep\n",
"from datetime import datetime\n",
"\n",
"batch_score_step = PythonScriptStep(\n",
" name=\"batch_scoring\",\n",
" script_name=\"batch_scoring.py\",\n",
" arguments=[\"--dataset_path\", input_images, \n",
" \"--model_name\", \"inception\",\n",
" \"--label_dir\", label_dir, \n",
" \"--output_dir\", output_dir, \n",
" \"--batch_size\", batch_size_param],\n",
" compute_target=compute_target,\n",
" inputs=[input_images, label_dir],\n",
" outputs=[output_dir],\n",
" runconfig=amlcompute_run_config\n",
"parallel_step_name = \"batchscoring-\" + datetime.now().strftime(\"%Y%m%d%H%M\")\n",
"\n",
"batch_score_step = ParallelRunStep(\n",
" name=parallel_step_name,\n",
" inputs=[input_images.as_named_input(\"input_images\")],\n",
" output=output_dir,\n",
" models=[model],\n",
" arguments=[\"--model_name\", \"inception\",\n",
" \"--labels_name\", \"label_ds\"],\n",
" parallel_run_config=parallel_run_config,\n",
" allow_reuse=False\n",
")"
]
},
@@ -510,7 +403,7 @@
"from azureml.pipeline.core import Pipeline\n",
"\n",
"pipeline = Pipeline(workspace=ws, steps=[batch_score_step])\n",
"pipeline_run = Experiment(ws, 'batch_scoring').submit(pipeline, pipeline_parameters={\"param_batch_size\": 20})\n",
"pipeline_run = Experiment(ws, \"batch_scoring\").submit(pipeline)\n",
"pipeline_run.wait_for_completion(show_output=True)"
]
},
@@ -534,14 +427,20 @@
"metadata": {},
"outputs": [],
"source": [
"batch_run = next(pipeline_run.get_children())\n",
"batch_output = batch_run.get_output_data(\"scores\")\n",
"batch_output.download(local_path=\"inception_results\")\n",
"\n",
"import pandas as pd\n",
"for root, dirs, files in os.walk(\"inception_results\"):\n",
" for file in files:\n",
" if file.endswith(\"parallel_run_step.txt\"):\n",
" result_file = os.path.join(root,file)\n",
"\n",
"step_run = list(pipeline_run.get_children())[0]\n",
"step_run.download_file(\"./outputs/result-labels.txt\")\n",
"\n",
"df = pd.read_csv(\"result-labels.txt\", delimiter=\":\", header=None)\n",
"df = pd.read_csv(result_file, delimiter=\":\", header=None)\n",
"df.columns = [\"Filename\", \"Prediction\"]\n",
"df.head(10)"
"print(\"Prediction has \", df.shape[0], \" rows\")\n",
"df.head(10) "
]
},
{
@@ -599,7 +498,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Get the REST url from the `endpoint` property of the published pipeline object. You can also find the REST url in your workspace in the portal. Build an HTTP POST request to the endpoint, specifying your authentication header. Additionally, add a JSON payload object with the experiment name and the batch size parameter. As a reminder, the `param_batch_size` is passed through to your `batch_scoring.py` script because you defined it as a `PipelineParameter` object in the step configuration.\n",
"Get the REST url from the `endpoint` property of the published pipeline object. You can also find the REST url in your workspace in the portal. Build an HTTP POST request to the endpoint, specifying your authentication header. Additionally, add a JSON payload object with the experiment name and the batch size parameter. As a reminder, the `process_count_per_node` is passed through to `ParallelRunStep` because you defined it is defined as a `PipelineParameter` object in the step configuration.\n",
"\n",
"Make the request to trigger the run. Access the `Id` key from the response dict to get the value of the run id."
]
@@ -616,8 +515,25 @@
"response = requests.post(rest_endpoint, \n",
" headers=auth_header, \n",
" json={\"ExperimentName\": \"batch_scoring\",\n",
" \"ParameterAssignments\": {\"param_batch_size\": 50}})\n",
"run_id = response.json()[\"Id\"]"
" \"ParameterAssignments\": {\"process_count_per_node\": 6}})"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"try:\n",
" response.raise_for_status()\n",
"except Exception: \n",
" raise Exception(\"Received bad response from the endpoint: {}\\n\"\n",
" \"Response Code: {}\\n\"\n",
" \"Headers: {}\\n\"\n",
" \"Content: {}\".format(rest_endpoint, response.status_code, response.headers, response.content))\n",
"\n",
"run_id = response.json().get('Id')\n",
"print('Submitted pipeline run: ', run_id)"
]
},
{
@@ -652,7 +568,8 @@
"\n",
"If you used a cloud notebook server, stop the VM when you are not using it to reduce cost.\n",
"\n",
"1. In your workspace, select **Notebook VMs**.\n",
"1. In your workspace, select **Compute**.\n",
"1. Select the **Notebook VMs** tab in the compute page.\n",
"1. From the list, select the VM.\n",
"1. Select **Stop**.\n",
"1. When you're ready to use the server again, select **Start**.\n",
@@ -683,19 +600,16 @@
"\n",
"See the [how-to](https://docs.microsoft.com/azure/machine-learning/service/how-to-create-your-first-pipeline?view=azure-devops) for additional detail on building pipelines with the machine learning SDK."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"authors": [
{
"name": "sanpil"
"name": [
"sanpil",
"trmccorm",
"pansav"
]
}
],
"kernelspec": {

View File

@@ -3,7 +3,7 @@ dependencies:
- pip:
- azureml-sdk
- azureml-pipeline-core
- azureml-pipeline-steps
- azureml-contrib-pipeline-steps
- pandas
- requests
- azureml-widgets