MachineLearningNotebooks/how-to-use-azureml/deployment/accelerated-models/accelerated-models-quickstart.ipynb

{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/deployment/accelerated-models/accelerated-models-quickstart.png)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Copyright (c) Microsoft Corporation. All rights reserved.\n",
        "\n",
        "Licensed under the MIT License."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "# Azure ML Hardware Accelerated Models Quickstart"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "This tutorial will show you how to deploy an image recognition service based on the ResNet 50 classifier using the Azure Machine Learning Accelerated Models service.  Get more information about our service from our [documentation](https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-accelerate-with-fpgas), [API reference](https://docs.microsoft.com/en-us/python/api/azureml-accel-models/azureml.accel?view=azure-ml-py), or [forum](https://aka.ms/aml-forum).\n",
        "\n",
        "We will use an accelerated ResNet50 featurizer running on an FPGA. Our Accelerated Models Service handles translating deep neural networks (DNN) into an FPGA program.\n",
        "\n",
        "For more information about using other models besides Resnet50, see the [README](./README.md).\n",
        "\n",
        "The steps covered in this notebook are: \n",
        "1. [Set up environment](#set-up-environment)\n",
        "* [Construct model](#construct-model)\n",
        "    * Image Preprocessing\n",
        "    * Featurizer (Resnet50)\n",
        "    * Classifier\n",
        "    * Save Model\n",
        "* [Register Model](#register-model)\n",
        "* [Convert into Accelerated Model](#convert-model)\n",
        "* [Create Image](#create-image)\n",
        "* [Deploy](#deploy-image)\n",
        "* [Test service](#test-service)\n",
        "* [Clean-up](#clean-up)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "<a id=\"set-up-environment\"></a>\n",
        "## 1. Set up environment"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import os\n",
        "import tensorflow as tf"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Retrieve Workspace\n",
        "If you haven't created a Workspace, please follow [this notebook](https://github.com/Azure/MachineLearningNotebooks/blob/master/configuration.ipynb) to do so. If you have, run the codeblock below to retrieve it. "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.core import Workspace\n",
        "\n",
        "ws = Workspace.from_config()\n",
        "print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep = '\\n')"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "<a id=\"construct-model\"></a>\n",
        "## 2. Construct model\n",
        "\n",
        "There are three parts to the model we are deploying: pre-processing, featurizer with ResNet50, and classifier with ImageNet dataset. Then we will save this complete Tensorflow model graph locally before registering it to your Azure ML Workspace.\n",
        "\n",
        "### 2.a. Image preprocessing\n",
        "We'd like our service to accept JPEG images as input. However the input to ResNet50 is a tensor. So we need code that decodes JPEG images and does the preprocessing required by ResNet50. The Accelerated AI service can execute TensorFlow graphs as part of the service and we'll use that ability to do the image preprocessing. This code defines a TensorFlow graph that preprocesses an array of JPEG images (as strings) and produces a tensor that is ready to be featurized by ResNet50.\n",
        "\n",
        "**Note:** Expect to see TF deprecation warnings until we port our SDK over to use Tensorflow 2.0."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Input images as a two-dimensional tensor containing an arbitrary number of images represented a strings\n",
        "import azureml.accel.models.utils as utils\n",
        "tf.reset_default_graph()\n",
        "\n",
        "in_images = tf.placeholder(tf.string)\n",
        "image_tensors = utils.preprocess_array(in_images)\n",
        "print(image_tensors.shape)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### 2.b. Featurizer\n",
        "We use ResNet50 as a featurizer. In this step we initialize the model. This downloads a TensorFlow checkpoint of the quantized ResNet50."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.accel.models import QuantizedResnet50\n",
        "save_path = os.path.expanduser('~/models')\n",
        "model_graph = QuantizedResnet50(save_path, is_frozen = True)\n",
        "feature_tensor = model_graph.import_graph_def(image_tensors)\n",
        "print(model_graph.version)\n",
        "print(feature_tensor.name)\n",
        "print(feature_tensor.shape)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### 2.c. Classifier\n",
        "The model we downloaded includes a classifier which takes the output of the ResNet50 and identifies an image. This classifier is trained on the ImageNet dataset. We are going to use this classifier for our service. The next [notebook](./accelerated-models-training.ipynb) shows how to train a classifier for a different data set. The input to the classifier is a tensor matching the output of our ResNet50 featurizer."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "classifier_output = model_graph.get_default_classifier(feature_tensor)\n",
        "print(classifier_output)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### 2.d. Save Model\n",
        "Now that we loaded all three parts of the tensorflow graph (preprocessor, resnet50 featurizer, and the classifier), we can save the graph and associated variables to a directory which we can register as an Azure ML Model."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# model_name must be lowercase\n",
        "model_name = \"resnet50\"\n",
        "model_save_path = os.path.join(save_path, model_name)\n",
        "print(\"Saving model in {}\".format(model_save_path))\n",
        "\n",
        "with tf.Session() as sess:\n",
        "    model_graph.restore_weights(sess)\n",
        "    tf.saved_model.simple_save(sess, model_save_path,\n",
        "                                   inputs={'images': in_images},\n",
        "                                   outputs={'output_alias': classifier_output})"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### 2.e. Important! Save names of input and output tensors\n",
        "\n",
        "These input and output tensors that were created during the preprocessing and classifier steps are also going to be used when **converting the model** to an Accelerated Model that can run on FPGA's and for **making an inferencing request**. It is very important to save this information! You can see our defaults for all the models in the [README](./README.md).\n",
        "\n",
        "By default for Resnet50, these are the values you should see when running the cell below: \n",
        "* input_tensors = \"Placeholder:0\"\n",
        "* output_tensors = \"classifier/resnet_v1_50/predictions/Softmax:0\""
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "tags": [
          "register model from file"
        ]
      },
      "outputs": [],
      "source": [
        "input_tensors = in_images.name\n",
        "output_tensors = classifier_output.name\n",
        "\n",
        "print(input_tensors)\n",
        "print(output_tensors)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "<a id=\"register-model\"></a>\n",
        "## 3. Register Model"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "You can add tags and descriptions to your models. Using tags, you can track useful information such as the name and version of the machine learning library used to train the model. Note that tags must be alphanumeric."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "tags": [
          "register model from file"
        ]
      },
      "outputs": [],
      "source": [
        "from azureml.core.model import Model\n",
        "\n",
        "registered_model = Model.register(workspace = ws,\n",
        "                                  model_path = model_save_path,\n",
        "                                  model_name = model_name)\n",
        "\n",
        "print(\"Successfully registered: \", registered_model.name, registered_model.description, registered_model.version, sep = '\\t')"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "<a id=\"convert-model\"></a>\n",
        "## 4. Convert Model"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "For conversion you need to provide names of input and output tensors. This information can be found from the model_graph you saved in step 2.e. above.\n",
        "\n",
        "**Note**: Conversion may take a while and on average for FPGA model it is about 1-3 minutes and it depends on model type."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "tags": [
          "register model from file"
        ]
      },
      "outputs": [],
      "source": [
        "from azureml.accel import AccelOnnxConverter\n",
        "\n",
        "convert_request = AccelOnnxConverter.convert_tf_model(ws, registered_model, input_tensors, output_tensors)\n",
        "\n",
        "if convert_request.wait_for_completion(show_output = False):\n",
        "    # If the above call succeeded, get the converted model\n",
        "    converted_model = convert_request.result\n",
        "    print(\"\\nSuccessfully converted: \", converted_model.name, converted_model.url, converted_model.version, \n",
        "          converted_model.id, converted_model.created_time, '\\n')\n",
        "else:\n",
        "    print(\"Model conversion failed. Showing output.\")\n",
        "    convert_request.wait_for_completion(show_output = True)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "<a id=\"create-image\"></a>\n",
        "## 5. Package the model into an Image"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "You can add tags and descriptions to image. Also, for FPGA model an image can only contain **single** model.\n",
        "\n",
        "**Note**: The following command can take few minutes. "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.core.image import Image\n",
        "from azureml.accel import AccelContainerImage\n",
        "\n",
        "image_config = AccelContainerImage.image_configuration()\n",
        "# Image name must be lowercase\n",
        "image_name = \"{}-image\".format(model_name)\n",
        "\n",
        "image = Image.create(name = image_name,\n",
        "                     models = [converted_model],\n",
        "                     image_config = image_config, \n",
        "                     workspace = ws)\n",
        "image.wait_for_creation(show_output = False)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "<a id=\"deploy-image\"></a>\n",
        "## 6. Deploy\n",
        "Once you have an Azure ML Accelerated Image in your Workspace, you can deploy it to two destinations, to a Databox Edge machine or to an AKS cluster. \n",
        "\n",
        "### 6.a. Databox Edge Machine using IoT Hub\n",
        "See the sample [here](https://github.com/Azure-Samples/aml-real-time-ai/) for using the Azure IoT CLI extension for deploying your Docker image to your Databox Edge Machine.\n",
        "\n",
        "### 6.b. Azure Kubernetes Service (AKS) using Azure ML Service\n",
        "We are going to create an AKS cluster with FPGA-enabled machines, then deploy our service to it. For more information, see [AKS official docs](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-deploy-and-where#aks).\n",
        "\n",
        "#### Create AKS ComputeTarget"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.core.compute import AksCompute, ComputeTarget\n",
        "\n",
        "# Uses the specific FPGA enabled VM (sku: Standard_PB6s)\n",
        "# Standard_PB6s are available in: eastus, westus2, westeurope, southeastasia\n",
        "prov_config = AksCompute.provisioning_configuration(vm_size = \"Standard_PB6s\",\n",
        "                                                    agent_count = 1, \n",
        "                                                    location = \"eastus\")\n",
        "\n",
        "aks_name = 'my-aks-pb6'\n",
        "# Create the cluster\n",
        "aks_target = ComputeTarget.create(workspace = ws, \n",
        "                                  name = aks_name, \n",
        "                                  provisioning_configuration = prov_config)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Provisioning an AKS cluster might take awhile (15 or so minutes), and we want to wait until it's successfully provisioned before we can deploy a service to it. If you interrupt this cell, provisioning of the cluster will continue. You can also check the status in your Workspace under Compute."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "%%time\n",
        "aks_target.wait_for_completion(show_output = True)\n",
        "print(aks_target.provisioning_state)\n",
        "print(aks_target.provisioning_errors)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "#### Deploy AccelContainerImage to AKS ComputeTarget"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "%%time\n",
        "from azureml.core.webservice import Webservice, AksWebservice\n",
        "\n",
        "# Set the web service configuration (for creating a test service, we don't want autoscale enabled)\n",
        "# Authentication is enabled by default, but for testing we specify False\n",
        "aks_config = AksWebservice.deploy_configuration(autoscale_enabled=False,\n",
        "                                                num_replicas=1,\n",
        "                                                auth_enabled = False)\n",
        "\n",
        "aks_service_name ='my-aks-service-1'\n",
        "\n",
        "aks_service = Webservice.deploy_from_image(workspace = ws,\n",
        "                                           name = aks_service_name,\n",
        "                                           image = image,\n",
        "                                           deployment_config = aks_config,\n",
        "                                           deployment_target = aks_target)\n",
        "aks_service.wait_for_deployment(show_output = True)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "<a id=\"test-service\"></a>\n",
        "## 7. Test the service"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### 7.a. Create Client\n",
        "The image supports gRPC and the TensorFlow Serving \"predict\" API. We will create a PredictionClient from the Webservice object that can call into the docker image to get predictions. If you do not have the Webservice object, you can also create [PredictionClient](https://docs.microsoft.com/en-us/python/api/azureml-accel-models/azureml.accel.predictionclient?view=azure-ml-py) directly.\n",
        "\n",
        "**Note:** If you chose to use auth_enabled=True when creating your AksWebservice, see documentation [here](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.webservice(class)?view=azure-ml-py#get-keys--) on how to retrieve your keys and use either key as an argument to PredictionClient(...,access_token=key).\n",
        "**WARNING:** If you are running on Azure Notebooks free compute, you will not be able to make outgoing calls to your service. Try locating your client on a different machine to consume it."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Using the grpc client in AzureML Accelerated Models SDK\n",
        "from azureml.accel import client_from_service\n",
        "\n",
        "# Initialize AzureML Accelerated Models client\n",
        "client = client_from_service(aks_service)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "You can adapt the client [code](https://github.com/Azure/aml-real-time-ai/blob/master/pythonlib/amlrealtimeai/client.py) to meet your needs. There is also an example C# [client](https://github.com/Azure/aml-real-time-ai/blob/master/sample-clients/csharp).\n",
        "\n",
        "The service provides an API that is compatible with TensorFlow Serving. There are instructions to download a sample client [here](https://www.tensorflow.org/serving/setup)."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### 7.b. Serve the model\n",
        "To understand the results we need a mapping to the human readable imagenet classes"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import requests\n",
        "classes_entries = requests.get(\"https://raw.githubusercontent.com/Lasagne/Recipes/master/examples/resnet50/imagenet_classes.txt\").text.splitlines()"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Score image with input and output tensor names\n",
        "results = client.score_file(path=\"./snowleopardgaze.jpg\", \n",
        "                             input_name=input_tensors, \n",
        "                             outputs=output_tensors)\n",
        "\n",
        "# map results [class_id] => [confidence]\n",
        "results = enumerate(results)\n",
        "# sort results by confidence\n",
        "sorted_results = sorted(results, key=lambda x: x[1], reverse=True)\n",
        "# print top 5 results\n",
        "for top in sorted_results[:5]:\n",
        "    print(classes_entries[top[0]], 'confidence:', top[1])"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "<a id=\"clean-up\"></a>\n",
        "## 8. Clean-up\n",
        "Run the cell below to delete your webservice, image, and model (must be done in that order). In the [next notebook](./accelerated-models-training.ipynb) you will learn how to train a classfier on a new dataset using transfer learning and finetune the weights."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "aks_service.delete()\n",
        "aks_target.delete()\n",
        "image.delete()\n",
        "registered_model.delete()\n",
        "converted_model.delete()"
      ]
    }
  ],
  "metadata": {
    "authors": [
      {
        "name": "coverste"
      },
      {
        "name": "paledger"
      },
      {
        "name": "aibhalla"
      }
    ],
    "kernelspec": {
      "display_name": "Python 3.6",
      "language": "python",
      "name": "python36"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.5.6"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 2
}