MachineLearningNotebooks/how-to-use-azureml/deployment/production-deploy-to-aks/production-deploy-to-aks.ipynb

{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Copyright (c) Microsoft Corporation. All rights reserved.\n",
        "\n",
        "Licensed under the MIT License."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/deployment/production-deploy-to-aks/production-deploy-to-aks.png)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "# Deploying a web service to Azure Kubernetes Service (AKS)\n",
        "This notebook shows the steps for deploying a service: registering a model, creating an image, provisioning a cluster (one time action), and deploying a service to it. \n",
        "We then test and delete the service, image and model."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.core import Workspace\n",
        "from azureml.core.compute import AksCompute, ComputeTarget\n",
        "from azureml.core.webservice import Webservice, AksWebservice\n",
        "from azureml.core.model import Model"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import azureml.core\n",
        "print(azureml.core.VERSION)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "# Get workspace\n",
        "Load existing workspace from the config file info."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.core.workspace import Workspace\n",
        "\n",
        "ws = Workspace.from_config()\n",
        "print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep = '\\n')"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "# Register the model\n",
        "Register an existing trained model, add descirption and tags."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "#Register the model\n",
        "from azureml.core.model import Model\n",
        "model = Model.register(model_path = \"sklearn_regression_model.pkl\", # this points to a local file\n",
        "                       model_name = \"sklearn_regression_model.pkl\", # this is the name the model is registered as\n",
        "                       tags = {'area': \"diabetes\", 'type': \"regression\"},\n",
        "                       description = \"Ridge regression model to predict diabetes\",\n",
        "                       workspace = ws)\n",
        "\n",
        "print(model.name, model.description, model.version)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "# Create the Environment\n",
        "Create an environment that the model will be deployed with"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.core import Environment\n",
        "from azureml.core.conda_dependencies import CondaDependencies \n",
        "\n",
        "conda_deps = CondaDependencies.create(conda_packages=['numpy','scikit-learn'], pip_packages=['azureml-defaults'])\n",
        "myenv = Environment(name='myenv')\n",
        "myenv.python.conda_dependencies = conda_deps"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "#### Use a custom Docker image\n",
        "\n",
        "You can also specify a custom Docker image to be used as base image if you don't want to use the default base image provided by Azure ML. Please make sure the custom Docker image has Ubuntu >= 16.04, Conda >= 4.5.\\* and Python(3.5.\\* or 3.6.\\*).\n",
        "\n",
        "Only supported with `python` runtime.\n",
        "```python\n",
        "# use an image available in public Container Registry without authentication\n",
        "myenv.docker.base_image = \"mcr.microsoft.com/azureml/o16n-sample-user-base/ubuntu-miniconda\"\n",
        "\n",
        "# or, use an image available in a private Container Registry\n",
        "myenv.docker.base_image = \"myregistry.azurecr.io/mycustomimage:1.0\"\n",
        "myenv.docker.base_image_registry.address = \"myregistry.azurecr.io\"\n",
        "myenv.docker.base_image_registry.username = \"username\"\n",
        "myenv.docker.base_image_registry.password = \"password\"\n",
        "```"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "# Write the Entry Script\n",
        "Write the script that will be used to predict on your model"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "%%writefile score.py\n",
        "import os\n",
        "import pickle\n",
        "import json\n",
        "import numpy\n",
        "from sklearn.externals import joblib\n",
        "from sklearn.linear_model import Ridge\n",
        "\n",
        "def init():\n",
        "    global model\n",
        "    # AZUREML_MODEL_DIR is an environment variable created during deployment.\n",
        "    # It is the path to the model folder (./azureml-models/$MODEL_NAME/$VERSION)\n",
        "    # For multiple models, it points to the folder containing all deployed models (./azureml-models)\n",
        "    model_path = os.path.join(os.getenv('AZUREML_MODEL_DIR'), 'sklearn_regression_model.pkl')\n",
        "    # deserialize the model file back into a sklearn model\n",
        "    model = joblib.load(model_path)\n",
        "\n",
        "# note you can pass in multiple rows for scoring\n",
        "def run(raw_data):\n",
        "    try:\n",
        "        data = json.loads(raw_data)['data']\n",
        "        data = numpy.array(data)\n",
        "        result = model.predict(data)\n",
        "        # you can return any data type as long as it is JSON-serializable\n",
        "        return result.tolist()\n",
        "    except Exception as e:\n",
        "        error = str(e)\n",
        "        return error"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "# Create the InferenceConfig\n",
        "Create the inference config that will be used when deploying the model"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.core.model import InferenceConfig\n",
        "\n",
        "inf_config = InferenceConfig(entry_script='score.py', environment=myenv)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "# Model Profiling\n",
        "\n",
        "Profile your model to understand how much CPU and memory the service, created as a result of its deployment, will need. Profiling returns information such as CPU usage, memory usage, and response latency. It also provides a CPU and memory recommendation based on the resource usage. You can profile your model (or more precisely the service built based on your model) on any CPU and/or memory combination where 0.1 <= CPU <= 3.5 and 0.1GB <= memory <= 15GB. If you do not provide a CPU and/or memory requirement, we will test it on the default configuration of 3.5 CPU and 15GB memory.\n",
        "\n",
        "In order to profile your model you will need:\n",
        "- a registered model\n",
        "- an entry script\n",
        "- an inference configuration\n",
        "- a single column tabular dataset, where each row contains a string representing sample request data sent to the service.\n",
        "\n",
        "Please, note that profiling is a long running operation and can take up to 25 minutes depending on the size of the dataset.\n",
        "\n",
        "At this point we only support profiling of services that expect their request data to be a string, for example: string serialized json, text, string serialized image, etc. The content of each row of the dataset (string) will be put into the body of the HTTP request and sent to the service encapsulating the model for scoring.\n",
        "\n",
        "Below is an example of how you can construct an input dataset to profile a service which expects its incoming requests to contain serialized json. In this case we created a dataset based one hundred instances of the same request data. In real world scenarios however, we suggest that you use larger datasets with various inputs, especially if your model resource usage/behavior is input dependent."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "You may want to register datasets using the register() method to your workspace so they can be shared with others, reused and referred to by name in your script.\n",
        "You can try get the dataset first to see if it's already registered."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import json\n",
        "from azureml.core import Datastore\n",
        "from azureml.core.dataset import Dataset\n",
        "from azureml.data import dataset_type_definitions\n",
        "\n",
        "dataset_name='sample_request_data'\n",
        "\n",
        "dataset_registered = False\n",
        "try:\n",
        "    sample_request_data = Dataset.get_by_name(workspace = ws, name = dataset_name)\n",
        "    dataset_registered = True\n",
        "except:\n",
        "    print(\"The dataset {} is not registered in workspace yet.\".format(dataset_name))\n",
        "\n",
        "if not dataset_registered:\n",
        "    input_json = {'data': [[1, 2, 3, 4, 5, 6, 7, 8, 9, 10],\n",
        "                        [10, 9, 8, 7, 6, 5, 4, 3, 2, 1]]}\n",
        "    # create a string that can be put in the body of the request\n",
        "    serialized_input_json = json.dumps(input_json)\n",
        "    dataset_content = []\n",
        "    for i in range(100):\n",
        "        dataset_content.append(serialized_input_json)\n",
        "    sample_request_data = '\\n'.join(dataset_content)\n",
        "    file_name = \"{}.txt\".format(dataset_name)\n",
        "    f = open(file_name, 'w')\n",
        "    f.write(sample_request_data)\n",
        "    f.close()\n",
        "\n",
        "    # upload the txt file created above to the Datastore and create a dataset from it\n",
        "    data_store = Datastore.get_default(ws)\n",
        "    data_store.upload_files(['./' + file_name], target_path='sample_request_data')\n",
        "    datastore_path = [(data_store, 'sample_request_data' +'/' + file_name)]\n",
        "    sample_request_data = Dataset.Tabular.from_delimited_files(\n",
        "        datastore_path,\n",
        "        separator='\\n',\n",
        "        infer_column_types=True,\n",
        "        header=dataset_type_definitions.PromoteHeadersBehavior.NO_HEADERS)\n",
        "    sample_request_data = sample_request_data.register(workspace=ws,\n",
        "                                                    name=dataset_name,\n",
        "                                                    create_new_version=True)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Now that we have an input dataset we are ready to go ahead with profiling. In this case we are testing the previously introduced sklearn regression model on 1 CPU and 0.5 GB memory. The memory usage and recommendation presented in the result is measured in Gigabytes. The CPU usage and recommendation is measured in CPU cores."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from datetime import datetime\n",
        "from azureml.core import Environment\n",
        "from azureml.core.conda_dependencies import CondaDependencies\n",
        "from azureml.core.model import Model, InferenceConfig\n",
        "\n",
        "\n",
        "environment = Environment('my-sklearn-environment')\n",
        "environment.python.conda_dependencies = CondaDependencies.create(pip_packages=[\n",
        "    'azureml-defaults',\n",
        "    'inference-schema[numpy-support]',\n",
        "    'joblib',\n",
        "    'numpy',\n",
        "    'scikit-learn'\n",
        "])\n",
        "inference_config = InferenceConfig(entry_script='score.py', environment=environment)\n",
        "# if cpu and memory_in_gb parameters are not provided\n",
        "# the model will be profiled on default configuration of\n",
        "# 3.5CPU and 15GB memory\n",
        "profile = Model.profile(ws,\n",
        "            'sklearn-%s' % datetime.now().strftime('%m%d%Y-%H%M%S'),\n",
        "            [model],\n",
        "            inference_config,\n",
        "            input_dataset=sample_request_data,\n",
        "            cpu=1.0,\n",
        "            memory_in_gb=0.5)\n",
        "\n",
        "# profiling is a long running operation and may take up to 25 min\n",
        "profile.wait_for_completion(True)\n",
        "details = profile.get_details()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "# Provision the AKS Cluster\n",
        "This is a one time setup. You can reuse this cluster for multiple deployments after it has been created. If you delete the cluster or the resource group that contains it, then you would have to recreate it."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Use the default configuration (can also provide parameters to customize)\n",
        "prov_config = AksCompute.provisioning_configuration()\n",
        "\n",
        "aks_name = 'my-aks-9' \n",
        "# Create the cluster\n",
        "aks_target = ComputeTarget.create(workspace = ws, \n",
        "                                  name = aks_name, \n",
        "                                  provisioning_configuration = prov_config)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "# Create AKS Cluster in an existing virtual network (optional)\n",
        "See code snippet below. Check the documentation [here](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-enable-virtual-network#use-azure-kubernetes-service) for more details."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# from azureml.core.compute import ComputeTarget, AksCompute\n",
        "\n",
        "# # Create the compute configuration and set virtual network information\n",
        "# config = AksCompute.provisioning_configuration(location=\"eastus2\")\n",
        "# config.vnet_resourcegroup_name = \"mygroup\"\n",
        "# config.vnet_name = \"mynetwork\"\n",
        "# config.subnet_name = \"default\"\n",
        "# config.service_cidr = \"10.0.0.0/16\"\n",
        "# config.dns_service_ip = \"10.0.0.10\"\n",
        "# config.docker_bridge_cidr = \"172.17.0.1/16\"\n",
        "\n",
        "# # Create the compute target\n",
        "# aks_target = ComputeTarget.create(workspace = ws,\n",
        "#                                   name = \"myaks\",\n",
        "#                                   provisioning_configuration = config)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "# Enable SSL on the AKS Cluster (optional)\n",
        "See code snippet below. Check the documentation [here](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-secure-web-service) for more details"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# provisioning_config = AksCompute.provisioning_configuration(ssl_cert_pem_file=\"cert.pem\", ssl_key_pem_file=\"key.pem\", ssl_cname=\"www.contoso.com\")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "%%time\n",
        "aks_target.wait_for_completion(show_output = True)\n",
        "print(aks_target.provisioning_state)\n",
        "print(aks_target.provisioning_errors)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Optional step: Attach existing AKS cluster\n",
        "\n",
        "If you have existing AKS cluster in your Azure subscription, you can attach it to the Workspace."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# # Use the default configuration (can also provide parameters to customize)\n",
        "# resource_id = '/subscriptions/92c76a2f-0e1c-4216-b65e-abf7a3f34c1e/resourcegroups/raymondsdk0604/providers/Microsoft.ContainerService/managedClusters/my-aks-0605d37425356b7d01'\n",
        "\n",
        "# create_name='my-existing-aks' \n",
        "# # Create the cluster\n",
        "# attach_config = AksCompute.attach_configuration(resource_id=resource_id)\n",
        "# aks_target = ComputeTarget.attach(workspace=ws, name=create_name, attach_configuration=attach_config)\n",
        "# # Wait for the operation to complete\n",
        "# aks_target.wait_for_completion(True)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "# Deploy web service to AKS"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "tags": [
          "sample-deploy-to-aks"
        ]
      },
      "outputs": [],
      "source": [
        "# Set the web service configuration (using default here)\n",
        "aks_config = AksWebservice.deploy_configuration()\n",
        "\n",
        "# # Enable token auth and disable (key) auth on the webservice\n",
        "# aks_config = AksWebservice.deploy_configuration(token_auth_enabled=True, auth_enabled=False)\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "tags": [
          "sample-deploy-to-aks"
        ]
      },
      "outputs": [],
      "source": [
        "%%time\n",
        "aks_service_name ='aks-service-1'\n",
        "\n",
        "aks_service = Model.deploy(workspace=ws,\n",
        "                           name=aks_service_name,\n",
        "                           models=[model],\n",
        "                           inference_config=inf_config,\n",
        "                           deployment_config=aks_config,\n",
        "                           deployment_target=aks_target)\n",
        "\n",
        "aks_service.wait_for_deployment(show_output = True)\n",
        "print(aks_service.state)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "# Test the web service using run method\n",
        "We test the web sevice by passing data.\n",
        "Run() method retrieves API keys behind the scenes to make sure that call is authenticated."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "%%time\n",
        "import json\n",
        "\n",
        "test_sample = json.dumps({'data': [\n",
        "    [1,2,3,4,5,6,7,8,9,10], \n",
        "    [10,9,8,7,6,5,4,3,2,1]\n",
        "]})\n",
        "test_sample = bytes(test_sample,encoding = 'utf8')\n",
        "\n",
        "prediction = aks_service.run(input_data = test_sample)\n",
        "print(prediction)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "# Test the web service using raw HTTP request (optional)\n",
        "Alternatively you can construct a raw HTTP request and send it to the service. In this case you need to explicitly pass the HTTP header. This process is shown in the next 2 cells."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# # if (key) auth is enabled, retrieve the API keys. AML generates two keys.\n",
        "# key1, Key2 = aks_service.get_keys()\n",
        "# print(key1)\n",
        "\n",
        "# # if token auth is enabled, retrieve the token.\n",
        "# access_token, refresh_after = aks_service.get_token()"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# construct raw HTTP request and send to the service\n",
        "# %%time\n",
        "\n",
        "# import requests\n",
        "\n",
        "# import json\n",
        "\n",
        "# test_sample = json.dumps({'data': [\n",
        "#     [1,2,3,4,5,6,7,8,9,10], \n",
        "#     [10,9,8,7,6,5,4,3,2,1]\n",
        "# ]})\n",
        "# test_sample = bytes(test_sample,encoding = 'utf8')\n",
        "\n",
        "# # If (key) auth is enabled, don't forget to add key to the HTTP header.\n",
        "# headers = {'Content-Type':'application/json', 'Authorization': 'Bearer ' + key1}\n",
        "\n",
        "# # If token auth is enabled, don't forget to add token to the HTTP header.\n",
        "# headers = {'Content-Type':'application/json', 'Authorization': 'Bearer ' + access_token}\n",
        "\n",
        "# resp = requests.post(aks_service.scoring_uri, test_sample, headers=headers)\n",
        "\n",
        "\n",
        "# print(\"prediction:\", resp.text)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "# Clean up\n",
        "Delete the service, image and model."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "%%time\n",
        "aks_service.delete()\n",
        "model.delete()"
      ]
    }
  ],
  "metadata": {
    "authors": [
      {
        "name": "vaidyas"
      }
    ],
    "kernelspec": {
      "display_name": "Python 3.6",
      "language": "python",
      "name": "python36"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.6.6"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 2
}