{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "Copyright (c) Microsoft Corporation. All rights reserved.\n", "\n", "Licensed under the MIT License." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/automated-machine-learning/classification-with-deployment/auto-ml-classification-with-deployment.png)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Automated Machine Learning\n", "_**Classification with Deployment**_\n", "\n", "## Contents\n", "1. [Introduction](#Introduction)\n", "1. [Setup](#Setup)\n", "1. [Train](#Train)\n", "1. [Deploy](#Deploy)\n", "1. [Test](#Test)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Introduction\n", "\n", "In this example we use the scikit learn's [digit dataset](http://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_digits.html) to showcase how you can use AutoML for a simple classification problem and deploy it to an Azure Container Instance (ACI).\n", "\n", "Make sure you have executed the [configuration](../../../configuration.ipynb) before running this notebook.\n", "\n", "An Enterprise workspace is required to complete this notebook.\n", "In this notebook you will learn how to:\n", "1. Create an experiment using an existing workspace.\n", "2. Configure AutoML using `AutoMLConfig`.\n", "3. Train the model using local compute.\n", "4. Explore the results.\n", "5. Register the model.\n", "6. Create a container image.\n", "7. Create an Azure Container Instance (ACI) service.\n", "8. Test the ACI service." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Setup\n", "\n", "As part of the setup you have already created an Azure ML `Workspace` object. For AutoML you will need to create an `Experiment` object, which is a named object in a `Workspace` used to run experiments." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Train\n", "\n", ".....This step (or The following steps) requires an Enterprise workspace to gain access to the features.\n", "Instantiate a AutoMLConfig object. This defines the settings and data used to run the experiment.\n", "\n", "|Property|Description|\n", "|-|-|\n", "|**task**|classification or regression|\n", "|**primary_metric**|This is the metric that you want to optimize. Classification supports the following primary metrics:
accuracy
AUC_weighted
average_precision_score_weighted
norm_macro_recall
precision_score_weighted|\n", "|**iteration_timeout_minutes**|Time limit in minutes for each iteration.|\n", "|**iterations**|Number of iterations. In each iteration AutoML trains a specific pipeline with the data.|\n", "|**n_cross_validations**|Number of cross validation splits.|\n", "|**X**|(sparse) array-like, shape = [n_samples, n_features]|\n", "|**y**|(sparse) array-like, shape = [n_samples, ], Multi-class targets.|\n", "|**path**|Relative path to the project folder. AutoML stores configuration files for the experiment under this folder. You can specify a new empty folder.|" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "digits = datasets.load_digits()\n", "X_train = digits.data[10:,:]\n", "y_train = digits.target[10:]\n", "\n", "automl_config = AutoMLConfig(task = 'classification',\n", " name = experiment_name,\n", " debug_log = 'automl_errors.log',\n", " primary_metric = 'AUC_weighted',\n", " iteration_timeout_minutes = 20,\n", " iterations = 10,\n", " verbosity = logging.INFO,\n", " X = X_train, \n", " y = y_train,\n", " path = project_folder)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Call the `submit` method on the experiment object and pass the run configuration. Execution of local runs is synchronous. Depending on the data and the number of iterations this can run for a while.\n", "In this example, we specify `show_output = True` to print currently running iterations to the console." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "local_run = experiment.submit(automl_config, show_output = True)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "local_run" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Deploy\n", "\n", "### Retrieve the Best Model\n", "\n", "Below we select the best pipeline from our iterations. The `get_output` method on `automl_classifier` returns the best run and the fitted model for the last invocation. Overloads on `get_output` allow you to retrieve the best run and fitted model for *any* logged metric or for a particular *iteration*." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "best_run, fitted_model = local_run.get_output()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Register the Fitted Model for Deployment\n", "If neither `metric` nor `iteration` are specified in the `register_model` call, the iteration with the best primary metric is registered." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "description = 'AutoML Model'\n", "tags = None\n", "model = local_run.register_model(description = description, tags = tags)\n", "\n", "print(local_run.model_id) # This will be written to the script file later in the notebook." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create Scoring Script" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%%writefile score.py\n", "import pickle\n", "import json\n", "import numpy\n", "import azureml.train.automl\n", "from sklearn.externals import joblib\n", "from azureml.core.model import Model\n", "\n", "\n", "def init():\n", " global model\n", " model_path = Model.get_model_path(model_name = '<>') # this name is model.id of model that we want to deploy\n", " # deserialize the model file back into a sklearn model\n", " model = joblib.load(model_path)\n", "\n", "def run(rawdata):\n", " try:\n", " data = json.loads(rawdata)['data']\n", " data = numpy.array(data)\n", " result = model.predict(data)\n", " except Exception as e:\n", " result = str(e)\n", " return json.dumps({\"error\": result})\n", " return json.dumps({\"result\":result.tolist()})" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create a YAML File for the Environment" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To ensure the fit results are consistent with the training results, the SDK dependency versions need to be the same as the environment that trains the model. The following cells create a file, myenv.yml, which specifies the dependencies from the run." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "experiment = Experiment(ws, experiment_name)\n", "ml_run = AutoMLRun(experiment = experiment, run_id = local_run.id)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "dependencies = ml_run.get_run_sdk_dependencies(iteration = 7)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for p in ['azureml-train-automl', 'azureml-sdk', 'azureml-core']:\n", " print('{}\\t{}'.format(p, dependencies[p]))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from azureml.core.conda_dependencies import CondaDependencies\n", "\n", "myenv = CondaDependencies.create(conda_packages=['numpy','scikit-learn','py-xgboost<=0.80'],\n", " pip_packages=['azureml-sdk[automl]'])\n", "\n", "conda_env_file_name = 'myenv.yml'\n", "myenv.save_to_file('.', conda_env_file_name)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Substitute the actual version number in the environment file.\n", "# This is not strictly needed in this notebook because the model should have been generated using the current SDK version.\n", "# However, we include this in case this code is used on an experiment from a previous SDK version.\n", "\n", "with open(conda_env_file_name, 'r') as cefr:\n", " content = cefr.read()\n", "\n", "with open(conda_env_file_name, 'w') as cefw:\n", " cefw.write(content.replace(azureml.core.VERSION, dependencies['azureml-sdk']))\n", "\n", "# Substitute the actual model id in the script file.\n", "\n", "script_file_name = 'score.py'\n", "\n", "with open(script_file_name, 'r') as cefr:\n", " content = cefr.read()\n", "\n", "with open(script_file_name, 'w') as cefw:\n", " cefw.write(content.replace('<>', local_run.model_id))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create a Container Image" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from azureml.core.image import Image, ContainerImage\n", "\n", "image_config = ContainerImage.image_configuration(runtime= \"python\",\n", " execution_script = script_file_name,\n", " conda_file = conda_env_file_name,\n", " tags = {'area': \"digits\", 'type': \"automl_classification\"},\n", " description = \"Image for automl classification sample\")\n", "\n", "image = Image.create(name = \"automlsampleimage\",\n", " # this is the model object \n", " models = [model],\n", " image_config = image_config, \n", " workspace = ws)\n", "\n", "image.wait_for_creation(show_output = True)\n", "\n", "if image.creation_state == 'Failed':\n", " print(\"Image build log at: \" + image.image_build_log_uri)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Deploy the Image as a Web Service on Azure Container Instance" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from azureml.core.webservice import AciWebservice\n", "\n", "aciconfig = AciWebservice.deploy_configuration(cpu_cores = 1, \n", " memory_gb = 1, \n", " tags = {'area': \"digits\", 'type': \"automl_classification\"}, \n", " description = 'sample service for Automl Classification')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from azureml.core.webservice import Webservice\n", "\n", "aci_service_name = 'automl-sample-01'\n", "print(aci_service_name)\n", "aci_service = Webservice.deploy_from_image(deployment_config = aciconfig,\n", " image = image,\n", " name = aci_service_name,\n", " workspace = ws)\n", "aci_service.wait_for_deployment(True)\n", "print(aci_service.state)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Delete a Web Service" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#aci_service.delete()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Get Logs from a Deployed Web Service" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#aci_service.get_logs()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Test" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#Randomly select digits and test\n", "digits = datasets.load_digits()\n", "X_test = digits.data[:10, :]\n", "y_test = digits.target[:10]\n", "images = digits.images[:10]\n", "\n", "for index in np.random.choice(len(y_test), 3, replace = False):\n", " print(index)\n", " test_sample = json.dumps({'data':X_test[index:index + 1].tolist()})\n", " predicted = aci_service.run(input_data = test_sample)\n", " label = y_test[index]\n", " predictedDict = json.loads(predicted)\n", " title = \"Label value = %d Predicted value = %s \" % ( label,predictedDict['result'][0])\n", " fig = plt.figure(1, figsize = (3,3))\n", " ax1 = fig.add_axes((0,0,.8,.8))\n", " ax1.set_title(title)\n", " plt.imshow(images[index], cmap = plt.cm.gray_r, interpolation = 'nearest')\n", " plt.show()" ] } ], "metadata": { "authors": [ { "name": "shwinne" } ], "kernelspec": { "display_name": "Python 3.6", "language": "python", "name": "python36" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.6" } }, "nbformat": 4, "nbformat_minor": 2 }