mirror of
https://github.com/Azure/MachineLearningNotebooks.git
synced 2025-12-23 20:00:06 -05:00
523 lines
16 KiB
Plaintext
523 lines
16 KiB
Plaintext
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Copyright (c) Microsoft Corporation. All rights reserved.\n",
|
|
"\n",
|
|
"Licensed under the MIT License."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"# AutoML 03: Remote Execution using Batch AI\n",
|
|
"\n",
|
|
"In this example we use the scikit learn's [diabetes dataset](http://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_diabetes.html) to showcase how you can use AutoML for a simple classification problem.\n",
|
|
"\n",
|
|
"Make sure you have executed the [setup](setup.ipynb) before running this notebook.\n",
|
|
"\n",
|
|
"In this notebook you would see\n",
|
|
"1. Creating an Experiment using an existing Workspace\n",
|
|
"2. Attaching an existing Batch AI compute to a workspace\n",
|
|
"3. Instantiating AutoMLConfig \n",
|
|
"4. Training the Model using the Batch AI\n",
|
|
"5. Exploring the results\n",
|
|
"6. Testing the fitted model\n",
|
|
"\n",
|
|
"In addition this notebook showcases the following features\n",
|
|
"- **Parallel** Executions for iterations\n",
|
|
"- Asyncronous tracking of progress\n",
|
|
"- **Cancelling** individual iterations or the entire run\n",
|
|
"- Retrieving models for any iteration or logged metric\n",
|
|
"- specify automl settings as **kwargs**\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Create Experiment\n",
|
|
"\n",
|
|
"As part of the setup you have already created a workspace. For AutoML you would need to create a <b>Experiment</b>. An <b>Experiment</b> is a named object in a <b>Workspace</b>, which is used to run experiments."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"import logging\n",
|
|
"import os\n",
|
|
"import random\n",
|
|
"\n",
|
|
"from matplotlib import pyplot as plt\n",
|
|
"from matplotlib.pyplot import imshow\n",
|
|
"import numpy as np\n",
|
|
"import pandas as pd\n",
|
|
"from sklearn import datasets\n",
|
|
"\n",
|
|
"import azureml.core\n",
|
|
"from azureml.core.experiment import Experiment\n",
|
|
"from azureml.core.workspace import Workspace\n",
|
|
"from azureml.train.automl import AutoMLConfig\n",
|
|
"from azureml.train.automl.run import AutoMLRun"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"ws = Workspace.from_config()\n",
|
|
"\n",
|
|
"# choose a name for the run history container in the workspace\n",
|
|
"experiment_name = 'automl-remote-batchai'\n",
|
|
"# project folder\n",
|
|
"project_folder = './sample_projects/automl-remote-batchai'\n",
|
|
"\n",
|
|
"experiment=Experiment(ws, experiment_name)\n",
|
|
"\n",
|
|
"output = {}\n",
|
|
"output['SDK version'] = azureml.core.VERSION\n",
|
|
"output['Subscription ID'] = ws.subscription_id\n",
|
|
"output['Workspace Name'] = ws.name\n",
|
|
"output['Resource Group'] = ws.resource_group\n",
|
|
"output['Location'] = ws.location\n",
|
|
"output['Project Directory'] = project_folder\n",
|
|
"output['Experiment Name'] = experiment.name\n",
|
|
"pd.set_option('display.max_colwidth', -1)\n",
|
|
"pd.DataFrame(data = output, index = ['']).T"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Diagnostics\n",
|
|
"\n",
|
|
"Opt-in diagnostics for better experience, quality, and security of future releases"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from azureml.telemetry import set_diagnostics_collection\n",
|
|
"set_diagnostics_collection(send_diagnostics=True)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Create Batch AI Cluster\n",
|
|
"The cluster is created as Machine Learning Compute and will appear under your workspace.\n",
|
|
"\n",
|
|
"<b>Note</b>: The cluster creation can take over 10 minutes, please be patient.\n",
|
|
"\n",
|
|
"As with other Azure services, there are limits on certain resources (for eg. BatchAI cluster size) associated with the Azure Machine Learning service. Please read [this article](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-manage-quotas) on the default limits and how to request more quota."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from azureml.core.compute import BatchAiCompute\n",
|
|
"from azureml.core.compute import ComputeTarget\n",
|
|
"\n",
|
|
"# choose a name for your cluster\n",
|
|
"batchai_cluster_name = ws.name + \"cpu\"\n",
|
|
"\n",
|
|
"found = False\n",
|
|
"# see if this compute target already exists in the workspace\n",
|
|
"for ct in ws.compute_targets():\n",
|
|
" print(ct.name, ct.type)\n",
|
|
" if (ct.name == batchai_cluster_name and ct.type == 'BatchAI'):\n",
|
|
" found = True\n",
|
|
" print('found compute target. just use it.')\n",
|
|
" compute_target = ct\n",
|
|
" break\n",
|
|
" \n",
|
|
"if not found:\n",
|
|
" print('creating a new compute target...')\n",
|
|
" provisioning_config = BatchAiCompute.provisioning_configuration(vm_size = \"STANDARD_D2_V2\", # for GPU, use \"STANDARD_NC6\"\n",
|
|
" #vm_priority = 'lowpriority', # optional\n",
|
|
" autoscale_enabled = True,\n",
|
|
" cluster_min_nodes = 1, \n",
|
|
" cluster_max_nodes = 4)\n",
|
|
"\n",
|
|
" # create the cluster\n",
|
|
" compute_target = ComputeTarget.create(ws,batchai_cluster_name, provisioning_config)\n",
|
|
" \n",
|
|
" # can poll for a minimum number of nodes and for a specific timeout. \n",
|
|
" # if no min node count is provided it will use the scale settings for the cluster\n",
|
|
" compute_target.wait_for_completion(show_output=True, min_node_count=None, timeout_in_minutes=20)\n",
|
|
" \n",
|
|
" # For a more detailed view of current BatchAI cluster status, use the 'status' property "
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Create Get Data File\n",
|
|
"For remote executions you should author a get_data.py file containing a get_data() function. This file should be in the root directory of the project. You can encapsulate code to read data either from a blob storage or local disk in this file."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"if not os.path.exists(project_folder):\n",
|
|
" os.makedirs(project_folder)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"%%writefile $project_folder/get_data.py\n",
|
|
"\n",
|
|
"from sklearn import datasets\n",
|
|
"from scipy import sparse\n",
|
|
"import numpy as np\n",
|
|
"\n",
|
|
"def get_data():\n",
|
|
" \n",
|
|
" digits = datasets.load_digits()\n",
|
|
" X_digits = digits.data\n",
|
|
" y_digits = digits.target\n",
|
|
"\n",
|
|
" return { \"X\" : X_digits, \"y\" : y_digits }"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Instantiate AutoML <a class=\"anchor\" id=\"Instatiate-AutoML-Remote-DSVM\"></a>\n",
|
|
"\n",
|
|
"You can specify automl_settings as **kwargs** as well. Also note that you can use the get_data() symantic for local excutions too. \n",
|
|
"\n",
|
|
"<i>Note: For Remote DSVM and Batch AI you cannot pass Numpy arrays directly to the fit method.</i>\n",
|
|
"\n",
|
|
"|Property|Description|\n",
|
|
"|-|-|\n",
|
|
"|**primary_metric**|This is the metric that you want to optimize.<br> Classification supports the following primary metrics <br><i>accuracy</i><br><i>AUC_weighted</i><br><i>balanced_accuracy</i><br><i>average_precision_score_weighted</i><br><i>precision_score_weighted</i>|\n",
|
|
"|**max_time_sec**|Time limit in seconds for each iteration|\n",
|
|
"|**iterations**|Number of iterations. In each iteration Auto ML trains a specific pipeline with the data|\n",
|
|
"|**n_cross_validations**|Number of cross validation splits|\n",
|
|
"|**concurrent_iterations**|Max number of iterations that would be executed in parallel. This should be less than the number of cores on the DSVM."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"automl_settings = {\n",
|
|
" \"max_time_sec\": 120,\n",
|
|
" \"iterations\": 20,\n",
|
|
" \"n_cross_validations\": 5,\n",
|
|
" \"primary_metric\": 'AUC_weighted',\n",
|
|
" \"preprocess\": False,\n",
|
|
" \"concurrent_iterations\": 5,\n",
|
|
" \"verbosity\": logging.INFO\n",
|
|
"}\n",
|
|
"\n",
|
|
"automl_config = AutoMLConfig(task = 'classification',\n",
|
|
" debug_log = 'automl_errors.log',\n",
|
|
" path=project_folder,\n",
|
|
" compute_target = compute_target,\n",
|
|
" data_script = project_folder + \"/get_data.py\",\n",
|
|
" **automl_settings\n",
|
|
" )\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"remote_run = experiment.submit(automl_config, show_output=False)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Exploring the Results\n",
|
|
"\n",
|
|
"#### Loading executed runs\n",
|
|
"In case you need to load a previously executed run given a run id please enable the below cell"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "raw",
|
|
"metadata": {},
|
|
"source": [
|
|
"remote_run = AutoMLRun(experiment=experiment, run_id='AutoML_5db13491-c92a-4f1d-b622-8ab8d973a058')"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"#### Widget for monitoring runs\n",
|
|
"\n",
|
|
"The widget will sit on \"loading\" until the first iteration completed, then you will see an auto-updating graph and table show up. It refreshed once per minute, so you should see the graph update as child runs complete.\n",
|
|
"\n",
|
|
"You can click on a pipeline to see run properties and output logs. Logs are also available on the DSVM under /tmp/azureml_run/{iterationid}/azureml-logs\n",
|
|
"\n",
|
|
"NOTE: The widget displays a link at the bottom. This links to a web-ui to explore the individual run details."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"remote_run"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from azureml.train.widgets import RunDetails\n",
|
|
"RunDetails(remote_run).show() "
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# wait till the run finishes\n",
|
|
"remote_run.wait_for_completion(show_output = True)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"\n",
|
|
"#### Retrieve All Child Runs\n",
|
|
"You can also use sdk methods to fetch all the child runs and see individual metrics that we log. "
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"children = list(remote_run.get_children())\n",
|
|
"metricslist = {}\n",
|
|
"for run in children:\n",
|
|
" properties = run.get_properties()\n",
|
|
" metrics = {k: v for k, v in run.get_metrics().items() if isinstance(v, float)} \n",
|
|
" metricslist[int(properties['iteration'])] = metrics\n",
|
|
"\n",
|
|
"rundata = pd.DataFrame(metricslist).sort_index(1)\n",
|
|
"rundata"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Canceling runs\n",
|
|
"\n",
|
|
"You can cancel ongoing remote runs using the *cancel()* and *cancel_iteration()* functions"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Cancel the ongoing experiment and stop scheduling new iterations\n",
|
|
"# remote_run.cancel()\n",
|
|
"\n",
|
|
"# Cancel iteration 1 and move onto iteration 2\n",
|
|
"# remote_run.cancel_iteration(1)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Retrieve the Best Model\n",
|
|
"\n",
|
|
"Below we select the best pipeline from our iterations. The *get_output* method on automl_classifier returns the best run and the fitted model for the last *fit* invocation. There are overloads on *get_output* that allow you to retrieve the best run and fitted model for *any* logged metric or a particular *iteration*."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"best_run, fitted_model = remote_run.get_output()\n",
|
|
"print(best_run)\n",
|
|
"print(fitted_model)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"#### Best Model based on any other metric\n",
|
|
"Show the run/model which has the smallest `log_loss` value."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"lookup_metric = \"log_loss\"\n",
|
|
"best_run, fitted_model = remote_run.get_output(metric = lookup_metric)\n",
|
|
"print(best_run)\n",
|
|
"print(fitted_model)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"#### Model from a specific iteration\n",
|
|
"Show the run and model from the 3rd iteration."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"iteration = 3\n",
|
|
"third_run, third_model = remote_run.get_output(iteration=iteration)\n",
|
|
"print(third_run)\n",
|
|
"print(third_model)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Register fitted model for deployment"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"description = 'AutoML Model'\n",
|
|
"tags = None\n",
|
|
"remote_run.register_model(description=description, tags=tags)\n",
|
|
"remote_run.model_id # Use this id to deploy the model as a web service in Azure"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Testing the Fitted Model <a class=\"anchor\" id=\"Testing-the-Fitted-Model-Remote-DSVM\"></a>\n",
|
|
"\n",
|
|
"#### Load Test Data"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"digits = datasets.load_digits()\n",
|
|
"X_digits = digits.data[:10, :]\n",
|
|
"y_digits = digits.target[:10]\n",
|
|
"images = digits.images[:10]"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"#### Testing our best pipeline"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"#Randomly select digits and test\n",
|
|
"for index in np.random.choice(len(y_digits), 2):\n",
|
|
" print(index)\n",
|
|
" predicted = fitted_model.predict(X_digits[index:index + 1])[0]\n",
|
|
" label = y_digits[index]\n",
|
|
" title = \"Label value = %d Predicted value = %d \" % ( label,predicted)\n",
|
|
" fig = plt.figure(1, figsize=(3,3))\n",
|
|
" ax1 = fig.add_axes((0,0,.8,.8))\n",
|
|
" ax1.set_title(title)\n",
|
|
" plt.imshow(images[index], cmap=plt.cm.gray_r, interpolation='nearest')\n",
|
|
" plt.show()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": []
|
|
}
|
|
],
|
|
"metadata": {
|
|
"kernelspec": {
|
|
"display_name": "Python 3.6",
|
|
"language": "python",
|
|
"name": "python36"
|
|
},
|
|
"language_info": {
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"file_extension": ".py",
|
|
"mimetype": "text/x-python",
|
|
"name": "python",
|
|
"nbconvert_exporter": "python",
|
|
"pygments_lexer": "ipython3",
|
|
"version": "3.6.6"
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 2
|
|
}
|