Files
MachineLearningNotebooks/how-to-use-azureml/training-with-deep-learning/how-to-use-estimator/how-to-use-estimator.ipynb
Roope Astala 2d41c00488 version 1.0.39
2019-05-14 16:01:14 -04:00

482 lines
18 KiB
Plaintext

{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Copyright (c) Microsoft Corporation. All rights reserved.\n",
"\n",
"Licensed under the MIT License."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/training-with-deep-learning/how-to-use-estimator/how-to-use-estimator.png)"
]
},
{
"cell_type": "markdown",
"metadata": {
"nbpresent": {
"id": "bf74d2e9-2708-49b1-934b-e0ede342f475"
}
},
"source": [
"# How to use Estimator in Azure ML\n",
"\n",
"## Introduction\n",
"This tutorial shows how to use the Estimator pattern in Azure Machine Learning SDK. Estimator is a convenient object in Azure Machine Learning that wraps run configuration information to help simplify the tasks of specifying how a script is executed.\n",
"\n",
"\n",
"## Prerequisite:\n",
"* Understand the [architecture and terms](https://docs.microsoft.com/azure/machine-learning/service/concept-azure-machine-learning-architecture) introduced by Azure Machine Learning\n",
"* If you are using an Azure Machine Learning Notebook VM, you are all set. Otherwise, go through the [configuration notebook](../../../configuration.ipynb) to:\n",
" * install the AML SDK\n",
" * create a workspace and its configuration file (`config.json`)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's get started. First let's import some Python libraries."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"nbpresent": {
"id": "edaa7f2f-2439-4148-b57a-8c794c0945ec"
}
},
"outputs": [],
"source": [
"import azureml.core\n",
"from azureml.core import Workspace\n",
"\n",
"# check core SDK version number\n",
"print(\"Azure ML SDK Version: \", azureml.core.VERSION)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Initialize workspace\n",
"Initialize a [Workspace](https://docs.microsoft.com/azure/machine-learning/service/concept-azure-machine-learning-architecture#workspace) object from the existing workspace you created in the Prerequisites step. `Workspace.from_config()` creates a workspace object from the details stored in `config.json`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"ws = Workspace.from_config()\n",
"print('Workspace name: ' + ws.name, \n",
" 'Azure region: ' + ws.location, \n",
" 'Subscription id: ' + ws.subscription_id, \n",
" 'Resource group: ' + ws.resource_group, sep = '\\n')"
]
},
{
"cell_type": "markdown",
"metadata": {
"nbpresent": {
"id": "59f52294-4a25-4c92-bab8-3b07f0f44d15"
}
},
"source": [
"## Create an Azure ML experiment\n",
"Let's create an experiment named \"estimator-test\". The script runs will be recorded under this experiment in Azure."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"nbpresent": {
"id": "bc70f780-c240-4779-96f3-bc5ef9a37d59"
}
},
"outputs": [],
"source": [
"from azureml.core import Experiment\n",
"\n",
"exp = Experiment(workspace=ws, name='estimator-test')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Get default AmlCompute\n",
"You can create a [compute target](https://docs.microsoft.com/azure/machine-learning/service/concept-azure-machine-learning-architecture#compute-target) for training your model. In this tutorial, you use default `AmlCompute` as your training compute resource."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"cpu_cluster = ws.get_default_compute_target(\"CPU\")\n",
"\n",
"# use get_status() to get a detailed status for the current cluster. \n",
"print(cpu_cluster.get_status().serialize())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now that you have retrieved the compute target, let's see what the workspace's `compute_targets` property returns."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"compute_targets = ws.compute_targets\n",
"for name, ct in compute_targets.items():\n",
" print(name, ct.type, ct.provisioning_state)"
]
},
{
"cell_type": "markdown",
"metadata": {
"nbpresent": {
"id": "2039d2d5-aca6-4f25-a12f-df9ae6529cae"
}
},
"source": [
"## Use a simple script\n",
"We have already created a simple \"hello world\" script. This is the script that we will submit through the estimator pattern. It prints a hello-world message, and if Azure ML SDK is installed, it will also logs an array of values ([Fibonacci numbers](https://en.wikipedia.org/wiki/Fibonacci_number)). The script takes as input the number of Fibonacci numbers in the sequence to log."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"with open('./dummy_train.py', 'r') as f:\n",
" print(f.read())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Create A Generic Estimator"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"First we import the Estimator class and also a widget to visualize a run."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.train.estimator import Estimator\n",
"from azureml.widgets import RunDetails"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The simplest estimator is to submit the current folder to the local computer. Estimator by default will attempt to use Docker-based execution. Let's turn that off for now. It then builds a conda environment locally, installs Azure ML SDK in it, and runs your script."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# use a conda environment, don't use Docker, on local computer\n",
"script_params = {\n",
" '--numbers-in-sequence': 10\n",
"}\n",
"est = Estimator(source_directory='.', script_params=script_params, compute_target='local', entry_script='dummy_train.py', use_docker=False)\n",
"run = exp.submit(est)\n",
"RunDetails(run).show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can also enable Docker and let estimator pick the default CPU image supplied by Azure ML for execution. You can target an AmlCompute cluster (or any other supported compute target types)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# use a conda environment on default Docker image in an AmlCompute cluster\n",
"script_params = {\n",
" '--numbers-in-sequence': 10\n",
"}\n",
"est = Estimator(source_directory='.', script_params=script_params, compute_target=cpu_cluster, entry_script='dummy_train.py', use_docker=True)\n",
"run = exp.submit(est)\n",
"RunDetails(run).show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can customize the conda environment by adding conda and/or pip packages."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# add a conda package\n",
"script_params = {\n",
" '--numbers-in-sequence': 10\n",
"}\n",
"est = Estimator(source_directory='.', \n",
" script_params=script_params, \n",
" compute_target='local', \n",
" entry_script='dummy_train.py', \n",
" use_docker=False, \n",
" conda_packages=['scikit-learn'])\n",
"run = exp.submit(est)\n",
"RunDetails(run).show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can also specify a custom Docker image for exeution. In this case, you probably want to tell the system not to build a new conda environment for you. Instead, you can specify the path to an existing Python environment in the custom Docker image.\n",
"\n",
"**Note**: since the below example points to the preinstalled Python environment in the miniconda3 image maintained by continuum.io on Docker Hub where Azure ML SDK is not present, the logging metric code is not triggered. But a run history record is still recorded. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# use a custom Docker image\n",
"from azureml.core.container_registry import ContainerRegistry\n",
"\n",
"# this is an image available in Docker Hub\n",
"image_name = 'continuumio/miniconda3'\n",
"\n",
"# you can also point to an image in a private ACR\n",
"image_registry_details = ContainerRegistry()\n",
"image_registry_details.address = \"myregistry.azurecr.io\"\n",
"image_registry_details.username = \"username\"\n",
"image_registry_details.password = \"password\"\n",
"\n",
"# don't let the system build a new conda environment\n",
"user_managed_dependencies = True\n",
"\n",
"# submit to a local Docker container. if you don't have Docker engine running locally, you can set compute_target to cpu_cluster.\n",
"script_params = {\n",
" '--numbers-in-sequence': 10\n",
"}\n",
"est = Estimator(source_directory='.', \n",
" script_params=script_params, \n",
" compute_target='local', \n",
" entry_script='dummy_train.py',\n",
" custom_docker_image=image_name,\n",
" # uncomment below line to use your private ACR\n",
" #image_registry_details=image_registry_details,\n",
" user_managed=user_managed_dependencies\n",
" )\n",
"\n",
"run = exp.submit(est)\n",
"RunDetails(run).show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note: if you need to cancel a run, you can follow [these instructions](https://aka.ms/aml-docs-cancel-run)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Intelligent hyperparameter tuning\n",
"\n",
"The simple \"hello world\" script above lets the user fix the value of a parameter for the number of Fibonacci numbers in the sequence to log. Similarly, when training models, you can fix values of parameters of the training algorithm itself. E.g. the learning rate, the number of layers, the number of nodes in each layer in a neural network, etc. These adjustable parameters that govern the training process are referred to as the hyperparameters of the model. The goal of hyperparameter tuning is to search across various hyperparameter configurations and find the configuration that results in the best performance.\n",
"\n",
"\n",
"To demonstrate how Azure Machine Learning can help you automate the process of hyperarameter tuning, we will launch multiple runs with different values for numbers in the sequence. First let's define the parameter space using random sampling."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.train.hyperdrive import RandomParameterSampling, BanditPolicy, HyperDriveConfig, PrimaryMetricGoal\n",
"from azureml.train.hyperdrive import choice\n",
"\n",
"ps = RandomParameterSampling(\n",
" {\n",
" '--numbers-in-sequence': choice(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20)\n",
" }\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Next, we will create a new estimator without the above numbers-in-sequence parameter since that will be passed in later. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"est = Estimator(source_directory='.', script_params={}, compute_target=cpu_cluster, entry_script='dummy_train.py', use_docker=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Next, we will look at training metrics and early termination policies. When training a model, users are interested in logging and optimizing certain metrics of the model e.g. maximize the accuracy of the model, or minimize loss. This metric is logged by the training script for each run. In our simple script above, we are logging Fibonacci numbers in a sequence. But a training script could just as easily log other metrics like accuracy or loss, which can be used to evaluate the performance of a given training run.\n",
"\n",
"The intelligent hyperparameter tuning capability in Azure Machine Learning automatically terminates poorly performing runs using an early termination policy. Early termination reduces wastage of compute resources and instead uses these resources for exploring other hyperparameter configurations. In this example, we use the BanditPolicy. This basically states to check the job every 2 iterations. If the primary metric (defined later) falls outside of the top 10% range, Azure ML will terminate the training run. This saves us from continuing to explore hyperparameters that don't show promise of helping reach our target metric."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"policy = BanditPolicy(evaluation_interval=2, slack_factor=0.1)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we are ready to configure a run configuration object for hyperparameter tuning. We need to call out the primary metric that we want the experiment to optimize. The name of the primary metric needs to exactly match the name of the metric logged by the training script and we specify that we are looking to maximize this value. Next, we control the resource budget for the experiment by setting the maximum total number of training runs to 10. We also set the maximum number of training runs to run concurrently at 4, which is the same as the number of nodes in our computer cluster."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"hdc = HyperDriveConfig(estimator=est, \n",
" hyperparameter_sampling=ps, \n",
" policy=policy, \n",
" primary_metric_name='Fibonacci numbers', \n",
" primary_metric_goal=PrimaryMetricGoal.MAXIMIZE, \n",
" max_total_runs=10,\n",
" max_concurrent_runs=4)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Finally, let's launch the hyperparameter tuning job."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"hdr = exp.submit(config=hdc)\n",
"RunDetails(hdr).show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"When all the runs complete, we can find the run with the best performance"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"best_run = hdr.get_best_run_by_primary_metric()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can register the model from the best run and use it to deploy a web service that can be used for Inferencing. Details on how how you can do this can be found in the sample folders for the ohter types of estimators.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Next Steps\n",
"Now you can proceed to explore the other types of estimators, such as TensorFlow estimator, PyTorch estimator, etc. in the sample folder."
]
}
],
"metadata": {
"authors": [
{
"name": "maxluk"
}
],
"kernelspec": {
"display_name": "Python 3.6",
"language": "python",
"name": "python36"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.8"
},
"msauthor": "minxia"
},
"nbformat": 4,
"nbformat_minor": 2
}