Files
MachineLearningNotebooks/tutorials/get-started-day1/day1-part2-hello-world.ipynb

204 lines
10 KiB
Plaintext

{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Copyright (c) Microsoft Corporation. All rights reserved.\n",
"\n",
"Licensed under the MIT License."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/tutorials/get-started-day1/day1-part2-hello-world.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Tutorial: \"Hello World\" (Part 2 of 4)\n",
"\n",
"---\n",
"## Introduction\n",
"In **part 2 of this get started series**, you will submit a trivial \"hello world\" python script to the cloud by:\n",
"\n",
"- Running Python code in the cloud with Azure Machine Learning SDK\n",
"- Switching between debugging locally on a compute instance.\n",
"- Submitting remote runs in the cloud\n",
"- Monitoring and recording runs in the Azure Machine Learning studio\n",
"\n",
"This notebook follows the steps provided on the [Python (day 1) - \"hello world\" documentation page](https://aka.ms/day1aml). This tutorial is part of a **four-part tutorial series** in which you learn the fundamentals of Azure Machine Learning and complete simple jobs-based machine learning tasks in the Azure cloud. It builds off the work you completed in [Tutorial part 1: set up an Azure Machine Learning compute cluster](day1-part1-setup.ipynb).\n",
"\n",
"## Pre-requisites\n",
"\n",
"- Complete [Tutorial part 1: set up an Azure Machine Learning compute cluster](day1-part1-setup.ipynb) if you don't already have an Azure Machine Learning compute cluster.\n",
"- Familiarity with Python and Machine Learning concepts.\n",
"- If you are using a compute instance in Azure Machine Learning to run this notebook series, you are all set. Otherwise, please follow the [Configure a development environment for Azure Machine Learning](https://docs.microsoft.com/azure/machine-learning/how-to-configure-environment)\n",
"---"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Your code\n",
"\n",
"In the `code/hello` subdirectory you will find a trivial python script [hello.py](code/hello/hello.py) that has the following code:\n",
"\n",
"```Python\n",
"# code/hello/hello.py\n",
"print(\"hello world!\")\n",
"```\n",
"\n",
"In this tutorial you are going to submit this trivial python script to an Azure Machine Learning Compute Cluster."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Test in your development environment\n",
"\n",
"You can test your code works on a compute instance or locally (for example, a laptop), which has the benefit of interactive debugging of code:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"!python code/hello/hello.py"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Submit your code to Azure Machine Learning\n",
"\n",
"Below you create a __*control script*__ this is where you specify _how_ your code is submitted to Azure Machine Learning. The code you submit to Azure Machine Learning (in this case `hello.py`) does not need anything specific to Azure Machine Learning - it can be any valid Python code. It is only the control script that is Azure Machine Learning specific.\n",
"\n",
"The code below will show a Jupyter widget that tracks the progress of your run, and displays logs.\n",
"\n",
"> <span style=\"color:purple; font-weight:bold\">! NOTE <br>\n",
"> The very first run will take 5-10minutes to complete. This is because in the background a docker image is built in the cloud, the compute cluster is resized from 0 to 1 node, and the docker image is downloaded to the compute. Subsequent runs are much quicker (~15 seconds) as the docker image is cached on the compute - you can test this by resubmitting the code below after the first run has completed.</span>"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"remote run",
"batchai",
"configure run",
"use notebook widget"
]
},
"outputs": [],
"source": [
"from azureml.core import Workspace, Experiment, ScriptRunConfig\n",
"from azureml.widgets import RunDetails\n",
"\n",
"ws = Workspace.from_config()\n",
"experiment = Experiment(workspace=ws, name='day1-experiment-hello')\n",
"\n",
"config = ScriptRunConfig(source_directory='./code/hello', script='hello.py', compute_target='cpu-cluster')\n",
"\n",
"run = experiment.submit(config)\n",
"RunDetails(run).show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Understanding the control code\n",
"\n",
"| Code |Description | \n",
"|---|---|\n",
"| `ws = Workspace.from_config()` | [Workspace](https://docs.microsoft.com/python/api/azureml-core/azureml.core.workspace.workspace?view=azure-ml-py&preserve-view=true) connects to your Azure Machine Learning workspace, so that you can communicate with your Azure Machine Learning resources. |\n",
"| `experiment = Experiment( ... )` | [Experiment](https://docs.microsoft.com/python/api/azureml-core/azureml.core.experiment.experiment?view=azure-ml-py&preserve-view=true) provides a simple way to organize multiple runs under a single name. <br>Later you can see how experiments make it easy to compare metrics between dozens of runs. |\n",
"| `config = ScriptRunConfig( ... )` | [ScriptRunConfig](https://docs.microsoft.com/python/api/azureml-core/azureml.core.scriptrunconfig?view=azure-ml-py&preserve-view=true) wraps your `hello.py` code and passes it to your workspace.<br> As the name suggests, you can use this class to _configure_ how you want your _script_ to _run_ in Azure Machine Learning. <br>Also specifies what compute target the script will run on. <br>In this code, the target is the compute cluster you created in the [setup tutorial](tutorial-1st-experiment-sdk-setup-local.md). |\n",
"| `run = experiment.submit(config)` | Submits your script. This submission is called a [Run](https://docs.microsoft.com/python/api/azureml-core/azureml.core.run(class)?view=azure-ml-py&preserve-view=true). <br>A run encapsulates a single execution of your code. Use a run to monitor the script progress, capture the output,<br> analyze the results, visualize metrics and more. |\n",
"| `aml_url = run.get_portal_url()` | The `run` object provides a handle on the execution of your code. Monitor its progress from <br> the Azure Machine Learning Studio with the URL that is printed from the python script. |\n",
"|`RunDetails(run).show()` | There is an Azure Machine Learning widget that shows the progress of your job along with streaming the log files.\n",
"\n",
"## View the logs\n",
"\n",
"The widget has a dropdown box titled **Output logs** select `70_driver_log.txt`, which shows the following standard output: \n",
"\n",
"```\n",
" 1: [2020-08-04T22:15:44.407305] Entering context manager injector.\n",
" 2: [context_manager_injector.py] Command line Options: Namespace(inject=['ProjectPythonPath:context_managers.ProjectPythonPath', 'RunHistory:context_managers.RunHistory', 'TrackUserError:context_managers.TrackUserError', 'UserExceptions:context_managers.UserExceptions'], invocation=['hello.py'])\n",
" 3: Starting the daemon thread to refresh tokens in background for process with pid = 31263\n",
" 4: Entering Run History Context Manager.\n",
" 5: Preparing to call script [ hello.py ] with arguments: []\n",
" 6: After variable expansion, calling script [ hello.py ] with arguments: []\n",
" 7:\n",
" 8: Hello world!\n",
" 9: Starting the daemon thread to refresh tokens in background for process with pid = 31263\n",
"10:\n",
"11:\n",
"12: The experiment completed successfully. Finalizing run...\n",
"13: Logging experiment finalizing status in history service.\n",
"14: [2020-08-04T22:15:46.541334] TimeoutHandler __init__\n",
"15: [2020-08-04T22:15:46.541396] TimeoutHandler __enter__\n",
"16: Cleaning up all outstanding Run operations, waiting 300.0 seconds\n",
"17: 1 items cleaning up...\n",
"18: Cleanup took 0.1812913417816162 seconds\n",
"19: [2020-08-04T22:15:47.040203] TimeoutHandler __exit__\n",
"```\n",
"\n",
"On line 8 above, you see the \"Hello world!\" output. The 70_driver_log.txt file contains the standard output from run and can be useful when debugging remote runs in the cloud. You can also view the run by clicking on the **Click here to see the run in Azure Machine Learning studio** link in the wdiget.\n",
"\n",
"## Next steps\n",
"\n",
"In this tutorial, you took a simple \"hello world\" script and ran it on Azure. You saw how to connect to your Azure Machine Learning workspace, create an Experiment, and submit your `hello.py` code to the cloud.\n",
"\n",
"In the [next tutorial](day1-part3-train-model.ipynb), you build on these learnings by running something more interesting than `print(\"Hello world!\")`.\n"
]
}
],
"metadata": {
"authors": [
{
"name": "samkemp"
}
],
"celltoolbar": "Edit Metadata",
"kernel_info": {
"name": "python3-azureml"
},
"kernelspec": {
"display_name": "Python 3.6",
"language": "python",
"name": "python36"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.5"
},
"notice": "Copyright (c) Microsoft Corporation. All rights reserved. Licensed under the MIT License.",
"nteract": {
"version": "nteract-front-end@1.0.0"
}
},
"nbformat": 4,
"nbformat_minor": 4
}