Copyright (c) Microsoft Corporation. All rights reserved.

Licensed under the MIT License.

![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/training/train-on-local/train-on-local.png)

# 02. Train locally
_**Train a model locally: Directly on your machine and within a Docker container**_

---


## Table of contents
1. [Introduction](#intro)
1. [Pre-requisites](#pre-reqs)
1. [Initialize Workspace](#init)
1. [Create An Experiment](#exp)
1. [View training and auxiliary scripts](#view)
1. [Configure & Run](#config-run)
 1. User-managed environment
 1. Set the environment up
 1. Submit the script to run in the user-managed environment
 1. Get run history details
 1. System-managed environment
 1. Set the environment up
 1. Submit the script to run in the system-managed environment
 1. Get run history details
 1. Docker-based execution
 1. Set the environment up
 1. Submit the script to run in the system-managed environment
 1. Get run history details
 1. Use a custom Docker image
1. [Query run metrics](#query)

---

## 1. Introduction 

In this notebook, we will learn how to:

* Connect to our AML workspace
* Create or load a workspace
* Configure & execute a local run in:
 - a user-managed Python environment
 - a system-managed Python environment
 - a Docker environment
* Query run metrics to find the best model trained in the run
* Register that model for operationalization

## 2. Pre-requisites 
In this notebook, we assume that you have set your Azure Machine Learning workspace. If you have not, make sure you go through the [configuration notebook](../../../configuration.ipynb) first. In the end, you should have configuration file that contains the subscription ID, resource group and name of your workspace.

In [None]:
# Check core SDK version number
import azureml.core

print("SDK version:", azureml.core.VERSION)

## 3. Initialize Workspace 

Initialize your workspace object from configuration file

In [None]:
from azureml.core.workspace import Workspace

ws = Workspace.from_config()
print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep='\n')

## 4. Create An Experiment 
An experiment is a logical container in an Azure ML Workspace. It contains a series of trials called `Runs`. As such, it hosts run records such as run metrics, logs, and other output artifacts from your experiments.

In [None]:
from azureml.core import Experiment
experiment_name = 'train-on-local'
exp = Experiment(workspace=ws, name=experiment_name)

## 5. View training and auxiliary scripts 

For convenience, we already created the training (`train.py`) script and supportive libraries (`mylib.py`) for you. Take a few minutes to examine both files.

In [None]:
with open('./train.py', 'r') as f:
 print(f.read())

In [None]:
with open('./mylib.py', 'r') as f:
 print(f.read())

## 6. Configure & Run 
### 6.A User-managed environment

#### 6.A.a Set the environment up
When using a user-managed environment, you are responsible for ensuring that all the necessary packages are available in the Python environment you choose to run the script in.

In [None]:
from azureml.core import Environment

# Editing a run configuration property on-fly.
user_managed_env = Environment("user-managed-env")

user_managed_env.python.user_managed_dependencies = True

# You can choose a specific Python environment by pointing to a Python path 
#user_managed_env.python.interpreter_path = '/home/johndoe/miniconda3/envs/myenv/bin/python'

#### 6.A.b Submit the script to run in the user-managed environment
Whatever the way you manage your environment, you need to use the `ScriptRunConfig` class. It allows you to further configure your run by pointing to the `train.py` script and to the working directory, which also contains the `mylib.py` file. These inputs indeed provide the commands to execute in the run. Once the run is configured, you submit it to your experiment.

In [None]:
from azureml.core import ScriptRunConfig

src = ScriptRunConfig(source_directory='./', script='train.py')
src.run_config.environment = user_managed_env

In [None]:
run = exp.submit(src)

#### 6.A.c Get run history details

While all calculations were run on your machine (cf. below), by using a `run` you also captured the results of your calculations into your run and experiment. You can then see them on the Azure portal, through the link displayed as output of the following cell.

**Note**: The recording of the computation results into your run was made possible by the `run.log()` commands in the `train.py` file.

In [None]:
run

Note: if you need to cancel a run, you can follow [these instructions](https://aka.ms/aml-docs-cancel-run).

Block any execution to wait until the run finishes.

In [None]:
run.wait_for_completion(show_output=True)

**Note:** All these calculations were run on your local machine, in the conda environment you defined above. You can find the results in:
- `~/.azureml/envs/azureml_xxxx` for the conda environment you just created
- `~/AppData/Local/Temp/azureml_runs/train-on-local_xxxx` for the machine learning models you trained (this path may differ depending on the platform you use). This folder also contains
 - Logs (under azureml_logs/)
 - Output pickled files (under outputs/)
 - The configuration files (credentials, local and docker image setups)
 - The train.py and mylib.py scripts
 - The current notebook

Take a few minutes to examine the output of the cell above. It shows the content of some of the log files, and extra information on the conda environment used.

### 6.B System-managed environment
#### 6.B.a Set the environment up
Now, instead of managing the setup of the environment yourself, you can ask the system to build a new conda environment for you. The environment is built once, and will be reused in subsequent executions as long as the conda dependencies remain unchanged.

In [None]:
from azureml.core.conda_dependencies import CondaDependencies

system_managed_env = Environment("system-managed-env")

system_managed_env.python.user_managed_dependencies = False

# Specify conda dependencies with scikit-learn
cd = CondaDependencies.create(conda_packages=['scikit-learn'])
system_managed_env.python.conda_dependencies = cd

#### 6.B.b Submit the script to run in the system-managed environment
A new conda environment is built based on the conda dependencies object. If you are running this for the first time, this might take up to 5 minutes.

The commands used to execute the run are then the same as the ones you used above.

In [None]:
src.run_config.environment = system_managed_env
run = exp.submit(src)

#### 6.B.c Get run history details

In [None]:
run

In [None]:
run.wait_for_completion(show_output = True)

### 6.C Docker-based execution
In this section, you will train the same models, but you will do so in a Docker container, on your local machine. For this, you then need to have the Docker engine installed locally. If you don't have it yet, please follow the instructions below.

#### How to install Docker

- [Linux](https://docs.docker.com/install/linux/docker-ce/ubuntu/)
- [MacOs](https://docs.docker.com/docker-for-mac/install/)
- [Windows](https://docs.docker.com/docker-for-windows/install/)

 In case of issues, troubleshooting documentation can be found [here](https://docs.docker.com/docker-for-windows/troubleshoot/#running-docker-for-windows-in-nested-virtualization-scenarios). Additionally, you can follow the steps below, if Virtualization is not enabled on your machine:
 - Go to Task Manager > Performance
 - Check that Virtualization is enabled
 - If it is not, go to `Start > Settings > Update and security > Recovery > Advanced Startup - Restart now > Troubleshoot > Advanced options > UEFI firmware settings - restart`
 - In the BIOS, go to `Advanced > System options > Click the "Virtualization Technology (VTx)" only > Save > Exit > Save all changes` -- This will restart the machine

**Notes**: 
- If your kernel is already running in a Docker container, such as **Azure Notebooks**, this mode will **NOT** work.
- If you use a GPU base image, it needs to be used on Microsoft Azure Services such as ACI, AML Compute, Azure VMs, or AKS.

You can also ask the system to pull down a Docker image and execute your scripts in it.

#### 6.C.a Set the environment up

In the cell below, you will configure your run to execute in a Docker container. It will:
- run on a CPU
- contain a conda environment in which the scikit-learn library will be installed.

As before, you will finish configuring your run by pointing to the `train.py` and `mylib.py` files.

In [None]:
docker_env = Environment("docker-env")

docker_env.python.user_managed_dependencies = False
docker_env.docker.enabled = True

# use the default CPU-based Docker image from Azure ML
print(docker_env.docker.base_image)

# Specify conda dependencies with scikit-learn
docker_env.python.conda_dependencies = cd

#### 6.C.b Submit the script to run in the system-managed environment

The run is now configured and ready to be executed in a Docker container. If you are running this for the first time, the Docker container will get created, as well as the conda environment inside it. This will take several minutes. Once all this is generated, however, this conda environment will be reused as long as you don't change the conda dependencies.

In [None]:
import subprocess

src.run_config.environment = docker_env

# Check if Docker is installed and Linux containers are enabled
if subprocess.run("docker -v", shell=True).returncode == 0:
 out = subprocess.check_output("docker system info", shell=True).decode('ascii')
 if not "OSType: linux" in out:
 print("Switch Docker engine to use Linux containers.")
 else:
 run = exp.submit(src)
else:
 print("Docker engine is not installed.")

##### Potential issue on Windows and how to solve it

If you are using a Windows machine, the creation of the Docker image may fail, and you may see the following error message
`docker: Error response from daemon: Drive has not been shared. Failed to launch docker container. Check that docker is running and that C:\ on Windows and /tmp elsewhere is shared.`

This is because the process above tries to create a linux-based, i.e. non-windows-based, Docker image. To fix this, you can:
- Open the Docker user interface
- Navigate to Settings > Shared drives
- Select C (or both C and D, if you have one)
- Apply

When this is done, you can try and re-run the command above.



#### 6.C.c Get run history details

In [None]:
# Get run history details
run

In [None]:
run.wait_for_completion(show_output=True)

The results obtained here should be the same as those obtained before. However, take a look at the "Execution summary" section in the output of the cell above. Look for "docker". There, you should see the "enabled" field set to True. Compare this to the 2 prior runs ("enabled" was then set to False).

#### 6.C.d Use a custom Docker image

You can also specify a custom Docker image, if you don't want to use the default image provided by Azure ML.

You can either pull an image directly from Anaconda:
```python
# Use an image available in Docker Hub without authentication
run_config_docker.environment.docker.base_image = "continuumio/miniconda3"
```

Or one of the images you may already have created:
```python
# or, use an image available in your private Azure Container Registry
run_config_docker.environment.docker.base_image = "mycustomimage:1.0"
run_config_docker.environment.docker.base_image_registry.address = "myregistry.azurecr.io"
run_config_docker.environment.docker.base_image_registry.username = "username"
run_config_docker.environment.docker.base_image_registry.password = "password"
```

##### Where to find my Docker image name and registry credentials
 If you do not know what the name of your Docker image or container registry is, or if you don't know how to access the username and password needed above, proceed as follows:
 - Docker image name:
 - In the portal, under your resource group, click on your current workspace
 - Click on Experiments
 - Click on Images
 - Click on the image of your choice
 - Copy the "ID" string
 - In this notebook, replace "mycustomimage:1/0" with that ID string
 - Username and password:
 - In the portal, under your resource group, click on the container registry associated with your workspace
 - If you have several and don't know which one you need, click on your workspace, go to Overview and click on the "Registry" name on the upper right of the screen
 - There, go to "Access keys"
 - Copy the username and one of the passwords
 - In this notebook, replace "username" and "password" by these values

In any case, you will need to use the lines above in place of the line marked as `# Reference Docker image` in section 6.C.a. 

When you are using your custom Docker image, you might already have your Python environment properly set up. In that case, you can skip specifying conda dependencies, and just use the `user_managed_dependencies` option instead:
```python
run_config_docker.environment.python.user_managed_dependencies = True
# path to the Python environment in the custom Docker image
run_config.environment.python.interpreter_path = '/opt/conda/bin/python'
```

## 7. Query run metrics 

Once your run has completed, you can now extract the metrics you captured by using the `get_metrics` method. As shown in the `train.py` file, these metrics are "alpha" and "mse".

In [None]:
# Get all metris logged in the run
run.get_metrics()
metrics = run.get_metrics()

Let's find the model that has the lowest MSE value logged.

In [None]:
import numpy as np

best_alpha = metrics['alpha'][np.argmin(metrics['mse'])]

print('When alpha is {1:0.2f}, we have min MSE {0:0.2f}.'.format(
 min(metrics['mse']), 
 best_alpha
))

Let's compare it to the others

In [None]:
%matplotlib inline

import matplotlib
import matplotlib.pyplot as plt

plt.plot(metrics['alpha'], metrics['mse'], marker='o')
plt.ylabel("MSE")
plt.xlabel("Alpha")

You can also list all the files that are associated with this run record

In [None]:
run.get_file_names()

From the results obtained above, `ridge_0.40.pkl` is the best performing model. You can now register that particular model with the workspace. Once you have done so, go back to the portal and click on "Models". You should see it there.

In [None]:
# Supply a model name, and the full path to the serialized model file.
model = run.register_model(model_name='best_ridge_model', model_path='./outputs/ridge_0.40.pkl')

In [None]:
print("Registered model:\n --> Name: {}\n --> Version: {}\n --> URL: {}".format(model.name, model.version, model.url))

You can now deploy your model by following [this example](https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/deployment/production-deploy-to-aks/production-deploy-to-aks.ipynb).