Copyright (c) Microsoft Corporation. All rights reserved.

Licensed under the MIT License.

# 02. Train locally
* Create or load workspace.
* Create scripts locally.
* Create `train.py` in a folder, along with a `my.lib` file.
* Configure & execute a local run in a user-managed Python environment.
* Configure & execute a local run in a system-managed Python environment.
* Configure & execute a local run in a Docker environment.
* Query run metrics to find the best model
* Register model for operationalization.

## Prerequisites
Make sure you go through the [configuration notebook](../../../configuration.ipynb) first if you haven't.

In [None]:
# Check core SDK version number
import azureml.core

print("SDK version:", azureml.core.VERSION)

## Initialize Workspace

Initialize a workspace object from persisted configuration.

In [None]:
from azureml.core.workspace import Workspace

ws = Workspace.from_config()
print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep='\n')

## Create An Experiment
**Experiment** is a logical container in an Azure ML Workspace. It hosts run records which can include run metrics and output artifacts from your experiments.

In [None]:
from azureml.core import Experiment
experiment_name = 'train-on-local'
exp = Experiment(workspace=ws, name=experiment_name)

## View `train.py`

`train.py` is already created for you.

In [None]:
with open('./train.py', 'r') as f:
 print(f.read())

Note `train.py` also references a `mylib.py` file.

In [None]:
with open('./mylib.py', 'r') as f:
 print(f.read())

## Configure & Run
### User-managed environment
Below, we use a user-managed run, which means you are responsible to ensure all the necessary packages are available in the Python environment you choose to run the script.

In [None]:
from azureml.core.runconfig import RunConfiguration

# Editing a run configuration property on-fly.
run_config_user_managed = RunConfiguration()

run_config_user_managed.environment.python.user_managed_dependencies = True

# You can choose a specific Python environment by pointing to a Python path 
#run_config.environment.python.interpreter_path = '/home/johndoe/miniconda3/envs/myenv/bin/python'

#### Submit script to run in the user-managed environment
Note whole script folder is submitted for execution, including the `mylib.py` file.

In [None]:
from azureml.core import ScriptRunConfig

src = ScriptRunConfig(source_directory='./', script='train.py', run_config=run_config_user_managed)
run = exp.submit(src)

#### Get run history details

In [None]:
run

Note: if you need to cancel a run, you can follow [these instructions](https://aka.ms/aml-docs-cancel-run).

Block to wait till run finishes.

In [None]:
run.wait_for_completion(show_output=True)

### System-managed environment
You can also ask the system to build a new conda environment and execute your scripts in it. The environment is built once and will be reused in subsequent executions as long as the conda dependencies remain unchanged. 

In [None]:
from azureml.core.conda_dependencies import CondaDependencies

run_config_system_managed = RunConfiguration()

run_config_system_managed.environment.python.user_managed_dependencies = False
run_config_system_managed.auto_prepare_environment = True

# Specify conda dependencies with scikit-learn
cd = CondaDependencies.create(conda_packages=['scikit-learn'])
run_config_system_managed.environment.python.conda_dependencies = cd

#### Submit script to run in the system-managed environment
A new conda environment is built based on the conda dependencies object. If you are running this for the first time, this might take up to 5 mninutes. But this conda environment is reused so long as you don't change the conda dependencies.

In [None]:
src = ScriptRunConfig(source_directory="./", script='train.py', run_config=run_config_system_managed)
run = exp.submit(src)

#### Get run history details

In [None]:
run

Block and wait till run finishes.

In [None]:
run.wait_for_completion(show_output = True)

### Docker-based execution
**IMPORTANT**: You must have Docker engine installed locally in order to use this execution mode. If your kernel is already running in a Docker container, such as **Azure Notebooks**, this mode will **NOT** work.

NOTE: The GPU base image must be used on Microsoft Azure Services only such as ACI, AML Compute, Azure VMs, and AKS.

You can also ask the system to pull down a Docker image and execute your scripts in it.

In [None]:
run_config_docker = RunConfiguration()
run_config_docker.environment.python.user_managed_dependencies = False
run_config_docker.auto_prepare_environment = True
run_config_docker.environment.docker.enabled = True

# use the default CPU-based Docker image from Azure ML
run_config_docker.environment.docker.base_image = azureml.core.runconfig.DEFAULT_CPU_IMAGE

# Specify conda dependencies with scikit-learn
cd = CondaDependencies.create(conda_packages=['scikit-learn'])
run_config_docker.environment.python.conda_dependencies = cd

src = ScriptRunConfig(source_directory="./", script='train.py', run_config=run_config_docker)

### Submit script to run in the system-managed environment
A new conda environment is built based on the conda dependencies object. If you are running this for the first time, this might take up to 5 mninutes. But this conda environment is reused so long as you don't change the conda dependencies.




In [None]:
import subprocess

# Check if Docker is installed and Linux containers are enables
if subprocess.run("docker -v", shell=True) == 0:
 out = subprocess.check_output("docker system info", shell=True, encoding="ascii").split("\n")
 if not "OSType: linux" in out:
 print("Switch Docker engine to use Linux containers.")
 else:
 run = exp.submit(src)
else:
 print("Docker engine not installed.")

In [None]:
#Get run history details
run

In [None]:
run.wait_for_completion(show_output=True)

#### Use a custom Docker image

You can also specify a custom Docker image if you don't want to use the default image provided by Azure ML.

```python
# use an image available in Docker Hub without authentication
run_config_docker.environment.docker.base_image = "continuumio/miniconda3"

# or, use an image available in a private Azure Container Registry
run_config_docker.environment.docker.base_image = "mycustomimage:1.0"
run_config_docker.environment.docker.base_image_registry.address = "myregistry.azurecr.io"
run_config_docker.environment.docker.base_image_registry.username = "username"
run_config_docker.environment.docker.base_image_registry.password = "password"
```

When you are using a custom Docker image, you might already have your environment setup properly in a Python environment in the Docker image. In that case, you can skip specifying conda dependencies, and just use `user_managed_dependencies` option instead:
```python
run_config_docker.environment.python.user_managed_dependencies = True
# path to the Python environment in the custom Docker image
run_config.environment.python.interpreter_path = '/opt/conda/bin/python'
```

## Query run metrics

In [None]:
# get all metris logged in the run
run.get_metrics()
metrics = run.get_metrics()

Let's find the model that has the lowest MSE value logged.

In [None]:
import numpy as np

best_alpha = metrics['alpha'][np.argmin(metrics['mse'])]

print('When alpha is {1:0.2f}, we have min MSE {0:0.2f}.'.format(
 min(metrics['mse']), 
 best_alpha
))

You can also list all the files that are associated with this run record

In [None]:
run.get_file_names()

We know the model `ridge_0.40.pkl` is the best performing model from the eariler queries. So let's register it with the workspace.

In [None]:
# supply a model name, and the full path to the serialized model file.
model = run.register_model(model_name='best_ridge_model', model_path='./outputs/ridge_0.40.pkl')

In [None]:
print(model.name, model.version, model.url)

Now you can deploy this model following the example in the 01 notebook.