Copyright (c) Microsoft Corporation. All rights reserved.

Licensed under the MIT License.

# 01. Train in the Notebook & Deploy Model to ACI

* Load workspace
* Train a simple regression model directly in the Notebook python kernel
* Record run history
* Find the best model in run history and download it.
* Deploy the model as an Azure Container Instance (ACI)

## Prerequisites
1. Make sure you go through the [00. Installation and Configuration](00.configuration.ipynb) Notebook first if you haven't. 

2. Install following pre-requisite libraries to your conda environment and restart notebook.
```shell
(myenv) $ conda install -y matplotlib tqdm scikit-learn
```

3. Check that ACI is registered for your Azure Subscription.  

In [None]:
!az provider show -n Microsoft.ContainerInstance -o table

If ACI is not registered, run following command to register it. Note that you have to be a subscription owner, or this command will fail.

In [None]:
!az provider register -n Microsoft.ContainerInstance

## Validate Azure ML SDK installation and get version number for debugging purposes

In [None]:
from azureml.core import Experiment, Run, Workspace
import azureml.core

# Check core SDK version number
print("SDK version:", azureml.core.VERSION)

## Initialize Workspace

Initialize a workspace object from persisted configuration.

In [None]:
ws = Workspace.from_config()
print('Workspace name: ' + ws.name, 
      'Azure region: ' + ws.location, 
      'Subscription id: ' + ws.subscription_id, 
      'Resource group: ' + ws.resource_group, sep='\n')

## Set experiment name
Choose a name for experiment.

In [None]:
experiment_name = 'train-in-notebook'

## Start a training run in local Notebook

In [None]:
# load diabetes dataset, a well-known small dataset that comes with scikit-learn
from sklearn.datasets import load_diabetes
from sklearn.linear_model import Ridge
from sklearn.metrics import mean_squared_error
from sklearn.model_selection import train_test_split
from sklearn.externals import joblib

X, y = load_diabetes(return_X_y = True)
columns = ['age', 'gender', 'bmi', 'bp', 's1', 's2', 's3', 's4', 's5', 's6']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
data = {
    "train":{"X": X_train, "y": y_train},        
    "test":{"X": X_test, "y": y_test}
}

### Train a simple Ridge model
Train a very simple Ridge regression model in scikit-learn, and save it as a pickle file.

In [None]:
reg = Ridge(alpha = 0.03)
reg.fit(X=data['train']['X'], y=data['train']['y'])
preds = reg.predict(data['test']['X'])
print('Mean Squared Error is', mean_squared_error(data['test']['y'], preds))
joblib.dump(value=reg, filename='model.pkl');

### Add experiment tracking
Now, let's add Azure ML experiment logging, and upload persisted model into run record as well.

In [None]:
experiment = Experiment(workspace=ws, name=experiment_name)
run = experiment.start_logging()

run.tag("Description","My first run!")
run.log('alpha', 0.03)
reg = Ridge(alpha=0.03)
reg.fit(data['train']['X'], data['train']['y'])
preds = reg.predict(data['test']['X'])
run.log('mse', mean_squared_error(data['test']['y'], preds))
joblib.dump(value=reg, filename='model.pkl')
run.upload_file(name='outputs/model.pkl', path_or_stream='./model.pkl')

run.complete()

We can browse to the recorded run. Please make sure you use Chrome to navigate the run history page.

In [None]:
run

### Simple parameter sweep
Sweep over alpha values of a sklearn ridge model, and capture metrics and trained model in the Azure ML experiment.

In [None]:
import numpy as np
import os
from tqdm import tqdm

model_name = "model.pkl"

# list of numbers from 0 to 1.0 with a 0.05 interval
alphas = np.arange(0.0, 1.0, 0.05)

# try a bunch of alpha values in a Linear Regression (Ridge) model
for alpha in tqdm(alphas):
    # create a bunch of runs, each train a model with a different alpha value
    with experiment.start_logging() as run:
        # Use Ridge algorithm to build a regression model
        reg = Ridge(alpha=alpha)
        reg.fit(X=data["train"]["X"], y=data["train"]["y"])
        preds = reg.predict(X=data["test"]["X"])
        mse = mean_squared_error(y_true=data["test"]["y"], y_pred=preds)

        # log alpha, mean_squared_error and feature names in run history
        run.log(name="alpha", value=alpha)
        run.log(name="mse", value=mse)
        run.log_list(name="columns", value=columns)

        with open(model_name, "wb") as file:
            joblib.dump(value=reg, filename=file)
        
        # upload the serialized model into run history record
        run.upload_file(name="outputs/" + model_name, path_or_stream=model_name)

        # now delete the serialized model from local folder since it is already uploaded to run history 
        os.remove(path=model_name)

In [None]:
# now let's take a look at the experiment in Azure portal.
experiment

## Select best model from the experiment
Load all experiment run metrics recursively from the experiment into a dictionary object.

In [None]:
runs = {}
run_metrics = {}

for r in tqdm(experiment.get_runs()):
    metrics = r.get_metrics()
    if 'mse' in metrics.keys():
        runs[r.id] = r
        run_metrics[r.id] = metrics

Now find the run with the lowest Mean Squared Error value

In [None]:
best_run_id = min(run_metrics, key = lambda k: run_metrics[k]['mse'])
best_run = runs[best_run_id]
print('Best run is:', best_run_id)
print('Metrics:', run_metrics[best_run_id])

You can add tags to your runs to make them easier to catalog

In [None]:
best_run.tag(key="Description", value="The best one")
best_run.get_tags()

### Plot MSE over alpha

Let's observe the best model visually by plotting the MSE values over alpha values:

In [None]:
%matplotlib inline
import matplotlib
import matplotlib.pyplot as plt

best_alpha = run_metrics[best_run_id]['alpha']
min_mse = run_metrics[best_run_id]['mse']

alpha_mse = np.array([(run_metrics[k]['alpha'], run_metrics[k]['mse']) for k in run_metrics.keys()])
sorted_alpha_mse = alpha_mse[alpha_mse[:,0].argsort()]

plt.plot(sorted_alpha_mse[:,0], sorted_alpha_mse[:,1], 'r--')
plt.plot(sorted_alpha_mse[:,0], sorted_alpha_mse[:,1], 'bo')

plt.xlabel('alpha', fontsize = 14)
plt.ylabel('mean squared error', fontsize = 14)
plt.title('MSE over alpha', fontsize = 16)

# plot arrow
plt.arrow(x = best_alpha, y = min_mse + 39, dx = 0, dy = -26, ls = '-', lw = 0.4,
          width = 0, head_width = .03, head_length = 8)

# plot "best run" text
plt.text(x = best_alpha - 0.08, y = min_mse + 50, s = 'Best Run', fontsize = 14)
plt.show()

## Register the best model

Find the model file saved in the run record of best run.

In [None]:
for f in best_run.get_file_names():
    print(f)

Now we can register this model in the model registry of the workspace

In [None]:
model = best_run.register_model(model_name='best_model', model_path='outputs/model.pkl')

Verify that the model has been registered properly. If you have done this several times you'd see the version number auto-increases each time.

In [None]:
from azureml.core.model import Model
models = Model.list(name='best_model')
for m in models:
    print(m.name, m.version)

You can also download the registered model. Afterwards, you should see a `model.pkl` file in the current directory. You can then use it for local testing if you'd like.

In [None]:
# remove the model file if it is already on disk
if os.path.isfile('model.pkl'): 
    os.remove('model.pkl')
# download the model
model.download(target_dir="./")

## Scoring script

Now we are ready to build a Docker image and deploy the model in it as a web service. The first step is creating the scoring script. For convenience, we have created the scoring script for you. It is printed below as text, but you can also run `%pfile ./score.py` in a cell to show the file.

Tbe scoring script consists of two functions: `init` that is used to load the model to memory when starting the container, and `run` that makes the prediction when web service is called. Please pay special attention to how the model is loaded in the `init()` function. When Docker image is built for this model, the actual model file is downloaded and placed on disk, and `get_model_path` function returns the local path where the model is placed.

In [None]:
with open('./score.py', 'r') as scoring_script:
    print(scoring_script.read())

## Create environment dependency file

We need a environment dependency file `myenv.yml` to specify which libraries are needed by the scoring script when building the Docker image for web service deployment. We can manually create this file, or we can use the `CondaDependencies` API to automatically create this file.

In [None]:
from azureml.core.conda_dependencies import CondaDependencies 

myenv = CondaDependencies()
myenv.add_conda_package("scikit-learn")
print(myenv.serialize_to_string())

with open("myenv.yml","w") as f:
    f.write(myenv.serialize_to_string())

## Deploy web service into an Azure Container Instance
The deployment process takes the registered model and your scoring scrip, and builds a Docker image. It then deploys the Docker image into Azure Container Instance as a running container with an HTTP endpoint readying for scoring calls. Read more about [Azure Container Instance](https://azure.microsoft.com/en-us/services/container-instances/).

Note ACI is great for quick and cost-effective dev/test deployment scenarios. For production workloads, please use [Azure Kubernentes Service (AKS)](https://azure.microsoft.com/en-us/services/kubernetes-service/) instead. Please follow in struction in [this notebook](11.production-deploy-to-aks.ipynb) to see how that can be done from Azure ML.
 
** Note: ** The web service creation can take 6-7 minutes.

In [None]:
from azureml.core.webservice import AciWebservice, Webservice

aciconfig = AciWebservice.deploy_configuration(cpu_cores=1, 
                                               memory_gb=1, 
                                               tags={'sample name': 'AML 101'}, 
                                               description='This is a great example.')

Note the below `WebService.deploy_from_model()` function takes a model object registered under the workspace. It then bakes the model file in the Docker image so it can be looked-up using the `Model.get_model_path()` function in `score.py`. 

If you have a local model file instead of a registered model object, you can also use the `WebService.deploy()` function which would register the model and then deploy.

In [None]:
from azureml.core.image import ContainerImage
image_config = ContainerImage.image_configuration(execution_script="score.py", 
                                    runtime="python", 
                                    conda_file="myenv.yml")

In [None]:
%%time
# this will take 5-10 minutes to finish
# you can also use "az container list" command to find the ACI being deployed
service = Webservice.deploy_from_model(name='my-aci-svc',
                                       deployment_config=aciconfig,
                                       models=[model],
                                       image_config=image_config,
                                       workspace=ws)

service.wait_for_deployment(show_output=True)


## Test web service

In [None]:
print('web service is hosted in ACI:', service.scoring_uri)

Use the `run` API to call the web service with one row of data to get a prediction.

In [None]:
import json
# score the first row from the test set.
test_samples = json.dumps({"data": X_test[0:1, :].tolist()})
service.run(input_data = test_samples)

Feed the entire test set and calculate the errors (residual values).

In [None]:
# score the entire test set.
test_samples = json.dumps({'data': X_test.tolist()})

result = json.loads(service.run(input_data = test_samples))['result']
residual = result - y_test

You can also send raw HTTP request to test the web service.

In [None]:
import requests
import json

# 2 rows of input data, each with 10 made-up numerical features
input_data = "{\"data\": [[1, 2, 3, 4, 5, 6, 7, 8, 9, 10], [10, 9, 8, 7, 6, 5, 4, 3, 2, 1]]}"

headers = {'Content-Type':'application/json'}

# for AKS deployment you'd need to the service key in the header as well
# api_key = service.get_key()
# headers = {'Content-Type':'application/json',  'Authorization':('Bearer '+ api_key)} 

resp = requests.post(service.scoring_uri, input_data, headers = headers)
print(resp.text)

## Residual graph
Plot a residual value graph to chart the errors on the entire test set. Observe the nice bell curve.

In [None]:
f, (a0, a1) = plt.subplots(1, 2, gridspec_kw={'width_ratios':[3, 1], 'wspace':0, 'hspace': 0})
f.suptitle('Residual Values', fontsize = 18)

f.set_figheight(6)
f.set_figwidth(14)

a0.plot(residual, 'bo', alpha=0.4);
a0.plot([0,90], [0,0], 'r', lw=2)
a0.set_ylabel('residue values', fontsize=14)
a0.set_xlabel('test data set', fontsize=14)

a1.hist(residual, orientation='horizontal', color='blue', bins=10, histtype='step');
a1.hist(residual, orientation='horizontal', color='blue', alpha=0.2, bins=10);
a1.set_yticklabels([])

plt.show()

## Delete ACI to clean up

Deleting ACI is super fast!

In [None]:
%%time
service.delete()