mirror of
https://github.com/Azure/MachineLearningNotebooks.git
synced 2025-12-23 02:52:39 -05:00
Compare commits
11 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
fdc3fe2a53 | ||
|
|
628b35912c | ||
|
|
3f4cc22e94 | ||
|
|
18d7afb707 | ||
|
|
cd35ca30d4 | ||
|
|
30eae0b46c | ||
|
|
f16951387f | ||
|
|
0d8de29147 | ||
|
|
836354640c | ||
|
|
6162e80972 | ||
|
|
fe9fe3392d |
@@ -1,12 +0,0 @@
|
|||||||
## Use MLflow with Azure Machine Learning service (Preview)
|
|
||||||
|
|
||||||
[MLflow](https://mlflow.org/) is an open-source platform for tracking machine learning experiments and managing models. You can use MLflow logging APIs with Azure Machine Learning service: the metrics and artifacts are logged to your Azure ML Workspace.
|
|
||||||
|
|
||||||
Try out the sample notebooks:
|
|
||||||
|
|
||||||
* [Use MLflow with Azure Machine Learning for Local Training Run](./train-local/train-local.ipynb)
|
|
||||||
* [Use MLflow with Azure Machine Learning for Remote Training Run](./train-remote/train-remote.ipynb)
|
|
||||||
* [Deploy Model as Azure Machine Learning Web Service using MLflow](./deploy-model/deploy-model.ipynb)
|
|
||||||
* [Train and Deploy PyTorch Image Classifier](./train-deploy-pytorch/train-deploy-pytorch.ipynb)
|
|
||||||
|
|
||||||

|
|
||||||
@@ -1,322 +0,0 @@
|
|||||||
{
|
|
||||||
"cells": [
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"Copyright (c) Microsoft Corporation. All rights reserved.\n",
|
|
||||||
"\n",
|
|
||||||
"Licensed under the MIT License."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
""
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"## Deploy Model as Azure Machine Learning Web Service using MLflow\n",
|
|
||||||
"\n",
|
|
||||||
"This example shows you how to use mlflow together with Azure Machine Learning services for deploying a model as a web service. You'll learn how to:\n",
|
|
||||||
"\n",
|
|
||||||
" 1. Retrieve a previously trained scikit-learn model\n",
|
|
||||||
" 2. Create a Docker image from the model\n",
|
|
||||||
" 3. Deploy the model as a web service on Azure Container Instance\n",
|
|
||||||
" 4. Make a scoring request against the web service.\n",
|
|
||||||
"\n",
|
|
||||||
"## Prerequisites and Set-up\n",
|
|
||||||
"\n",
|
|
||||||
"This notebook requires you to first complete the [Use MLflow with Azure Machine Learning for Local Training Run](../train-local/train-local.ipnyb) or [Use MLflow with Azure Machine Learning for Remote Training Run](../train-remote/train-remote.ipnyb) notebook, so as to have an experiment run with uploaded model in your Azure Machine Learning Workspace.\n",
|
|
||||||
"\n",
|
|
||||||
"Also install following packages if you haven't already\n",
|
|
||||||
"\n",
|
|
||||||
"```\n",
|
|
||||||
"pip install azureml-mlflow pandas\n",
|
|
||||||
"```\n",
|
|
||||||
"\n",
|
|
||||||
"Then, import necessary packages:"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"import mlflow\n",
|
|
||||||
"import azureml.mlflow\n",
|
|
||||||
"import azureml.core\n",
|
|
||||||
"from azureml.core import Workspace\n",
|
|
||||||
"\n",
|
|
||||||
"# Check core SDK version number\n",
|
|
||||||
"print(\"SDK version:\", azureml.core.VERSION)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"### Connect to workspace and set MLflow tracking URI\n",
|
|
||||||
"\n",
|
|
||||||
"Setting the tracking URI is required for retrieving the model and creating an image using the MLflow APIs."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"ws = Workspace.from_config()\n",
|
|
||||||
"\n",
|
|
||||||
"mlflow.set_tracking_uri(ws.get_mlflow_tracking_uri())"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"### Retrieve model from previous run\n",
|
|
||||||
"\n",
|
|
||||||
"Let's retrieve the experiment from training notebook, and list the runs within that experiment."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"experiment_name = \"experiment-with-mlflow\"\n",
|
|
||||||
"exp = ws.experiments[experiment_name]\n",
|
|
||||||
"\n",
|
|
||||||
"runs = list(exp.get_runs())\n",
|
|
||||||
"runs"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"Then, let's select the most recent training run and find its ID. You also need to specify the path in run history where the model was saved. "
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"runid = runs[0].id\n",
|
|
||||||
"model_save_path = \"model\""
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"### Create Docker image\n",
|
|
||||||
"\n",
|
|
||||||
"To create a Docker image with Azure Machine Learning for Model Management, use ```mlflow.azureml.build_image``` method. Specify the model path, your workspace, run ID and other parameters.\n",
|
|
||||||
"\n",
|
|
||||||
"MLflow automatically recognizes the model framework as scikit-learn, and creates the scoring logic and includes library dependencies for you.\n",
|
|
||||||
"\n",
|
|
||||||
"Note that the image creation can take several minutes."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"import mlflow.azureml\n",
|
|
||||||
"\n",
|
|
||||||
"azure_image, azure_model = mlflow.azureml.build_image(model_uri=\"runs:/{}/{}\".format(runid, model_save_path),\n",
|
|
||||||
" workspace=ws,\n",
|
|
||||||
" model_name='diabetes-sklearn-model',\n",
|
|
||||||
" image_name='diabetes-sklearn-image',\n",
|
|
||||||
" synchronous=True)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"### Deploy web service\n",
|
|
||||||
"\n",
|
|
||||||
"Let's use Azure Machine Learning SDK to deploy the image as a web service. \n",
|
|
||||||
"\n",
|
|
||||||
"First, specify the deployment configuration. Azure Container Instance is a suitable choice for a quick dev-test deployment, while Azure Kubernetes Service is suitable for scalable production deployments.\n",
|
|
||||||
"\n",
|
|
||||||
"Then, deploy the image using Azure Machine Learning SDK's ```deploy_from_image``` method.\n",
|
|
||||||
"\n",
|
|
||||||
"Note that the deployment can take several minutes."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"from azureml.core.webservice import AciWebservice, Webservice\n",
|
|
||||||
"\n",
|
|
||||||
"\n",
|
|
||||||
"aci_config = AciWebservice.deploy_configuration(cpu_cores=1, \n",
|
|
||||||
" memory_gb=1, \n",
|
|
||||||
" tags={\"method\" : \"sklearn\"}, \n",
|
|
||||||
" description='Diabetes model',\n",
|
|
||||||
" location='eastus2')\n",
|
|
||||||
"\n",
|
|
||||||
"\n",
|
|
||||||
"# Deploy the image to Azure Container Instances (ACI) for real-time serving\n",
|
|
||||||
"webservice = Webservice.deploy_from_image(\n",
|
|
||||||
" image=azure_image, workspace=ws, name=\"diabetes-model-1\", deployment_config=aci_config)\n",
|
|
||||||
"\n",
|
|
||||||
"\n",
|
|
||||||
"webservice.wait_for_deployment(show_output=True)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"## Make a scoring request\n",
|
|
||||||
"\n",
|
|
||||||
"Let's take the first few rows of test data and score them using the web service"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"test_rows = [\n",
|
|
||||||
" [0.01991321, 0.05068012, 0.10480869, 0.07007254, -0.03596778,\n",
|
|
||||||
" -0.0266789 , -0.02499266, -0.00259226, 0.00371174, 0.04034337],\n",
|
|
||||||
" [-0.01277963, -0.04464164, 0.06061839, 0.05285819, 0.04796534,\n",
|
|
||||||
" 0.02937467, -0.01762938, 0.03430886, 0.0702113 , 0.00720652],\n",
|
|
||||||
" [ 0.03807591, 0.05068012, 0.00888341, 0.04252958, -0.04284755,\n",
|
|
||||||
" -0.02104223, -0.03971921, -0.00259226, -0.01811827, 0.00720652]]"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"MLflow-based web service for scikit-learn model requires the data to be converted to Pandas DataFrame, and then serialized as JSON. "
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"import json\n",
|
|
||||||
"import pandas as pd\n",
|
|
||||||
"\n",
|
|
||||||
"test_rows_as_json = pd.DataFrame(test_rows).to_json(orient=\"split\")"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"Let's pass the conveted and serialized data to web service to get the predictions."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"predictions = webservice.run(test_rows_as_json)\n",
|
|
||||||
"\n",
|
|
||||||
"print(predictions)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"You can use the web service's scoring URI to make a raw HTTP request"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"webservice.scoring_uri"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"You can diagnose the web service using ```get_logs``` method."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"webservice.get_logs()"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"## Next Steps\n",
|
|
||||||
"\n",
|
|
||||||
"Learn about [model management and inference in Azure Machine Learning service](https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-model-management-and-deployment)."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": []
|
|
||||||
}
|
|
||||||
],
|
|
||||||
"metadata": {
|
|
||||||
"authors": [
|
|
||||||
{
|
|
||||||
"name": "rastala"
|
|
||||||
}
|
|
||||||
],
|
|
||||||
"kernelspec": {
|
|
||||||
"display_name": "Python 3.6",
|
|
||||||
"language": "python",
|
|
||||||
"name": "python36"
|
|
||||||
},
|
|
||||||
"language_info": {
|
|
||||||
"codemirror_mode": {
|
|
||||||
"name": "ipython",
|
|
||||||
"version": 3
|
|
||||||
},
|
|
||||||
"file_extension": ".py",
|
|
||||||
"mimetype": "text/x-python",
|
|
||||||
"name": "python",
|
|
||||||
"nbconvert_exporter": "python",
|
|
||||||
"pygments_lexer": "ipython3",
|
|
||||||
"version": "3.6.4"
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"nbformat": 4,
|
|
||||||
"nbformat_minor": 2
|
|
||||||
}
|
|
||||||
@@ -1,8 +0,0 @@
|
|||||||
name: deploy-model
|
|
||||||
dependencies:
|
|
||||||
- scikit-learn
|
|
||||||
- matplotlib
|
|
||||||
- pip:
|
|
||||||
- azureml-sdk
|
|
||||||
- azureml-mlflow
|
|
||||||
- pandas
|
|
||||||
@@ -1,150 +0,0 @@
|
|||||||
# Copyright (c) 2017, PyTorch Team
|
|
||||||
# All rights reserved
|
|
||||||
# Licensed under BSD 3-Clause License.
|
|
||||||
|
|
||||||
# This example is based on PyTorch MNIST example:
|
|
||||||
# https://github.com/pytorch/examples/blob/master/mnist/main.py
|
|
||||||
|
|
||||||
import mlflow
|
|
||||||
import mlflow.pytorch
|
|
||||||
from mlflow.utils.environment import _mlflow_conda_env
|
|
||||||
import warnings
|
|
||||||
import cloudpickle
|
|
||||||
import torch
|
|
||||||
import torch.nn as nn
|
|
||||||
import torch.nn.functional as F
|
|
||||||
import torch.optim as optim
|
|
||||||
import torchvision
|
|
||||||
from torchvision import datasets, transforms
|
|
||||||
|
|
||||||
|
|
||||||
class Net(nn.Module):
|
|
||||||
def __init__(self):
|
|
||||||
super(Net, self).__init__()
|
|
||||||
self.conv1 = nn.Conv2d(1, 20, 5, 1)
|
|
||||||
self.conv2 = nn.Conv2d(20, 50, 5, 1)
|
|
||||||
self.fc1 = nn.Linear(4 * 4 * 50, 500)
|
|
||||||
self.fc2 = nn.Linear(500, 10)
|
|
||||||
|
|
||||||
def forward(self, x):
|
|
||||||
# Added the view for reshaping score requests
|
|
||||||
x = x.view(-1, 1, 28, 28)
|
|
||||||
x = F.relu(self.conv1(x))
|
|
||||||
x = F.max_pool2d(x, 2, 2)
|
|
||||||
x = F.relu(self.conv2(x))
|
|
||||||
x = F.max_pool2d(x, 2, 2)
|
|
||||||
x = x.view(-1, 4 * 4 * 50)
|
|
||||||
x = F.relu(self.fc1(x))
|
|
||||||
x = self.fc2(x)
|
|
||||||
return F.log_softmax(x, dim=1)
|
|
||||||
|
|
||||||
|
|
||||||
def train(args, model, device, train_loader, optimizer, epoch):
|
|
||||||
model.train()
|
|
||||||
for batch_idx, (data, target) in enumerate(train_loader):
|
|
||||||
data, target = data.to(device), target.to(device)
|
|
||||||
optimizer.zero_grad()
|
|
||||||
output = model(data)
|
|
||||||
loss = F.nll_loss(output, target)
|
|
||||||
loss.backward()
|
|
||||||
optimizer.step()
|
|
||||||
if batch_idx % args.log_interval == 0:
|
|
||||||
print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
|
|
||||||
epoch, batch_idx * len(data), len(train_loader.dataset),
|
|
||||||
100. * batch_idx / len(train_loader), loss.item()))
|
|
||||||
# Use MLflow logging
|
|
||||||
mlflow.log_metric("epoch_loss", loss.item())
|
|
||||||
|
|
||||||
|
|
||||||
def test(args, model, device, test_loader):
|
|
||||||
model.eval()
|
|
||||||
test_loss = 0
|
|
||||||
correct = 0
|
|
||||||
with torch.no_grad():
|
|
||||||
for data, target in test_loader:
|
|
||||||
data, target = data.to(device), target.to(device)
|
|
||||||
output = model(data)
|
|
||||||
# sum up batch loss
|
|
||||||
test_loss += F.nll_loss(output, target, reduction="sum").item()
|
|
||||||
# get the index of the max log-probability
|
|
||||||
pred = output.argmax(dim=1, keepdim=True)
|
|
||||||
correct += pred.eq(target.view_as(pred)).sum().item()
|
|
||||||
|
|
||||||
test_loss /= len(test_loader.dataset)
|
|
||||||
print("\n")
|
|
||||||
print("Test set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n".format(
|
|
||||||
test_loss, correct, len(test_loader.dataset),
|
|
||||||
100. * correct / len(test_loader.dataset)))
|
|
||||||
# Use MLflow logging
|
|
||||||
mlflow.log_metric("average_loss", test_loss)
|
|
||||||
|
|
||||||
|
|
||||||
class Args(object):
|
|
||||||
pass
|
|
||||||
|
|
||||||
|
|
||||||
# Training settings
|
|
||||||
args = Args()
|
|
||||||
setattr(args, 'batch_size', 64)
|
|
||||||
setattr(args, 'test_batch_size', 1000)
|
|
||||||
setattr(args, 'epochs', 3) # Higher number for better convergence
|
|
||||||
setattr(args, 'lr', 0.01)
|
|
||||||
setattr(args, 'momentum', 0.5)
|
|
||||||
setattr(args, 'no_cuda', True)
|
|
||||||
setattr(args, 'seed', 1)
|
|
||||||
setattr(args, 'log_interval', 10)
|
|
||||||
setattr(args, 'save_model', True)
|
|
||||||
|
|
||||||
use_cuda = not args.no_cuda and torch.cuda.is_available()
|
|
||||||
|
|
||||||
torch.manual_seed(args.seed)
|
|
||||||
|
|
||||||
device = torch.device("cuda" if use_cuda else "cpu")
|
|
||||||
|
|
||||||
kwargs = {'num_workers': 1, 'pin_memory': True} if use_cuda else {}
|
|
||||||
train_loader = torch.utils.data.DataLoader(
|
|
||||||
datasets.MNIST('../data', train=True, download=True,
|
|
||||||
transform=transforms.Compose([
|
|
||||||
transforms.ToTensor(),
|
|
||||||
transforms.Normalize((0.1307,), (0.3081,))
|
|
||||||
])),
|
|
||||||
batch_size=args.batch_size, shuffle=True, **kwargs)
|
|
||||||
test_loader = torch.utils.data.DataLoader(
|
|
||||||
datasets.MNIST(
|
|
||||||
'../data',
|
|
||||||
train=False,
|
|
||||||
transform=transforms.Compose([
|
|
||||||
transforms.ToTensor(),
|
|
||||||
transforms.Normalize((0.1307,), (0.3081,))])),
|
|
||||||
batch_size=args.test_batch_size, shuffle=True, **kwargs)
|
|
||||||
|
|
||||||
|
|
||||||
def driver():
|
|
||||||
warnings.filterwarnings("ignore")
|
|
||||||
# Dependencies for deploying the model
|
|
||||||
pytorch_index = "https://download.pytorch.org/whl/"
|
|
||||||
pytorch_version = "cpu/torch-1.1.0-cp36-cp36m-linux_x86_64.whl"
|
|
||||||
deps = [
|
|
||||||
"cloudpickle=={}".format(cloudpickle.__version__),
|
|
||||||
pytorch_index + pytorch_version,
|
|
||||||
"torchvision=={}".format(torchvision.__version__),
|
|
||||||
"Pillow=={}".format("6.0.0")
|
|
||||||
]
|
|
||||||
with mlflow.start_run() as run:
|
|
||||||
model = Net().to(device)
|
|
||||||
optimizer = optim.SGD(
|
|
||||||
model.parameters(),
|
|
||||||
lr=args.lr,
|
|
||||||
momentum=args.momentum)
|
|
||||||
for epoch in range(1, args.epochs + 1):
|
|
||||||
train(args, model, device, train_loader, optimizer, epoch)
|
|
||||||
test(args, model, device, test_loader)
|
|
||||||
# Log model to run history using MLflow
|
|
||||||
if args.save_model:
|
|
||||||
model_env = _mlflow_conda_env(additional_pip_deps=deps)
|
|
||||||
mlflow.pytorch.log_model(model, "model", conda_env=model_env)
|
|
||||||
return run
|
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
|
||||||
driver()
|
|
||||||
@@ -1,481 +0,0 @@
|
|||||||
{
|
|
||||||
"cells": [
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"Copyright (c) Microsoft Corporation. All rights reserved.\n",
|
|
||||||
"\n",
|
|
||||||
"Licensed under the MIT License."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
""
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"## Use MLflow with Azure Machine Learning to Train and Deploy PyTorch Image Classifier\n",
|
|
||||||
"\n",
|
|
||||||
"This example shows you how to use MLflow together with Azure Machine Learning services for tracking the metrics and artifacts while training a PyTorch model to classify MNIST digit images, and then deploy the model as a web service. You'll learn how to:\n",
|
|
||||||
"\n",
|
|
||||||
" 1. Set up MLflow tracking URI so as to use Azure ML\n",
|
|
||||||
" 2. Create experiment\n",
|
|
||||||
" 3. Instrument your model with MLflow tracking\n",
|
|
||||||
" 4. Train a PyTorch model locally\n",
|
|
||||||
" 5. Train a model on GPU compute on Azure\n",
|
|
||||||
" 6. View your experiment within your Azure ML Workspace in Azure Portal\n",
|
|
||||||
" 7. Create a Docker image from the trained model\n",
|
|
||||||
" 8. Deploy the model as a web service on Azure Container Instance\n",
|
|
||||||
" 9. Call the model to make predictions\n",
|
|
||||||
" \n",
|
|
||||||
"### Pre-requisites\n",
|
|
||||||
" \n",
|
|
||||||
"Make sure you have completed the [Configuration](../../../configuration.ipnyb) notebook to set up your Azure Machine Learning workspace and ensure other common prerequisites are met.\n",
|
|
||||||
"\n",
|
|
||||||
"Also, install mlflow-azureml package using ```pip install mlflow-azureml```. Note that mlflow-azureml installs mlflow package itself as a dependency, if you haven't done so previously.\n",
|
|
||||||
"\n",
|
|
||||||
"### Set-up\n",
|
|
||||||
"\n",
|
|
||||||
"Import packages and check versions of Azure ML SDK and MLflow installed on your computer. Then connect to your Workspace."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"import sys, os\n",
|
|
||||||
"import mlflow\n",
|
|
||||||
"import mlflow.azureml\n",
|
|
||||||
"import mlflow.sklearn\n",
|
|
||||||
"\n",
|
|
||||||
"import azureml.core\n",
|
|
||||||
"from azureml.core import Workspace\n",
|
|
||||||
"\n",
|
|
||||||
"\n",
|
|
||||||
"print(\"SDK version:\", azureml.core.VERSION)\n",
|
|
||||||
"print(\"MLflow version:\", mlflow.version.VERSION)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"ws = Workspace.from_config()\n",
|
|
||||||
"ws.get_details()"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"### Set tracking URI\n",
|
|
||||||
"\n",
|
|
||||||
"Set the MLFlow tracking URI to point to your Azure ML Workspace. The subsequent logging calls from MLFlow APIs will go to Azure ML services and will be tracked under your Workspace."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"mlflow.set_tracking_uri(ws.get_mlflow_tracking_uri())"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"### Create Experiment\n",
|
|
||||||
"\n",
|
|
||||||
"In both MLflow and Azure ML, training runs are grouped into experiments. Let's create one for our experimentation."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"experiment_name = \"pytorch-with-mlflow\"\n",
|
|
||||||
"mlflow.set_experiment(experiment_name)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"### Train model locally while logging metrics and artifacts\n",
|
|
||||||
"\n",
|
|
||||||
"The ```scripts/train.py``` program contains the code to load the image dataset, and train and test the model. Within this program, the train.driver function wraps the end-to-end workflow.\n",
|
|
||||||
"\n",
|
|
||||||
"Within the driver, the ```mlflow.start_run``` starts MLflow tracking. Then, ```mlflow.log_metric``` functions are used to track the convergence of the neural network training iterations. Finally ```mlflow.pytorch.save_model``` is used to save the trained model in framework-aware manner.\n",
|
|
||||||
"\n",
|
|
||||||
"Let's add the program to search path, import it as a module, and then invoke the driver function. Note that the training can take few minutes."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"lib_path = os.path.abspath(\"scripts\")\n",
|
|
||||||
"sys.path.append(lib_path)\n",
|
|
||||||
"\n",
|
|
||||||
"import train"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"run = train.driver()"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"You can view the metrics of the run at Azure Portal"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"print(azureml.mlflow.get_portal_url(run))"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"### Train model on GPU compute on Azure\n",
|
|
||||||
"\n",
|
|
||||||
"Next, let's run the same script on GPU-enabled compute for faster training. If you've completed the the [Configuration](../../../configuration.ipnyb) notebook, you should have a GPU cluster named \"gpu-cluster\" available in your workspace. Otherwise, follow the instructions in the notebook to create one. For simplicity, this example uses single process on single VM to train the model.\n",
|
|
||||||
"\n",
|
|
||||||
"Create a PyTorch estimator to specify the training configuration: script, compute as well as additional packages needed. To enable MLflow tracking, include ```azureml-mlflow``` as pip package. The low-level specifications for the training run are encapsulated in the estimator instance."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"from azureml.train.dnn import PyTorch\n",
|
|
||||||
"\n",
|
|
||||||
"pt = PyTorch(source_directory=\"./scripts\", \n",
|
|
||||||
" entry_script = \"train.py\", \n",
|
|
||||||
" compute_target = \"gpu-cluster\", \n",
|
|
||||||
" node_count = 1, \n",
|
|
||||||
" process_count_per_node = 1, \n",
|
|
||||||
" use_gpu=True,\n",
|
|
||||||
" pip_packages = [\"azureml-mlflow\", \"Pillow==6.0.0\"])\n",
|
|
||||||
"\n"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"Get a reference to the experiment you created previously, but this time, as Azure Machine Learning experiment object.\n",
|
|
||||||
"\n",
|
|
||||||
"Then, use ```Experiment.submit``` method to start the remote training run. Note that the first training run often takes longer as Azure Machine Learning service builds the Docker image for executing the script. Subsequent runs will be faster as cached image is used."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"from azureml.core import Experiment\n",
|
|
||||||
"\n",
|
|
||||||
"exp = Experiment(ws, experiment_name)\n",
|
|
||||||
"run = exp.submit(pt)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"You can monitor the run and its metrics on Azure Portal."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"run"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"Also, you can wait for run to complete."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"run.wait_for_completion(show_output=True)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"### Deploy model as web service\n",
|
|
||||||
"\n",
|
|
||||||
"To deploy a web service, first create a Docker image, and then deploy that Docker image on inferencing compute.\n",
|
|
||||||
"\n",
|
|
||||||
"The ```mlflow.azureml.build_image``` function builds a Docker image from saved PyTorch model in a framework-aware manner. It automatically creates the PyTorch-specific inferencing wrapper code and specififies package dependencies for you."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"run.get_file_names()"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"Then build a docker image using *runs:/<run.id>/model* as the model_uri path.\n",
|
|
||||||
"\n",
|
|
||||||
"Note that the image building can take several minutes."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"model_path = \"model\"\n",
|
|
||||||
"\n",
|
|
||||||
"\n",
|
|
||||||
"azure_image, azure_model = mlflow.azureml.build_image(model_uri='runs:/{}/{}'.format(run.id, model_path),\n",
|
|
||||||
" workspace=ws,\n",
|
|
||||||
" model_name='pytorch_mnist',\n",
|
|
||||||
" image_name='pytorch-mnist-img',\n",
|
|
||||||
" synchronous=True)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"Then, deploy the Docker image to Azure Container Instance: a serverless compute capable of running a single container. You can tag and add descriptions to help keep track of your web service. \n",
|
|
||||||
"\n",
|
|
||||||
"[Other inferencing compute choices](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-deploy-and-where) include Azure Kubernetes Service which provides scalable endpoint suitable for production use.\n",
|
|
||||||
"\n",
|
|
||||||
"Note that the service deployment can take several minutes."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"from azureml.core.webservice import AciWebservice, Webservice\n",
|
|
||||||
"\n",
|
|
||||||
"aci_config = AciWebservice.deploy_configuration(cpu_cores=2, \n",
|
|
||||||
" memory_gb=5, \n",
|
|
||||||
" tags={\"data\": \"MNIST\", \"method\" : \"pytorch\"}, \n",
|
|
||||||
" description=\"Predict using webservice\")\n",
|
|
||||||
"\n",
|
|
||||||
"\n",
|
|
||||||
"# Deploy the image to Azure Container Instances (ACI) for real-time serving\n",
|
|
||||||
"webservice = Webservice.deploy_from_image(\n",
|
|
||||||
" image=azure_image, workspace=ws, name=\"pytorch-mnist-1\", deployment_config=aci_config)\n",
|
|
||||||
"\n",
|
|
||||||
"\n",
|
|
||||||
"webservice.wait_for_deployment()"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"Once the deployment has completed you can check the scoring URI of the web service."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"print(\"Scoring URI is: {}\".format(webservice.scoring_uri))"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"In case of a service creation issue, you can use ```webservice.get_logs()``` to get logs to debug."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"### Make predictions using web service\n",
|
|
||||||
"\n",
|
|
||||||
"To make the web service, create a test data set as normalized PyTorch tensors. \n",
|
|
||||||
"\n",
|
|
||||||
"Then, let's define a utility function that takes a random image and converts it into format and shape suitable for as input to PyTorch inferencing end-point. The conversion is done by: \n",
|
|
||||||
"\n",
|
|
||||||
" 1. Select a random (image, label) tuple\n",
|
|
||||||
" 2. Take the image and converting the tensor to NumPy array \n",
|
|
||||||
" 3. Reshape array into 1 x 1 x N array\n",
|
|
||||||
" * 1 image in batch, 1 color channel, N = 784 pixels for MNIST images\n",
|
|
||||||
" * Note also ```x = x.view(-1, 1, 28, 28)``` in net definition in ```train.py``` program to shape incoming scoring requests.\n",
|
|
||||||
" 4. Convert the NumPy array to list to make it into a built-in type.\n",
|
|
||||||
" 5. Create a dictionary {\"data\", <list>} that can be converted to JSON string for web service requests."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"from torchvision import datasets, transforms\n",
|
|
||||||
"import random\n",
|
|
||||||
"import numpy as np\n",
|
|
||||||
"\n",
|
|
||||||
"test_data = datasets.MNIST('../data', train=False, transform=transforms.Compose([\n",
|
|
||||||
" transforms.ToTensor(),\n",
|
|
||||||
" transforms.Normalize((0.1307,), (0.3081,))]))\n",
|
|
||||||
"\n",
|
|
||||||
"\n",
|
|
||||||
"def get_random_image():\n",
|
|
||||||
" image_idx = random.randint(0,len(test_data))\n",
|
|
||||||
" image_as_tensor = test_data[image_idx][0]\n",
|
|
||||||
" return {\"data\": elem for elem in image_as_tensor.numpy().reshape(1,1,-1).tolist()}"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"Then, invoke the web service using a random test image. Convert the dictionary containing the image to JSON string before passing it to web service.\n",
|
|
||||||
"\n",
|
|
||||||
"The response contains the raw scores for each label, with greater value indicating higher probability. Sort the labels and select the one with greatest score to get the prediction. Let's also plot the image sent to web service for comparison purposes."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"%matplotlib inline\n",
|
|
||||||
"\n",
|
|
||||||
"import json\n",
|
|
||||||
"import matplotlib.pyplot as plt\n",
|
|
||||||
"\n",
|
|
||||||
"test_image = get_random_image()\n",
|
|
||||||
"\n",
|
|
||||||
"response = webservice.run(json.dumps(test_image))\n",
|
|
||||||
"\n",
|
|
||||||
"response = sorted(response[0].items(), key = lambda x: x[1], reverse = True)\n",
|
|
||||||
"\n",
|
|
||||||
"\n",
|
|
||||||
"print(\"Predicted label:\", response[0][0])\n",
|
|
||||||
"plt.imshow(np.array(test_image[\"data\"]).reshape(28,28), cmap = \"gray\")"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"You can also call the web service using a raw POST method against the web service"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"import requests\n",
|
|
||||||
"\n",
|
|
||||||
"response = requests.post(url=webservice.scoring_uri, data=json.dumps(test_image),headers={\"Content-type\": \"application/json\"})\n",
|
|
||||||
"print(response.text)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": []
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": []
|
|
||||||
}
|
|
||||||
],
|
|
||||||
"metadata": {
|
|
||||||
"authors": [
|
|
||||||
{
|
|
||||||
"name": "roastala"
|
|
||||||
}
|
|
||||||
],
|
|
||||||
"celltoolbar": "Edit Metadata",
|
|
||||||
"kernelspec": {
|
|
||||||
"display_name": "Python 3.6",
|
|
||||||
"language": "python",
|
|
||||||
"name": "python36"
|
|
||||||
},
|
|
||||||
"language_info": {
|
|
||||||
"codemirror_mode": {
|
|
||||||
"name": "ipython",
|
|
||||||
"version": 3
|
|
||||||
},
|
|
||||||
"file_extension": ".py",
|
|
||||||
"mimetype": "text/x-python",
|
|
||||||
"name": "python",
|
|
||||||
"nbconvert_exporter": "python",
|
|
||||||
"pygments_lexer": "ipython3",
|
|
||||||
"version": "3.7.3"
|
|
||||||
},
|
|
||||||
"name": "mlflow-sparksummit-pytorch",
|
|
||||||
"notebookId": 2495374963457641
|
|
||||||
},
|
|
||||||
"nbformat": 4,
|
|
||||||
"nbformat_minor": 1
|
|
||||||
}
|
|
||||||
@@ -1,8 +0,0 @@
|
|||||||
name: train-and-deploy-pytorch
|
|
||||||
dependencies:
|
|
||||||
- matplotlib
|
|
||||||
- pip:
|
|
||||||
- azureml-sdk
|
|
||||||
- azureml-mlflow
|
|
||||||
- https://download.pytorch.org/whl/cpu/torch-1.1.0-cp35-cp35m-win_amd64.whl
|
|
||||||
- https://download.pytorch.org/whl/cpu/torchvision-0.3.0-cp35-cp35m-win_amd64.whl
|
|
||||||
@@ -1,248 +0,0 @@
|
|||||||
{
|
|
||||||
"cells": [
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"Copyright (c) Microsoft Corporation. All rights reserved.\n",
|
|
||||||
"\n",
|
|
||||||
"Licensed under the MIT License."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
""
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"## Use MLflow with Azure Machine Learning for Local Training Run\n",
|
|
||||||
"\n",
|
|
||||||
"This example shows you how to use mlflow tracking APIs together with Azure Machine Learning services for storing your metrics and artifacts, from local Notebook run. You'll learn how to:\n",
|
|
||||||
"\n",
|
|
||||||
" 1. Set up MLflow tracking URI so as to use Azure ML\n",
|
|
||||||
" 2. Create experiment\n",
|
|
||||||
" 3. Train a model on your local computer while logging metrics and artifacts\n",
|
|
||||||
" 4. View your experiment within your Azure ML Workspace in Azure Portal.\n",
|
|
||||||
"\n",
|
|
||||||
"## Prerequisites and Set-up\n",
|
|
||||||
"\n",
|
|
||||||
"Make sure you have completed the [Configuration](../../../configuration.ipnyb) notebook to set up your Azure Machine Learning workspace and ensure other common prerequisites are met.\n",
|
|
||||||
"\n",
|
|
||||||
"Install azureml-mlflow package before running this notebook. Note that mlflow itself gets installed as dependency if you haven't installed it yet.\n",
|
|
||||||
"\n",
|
|
||||||
"```\n",
|
|
||||||
"pip install azureml-mlflow\n",
|
|
||||||
"```\n",
|
|
||||||
"\n",
|
|
||||||
"This example also uses scikit-learn and matplotlib packages. Install them:\n",
|
|
||||||
"```\n",
|
|
||||||
"pip install scikit-learn matplotlib\n",
|
|
||||||
"```\n",
|
|
||||||
"\n",
|
|
||||||
"Then, import necessary packages"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"import mlflow\n",
|
|
||||||
"import mlflow.sklearn\n",
|
|
||||||
"import azureml.core\n",
|
|
||||||
"from azureml.core import Workspace\n",
|
|
||||||
"import matplotlib.pyplot as plt\n",
|
|
||||||
"\n",
|
|
||||||
"# Check core SDK version number\n",
|
|
||||||
"print(\"SDK version:\", azureml.core.VERSION)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"## Set tracking URI\n",
|
|
||||||
"\n",
|
|
||||||
"Set the MLflow tracking URI to point to your Azure ML Workspace. The subsequent logging calls from MLflow APIs will go to Azure ML services and will be tracked under your Workspace."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"ws = Workspace.from_config()\n",
|
|
||||||
"\n",
|
|
||||||
"mlflow.set_tracking_uri(ws.get_mlflow_tracking_uri())"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"## Create Experiment\n",
|
|
||||||
"\n",
|
|
||||||
"In both MLflow and Azure ML, training runs are grouped into experiments. Let's create one for our experimentation."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"experiment_name = \"experiment-with-mlflow\"\n",
|
|
||||||
"mlflow.set_experiment(experiment_name)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"## Create training and test data set\n",
|
|
||||||
"\n",
|
|
||||||
"This example uses diabetes dataset to build a simple regression model. Let's load the dataset and split it into training and test sets."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"import numpy as np\n",
|
|
||||||
"from sklearn.datasets import load_diabetes\n",
|
|
||||||
"from sklearn.linear_model import Ridge\n",
|
|
||||||
"from sklearn.metrics import mean_squared_error\n",
|
|
||||||
"from sklearn.model_selection import train_test_split\n",
|
|
||||||
"\n",
|
|
||||||
"X, y = load_diabetes(return_X_y = True)\n",
|
|
||||||
"columns = ['age', 'gender', 'bmi', 'bp', 's1', 's2', 's3', 's4', 's5', 's6']\n",
|
|
||||||
"X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)\n",
|
|
||||||
"data = {\n",
|
|
||||||
" \"train\":{\"X\": X_train, \"y\": y_train}, \n",
|
|
||||||
" \"test\":{\"X\": X_test, \"y\": y_test}\n",
|
|
||||||
"}\n",
|
|
||||||
"\n",
|
|
||||||
"print (\"Data contains\", len(data['train']['X']), \"training samples and\",len(data['test']['X']), \"test samples\")"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"## Train while logging metrics and artifacts\n",
|
|
||||||
"\n",
|
|
||||||
"Next, start a mlflow run to train a scikit-learn regression model. Note that the training script has been instrumented using MLflow to:\n",
|
|
||||||
" * Log model hyperparameter alpha value\n",
|
|
||||||
" * Log mean squared error against test set\n",
|
|
||||||
" * Save the scikit-learn based regression model produced by training\n",
|
|
||||||
" * Save an image that shows actuals vs predictions against test set.\n",
|
|
||||||
" \n",
|
|
||||||
"These metrics and artifacts have been recorded to your Azure ML Workspace; in the next step you'll learn how to view them."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"# Create a run object in the experiment\n",
|
|
||||||
"model_save_path = \"model\"\n",
|
|
||||||
"\n",
|
|
||||||
"with mlflow.start_run() as run:\n",
|
|
||||||
" # Log the algorithm parameter alpha to the run\n",
|
|
||||||
" mlflow.log_metric('alpha', 0.03)\n",
|
|
||||||
" # Create, fit, and test the scikit-learn Ridge regression model\n",
|
|
||||||
" regression_model = Ridge(alpha=0.03)\n",
|
|
||||||
" regression_model.fit(data['train']['X'], data['train']['y'])\n",
|
|
||||||
" preds = regression_model.predict(data['test']['X'])\n",
|
|
||||||
"\n",
|
|
||||||
" # Log mean squared error\n",
|
|
||||||
" print('Mean Squared Error is', mean_squared_error(data['test']['y'], preds))\n",
|
|
||||||
" mlflow.log_metric('mse', mean_squared_error(data['test']['y'], preds))\n",
|
|
||||||
" \n",
|
|
||||||
" # Save the model to the outputs directory for capture\n",
|
|
||||||
" mlflow.sklearn.log_model(regression_model,model_save_path)\n",
|
|
||||||
" \n",
|
|
||||||
" # Plot actuals vs predictions and save the plot within the run\n",
|
|
||||||
" fig = plt.figure(1)\n",
|
|
||||||
" idx = np.argsort(data['test']['y'])\n",
|
|
||||||
" plt.plot(data['test']['y'][idx],preds[idx])\n",
|
|
||||||
" fig.savefig(\"actuals_vs_predictions.png\")\n",
|
|
||||||
" mlflow.log_artifact(\"actuals_vs_predictions.png\") "
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"You can open the report page for your experiment and runs within it from Azure Portal.\n",
|
|
||||||
"\n",
|
|
||||||
"Select one of the runs to view the metrics, and the plot you saved. The saved scikit-learn model appears under **outputs** tab."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"ws.experiments[experiment_name]"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"### Next steps\n",
|
|
||||||
"\n",
|
|
||||||
"Try out these notebooks to learn more about MLflow-Azure Machine Learning integration:\n",
|
|
||||||
"\n",
|
|
||||||
" * [Train a model using remote compute on Azure Cloud](../train-on-remote/train-on-remote.ipynb)\n",
|
|
||||||
" * [Deploy the model as a web service](../deploy-model/deploy-model.ipynb)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": []
|
|
||||||
}
|
|
||||||
],
|
|
||||||
"metadata": {
|
|
||||||
"authors": [
|
|
||||||
{
|
|
||||||
"name": "rastala"
|
|
||||||
}
|
|
||||||
],
|
|
||||||
"kernelspec": {
|
|
||||||
"display_name": "Python 3.6",
|
|
||||||
"language": "python",
|
|
||||||
"name": "python36"
|
|
||||||
},
|
|
||||||
"language_info": {
|
|
||||||
"codemirror_mode": {
|
|
||||||
"name": "ipython",
|
|
||||||
"version": 3
|
|
||||||
},
|
|
||||||
"file_extension": ".py",
|
|
||||||
"mimetype": "text/x-python",
|
|
||||||
"name": "python",
|
|
||||||
"nbconvert_exporter": "python",
|
|
||||||
"pygments_lexer": "ipython3",
|
|
||||||
"version": "3.6.4"
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"nbformat": 4,
|
|
||||||
"nbformat_minor": 2
|
|
||||||
}
|
|
||||||
@@ -1,7 +0,0 @@
|
|||||||
name: train-local
|
|
||||||
dependencies:
|
|
||||||
- scikit-learn
|
|
||||||
- matplotlib
|
|
||||||
- pip:
|
|
||||||
- azureml-sdk
|
|
||||||
- azureml-mlflow
|
|
||||||
@@ -1,318 +0,0 @@
|
|||||||
{
|
|
||||||
"cells": [
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"Copyright (c) Microsoft Corporation. All rights reserved.\n",
|
|
||||||
"\n",
|
|
||||||
"Licensed under the MIT License."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
""
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"## Use MLflow with Azure Machine Learning for Remote Training Run\n",
|
|
||||||
"\n",
|
|
||||||
"This example shows you how to use MLflow tracking APIs together with Azure Machine Learning services for storing your metrics and artifacts, from local Notebook run. You'll learn how to:\n",
|
|
||||||
"\n",
|
|
||||||
" 1. Set up MLflow tracking URI so as to use Azure ML\n",
|
|
||||||
" 2. Create experiment\n",
|
|
||||||
" 3. Train a model on Machine Learning Compute while logging metrics and artifacts\n",
|
|
||||||
" 4. View your experiment within your Azure ML Workspace in Azure Portal."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"## Prerequisites\n",
|
|
||||||
"\n",
|
|
||||||
"Make sure you have completed the [Configuration](../../../configuration.ipnyb) notebook to set up your Azure Machine Learning workspace and ensure other common prerequisites are met."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"## Set-up\n",
|
|
||||||
"\n",
|
|
||||||
"Check Azure ML SDK version installed on your computer, and then connect to your Workspace."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"# Check core SDK version number\n",
|
|
||||||
"import azureml.core\n",
|
|
||||||
"from azureml.core import Workspace, Experiment\n",
|
|
||||||
"\n",
|
|
||||||
"print(\"SDK version:\", azureml.core.VERSION)\n",
|
|
||||||
"\n",
|
|
||||||
"ws = Workspace.from_config()"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"Let's also create a Machine Learning Compute cluster for submitting the remote run. "
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"from azureml.core.compute import ComputeTarget, AmlCompute\n",
|
|
||||||
"from azureml.core.compute_target import ComputeTargetException\n",
|
|
||||||
"\n",
|
|
||||||
"# Choose a name for your CPU cluster\n",
|
|
||||||
"cpu_cluster_name = \"cpu-cluster\"\n",
|
|
||||||
"\n",
|
|
||||||
"# Verify that cluster does not exist already\n",
|
|
||||||
"try:\n",
|
|
||||||
" cpu_cluster = ComputeTarget(workspace=ws, name=cpu_cluster_name)\n",
|
|
||||||
" print(\"Found existing cpu-cluster\")\n",
|
|
||||||
"except ComputeTargetException:\n",
|
|
||||||
" print(\"Creating new cpu-cluster\")\n",
|
|
||||||
" \n",
|
|
||||||
" # Specify the configuration for the new cluster\n",
|
|
||||||
" compute_config = AmlCompute.provisioning_configuration(vm_size=\"STANDARD_D2_V2\",\n",
|
|
||||||
" min_nodes=0,\n",
|
|
||||||
" max_nodes=1)\n",
|
|
||||||
"\n",
|
|
||||||
" # Create the cluster with the specified name and configuration\n",
|
|
||||||
" cpu_cluster = ComputeTarget.create(ws, cpu_cluster_name, compute_config)\n",
|
|
||||||
" \n",
|
|
||||||
" # Wait for the cluster to complete, show the output log\n",
|
|
||||||
" cpu_cluster.wait_for_completion(show_output=True)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"### Create Azure ML Experiment"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"The following steps show how to submit a training Python script to a cluster as an Azure ML run, while logging happens through MLflow APIs to your Azure ML Workspace. Let's first create an experiment to hold the training runs."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"from azureml.core import Experiment\n",
|
|
||||||
"\n",
|
|
||||||
"experiment_name = \"experiment-with-mlflow\"\n",
|
|
||||||
"exp = Experiment(workspace=ws, name=experiment_name)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"### Instrument remote training script using MLflow"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"Let's use [*train_diabetes.py*](train_diabetes.py) to train a regression model against diabetes dataset as the example. Note that the training script uses mlflow.start_run() to start logging, and then logs metrics, saves the trained scikit-learn model, and saves a plot as an artifact.\n",
|
|
||||||
"\n",
|
|
||||||
"Run following command to view the script file. Notice the mlflow logging statements, and also notice that the script doesn't have explicit dependencies on azureml library."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"training_script = 'train_diabetes.py'\n",
|
|
||||||
"with open(training_script, 'r') as f:\n",
|
|
||||||
" print(f.read())"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"### Submit Run to Cluster \n",
|
|
||||||
"\n",
|
|
||||||
"Let's submit the run to cluster. When running on the remote cluster as submitted run, Azure ML sets the MLflow tracking URI to point to your Azure ML Workspace, so that the metrics and artifacts are automatically logged there.\n",
|
|
||||||
"\n",
|
|
||||||
"Note that you have to specify the packages your script depends on, including *azureml-mlflow* that implicitly enables the MLflow logging to Azure ML. \n",
|
|
||||||
"\n",
|
|
||||||
"First, create a environment with Docker enable and required package dependencies specified."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {
|
|
||||||
"tags": [
|
|
||||||
"mlflow"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"from azureml.core import Environment\n",
|
|
||||||
"from azureml.core.conda_dependencies import CondaDependencies\n",
|
|
||||||
"\n",
|
|
||||||
"env = Environment(name=\"mlflow-env\")\n",
|
|
||||||
"\n",
|
|
||||||
"env.docker.enabled = True\n",
|
|
||||||
"\n",
|
|
||||||
"# Specify conda dependencies with scikit-learn and temporary pointers to mlflow extensions\n",
|
|
||||||
"cd = CondaDependencies.create(\n",
|
|
||||||
" conda_packages=[\"scikit-learn\", \"matplotlib\"],\n",
|
|
||||||
" pip_packages=[\"azureml-mlflow\", \"numpy\"]\n",
|
|
||||||
" )\n",
|
|
||||||
"\n",
|
|
||||||
"env.python.conda_dependencies = cd"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"Next, specify a script run configuration that includes the training script, environment and CPU cluster created earlier."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"from azureml.core import ScriptRunConfig\n",
|
|
||||||
"\n",
|
|
||||||
"src = ScriptRunConfig(source_directory=\".\", script=training_script)\n",
|
|
||||||
"src.run_config.environment = env\n",
|
|
||||||
"src.run_config.target = cpu_cluster.name"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"Finally, submit the run. Note that the first instance of the run typically takes longer as the Docker-based environment is created, several minutes. Subsequent runs reuse the image and are faster."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"run = exp.submit(src)\n",
|
|
||||||
"run.wait_for_completion(show_output=True)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"You can navigate to your Azure ML Workspace at Azure Portal to view the run metrics and artifacts. "
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"run"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"You can also get the metrics and bring them to your local notebook, and view the details of the run."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"run.get_metrics()"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"ws.get_details()"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"### Next steps\n",
|
|
||||||
"\n",
|
|
||||||
" * [Deploy the model as a web service](../deploy-model/deploy-model.ipynb)\n",
|
|
||||||
" * [Learn more about Azure Machine Learning compute options](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-set-up-training-targets)"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": []
|
|
||||||
}
|
|
||||||
],
|
|
||||||
"metadata": {
|
|
||||||
"authors": [
|
|
||||||
{
|
|
||||||
"name": "rastala"
|
|
||||||
}
|
|
||||||
],
|
|
||||||
"kernelspec": {
|
|
||||||
"display_name": "Python 3.6",
|
|
||||||
"language": "python",
|
|
||||||
"name": "python36"
|
|
||||||
},
|
|
||||||
"language_info": {
|
|
||||||
"codemirror_mode": {
|
|
||||||
"name": "ipython",
|
|
||||||
"version": 3
|
|
||||||
},
|
|
||||||
"file_extension": ".py",
|
|
||||||
"mimetype": "text/x-python",
|
|
||||||
"name": "python",
|
|
||||||
"nbconvert_exporter": "python",
|
|
||||||
"pygments_lexer": "ipython3",
|
|
||||||
"version": "3.6.4"
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"nbformat": 4,
|
|
||||||
"nbformat_minor": 2
|
|
||||||
}
|
|
||||||
@@ -1,4 +0,0 @@
|
|||||||
name: train-remote
|
|
||||||
dependencies:
|
|
||||||
- pip:
|
|
||||||
- azureml-sdk
|
|
||||||
@@ -1,46 +0,0 @@
|
|||||||
# Copyright (c) Microsoft. All rights reserved.
|
|
||||||
# Licensed under the MIT license.
|
|
||||||
|
|
||||||
import numpy as np
|
|
||||||
from sklearn.datasets import load_diabetes
|
|
||||||
from sklearn.linear_model import Ridge
|
|
||||||
from sklearn.metrics import mean_squared_error
|
|
||||||
from sklearn.model_selection import train_test_split
|
|
||||||
import mlflow
|
|
||||||
import mlflow.sklearn
|
|
||||||
|
|
||||||
import matplotlib
|
|
||||||
matplotlib.use('Agg')
|
|
||||||
import matplotlib.pyplot as plt
|
|
||||||
|
|
||||||
with mlflow.start_run():
|
|
||||||
X, y = load_diabetes(return_X_y=True)
|
|
||||||
columns = ['age', 'gender', 'bmi', 'bp', 's1', 's2', 's3', 's4', 's5', 's6']
|
|
||||||
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
|
|
||||||
data = {
|
|
||||||
"train": {"X": X_train, "y": y_train},
|
|
||||||
"test": {"X": X_test, "y": y_test}}
|
|
||||||
|
|
||||||
mlflow.log_metric("Training samples", len(data['train']['X']))
|
|
||||||
mlflow.log_metric("Test samples", len(data['test']['X']))
|
|
||||||
|
|
||||||
# Log the algorithm parameter alpha to the run
|
|
||||||
mlflow.log_metric('alpha', 0.03)
|
|
||||||
# Create, fit, and test the scikit-learn Ridge regression model
|
|
||||||
regression_model = Ridge(alpha=0.03)
|
|
||||||
regression_model.fit(data['train']['X'], data['train']['y'])
|
|
||||||
preds = regression_model.predict(data['test']['X'])
|
|
||||||
|
|
||||||
# Log mean squared error
|
|
||||||
print('Mean Squared Error is', mean_squared_error(data['test']['y'], preds))
|
|
||||||
mlflow.log_metric('mse', mean_squared_error(data['test']['y'], preds))
|
|
||||||
|
|
||||||
# Save the model to the outputs directory for capture
|
|
||||||
mlflow.sklearn.log_model(regression_model, "model")
|
|
||||||
|
|
||||||
# Plot actuals vs predictions and save the plot within the run
|
|
||||||
fig = plt.figure(1)
|
|
||||||
idx = np.argsort(data['test']['y'])
|
|
||||||
plt.plot(data['test']['y'][idx], preds[idx])
|
|
||||||
fig.savefig("actuals_vs_predictions.png")
|
|
||||||
mlflow.log_artifact("actuals_vs_predictions.png")
|
|
||||||
Reference in New Issue
Block a user