more updates

This commit is contained in:
Roope Astala
2018-10-12 14:43:18 -04:00
parent a4792d95ac
commit b3cc1b61a2
15 changed files with 151 additions and 3032 deletions

View File

@@ -1,473 +0,0 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Copyright (c) Microsoft Corporation. All rights reserved.\n",
"\n",
"Licensed under the MIT License."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 02. Train locally\n",
"* Create or load workspace.\n",
"* Create scripts locally.\n",
"* Create `train.py` in a folder, along with a `my.lib` file.\n",
"* Configure & execute a local run in a user-managed Python environment.\n",
"* Configure & execute a local run in a system-managed Python environment.\n",
"* Configure & execute a local run in a Docker environment.\n",
"* Query run metrics to find the best model\n",
"* Register model for operationalization."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Prerequisites\n",
"Make sure you go through the [00. Installation and Configuration](00.configuration.ipynb) Notebook first if you haven't."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Check core SDK version number\n",
"import azureml.core\n",
"\n",
"print(\"SDK version:\", azureml.core.VERSION)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Initialize Workspace\n",
"\n",
"Initialize a workspace object from persisted configuration."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core.workspace import Workspace\n",
"\n",
"ws = Workspace.from_config()\n",
"print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep='\\n')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Create An Experiment\n",
"**Experiment** is a logical container in an Azure ML Workspace. It hosts run records which can include run metrics and output artifacts from your experiments."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core import Experiment\n",
"experiment_name = 'train-on-local'\n",
"exp = Experiment(workspace=ws, name=experiment_name)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## View `train.py`\n",
"\n",
"`train.py` is already created for you."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"with open('./train.py', 'r') as f:\n",
" print(f.read())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note `train.py` also references a `mylib.py` file."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"with open('./mylib.py', 'r') as f:\n",
" print(f.read())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Configure & Run\n",
"### User-managed environment\n",
"Below, we use a user-managed run, which means you are responsible to ensure all the necessary packages are available in the Python environment you choose to run the script."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core.runconfig import RunConfiguration\n",
"\n",
"# Editing a run configuration property on-fly.\n",
"run_config_user_managed = RunConfiguration()\n",
"\n",
"run_config_user_managed.environment.python.user_managed_dependencies = True\n",
"\n",
"# You can choose a specific Python environment by pointing to a Python path \n",
"#run_config.environment.python.interpreter_path = '/home/johndoe/miniconda3/envs/sdk2/bin/python'"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Submit script to run in the user-managed environment\n",
"Note whole script folder is submitted for execution, including the `mylib.py` file."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core import ScriptRunConfig\n",
"\n",
"src = ScriptRunConfig(source_directory='./', script='train.py', run_config=run_config_user_managed)\n",
"run = exp.submit(src)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Get run history details"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"run"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Block to wait till run finishes."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"run.wait_for_completion(show_output=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### System-managed environment\n",
"You can also ask the system to build a new conda environment and execute your scripts in it. The environment is built once and will be reused in subsequent executions as long as the conda dependencies remain unchanged. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core.runconfig import RunConfiguration\n",
"from azureml.core.conda_dependencies import CondaDependencies\n",
"\n",
"run_config_system_managed = RunConfiguration()\n",
"\n",
"run_config_system_managed.environment.python.user_managed_dependencies = False\n",
"run_config_system_managed.auto_prepare_environment = True\n",
"\n",
"# Specify conda dependencies with scikit-learn\n",
"cd = CondaDependencies.create(conda_packages=['scikit-learn'])\n",
"run_config_system_managed.environment.python.conda_dependencies = cd"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Submit script to run in the system-managed environment\n",
"A new conda environment is built based on the conda dependencies object. If you are running this for the first time, this might take up to 5 mninutes. But this conda environment is reused so long as you don't change the conda dependencies."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"src = ScriptRunConfig(source_directory=\"./\", script='train.py', run_config=run_config_system_managed)\n",
"run = exp.submit(src)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Get run history details"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"run"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Block and wait till run finishes."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"run.wait_for_completion(show_output = True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Docker-based execution\n",
"**IMPORTANT**: You must have Docker engine installed locally in order to use this execution mode. If your kernel is already running in a Docker container, such as **Azure Notebooks**, this mode will **NOT** work.\n",
"\n",
"You can also ask the system to pull down a Docker image and execute your scripts in it."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"run_config_docker = RunConfiguration()\n",
"run_config_docker.environment.python.user_managed_dependencies = False\n",
"run_config_docker.auto_prepare_environment = True\n",
"run_config_docker.environment.docker.enabled = True\n",
"run_config_docker.environment.docker.base_image = azureml.core.runconfig.DEFAULT_CPU_IMAGE\n",
"\n",
"# Specify conda dependencies with scikit-learn\n",
"cd = CondaDependencies.create(conda_packages=['scikit-learn'])\n",
"run_config_docker.environment.python.conda_dependencies = cd\n",
"\n",
"src = ScriptRunConfig(source_directory=\"./\", script='train.py', run_config=run_config_docker)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Submit script to run in the system-managed environment\n",
"A new conda environment is built based on the conda dependencies object. If you are running this for the first time, this might take up to 5 mninutes. But this conda environment is reused so long as you don't change the conda dependencies.\n",
"\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"# Check if Docker is installed\n",
"if os.system(\"docker -v\") == 0:\n",
" run = exp.submit(src)\n",
"else:\n",
" print(\"Docker engine not installed.\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#Get run history details\n",
"run"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"run.wait_for_completion(show_output=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Query run metrics"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"query history",
"get metrics"
]
},
"outputs": [],
"source": [
"# get all metris logged in the run\n",
"run.get_metrics()\n",
"metrics = run.get_metrics()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's find the model that has the lowest MSE value logged."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"\n",
"best_alpha = metrics['alpha'][np.argmin(metrics['mse'])]\n",
"\n",
"print('When alpha is {1:0.2f}, we have min MSE {0:0.2f}.'.format(\n",
" min(metrics['mse']), \n",
" best_alpha\n",
"))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can also list all the files that are associated with this run record"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"run.get_file_names()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We know the model `ridge_0.40.pkl` is the best performing model from the eariler queries. So let's register it with the workspace."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# supply a model name, and the full path to the serialized model file.\n",
"model = run.register_model(model_name='best_ridge_model', model_path='./outputs/ridge_0.40.pkl')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print(model.name, model.version, model.url)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now you can deploy this model following the example in the 01 notebook."
]
}
],
"metadata": {
"authors": [
{
"name": "roastala"
}
],
"kernelspec": {
"display_name": "Python 3.6",
"language": "python",
"name": "python36"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.6"
}
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -1,325 +0,0 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Copyright (c) Microsoft Corporation. All rights reserved.\n",
"\n",
"Licensed under the MIT License."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 03. Train on Azure Container Instance (EXPERIMENTAL)\n",
"\n",
"* Create Workspace\n",
"* Create Project\n",
"* Create `train.py` in the project folder.\n",
"* Configure an ACI (Azure Container Instance) run\n",
"* Execute in ACI"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Prerequisites\n",
"Make sure you go through the [00. Installation and Configuration](00.configuration.ipynb) Notebook first if you haven't."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Check core SDK version number\n",
"import azureml.core\n",
"\n",
"print(\"SDK version:\", azureml.core.VERSION)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Initialize Workspace\n",
"\n",
"Initialize a workspace object from persisted configuration"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"create workspace"
]
},
"outputs": [],
"source": [
"from azureml.core import Workspace\n",
"\n",
"ws = Workspace.from_config()\n",
"print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep = '\\n')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Create An Experiment\n",
"\n",
"**Experiment** is a logical container in an Azure ML Workspace. It hosts run records which can include run metrics and output artifacts from your experiments."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core import Experiment\n",
"experiment_name = 'train-on-aci'\n",
"experiment = Experiment(workspace = ws, name = experiment_name)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Create a folder to store the training script."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"script_folder = './samples/train-on-aci'\n",
"os.makedirs(script_folder, exist_ok = True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Remote execution on ACI\n",
"\n",
"Use `%%writefile` magic to write training code to `train.py` file under the project folder."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%writefile $script_folder/train.py\n",
"\n",
"import os\n",
"from sklearn.datasets import load_diabetes\n",
"from sklearn.linear_model import Ridge\n",
"from sklearn.metrics import mean_squared_error\n",
"from sklearn.model_selection import train_test_split\n",
"from azureml.core.run import Run\n",
"from sklearn.externals import joblib\n",
"\n",
"import numpy as np\n",
"\n",
"os.makedirs('./outputs', exist_ok=True)\n",
"\n",
"X, y = load_diabetes(return_X_y = True)\n",
"\n",
"run = Run.get_submitted_run()\n",
"\n",
"X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)\n",
"data = {\"train\": {\"X\": X_train, \"y\": y_train},\n",
" \"test\": {\"X\": X_test, \"y\": y_test}}\n",
"\n",
"# list of numbers from 0.0 to 1.0 with a 0.05 interval\n",
"alphas = np.arange(0.0, 1.0, 0.05)\n",
"\n",
"for alpha in alphas:\n",
" # Use Ridge algorithm to create a regression model\n",
" reg = Ridge(alpha = alpha)\n",
" reg.fit(data[\"train\"][\"X\"], data[\"train\"][\"y\"])\n",
"\n",
" preds = reg.predict(data[\"test\"][\"X\"])\n",
" mse = mean_squared_error(preds, data[\"test\"][\"y\"])\n",
" run.log('alpha', alpha)\n",
" run.log('mse', mse)\n",
" \n",
" model_file_name = 'ridge_{0:.2f}.pkl'.format(alpha)\n",
" with open(model_file_name, \"wb\") as file:\n",
" joblib.dump(value = reg, filename = 'outputs/' + model_file_name)\n",
"\n",
" print('alpha is {0:.2f}, and mse is {1:0.2f}'.format(alpha, mse))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Configure for using ACI\n",
"Linux-based ACI is available in `westus`, `eastus`, `westeurope`, `northeurope`, `westus2` and `southeastasia` regions. See details [here](https://docs.microsoft.com/en-us/azure/container-instances/container-instances-quotas#region-availability)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"configure run"
]
},
"outputs": [],
"source": [
"from azureml.core.runconfig import RunConfiguration\n",
"from azureml.core.conda_dependencies import CondaDependencies\n",
"\n",
"# create a new runconfig object\n",
"run_config = RunConfiguration()\n",
"\n",
"# signal that you want to use ACI to execute script.\n",
"run_config.target = \"containerinstance\"\n",
"\n",
"# ACI container group is only supported in certain regions, which can be different than the region the Workspace is in.\n",
"run_config.container_instance.region = 'eastus'\n",
"\n",
"# set the ACI CPU and Memory \n",
"run_config.container_instance.cpu_cores = 1\n",
"run_config.container_instance.memory_gb = 2\n",
"\n",
"# enable Docker \n",
"run_config.environment.docker.enabled = True\n",
"\n",
"# set Docker base image to the default CPU-based image\n",
"run_config.environment.docker.base_image = azureml.core.runconfig.DEFAULT_CPU_IMAGE\n",
"#run_config.environment.docker.base_image = 'microsoft/mmlspark:plus-0.9.9'\n",
"\n",
"# use conda_dependencies.yml to create a conda environment in the Docker image for execution\n",
"run_config.environment.python.user_managed_dependencies = False\n",
"\n",
"# auto-prepare the Docker image when used for execution (if it is not already prepared)\n",
"run_config.auto_prepare_environment = True\n",
"\n",
"# specify CondaDependencies obj\n",
"run_config.environment.python.conda_dependencies = CondaDependencies.create(conda_packages=['scikit-learn'])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Submit the Experiment\n",
"Finally, run the training job on the ACI"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"remote run",
"aci"
]
},
"outputs": [],
"source": [
"%%time \n",
"from azureml.core.script_run_config import ScriptRunConfig\n",
"\n",
"script_run_config = ScriptRunConfig(source_directory = script_folder,\n",
" script= 'train.py',\n",
" run_config = run_config)\n",
"\n",
"run = experiment.submit(script_run_config)\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"remote run",
"aci"
]
},
"outputs": [],
"source": [
"%%time\n",
"# Shows output of the run on stdout.\n",
"run.wait_for_completion(show_output = True)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"query history"
]
},
"outputs": [],
"source": [
"# Show run details\n",
"run"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"get metrics"
]
},
"outputs": [],
"source": [
"# get all metris logged in the run\n",
"run.get_metrics()\n",
"metrics = run.get_metrics()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"print('When alpha is {1:0.2f}, we have min MSE {0:0.2f}.'.format(\n",
" min(metrics['mse']), \n",
" metrics['alpha'][np.argmin(metrics['mse'])]\n",
"))"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.5"
}
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -1,321 +0,0 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Copyright (c) Microsoft Corporation. All rights reserved.\n",
"\n",
"Licensed under the MIT License."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 04. Train in a remote VM (MLC managed DSVM)\n",
"* Create Workspace\n",
"* Create Project\n",
"* Create `train.py` file\n",
"* Create DSVM as Machine Learning Compute (MLC) resource\n",
"* Configure & execute a run in a conda environment in the default miniconda Docker container on DSVM"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Prerequisites\n",
"Make sure you go through the [00. Installation and Configuration](00.configuration.ipynb) Notebook first if you haven't."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Check core SDK version number\n",
"import azureml.core\n",
"\n",
"print(\"SDK version:\", azureml.core.VERSION)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Initialize Workspace\n",
"\n",
"Initialize a workspace object from persisted configuration."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core import Workspace\n",
"\n",
"ws = Workspace.from_config()\n",
"print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep = '\\n')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Create Experiment\n",
"\n",
"**Experiment** is a logical container in an Azure ML Workspace. It hosts run records which can include run metrics and output artifacts from your experiments."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"experiment_name = 'train-on-remote-vm'\n",
"\n",
"from azureml.core import Experiment\n",
"\n",
"exp = Experiment(workspace = ws, name = experiment_name)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## View `train.py`\n",
"\n",
"For convenience, we created a training script for you. It is printed below as a text, but you can also run `%pfile ./train.py` in a cell to show the file."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"with open('./train.py', 'r') as training_script:\n",
" print(training_script.read())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Create Linux DSVM as a compute target\n",
"\n",
"**Note**: If creation fails with a message about Marketplace purchase eligibilty, go to portal.azure.com, start creating DSVM there, and select \"Want to create programmatically\" to enable programmatic creation. Once you've enabled it, you can exit without actually creating VM.\n",
" \n",
"**Note**: By default SSH runs on port 22 and you don't need to specify it. But if for security reasons you switch to a different port (such as 5022), you can append the port number to the address like the example below. [Read more](../../documentation/sdk/ssh-issue.md) on this."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core.compute import DsvmCompute\n",
"from azureml.core.compute_target import ComputeTargetException\n",
"\n",
"compute_target_name = 'mydsvm'\n",
"\n",
"try:\n",
" dsvm_compute = DsvmCompute(workspace = ws, name = compute_target_name)\n",
" print('found existing:', dsvm_compute.name)\n",
"except ComputeTargetException:\n",
" print('creating new.')\n",
" dsvm_config = DsvmCompute.provisioning_configuration(vm_size = \"Standard_D2_v2\")\n",
" dsvm_compute = DsvmCompute.create(ws, name = compute_target_name, provisioning_configuration = dsvm_config)\n",
" dsvm_compute.wait_for_completion(show_output = True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Attach an existing Linux DSVM as a compute target\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"'''\n",
" from azureml.core.compute import RemoteCompute \n",
" # if you want to connect using SSH key instead of username/password you can provide parameters private_key_file and private_key_passphrase \n",
" dsvm_compute = RemoteCompute.attach(ws,name=\"attach-from-sdk6\",username=<username>,address=<ipaddress>,ssh_port=22,password=<password>)\n",
"'''"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Configure & Run"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Configure a Docker run with new conda environment on the VM\n",
"You can execute in a Docker container in the VM. If you choose this route, you don't need to install anything on the VM yourself. Azure ML execution service will take care of it for you."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core.runconfig import RunConfiguration\n",
"from azureml.core.conda_dependencies import CondaDependencies\n",
"\n",
"\n",
"# Load the \"cpu-dsvm.runconfig\" file (created by the above attach operation) in memory\n",
"run_config = RunConfiguration(framework = \"python\")\n",
"\n",
"# Set compute target to the Linux DSVM\n",
"run_config.target = compute_target_name\n",
"\n",
"# Use Docker in the remote VM\n",
"run_config.environment.docker.enabled = True\n",
"\n",
"# Use CPU base image from DockerHub\n",
"run_config.environment.docker.base_image = azureml.core.runconfig.DEFAULT_CPU_IMAGE\n",
"print('Base Docker image is:', run_config.environment.docker.base_image)\n",
"\n",
"# Ask system to provision a new one based on the conda_dependencies.yml file\n",
"run_config.environment.python.user_managed_dependencies = False\n",
"\n",
"# Prepare the Docker and conda environment automatically when executingfor the first time.\n",
"run_config.prepare_environment = True\n",
"\n",
"# specify CondaDependencies obj\n",
"run_config.environment.python.conda_dependencies = CondaDependencies.create(conda_packages=['scikit-learn'])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Submit the Experiment\n",
"Submit script to run in the Docker image in the remote VM. If you run this for the first time, the system will download the base image, layer in packages specified in the `conda_dependencies.yml` file on top of the base image, create a container and then execute the script in the container."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core import Run\n",
"from azureml.core import ScriptRunConfig\n",
"\n",
"src = ScriptRunConfig(source_directory = '.', script = 'train.py', run_config = run_config)\n",
"run = exp.submit(src)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### View run history details"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"run"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"run.wait_for_completion(show_output = True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Find the best run"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# get all metris logged in the run\n",
"run.get_metrics()\n",
"metrics = run.get_metrics()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"print('When alpha is {1:0.2f}, we have min MSE {0:0.2f}.'.format(\n",
" min(metrics['mse']), \n",
" metrics['alpha'][np.argmin(metrics['mse'])]\n",
"))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Clean up compute resource"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"dsvm_compute.delete()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.5"
}
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -1,257 +0,0 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Copyright (c) Microsoft Corporation. All rights reserved.\n",
"\n",
"Licensed under the MIT License."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 05. Train in Spark\n",
"* Create Workspace\n",
"* Create Experiment\n",
"* Copy relevant files to the script folder\n",
"* Configure and Run"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Prerequisites\n",
"Make sure you go through the [00. Installation and Configuration](00.configuration.ipynb) Notebook first if you haven't."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Check core SDK version number\n",
"import azureml.core\n",
"\n",
"print(\"SDK version:\", azureml.core.VERSION)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Initialize Workspace\n",
"\n",
"Initialize a workspace object from persisted configuration."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core import Workspace\n",
"\n",
"ws = Workspace.from_config()\n",
"print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep = '\\n')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Create Experiment\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"experiment_name = 'train-on-remote-vm'\n",
"\n",
"from azureml.core import Experiment\n",
"\n",
"exp = Experiment(workspace = ws, name = experiment_name)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## View `train-spark.py`\n",
"\n",
"For convenience, we created a training script for you. It is printed below as a text, but you can also run `%pfile ./train-spark.py` in a cell to show the file."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"with open('train-spark.py', 'r') as training_script:\n",
" print(training_script.read())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Configure & Run"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Attach an HDI cluster\n",
"To use HDI commpute target:\n",
" 1. Create an Spark for HDI cluster in Azure. Here is some [quick instructions](https://docs.microsoft.com/en-us/azure/machine-learning/desktop-workbench/how-to-create-dsvm-hdi). Make sure you use the Ubuntu flavor, NOT CentOS.\n",
" 2. Enter the IP address, username and password below"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core.compute import HDInsightCompute\n",
"\n",
"try:\n",
" # if you want to connect using SSH key instead of username/password you can provide parameters private_key_file and private_key_passphrase\n",
" hdi_compute_new = HDInsightCompute.attach(ws, \n",
" name=\"hdi-attach\", \n",
" address=\"hdi-ignite-demo-ssh.azurehdinsight.net\", \n",
" ssh_port=22, \n",
" username='<username>', \n",
" password='<password>')\n",
"\n",
"except UserErrorException as e:\n",
" print(\"Caught = {}\".format(e.message))\n",
" print(\"Compute config already attached.\")\n",
" \n",
" \n",
"hdi_compute_new.wait_for_completion(show_output=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Configure HDI run"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core.runconfig import RunConfiguration\n",
"from azureml.core.conda_dependencies import CondaDependencies\n",
"\n",
"\n",
"# Load the \"cpu-dsvm.runconfig\" file (created by the above attach operation) in memory\n",
"run_config = RunConfiguration(framework = \"python\")\n",
"\n",
"# Set compute target to the Linux DSVM\n",
"run_config.target = hdi_compute.name\n",
"\n",
"# Use Docker in the remote VM\n",
"# run_config.environment.docker.enabled = True\n",
"\n",
"# Use CPU base image from DockerHub\n",
"# run_config.environment.docker.base_image = azureml.core.runconfig.DEFAULT_CPU_IMAGE\n",
"# print('Base Docker image is:', run_config.environment.docker.base_image)\n",
"\n",
"# Ask system to provision a new one based on the conda_dependencies.yml file\n",
"run_config.environment.python.user_managed_dependencies = False\n",
"\n",
"# Prepare the Docker and conda environment automatically when executingfor the first time.\n",
"# run_config.prepare_environment = True\n",
"\n",
"# specify CondaDependencies obj\n",
"# run_config.environment.python.conda_dependencies = CondaDependencies.create(conda_packages=['scikit-learn'])\n",
"# load the runconfig object from the \"myhdi.runconfig\" file generated by the attach operaton above."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Submit the script to HDI"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"script_run_config = ScriptRunConfig(source_directory = '.',\n",
" script= 'train-spark.py',\n",
" run_config = run_config)\n",
"run = experiment.submit(script_run_config)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# get the URL of the run history web page\n",
"run"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"run.wait_for_completion(show_output = True)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# get all metris logged in the run\n",
"metrics = run.get_metrics()\n",
"print(metrics)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.5"
}
},
"nbformat": 4,
"nbformat_minor": 2
}

27
onnx/README.md Normal file
View File

@@ -0,0 +1,27 @@
# ONNX on Azure Machine Learning
These tutorials show how to create and deploy [ONNX](http://onnx.ai) models using Azure Machine Learning and the [ONNX Runtime](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-build-deploy-onnx).
Once deployed as web services, you can ping the models with your own images to be analyzed!
## Tutorials
- [Obtain ONNX model from ONNX Model Zoo and deploy - ResNet50](https://github.com/Azure/MachineLearningNotebooks/blob/master/onnx/onnx-modelzoo-aml-deploy-resnet50.ipynb)
- [Convert ONNX model from CoreML and deploy - TinyYOLO](https://github.com/Azure/MachineLearningNotebooks/blob/master/onnx/onnx-convert-aml-deploy-tinyyolo.ipynb)
- [Train ONNX model in PyTorch and deploy - MNIST](https://github.com/Azure/MachineLearningNotebooks/blob/master/onnx/onnx-train-pytorch-aml-deploy-mnist.ipynb)
- [Handwritten Digit Classification (MNIST) using ONNX Runtime on AzureML](https://github.com/Azure/MachineLearningNotebooks/blob/master/onnx/onnx-inference-mnist.ipynb)
- [Facial Expression Recognition using ONNX Runtime on AzureML](https://github.com/Azure/MachineLearningNotebooks/blob/master/onnx/onnx-inference-emotion-recognition.ipynb)
## Documentation
- [ONNX Runtime Python API Documentation](http://aka.ms/onnxruntime-python)
- [Azure Machine Learning API Documentation](http://aka.ms/aml-docs)
## Related Articles
- [Building and Deploying ONNX Runtime Models](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-build-deploy-onnx)
- [Azure AI Making AI Real for Business](https://aka.ms/aml-blog-overview)
- [Whats new in Azure Machine Learning](https://aka.ms/aml-blog-whats-new)
## License
Copyright (c) Microsoft Corporation. All rights reserved.
Licensed under the MIT License.

124
onnx/mnist.py Normal file
View File

@@ -0,0 +1,124 @@
# This is a modified version of https://github.com/pytorch/examples/blob/master/mnist/main.py which is
# licensed under BSD 3-Clause (https://github.com/pytorch/examples/blob/master/LICENSE)
from __future__ import print_function
import argparse
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torchvision import datasets, transforms
import os
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(1, 10, kernel_size=5)
self.conv2 = nn.Conv2d(10, 20, kernel_size=5)
self.conv2_drop = nn.Dropout2d()
self.fc1 = nn.Linear(320, 50)
self.fc2 = nn.Linear(50, 10)
def forward(self, x):
x = F.relu(F.max_pool2d(self.conv1(x), 2))
x = F.relu(F.max_pool2d(self.conv2_drop(self.conv2(x)), 2))
x = x.view(-1, 320)
x = F.relu(self.fc1(x))
x = F.dropout(x, training=self.training)
x = self.fc2(x)
return F.log_softmax(x, dim=1)
def train(args, model, device, train_loader, optimizer, epoch, output_dir):
model.train()
for batch_idx, (data, target) in enumerate(train_loader):
data, target = data.to(device), target.to(device)
optimizer.zero_grad()
output = model(data)
loss = F.nll_loss(output, target)
loss.backward()
optimizer.step()
if batch_idx % args.log_interval == 0:
print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
epoch, batch_idx * len(data), len(train_loader.dataset),
100. * batch_idx / len(train_loader), loss.item()))
def test(args, model, device, test_loader):
model.eval()
test_loss = 0
correct = 0
with torch.no_grad():
for data, target in test_loader:
data, target = data.to(device), target.to(device)
output = model(data)
test_loss += F.nll_loss(output, target, size_average=False, reduce=True).item() # sum up batch loss
pred = output.max(1, keepdim=True)[1] # get the index of the max log-probability
correct += pred.eq(target.view_as(pred)).sum().item()
test_loss /= len(test_loader.dataset)
print('\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format(
test_loss, correct, len(test_loader.dataset),
100. * correct / len(test_loader.dataset)))
def main():
# Training settings
parser = argparse.ArgumentParser(description='PyTorch MNIST Example')
parser.add_argument('--batch-size', type=int, default=64, metavar='N',
help='input batch size for training (default: 64)')
parser.add_argument('--test-batch-size', type=int, default=1000, metavar='N',
help='input batch size for testing (default: 1000)')
parser.add_argument('--epochs', type=int, default=10, metavar='N',
help='number of epochs to train (default: 10)')
parser.add_argument('--lr', type=float, default=0.01, metavar='LR',
help='learning rate (default: 0.01)')
parser.add_argument('--momentum', type=float, default=0.5, metavar='M',
help='SGD momentum (default: 0.5)')
parser.add_argument('--no-cuda', action='store_true', default=False,
help='disables CUDA training')
parser.add_argument('--seed', type=int, default=1, metavar='S',
help='random seed (default: 1)')
parser.add_argument('--log-interval', type=int, default=10, metavar='N',
help='how many batches to wait before logging training status')
parser.add_argument('--output-dir', type=str, default='outputs')
args = parser.parse_args()
use_cuda = not args.no_cuda and torch.cuda.is_available()
torch.manual_seed(args.seed)
device = torch.device("cuda" if use_cuda else "cpu")
output_dir = args.output_dir
os.makedirs(output_dir, exist_ok=True)
kwargs = {'num_workers': 1, 'pin_memory': True} if use_cuda else {}
train_loader = torch.utils.data.DataLoader(
datasets.MNIST('data', train=True, download=True,
transform=transforms.Compose([transforms.ToTensor(),
transforms.Normalize((0.1307,), (0.3081,))])
),
batch_size=args.batch_size, shuffle=True, **kwargs)
test_loader = torch.utils.data.DataLoader(
datasets.MNIST('data', train=False,
transform=transforms.Compose([transforms.ToTensor(),
transforms.Normalize((0.1307,), (0.3081,))])
),
batch_size=args.test_batch_size, shuffle=True, **kwargs)
model = Net().to(device)
optimizer = optim.SGD(model.parameters(), lr=args.lr, momentum=args.momentum)
for epoch in range(1, args.epochs + 1):
train(args, model, device, train_loader, optimizer, epoch, output_dir)
test(args, model, device, test_loader)
# save model
dummy_input = torch.randn(1, 1, 28, 28, device=device)
model_path = os.path.join(output_dir, 'mnist.onnx')
torch.onnx.export(model, dummy_input, model_path)
if __name__ == '__main__':
main()

View File

@@ -1 +0,0 @@
{"cells":[{"cell_type":"markdown","source":["Azure ML & Azure Databricks notebooks by Parashar Shah.\n\nCopyright (c) Microsoft Corporation. All rights reserved.\n\nLicensed under the MIT License."],"metadata":{}},{"cell_type":"markdown","source":["Please ensure you have run this notebook before proceeding."],"metadata":{}},{"cell_type":"markdown","source":["Now we support installing AML SDK as library from GUI. When attaching a library follow this https://docs.databricks.com/user-guide/libraries.html and add the below string as your PyPi package (during private preview). You can select the option to attach the library to all clusters or just one cluster.\n\nProvide this full string to install the SDK:\n\nazureml-sdk[databricks]"],"metadata":{}},{"cell_type":"code","source":["import azureml.core\n\n# Check core SDK version number - based on build number of preview/master.\nprint(\"SDK version:\", azureml.core.VERSION)"],"metadata":{},"outputs":[],"execution_count":4},{"cell_type":"code","source":["subscription_id = \"<your-subscription-id>\"\nresource_group = \"<your-existing-resource-group>\"\nworkspace_name = \"<a-new-or-existing-workspace; it is unrelated to Databricks workspace>\"\nworkspace_region = \"<your-resource group-region>\""],"metadata":{},"outputs":[],"execution_count":5},{"cell_type":"code","source":["# import the Workspace class and check the azureml SDK version\n# exist_ok checks if workspace exists or not.\n\nfrom azureml.core import Workspace\n\nws = Workspace.create(name = workspace_name,\n subscription_id = subscription_id,\n resource_group = resource_group, \n location = workspace_region,\n exist_ok=True)\n\nws.get_details()"],"metadata":{},"outputs":[],"execution_count":6},{"cell_type":"code","source":["ws = Workspace(workspace_name = workspace_name,\n subscription_id = subscription_id,\n resource_group = resource_group)\n\n# persist the subscription id, resource group name, and workspace name in aml_config/config.json.\nws.write_config()"],"metadata":{},"outputs":[],"execution_count":7},{"cell_type":"code","source":["%sh\ncat /databricks/driver/aml_config/config.json"],"metadata":{},"outputs":[],"execution_count":8},{"cell_type":"code","source":["# import the Workspace class and check the azureml SDK version\nfrom azureml.core import Workspace\n\nws = Workspace.from_config()\nprint('Workspace name: ' + ws.name, \n 'Azure region: ' + ws.location, \n 'Subscription id: ' + ws.subscription_id, \n 'Resource group: ' + ws.resource_group, sep = '\\n')"],"metadata":{},"outputs":[],"execution_count":9},{"cell_type":"code","source":["dbutils.notebook.exit(\"success\")"],"metadata":{},"outputs":[],"execution_count":10},{"cell_type":"code","source":[""],"metadata":{},"outputs":[],"execution_count":11}],"metadata":{"name":"01.Installation_and_Configuration","notebookId":3874566296719377},"nbformat":4,"nbformat_minor":0}

View File

@@ -1 +0,0 @@
{"cells":[{"cell_type":"markdown","source":["Azure ML & Azure Databricks notebooks by Parashar Shah.\n\nCopyright (c) Microsoft Corporation. All rights reserved.\n\nLicensed under the MIT License."],"metadata":{}},{"cell_type":"markdown","source":["Please ensure you have run all previous notebooks in sequence before running this."],"metadata":{}},{"cell_type":"markdown","source":["#Data Ingestion"],"metadata":{}},{"cell_type":"code","source":["import os\nimport urllib"],"metadata":{},"outputs":[],"execution_count":4},{"cell_type":"code","source":["# Download AdultCensusIncome.csv from Azure CDN. This file has 32,561 rows.\nbasedataurl = \"https://amldockerdatasets.azureedge.net\"\ndatafile = \"AdultCensusIncome.csv\"\ndatafile_dbfs = os.path.join(\"/dbfs\", datafile)\n\nif os.path.isfile(datafile_dbfs):\n print(\"found {} at {}\".format(datafile, datafile_dbfs))\nelse:\n print(\"downloading {} to {}\".format(datafile, datafile_dbfs))\n urllib.request.urlretrieve(os.path.join(basedataurl, datafile), datafile_dbfs)"],"metadata":{},"outputs":[],"execution_count":5},{"cell_type":"code","source":["# Create a Spark dataframe out of the csv file.\ndata_all = sqlContext.read.format('csv').options(header='true', inferSchema='true', ignoreLeadingWhiteSpace='true', ignoreTrailingWhiteSpace='true').load(datafile)\nprint(\"({}, {})\".format(data_all.count(), len(data_all.columns)))\ndata_all.printSchema()"],"metadata":{},"outputs":[],"execution_count":6},{"cell_type":"code","source":["#renaming columns\ncolumns_new = [col.replace(\"-\", \"_\") for col in data_all.columns]\ndata_all = data_all.toDF(*columns_new)\ndata_all.printSchema()"],"metadata":{},"outputs":[],"execution_count":7},{"cell_type":"code","source":["display(data_all.limit(5))"],"metadata":{},"outputs":[],"execution_count":8},{"cell_type":"markdown","source":["#Data Preparation"],"metadata":{}},{"cell_type":"code","source":["# Choose feature columns and the label column.\nlabel = \"income\"\nxvals_all = set(data_all.columns) - {label}\n\n#dbutils.widgets.remove(\"xvars_multiselect\")\ndbutils.widgets.removeAll()\n\ndbutils.widgets.multiselect('xvars_multiselect', 'hours_per_week', xvals_all)\nxvars_multiselect = dbutils.widgets.get(\"xvars_multiselect\")\nxvars = xvars_multiselect.split(\",\")\n\nprint(\"label = {}\".format(label))\nprint(\"features = {}\".format(xvars))\n\ndata = data_all.select([*xvars, label])\n\n# Split data into train and test.\ntrain, test = data.randomSplit([0.75, 0.25], seed=123)\n\nprint(\"train ({}, {})\".format(train.count(), len(train.columns)))\nprint(\"test ({}, {})\".format(test.count(), len(test.columns)))"],"metadata":{},"outputs":[],"execution_count":10},{"cell_type":"markdown","source":["#Data Persistence"],"metadata":{}},{"cell_type":"code","source":["# Write the train and test data sets to intermediate storage\ntrain_data_path = \"AdultCensusIncomeTrain\"\ntest_data_path = \"AdultCensusIncomeTest\"\n\ntrain_data_path_dbfs = os.path.join(\"/dbfs\", \"AdultCensusIncomeTrain\")\ntest_data_path_dbfs = os.path.join(\"/dbfs\", \"AdultCensusIncomeTest\")\n\ntrain.write.mode('overwrite').parquet(train_data_path)\ntest.write.mode('overwrite').parquet(test_data_path)\nprint(\"train and test datasets saved to {} and {}\".format(train_data_path_dbfs, test_data_path_dbfs))"],"metadata":{},"outputs":[],"execution_count":12},{"cell_type":"code","source":["dbutils.notebook.exit(\"success\")"],"metadata":{},"outputs":[],"execution_count":13},{"cell_type":"code","source":[""],"metadata":{},"outputs":[],"execution_count":14}],"metadata":{"name":"02.Ingest_data","notebookId":3874566296719393},"nbformat":4,"nbformat_minor":0}

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@@ -1 +0,0 @@
{"cells":[{"cell_type":"markdown","source":["Azure ML & Azure Databricks notebooks by Parashar Shah.\n\nCopyright (c) Microsoft Corporation. All rights reserved.\n\nLicensed under the MIT License."],"metadata":{}},{"cell_type":"markdown","source":["Please ensure you have run all previous notebooks in sequence before running this. This notebook uses image from ACI notebook for deploying to AKS."],"metadata":{}},{"cell_type":"code","source":["from azureml.core import Workspace\nimport azureml.core\n\n# Check core SDK version number\nprint(\"SDK version:\", azureml.core.VERSION)\n\n#'''\nws = Workspace.from_config()\nprint('Workspace name: ' + ws.name, \n 'Azure region: ' + ws.location, \n 'Subscription id: ' + ws.subscription_id, \n 'Resource group: ' + ws.resource_group, sep = '\\n')\n#'''"],"metadata":{},"outputs":[],"execution_count":3},{"cell_type":"code","source":["# List images by ws\n\nfrom azureml.core.image import ContainerImage\nfor i in ContainerImage.list(workspace = ws):\n print('{}(v.{} [{}]) stored at {} with build log {}'.format(i.name, i.version, i.creation_state, i.image_location, i.image_build_log_uri))"],"metadata":{},"outputs":[],"execution_count":4},{"cell_type":"code","source":["from azureml.core.image import Image\nmyimage = Image(workspace=ws, id=\"aciws:25\")"],"metadata":{},"outputs":[],"execution_count":5},{"cell_type":"code","source":["#create AKS compute\n#it may take 20-25 minutes to create a new cluster\n\nfrom azureml.core.compute import AksCompute, ComputeTarget\n\n# Use the default configuration (can also provide parameters to customize)\nprov_config = AksCompute.provisioning_configuration()\n\naks_name = 'ps-aks-clus2' \n\n# Create the cluster\naks_target = ComputeTarget.create(workspace = ws, \n name = aks_name, \n provisioning_configuration = prov_config)\n\naks_target.wait_for_completion(show_output = True)\n\nprint(aks_target.provisioning_state)\nprint(aks_target.provisioning_errors)"],"metadata":{},"outputs":[],"execution_count":6},{"cell_type":"code","source":["from azureml.core.webservice import Webservice\nhelp( Webservice.deploy_from_image)"],"metadata":{},"outputs":[],"execution_count":7},{"cell_type":"code","source":["from azureml.core.webservice import Webservice, AksWebservice\nfrom azureml.core.image import ContainerImage\n\n#Set the web service configuration (using default here)\naks_config = AksWebservice.deploy_configuration()\n\n#unique service name\nservice_name ='ps-aks-service'\n\n# Webservice creation using single command, there is a variant to use image directly as well.\naks_service = Webservice.deploy_from_image(\n workspace=ws, \n name=service_name,\n deployment_config = aks_config,\n image = myimage,\n deployment_target = aks_target\n )\n\naks_service.wait_for_deployment(show_output=True)"],"metadata":{},"outputs":[],"execution_count":8},{"cell_type":"code","source":["#for using the Web HTTP API \nprint(aks_service.scoring_uri)\nprint(aks_service.get_keys())"],"metadata":{},"outputs":[],"execution_count":9},{"cell_type":"code","source":["import json\n\n#get the some sample data\ntest_data_path = \"AdultCensusIncomeTest\"\ntest = spark.read.parquet(test_data_path).limit(5)\n\ntest_json = json.dumps(test.toJSON().collect())\n\nprint(test_json)"],"metadata":{},"outputs":[],"execution_count":10},{"cell_type":"code","source":["#using data defined above predict if income is >50K (1) or <=50K (0)\naks_service.run(input_data=test_json)"],"metadata":{},"outputs":[],"execution_count":11},{"cell_type":"code","source":["#comment to not delete the web service\naks_service.delete()\n#image.delete()\n#model.delete()\n#aks_target.delete()"],"metadata":{},"outputs":[],"execution_count":12},{"cell_type":"code","source":[""],"metadata":{},"outputs":[],"execution_count":13}],"metadata":{"name":"04.DeploytoACI","notebookId":3874566296719318},"nbformat":4,"nbformat_minor":0}

View File

@@ -1,26 +0,0 @@
# Azure Databricks - Azure ML SDK Sample Notebooks
**NOTE**: With the latest version of our AML SDK, there are some API changes due to which previous version of notebooks will not work.
Kindly use this v4 notebooks (updated Sep 18) if you had installed the AML SDK in your Databricks cluster please update to latest SDK version by installing azureml-sdk[databricks] as a library from GUI.
**NOTE**: Please create your Azure Databricks cluster as v4.x (high concurrency preferred) with **Python 3** (dropdown). We are extending it to more runtimes asap.
**NOTE**: Some packages like psutil upgrade libs that can cause a conflict, please install such packages by freezing lib version. Eg. "pstuil **cryptography==1.5 pyopenssl==16.0.0 ipython=2.2.0**" to avoid install error. This issue is related to Databricks and not related to AML SDK.
**NOTE**: You should at least have contributor access to your Azure subcription to run some of the notebooks.
The iPython Notebooks have to be run sequentially after making changes based on your subscription. The corresponding DBC archive contains all the notebooks and can be imported into your Databricks workspace. You can the run notebooks after importing .dbc instead of downloading individually.
This set of notebooks are related to Income prediction experiment based on this [dataset](https://archive.ics.uci.edu/ml/datasets/adult) and demonstrate how to data prep, train and operationalize a Spark ML model with Azure ML Python SDK from within Azure Databricks. For details on SDK concepts, please refer to [Private preview notebooks](https://github.com/Azure/ViennaDocs/tree/master/PrivatePreview/notebooks)
(Recommended) [Azure Databricks AML SDK notebooks](Databricks_AMLSDK_github.dbc) A single DBC package to import all notebooks in your Databricks workspace.
01. [Installation and Configuration](01.Installation_and_Configuration.ipynb): Install the Azure ML Python SDK and Initialize an Azure ML Workspace and save the Workspace configuration file.
02. [Ingest data](02.Ingest_data.ipynb): Download the Adult Census Income dataset and split it into train and test sets.
03. [Build model](03a.Build_model.ipynb): Train a binary classification model in Azure Databricks with a Spark ML Pipeline.
04. [Build model with Run History](03b.Build_model_runHistory.ipynb): Train model and also capture run history (tracking) with Azure ML Python SDK.
05. [Deploy to ACI](04.Deploy_to_ACI.ipynb): Deploy model to Azure Container Instance (ACI) with Azure ML Python SDK.
06. [Deploy to AKS](04.Deploy_to_AKS_existingImage.ipynb): Deploy model to Azure Kubernetis Service (AKS) with Azure ML Python SDK from an existing Image with model, conda and score file.
Copyright (c) Microsoft Corporation. All rights reserved.
All notebooks in this folder are licensed under the MIT License.