mirror of
https://github.com/Azure/MachineLearningNotebooks.git
synced 2025-12-19 17:17:04 -05:00
736 lines
28 KiB
Plaintext
736 lines
28 KiB
Plaintext
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Copyright (c) Microsoft Corporation. All rights reserved.\n",
|
|
"\n",
|
|
"Licensed under the MIT License."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
""
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"# Train and explain models remotely via Azure Machine Learning Compute\n",
|
|
"\n",
|
|
"\n",
|
|
"_**This notebook showcases how to use the Azure Machine Learning Interpretability SDK to train and explain a regression model remotely on an Azure Machine Leanrning Compute Target (AMLCompute).**_\n",
|
|
"\n",
|
|
"\n",
|
|
"\n",
|
|
"\n",
|
|
"## Table of Contents\n",
|
|
"\n",
|
|
"1. [Introduction](#Introduction)\n",
|
|
"1. [Setup](#Setup)\n",
|
|
" 1. Initialize a Workspace\n",
|
|
" 1. Create an Experiment\n",
|
|
" 1. Introduction to AmlCompute\n",
|
|
" 1. Submit an AmlCompute run in a few different ways\n",
|
|
" 1. Option 1: Provision as a run based compute target \n",
|
|
" 1. Option 2: Provision as a persistent compute target (Basic)\n",
|
|
" 1. Option 3: Provision as a persistent compute target (Advanced)\n",
|
|
"1. Additional operations to perform on AmlCompute\n",
|
|
"1. [Download model explanations from Azure Machine Learning Run History](#Download)\n",
|
|
"1. [Visualize explanations](#Visualize)\n",
|
|
"1. [Next steps](#Next)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Introduction\n",
|
|
"\n",
|
|
"This notebook showcases how to train and explain a regression model remotely via Azure Machine Learning Compute (AMLCompute), and download the calculated explanations locally for visualization.\n",
|
|
"It demonstrates the API calls that you need to make to submit a run for training and explaining a model to AMLCompute, download the compute explanations remotely, and visualizing the global and local explanations via a visualization dashboard that provides an interactive way of discovering patterns in model predictions and downloaded explanations.\n",
|
|
"\n",
|
|
"We will showcase one of the tabular data explainers: TabularExplainer (SHAP).\n",
|
|
"\n",
|
|
"Problem: Boston Housing Price Prediction with scikit-learn (train a model and run an explainer remotely via AMLCompute, and download and visualize the remotely-calculated explanations.)\n",
|
|
"\n",
|
|
"|  |\n",
|
|
"|:--:|\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Setup\n",
|
|
"If you are using an Azure Machine Learning Notebook VM, you are all set. Otherwise, make sure you go through the [configuration notebook](../../../configuration.ipynb) first if you haven't.\n",
|
|
"\n",
|
|
"\n",
|
|
"If you are using Jupyter notebooks, the extensions should be installed automatically with the package.\n",
|
|
"If you are using Jupyter Labs run the following command:\n",
|
|
"```\n",
|
|
"(myenv) $ jupyter labextension install @jupyter-widgets/jupyterlab-manager\n",
|
|
"```\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Check core SDK version number\n",
|
|
"import azureml.core\n",
|
|
"\n",
|
|
"print(\"SDK version:\", azureml.core.VERSION)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Initialize a Workspace\n",
|
|
"\n",
|
|
"Initialize a workspace object from persisted configuration"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {
|
|
"tags": [
|
|
"create workspace"
|
|
]
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"from azureml.core import Workspace\n",
|
|
"\n",
|
|
"ws = Workspace.from_config()\n",
|
|
"print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep='\\n')"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Create An Experiment\n",
|
|
"\n",
|
|
"**Experiment** is a logical container in an Azure ML Workspace. It hosts run records which can include run metrics and output artifacts from your experiments."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from azureml.core import Experiment\n",
|
|
"experiment_name = 'explainer-remote-run-on-amlcompute'\n",
|
|
"experiment = Experiment(workspace=ws, name=experiment_name)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Introduction to AmlCompute\n",
|
|
"\n",
|
|
"Azure Machine Learning Compute is managed compute infrastructure that allows the user to easily create single to multi-node compute of the appropriate VM Family. It is created **within your workspace region** and is a resource that can be used by other users in your workspace. It autoscales by default to the max_nodes, when a job is submitted, and executes in a containerized environment packaging the dependencies as specified by the user. \n",
|
|
"\n",
|
|
"Since it is managed compute, job scheduling and cluster management are handled internally by Azure Machine Learning service. \n",
|
|
"\n",
|
|
"For more information on Azure Machine Learning Compute, please read [this article](https://docs.microsoft.com/azure/machine-learning/service/how-to-set-up-training-targets#amlcompute)\n",
|
|
"\n",
|
|
"If you are an existing BatchAI customer who is migrating to Azure Machine Learning, please read [this article](https://aka.ms/batchai-retirement)\n",
|
|
"\n",
|
|
"**Note**: As with other Azure services, there are limits on certain resources (for eg. AmlCompute quota) associated with the Azure Machine Learning service. Please read [this article](https://docs.microsoft.com/azure/machine-learning/service/how-to-manage-quotas) on the default limits and how to request more quota.\n",
|
|
"\n",
|
|
"\n",
|
|
"The training script `train_explain.py` is already created for you. Let's have a look."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Submit an AmlCompute run in a few different ways\n",
|
|
"\n",
|
|
"First lets check which VM families are available in your region. Azure is a regional service and some specialized SKUs (especially GPUs) are only available in certain regions. Since AmlCompute is created in the region of your workspace, we will use the supported_vms () function to see if the VM family we want to use ('STANDARD_D2_V2') is supported.\n",
|
|
"\n",
|
|
"You can also pass a different region to check availability and then re-create your workspace in that region through the [configuration notebook](../../../configuration.ipynb)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from azureml.core.compute import ComputeTarget, AmlCompute\n",
|
|
"\n",
|
|
"AmlCompute.supported_vmsizes(workspace=ws)\n",
|
|
"# AmlCompute.supported_vmsizes(workspace=ws, location='southcentralus')"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Create project directory\n",
|
|
"\n",
|
|
"Create a directory that will contain all the necessary code from your local machine that you will need access to on the remote resource. This includes the training script, and any additional files your training script depends on"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"import os\n",
|
|
"import shutil\n",
|
|
"\n",
|
|
"project_folder = './explainer-remote-run-on-amlcompute'\n",
|
|
"os.makedirs(project_folder, exist_ok=True)\n",
|
|
"shutil.copy('train_explain.py', project_folder)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Option 1: Provision as a run based compute target\n",
|
|
"\n",
|
|
"You can provision AmlCompute as a compute target at run-time. In this case, the compute is auto-created for your run, scales up to max_nodes that you specify, and then **deleted automatically** after the run completes."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from azureml.core.runconfig import RunConfiguration\n",
|
|
"from azureml.core.conda_dependencies import CondaDependencies\n",
|
|
"from azureml.core.runconfig import DEFAULT_CPU_IMAGE\n",
|
|
"\n",
|
|
"# create a new runconfig object\n",
|
|
"run_config = RunConfiguration()\n",
|
|
"\n",
|
|
"# signal that you want to use AmlCompute to execute script.\n",
|
|
"run_config.target = \"amlcompute\"\n",
|
|
"\n",
|
|
"# AmlCompute will be created in the same region as workspace\n",
|
|
"# Set vm size for AmlCompute\n",
|
|
"run_config.amlcompute.vm_size = 'STANDARD_D2_V2'\n",
|
|
"\n",
|
|
"# enable Docker \n",
|
|
"run_config.environment.docker.enabled = True\n",
|
|
"\n",
|
|
"# set Docker base image to the default CPU-based image\n",
|
|
"run_config.environment.docker.base_image = DEFAULT_CPU_IMAGE\n",
|
|
"\n",
|
|
"# use conda_dependencies.yml to create a conda environment in the Docker image for execution\n",
|
|
"run_config.environment.python.user_managed_dependencies = False\n",
|
|
"\n",
|
|
"azureml_pip_packages = [\n",
|
|
" 'azureml-defaults', 'azureml-contrib-explain-model', 'azureml-core', 'azureml-telemetry',\n",
|
|
" 'azureml-explain-model', 'sklearn-pandas', 'azureml-dataprep'\n",
|
|
"]\n",
|
|
"\n",
|
|
"# specify CondaDependencies obj\n",
|
|
"run_config.environment.python.conda_dependencies = CondaDependencies.create(conda_packages=['scikit-learn'],\n",
|
|
" pip_packages=azureml_pip_packages)\n",
|
|
"\n",
|
|
"# Now submit a run on AmlCompute\n",
|
|
"from azureml.core.script_run_config import ScriptRunConfig\n",
|
|
"\n",
|
|
"script_run_config = ScriptRunConfig(source_directory=project_folder,\n",
|
|
" script='train_explain.py',\n",
|
|
" run_config=run_config)\n",
|
|
"\n",
|
|
"run = experiment.submit(script_run_config)\n",
|
|
"\n",
|
|
"# Show run details\n",
|
|
"run"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Note: if you need to cancel a run, you can follow [these instructions](https://aka.ms/aml-docs-cancel-run)."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"%%time\n",
|
|
"# Shows output of the run on stdout.\n",
|
|
"run.wait_for_completion(show_output=True)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Option 2: Provision as a persistent compute target (Basic)\n",
|
|
"\n",
|
|
"You can provision a persistent AmlCompute resource by simply defining two parameters thanks to smart defaults. By default it autoscales from 0 nodes and provisions dedicated VMs to run your job in a container. This is useful when you want to continously re-use the same target, debug it between jobs or simply share the resource with other users of your workspace.\n",
|
|
"\n",
|
|
"* `vm_size`: VM family of the nodes provisioned by AmlCompute. Simply choose from the supported_vmsizes() above\n",
|
|
"* `max_nodes`: Maximum nodes to autoscale to while running a job on AmlCompute"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from azureml.core.compute import ComputeTarget, AmlCompute\n",
|
|
"from azureml.core.compute_target import ComputeTargetException\n",
|
|
"\n",
|
|
"# Choose a name for your CPU cluster\n",
|
|
"cpu_cluster_name = \"cpu-cluster\"\n",
|
|
"\n",
|
|
"# Verify that cluster does not exist already\n",
|
|
"try:\n",
|
|
" cpu_cluster = ComputeTarget(workspace=ws, name=cpu_cluster_name)\n",
|
|
" print('Found existing cluster, use it.')\n",
|
|
"except ComputeTargetException:\n",
|
|
" compute_config = AmlCompute.provisioning_configuration(vm_size='STANDARD_D2_V2',\n",
|
|
" max_nodes=4)\n",
|
|
" cpu_cluster = ComputeTarget.create(ws, cpu_cluster_name, compute_config)\n",
|
|
"\n",
|
|
"cpu_cluster.wait_for_completion(show_output=True)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Configure & Run"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from azureml.core.runconfig import RunConfiguration\n",
|
|
"from azureml.core.conda_dependencies import CondaDependencies\n",
|
|
"\n",
|
|
"# create a new RunConfig object\n",
|
|
"run_config = RunConfiguration(framework=\"python\")\n",
|
|
"\n",
|
|
"# Set compute target to AmlCompute target created in previous step\n",
|
|
"run_config.target = cpu_cluster.name\n",
|
|
"\n",
|
|
"# enable Docker \n",
|
|
"run_config.environment.docker.enabled = True\n",
|
|
"\n",
|
|
"azureml_pip_packages = [\n",
|
|
" 'azureml-defaults', 'azureml-contrib-explain-model', 'azureml-core', 'azureml-telemetry',\n",
|
|
" 'azureml-explain-model', 'azureml-dataprep'\n",
|
|
"]\n",
|
|
"\n",
|
|
"# specify CondaDependencies obj\n",
|
|
"run_config.environment.python.conda_dependencies = CondaDependencies.create(conda_packages=['scikit-learn'],\n",
|
|
" pip_packages=azureml_pip_packages)\n",
|
|
"\n",
|
|
"from azureml.core import Run\n",
|
|
"from azureml.core import ScriptRunConfig\n",
|
|
"\n",
|
|
"src = ScriptRunConfig(source_directory=project_folder, \n",
|
|
" script='train_explain.py', \n",
|
|
" run_config=run_config) \n",
|
|
"run = experiment.submit(config=src)\n",
|
|
"run"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"%%time\n",
|
|
"# Shows output of the run on stdout.\n",
|
|
"run.wait_for_completion(show_output=True)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"run.get_metrics()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Option 3: Provision as a persistent compute target (Advanced)\n",
|
|
"\n",
|
|
"You can also specify additional properties or change defaults while provisioning AmlCompute using a more advanced configuration. This is useful when you want a dedicated cluster of 4 nodes (for example you can set the min_nodes and max_nodes to 4), or want the compute to be within an existing VNet in your subscription.\n",
|
|
"\n",
|
|
"In addition to `vm_size` and `max_nodes`, you can specify:\n",
|
|
"* `min_nodes`: Minimum nodes (default 0 nodes) to downscale to while running a job on AmlCompute\n",
|
|
"* `vm_priority`: Choose between 'dedicated' (default) and 'lowpriority' VMs when provisioning AmlCompute. Low Priority VMs use Azure's excess capacity and are thus cheaper but risk your run being pre-empted\n",
|
|
"* `idle_seconds_before_scaledown`: Idle time (default 120 seconds) to wait after run completion before auto-scaling to min_nodes\n",
|
|
"* `vnet_resourcegroup_name`: Resource group of the **existing** VNet within which AmlCompute should be provisioned\n",
|
|
"* `vnet_name`: Name of VNet\n",
|
|
"* `subnet_name`: Name of SubNet within the VNet"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from azureml.core.compute import ComputeTarget, AmlCompute\n",
|
|
"from azureml.core.compute_target import ComputeTargetException\n",
|
|
"\n",
|
|
"# Choose a name for your CPU cluster\n",
|
|
"cpu_cluster_name = \"cpu-cluster\"\n",
|
|
"\n",
|
|
"# Verify that cluster does not exist already\n",
|
|
"try:\n",
|
|
" cpu_cluster = ComputeTarget(workspace=ws, name=cpu_cluster_name)\n",
|
|
" print('Found existing cluster, use it.')\n",
|
|
"except ComputeTargetException:\n",
|
|
" compute_config = AmlCompute.provisioning_configuration(vm_size='STANDARD_D2_V2',\n",
|
|
" vm_priority='lowpriority',\n",
|
|
" min_nodes=2,\n",
|
|
" max_nodes=4,\n",
|
|
" idle_seconds_before_scaledown='300',\n",
|
|
" vnet_resourcegroup_name='<my-resource-group>',\n",
|
|
" vnet_name='<my-vnet-name>',\n",
|
|
" subnet_name='<my-subnet-name>')\n",
|
|
" cpu_cluster = ComputeTarget.create(ws, cpu_cluster_name, compute_config)\n",
|
|
"\n",
|
|
"cpu_cluster.wait_for_completion(show_output=True)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Configure & Run"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from azureml.core.runconfig import RunConfiguration\n",
|
|
"from azureml.core.conda_dependencies import CondaDependencies\n",
|
|
"\n",
|
|
"# create a new RunConfig object\n",
|
|
"run_config = RunConfiguration(framework=\"python\")\n",
|
|
"\n",
|
|
"# Set compute target to AmlCompute target created in previous step\n",
|
|
"run_config.target = cpu_cluster.name\n",
|
|
"\n",
|
|
"# enable Docker \n",
|
|
"run_config.environment.docker.enabled = True\n",
|
|
"\n",
|
|
"azureml_pip_packages = [\n",
|
|
" 'azureml-defaults', 'azureml-contrib-explain-model', 'azureml-core', 'azureml-telemetry',\n",
|
|
" 'azureml-explain-model', 'azureml-dataprep'\n",
|
|
"]\n",
|
|
"\n",
|
|
"\n",
|
|
"\n",
|
|
"# specify CondaDependencies obj\n",
|
|
"run_config.environment.python.conda_dependencies = CondaDependencies.create(conda_packages=['scikit-learn'],\n",
|
|
" pip_packages=azureml_pip_packages)\n",
|
|
"\n",
|
|
"from azureml.core import Run\n",
|
|
"from azureml.core import ScriptRunConfig\n",
|
|
"\n",
|
|
"src = ScriptRunConfig(source_directory=project_folder, \n",
|
|
" script='train_explain.py', \n",
|
|
" run_config=run_config) \n",
|
|
"run = experiment.submit(config=src)\n",
|
|
"run"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"%%time\n",
|
|
"# Shows output of the run on stdout.\n",
|
|
"run.wait_for_completion(show_output=True)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"run.get_metrics()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from azureml.contrib.explain.model.explanation.explanation_client import ExplanationClient\n",
|
|
"\n",
|
|
"client = ExplanationClient.from_run(run)\n",
|
|
"# Get the top k (e.g., 4) most important features with their importance values\n",
|
|
"explanation = client.download_model_explanation(top_k=4)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Additional operations to perform on AmlCompute\n",
|
|
"\n",
|
|
"You can perform more operations on AmlCompute such as updating the node counts or deleting the compute. "
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Get_status () gets the latest status of the AmlCompute target\n",
|
|
"cpu_cluster.get_status().serialize()\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Update () takes in the min_nodes, max_nodes and idle_seconds_before_scaledown and updates the AmlCompute target\n",
|
|
"# cpu_cluster.update(min_nodes=1)\n",
|
|
"# cpu_cluster.update(max_nodes=10)\n",
|
|
"cpu_cluster.update(idle_seconds_before_scaledown=300)\n",
|
|
"# cpu_cluster.update(min_nodes=2, max_nodes=4, idle_seconds_before_scaledown=600)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Delete () is used to deprovision and delete the AmlCompute target. Useful if you want to re-use the compute name \n",
|
|
"# 'cpu-cluster' in this case but use a different VM family for instance.\n",
|
|
"\n",
|
|
"# cpu_cluster.delete()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Download \n",
|
|
"1. Download model explanation data."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from azureml.contrib.explain.model.explanation.explanation_client import ExplanationClient\n",
|
|
"\n",
|
|
"# Get model explanation data\n",
|
|
"client = ExplanationClient.from_run(run)\n",
|
|
"global_explanation = client.download_model_explanation()\n",
|
|
"local_importance_values = global_explanation.local_importance_values\n",
|
|
"expected_values = global_explanation.expected_values\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Or you can use the saved run.id to retrive the feature importance values\n",
|
|
"client = ExplanationClient.from_run_id(ws, experiment_name, run.id)\n",
|
|
"global_explanation = client.download_model_explanation()\n",
|
|
"local_importance_values = global_explanation.local_importance_values\n",
|
|
"expected_values = global_explanation.expected_values"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Get the top k (e.g., 4) most important features with their importance values\n",
|
|
"global_explanation_topk = client.download_model_explanation(top_k=4)\n",
|
|
"global_importance_values = global_explanation_topk.get_ranked_global_values()\n",
|
|
"global_importance_names = global_explanation_topk.get_ranked_global_names()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"print('global importance values: {}'.format(global_importance_values))\n",
|
|
"print('global importance names: {}'.format(global_importance_names))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"2. Download model file."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# retrieve model for visualization and deployment\n",
|
|
"from azureml.core.model import Model\n",
|
|
"from sklearn.externals import joblib\n",
|
|
"original_model = Model(ws, 'model_explain_model_on_amlcomp')\n",
|
|
"model_path = original_model.download(exist_ok=True)\n",
|
|
"original_model = joblib.load(model_path)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"3. Download test dataset."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# retrieve x_test for visualization\n",
|
|
"from sklearn.externals import joblib\n",
|
|
"x_test_path = './x_test_boston_housing.pkl'\n",
|
|
"run.download_file('x_test_boston_housing.pkl', output_file_path=x_test_path)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"x_test = joblib.load('x_test_boston_housing.pkl')"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Visualize\n",
|
|
"Load the visualization dashboard"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from azureml.contrib.explain.model.visualize import ExplanationDashboard"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"ExplanationDashboard(global_explanation, original_model, x_test)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Next\n",
|
|
"Learn about other use cases of the explain package on a:\n",
|
|
"1. [Training time: regression problem](../../tabular-data/explain-binary-classification-local.ipynb) \n",
|
|
"1. [Training time: binary classification problem](../../tabular-data/explain-binary-classification-local.ipynb)\n",
|
|
"1. [Training time: multiclass classification problem](../../tabular-data/explain-multiclass-classification-local.ipynb)\n",
|
|
"1. Explain models with engineered features:\n",
|
|
" 1. [Simple feature transformations](../../tabular-data/simple-feature-transformations-explain-local.ipynb)\n",
|
|
" 1. [Advanced feature transformations](../../tabular-data/advanced-feature-transformations-explain-local.ipynb)\n",
|
|
"1. [Save model explanations via Azure Machine Learning Run History](../run-history/save-retrieve-explanations-run-history.ipynb)\n",
|
|
"1. Inferencing time: deploy a classification model and explainer:\n",
|
|
" 1. [Deploy a locally-trained model and explainer](../scoring-time/train-explain-model-locally-and-deploy.ipynb)\n",
|
|
" 1. [Deploy a remotely-trained model and explainer](../scoring-time/train-explain-model-on-amlcompute-and-deploy.ipynb)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": []
|
|
}
|
|
],
|
|
"metadata": {
|
|
"authors": [
|
|
{
|
|
"name": "mesameki"
|
|
}
|
|
],
|
|
"kernelspec": {
|
|
"display_name": "Python 3.6",
|
|
"language": "python",
|
|
"name": "python36"
|
|
},
|
|
"language_info": {
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"file_extension": ".py",
|
|
"mimetype": "text/x-python",
|
|
"name": "python",
|
|
"nbconvert_exporter": "python",
|
|
"pygments_lexer": "ipython3",
|
|
"version": "3.6.8"
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 2
|
|
} |