MachineLearningNotebooks/automl/07.auto-ml-exploring-previous-runs.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Copyright (c) Microsoft Corporation. All rights reserved.\n",
    "\n",
    "Licensed under the MIT License."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# AutoML 07: Exploring previous runs\n",
    "\n",
    "In this example we present some examples on navigating previously executed runs. We also show how you can download a fitted model for any previous run.\n",
    "\n",
    "Make sure you have executed the [00.configuration](00.configuration.ipynb) before running this notebook.\n",
    "\n",
    "In this notebook you would see\n",
    "1. List all Experiments for the workspace\n",
    "2. List AutoML runs for an Experiment\n",
    "3. Get details for a AutoML Run. (Automl settings, run widget & all metrics)\n",
    "4. Download fitted pipeline for any iteration\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# List all AutoML Experiments in a Workspace"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import logging\n",
    "import os\n",
    "import random\n",
    "import re\n",
    "\n",
    "from matplotlib import pyplot as plt\n",
    "from matplotlib.pyplot import imshow\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "from sklearn import datasets\n",
    "\n",
    "import azureml.core\n",
    "from azureml.core.experiment import Experiment\n",
    "from azureml.core.run import Run\n",
    "from azureml.core.workspace import Workspace\n",
    "from azureml.train.automl import AutoMLConfig\n",
    "from azureml.train.automl.run import AutoMLRun"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ws = Workspace.from_config()\n",
    "experiment_list = Experiment.list(workspace=ws)\n",
    "\n",
    "summary_df = pd.DataFrame(index = ['No of Runs'])\n",
    "pattern = re.compile('^AutoML_[^_]*$')\n",
    "for experiment in experiment_list:\n",
    "    all_runs = list(experiment.get_runs())\n",
    "    automl_runs = []\n",
    "    for run in all_runs:\n",
    "        if(pattern.match(run.id)):\n",
    "            automl_runs.append(run)    \n",
    "    summary_df[experiment.name] = [len(automl_runs)]\n",
    "    \n",
    "pd.set_option('display.max_colwidth', -1)\n",
    "summary_df.T"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Diagnostics\n",
    "\n",
    "Opt-in diagnostics for better experience, quality, and security of future releases"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from azureml.telemetry import set_diagnostics_collection\n",
    "set_diagnostics_collection(send_diagnostics=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# List AutoML runs for an Experiment\n",
    "You can set <i>Experiment</i> name with any experiment name from the result of the Experiment.list cell to load the AutoML runs."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "experiment_name = 'automl-local-classification' # Replace this with any project name from previous cell\n",
    "\n",
    "proj = ws.experiments()[experiment_name]\n",
    "summary_df = pd.DataFrame(index = ['Type', 'Status', 'Primary Metric', 'Iterations', 'Compute', 'Name'])\n",
    "pattern = re.compile('^AutoML_[^_]*$')\n",
    "all_runs = list(proj.get_runs(properties={'azureml.runsource': 'automl'}))\n",
    "for run in all_runs:\n",
    "    if(pattern.match(run.id)):\n",
    "        properties = run.get_properties()\n",
    "        tags = run.get_tags()\n",
    "        amlsettings = eval(properties['RawAMLSettingsString'])\n",
    "        if 'iterations' in tags:\n",
    "            iterations = tags['iterations']\n",
    "        else:\n",
    "            iterations = properties['num_iterations']\n",
    "        summary_df[run.id] = [amlsettings['task_type'], run.get_details()['status'], properties['primary_metric'], iterations, properties['target'], amlsettings['name']]\n",
    "    \n",
    "from IPython.display import HTML\n",
    "projname_html = HTML(\"<h3>{}</h3>\".format(proj.name))\n",
    "\n",
    "from IPython.display import display\n",
    "display(projname_html)\n",
    "display(summary_df.T)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Get Details for a Auto ML Run\n",
    "\n",
    "Copy the project name and run id from the previous cell output to find more details on a particular run."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "run_id = '' # Filling your own run_id\n",
    "\n",
    "from azureml.train.widgets import RunDetails\n",
    "\n",
    "experiment = Experiment(ws, experiment_name)\n",
    "ml_run = AutoMLRun(experiment=experiment, run_id=run_id)\n",
    "\n",
    "summary_df = pd.DataFrame(index = ['Type', 'Status', 'Primary Metric', 'Iterations', 'Compute', 'Name', 'Start Time', 'End Time'])\n",
    "properties = ml_run.get_properties()\n",
    "tags = ml_run.get_tags()\n",
    "status = ml_run.get_details()\n",
    "amlsettings = eval(properties['RawAMLSettingsString'])\n",
    "if 'iterations' in tags:\n",
    "    iterations = tags['iterations']\n",
    "else:\n",
    "    iterations = properties['num_iterations']\n",
    "start_time = None\n",
    "if 'startTimeUtc' in status:\n",
    "    start_time = status['startTimeUtc']\n",
    "end_time = None\n",
    "if 'endTimeUtc' in status:\n",
    "    end_time = status['endTimeUtc']\n",
    "summary_df[ml_run.id] = [amlsettings['task_type'], status['status'], properties['primary_metric'], iterations, properties['target'], amlsettings['name'], start_time, end_time]\n",
    "display(HTML('<h3>Runtime Details</h3>'))\n",
    "display(summary_df)\n",
    "\n",
    "#settings_df = pd.DataFrame(data=amlsettings, index=[''])\n",
    "display(HTML('<h3>AutoML Settings</h3>'))\n",
    "display(amlsettings)\n",
    "\n",
    "display(HTML('<h3>Iterations</h3>'))\n",
    "RunDetails(ml_run).show() \n",
    "\n",
    "children = list(ml_run.get_children())\n",
    "metricslist = {}\n",
    "for run in children:\n",
    "    properties = run.get_properties()\n",
    "    metrics = {k: v for k, v in run.get_metrics().items() if isinstance(v, float)}    \n",
    "    metricslist[int(properties['iteration'])] = metrics\n",
    "\n",
    "rundata = pd.DataFrame(metricslist).sort_index(1)\n",
    "display(HTML('<h3>Metrics</h3>'))\n",
    "display(rundata)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Download fitted models"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Download best model for any given metric"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "metric = 'AUC_weighted' # Replace with a metric name\n",
    "best_run, fitted_model = ml_run.get_output(metric=metric)\n",
    "fitted_model"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Download model for any given iteration"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "iteration = 4 # Replace with an interation number\n",
    "best_run, fitted_model = ml_run.get_output(iteration=iteration)\n",
    "fitted_model"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Register fitted model for deployment"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "description = 'AutoML Model'\n",
    "tags = None\n",
    "ml_run.register_model(description=description, tags=tags)\n",
    "ml_run.model_id # Use this id to deploy the model as a web service in Azure"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Register best model for any given metric"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "metric = 'AUC_weighted' # Replace with a metric name\n",
    "description = 'AutoML Model'\n",
    "tags = None\n",
    "ml_run.register_model(description=description, tags=tags, metric=metric)\n",
    "ml_run.model_id # Use this id to deploy the model as a web service in Azure"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Register model for any given iteration"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "iteration = 4 # Replace with an interation number\n",
    "description = 'AutoML Model'\n",
    "tags = None\n",
    "ml_run.register_model(description=description, tags=tags, iteration=iteration)\n",
    "ml_run.model_id # Use this id to deploy the model as a web service in Azure"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3.6",
   "language": "python",
   "name": "python36"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}