Copyright (c) Microsoft Corporation. All rights reserved.

Licensed under the MIT License.

# Automated Machine Learning
_**Exploring Previous Runs**_

## Contents
1. [Introduction](#Introduction)
1. [Setup](#Setup)
1. [Explore](#Explore)
1. [Download](#Download)
1. [Register](#Register)

## Introduction
In this example we present some examples on navigating previously executed runs. We also show how you can download a fitted model for any previous run.

Make sure you have executed the [configuration](../../../configuration.ipynb) before running this notebook.

In this notebook you will learn how to:
1. List all experiments in a workspace.
2. List all AutoML runs in an experiment.
3. Get details for an AutoML run, including settings, run widget, and all metrics.
4. Download a fitted pipeline for any iteration.

## Setup

In [None]:
import pandas as pd
import json

from azureml.core.experiment import Experiment
from azureml.core.workspace import Workspace
from azureml.train.automl.run import AutoMLRun

In [None]:
ws = Workspace.from_config()

Opt-in diagnostics for better experience, quality, and security of future releases.

In [None]:
from azureml.telemetry import set_diagnostics_collection
set_diagnostics_collection(send_diagnostics = True)

## Explore

### List Experiments

In [None]:
experiment_list = Experiment.list(workspace=ws)

summary_df = pd.DataFrame(index = ['No of Runs'])
for experiment in experiment_list:
    automl_runs = list(experiment.get_runs(type='automl'))
    summary_df[experiment.name] = [len(automl_runs)]
    
pd.set_option('display.max_colwidth', -1)
summary_df.T

### List runs for an experiment
Set `experiment_name` to any experiment name from the result of the Experiment.list cell to load the AutoML runs.

In [None]:
experiment_name = 'automl-local-classification' # Replace this with any project name from previous cell.

proj = ws.experiments[experiment_name]
summary_df = pd.DataFrame(index = ['Type', 'Status', 'Primary Metric', 'Iterations', 'Compute', 'Name'])
automl_runs = list(proj.get_runs(type='automl'))
automl_runs_project = []
for run in automl_runs:
    properties = run.get_properties()
    tags = run.get_tags()
    amlsettings = json.loads(properties['AMLSettingsJsonString'])
    if 'iterations' in tags:
        iterations = tags['iterations']
    else:
        iterations = properties['num_iterations']
    summary_df[run.id] = [amlsettings['task_type'], run.get_details()['status'], properties['primary_metric'], iterations, properties['target'], amlsettings['name']]
    if run.get_details()['status'] == 'Completed':
        automl_runs_project.append(run.id)
    
from IPython.display import HTML
projname_html = HTML("<h3>{}</h3>".format(proj.name))

from IPython.display import display
display(projname_html)
display(summary_df.T)

### Get details for a run

Copy the project name and run id from the previous cell output to find more details on a particular run.

In [None]:
run_id = automl_runs_project[0]  # Replace with your own run_id from above run ids
assert (run_id in summary_df.keys()), "Run id not found! Please set run id to a value from above run ids"

from azureml.widgets import RunDetails

experiment = Experiment(ws, experiment_name)
ml_run = AutoMLRun(experiment = experiment, run_id = run_id)

summary_df = pd.DataFrame(index = ['Type', 'Status', 'Primary Metric', 'Iterations', 'Compute', 'Name', 'Start Time', 'End Time'])
properties = ml_run.get_properties()
tags = ml_run.get_tags()
status = ml_run.get_details()
amlsettings = json.loads(properties['AMLSettingsJsonString'])
if 'iterations' in tags:
    iterations = tags['iterations']
else:
    iterations = properties['num_iterations']
start_time = None
if 'startTimeUtc' in status:
    start_time = status['startTimeUtc']
end_time = None
if 'endTimeUtc' in status:
    end_time = status['endTimeUtc']
summary_df[ml_run.id] = [amlsettings['task_type'], status['status'], properties['primary_metric'], iterations, properties['target'], amlsettings['name'], start_time, end_time]
display(HTML('<h3>Runtime Details</h3>'))
display(summary_df)

#settings_df = pd.DataFrame(data = amlsettings, index = [''])
display(HTML('<h3>AutoML Settings</h3>'))
display(amlsettings)

display(HTML('<h3>Iterations</h3>'))
RunDetails(ml_run).show() 

children = list(ml_run.get_children())
metricslist = {}
for run in children:
    properties = run.get_properties()
    metrics = {k: v for k, v in run.get_metrics().items() if isinstance(v, float)}
    metricslist[int(properties['iteration'])] = metrics

rundata = pd.DataFrame(metricslist).sort_index(1)
display(HTML('<h3>Metrics</h3>'))
display(rundata)


## Download

### Download the Best Model for Any Given Metric

In [None]:
metric = 'AUC_weighted' # Replace with a metric name.
best_run, fitted_model = ml_run.get_output(metric = metric)
fitted_model

### Download the Model for Any Given Iteration

In [None]:
iteration = 1 # Replace with an iteration number.
best_run, fitted_model = ml_run.get_output(iteration = iteration)
fitted_model

## Register

### Register fitted model for deployment
If neither `metric` nor `iteration` are specified in the `register_model` call, the iteration with the best primary metric is registered.

In [None]:
description = 'AutoML Model'
tags = None
ml_run.register_model(description = description, tags = tags)
print(ml_run.model_id) # Use this id to deploy the model as a web service in Azure.

### Register the Best Model for Any Given Metric

In [None]:
metric = 'AUC_weighted' # Replace with a metric name.
description = 'AutoML Model'
tags = None
ml_run.register_model(description = description, tags = tags, metric = metric)
print(ml_run.model_id) # Use this id to deploy the model as a web service in Azure.

### Register the Model for Any Given Iteration

In [None]:
iteration = 1 # Replace with an iteration number.
description = 'AutoML Model'
tags = None
ml_run.register_model(description = description, tags = tags, iteration = iteration)
print(ml_run.model_id) # Use this id to deploy the model as a web service in Azure.