Compare commits

...

5 Commits

Author SHA1 Message Date
amlrelsa-ms
5189691f06 update samples from Release-56 as a part of SDK release 2020-06-22 18:11:40 +00:00
Harneet Virk
fb900916e3 Update README.md 2020-06-11 13:26:04 -07:00
Harneet Virk
738347f3da Merge pull request #996 from Azure/release_update/Release-55
update samples from Release-55 as a part of  SDK release
2020-06-08 15:31:35 -07:00
amlrelsa-ms
34a67c1f8b update samples from Release-55 as a part of SDK release 2020-06-08 22:28:25 +00:00
Harneet Virk
34898828be Merge pull request #992 from Azure/release_update/Release-54
update samples from Release-54 as a part of  SDK release
2020-06-02 14:42:02 -07:00
55 changed files with 2923 additions and 156 deletions

View File

@@ -103,7 +103,7 @@
"source": [ "source": [
"import azureml.core\n", "import azureml.core\n",
"\n", "\n",
"print(\"This notebook was created using version 1.6.0 of the Azure ML SDK\")\n", "print(\"This notebook was created using version 1.8.0 of the Azure ML SDK\")\n",
"print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")" "print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
] ]
}, },

View File

@@ -0,0 +1,549 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Copyright (c) Microsoft Corporation. All rights reserved. \n",
"Licensed under the MIT License."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/contrib/fairness/fairlearn-azureml-mitigation.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Unfairness Mitigation with Fairlearn and Azure Machine Learning\n",
"**This notebook shows how to upload results from Fairlearn's GridSearch mitigation algorithm into a dashboard in Azure Machine Learning Studio**\n",
"\n",
"## Table of Contents\n",
"\n",
"1. [Introduction](#Introduction)\n",
"1. [Loading the Data](#LoadingData)\n",
"1. [Training an Unmitigated Model](#UnmitigatedModel)\n",
"1. [Mitigation with GridSearch](#Mitigation)\n",
"1. [Uploading a Fairness Dashboard to Azure](#AzureUpload)\n",
" 1. Registering models\n",
" 1. Computing Fairness Metrics\n",
" 1. Uploading to Azure\n",
"1. [Conclusion](#Conclusion)\n",
"\n",
"<a id=\"Introduction\"></a>\n",
"## Introduction\n",
"This notebook shows how to use [Fairlearn (an open source fairness assessment and unfairness mitigation package)](http://fairlearn.github.io) and Azure Machine Learning Studio for a binary classification problem. This example uses the well-known adult census dataset. For the purposes of this notebook, we shall treat this as a loan decision problem. We will pretend that the label indicates whether or not each individual repaid a loan in the past. We will use the data to train a predictor to predict whether previously unseen individuals will repay a loan or not. The assumption is that the model predictions are used to decide whether an individual should be offered a loan. Its purpose is purely illustrative of a workflow including a fairness dashboard - in particular, we do **not** include a full discussion of the detailed issues which arise when considering fairness in machine learning. For such discussions, please [refer to the Fairlearn website](http://fairlearn.github.io/).\n",
"\n",
"We will apply the [grid search algorithm](https://fairlearn.github.io/api_reference/fairlearn.reductions.html#fairlearn.reductions.GridSearch) from the Fairlearn package using a specific notion of fairness called Demographic Parity. This produces a set of models, and we will view these in a dashboard both locally and in the Azure Machine Learning Studio.\n",
"\n",
"### Setup\n",
"\n",
"To use this notebook, an Azure Machine Learning workspace is required.\n",
"Please see the [configuration notebook](../../configuration.ipynb) for information about creating one, if required.\n",
"This notebook also requires the following packages:\n",
"* `azureml-contrib-fairness`\n",
"* `fairlearn==0.4.6`\n",
"* `joblib`\n",
"* `shap`\n",
"\n",
"\n",
"<a id=\"LoadingData\"></a>\n",
"## Loading the Data\n",
"We use the well-known `adult` census dataset, which we load using `shap` (for convenience). We start with a fairly unremarkable set of imports:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from fairlearn.reductions import GridSearch, DemographicParity, ErrorRate\n",
"from fairlearn.widget import FairlearnDashboard\n",
"from sklearn import svm\n",
"from sklearn.preprocessing import LabelEncoder, StandardScaler\n",
"from sklearn.linear_model import LogisticRegression\n",
"import pandas as pd\n",
"import shap"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can now load and inspect the data from the `shap` package:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"X_raw, Y = shap.datasets.adult()\n",
"X_raw[\"Race\"].value_counts().to_dict()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We are going to treat the sex of each individual as a protected attribute (where 0 indicates female and 1 indicates male), and in this particular case we are going separate this attribute out and drop it from the main data (this is not always the best option - see the [Fairlearn website](http://fairlearn.github.io/) for further discussion). We also separate out the Race column, but we will not perform any mitigation based on it. Finally, we perform some standard data preprocessing steps to convert the data into a format suitable for the ML algorithms"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"A = X_raw[['Sex','Race']]\n",
"X = X_raw.drop(labels=['Sex', 'Race'],axis = 1)\n",
"X = pd.get_dummies(X)\n",
"\n",
"\n",
"le = LabelEncoder()\n",
"Y = le.fit_transform(Y)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"With our data prepared, we can make the conventional split in to 'test' and 'train' subsets:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from sklearn.model_selection import train_test_split\n",
"X_train, X_test, Y_train, Y_test, A_train, A_test = train_test_split(X_raw, \n",
" Y, \n",
" A,\n",
" test_size = 0.2,\n",
" random_state=0,\n",
" stratify=Y)\n",
"\n",
"# Work around indexing issue\n",
"X_train = X_train.reset_index(drop=True)\n",
"A_train = A_train.reset_index(drop=True)\n",
"X_test = X_test.reset_index(drop=True)\n",
"A_test = A_test.reset_index(drop=True)\n",
"\n",
"# Improve labels\n",
"A_test.Sex.loc[(A_test['Sex'] == 0)] = 'female'\n",
"A_test.Sex.loc[(A_test['Sex'] == 1)] = 'male'\n",
"\n",
"\n",
"A_test.Race.loc[(A_test['Race'] == 0)] = 'Amer-Indian-Eskimo'\n",
"A_test.Race.loc[(A_test['Race'] == 1)] = 'Asian-Pac-Islander'\n",
"A_test.Race.loc[(A_test['Race'] == 2)] = 'Black'\n",
"A_test.Race.loc[(A_test['Race'] == 3)] = 'Other'\n",
"A_test.Race.loc[(A_test['Race'] == 4)] = 'White'"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a id=\"UnmitigatedModel\"></a>\n",
"## Training an Unmitigated Model\n",
"\n",
"So we have a point of comparison, we first train a model (specifically, logistic regression from scikit-learn) on the raw data, without applying any mitigation algorithm:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"unmitigated_predictor = LogisticRegression(solver='liblinear', fit_intercept=True)\n",
"\n",
"unmitigated_predictor.fit(X_train, Y_train)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can view this model in the fairness dashboard, and see the disparities which appear:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"FairlearnDashboard(sensitive_features=A_test, sensitive_feature_names=['Sex', 'Race'],\n",
" y_true=Y_test,\n",
" y_pred={\"unmitigated\": unmitigated_predictor.predict(X_test)})"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Looking at the disparity in accuracy when we select 'Sex' as the sensitive feature, we see that males have an error rate about three times greater than the females. More interesting is the disparity in opportunitiy - males are offered loans at three times the rate of females.\n",
"\n",
"Despite the fact that we removed the feature from the training data, our predictor still discriminates based on sex. This demonstrates that simply ignoring a protected attribute when fitting a predictor rarely eliminates unfairness. There will generally be enough other features correlated with the removed attribute to lead to disparate impact."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a id=\"Mitigation\"></a>\n",
"## Mitigation with GridSearch\n",
"\n",
"The `GridSearch` class in `Fairlearn` implements a simplified version of the exponentiated gradient reduction of [Agarwal et al. 2018](https://arxiv.org/abs/1803.02453). The user supplies a standard ML estimator, which is treated as a blackbox - for this simple example, we shall use the logistic regression estimator from scikit-learn. `GridSearch` works by generating a sequence of relabellings and reweightings, and trains a predictor for each.\n",
"\n",
"For this example, we specify demographic parity (on the protected attribute of sex) as the fairness metric. Demographic parity requires that individuals are offered the opportunity (a loan in this example) independent of membership in the protected class (i.e., females and males should be offered loans at the same rate). *We are using this metric for the sake of simplicity* in this example; the appropriate fairness metric can only be selected after *careful examination of the broader context* in which the model is to be used."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"sweep = GridSearch(LogisticRegression(solver='liblinear', fit_intercept=True),\n",
" constraints=DemographicParity(),\n",
" grid_size=71)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"With our estimator created, we can fit it to the data. After `fit()` completes, we extract the full set of predictors from the `GridSearch` object.\n",
"\n",
"The following cell trains a many copies of the underlying estimator, and may take a minute or two to run:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"sweep.fit(X_train, Y_train,\n",
" sensitive_features=A_train.Sex)\n",
"\n",
"predictors = sweep._predictors"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We could load these predictors into the Fairness dashboard now. However, the plot would be somewhat confusing due to their number. In this case, we are going to remove the predictors which are dominated in the error-disparity space by others from the sweep (note that the disparity will only be calculated for the protected attribute; other potentially protected attributes will *not* be mitigated). In general, one might not want to do this, since there may be other considerations beyond the strict optimisation of error and disparity (of the given protected attribute)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"errors, disparities = [], []\n",
"for m in predictors:\n",
" classifier = lambda X: m.predict(X)\n",
" \n",
" error = ErrorRate()\n",
" error.load_data(X_train, pd.Series(Y_train), sensitive_features=A_train.Sex)\n",
" disparity = DemographicParity()\n",
" disparity.load_data(X_train, pd.Series(Y_train), sensitive_features=A_train.Sex)\n",
" \n",
" errors.append(error.gamma(classifier)[0])\n",
" disparities.append(disparity.gamma(classifier).max())\n",
" \n",
"all_results = pd.DataFrame( {\"predictor\": predictors, \"error\": errors, \"disparity\": disparities})\n",
"\n",
"dominant_models_dict = dict()\n",
"base_name_format = \"census_gs_model_{0}\"\n",
"row_id = 0\n",
"for row in all_results.itertuples():\n",
" model_name = base_name_format.format(row_id)\n",
" errors_for_lower_or_eq_disparity = all_results[\"error\"][all_results[\"disparity\"]<=row.disparity]\n",
" if row.error <= errors_for_lower_or_eq_disparity.min():\n",
" dominant_models_dict[model_name] = row.predictor\n",
" row_id = row_id + 1"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can construct predictions for the dominant models (we include the unmitigated predictor as well, for comparison):"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"predictions_dominant = {\"census_unmitigated\": unmitigated_predictor.predict(X_test)}\n",
"models_dominant = {\"census_unmitigated\": unmitigated_predictor}\n",
"for name, predictor in dominant_models_dict.items():\n",
" value = predictor.predict(X_test)\n",
" predictions_dominant[name] = value\n",
" models_dominant[name] = predictor"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"These predictions may then be viewed in the fairness dashboard. We include the race column from the dataset, as an alternative basis for assessing the models. However, since we have not based our mitigation on it, the variation in the models with respect to race can be large."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"FairlearnDashboard(sensitive_features=A_test, \n",
" sensitive_feature_names=['Sex', 'Race'],\n",
" y_true=Y_test.tolist(),\n",
" y_pred=predictions_dominant)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"When using sex as the sensitive feature, we see a Pareto front forming - the set of predictors which represent optimal tradeoffs between accuracy and disparity in predictions. In the ideal case, we would have a predictor at (1,0) - perfectly accurate and without any unfairness under demographic parity (with respect to the protected attribute \"sex\"). The Pareto front represents the closest we can come to this ideal based on our data and choice of estimator. Note the range of the axes - the disparity axis covers more values than the accuracy, so we can reduce disparity substantially for a small loss in accuracy. Finally, we also see that the unmitigated model is towards the top right of the plot, with high accuracy, but worst disparity.\n",
"\n",
"By clicking on individual models on the plot, we can inspect their metrics for disparity and accuracy in greater detail. In a real example, we would then pick the model which represented the best trade-off between accuracy and disparity given the relevant business constraints."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a id=\"AzureUpload\"></a>\n",
"## Uploading a Fairness Dashboard to Azure\n",
"\n",
"Uploading a fairness dashboard to Azure is a two stage process. The `FairlearnDashboard` invoked in the previous section relies on the underlying Python kernel to compute metrics on demand. This is obviously not available when the fairness dashboard is rendered in AzureML Studio. By default, the dashboard in Azure Machine Learning Studio also requires the models to be registered. The required stages are therefore:\n",
"1. Register the dominant models\n",
"1. Precompute all the required metrics\n",
"1. Upload to Azure\n",
"\n",
"Before that, we need to connect to Azure Machine Learning Studio:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core import Workspace, Experiment, Model\n",
"\n",
"ws = Workspace.from_config()\n",
"ws.get_details()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a id=\"RegisterModels\"></a>\n",
"### Registering Models\n",
"\n",
"The fairness dashboard is designed to integrate with registered models, so we need to do this for the models we want in the Studio portal. The assumption is that the names of the models specified in the dashboard dictionary correspond to the `id`s (i.e. `<name>:<version>` pairs) of registered models in the workspace. We register each of the models in the `models_dominant` dictionary into the workspace. For this, we have to save each model to a file, and then register that file:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import joblib\n",
"import os\n",
"\n",
"os.makedirs('models', exist_ok=True)\n",
"def register_model(name, model):\n",
" print(\"Registering \", name)\n",
" model_path = \"models/{0}.pkl\".format(name)\n",
" joblib.dump(value=model, filename=model_path)\n",
" registered_model = Model.register(model_path=model_path,\n",
" model_name=name,\n",
" workspace=ws)\n",
" print(\"Registered \", registered_model.id)\n",
" return registered_model.id\n",
"\n",
"model_name_id_mapping = dict()\n",
"for name, model in models_dominant.items():\n",
" m_id = register_model(name, model)\n",
" model_name_id_mapping[name] = m_id"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now, produce new predictions dictionaries, with the updated names:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"predictions_dominant_ids = dict()\n",
"for name, y_pred in predictions_dominant.items():\n",
" predictions_dominant_ids[model_name_id_mapping[name]] = y_pred"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a id=\"PrecomputeMetrics\"></a>\n",
"### Precomputing Metrics\n",
"\n",
"We create a _dashboard dictionary_ using Fairlearn's `metrics` package. The `_create_group_metric_set` method has arguments similar to the Dashboard constructor, except that the sensitive features are passed as a dictionary (to ensure that names are available), and we must specify the type of prediction. Note that we use the `predictions_dominant_ids` dictionary we just created:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"sf = { 'sex': A_test.Sex, 'race': A_test.Race }\n",
"\n",
"from fairlearn.metrics._group_metric_set import _create_group_metric_set\n",
"\n",
"\n",
"dash_dict = _create_group_metric_set(y_true=Y_test,\n",
" predictions=predictions_dominant_ids,\n",
" sensitive_features=sf,\n",
" prediction_type='binary_classification')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a id=\"DashboardUpload\"></a>\n",
"### Uploading the Dashboard\n",
"\n",
"Now, we import our `contrib` package which contains the routine to perform the upload:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.contrib.fairness import upload_dashboard_dictionary, download_dashboard_by_upload_id"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we can create an Experiment, then a Run, and upload our dashboard to it:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"exp = Experiment(ws, \"Test_Fairlearn_GridSearch_Census_Demo\")\n",
"print(exp)\n",
"\n",
"run = exp.start_logging()\n",
"try:\n",
" dashboard_title = \"Dominant Models from GridSearch\"\n",
" upload_id = upload_dashboard_dictionary(run,\n",
" dash_dict,\n",
" dashboard_name=dashboard_title)\n",
" print(\"\\nUploaded to id: {0}\\n\".format(upload_id))\n",
"\n",
" downloaded_dict = download_dashboard_by_upload_id(run, upload_id)\n",
"finally:\n",
" run.complete()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The dashboard can be viewed in the Run Details page.\n",
"\n",
"Finally, we can verify that the dashboard dictionary which we downloaded matches our upload:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print(dash_dict == downloaded_dict)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a id=\"Conclusion\"></a>\n",
"## Conclusion\n",
"\n",
"In this notebook we have demonstrated how to use the `GridSearch` algorithm from Fairlearn to generate a collection of models, and then present them in the fairness dashboard in Azure Machine Learning Studio. Please remember that this notebook has not attempted to discuss the many considerations which should be part of any approach to unfairness mitigation. The [Fairlearn website](http://fairlearn.github.io/) provides that discussion"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"authors": [
{
"name": "riedgar"
}
],
"kernelspec": {
"display_name": "Python 3.6",
"language": "python",
"name": "python36"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.10"
}
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -0,0 +1,8 @@
name: fairlearn-azureml-mitigation
dependencies:
- pip:
- azureml-sdk
- azureml-contrib-fairness
- fairlearn==0.4.6
- joblib
- shap

View File

@@ -0,0 +1,494 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Copyright (c) Microsoft Corporation. All rights reserved. \n",
"Licensed under the MIT License."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/contrib/fairness/upload-fairness-dashboard.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Upload a Fairness Dashboard to Azure Machine Learning Studio\n",
"**This notebook shows how to generate and upload a fairness assessment dashboard from Fairlearn to AzureML Studio**\n",
"\n",
"## Table of Contents\n",
"\n",
"1. [Introduction](#Introduction)\n",
"1. [Loading the Data](#LoadingData)\n",
"1. [Processing the Data](#ProcessingData)\n",
"1. [Training Models](#TrainingModels)\n",
"1. [Logging in to AzureML](#LoginAzureML)\n",
"1. [Registering the Models](#RegisterModels)\n",
"1. [Using the Fairlearn Dashboard](#LocalDashboard)\n",
"1. [Uploading a Fairness Dashboard to Azure](#AzureUpload)\n",
" 1. Computing Fairness Metrics\n",
" 1. Uploading to Azure\n",
"1. [Conclusion](#Conclusion)\n",
" \n",
"\n",
"<a id=\"Introduction\"></a>\n",
"## Introduction\n",
"\n",
"In this notebook, we walk through a simple example of using the `azureml-contrib-fairness` package to upload a collection of fairness statistics for a fairness dashboard. It is an example of integrating the [open source Fairlearn package](https://www.github.com/fairlearn/fairlearn) with Azure Machine Learning. This is not an example of fairness analysis or mitigation - this notebook simply shows how to get a fairness dashboard into the Azure Machine Learning portal. We will load the data and train a couple of simple models. We will then use Fairlearn to generate data for a Fairness dashboard, which we can upload to Azure Machine Learning portal and view there.\n",
"\n",
"### Setup\n",
"\n",
"To use this notebook, an Azure Machine Learning workspace is required.\n",
"Please see the [configuration notebook](../../configuration.ipynb) for information about creating one, if required.\n",
"This notebook also requires the following packages:\n",
"* `azureml-contrib-fairness`\n",
"* `fairlearn==0.4.6`\n",
"* `joblib`\n",
"* `shap`\n",
"\n",
"\n",
"\n",
"\n",
"<a id=\"LoadingData\"></a>\n",
"## Loading the Data\n",
"We use the well-known `adult` census dataset, which we load using `shap` (for convenience). We start with a fairly unremarkable set of imports:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from sklearn import svm\n",
"from sklearn.preprocessing import LabelEncoder, StandardScaler\n",
"from sklearn.linear_model import LogisticRegression\n",
"import pandas as pd\n",
"import shap"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we can load the data:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"X_raw, Y = shap.datasets.adult()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can take a look at some of the data. For example, the next cells shows the counts of the different races identified in the dataset:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print(X_raw[\"Race\"].value_counts().to_dict())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a id=\"ProcessingData\"></a>\n",
"## Processing the Data\n",
"\n",
"With the data loaded, we process it for our needs. First, we extract the sensitive features of interest into `A` (conventionally used in the literature) and put the rest of the feature data into `X`:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"A = X_raw[['Sex','Race']]\n",
"X = X_raw.drop(labels=['Sex', 'Race'],axis = 1)\n",
"X = pd.get_dummies(X)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Next, we apply a standard set of scalings:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"sc = StandardScaler()\n",
"X_scaled = sc.fit_transform(X)\n",
"X_scaled = pd.DataFrame(X_scaled, columns=X.columns)\n",
"\n",
"le = LabelEncoder()\n",
"Y = le.fit_transform(Y)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Finally, we can then split our data into training and test sets, and also make the labels on our test portion of `A` human-readable:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from sklearn.model_selection import train_test_split\n",
"X_train, X_test, Y_train, Y_test, A_train, A_test = train_test_split(X_scaled, \n",
" Y, \n",
" A,\n",
" test_size = 0.2,\n",
" random_state=0,\n",
" stratify=Y)\n",
"\n",
"# Work around indexing issue\n",
"X_train = X_train.reset_index(drop=True)\n",
"A_train = A_train.reset_index(drop=True)\n",
"X_test = X_test.reset_index(drop=True)\n",
"A_test = A_test.reset_index(drop=True)\n",
"\n",
"# Improve labels\n",
"A_test.Sex.loc[(A_test['Sex'] == 0)] = 'female'\n",
"A_test.Sex.loc[(A_test['Sex'] == 1)] = 'male'\n",
"\n",
"\n",
"A_test.Race.loc[(A_test['Race'] == 0)] = 'Amer-Indian-Eskimo'\n",
"A_test.Race.loc[(A_test['Race'] == 1)] = 'Asian-Pac-Islander'\n",
"A_test.Race.loc[(A_test['Race'] == 2)] = 'Black'\n",
"A_test.Race.loc[(A_test['Race'] == 3)] = 'Other'\n",
"A_test.Race.loc[(A_test['Race'] == 4)] = 'White'"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a id=\"TrainingModels\"></a>\n",
"## Training Models\n",
"\n",
"We now train a couple of different models on our data. The `adult` census dataset is a classification problem - the goal is to predict whether a particular individual exceeds an income threshold. For the purpose of generating a dashboard to upload, it is sufficient to train two basic classifiers. First, a logistic regression classifier:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"lr_predictor = LogisticRegression(solver='liblinear', fit_intercept=True)\n",
"\n",
"lr_predictor.fit(X_train, Y_train)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"And for comparison, a support vector classifier:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"svm_predictor = svm.SVC()\n",
"\n",
"svm_predictor.fit(X_train, Y_train)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a id=\"LoginAzureML\"></a>\n",
"## Logging in to AzureML\n",
"\n",
"With our two classifiers trained, we can log into our AzureML workspace:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core import Workspace, Experiment, Model\n",
"\n",
"ws = Workspace.from_config()\n",
"ws.get_details()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a id=\"RegisterModels\"></a>\n",
"## Registering the Models\n",
"\n",
"Next, we register our models. By default, the subroutine which uploads the models checks that the names provided correspond to registered models in the workspace. We define a utility routine to do the registering:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import joblib\n",
"import os\n",
"\n",
"os.makedirs('models', exist_ok=True)\n",
"def register_model(name, model):\n",
" print(\"Registering \", name)\n",
" model_path = \"models/{0}.pkl\".format(name)\n",
" joblib.dump(value=model, filename=model_path)\n",
" registered_model = Model.register(model_path=model_path,\n",
" model_name=name,\n",
" workspace=ws)\n",
" print(\"Registered \", registered_model.id)\n",
" return registered_model.id"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now, we register the models. For convenience in subsequent method calls, we store the results in a dictionary, which maps the `id` of the registered model (a string in `name:version` format) to the predictor itself:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"model_dict = {}\n",
"\n",
"lr_reg_id = register_model(\"fairness_linear_regression\", lr_predictor)\n",
"model_dict[lr_reg_id] = lr_predictor\n",
"svm_reg_id = register_model(\"fairness_svm\", svm_predictor)\n",
"model_dict[svm_reg_id] = svm_predictor"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a id=\"LocalDashboard\"></a>\n",
"## Using the Fairlearn Dashboard\n",
"\n",
"We can now examine the fairness of the two models we have training, both as a function of race and (binary) sex. Before uploading the dashboard to the AzureML portal, we will first instantiate a local instance of the Fairlearn dashboard.\n",
"\n",
"Regardless of the viewing location, the dashboard is based on three things - the true values, the model predictions and the sensitive feature values. The dashboard can use predictions from multiple models and multiple sensitive features if desired (as we are doing here).\n",
"\n",
"Our first step is to generate a dictionary mapping the `id` of the registered model to the corresponding array of predictions:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"ys_pred = {}\n",
"for n, p in model_dict.items():\n",
" ys_pred[n] = p.predict(X_test)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can examine these predictions in a locally invoked Fairlearn dashboard. This can be compared to the dashboard uploaded to the portal (in the next section):"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from fairlearn.widget import FairlearnDashboard\n",
"\n",
"FairlearnDashboard(sensitive_features=A_test, \n",
" sensitive_feature_names=['Sex', 'Race'],\n",
" y_true=Y_test.tolist(),\n",
" y_pred=ys_pred)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a id=\"AzureUpload\"></a>\n",
"## Uploading a Fairness Dashboard to Azure\n",
"\n",
"Uploading a fairness dashboard to Azure is a two stage process. The `FairlearnDashboard` invoked in the previous section relies on the underlying Python kernel to compute metrics on demand. This is obviously not available when the fairness dashboard is rendered in AzureML Studio. The required stages are therefore:\n",
"1. Precompute all the required metrics\n",
"1. Upload to Azure\n",
"\n",
"\n",
"### Computing Fairness Metrics\n",
"We use Fairlearn to create a dictionary which contains all the data required to display a dashboard. This includes both the raw data (true values, predicted values and sensitive features), and also the fairness metrics. The API is similar to that used to invoke the Dashboard locally. However, there are a few minor changes to the API, and the type of problem being examined (binary classification, regression etc.) needs to be specified explicitly:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"sf = { 'Race': A_test.Race, 'Sex': A_test.Sex }\n",
"\n",
"from fairlearn.metrics._group_metric_set import _create_group_metric_set\n",
"\n",
"dash_dict = _create_group_metric_set(y_true=Y_test,\n",
" predictions=ys_pred,\n",
" sensitive_features=sf,\n",
" prediction_type='binary_classification')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The `_create_group_metric_set()` method is currently underscored since its exact design is not yet final in Fairlearn."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Uploading to Azure\n",
"\n",
"We can now import the `azureml.contrib.fairness` package itself. We will round-trip the data, so there are two required subroutines:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.contrib.fairness import upload_dashboard_dictionary, download_dashboard_by_upload_id"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Finally, we can upload the generated dictionary to AzureML. The upload method requires a run, so we first create an experiment and a run. The uploaded dashboard can be seen on the corresponding Run Details page in AzureML Studio. For completeness, we also download the dashboard dictionary which we uploaded."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"exp = Experiment(ws, \"notebook-01\")\n",
"print(exp)\n",
"\n",
"run = exp.start_logging()\n",
"try:\n",
" dashboard_title = \"Sample notebook upload\"\n",
" upload_id = upload_dashboard_dictionary(run,\n",
" dash_dict,\n",
" dashboard_name=dashboard_title)\n",
" print(\"\\nUploaded to id: {0}\\n\".format(upload_id))\n",
"\n",
" downloaded_dict = download_dashboard_by_upload_id(run, upload_id)\n",
"finally:\n",
" run.complete()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Finally, we can verify that the dashboard dictionary which we downloaded matches our upload:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print(dash_dict == downloaded_dict)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a id=\"Conclusion\"></a>\n",
"## Conclusion\n",
"\n",
"In this notebook we have demonstrated how to generate and upload a fairness dashboard to AzureML Studio. We have not discussed how to analyse the results and apply mitigations. Those topics will be covered elsewhere."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"authors": [
{
"name": "riedgar"
}
],
"kernelspec": {
"display_name": "Python 3.6",
"language": "python",
"name": "python36"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.8"
}
},
"nbformat": 4,
"nbformat_minor": 4
}

View File

@@ -0,0 +1,8 @@
name: upload-fairness-dashboard
dependencies:
- pip:
- azureml-sdk
- azureml-contrib-fairness
- fairlearn==0.4.6
- joblib
- shap

View File

@@ -105,7 +105,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"print(\"This notebook was created using version 1.6.0 of the Azure ML SDK\")\n", "print(\"This notebook was created using version 1.8.0 of the Azure ML SDK\")\n",
"print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")" "print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
] ]
}, },

View File

@@ -93,7 +93,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"print(\"This notebook was created using version 1.6.0 of the Azure ML SDK\")\n", "print(\"This notebook was created using version 1.8.0 of the Azure ML SDK\")\n",
"print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")" "print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
] ]
}, },

View File

@@ -97,7 +97,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"print(\"This notebook was created using version 1.6.0 of the Azure ML SDK\")\n", "print(\"This notebook was created using version 1.8.0 of the Azure ML SDK\")\n",
"print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")" "print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
] ]
}, },
@@ -491,8 +491,8 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"test_run = run_inference(test_experiment, compute_target, script_folder, best_dnn_run, test_dataset,\n", "test_run = run_inference(test_experiment, compute_target, script_folder, best_dnn_run,\n",
" target_column_name, model_name)" " train_dataset, test_dataset, target_column_name, model_name)"
] ]
}, },
{ {

View File

@@ -6,7 +6,7 @@ from azureml.core.run import Run
def run_inference(test_experiment, compute_target, script_folder, train_run, def run_inference(test_experiment, compute_target, script_folder, train_run,
test_dataset, target_column_name, model_name): train_dataset, test_dataset, target_column_name, model_name):
train_run.download_file('outputs/conda_env_v_1_0_0.yml', train_run.download_file('outputs/conda_env_v_1_0_0.yml',
'inference/condafile.yml') 'inference/condafile.yml')
@@ -22,7 +22,10 @@ def run_inference(test_experiment, compute_target, script_folder, train_run,
'--target_column_name': target_column_name, '--target_column_name': target_column_name,
'--model_name': model_name '--model_name': model_name
}, },
inputs=[test_dataset.as_named_input('test_data')], inputs=[
train_dataset.as_named_input('train_data'),
test_dataset.as_named_input('test_data')
],
compute_target=compute_target, compute_target=compute_target,
environment_definition=inference_env) environment_definition=inference_env)

View File

@@ -1,8 +1,11 @@
import numpy as np
import argparse import argparse
from azureml.core import Run
import numpy as np
from sklearn.externals import joblib from sklearn.externals import joblib
from azureml.automl.core.shared import constants, metrics
from azureml.automl.runtime.shared.score import scoring, constants
from azureml.core import Run
from azureml.core.model import Model from azureml.core.model import Model
@@ -29,22 +32,26 @@ model = joblib.load(model_path)
run = Run.get_context() run = Run.get_context()
# get input dataset by name # get input dataset by name
test_dataset = run.input_datasets['test_data'] test_dataset = run.input_datasets['test_data']
train_dataset = run.input_datasets['train_data']
X_test_df = test_dataset.drop_columns(columns=[target_column_name]) \ X_test_df = test_dataset.drop_columns(columns=[target_column_name]) \
.to_pandas_dataframe() .to_pandas_dataframe()
y_test_df = test_dataset.with_timestamp_columns(None) \ y_test_df = test_dataset.with_timestamp_columns(None) \
.keep_columns(columns=[target_column_name]) \ .keep_columns(columns=[target_column_name]) \
.to_pandas_dataframe() .to_pandas_dataframe()
y_train_df = test_dataset.with_timestamp_columns(None) \
.keep_columns(columns=[target_column_name]) \
.to_pandas_dataframe()
predicted = model.predict_proba(X_test_df) predicted = model.predict_proba(X_test_df)
# use automl metrics module # Use the AutoML scoring module
scores = metrics.compute_metrics_classification( class_labels = np.unique(np.concatenate((y_train_df.values, y_test_df.values)))
np.array(predicted), train_labels = model.classes_
np.array(y_test_df), classification_metrics = list(constants.CLASSIFICATION_SCALAR_SET)
class_labels=model.classes_, scores = scoring.score_classification(y_test_df.values, predicted,
metrics=list(constants.Metric.SCALAR_CLASSIFICATION_SET) classification_metrics,
) class_labels, train_labels)
print("scores:") print("scores:")
print(scores) print(scores)

View File

@@ -88,7 +88,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"print(\"This notebook was created using version 1.6.0 of the Azure ML SDK\")\n", "print(\"This notebook was created using version 1.8.0 of the Azure ML SDK\")\n",
"print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")" "print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
] ]
}, },

View File

@@ -114,7 +114,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"print(\"This notebook was created using version 1.6.0 of the Azure ML SDK\")\n", "print(\"This notebook was created using version 1.8.0 of the Azure ML SDK\")\n",
"print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")" "print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
] ]
}, },

View File

@@ -1,11 +1,14 @@
import pandas as pd
import numpy as np
import argparse import argparse
from azureml.core import Run
import numpy as np
import pandas as pd
from pandas.tseries.frequencies import to_offset
from sklearn.externals import joblib from sklearn.externals import joblib
from sklearn.metrics import mean_absolute_error, mean_squared_error from sklearn.metrics import mean_absolute_error, mean_squared_error
from azureml.automl.core.shared import constants, metrics
from pandas.tseries.frequencies import to_offset from azureml.automl.runtime.shared.score import scoring, constants
from azureml.core import Run
def align_outputs(y_predicted, X_trans, X_test, y_test, def align_outputs(y_predicted, X_trans, X_test, y_test,
@@ -299,12 +302,11 @@ print(df_all[target_column_name])
print("predicted values:::") print("predicted values:::")
print(df_all['predicted']) print(df_all['predicted'])
# use automl metrics module # Use the AutoML scoring module
scores = metrics.compute_metrics_regression( regression_metrics = list(constants.REGRESSION_SCALAR_SET)
df_all['predicted'], y_test = np.array(df_all[target_column_name])
df_all[target_column_name], y_pred = np.array(df_all['predicted'])
list(constants.Metric.SCALAR_REGRESSION_SET), scores = scoring.score_regression(y_test, y_pred, regression_metrics)
None, None, None)
print("scores:") print("scores:")
print(scores) print(scores)

View File

@@ -87,7 +87,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"print(\"This notebook was created using version 1.6.0 of the Azure ML SDK\")\n", "print(\"This notebook was created using version 1.8.0 of the Azure ML SDK\")\n",
"print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")" "print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
] ]
}, },

View File

@@ -97,7 +97,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"print(\"This notebook was created using version 1.6.0 of the Azure ML SDK\")\n", "print(\"This notebook was created using version 1.8.0 of the Azure ML SDK\")\n",
"print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")" "print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
] ]
}, },

View File

@@ -94,7 +94,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"print(\"This notebook was created using version 1.6.0 of the Azure ML SDK\")\n", "print(\"This notebook was created using version 1.8.0 of the Azure ML SDK\")\n",
"print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")" "print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
] ]
}, },

View File

@@ -82,7 +82,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"print(\"This notebook was created using version 1.6.0 of the Azure ML SDK\")\n", "print(\"This notebook was created using version 1.8.0 of the Azure ML SDK\")\n",
"print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")" "print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
] ]
}, },
@@ -336,7 +336,11 @@
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": null, "execution_count": null,
"metadata": {}, "metadata": {
"tags": [
"sample-featurizationconfig-remarks"
]
},
"outputs": [], "outputs": [],
"source": [ "source": [
"featurization_config = FeaturizationConfig()\n", "featurization_config = FeaturizationConfig()\n",

View File

@@ -28,7 +28,8 @@
"1. [Setup](#Setup)\n", "1. [Setup](#Setup)\n",
"1. [Train](#Train)\n", "1. [Train](#Train)\n",
"1. [Results](#Results)\n", "1. [Results](#Results)\n",
"1. [Test](#Test)\n", "1. [Test](#Tests)\n",
"1. [Explanation](#Explanation)\n",
"1. [Acknowledgements](#Acknowledgements)" "1. [Acknowledgements](#Acknowledgements)"
] ]
}, },
@@ -49,9 +50,9 @@
"2. Configure AutoML using `AutoMLConfig`.\n", "2. Configure AutoML using `AutoMLConfig`.\n",
"3. Train the model.\n", "3. Train the model.\n",
"4. Explore the results.\n", "4. Explore the results.\n",
"5. Visualization model's feature importance in azure portal\n", "5. Test the fitted model.\n",
"6. Explore any model's explanation and explore feature importance in azure portal\n", "6. Explore any model's explanation and explore feature importance in azure portal.\n",
"7. Test the fitted model." "7. Create an AKS cluster, deploy the webservice of AutoML scoring model and the explainer model to the AKS and consume the web service."
] ]
}, },
{ {
@@ -95,7 +96,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"print(\"This notebook was created using version 1.6.0 of the Azure ML SDK\")\n", "print(\"This notebook was created using version 1.8.0 of the Azure ML SDK\")\n",
"print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")" "print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
] ]
}, },
@@ -255,9 +256,9 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"## Analyze results\n", "### Analyze results\n",
"\n", "\n",
"### Retrieve the Best Model\n", "#### Retrieve the Best Model\n",
"\n", "\n",
"Below we select the best pipeline from our iterations. The `get_output` method on `automl_classifier` returns the best run and the fitted model for the last invocation. Overloads on `get_output` allow you to retrieve the best run and fitted model for *any* logged metric or for a particular *iteration*." "Below we select the best pipeline from our iterations. The `get_output` method on `automl_classifier` returns the best run and the fitted model for the last invocation. Overloads on `get_output` allow you to retrieve the best run and fitted model for *any* logged metric or for a particular *iteration*."
] ]
@@ -284,9 +285,80 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"## Best Model 's explanation\n", "## Tests\n",
"Retrieve the explanation from the best_run which includes explanations for engineered features and raw features.\n",
"\n", "\n",
"Now that the model is trained, split the data in the same way the data was split for training (The difference here is the data is being split locally) and then run the test data through the trained model to get the predicted values."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# convert the test data to dataframe\n",
"X_test_df = validation_data.drop_columns(columns=[label_column_name]).to_pandas_dataframe()\n",
"y_test_df = validation_data.keep_columns(columns=[label_column_name], validate=True).to_pandas_dataframe()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# call the predict functions on the model\n",
"y_pred = fitted_model.predict(X_test_df)\n",
"y_pred"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Calculate metrics for the prediction\n",
"\n",
"Now visualize the data on a scatter plot to show what our truth (actual) values are compared to the predicted values \n",
"from the trained model that was returned."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from sklearn.metrics import confusion_matrix\n",
"import numpy as np\n",
"import itertools\n",
"\n",
"cf =confusion_matrix(y_test_df.values,y_pred)\n",
"plt.imshow(cf,cmap=plt.cm.Blues,interpolation='nearest')\n",
"plt.colorbar()\n",
"plt.title('Confusion Matrix')\n",
"plt.xlabel('Predicted')\n",
"plt.ylabel('Actual')\n",
"class_labels = ['False','True']\n",
"tick_marks = np.arange(len(class_labels))\n",
"plt.xticks(tick_marks,class_labels)\n",
"plt.yticks([-0.5,0,1,1.5],['','False','True',''])\n",
"# plotting text value inside cells\n",
"thresh = cf.max() / 2.\n",
"for i,j in itertools.product(range(cf.shape[0]),range(cf.shape[1])):\n",
" plt.text(j,i,format(cf[i,j],'d'),horizontalalignment='center',color='white' if cf[i,j] >thresh else 'black')\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Explanation\n",
"In this section, we will show how to compute model explanations and visualize the explanations using azureml-explain-model package. We will also show how to run the automl model and the explainer model through deploying an AKS web service.\n",
"\n",
"Besides retrieving an existing model explanation for an AutoML model, you can also explain your AutoML model with different test data. The following steps will allow you to compute and visualize engineered feature importance based on your test data.\n",
"\n",
"### Run the explanation\n",
"#### Download engineered feature importance from artifact store\n", "#### Download engineered feature importance from artifact store\n",
"You can use ExplanationClient to download the engineered feature explanations from the artifact store of the best_run. You can also use azure portal url to view the dash board visualization of the feature importance values of the engineered features." "You can use ExplanationClient to download the engineered feature explanations from the artifact store of the best_run. You can also use azure portal url to view the dash board visualization of the feature importance values of the engineered features."
] ]
@@ -303,14 +375,6 @@
"print(\"You can visualize the engineered explanations under the 'Explanations (preview)' tab in the AutoML run at:-\\n\" + best_run.get_portal_url())" "print(\"You can visualize the engineered explanations under the 'Explanations (preview)' tab in the AutoML run at:-\\n\" + best_run.get_portal_url())"
] ]
}, },
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Explanations\n",
"In this section, we will show how to compute model explanations and visualize the explanations using azureml-explain-model package. Besides retrieving an existing model explanation for an AutoML model, you can also explain your AutoML model with different test data. The following steps will allow you to compute and visualize engineered feature importance based on your test data."
]
},
{ {
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
@@ -403,6 +467,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"# Compute the engineered explanations\n",
"engineered_explanations = explainer.explain(['local', 'global'], eval_dataset=automl_explainer_setup_obj.X_test_transform)\n", "engineered_explanations = explainer.explain(['local', 'global'], eval_dataset=automl_explainer_setup_obj.X_test_transform)\n",
"print(engineered_explanations.get_feature_importance_dict())\n", "print(engineered_explanations.get_feature_importance_dict())\n",
"print(\"You can visualize the engineered explanations under the 'Explanations (preview)' tab in the AutoML run at:-\\n\" + automl_run.get_portal_url())" "print(\"You can visualize the engineered explanations under the 'Explanations (preview)' tab in the AutoML run at:-\\n\" + automl_run.get_portal_url())"
@@ -412,41 +477,37 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"## Test the fitted model\n", "#### Initialize the scoring Explainer, save and upload it for later use in scoring explanation"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.explain.model.scoring.scoring_explainer import TreeScoringExplainer\n",
"import joblib\n",
"\n", "\n",
"Now that the model is trained, split the data in the same way the data was split for training (The difference here is the data is being split locally) and then run the test data through the trained model to get the predicted values." "# Initialize the ScoringExplainer\n",
] "scoring_explainer = TreeScoringExplainer(explainer.explainer, feature_maps=[automl_explainer_setup_obj.feature_map])\n",
}, "\n",
{ "# Pickle scoring explainer locally to './scoring_explainer.pkl'\n",
"cell_type": "code", "scoring_explainer_file_name = 'scoring_explainer.pkl'\n",
"execution_count": null, "with open(scoring_explainer_file_name, 'wb') as stream:\n",
"metadata": {}, " joblib.dump(scoring_explainer, stream)\n",
"outputs": [], "\n",
"source": [ "# Upload the scoring explainer to the automl run\n",
"# convert the test data to dataframe\n", "automl_run.upload_file('outputs/scoring_explainer.pkl', scoring_explainer_file_name)"
"X_test_df = validation_data.drop_columns(columns=[label_column_name]).to_pandas_dataframe()\n",
"y_test_df = validation_data.keep_columns(columns=[label_column_name], validate=True).to_pandas_dataframe()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# call the predict functions on the model\n",
"y_pred = fitted_model.predict(X_test_df)\n",
"y_pred"
] ]
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"### Calculate metrics for the prediction\n", "### Deploying the scoring and explainer models to a web service to Azure Kubernetes Service (AKS)\n",
"\n", "\n",
"Now visualize the data on a scatter plot to show what our truth (actual) values are compared to the predicted values \n", "We use the TreeScoringExplainer from azureml.explain.model package to create the scoring explainer which will be used to compute the raw and engineered feature importances at the inference time. In the cell below, we register the AutoML model and the scoring explainer with the Model Management Service."
"from the trained model that was returned."
] ]
}, },
{ {
@@ -455,25 +516,238 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"from sklearn.metrics import confusion_matrix\n", "# Register trained automl model present in the 'outputs' folder in the artifacts\n",
"import numpy as np\n", "original_model = automl_run.register_model(model_name='automl_model', \n",
"import itertools\n", " model_path='outputs/model.pkl')\n",
"scoring_explainer_model = automl_run.register_model(model_name='scoring_explainer',\n",
" model_path='outputs/scoring_explainer.pkl')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Create the conda dependencies for setting up the service\n",
"\n", "\n",
"cf =confusion_matrix(y_test_df.values,y_pred)\n", "We need to create the conda dependencies comprising of the azureml-explain-model, azureml-train-automl and azureml-defaults packages."
"plt.imshow(cf,cmap=plt.cm.Blues,interpolation='nearest')\n", ]
"plt.colorbar()\n", },
"plt.title('Confusion Matrix')\n", {
"plt.xlabel('Predicted')\n", "cell_type": "code",
"plt.ylabel('Actual')\n", "execution_count": null,
"class_labels = ['False','True']\n", "metadata": {},
"tick_marks = np.arange(len(class_labels))\n", "outputs": [],
"plt.xticks(tick_marks,class_labels)\n", "source": [
"plt.yticks([-0.5,0,1,1.5],['','False','True',''])\n", "from azureml.automl.core.shared import constants\n",
"# plotting text value inside cells\n", "from azureml.core.environment import Environment\n",
"thresh = cf.max() / 2.\n", "\n",
"for i,j in itertools.product(range(cf.shape[0]),range(cf.shape[1])):\n", "automl_run.download_file(constants.CONDA_ENV_FILE_PATH, 'myenv.yml')\n",
" plt.text(j,i,format(cf[i,j],'d'),horizontalalignment='center',color='white' if cf[i,j] >thresh else 'black')\n", "myenv = Environment.from_conda_specification(name=\"myenv\", file_path=\"myenv.yml\")\n",
"plt.show()" "myenv"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Write the Entry Script\n",
"Write the script that will be used to predict on your model"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%writefile score.py\n",
"import numpy as np\n",
"import pandas as pd\n",
"import os\n",
"import pickle\n",
"import azureml.train.automl\n",
"import azureml.explain.model\n",
"from azureml.train.automl.runtime.automl_explain_utilities import AutoMLExplainerSetupClass, \\\n",
" automl_setup_model_explanations\n",
"import joblib\n",
"from azureml.core.model import Model\n",
"\n",
"\n",
"def init():\n",
" global automl_model\n",
" global scoring_explainer\n",
"\n",
" # Retrieve the path to the model file using the model name\n",
" # Assume original model is named original_prediction_model\n",
" automl_model_path = Model.get_model_path('automl_model')\n",
" scoring_explainer_path = Model.get_model_path('scoring_explainer')\n",
"\n",
" automl_model = joblib.load(automl_model_path)\n",
" scoring_explainer = joblib.load(scoring_explainer_path)\n",
"\n",
"\n",
"def run(raw_data):\n",
" data = pd.read_json(raw_data, orient='records') \n",
" # Make prediction\n",
" predictions = automl_model.predict(data)\n",
" # Setup for inferencing explanations\n",
" automl_explainer_setup_obj = automl_setup_model_explanations(automl_model,\n",
" X_test=data, task='classification')\n",
" # Retrieve model explanations for engineered explanations\n",
" engineered_local_importance_values = scoring_explainer.explain(automl_explainer_setup_obj.X_test_transform) \n",
" # You can return any data type as long as it is JSON-serializable\n",
" return {'predictions': predictions.tolist(),\n",
" 'engineered_local_importance_values': engineered_local_importance_values}\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Create the InferenceConfig \n",
"Create the inference config that will be used when deploying the model"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core.model import InferenceConfig\n",
"\n",
"inf_config = InferenceConfig(entry_script='score.py', environment=myenv)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Provision the AKS Cluster\n",
"This is a one time setup. You can reuse this cluster for multiple deployments after it has been created. If you delete the cluster or the resource group that contains it, then you would have to recreate it."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core.compute import ComputeTarget, AksCompute\n",
"from azureml.core.compute_target import ComputeTargetException\n",
"\n",
"# Choose a name for your cluster.\n",
"aks_name = 'scoring-explain'\n",
"\n",
"# Verify that cluster does not exist already\n",
"try:\n",
" aks_target = ComputeTarget(workspace=ws, name=aks_name)\n",
" print('Found existing cluster, use it.')\n",
"except ComputeTargetException:\n",
" prov_config = AksCompute.provisioning_configuration(vm_size='STANDARD_D3_V2')\n",
" aks_target = ComputeTarget.create(workspace=ws, \n",
" name=aks_name,\n",
" provisioning_configuration=prov_config)\n",
"aks_target.wait_for_completion(show_output=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Deploy web service to AKS"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Set the web service configuration (using default here)\n",
"from azureml.core.webservice import AksWebservice\n",
"from azureml.core.model import Model\n",
"\n",
"aks_config = AksWebservice.deploy_configuration()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"aks_service_name ='model-scoring-local-aks'\n",
"\n",
"aks_service = Model.deploy(workspace=ws,\n",
" name=aks_service_name,\n",
" models=[scoring_explainer_model, original_model],\n",
" inference_config=inf_config,\n",
" deployment_config=aks_config,\n",
" deployment_target=aks_target)\n",
"\n",
"aks_service.wait_for_deployment(show_output = True)\n",
"print(aks_service.state)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### View the service logs"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"aks_service.get_logs()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Consume the web service using run method to do the scoring and explanation of scoring.\n",
"We test the web sevice by passing data. Run() method retrieves API keys behind the scenes to make sure that call is authenticated."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Serialize the first row of the test data into json\n",
"X_test_json = X_test_df[:1].to_json(orient='records')\n",
"print(X_test_json)\n",
"\n",
"# Call the service to get the predictions and the engineered and raw explanations\n",
"output = aks_service.run(X_test_json)\n",
"\n",
"# Print the predicted value\n",
"print('predictions:\\n{}\\n'.format(output['predictions']))\n",
"# Print the engineered feature importances for the predicted value\n",
"print('engineered_local_importance_values:\\n{}\\n'.format(output['engineered_local_importance_values']))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Clean up\n",
"Delete the service."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"aks_service.delete()"
] ]
}, },
{ {

View File

@@ -98,7 +98,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"print(\"This notebook was created using version 1.6.0 of the Azure ML SDK\")\n", "print(\"This notebook was created using version 1.8.0 of the Azure ML SDK\")\n",
"print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")" "print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
] ]
}, },
@@ -242,7 +242,11 @@
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": null, "execution_count": null,
"metadata": {}, "metadata": {
"tags": [
"sample-featurizationconfig-remarks2"
]
},
"outputs": [], "outputs": [],
"source": [ "source": [
"featurization_config = FeaturizationConfig()\n", "featurization_config = FeaturizationConfig()\n",
@@ -260,7 +264,11 @@
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": null, "execution_count": null,
"metadata": {}, "metadata": {
"tags": [
"sample-featurizationconfig-remarks3"
]
},
"outputs": [], "outputs": [],
"source": [ "source": [
"automl_settings = {\n", "automl_settings = {\n",

View File

@@ -92,7 +92,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"print(\"This notebook was created using version 1.6.0 of the Azure ML SDK\")\n", "print(\"This notebook was created using version 1.8.0 of the Azure ML SDK\")\n",
"print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")" "print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
] ]
}, },

View File

@@ -108,9 +108,9 @@
"environment.python.conda_dependencies = CondaDependencies.create(pip_packages=[\n", "environment.python.conda_dependencies = CondaDependencies.create(pip_packages=[\n",
" 'azureml-defaults',\n", " 'azureml-defaults',\n",
" 'inference-schema[numpy-support]',\n", " 'inference-schema[numpy-support]',\n",
" 'joblib',\n",
" 'numpy',\n", " 'numpy',\n",
" 'scikit-learn'\n", " 'scikit-learn==0.19.1',\n",
" 'scipy'\n",
"])" "])"
] ]
}, },

View File

@@ -0,0 +1,354 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Copyright (c) Microsoft Corporation. All rights reserved.\n",
"\n",
"Licensed under the MIT License."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/deployment/production-deploy-to-aks/production-deploy-to-aks.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Deploying a web service to Azure Kubernetes Service (AKS)\n",
"This notebook shows the steps for deploying a service: registering a model, provisioning a cluster with ssl (one time action), and deploying a service to it. \n",
"We then test and delete the service, image and model."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core import Workspace\n",
"from azureml.core.compute import AksCompute, ComputeTarget\n",
"from azureml.core.webservice import Webservice, AksWebservice\n",
"from azureml.core.model import Model"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import azureml.core\n",
"print(azureml.core.VERSION)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Get workspace\n",
"Load existing workspace from the config file info."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core.workspace import Workspace\n",
"\n",
"ws = Workspace.from_config()\n",
"print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep = '\\n')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Register the model\n",
"Register an existing trained model, add descirption and tags."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#Register the model\n",
"from azureml.core.model import Model\n",
"model = Model.register(model_path = \"sklearn_regression_model.pkl\", # this points to a local file\n",
" model_name = \"sklearn_model\", # this is the name the model is registered as\n",
" tags = {'area': \"diabetes\", 'type': \"regression\"},\n",
" description = \"Ridge regression model to predict diabetes\",\n",
" workspace = ws)\n",
"\n",
"print(model.name, model.description, model.version)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Create the Environment\n",
"Create an environment that the model will be deployed with"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core import Environment\n",
"from azureml.core.conda_dependencies import CondaDependencies \n",
"\n",
"conda_deps = CondaDependencies.create(conda_packages=['numpy', 'scikit-learn==0.19.1', 'scipy'], pip_packages=['azureml-defaults', 'inference-schema'])\n",
"myenv = Environment(name='myenv')\n",
"myenv.python.conda_dependencies = conda_deps"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Use a custom Docker image\n",
"\n",
"You can also specify a custom Docker image to be used as base image if you don't want to use the default base image provided by Azure ML. Please make sure the custom Docker image has Ubuntu >= 16.04, Conda >= 4.5.\\* and Python(3.5.\\* or 3.6.\\*).\n",
"\n",
"Only supported with `python` runtime.\n",
"```python\n",
"# use an image available in public Container Registry without authentication\n",
"myenv.docker.base_image = \"mcr.microsoft.com/azureml/o16n-sample-user-base/ubuntu-miniconda\"\n",
"\n",
"# or, use an image available in a private Container Registry\n",
"myenv.docker.base_image = \"myregistry.azurecr.io/mycustomimage:1.0\"\n",
"myenv.docker.base_image_registry.address = \"myregistry.azurecr.io\"\n",
"myenv.docker.base_image_registry.username = \"username\"\n",
"myenv.docker.base_image_registry.password = \"password\"\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Write the Entry Script\n",
"Write the script that will be used to predict on your model"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%writefile score.py\n",
"import os\n",
"import pickle\n",
"import json\n",
"import numpy\n",
"from sklearn.externals import joblib\n",
"from sklearn.linear_model import Ridge\n",
"from inference_schema.schema_decorators import input_schema, output_schema\n",
"from inference_schema.parameter_types.standard_py_parameter_type import StandardPythonParameterType\n",
"\n",
"def init():\n",
" global model\n",
" # AZUREML_MODEL_DIR is an environment variable created during deployment.\n",
" # It is the path to the model folder (./azureml-models/$MODEL_NAME/$VERSION)\n",
" # For multiple models, it points to the folder containing all deployed models (./azureml-models)\n",
" model_path = os.path.join(os.getenv('AZUREML_MODEL_DIR'), 'sklearn_regression_model.pkl')\n",
" # deserialize the model file back into a sklearn model\n",
" model = joblib.load(model_path)\n",
"\n",
"\n",
"standard_sample_input = {'a': 10, 'b': 9, 'c': 8, 'd': 7, 'e': 6, 'f': 5, 'g': 4, 'h': 3, 'i': 2, 'j': 1 }\n",
"standard_sample_output = {'outcome': 1}\n",
"\n",
"@input_schema('param', StandardPythonParameterType(standard_sample_input))\n",
"@output_schema(StandardPythonParameterType(standard_sample_output))\n",
"def run(param):\n",
" try:\n",
" raw_data = [param['a'], param['b'], param['c'], param['d'], param['e'], param['f'], param['g'], param['h'], param['i'], param['j']]\n",
" data = numpy.array([raw_data])\n",
" result = model.predict(data)\n",
" return { 'outcome' : result[0] }\n",
" except Exception as e:\n",
" error = str(e)\n",
" return error"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Create the InferenceConfig\n",
"Create the inference config that will be used when deploying the model"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core.model import InferenceConfig\n",
"\n",
"inf_config = InferenceConfig(entry_script='score.py', environment=myenv)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Provision the AKS Cluster with SSL\n",
"This is a one time setup. You can reuse this cluster for multiple deployments after it has been created. If you delete the cluster or the resource group that contains it, then you would have to recreate it.\n",
"\n",
"See code snippet below. Check the documentation [here](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-secure-web-service) for more details"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Use the default configuration (can also provide parameters to customize)\n",
"\n",
"provisioning_config = AksCompute.provisioning_configuration()\n",
"# Leaf domain label generates a name using the formula\n",
"# \"<leaf-domain-label>######.<azure-region>.cloudapp.azure.net\"\n",
"# where \"######\" is a random series of characters\n",
"provisioning_config.enable_ssl(leaf_domain_label = \"contoso\")\n",
"\n",
"aks_name = 'my-aks-ssl-1' \n",
"# Create the cluster\n",
"aks_target = ComputeTarget.create(workspace = ws, \n",
" name = aks_name, \n",
" provisioning_configuration = provisioning_config)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%time\n",
"aks_target.wait_for_completion(show_output = True)\n",
"print(aks_target.provisioning_state)\n",
"print(aks_target.provisioning_errors)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Deploy web service to AKS"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"sample-deploy-to-aks"
]
},
"outputs": [],
"source": [
"%%time\n",
"\n",
"aks_config = AksWebservice.deploy_configuration()\n",
"\n",
"aks_service_name ='aks-service-ssl-1'\n",
"\n",
"aks_service = Model.deploy(workspace=ws,\n",
" name=aks_service_name,\n",
" models=[model],\n",
" inference_config=inf_config,\n",
" deployment_config=aks_config,\n",
" deployment_target=aks_target,\n",
" overwrite=True)\n",
"\n",
"aks_service.wait_for_deployment(show_output = True)\n",
"print(aks_service.state)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Test the web service using run method\n",
"We test the web sevice by passing data.\n",
"Run() method retrieves API keys behind the scenes to make sure that call is authenticated."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%time\n",
"import json\n",
"\n",
"standard_sample_input = json.dumps({'param': {'a': 10, 'b': 9, 'c': 8, 'd': 7, 'e': 6, 'f': 5, 'g': 4, 'h': 3, 'i': 2, 'j': 1 }})\n",
"\n",
"aks_service.run(input_data=standard_sample_input)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Clean up\n",
"Delete the service, image and model."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%time\n",
"aks_service.delete()\n",
"model.delete()"
]
}
],
"metadata": {
"authors": [
{
"name": "vaidyas"
}
],
"kernelspec": {
"display_name": "Python 3.6",
"language": "python",
"name": "python36"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.6"
}
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -0,0 +1,8 @@
name: production-deploy-to-aks-ssl
dependencies:
- pip:
- azureml-sdk
- matplotlib
- tqdm
- scipy
- sklearn

View File

@@ -109,7 +109,7 @@
"from azureml.core import Environment\n", "from azureml.core import Environment\n",
"from azureml.core.conda_dependencies import CondaDependencies \n", "from azureml.core.conda_dependencies import CondaDependencies \n",
"\n", "\n",
"conda_deps = CondaDependencies.create(conda_packages=['numpy','scikit-learn'], pip_packages=['azureml-defaults'])\n", "conda_deps = CondaDependencies.create(conda_packages=['numpy','scikit-learn==0.19.1','scipy'], pip_packages=['azureml-defaults'])\n",
"myenv = Environment(name='myenv')\n", "myenv = Environment(name='myenv')\n",
"myenv.python.conda_dependencies = conda_deps" "myenv.python.conda_dependencies = conda_deps"
] ]
@@ -300,7 +300,8 @@
" 'inference-schema[numpy-support]',\n", " 'inference-schema[numpy-support]',\n",
" 'joblib',\n", " 'joblib',\n",
" 'numpy',\n", " 'numpy',\n",
" 'scikit-learn'\n", " 'scikit-learn==0.19.1',\n",
" 'scipy'\n",
"])\n", "])\n",
"inference_config = InferenceConfig(entry_script='score.py', environment=environment)\n", "inference_config = InferenceConfig(entry_script='score.py', environment=environment)\n",
"# if cpu and memory_in_gb parameters are not provided\n", "# if cpu and memory_in_gb parameters are not provided\n",

View File

@@ -261,6 +261,10 @@
"if pandas_ver:\n", "if pandas_ver:\n",
" pandas_dep = 'pandas=={}'.format(pandas_ver)\n", " pandas_dep = 'pandas=={}'.format(pandas_ver)\n",
"# specify CondaDependencies obj\n", "# specify CondaDependencies obj\n",
"# The CondaDependencies specifies the conda and pip packages that are installed in the environment\n",
"# the submitted job is run in. Note the remote environment(s) needs to be similar to the local\n",
"# environment, otherwise if a model is trained or deployed in a different environment this can\n",
"# cause errors. Please take extra care when specifying your dependencies in a production environment.\n",
"run_config.environment.python.conda_dependencies = CondaDependencies.create(conda_packages=[sklearn_dep, pandas_dep],\n", "run_config.environment.python.conda_dependencies = CondaDependencies.create(conda_packages=[sklearn_dep, pandas_dep],\n",
" pip_packages=azureml_pip_packages)\n", " pip_packages=azureml_pip_packages)\n",
"\n", "\n",
@@ -379,6 +383,10 @@
"if pandas_ver:\n", "if pandas_ver:\n",
" pandas_dep = 'pandas=={}'.format(pandas_ver)\n", " pandas_dep = 'pandas=={}'.format(pandas_ver)\n",
"# specify CondaDependencies obj\n", "# specify CondaDependencies obj\n",
"# The CondaDependencies specifies the conda and pip packages that are installed in the environment\n",
"# the submitted job is run in. Note the remote environment(s) needs to be similar to the local\n",
"# environment, otherwise if a model is trained or deployed in a different environment this can\n",
"# cause errors. Please take extra care when specifying your dependencies in a production environment.\n",
"run_config.environment.python.conda_dependencies = CondaDependencies.create(conda_packages=[sklearn_dep, pandas_dep],\n", "run_config.environment.python.conda_dependencies = CondaDependencies.create(conda_packages=[sklearn_dep, pandas_dep],\n",
" pip_packages=azureml_pip_packages)\n", " pip_packages=azureml_pip_packages)\n",
"\n", "\n",
@@ -509,6 +517,10 @@
"if pandas_ver:\n", "if pandas_ver:\n",
" pandas_dep = 'pandas=={}'.format(pandas_ver)\n", " pandas_dep = 'pandas=={}'.format(pandas_ver)\n",
"# specify CondaDependencies obj\n", "# specify CondaDependencies obj\n",
"# The CondaDependencies specifies the conda and pip packages that are installed in the environment\n",
"# the submitted job is run in. Note the remote environment(s) needs to be similar to the local\n",
"# environment, otherwise if a model is trained or deployed in a different environment this can\n",
"# cause errors. Please take extra care when specifying your dependencies in a production environment.\n",
"run_config.environment.python.conda_dependencies = CondaDependencies.create(conda_packages=[sklearn_dep, pandas_dep],\n", "run_config.environment.python.conda_dependencies = CondaDependencies.create(conda_packages=[sklearn_dep, pandas_dep],\n",
" pip_packages=azureml_pip_packages)\n", " pip_packages=azureml_pip_packages)\n",
"\n", "\n",

View File

@@ -346,6 +346,10 @@
"if pandas_ver:\n", "if pandas_ver:\n",
" pandas_dep = 'pandas=={}'.format(pandas_ver)\n", " pandas_dep = 'pandas=={}'.format(pandas_ver)\n",
"# specify CondaDependencies obj\n", "# specify CondaDependencies obj\n",
"# The CondaDependencies specifies the conda and pip packages that are installed in the environment\n",
"# the submitted job is run in. Note the remote environment(s) needs to be similar to the local\n",
"# environment, otherwise if a model is trained or deployed in a different environment this can\n",
"# cause errors. Please take extra care when specifying your dependencies in a production environment.\n",
"myenv = CondaDependencies.create(conda_packages=[sklearn_dep, pandas_dep],\n", "myenv = CondaDependencies.create(conda_packages=[sklearn_dep, pandas_dep],\n",
" pip_packages=['sklearn-pandas', 'pyyaml'] + azureml_pip_packages,\n", " pip_packages=['sklearn-pandas', 'pyyaml'] + azureml_pip_packages,\n",
" pin_sdk_version=False)\n", " pin_sdk_version=False)\n",
@@ -413,9 +417,9 @@
"headers = {'Content-Type':'application/json'}\n", "headers = {'Content-Type':'application/json'}\n",
"\n", "\n",
"# send request to service\n", "# send request to service\n",
"print(\"POST to url\", service.scoring_uri)\n",
"resp = requests.post(service.scoring_uri, sample_data, headers=headers)\n", "resp = requests.post(service.scoring_uri, sample_data, headers=headers)\n",
"\n", "\n",
"print(\"POST to url\", service.scoring_uri)\n",
"# can covert back to Python objects from json string if desired\n", "# can covert back to Python objects from json string if desired\n",
"print(\"prediction:\", resp.text)\n", "print(\"prediction:\", resp.text)\n",
"result = json.loads(resp.text)" "result = json.loads(resp.text)"

View File

@@ -264,6 +264,10 @@
"if pandas_ver:\n", "if pandas_ver:\n",
" pandas_dep = 'pandas=={}'.format(pandas_ver)\n", " pandas_dep = 'pandas=={}'.format(pandas_ver)\n",
"# specify CondaDependencies obj\n", "# specify CondaDependencies obj\n",
"# The CondaDependencies specifies the conda and pip packages that are installed in the environment\n",
"# the submitted job is run in. Note the remote environment(s) needs to be similar to the local\n",
"# environment, otherwise if a model is trained or deployed in a different environment this can\n",
"# cause errors. Please take extra care when specifying your dependencies in a production environment.\n",
"run_config.environment.python.conda_dependencies = CondaDependencies.create(conda_packages=[sklearn_dep, pandas_dep],\n", "run_config.environment.python.conda_dependencies = CondaDependencies.create(conda_packages=[sklearn_dep, pandas_dep],\n",
" pip_packages=['sklearn_pandas', 'pyyaml'] + azureml_pip_packages,\n", " pip_packages=['sklearn_pandas', 'pyyaml'] + azureml_pip_packages,\n",
" pin_sdk_version=False)\n", " pin_sdk_version=False)\n",
@@ -432,6 +436,10 @@
"if pandas_ver:\n", "if pandas_ver:\n",
" pandas_dep = 'pandas=={}'.format(pandas_ver)\n", " pandas_dep = 'pandas=={}'.format(pandas_ver)\n",
"# specify CondaDependencies obj\n", "# specify CondaDependencies obj\n",
"# The CondaDependencies specifies the conda and pip packages that are installed in the environment\n",
"# the submitted job is run in. Note the remote environment(s) needs to be similar to the local\n",
"# environment, otherwise if a model is trained or deployed in a different environment this can\n",
"# cause errors. Please take extra care when specifying your dependencies in a production environment.\n",
"myenv = CondaDependencies.create(conda_packages=[sklearn_dep, pandas_dep],\n", "myenv = CondaDependencies.create(conda_packages=[sklearn_dep, pandas_dep],\n",
" pip_packages=['sklearn-pandas', 'pyyaml'] + azureml_pip_packages,\n", " pip_packages=['sklearn-pandas', 'pyyaml'] + azureml_pip_packages,\n",
" pin_sdk_version=False)\n", " pin_sdk_version=False)\n",
@@ -495,9 +503,9 @@
"headers = {'Content-Type':'application/json'}\n", "headers = {'Content-Type':'application/json'}\n",
"\n", "\n",
"# send request to service\n", "# send request to service\n",
"print(\"POST to url\", service.scoring_uri)\n",
"resp = requests.post(service.scoring_uri, input_data, headers=headers)\n", "resp = requests.post(service.scoring_uri, input_data, headers=headers)\n",
"\n", "\n",
"print(\"POST to url\", service.scoring_uri)\n",
"# can covert back to Python objects from json string if desired\n", "# can covert back to Python objects from json string if desired\n",
"print(\"prediction:\", resp.text)" "print(\"prediction:\", resp.text)"
] ]

View File

@@ -716,7 +716,6 @@
"\n", "\n",
"trainWithAutomlStep = AutoMLStep(name='AutoML_Regression',\n", "trainWithAutomlStep = AutoMLStep(name='AutoML_Regression',\n",
" automl_config=automl_config,\n", " automl_config=automl_config,\n",
" passthru_automl_config=False,\n",
" allow_reuse=True)\n", " allow_reuse=True)\n",
"print(\"trainWithAutomlStep created.\")" "print(\"trainWithAutomlStep created.\")"
] ]

View File

@@ -13,7 +13,7 @@ def init():
global g_tf_sess global g_tf_sess
# pull down model from workspace # pull down model from workspace
model_path = Model.get_model_path("mnist") model_path = Model.get_model_path("mnist-prs")
# contruct graph to execute # contruct graph to execute
tf.reset_default_graph() tf.reset_default_graph()

View File

@@ -274,7 +274,7 @@
"\n", "\n",
"# register downloaded model \n", "# register downloaded model \n",
"model = Model.register(model_path = \"models/\",\n", "model = Model.register(model_path = \"models/\",\n",
" model_name = \"mnist\", # this is the name the model is registered as\n", " model_name = \"mnist-prs\", # this is the name the model is registered as\n",
" tags = {'pretrained': \"mnist\"},\n", " tags = {'pretrained': \"mnist\"},\n",
" description = \"Mnist trained tensorflow model\",\n", " description = \"Mnist trained tensorflow model\",\n",
" workspace = ws)" " workspace = ws)"
@@ -474,8 +474,7 @@
"outputs": [], "outputs": [],
"source": [ "source": [
"path_on_datastore = mnist_data.path('mnist/0.png')\n", "path_on_datastore = mnist_data.path('mnist/0.png')\n",
"single_image_ds = Dataset.File.from_files(path=path_on_datastore, validate=False)\n", "single_image_ds = Dataset.File.from_files(path=path_on_datastore, validate=False)"
"single_image_ds._ensure_saved(ws)"
] ]
}, },
{ {

View File

@@ -227,7 +227,7 @@
"\n", "\n",
"# register downloaded model\n", "# register downloaded model\n",
"model = Model.register(model_path = \"iris_model.pkl/iris_model.pkl\",\n", "model = Model.register(model_path = \"iris_model.pkl/iris_model.pkl\",\n",
" model_name = \"iris\", # this is the name the model is registered as\n", " model_name = \"iris-prs\", # this is the name the model is registered as\n",
" tags = {'pretrained': \"iris\"},\n", " tags = {'pretrained': \"iris\"},\n",
" workspace = ws)" " workspace = ws)"
] ]
@@ -356,7 +356,7 @@
" inputs=[named_iris_ds],\n", " inputs=[named_iris_ds],\n",
" output=output_folder,\n", " output=output_folder,\n",
" parallel_run_config=parallel_run_config,\n", " parallel_run_config=parallel_run_config,\n",
" arguments=['--model_name', 'iris'],\n", " arguments=['--model_name', 'iris-prs'],\n",
" allow_reuse=True\n", " allow_reuse=True\n",
")" ")"
] ]
@@ -380,7 +380,7 @@
"\n", "\n",
"pipeline = Pipeline(workspace=ws, steps=[distributed_csv_iris_step])\n", "pipeline = Pipeline(workspace=ws, steps=[distributed_csv_iris_step])\n",
"\n", "\n",
"pipeline_run = Experiment(ws, 'iris').submit(pipeline)" "pipeline_run = Experiment(ws, 'iris-prs').submit(pipeline)"
] ]
}, },
{ {

View File

@@ -381,7 +381,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"run.cancel()" "run.wait_for_completion(show_output=True)"
] ]
}, },
{ {

View File

@@ -456,6 +456,24 @@
"monitor.enable_schedule()" "monitor.enable_schedule()"
] ]
}, },
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Delete the DataDriftDetector\n",
"\n",
"Invoking the `delete()` method on the object deletes the the drift monitor permanently and cannot be undone. You will no longer be able to find it in the UI and the `list()` or `get()` methods. The object on which delete() was called will have its state set to deleted and name suffixed with deleted. The baseline and target datasets and model data that was collected, if any, are not deleted. The compute is not deleted. The DataDrift schedule pipeline is disabled and archived."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"monitor.delete()"
]
},
{ {
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},

View File

@@ -7,4 +7,5 @@ dependencies:
- azureml-monitoring - azureml-monitoring
- scikit-learn - scikit-learn
- numpy - numpy
- packaging
- inference-schema[numpy-support] - inference-schema[numpy-support]

View File

@@ -4,7 +4,13 @@ import numpy as np
from azureml.monitoring import ModelDataCollector from azureml.monitoring import ModelDataCollector
from inference_schema.parameter_types.numpy_parameter_type import NumpyParameterType from inference_schema.parameter_types.numpy_parameter_type import NumpyParameterType
from inference_schema.schema_decorators import input_schema, output_schema from inference_schema.schema_decorators import input_schema, output_schema
from sklearn.externals import joblib # sklearn.externals.joblib is removed in 0.23
from sklearn import __version__ as sklearnver
from packaging.version import Version
if Version(sklearnver) < Version("0.23.0"):
from sklearn.externals import joblib
else:
import joblib
def init(): def init():

View File

@@ -11,19 +11,29 @@ Taxonomies for products and languages: https://review.docs.microsoft.com/new-hop
This is an introduction to the [Azure Machine Learning](https://docs.microsoft.com/en-us/azure/machine-learning/service/) Reinforcement Learning (Public Preview) using the [Ray](https://github.com/ray-project/ray/) framework. This is an introduction to the [Azure Machine Learning](https://docs.microsoft.com/en-us/azure/machine-learning/service/) Reinforcement Learning (Public Preview) using the [Ray](https://github.com/ray-project/ray/) framework.
Using these samples, you will be able to do the following. ## What is reinforcement learning?
1. Use an Azure Machine Learning workspace, set up virtual network and create compute clusters for running Ray. Reinforcement learning is an approach to machine learning to train agents to make a sequence of decisions. This technique has gained popularity over the last few years as breakthroughs have been made to teach reinforcement learning agents to excel at complex tasks like playing video games. There are many practical real-world use cases as well, including robotics, chemistry, online recommendations, advertising and more.
2. Run some experiments to train a reinforcement learning agent using Ray and RLlib.
In reinforcement learning, the goal is to train an agent *policy* that outputs actions based on the agents observations of its environment. Actions result in further observations and *rewards* for taking the actions. In reinforcement learning, the full reward for policy actions may take many steps to obtain. Learning a policy involves many trial-and-error runs of the agent interacting with the environment and improving its policy.
## Reinforcement learning on Azure Machine Learning
Reinforcement learning support in Azure Machine Learning service enables data scientists to scale training to many powerful CPU or GPU enabled VMs using [Azure Machine Learning compute clusters](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-set-up-training-targets#amlcompute) which automatically provision, manage, and scale down these VMs to help manage your costs.
Using these samples, you will learn how to do the following.
1. Use an Azure Machine Learning workspace, set up virtual network and create compute clusters for distributed training.
2. Train reinforcement learning agents using Ray RLlib.
## Contents ## Contents
| File/folder | Description | | File/folder | Description |
|-------------------|--------------------------------------------| |-------------------|--------------------------------------------|
| [devenv_setup.ipynb](setup/devenv_setup.ipynb) | Notebook to setup development environment for Azure ML RL | | [devenv_setup.ipynb](setup/devenv_setup.ipynb) | Notebook to setup virtual network for using Azure Machine Learning. Needed for the Pong and Minecraft examples. |
| [cartpole_ci.ipynb](cartpole-on-compute-instance/cartpole_ci.ipynb) | Notebook to train a Cartpole playing agent on an Azure ML Compute Instance | | [cartpole_ci.ipynb](cartpole-on-compute-instance/cartpole_ci.ipynb) | Notebook to train a Cartpole playing agent on an Azure Machine Learning Compute Instance |
| [cartpole_sc.ipynb](cartpole-on-single-compute/cartpole_sc.ipynb) | Notebook to train a Cartpole playing agent on an Azure ML Compute Cluster (single node) | | [cartpole_sc.ipynb](cartpole-on-single-compute/cartpole_sc.ipynb) | Notebook to train a Cartpole playing agent on an Azure Machine Learning Compute Cluster (single node) |
| [pong_rllib.ipynb](atari-on-distributed-compute/pong_rllib.ipynb) | Notebook to train Pong agent using RLlib on multiple compute targets | | [pong_rllib.ipynb](atari-on-distributed-compute/pong_rllib.ipynb) | Notebook for distributed training of Pong agent using RLlib on multiple compute targets |
| [minecraft.ipynb](minecraft-on-distributed-compute/minecraft.ipynb) | Notebook to train an agent to navigate through a lava maze in the Minecraft game | | [minecraft.ipynb](minecraft-on-distributed-compute/minecraft.ipynb) | Notebook to train an agent to navigate through a lava maze in the Minecraft game |
## Prerequisites ## Prerequisites
@@ -32,9 +42,10 @@ To make use of these samples, you need the following.
* A Microsoft Azure subscription. * A Microsoft Azure subscription.
* A Microsoft Azure resource group. * A Microsoft Azure resource group.
* An Azure Machine Learning Workspace in the resource group. Please make sure that the VM sizes `STANDARD_NC6` and `STANDARD_D2_V2` are supported in the workspace's region. * An Azure Machine Learning Workspace in the resource group.
* A virtual network set up in the resource group. * Azure Machine Learning training compute. These samples use the VM sizes `STANDARD_NC6` and `STANDARD_D2_V2`. If these are not available in your region,
* A virtual network is needed for the examples training on multiple compute targets. you can replace them with other sizes.
* A virtual network set up in the resource group for samples that use multiple compute targets. The Cartpole examples do not need a virtual network.
* The [devenv_setup.ipynb](setup/devenv_setup.ipynb) notebook shows you how to create a virtual network. You can alternatively use an existing virtual network, make sure it's in the same region as workspace is. * The [devenv_setup.ipynb](setup/devenv_setup.ipynb) notebook shows you how to create a virtual network. You can alternatively use an existing virtual network, make sure it's in the same region as workspace is.
* Any network security group defined on the virtual network must allow network traffic on ports used by Azure infrastructure services. This is described in more detail in the [devenv_setup.ipynb](setup/devenv_setup.ipynb) notebook. * Any network security group defined on the virtual network must allow network traffic on ports used by Azure infrastructure services. This is described in more detail in the [devenv_setup.ipynb](setup/devenv_setup.ipynb) notebook.
@@ -43,10 +54,10 @@ To make use of these samples, you need the following.
You can run these samples in the following ways. You can run these samples in the following ways.
* On an Azure ML Compute Instance or Notebook VM. * On an Azure Machine Learning Compute Instance or Azure Data Science Virtual Machine (DSVM).
* On a workstation with Python and the Azure ML Python SDK installed. * On a workstation with Python and the Azure ML Python SDK installed.
### Azure ML Compute Instance or Notebook VM ### Compute Instance or DSVM
#### Update packages #### Update packages

View File

@@ -0,0 +1,237 @@
import sys
import csv
from azure.mgmt.network import NetworkManagementClient
def check_port_in_port_range(expected_port: str,
dest_port_range: str):
"""
Check if a port is within a port range
Port range maybe like *, 8080 or 8888-8889
"""
if dest_port_range == '*':
return True
dest_ports = dest_port_range.split('-')
if len(dest_ports) == 1 and \
int(dest_ports[0]) == int(expected_port):
return True
if len(dest_ports) == 2 and \
int(dest_ports[0]) <= int(expected_port) and \
int(dest_ports[1]) >= int(expected_port):
return True
return False
def check_port_in_destination_port_ranges(expected_port: str,
dest_port_ranges: list):
"""
Check if a port is within a given list of port ranges
i.e. check if port 8080 is in port ranges of 22,80,8080-8090,443
"""
for dest_port_range in dest_port_ranges:
if check_port_in_port_range(expected_port, dest_port_range) is True:
return True
return False
def check_ports_in_destination_port_ranges(expected_ports: list,
dest_port_ranges: list):
"""
Check if all ports in a given port list are within a given list
of port ranges
i.e. check if port 8080,8081 are in port ranges of 22,80,8080-8090,443
"""
for expected_port in expected_ports:
if check_port_in_destination_port_ranges(
expected_port, dest_port_ranges) is False:
return False
return True
def check_source_address_prefix(source_address_prefix: str):
"""Check if source address prefix is BatchNodeManagement or default"""
required_prefix = 'BatchNodeManagement'
default_prefix = 'default'
if source_address_prefix.lower() == required_prefix.lower() or \
source_address_prefix.lower() == default_prefix.lower():
return True
return False
def check_protocol(protocol: str):
"""Check if protocol is supported - Tcp/Any"""
required_protocol = 'Tcp'
any_protocol = 'Any'
if required_protocol.lower() == protocol.lower() or \
any_protocol.lower() == protocol.lower():
return True
return False
def check_direction(direction: str):
"""Check if port direction is inbound"""
required_direction = 'Inbound'
if required_direction.lower() == direction.lower():
return True
return False
def check_provisioning_state(provisioning_state: str):
"""Check if the provisioning state is succeeded"""
required_provisioning_state = 'Succeeded'
if required_provisioning_state.lower() == provisioning_state.lower():
return True
return False
def check_rule_for_Azure_ML(rule):
"""Check if the ports required for Azure Machine Learning are open"""
required_ports = ['29876', '29877']
if check_source_address_prefix(rule.source_address_prefix) is False:
return False
if check_protocol(rule.protocol) is False:
return False
if check_direction(rule.direction) is False:
return False
if check_provisioning_state(rule.provisioning_state) is False:
return False
if rule.destination_port_range is not None:
if check_ports_in_destination_port_ranges(
required_ports,
[rule.destination_port_range]) is False:
return False
else:
if check_ports_in_destination_port_ranges(
required_ports,
rule.destination_port_ranges) is False:
return False
return True
def check_vnet_security_rules(auth_object,
vnet_subscription_id,
vnet_resource_group,
vnet_name,
save_to_file=False):
"""
Check all the rules of virtual network if required ports for Azure Machine
Learning are open
"""
network_client = NetworkManagementClient(
auth_object,
vnet_subscription_id)
# get the vnet
vnet = network_client.virtual_networks.get(
resource_group_name=vnet_resource_group,
virtual_network_name=vnet_name)
vnet_location = vnet.location
vnet_info = []
if vnet.subnets is None or len(vnet.subnets) == 0:
print('WARNING: No subnet found for VNet:', vnet_name)
# for each subnet of the vnet
for subnet in vnet.subnets:
if subnet.network_security_group is None:
print('WARNING: No network security group found for subnet.',
'Subnet',
subnet.id.split("/")[-1])
else:
# get all the rules
network_security_group_name = \
subnet.network_security_group.id.split("/")[-1]
network_security_group_resource_group_name = \
subnet.network_security_group.id.split("/")[4]
network_security_group_subscription_id = \
subnet.network_security_group.id.split("/")[2]
security_rules = list(network_client.security_rules.list(
network_security_group_resource_group_name,
network_security_group_name))
rule_matched = None
for rule in security_rules:
rule_info = []
# add vnet details
rule_info.append(vnet_name)
rule_info.append(vnet_subscription_id)
rule_info.append(vnet_resource_group)
rule_info.append(vnet_location)
# add subnet details
rule_info.append(subnet.id.split("/")[-1])
rule_info.append(network_security_group_name)
rule_info.append(network_security_group_subscription_id)
rule_info.append(network_security_group_resource_group_name)
# add rule details
rule_info.append(rule.priority)
rule_info.append(rule.name)
rule_info.append(rule.source_address_prefix)
if rule.destination_port_range is not None:
rule_info.append(rule.destination_port_range)
else:
rule_info.append(rule.destination_port_ranges)
rule_info.append(rule.direction)
rule_info.append(rule.provisioning_state)
vnet_info.append(rule_info)
if check_rule_for_Azure_ML(rule) is True:
rule_matched = rule
if rule_matched is not None:
print("INFORMATION: Rule matched with required ports. Subnet:",
subnet.id.split("/")[-1], "Rule:", rule.name)
else:
print("WARNING: No rule matched with required ports. Subnet:",
subnet.id.split("/")[-1])
if save_to_file is True:
file_name = vnet_name + ".csv"
with open(file_name, mode='w') as vnet_rule_file:
vnet_rule_file_writer = csv.writer(
vnet_rule_file,
delimiter=',',
quotechar='"',
quoting=csv.QUOTE_MINIMAL)
header = ['VNet_Name', 'VNet_Subscription_ID',
'VNet_Resource_Group', 'VNet_Location',
'Subnet_Name', 'NSG_Name',
'NSG_Subscription_ID', 'NSG_Resource_Group',
'Rule_Priority', 'Rule_Name', 'Rule_Source',
'Rule_Destination_Ports', 'Rule_Direction',
'Rule_Provisioning_State']
vnet_rule_file_writer.writerow(header)
vnet_rule_file_writer.writerows(vnet_info)
print("INFORMATION: Network security group rules for your virtual \
network are saved in file", file_name)

View File

@@ -13,7 +13,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/tutorials/how-to-use-azureml/reinforcement-learning/atari-on-distributed-compute/pong_rllib.png)" "![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/reinforcement-learning/atari-on-distributed-compute/pong_rllib.png)"
] ]
}, },
{ {
@@ -155,6 +155,24 @@
"vnet_name = 'your_vnet'" "vnet_name = 'your_vnet'"
] ]
}, },
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Ensure that the virtual network is configured correctly with required ports open. It is possible that you have configured rules with broader range of ports that allows ports 29876-29877 to be opened. Kindly review your network security group rules. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from files.networkutils import *\n",
"\n",
"check_vnet_security_rules(ws._auth_object, ws.subscription_id, ws.resource_group, vnet_name, True)"
]
},
{ {
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},

View File

@@ -13,7 +13,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/tutorials/how-to-use-azureml/reinforcement-learning/cartpole-on-compute-instance/cartpole_ci.png)" "![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/reinforcement-learning/cartpole-on-compute-instance/cartpole_ci.png)"
] ]
}, },
{ {
@@ -161,6 +161,9 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"from azureml.core.compute import ComputeTarget, ComputeInstance\n",
"from azureml.core.compute_target import ComputeTargetException\n",
"\n",
"# Load current compute instance info\n", "# Load current compute instance info\n",
"current_compute_instance = load_nbvm()\n", "current_compute_instance = load_nbvm()\n",
"\n", "\n",
@@ -169,17 +172,23 @@
" print(\"Current compute instance:\", current_compute_instance)\n", " print(\"Current compute instance:\", current_compute_instance)\n",
" instance_name = current_compute_instance['instance']\n", " instance_name = current_compute_instance['instance']\n",
"else:\n", "else:\n",
" instance_name = next(iter(ws.compute_targets))\n", " instance_name = \"cartpole-ci-stdd2v2\"\n",
" try:\n",
" instance = ComputeInstance(workspace=ws, name=instance_name)\n",
" print('Found existing instance, use it.')\n",
" except ComputeTargetException:\n",
" print(\"Creating new compute instance...\")\n",
" compute_config = ComputeInstance.provisioning_configuration(\n",
" vm_size='STANDARD_D2_V2'\n",
" )\n",
" instance = ComputeInstance.create(ws, instance_name, compute_config)\n",
" instance.wait_for_completion(show_output=True)\n",
" print(\"Instance name:\", instance_name)\n", " print(\"Instance name:\", instance_name)\n",
"\n", "\n",
"compute_target = ws.compute_targets[instance_name]\n", "compute_target = ws.compute_targets[instance_name]\n",
"\n", "\n",
"print(\"Compute target status:\")\n", "print(\"Compute target status:\")\n",
"try:\n", "print(compute_target.get_status().serialize())\n",
" print(compute_target.get_status().serialize())\n",
"except:\n",
" print(compute_target.get_status())\n",
"\n",
"print(\"Compute target size:\")\n", "print(\"Compute target size:\")\n",
"print(compute_target.size(ws))" "print(compute_target.size(ws))"
] ]
@@ -658,7 +667,11 @@
"outputs": [], "outputs": [],
"source": [ "source": [
"# To archive the created experiment:\n", "# To archive the created experiment:\n",
"#exp.archive()" "#exp.archive()\n",
"\n",
"# To delete created compute instance\n",
"if not current_compute_instance:\n",
" compute_target.delete()"
] ]
}, },
{ {

View File

@@ -4,3 +4,4 @@ dependencies:
- azureml-sdk - azureml-sdk
- azureml-contrib-reinforcementlearning - azureml-contrib-reinforcementlearning
- azureml-widgets - azureml-widgets
- azureml-dataprep

View File

@@ -13,7 +13,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/tutorials/how-to-use-azureml/reinforcement-learning/cartpole_on_single_compute/cartpole_sc.png)" "![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/reinforcement-learning/cartpole_on_single_compute/cartpole_sc.png)"
] ]
}, },
{ {
@@ -170,7 +170,7 @@
"source": [ "source": [
"from azureml.core.experiment import Experiment\n", "from azureml.core.experiment import Experiment\n",
"\n", "\n",
"experiment_name = 'CartPole-v0-CC'\n", "experiment_name = 'CartPole-v0-SC'\n",
"exp = Experiment(workspace=ws, name=experiment_name)" "exp = Experiment(workspace=ws, name=experiment_name)"
] ]
}, },

View File

@@ -4,3 +4,4 @@ dependencies:
- azureml-sdk - azureml-sdk
- azureml-contrib-reinforcementlearning - azureml-contrib-reinforcementlearning
- azureml-widgets - azureml-widgets
- azureml-dataprep

View File

@@ -209,7 +209,7 @@
"# name of the Virtual Network subnet ('default' the default name)\n", "# name of the Virtual Network subnet ('default' the default name)\n",
"subnet_name = 'default'\n", "subnet_name = 'default'\n",
"\n", "\n",
"gpu_cluster_name = 'gpu-cluster-nc6'\n", "gpu_cluster_name = 'gpu-cl-nc6-vnet'\n",
"\n", "\n",
"try:\n", "try:\n",
" gpu_cluster = ComputeTarget(workspace=ws, name=gpu_cluster_name)\n", " gpu_cluster = ComputeTarget(workspace=ws, name=gpu_cluster_name)\n",
@@ -250,7 +250,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"cpu_cluster_name = 'cpu-cluster-d2'\n", "cpu_cluster_name = 'cpu-cl-d2-vnet'\n",
"\n", "\n",
"try:\n", "try:\n",
" cpu_cluster = ComputeTarget(workspace=ws, name=cpu_cluster_name)\n", " cpu_cluster = ComputeTarget(workspace=ws, name=cpu_cluster_name)\n",

View File

@@ -13,7 +13,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/tutorials/how-to-use-azureml/reinforcement-learning/setup/devenv_setup.png)" "![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/reinforcement-learning/setup/devenv_setup.png)"
] ]
}, },
{ {
@@ -186,7 +186,7 @@
")\n", ")\n",
"\n", "\n",
"async_nsg_creation.wait() \n", "async_nsg_creation.wait() \n",
"print(\"Network security group created successfully: \", async_nsg_creation.result())\n", "print(\"Network security group created successfully:\", async_nsg_creation.result())\n",
"\n", "\n",
"network_security_group = network_client.network_security_groups.get(\n", "network_security_group = network_client.network_security_groups.get(\n",
" resource_group,\n", " resource_group,\n",
@@ -211,6 +211,25 @@
"async_subnet_creation.wait()\n", "async_subnet_creation.wait()\n",
"print(\"Subnet created successfully:\", async_subnet_creation.result())" "print(\"Subnet created successfully:\", async_subnet_creation.result())"
] ]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Review the virtual network security rules\n",
"Ensure that the virtual network is configured correctly with required ports open. It is possible that you have configured rules with broader range of ports that allows ports 29876-29877 to be opened. Kindly review your network security group rules. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from files.networkutils import *\n",
"\n",
"check_vnet_security_rules(ws._auth_object, ws.subscription_id, ws.resource_group, vnet_name, True)"
]
} }
], ],
"metadata": { "metadata": {

View File

@@ -0,0 +1,237 @@
import sys
import csv
from azure.mgmt.network import NetworkManagementClient
def check_port_in_port_range(expected_port: str,
dest_port_range: str):
"""
Check if a port is within a port range
Port range maybe like *, 8080 or 8888-8889
"""
if dest_port_range == '*':
return True
dest_ports = dest_port_range.split('-')
if len(dest_ports) == 1 and \
int(dest_ports[0]) == int(expected_port):
return True
if len(dest_ports) == 2 and \
int(dest_ports[0]) <= int(expected_port) and \
int(dest_ports[1]) >= int(expected_port):
return True
return False
def check_port_in_destination_port_ranges(expected_port: str,
dest_port_ranges: list):
"""
Check if a port is within a given list of port ranges
i.e. check if port 8080 is in port ranges of 22,80,8080-8090,443
"""
for dest_port_range in dest_port_ranges:
if check_port_in_port_range(expected_port, dest_port_range) is True:
return True
return False
def check_ports_in_destination_port_ranges(expected_ports: list,
dest_port_ranges: list):
"""
Check if all ports in a given port list are within a given list
of port ranges
i.e. check if port 8080,8081 are in port ranges of 22,80,8080-8090,443
"""
for expected_port in expected_ports:
if check_port_in_destination_port_ranges(
expected_port, dest_port_ranges) is False:
return False
return True
def check_source_address_prefix(source_address_prefix: str):
"""Check if source address prefix is BatchNodeManagement or default"""
required_prefix = 'BatchNodeManagement'
default_prefix = 'default'
if source_address_prefix.lower() == required_prefix.lower() or \
source_address_prefix.lower() == default_prefix.lower():
return True
return False
def check_protocol(protocol: str):
"""Check if protocol is supported - Tcp/Any"""
required_protocol = 'Tcp'
any_protocol = 'Any'
if required_protocol.lower() == protocol.lower() or \
any_protocol.lower() == protocol.lower():
return True
return False
def check_direction(direction: str):
"""Check if port direction is inbound"""
required_direction = 'Inbound'
if required_direction.lower() == direction.lower():
return True
return False
def check_provisioning_state(provisioning_state: str):
"""Check if the provisioning state is succeeded"""
required_provisioning_state = 'Succeeded'
if required_provisioning_state.lower() == provisioning_state.lower():
return True
return False
def check_rule_for_Azure_ML(rule):
"""Check if the ports required for Azure Machine Learning are open"""
required_ports = ['29876', '29877']
if check_source_address_prefix(rule.source_address_prefix) is False:
return False
if check_protocol(rule.protocol) is False:
return False
if check_direction(rule.direction) is False:
return False
if check_provisioning_state(rule.provisioning_state) is False:
return False
if rule.destination_port_range is not None:
if check_ports_in_destination_port_ranges(
required_ports,
[rule.destination_port_range]) is False:
return False
else:
if check_ports_in_destination_port_ranges(
required_ports,
rule.destination_port_ranges) is False:
return False
return True
def check_vnet_security_rules(auth_object,
vnet_subscription_id,
vnet_resource_group,
vnet_name,
save_to_file=False):
"""
Check all the rules of virtual network if required ports for Azure Machine
Learning are open
"""
network_client = NetworkManagementClient(
auth_object,
vnet_subscription_id)
# get the vnet
vnet = network_client.virtual_networks.get(
resource_group_name=vnet_resource_group,
virtual_network_name=vnet_name)
vnet_location = vnet.location
vnet_info = []
if vnet.subnets is None or len(vnet.subnets) == 0:
print('WARNING: No subnet found for VNet:', vnet_name)
# for each subnet of the vnet
for subnet in vnet.subnets:
if subnet.network_security_group is None:
print('WARNING: No network security group found for subnet.',
'Subnet',
subnet.id.split("/")[-1])
else:
# get all the rules
network_security_group_name = \
subnet.network_security_group.id.split("/")[-1]
network_security_group_resource_group_name = \
subnet.network_security_group.id.split("/")[4]
network_security_group_subscription_id = \
subnet.network_security_group.id.split("/")[2]
security_rules = list(network_client.security_rules.list(
network_security_group_resource_group_name,
network_security_group_name))
rule_matched = None
for rule in security_rules:
rule_info = []
# add vnet details
rule_info.append(vnet_name)
rule_info.append(vnet_subscription_id)
rule_info.append(vnet_resource_group)
rule_info.append(vnet_location)
# add subnet details
rule_info.append(subnet.id.split("/")[-1])
rule_info.append(network_security_group_name)
rule_info.append(network_security_group_subscription_id)
rule_info.append(network_security_group_resource_group_name)
# add rule details
rule_info.append(rule.priority)
rule_info.append(rule.name)
rule_info.append(rule.source_address_prefix)
if rule.destination_port_range is not None:
rule_info.append(rule.destination_port_range)
else:
rule_info.append(rule.destination_port_ranges)
rule_info.append(rule.direction)
rule_info.append(rule.provisioning_state)
vnet_info.append(rule_info)
if check_rule_for_Azure_ML(rule) is True:
rule_matched = rule
if rule_matched is not None:
print("INFORMATION: Rule matched with required ports. Subnet:",
subnet.id.split("/")[-1], "Rule:", rule.name)
else:
print("WARNING: No rule matched with required ports. Subnet:",
subnet.id.split("/")[-1])
if save_to_file is True:
file_name = vnet_name + ".csv"
with open(file_name, mode='w') as vnet_rule_file:
vnet_rule_file_writer = csv.writer(
vnet_rule_file,
delimiter=',',
quotechar='"',
quoting=csv.QUOTE_MINIMAL)
header = ['VNet_Name', 'VNet_Subscription_ID',
'VNet_Resource_Group', 'VNet_Location',
'Subnet_Name', 'NSG_Name',
'NSG_Subscription_ID', 'NSG_Resource_Group',
'Rule_Priority', 'Rule_Name', 'Rule_Source',
'Rule_Destination_Ports', 'Rule_Direction',
'Rule_Provisioning_State']
vnet_rule_file_writer.writerow(header)
vnet_rule_file_writer.writerows(vnet_info)
print("INFORMATION: Network security group rules for your virtual \
network are saved in file", file_name)

View File

@@ -100,7 +100,7 @@
"\n", "\n",
"# Check core SDK version number\n", "# Check core SDK version number\n",
"\n", "\n",
"print(\"This notebook was created using SDK version 1.6.0, you are currently running version\", azureml.core.VERSION)" "print(\"This notebook was created using SDK version 1.8.0, you are currently running version\", azureml.core.VERSION)"
] ]
}, },
{ {

View File

@@ -0,0 +1,374 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Copyright (c) Microsoft Corporation. All rights reserved.\n",
"\n",
"Licensed under the MIT License."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/training/train-on-amlcompute/train-on-computeinstance.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Train using Azure Machine Learning Compute Instance\n",
"\n",
"* Initialize Workspace\n",
"* Introduction to ComputeInstance\n",
"* Create an Experiment\n",
"* Submit ComputeInstance run\n",
"* Additional operations to perform on ComputeInstance"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Prerequisites\n",
"If you are using an Azure Machine Learning ComputeInstance, you are all set. Otherwise, go through the [configuration](../../../configuration.ipynb) notebook first if you haven't already to establish your connection to the AzureML Workspace."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Check core SDK version number\n",
"import azureml.core\n",
"\n",
"print(\"SDK version:\", azureml.core.VERSION)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Initialize Workspace\n",
"\n",
"Initialize a workspace object"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"create workspace"
]
},
"outputs": [],
"source": [
"from azureml.core import Workspace\n",
"\n",
"ws = Workspace.from_config()\n",
"print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep = '\\n')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Introduction to ComputeInstance\n",
"\n",
"\n",
"Azure Machine Learning compute instance is a fully-managed cloud-based workstation optimized for your machine learning development environment. It is created **within your workspace region**.\n",
"\n",
"For more information on ComputeInstance, please read [this article](https://docs.microsoft.com/en-us/azure/machine-learning/concept-compute-instance)\n",
"\n",
"**Note**: As with other Azure services, there are limits on certain resources (for eg. AmlCompute quota) associated with the Azure Machine Learning service. Please read [this article](https://docs.microsoft.com/azure/machine-learning/service/how-to-manage-quotas) on the default limits and how to request more quota."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Create ComputeInstance\n",
"First lets check which VM families are available in your region. Azure is a regional service and some specialized SKUs (especially GPUs) are only available in certain regions. Since ComputeInstance is created in the region of your workspace, we will use the supported_vms () function to see if the VM family we want to use ('STANDARD_D3_V2') is supported.\n",
"\n",
"You can also pass a different region to check availability and then re-create your workspace in that region through the [configuration notebook](../../../configuration.ipynb)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"msdoc": "how-to-auto-train-remote.md",
"name": "check_region"
},
"outputs": [],
"source": [
"from azureml.core.compute import ComputeTarget, ComputeInstance\n",
"\n",
"ComputeInstance.supported_vmsizes(workspace = ws)\n",
"# ComputeInstance.supported_vmsizes(workspace = ws, location='eastus')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"msdoc": "how-to-auto-train-remote.md",
"name": "create_instance"
},
"outputs": [],
"source": [
"import datetime\n",
"import time\n",
"\n",
"from azureml.core.compute import ComputeTarget, ComputeInstance\n",
"from azureml.core.compute_target import ComputeTargetException\n",
"\n",
"# Choose a name for your instance\n",
"compute_name = \"compute-instance\"\n",
"\n",
"# Verify that instance does not exist already\n",
"try:\n",
" instance = ComputeInstance(workspace=ws, name=compute_name)\n",
" print('Found existing instance, use it.')\n",
"except ComputeTargetException:\n",
" compute_config = ComputeInstance.provisioning_configuration(\n",
" vm_size='STANDARD_D3_V2',\n",
" ssh_public_access=False,\n",
" # vnet_resourcegroup_name='<my-resource-group>',\n",
" # vnet_name='<my-vnet-name>',\n",
" # subnet_name='default',\n",
" # admin_user_ssh_public_key='<my-sshkey>'\n",
" )\n",
" instance = ComputeInstance.create(ws, compute_name, compute_config)\n",
" instance.wait_for_completion(show_output=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Create An Experiment\n",
"\n",
"**Experiment** is a logical container in an Azure ML Workspace. It hosts run records which can include run metrics and output artifacts from your experiments."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core import Experiment\n",
"experiment_name = 'train-on-computeinstance'\n",
"experiment = Experiment(workspace = ws, name = experiment_name)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Submit ComputeInstance run\n",
"The training script `train.py` is already created for you"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Create environment\n",
"\n",
"Create Docker based environment with scikit-learn installed."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core import Environment\n",
"from azureml.core.conda_dependencies import CondaDependencies\n",
"\n",
"myenv = Environment(\"myenv\")\n",
"\n",
"myenv.docker.enabled = True\n",
"myenv.python.conda_dependencies = CondaDependencies.create(conda_packages=['scikit-learn'])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Configure & Run"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core import ScriptRunConfig\n",
"from azureml.core.runconfig import DEFAULT_CPU_IMAGE\n",
"\n",
"src = ScriptRunConfig(source_directory='', script='train.py')\n",
"\n",
"# Set compute target to the one created in previous step\n",
"src.run_config.target = instance\n",
"\n",
"# Set environment\n",
"src.run_config.environment = myenv\n",
" \n",
"run = experiment.submit(config=src)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note: if you need to cancel a run, you can follow [these instructions](https://aka.ms/aml-docs-cancel-run)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%time\n",
"# Shows output of the run on stdout.\n",
"run.wait_for_completion(show_output=True)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print(run.get_metrics())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Additional operations to perform on ComputeInstance\n",
"\n",
"You can perform more operations on ComputeInstance such as get status, change the state or deleting the compute."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"msdoc": "how-to-auto-train-remote.md",
"name": "get_status"
},
"outputs": [],
"source": [
"# get_status() gets the latest status of the ComputeInstance target\n",
"instance.get_status()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"msdoc": "how-to-auto-train-remote.md",
"name": "stop"
},
"outputs": [],
"source": [
"# stop() is used to stop the ComputeInstance\n",
"# Stopping ComputeInstance will stop the billing meter and persist the state on the disk.\n",
"# Available Quota will not be changed with this operation.\n",
"instance.stop(wait_for_completion=True, show_output=True)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"msdoc": "how-to-auto-train-remote.md",
"name": "start"
},
"outputs": [],
"source": [
"# start() is used to start the ComputeInstance if it is in stopped state\n",
"instance.start(wait_for_completion=True, show_output=True)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# restart() is used to restart the ComputeInstance\n",
"instance.restart(wait_for_completion=True, show_output=True)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# delete() is used to delete the ComputeInstance target. Useful if you want to re-use the compute name \n",
"# instance.delete()"
]
}
],
"metadata": {
"authors": [
{
"name": "ramagott"
}
],
"category": "training",
"compute": [
"Compute Instance"
],
"datasets": [
"Diabetes"
],
"deployment": [
"None"
],
"exclude_from_index": false,
"framework": [
"None"
],
"friendly_name": "Train on Azure Machine Learning Compute Instance",
"index_order": 1,
"kernelspec": {
"display_name": "Python 3.6",
"language": "python",
"name": "python36"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.7"
},
"tags": [
"None"
],
"task": "Submit a run on Azure Machine Learning Compute Instance."
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -0,0 +1,6 @@
name: train-on-computeinstance
dependencies:
- scikit-learn
- pip:
- azureml-sdk
- azureml-widgets

View File

@@ -0,0 +1,44 @@
# Copyright (c) Microsoft. All rights reserved.
# Licensed under the MIT license.
from sklearn.datasets import load_diabetes
from sklearn.linear_model import Ridge
from sklearn.metrics import mean_squared_error
from sklearn.model_selection import train_test_split
from azureml.core.run import Run
from sklearn.externals import joblib
import os
import numpy as np
os.makedirs('./outputs', exist_ok=True)
X, y = load_diabetes(return_X_y=True)
run = Run.get_context()
X_train, X_test, y_train, y_test = train_test_split(X, y,
test_size=0.2,
random_state=0)
data = {"train": {"X": X_train, "y": y_train},
"test": {"X": X_test, "y": y_test}}
# list of numbers from 0.0 to 1.0 with a 0.05 interval
alphas = np.arange(0.0, 1.0, 0.05)
for alpha in alphas:
# Use Ridge algorithm to create a regression model
reg = Ridge(alpha=alpha)
reg.fit(data["train"]["X"], data["train"]["y"])
preds = reg.predict(data["test"]["X"])
mse = mean_squared_error(preds, data["test"]["y"])
run.log('alpha', alpha)
run.log('mse', mse)
model_file_name = 'ridge_{0:.2f}.pkl'.format(alpha)
# save model in the outputs folder so it automatically get uploaded
with open(model_file_name, "wb") as file:
joblib.dump(value=reg, filename=os.path.join('./outputs/',
model_file_name))
print('alpha is {0:.2f}, and mse is {1:0.2f}'.format(alpha, mse))

View File

@@ -380,6 +380,24 @@
"#compute_target.delete()" "#compute_target.delete()"
] ]
}, },
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Delete the DataDriftDetector\n",
"\n",
"Invoking the `delete()` method on the object deletes the the drift monitor permanently and cannot be undone. You will no longer be able to find it in the UI and the `list()` or `get()` methods. The object on which delete() was called will have its state set to deleted and name suffixed with deleted. The baseline and target datasets and model data that was collected, if any, are not deleted. The compute is not deleted. The DataDrift schedule pipeline is disabled and archived."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"monitor.delete()"
]
},
{ {
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},

View File

@@ -227,7 +227,13 @@
"from azureml.core import Dataset, Run\n", "from azureml.core import Dataset, Run\n",
"from sklearn.model_selection import train_test_split\n", "from sklearn.model_selection import train_test_split\n",
"from sklearn.tree import DecisionTreeClassifier\n", "from sklearn.tree import DecisionTreeClassifier\n",
"from sklearn.externals import joblib\n", "# sklearn.externals.joblib is removed in 0.23\n",
"from sklearn import __version__ as sklearnver\n",
"from packaging.version import Version\n",
"if Version(sklearnver) < Version(\"0.23.0\"):\n",
" from sklearn.externals import joblib\n",
"else:\n",
" import joblib\n",
"\n", "\n",
"run = Run.get_context()\n", "run = Run.get_context()\n",
"# get input dataset by name\n", "# get input dataset by name\n",
@@ -291,7 +297,7 @@
" entry_script='train_iris.py', \n", " entry_script='train_iris.py', \n",
" # pass dataset object as an input with name 'titanic'\n", " # pass dataset object as an input with name 'titanic'\n",
" inputs=[dataset.as_named_input('iris')],\n", " inputs=[dataset.as_named_input('iris')],\n",
" pip_packages=['azureml-dataprep[fuse]'],\n", " pip_packages=['azureml-dataprep[fuse]', 'packaging'],\n",
" compute_target=compute_target) " " compute_target=compute_target) "
] ]
}, },
@@ -415,11 +421,17 @@
"import os\n", "import os\n",
"import glob\n", "import glob\n",
"\n", "\n",
"from azureml.core.run import Run\n",
"from sklearn.linear_model import Ridge\n", "from sklearn.linear_model import Ridge\n",
"from sklearn.metrics import mean_squared_error\n", "from sklearn.metrics import mean_squared_error\n",
"from sklearn.model_selection import train_test_split\n", "from sklearn.model_selection import train_test_split\n",
"from azureml.core.run import Run\n", "# sklearn.externals.joblib is removed in 0.23\n",
"from sklearn.externals import joblib\n", "from sklearn import __version__ as sklearnver\n",
"from packaging.version import Version\n",
"if Version(sklearnver) < Version(\"0.23.0\"):\n",
" from sklearn.externals import joblib\n",
"else:\n",
" import joblib\n",
"\n", "\n",
"import numpy as np\n", "import numpy as np\n",
"\n", "\n",
@@ -480,9 +492,10 @@
"from azureml.core.conda_dependencies import CondaDependencies\n", "from azureml.core.conda_dependencies import CondaDependencies\n",
"\n", "\n",
"conda_env = Environment('conda-env')\n", "conda_env = Environment('conda-env')\n",
"conda_env.python.conda_dependencies = CondaDependencies.create(pip_packages=['azureml-sdk',\n", "conda_env.python.conda_dependencies = CondaDependencies.create(pip_packages=['azureml-core',\n",
" 'azureml-dataprep[pandas,fuse]',\n", " 'azureml-dataprep[pandas,fuse]',\n",
" 'scikit-learn'])" " 'scikit-learn',\n",
" 'packaging'])"
] ]
}, },
{ {

View File

@@ -65,6 +65,7 @@ Machine Learning notebook samples and encourage efficient retrieval of topics an
| [Resuming a model](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/ml-frameworks/tensorflow/training/train-tensorflow-resume-training/train-tensorflow-resume-training.ipynb) | Resume a model in TensorFlow from a previously submitted run | MNIST | AML Compute | None | TensorFlow | None | | [Resuming a model](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/ml-frameworks/tensorflow/training/train-tensorflow-resume-training/train-tensorflow-resume-training.ipynb) | Resume a model in TensorFlow from a previously submitted run | MNIST | AML Compute | None | TensorFlow | None |
| [Training in Spark](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/training/train-in-spark/train-in-spark.ipynb) | Submiting a run on a spark cluster | None | HDI cluster | None | PySpark | None | | [Training in Spark](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/training/train-in-spark/train-in-spark.ipynb) | Submiting a run on a spark cluster | None | HDI cluster | None | PySpark | None |
| [Train on Azure Machine Learning Compute](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/training/train-on-amlcompute/train-on-amlcompute.ipynb) | Submit a run on Azure Machine Learning Compute. | Diabetes | AML Compute | None | None | None | | [Train on Azure Machine Learning Compute](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/training/train-on-amlcompute/train-on-amlcompute.ipynb) | Submit a run on Azure Machine Learning Compute. | Diabetes | AML Compute | None | None | None |
| [Train on Azure Machine Learning Compute Instance](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/training/train-on-computeinstance/train-on-computeinstance.ipynb) | Submit a run on Azure Machine Learning Compute Instance. | Diabetes | Compute Instance | None | None | None |
| [Train on local compute](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/training/train-on-local/train-on-local.ipynb) | Train a model locally | Diabetes | Local | None | None | None | | [Train on local compute](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/training/train-on-local/train-on-local.ipynb) | Train a model locally | Diabetes | Local | None | None | None |
| [Train in a remote Linux virtual machine](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/training/train-on-remote-vm/train-on-remote-vm.ipynb) | Configure and execute a run | Diabetes | Data Science Virtual Machine | None | None | None | | [Train in a remote Linux virtual machine](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/training/train-on-remote-vm/train-on-remote-vm.ipynb) | Configure and execute a run | Diabetes | Data Science Virtual Machine | None | None | None |
| [Using Tensorboard](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/training-with-deep-learning/export-run-history-to-tensorboard/export-run-history-to-tensorboard.ipynb) | Export the run history as Tensorboard logs | None | None | None | TensorFlow | None | | [Using Tensorboard](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/training-with-deep-learning/export-run-history-to-tensorboard/export-run-history-to-tensorboard.ipynb) | Export the run history as Tensorboard logs | None | None | None | TensorFlow | None |
@@ -95,6 +96,8 @@ Machine Learning notebook samples and encourage efficient retrieval of topics an
|:----|:-----|:-------:|:----------------:|:-----------------:|:------------:|:------------:| |:----|:-----|:-------:|:----------------:|:-----------------:|:------------:|:------------:|
| [DNN Text Featurization](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/automated-machine-learning/classification-text-dnn/auto-ml-classification-text-dnn.ipynb) | Text featurization using DNNs for classification | None | AML Compute | None | None | None | | [DNN Text Featurization](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/automated-machine-learning/classification-text-dnn/auto-ml-classification-text-dnn.ipynb) | Text featurization using DNNs for classification | None | AML Compute | None | None | None |
| [configuration](https://github.com/Azure/MachineLearningNotebooks/blob/master/configuration.ipynb) | | | | | | | | [configuration](https://github.com/Azure/MachineLearningNotebooks/blob/master/configuration.ipynb) | | | | | | |
| [fairlearn-azureml-mitigation](https://github.com/Azure/MachineLearningNotebooks/blob/master//contrib/fairness/fairlearn-azureml-mitigation.ipynb) | | | | | | |
| [upload-fairness-dashboard](https://github.com/Azure/MachineLearningNotebooks/blob/master//contrib/fairness/upload-fairness-dashboard.ipynb) | | | | | | |
| [lightgbm-example](https://github.com/Azure/MachineLearningNotebooks/blob/master//contrib/gbdt/lightgbm/lightgbm-example.ipynb) | | | | | | | | [lightgbm-example](https://github.com/Azure/MachineLearningNotebooks/blob/master//contrib/gbdt/lightgbm/lightgbm-example.ipynb) | | | | | | |
| [azure-ml-with-nvidia-rapids](https://github.com/Azure/MachineLearningNotebooks/blob/master//contrib/RAPIDS/azure-ml-with-nvidia-rapids.ipynb) | | | | | | | | [azure-ml-with-nvidia-rapids](https://github.com/Azure/MachineLearningNotebooks/blob/master//contrib/RAPIDS/azure-ml-with-nvidia-rapids.ipynb) | | | | | | |
| [auto-ml-continuous-retraining](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/automated-machine-learning/continuous-retraining/auto-ml-continuous-retraining.ipynb) | | | | | | | | [auto-ml-continuous-retraining](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/automated-machine-learning/continuous-retraining/auto-ml-continuous-retraining.ipynb) | | | | | | |
@@ -112,6 +115,7 @@ Machine Learning notebook samples and encourage efficient retrieval of topics an
| [register-model-deploy-local-advanced](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/deploy-to-local/register-model-deploy-local-advanced.ipynb) | | | | | | | | [register-model-deploy-local-advanced](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/deploy-to-local/register-model-deploy-local-advanced.ipynb) | | | | | | |
| [enable-app-insights-in-production-service](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/enable-app-insights-in-production-service/enable-app-insights-in-production-service.ipynb) | | | | | | | | [enable-app-insights-in-production-service](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/enable-app-insights-in-production-service/enable-app-insights-in-production-service.ipynb) | | | | | | |
| [onnx-model-register-and-deploy](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/onnx/onnx-model-register-and-deploy.ipynb) | | | | | | | | [onnx-model-register-and-deploy](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/onnx/onnx-model-register-and-deploy.ipynb) | | | | | | |
| [production-deploy-to-aks-ssl](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/production-deploy-to-aks/production-deploy-to-aks-ssl.ipynb) | | | | | | |
| [production-deploy-to-aks](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/production-deploy-to-aks/production-deploy-to-aks.ipynb) | | | | | | | | [production-deploy-to-aks](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/production-deploy-to-aks/production-deploy-to-aks.ipynb) | | | | | | |
| [production-deploy-to-aks-gpu](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/production-deploy-to-aks-gpu/production-deploy-to-aks-gpu.ipynb) | | | | | | | | [production-deploy-to-aks-gpu](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/deployment/production-deploy-to-aks-gpu/production-deploy-to-aks-gpu.ipynb) | | | | | | |
| [explain-model-on-amlcompute](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/explain-model/azure-integration/remote-explanation/explain-model-on-amlcompute.ipynb) | | | | | | | | [explain-model-on-amlcompute](https://github.com/Azure/MachineLearningNotebooks/blob/master//how-to-use-azureml/explain-model/azure-integration/remote-explanation/explain-model-on-amlcompute.ipynb) | | | | | | |

View File

@@ -102,7 +102,7 @@
"source": [ "source": [
"import azureml.core\n", "import azureml.core\n",
"\n", "\n",
"print(\"This notebook was created using version 1.6.0 of the Azure ML SDK\")\n", "print(\"This notebook was created using version 1.8.0 of the Azure ML SDK\")\n",
"print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")" "print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
] ]
}, },

View File

@@ -217,6 +217,7 @@
"outputs": [], "outputs": [],
"source": [ "source": [
"%%time\n", "%%time\n",
"import uuid\n",
"from azureml.core.webservice import Webservice\n", "from azureml.core.webservice import Webservice\n",
"from azureml.core.model import InferenceConfig\n", "from azureml.core.model import InferenceConfig\n",
"from azureml.core.environment import Environment\n", "from azureml.core.environment import Environment\n",
@@ -230,8 +231,9 @@
"myenv = Environment.get(workspace=ws, name=\"tutorial-env\", version=\"1\")\n", "myenv = Environment.get(workspace=ws, name=\"tutorial-env\", version=\"1\")\n",
"inference_config = InferenceConfig(entry_script=\"score.py\", environment=myenv)\n", "inference_config = InferenceConfig(entry_script=\"score.py\", environment=myenv)\n",
"\n", "\n",
"service_name = 'sklearn-mnist-svc-' + str(uuid.uuid4())[:4]\n",
"service = Model.deploy(workspace=ws, \n", "service = Model.deploy(workspace=ws, \n",
" name='sklearn-mnist-svc', \n", " name=service_name, \n",
" models=[model], \n", " models=[model], \n",
" inference_config=inference_config, \n", " inference_config=inference_config, \n",
" deployment_config=aciconfig)\n", " deployment_config=aciconfig)\n",

View File

@@ -272,6 +272,7 @@
"outputs": [], "outputs": [],
"source": [ "source": [
"%%time\n", "%%time\n",
"import uuid\n",
"from azureml.core.webservice import Webservice\n", "from azureml.core.webservice import Webservice\n",
"from azureml.core.model import InferenceConfig\n", "from azureml.core.model import InferenceConfig\n",
"from azureml.core.environment import Environment\n", "from azureml.core.environment import Environment\n",
@@ -284,8 +285,9 @@
"myenv = Environment.get(workspace=ws, name=\"tutorial-env\")\n", "myenv = Environment.get(workspace=ws, name=\"tutorial-env\")\n",
"inference_config = InferenceConfig(entry_script=\"score.py\", environment=myenv)\n", "inference_config = InferenceConfig(entry_script=\"score.py\", environment=myenv)\n",
"\n", "\n",
"service_name = 'sklearn-mnist-svc-' + str(uuid.uuid4())[:4]\n",
"service = Model.deploy(workspace=ws, \n", "service = Model.deploy(workspace=ws, \n",
" name='sklearn-mnist-svc', \n", " name=service_name, \n",
" models=[model], \n", " models=[model], \n",
" inference_config=inference_config, \n", " inference_config=inference_config, \n",
" deployment_config=aciconfig)\n", " deployment_config=aciconfig)\n",
@@ -489,7 +491,7 @@
"import json\n", "import json\n",
"from azureml.core import Webservice\n", "from azureml.core import Webservice\n",
"\n", "\n",
"service = Webservice(ws, 'sklearn-mnist-svc')\n", "service = Webservice(ws, service_name)\n",
"\n", "\n",
"#pass the connection string for blob storage to give the server access to the uploaded public keys \n", "#pass the connection string for blob storage to give the server access to the uploaded public keys \n",
"conn_str_template = 'DefaultEndpointsProtocol={};AccountName={};AccountKey={};EndpointSuffix=core.windows.net'\n", "conn_str_template = 'DefaultEndpointsProtocol={};AccountName={};AccountKey={};EndpointSuffix=core.windows.net'\n",