From cf0490ab92c0b3c7d40aa2edec5157c26fecee68 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Shan=C3=A9=20Winner?= <43390034+swinner95@users.noreply.github.com> Date: Thu, 31 Oct 2019 12:24:08 -0700 Subject: [PATCH] Update auto-ml-classification-bank-marketing.ipynb --- ...uto-ml-classification-bank-marketing.ipynb | 713 +++--------------- 1 file changed, 103 insertions(+), 610 deletions(-) diff --git a/how-to-use-azureml/automated-machine-learning/classification-bank-marketing/auto-ml-classification-bank-marketing.ipynb b/how-to-use-azureml/automated-machine-learning/classification-bank-marketing/auto-ml-classification-bank-marketing.ipynb index 8827a394..1e005683 100644 --- a/how-to-use-azureml/automated-machine-learning/classification-bank-marketing/auto-ml-classification-bank-marketing.ipynb +++ b/how-to-use-azureml/automated-machine-learning/classification-bank-marketing/auto-ml-classification-bank-marketing.ipynb @@ -1,15 +1,6 @@ { "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Copyright (c) Microsoft Corporation. All rights reserved.\n", - "\n", - "Licensed under the MIT License." - ] - }, - { + { "cell_type": "markdown", "metadata": {}, "source": [ @@ -20,38 +11,40 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "# Automated Machine Learning\n", - "_**Classification with Deployment using a Bank Marketing Dataset**_\n", + "Copyright (c) Microsoft Corporation. All rights reserved.\n", "\n", - "## Contents\n", - "1. [Introduction](#Introduction)\n", - "1. [Setup](#Setup)\n", - "1. [Train](#Train)\n", - "1. [Results](#Results)\n", - "1. [Deploy](#Deploy)\n", - "1. [Test](#Test)\n", - "1. [Acknowledgements](#Acknowledgements)" + "Licensed under the MIT License." ] - }, + }, { "cell_type": "markdown", "metadata": {}, "source": [ - "## Introduction\n", + "# Unique Descriptive Title\n", + "_**Unique Subtitle**_\n", "\n", - "In this example we use the UCI Bank Marketing dataset to showcase how you can use AutoML for a classification problem and deploy it to an Azure Container Instance (ACI). The classification goal is to predict if the client will subscribe to a term deposit with the bank.\n", + "Introduction that describes in a customer friendly language, what they will do and accomplish.\n". + "## Contents\n", + "1. [Introduction](#Introduction)\n", + "1. [Prerequisites](#Prerequisites)\n", + "1. [Configuration and Setup](#Setup)\n", + "1. [Working with Data](#Working with Data)\n", + "1. [Training](#Training)\n", + "1. [Productionizing](#Productionizing)\n", + "1. [Model Monitoring](#Model Monitoring)\n", + "1. [Clean up resources](#Clean up resources)\n", + "1. [Next Steps](#Next Steps)\n", + "1. [Acknowledgements](#Acknowledgements)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Configuration\n", "\n", - "If you are using an Azure Machine Learning Notebook VM, you are all set. Otherwise, go through the [configuration](../../../configuration.ipynb) notebook first if you haven't already to establish your connection to the AzureML Workspace. \n", - "\n", - "In this notebook you will learn how to:\n", - "1. Create an experiment using an existing workspace.\n", - "2. Configure AutoML using `AutoMLConfig`.\n", - "3. Train the model using local compute.\n", - "4. Explore the results.\n", - "5. Register the model.\n", - "6. Create a container image.\n", - "7. Create an Azure Container Instance (ACI) service.\n", - "8. Test the ACI service." + "If you are using an Azure Machine Learning Compute Instance, you are all set. Otherwise, go through the [configuration](../../../configuration.ipynb) notebook first if you haven't already to establish your connection to the AzureML Workspace. \n", + "Please note that a Basic edition workspace is created by default in the configuration.ipynb file.\n", ] }, { @@ -60,8 +53,7 @@ "source": [ "## Setup\n", "\n", - "As part of the setup you have already created an Azure ML `Workspace` object. For AutoML you will need to create an `Experiment` object, which is a named object in a `Workspace` used to run experiments." - ] + "As part of the setup you have already created an Azure ML `Workspace` object....\n", }, { "cell_type": "code", @@ -69,22 +61,7 @@ "metadata": {}, "outputs": [], "source": [ - "import json\n", - "import logging\n", - "\n", - "from matplotlib import pyplot as plt\n", - "import numpy as np\n", - "import pandas as pd\n", - "import os\n", - "from sklearn import datasets\n", - "import azureml.dataprep as dprep\n", - "from sklearn.model_selection import train_test_split\n", - "\n", "import azureml.core\n", - "from azureml.core.experiment import Experiment\n", - "from azureml.core.workspace import Workspace\n", - "from azureml.train.automl import AutoMLConfig\n", - "from azureml.train.automl.run import AutoMLRun" ] }, { @@ -93,26 +70,21 @@ "metadata": {}, "outputs": [], "source": [ - "ws = Workspace.from_config()\n", + "tenant_id = os.environ['TENANT_ID’]\n", + "client_id = os.environ['CLIENT_ID’]\n", + "run = Run.get_context()\n", + "secret_name = “{0}-secret”.format(client_id)\n", + "secret = run.get_secret(name=secret_name)\n", + "sp_auth = ServicePrincipalAuthentication(tenant_id, client_id, secret)\n", + "ws = Workspace.from_config(auth=sp_auth)\n", "\n", - "# choose a name for experiment\n", - "experiment_name = 'automl-classification-bmarketing'\n", + "# choose a unique name for experiment\n", + "experiment_name = 'unique-name'\n", "# project folder\n", - "project_folder = './sample_projects/automl-classification-bankmarketing'\n", + "project_folder = './sample_projects/test'\n", "\n", "experiment=Experiment(ws, experiment_name)\n", "\n", - "output = {}\n", - "output['SDK version'] = azureml.core.VERSION\n", - "output['Subscription ID'] = ws.subscription_id\n", - "output['Workspace'] = ws.name\n", - "output['Resource Group'] = ws.resource_group\n", - "output['Location'] = ws.location\n", - "output['Project Directory'] = project_folder\n", - "output['Experiment Name'] = experiment.name\n", - "pd.set_option('display.max_colwidth', -1)\n", - "outputDf = pd.DataFrame(data = output, index = [''])\n", - "outputDf.T" ] }, { @@ -120,583 +92,77 @@ "metadata": {}, "source": [ "## Create or Attach existing AmlCompute\n", - "You will need to create a compute target for your AutoML run. In this tutorial, you create AmlCompute as your training compute resource.\n", + "You will need to create a compute target for your run. In this tutorial, you create AmlCompute as your training compute resource.\n", "#### Creation of AmlCompute takes approximately 5 minutes. \n", "If the AmlCompute with that name is already in your workspace this code will skip the creation process.\n", "As with other Azure services, there are limits on certain resources (e.g. AmlCompute) associated with the Azure Machine Learning service. Please read this article on the default limits and how to request more quota." ] }, { - "cell_type": "code", - "execution_count": null, + "cell_type": "markdown", "metadata": {}, - "outputs": [], "source": [ - "from azureml.core.compute import AmlCompute\n", - "from azureml.core.compute import ComputeTarget\n", + "# Working with Data\n", "\n", - "# Choose a name for your cluster.\n", - "amlcompute_cluster_name = \"automlcl\"\n", - "\n", - "found = False\n", - "# Check if this compute target already exists in the workspace.\n", - "cts = ws.compute_targets\n", - "if amlcompute_cluster_name in cts and cts[amlcompute_cluster_name].type == 'AmlCompute':\n", - " found = True\n", - " print('Found existing compute target.')\n", - " compute_target = cts[amlcompute_cluster_name]\n", - " \n", - "if not found:\n", - " print('Creating a new compute target...')\n", - " provisioning_config = AmlCompute.provisioning_configuration(vm_size = \"STANDARD_D2_V2\", # for GPU, use \"STANDARD_NC6\"\n", - " #vm_priority = 'lowpriority', # optional\n", - " max_nodes = 6)\n", - "\n", - " # Create the cluster.\n", - " compute_target = ComputeTarget.create(ws, amlcompute_cluster_name, provisioning_config)\n", - " \n", - " # Can poll for a minimum number of nodes and for a specific timeout.\n", - " # If no min_node_count is provided, it will use the scale settings for the cluster.\n", - " compute_target.wait_for_completion(show_output = True, min_node_count = None, timeout_in_minutes = 20)\n", - " \n", - " # For a more detailed view of current AmlCompute status, use get_status()." + "Here you would learn how to perform Data labeling and use Open Datasets etc..\n", + "To do this first load....\n", ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "# Data\n", + "## Training\n", "\n", - "Here load the data in the get_data() script to be utilized in azure compute. To do this first load all the necessary libraries and dependencies to set up paths for the data and to create the conda_Run_config." + "Here you would learn how to train a DNN using...\n", ] }, - { - "cell_type": "code", - "execution_count": null, + { + "cell_type": "markdown", "metadata": {}, - "outputs": [], "source": [ - "if not os.path.isdir('data'):\n", - " os.mkdir('data')\n", - " \n", - "if not os.path.exists(project_folder):\n", - " os.makedirs(project_folder)" + "# Productionizing\n", + "\n", + "Here you would learn how to deploy your model to ACI to perform...\n", ] }, - { - "cell_type": "code", - "execution_count": null, + { + "cell_type": "markdown", "metadata": {}, - "outputs": [], "source": [ - "from azureml.core.runconfig import RunConfiguration\n", - "from azureml.core.conda_dependencies import CondaDependencies\n", - "import pkg_resources\n", + "# Model Monitoring\n", "\n", - "# create a new RunConfig object\n", - "conda_run_config = RunConfiguration(framework=\"python\")\n", + "Here you would learn how to detect datadrift etc...\n", + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Clean up resources\n", "\n", - "# Set compute target to AmlCompute\n", - "conda_run_config.target = compute_target\n", - "conda_run_config.environment.docker.enabled = True\n", - "conda_run_config.environment.docker.base_image = azureml.core.runconfig.DEFAULT_CPU_IMAGE\n", + "Now, let's clean up the resources we created...\n", + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Next Steps\n", "\n", - "dprep_dependency = 'azureml-dataprep==' + pkg_resources.get_distribution(\"azureml-dataprep\").version\n", - "\n", - "cd = CondaDependencies.create(pip_packages=['azureml-sdk[automl]', dprep_dependency], conda_packages=['numpy','py-xgboost<=0.80'])\n", - "conda_run_config.environment.python.conda_dependencies = cd" + "In this notebook, you’ve done x, y, z. You can learn more with these resources:\n", + "+ [SDK reference documentation for `MyClass`]()\n", + "+ [About this feature](https://docs.microsoft.com/azure/machine-learning/service/thisfeature)\n", ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "### Load Data\n", - "\n", - "Here we create the script to be run in azure comput for loading the data, we load the bank marketing dataset into X_train and y_train. Next X_train and y_train is returned for training the model." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "data = \"https://automlsamplenotebookdata.blob.core.windows.net/automl-sample-notebook-data/bankmarketing_train.csv\"\n", - "dflow = dprep.read_csv(data, infer_column_types=True)\n", - "dflow.get_profile()\n", - "X_train = dflow.drop_columns(columns=['y'])\n", - "y_train = dflow.keep_columns(columns=['y'], validate_column_exists=True)\n", - "dflow.head()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Train\n", - "\n", - "Instantiate a AutoMLConfig object. This defines the settings and data used to run the experiment.\n", - "\n", - "|Property|Description|\n", - "|-|-|\n", - "|**task**|classification or regression|\n", - "|**primary_metric**|This is the metric that you want to optimize. Classification supports the following primary metrics:
accuracy
AUC_weighted
average_precision_score_weighted
norm_macro_recall
precision_score_weighted|\n", - "|**iteration_timeout_minutes**|Time limit in minutes for each iteration.|\n", - "|**iterations**|Number of iterations. In each iteration AutoML trains a specific pipeline with the data.|\n", - "|**n_cross_validations**|Number of cross validation splits.|\n", - "|**X**|(sparse) array-like, shape = [n_samples, n_features]|\n", - "|**y**|(sparse) array-like, shape = [n_samples, ], Multi-class targets.|\n", - "|**path**|Relative path to the project folder. AutoML stores configuration files for the experiment under this folder. You can specify a new empty folder.|\n", - "\n", - "**_You can find more information about primary metrics_** [here](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-configure-auto-train#primary-metric)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "automl_settings = {\n", - " \"iteration_timeout_minutes\": 5,\n", - " \"iterations\": 10,\n", - " \"n_cross_validations\": 2,\n", - " \"primary_metric\": 'AUC_weighted',\n", - " \"preprocess\": True,\n", - " \"max_concurrent_iterations\": 5,\n", - " \"verbosity\": logging.INFO,\n", - "}\n", - "\n", - "automl_config = AutoMLConfig(task = 'classification',\n", - " debug_log = 'automl_errors.log',\n", - " path = project_folder,\n", - " run_configuration=conda_run_config,\n", - " X = X_train,\n", - " y = y_train,\n", - " **automl_settings\n", - " )" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Call the `submit` method on the experiment object and pass the run configuration. Execution of local runs is synchronous. Depending on the data and the number of iterations this can run for a while.\n", - "In this example, we specify `show_output = True` to print currently running iterations to the console." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "remote_run = experiment.submit(automl_config, show_output = True)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "remote_run" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Results" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "#### Widget for Monitoring Runs\n", - "\n", - "The widget will first report a \"loading\" status while running the first iteration. After completing the first iteration, an auto-updating graph and table will be shown. The widget will refresh once per minute, so you should see the graph update as child runs complete.\n", - "\n", - "**Note:** The widget displays a link at the bottom. Use this link to open a web interface to explore the individual run details" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from azureml.widgets import RunDetails\n", - "RunDetails(remote_run).show() " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Deploy\n", - "\n", - "### Retrieve the Best Model\n", - "\n", - "Below we select the best pipeline from our iterations. The `get_output` method on `automl_classifier` returns the best run and the fitted model for the last invocation. Overloads on `get_output` allow you to retrieve the best run and fitted model for *any* logged metric or for a particular *iteration*." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "best_run, fitted_model = remote_run.get_output()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Register the Fitted Model for Deployment\n", - "If neither `metric` nor `iteration` are specified in the `register_model` call, the iteration with the best primary metric is registered." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "description = 'AutoML Model trained on bank marketing data to predict if a client will subscribe to a term deposit'\n", - "tags = None\n", - "model = remote_run.register_model(description = description, tags = tags)\n", - "\n", - "print(remote_run.model_id) # This will be written to the script file later in the notebook." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Create Scoring Script\n", - "The scoring script is required to generate the image for deployment. It contains the code to do the predictions on input data." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%%writefile score.py\n", - "import pickle\n", - "import json\n", - "import numpy\n", - "import azureml.train.automl\n", - "from sklearn.externals import joblib\n", - "from azureml.core.model import Model\n", - "\n", - "\n", - "def init():\n", - " global model\n", - " model_path = Model.get_model_path(model_name = '<>') # this name is model.id of model that we want to deploy\n", - " # deserialize the model file back into a sklearn model\n", - " model = joblib.load(model_path)\n", - "\n", - "def run(rawdata):\n", - " try:\n", - " data = json.loads(rawdata)['data']\n", - " data = numpy.array(data)\n", - " result = model.predict(data)\n", - " except Exception as e:\n", - " result = str(e)\n", - " return json.dumps({\"error\": result})\n", - " return json.dumps({\"result\":result.tolist()})" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Create a YAML File for the Environment" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "To ensure the fit results are consistent with the training results, the SDK dependency versions need to be the same as the environment that trains the model. Details about retrieving the versions can be found in notebook [12.auto-ml-retrieve-the-training-sdk-versions](12.auto-ml-retrieve-the-training-sdk-versions.ipynb)." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "dependencies = remote_run.get_run_sdk_dependencies(iteration = 1)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "for p in ['azureml-train-automl', 'azureml-sdk', 'azureml-core']:\n", - " print('{}\\t{}'.format(p, dependencies[p]))" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from azureml.core.conda_dependencies import CondaDependencies\n", - "\n", - "myenv = CondaDependencies.create(conda_packages=['numpy','scikit-learn','py-xgboost<=0.80'],\n", - " pip_packages=['azureml-sdk[automl]'])\n", - "\n", - "conda_env_file_name = 'myenv.yml'\n", - "myenv.save_to_file('.', conda_env_file_name)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Substitute the actual version number in the environment file.\n", - "# This is not strictly needed in this notebook because the model should have been generated using the current SDK version.\n", - "# However, we include this in case this code is used on an experiment from a previous SDK version.\n", - "\n", - "with open(conda_env_file_name, 'r') as cefr:\n", - " content = cefr.read()\n", - "\n", - "with open(conda_env_file_name, 'w') as cefw:\n", - " cefw.write(content.replace(azureml.core.VERSION, dependencies['azureml-sdk']))\n", - "\n", - "# Substitute the actual model id in the script file.\n", - "\n", - "script_file_name = 'score.py'\n", - "\n", - "with open(script_file_name, 'r') as cefr:\n", - " content = cefr.read()\n", - "\n", - "with open(script_file_name, 'w') as cefw:\n", - " cefw.write(content.replace('<>', remote_run.model_id))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Create a Container Image\n", - "\n", - "Next use Azure Container Instances for deploying models as a web service for quickly deploying and validating your model\n", - "or when testing a model that is under development." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from azureml.core.image import Image, ContainerImage\n", - "\n", - "image_config = ContainerImage.image_configuration(runtime= \"python\",\n", - " execution_script = script_file_name,\n", - " conda_file = conda_env_file_name,\n", - " tags = {'area': \"bmData\", 'type': \"automl_classification\"},\n", - " description = \"Image for automl classification sample\")\n", - "\n", - "image = Image.create(name = \"automlsampleimage\",\n", - " # this is the model object \n", - " models = [model],\n", - " image_config = image_config, \n", - " workspace = ws)\n", - "\n", - "image.wait_for_creation(show_output = True)\n", - "\n", - "if image.creation_state == 'Failed':\n", - " print(\"Image build log at: \" + image.image_build_log_uri)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Deploy the Image as a Web Service on Azure Container Instance\n", - "\n", - "Deploy an image that contains the model and other assets needed by the service." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from azureml.core.webservice import AciWebservice\n", - "\n", - "aciconfig = AciWebservice.deploy_configuration(cpu_cores = 1, \n", - " memory_gb = 1, \n", - " tags = {'area': \"bmData\", 'type': \"automl_classification\"}, \n", - " description = 'sample service for Automl Classification')" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from azureml.core.webservice import Webservice\n", - "\n", - "aci_service_name = 'automl-sample-bankmarketing'\n", - "print(aci_service_name)\n", - "aci_service = Webservice.deploy_from_image(deployment_config = aciconfig,\n", - " image = image,\n", - " name = aci_service_name,\n", - " workspace = ws)\n", - "aci_service.wait_for_deployment(True)\n", - "print(aci_service.state)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Delete a Web Service\n", - "\n", - "Deletes the specified web service." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "#aci_service.delete()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Get Logs from a Deployed Web Service\n", - "\n", - "Gets logs from a deployed web service." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "#aci_service.get_logs()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Test\n", - "\n", - "Now that the model is trained split our data in the same way the data was split for training (The difference here is the data is being split locally) and then run the test data through the trained model to get the predicted values." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Load the bank marketing datasets.\n", - "from sklearn.datasets import load_diabetes\n", - "from sklearn.model_selection import train_test_split\n", - "from numpy import array" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "data = \"https://automlsamplenotebookdata.blob.core.windows.net/automl-sample-notebook-data/bankmarketing_validate.csv\"\n", - "dflow = dprep.read_csv(data, infer_column_types=True)\n", - "dflow.get_profile()\n", - "X_test = dflow.drop_columns(columns=['y'])\n", - "y_test = dflow.keep_columns(columns=['y'], validate_column_exists=True)\n", - "dflow.head()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "X_test = X_test.to_pandas_dataframe()\n", - "y_test = y_test.to_pandas_dataframe()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "y_pred = fitted_model.predict(X_test)\n", - "actual = array(y_test)\n", - "actual = actual[:,0]\n", - "print(y_pred.shape, \" \", actual.shape)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Calculate metrics for the prediction\n", - "\n", - "Now visualize the data on a scatter plot to show what our truth (actual) values are compared to the predicted values \n", - "from the trained model that was returned." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%matplotlib notebook\n", - "test_pred = plt.scatter(actual, y_pred, color='b')\n", - "test_test = plt.scatter(actual, actual, color='g')\n", - "plt.legend((test_pred, test_test), ('prediction', 'truth'), loc='upper left', fontsize=8)\n", - "plt.show()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Acknowledgements" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "This Bank Marketing dataset is made available under the Creative Commons (CCO: Public Domain) License: https://creativecommons.org/publicdomain/zero/1.0/. Any rights in individual contents of the database are licensed under the Database Contents License: https://creativecommons.org/publicdomain/zero/1.0/ and is available at: https://www.kaggle.com/janiobachmann/bank-marketing-dataset .\n", + "This dataset is made available under the Creative Commons (CCO: Public Domain) License: https://creativecommons.org/publicdomain/zero/1.0/. Any rights in individual contents of the database are licensed under the Database Contents License: https://creativecommons.org/publicdomain/zero/1.0/ and is available at: https://www.kaggle.com/janiobachmann/bank-marketing-dataset .\n", "\n", "_**Acknowledgements**_\n", - "This data set is originally available within the UCI Machine Learning Database: https://archive.ics.uci.edu/ml/datasets/bank+marketing\n", + "This dataset is originally available within the UCI Machine Learning Database: https://archive.ics.uci.edu/ml/datasets/bank+marketing\n", "\n", "[Moro et al., 2014] S. Moro, P. Cortez and P. Rita. A Data-Driven Approach to Predict the Success of Bank Telemarketing. Decision Support Systems, Elsevier, 62:22-31, June 2014" ] @@ -705,9 +171,32 @@ "metadata": { "authors": [ { - "name": "v-rasav" + "name": "YOUR ALIAS" } + ], + "category": "tutorial", + "compute": [ + "AML Compute" ], + "datasets": [ + "MNIST" + ], + "deployment": [ + "AKS" + ], + "exclude_from_index": false, + "framework": [ + "PyTorch" + ], + "friendly_name": "How to use ModuleStep with AML Pipelines", + }, + "order_index": 14, + "star_tag": [], + "tags": [ + "Pipeline Builder" + ], + "task": "Demonstrates the use of ModuleStep" + }, "kernelspec": { "display_name": "Python 3.6", "language": "python", @@ -728,4 +217,8 @@ }, "nbformat": 4, "nbformat_minor": 2 -} \ No newline at end of file +} + + + +