MachineLearningNotebooks/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-how-to-use-pipeline-drafts.ipynb

{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Copyright (c) Microsoft Corporation. All rights reserved.  \n",
        "Licensed under the MIT License."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-how-to-use-pipeline-drafts.png)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "# How to Use Pipeline Drafts\n",
        "In this notebook, we will show you how you can use Pipeline Drafts. Pipeline Drafts are mutable pipelines which can be used to submit runs and create Published Pipelines."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Prerequisites and AML Basics\n",
        "If you are using an Azure Machine Learning Notebook VM, you are all set. Otherwise, make sure you go through the [configuration Notebook](https://aka.ms/pl-config) first if you haven't. This sets you up with a working config file that has information on your workspace, subscription id, etc.\n",
        "\n",
        "### Initialization Steps"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import azureml.core\n",
        "from azureml.core import Workspace\n",
        "from azureml.widgets import RunDetails\n",
        "\n",
        "# Check core SDK version number\n",
        "print(\"SDK version:\", azureml.core.VERSION)\n",
        "\n",
        "ws = Workspace.from_config()\n",
        "print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep = '\\n')"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Compute Target\n",
        "Retrieve an already attached Azure Machine Learning Compute to use in the Pipeline."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.core.compute import AmlCompute, ComputeTarget\n",
        "from azureml.core.compute_target import ComputeTargetException\n",
        "aml_compute_target = \"cpu-cluster\"\n",
        "try:\n",
        "    aml_compute = AmlCompute(ws, aml_compute_target)\n",
        "    print(\"Found existing compute target: {}\".format(aml_compute_target))\n",
        "except ComputeTargetException:\n",
        "    print(\"Creating new compute target: {}\".format(aml_compute_target))\n",
        "    \n",
        "    provisioning_config = AmlCompute.provisioning_configuration(vm_size = \"STANDARD_D2_V2\",\n",
        "                                                                min_nodes = 1, \n",
        "                                                                max_nodes = 4)    \n",
        "    aml_compute = ComputeTarget.create(ws, aml_compute_target, provisioning_config)\n",
        "    aml_compute.wait_for_completion(show_output=True, min_node_count=None, timeout_in_minutes=20)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Build a Pipeline\n",
        "Build a simple pipeline to use to create a PipelineDraft."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.pipeline.core import Pipeline\n",
        "from azureml.pipeline.steps import PythonScriptStep\n",
        "\n",
        "source_directory = \"publish_run_train\"\n",
        "\n",
        "train_step = PythonScriptStep(\n",
        "    name=\"Training_Step\",\n",
        "    script_name=\"train.py\", \n",
        "    compute_target=aml_compute_target, \n",
        "    source_directory=source_directory)\n",
        "print(\"train step created\")\n",
        "\n",
        "pipeline = Pipeline(workspace=ws, steps=[train_step])\n",
        "print (\"Pipeline is built\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Create a Pipeline Draft\n",
        "Create a PipelineDraft by specifying a name, description, experiment_name and Pipeline. You can also specify tags, properties and pipeline_parameter values.\n",
        "\n",
        "In this example we use the previously created Pipeline object to create the Pipeline Draft. You can also create a Pipeline Draft from an existing Pipeline Run, Published Pipeline, or other Pipeline Draft."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from azureml.pipeline.core import PipelineDraft\n",
        "\n",
        "pipeline_draft = PipelineDraft.create(ws, name=\"TestPipelineDraft\",\n",
        "                                      description=\"draft description\",\n",
        "                                      experiment_name=\"helloworld\",\n",
        "                                      pipeline=pipeline,\n",
        "                                      continue_on_step_failure=True,\n",
        "                                      tags={'dev': 'true'},\n",
        "                                      properties={'train': 'value'})\n",
        "\n",
        "created_pipeline_draft_id = pipeline_draft.id"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### List Pipeline Drafts in a Workspace\n",
        "Use the PipelineDraft.list() function to list all PipelineDrafts in a Workspace. You can use the optional tags parameter to filter on specified tag values."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "pipeline_drafts = PipelineDraft.list(ws, tags={'dev': 'true'})\n",
        "\n",
        "for pipeline_draft in pipeline_drafts:\n",
        "    print(pipeline_draft)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Get a Pipeline Draft by Id"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "pipeline_draft = PipelineDraft.get(ws, id=created_pipeline_draft_id)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Update a Pipeline Draft\n",
        "The update() function of a pipeline draft can be used to update the name, description, experiment name, pipeline parameter assignments, continue on step failure setting and Pipeline associated with the PipelineDraft. "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "new_train_step = PythonScriptStep(\n",
        "    name=\"New_Training_Step\",\n",
        "    script_name=\"train.py\", \n",
        "    compute_target=aml_compute_target, \n",
        "    source_directory=source_directory)\n",
        "\n",
        "new_pipeline = Pipeline(workspace=ws, steps=[new_train_step])\n",
        "\n",
        "pipeline_draft.update(name=\"UpdatedPipelineDraft\", description=\"has updated train step\", pipeline=new_pipeline)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Submit a Pipeline Run from a Pipeline Draft\n",
        "Use the pipeline_draft.submit() function to submit a PipelineRun. After the run is submitted, the PipelineDraft can still be edited and used to submit new runs."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "pipeline_run = pipeline_draft.submit_run()\n",
        "pipeline_run"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Create a Published Pipeline from a Pipeline Draft\n",
        "Use the pipeline_draft.publish() function to create a Published Pipeline from the Pipeline Draft. After creating a Published Pipeline, the Pipeline Draft can still be edited and used to create other Published Pipelines."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "published_pipeline = pipeline_draft.publish()\n",
        "published_pipeline"
      ]
    }
  ],
  "metadata": {
    "authors": [
      {
        "name": "elihop"
      }
    ],
    "category": "tutorial",
    "compute": [
      "AML Compute"
    ],
    "datasets": [
      "Custom"
    ],
    "deployment": [
      "None"
    ],
    "exclude_from_index": false,
    "framework": [
      "Azure ML"
    ],
    "friendly_name": "How to use Pipeline Drafts to create a Published Pipeline",
    "kernelspec": {
      "display_name": "Python 3.6",
      "language": "python",
      "name": "python36"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.6.2"
    },
    "order_index": 14,
    "star_tag": [
      "featured"
    ],
    "tags": [
      "None"
    ],
    "task": "Demonstrates the use of Pipeline Drafts"
  },
  "nbformat": 4,
  "nbformat_minor": 2
}