mirror of
https://github.com/Azure/MachineLearningNotebooks.git
synced 2025-12-20 01:27:06 -05:00
553 lines
18 KiB
Plaintext
553 lines
18 KiB
Plaintext
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Copyright (c) Microsoft Corporation. All rights reserved. \n",
|
|
"Licensed under the MIT License."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
""
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"\n",
|
|
"# How to Setup a PipelineEndpoint and Submit a Pipeline Using the PipelineEndpoint.\n",
|
|
"In this notebook, we will see how to setup a PipelineEndpoint and run specific pipeline version.\n",
|
|
"\n",
|
|
"PipelineEndpoint can be used to update a published pipeline while maintaining same endpoint.\n",
|
|
"PipelineEndpoint, provides a way to keep track of [PublishedPipelines](https://docs.microsoft.com/en-us/python/api/azureml-pipeline-core/azureml.pipeline.core.publishedpipeline) using versions. PipelineEndpoint uses endpoint with version information to trigger underlying published pipeline. Pipeline endpoints are uniquely named within a workspace. \n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Prerequisites and AML Basics\n",
|
|
"If you are using an Azure Machine Learning Notebook VM, you are all set. Otherwise, make sure you go through the [configuration Notebook](https://aka.ms/pl-config) first if you haven't. This sets you up with a working config file that has information on your workspace, subscription id, etc.\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from azureml.core import Workspace\n",
|
|
"\n",
|
|
"ws = Workspace.from_config()\n",
|
|
"print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep = '\\n')"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Notebook Overview\n",
|
|
"In this notebook, we provide an introduction to Azure machine learning PipelineEndpoints. It covers:\n",
|
|
"* [Create PipelineEndpoint](#Create-PipelineEndpoint), How to create PipelineEndpoint.\n",
|
|
"* [Retrieving PipelineEndpoint](#Retrieving-PipelineEndpoint), How to get specific PipelineEndpoint from worskpace by name/Id and get all [PipelineEndpoints](#Get-all-PipelineEndpoints-in-workspace) within workspace.\n",
|
|
"* [PipelineEndpoint Properties](#PipelineEndpoint-properties). How to get and set PipelineEndpoint properties, such as default version of PipelineEndpoint.\n",
|
|
"* [PipelineEndpoint Submission](#PipelineEndpoint-Submission). How to run a Pipeline using PipelineEndpoint."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Create PipelineEndpoint\n",
|
|
"Following are required input parameters to create PipelineEndpoint:\n",
|
|
"\n",
|
|
"* *workspace*: AML workspace.\n",
|
|
"* *name*: name of PipelineEndpoint, it is unique within workspace.\n",
|
|
"* *description*: description details for PipelineEndpoint.\n",
|
|
"* *pipeline*: A [Pipeline](#Steps-to-create-simple-Pipeline) or [PublishedPipeline](#Publish-Pipeline), to set default version of PipelineEndpoint. "
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"#### Initialization, Steps to create a Pipeline\n",
|
|
"\n",
|
|
"The best practice is to use separate folders for scripts and its dependent files for each step and specify that folder as the `source_directory` for the step. This helps reduce the size of the snapshot created for the step (only the specific folder is snapshotted). Since changes in any files in the `source_directory` would trigger a re-upload of the snapshot, this helps keep the reuse of the step when there are no changes in the `source_directory` of the step."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from azureml.core import Run, Experiment, Datastore\n",
|
|
"from azureml.core.compute import AmlCompute, ComputeTarget\n",
|
|
"from azureml.pipeline.steps import PythonScriptStep\n",
|
|
"from azureml.pipeline.core import Pipeline\n",
|
|
"\n",
|
|
"#Retrieve an already attached Azure Machine Learning Compute\n",
|
|
"aml_compute_target = \"cpu-cluster\"\n",
|
|
"try:\n",
|
|
" aml_compute = AmlCompute(ws, aml_compute_target)\n",
|
|
" print(\"Found existing compute target: {}\".format(aml_compute_target))\n",
|
|
"except:\n",
|
|
" print(\"Creating new compute target: {}\".format(aml_compute_target))\n",
|
|
" \n",
|
|
" provisioning_config = AmlCompute.provisioning_configuration(vm_size = \"STANDARD_D2_V2\",\n",
|
|
" min_nodes = 1, \n",
|
|
" max_nodes = 4) \n",
|
|
" aml_compute = ComputeTarget.create(ws, aml_compute_target, provisioning_config)\n",
|
|
" aml_compute.wait_for_completion(show_output=True, min_node_count=None, timeout_in_minutes=20)\n",
|
|
"\n",
|
|
"# source_directory\n",
|
|
"source_directory = 'publish_run_train'\n",
|
|
"# define a single step pipeline for demonstration purpose.\n",
|
|
"trainStep = PythonScriptStep(\n",
|
|
" name=\"Training_Step\",\n",
|
|
" script_name=\"train.py\", \n",
|
|
" compute_target=aml_compute_target, \n",
|
|
" source_directory=source_directory\n",
|
|
")\n",
|
|
"print(\"TrainStep created\")\n",
|
|
"# build and validate Pipeline\n",
|
|
"pipeline = Pipeline(workspace=ws, steps=[trainStep])\n",
|
|
"print(\"Pipeline is built\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"#### Publish Pipeline"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from datetime import datetime\n",
|
|
"\n",
|
|
"timenow = datetime.now().strftime('%m-%d-%Y-%H-%M')\n",
|
|
"\n",
|
|
"pipeline_name = timenow + \"-Pipeline\"\n",
|
|
"print(pipeline_name)\n",
|
|
"\n",
|
|
"published_pipeline = pipeline.publish(\n",
|
|
" name=pipeline_name, \n",
|
|
" description=pipeline_name)\n",
|
|
"print(\"Newly published pipeline id: {}\".format(published_pipeline.id))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"#### Publishing PipelineEndpoint\n",
|
|
"Create PipelineEndpoint with required parameters: workspace, name, description and pipeline"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from azureml.pipeline.core import PipelineEndpoint\n",
|
|
"\n",
|
|
"pipeline_endpoint = PipelineEndpoint.publish(workspace=ws, name=\"PipelineEndpointTest\",\n",
|
|
" pipeline=pipeline, description=\"Test description Notebook\")\n",
|
|
"pipeline_endpoint"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Retrieving PipelineEndpoint\n",
|
|
"\n",
|
|
"PipelineEndpoint is uniquely defined by name and id within workspace. PipelineEndpoint in workspace can be retrived by Id or by name."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"#### Get PipelineEndpoint by Name\n",
|
|
"\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"pipeline_endpoint_by_name = PipelineEndpoint.get(workspace=ws, name=\"PipelineEndpointTest\")\n",
|
|
"pipeline_endpoint_by_name"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"#### Get PipelineEndpoint by Id\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"#get the PipelineEndpoint Id\n",
|
|
"pipeline_endpoint_by_name = PipelineEndpoint.get(workspace=ws, name=\"PipelineEndpointTest\")\n",
|
|
"endpoint_id = pipeline_endpoint_by_name.id\n",
|
|
"\n",
|
|
"pipeline_endpoint_by_id = PipelineEndpoint.get(workspace=ws, id=endpoint_id)\n",
|
|
"pipeline_endpoint_by_id"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"#### Get all PipelineEndpoints in workspace\n",
|
|
"Returns all PipelineEndpoints within workspace"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"endpoint_list = PipelineEndpoint.get_all(workspace=ws, active_only=True)\n",
|
|
"endpoint_list"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### PipelineEndpoint properties"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"#### Default Version of PipelineEndpoint\n",
|
|
"Default version of PipelineEndpoint starts from \"0\" and increments on addition of pipelines.\n",
|
|
"\n",
|
|
"##### Get the Default Version"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"default_version = pipeline_endpoint_by_name.get_default_version()\n",
|
|
"default_version"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"##### Set default version \n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"pipeline_endpoint_by_name.set_default_version(\"0\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"#### Get the Published Pipeline corresponds to specific version of PipelineEndpoint"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"pipeline = pipeline_endpoint_by_name.get_pipeline(\"0\")\n",
|
|
"pipeline"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"#### Get default version Published Pipeline"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"pipeline = pipeline_endpoint_by_name.get_pipeline()\n",
|
|
"pipeline"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"#### Add Published Pipeline to PipelineEndpoint, \n",
|
|
"Adds a published pipeline (if its not present) using add() and if you want to add and set to default use add_default()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"pipeline_endpoint_by_name.add(published_pipeline)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"#### Add Published pipeline to PipelineEndpoint and set it to default version\n",
|
|
"Adding published pipeline to PipelineEndpoint if not present and set it to default"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Set Published Pipeline to PipelineEndpoint, if exists\n",
|
|
"pipeline_endpoint_by_name.set_default(published_pipeline)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"#### Get all Versions in PipelineEndpoint\n",
|
|
"Returns list of published pipelines and its versions"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"versions = pipeline_endpoint_by_name.get_all_versions()\n",
|
|
"\n",
|
|
"for ve in versions:\n",
|
|
" print(ve.version)\n",
|
|
" print(ve.pipeline.id)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"#### Get all Published Pipelines in PipelineEndpoint\n",
|
|
"Returns all active pipelines in PipelineEnpoint, if active_only flag is set to True."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"pipelines = pipeline_endpoint_by_name.get_all_pipelines(active_only=True)\n",
|
|
"pipelines"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"#### Name property of PipelineEndpoint\n",
|
|
"PipelineEndpoint is uniquely identified by name"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"##### Set Name PipelineEndpoint"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"pipeline_endpoint_by_name.set_name(name=\"NewName\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### PipelineEndpoint Submission\n",
|
|
"PipelineEndpoint triggers specific versioned pipeline or default pipeline by:\n",
|
|
"* Rest Endpoint \n",
|
|
"* Submit call."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"#### Run Pipeline by endpoint property of PipelineEndpoint\n",
|
|
"Run specific pipeline using endpoint property of PipelineEndpoint and executing http post."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"pipeline_endpoint_by_name = PipelineEndpoint.get(workspace=ws, name=\"PipelineEndpointTest\")\n",
|
|
"\n",
|
|
"# endpoint with id \n",
|
|
"rest_endpoint_id = pipeline_endpoint_by_name.endpoint\n",
|
|
"\n",
|
|
"# for default version pipeline\n",
|
|
"rest_endpoint_id_without_version_with_id = rest_endpoint_id\n",
|
|
"\n",
|
|
"# for specific version pipeline just append version info\n",
|
|
"version=\"0\"\n",
|
|
"rest_endpoint_id_with_version = rest_endpoint_id_without_version_with_id+\"/\"+ version\n",
|
|
"print(rest_endpoint_id_with_version)\n",
|
|
"pipeline_endpoint_by_name"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# endpoint with name\n",
|
|
"rest_endpoint_name = rest_endpoint_id.split(\"Id\", 1)[0] + \"Name?name=\" + pipeline_endpoint_by_name.name\n",
|
|
"\n",
|
|
"# for default version pipeline\n",
|
|
"rest_endpoint_name_without_version = rest_endpoint_name\n",
|
|
"\n",
|
|
"# for specific version pipeline just append version info\n",
|
|
"version=\"0\"\n",
|
|
"rest_endpoint_name_with_version = rest_endpoint_name_without_version+\"&pipelineVersion=\"+ version\n",
|
|
"print(rest_endpoint_name_with_version)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"[This notebook](https://aka.ms/pl-restep-auth) shows how to authenticate to AML workspace."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from azureml.core.authentication import InteractiveLoginAuthentication\n",
|
|
"import requests\n",
|
|
"\n",
|
|
"auth = InteractiveLoginAuthentication()\n",
|
|
"aad_token = auth.get_authentication_header()\n",
|
|
"\n",
|
|
"#endpoint = pipeline_endpoint_by_name.url\n",
|
|
"\n",
|
|
"print(\"You can perform HTTP POST on URL {} to trigger this pipeline\".format(rest_endpoint_name_with_version))\n",
|
|
"\n",
|
|
"# specify the param when running the pipeline\n",
|
|
"response = requests.post(rest_endpoint_name_with_version, \n",
|
|
" headers=aad_token, \n",
|
|
" json={\"ExperimentName\": \"default_pipeline\",\n",
|
|
" \"RunSource\": \"SDK\",\n",
|
|
" \"ParameterAssignments\": {\"1\": \"united\", \"2\":\"city\"}})\n",
|
|
"\n",
|
|
"run_id = response.json()[\"Id\"]\n",
|
|
"print(run_id)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"#### Run Pipeline by Submit call of PipelineEndpoint \n",
|
|
"Run specific pipeline using Submit api of PipelineEndpoint"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# submit pipeline with specific version\n",
|
|
"run_id = pipeline_endpoint_by_name.submit(\"TestPipelineEndpoint\", pipeline_version=\"0\")\n",
|
|
"print(run_id)\n",
|
|
"\n",
|
|
"# submit pipeline with default version\n",
|
|
"run_id = pipeline_endpoint_by_name.submit(\"TestPipelineEndpoint\")\n",
|
|
"print(run_id)"
|
|
]
|
|
}
|
|
],
|
|
"metadata": {
|
|
"authors": [
|
|
{
|
|
"name": "mameghwa"
|
|
}
|
|
],
|
|
"kernelspec": {
|
|
"display_name": "Python 3.6",
|
|
"language": "python",
|
|
"name": "python36"
|
|
},
|
|
"language_info": {
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"file_extension": ".py",
|
|
"mimetype": "text/x-python",
|
|
"name": "python",
|
|
"nbconvert_exporter": "python",
|
|
"pygments_lexer": "ipython3",
|
|
"version": "3.6.2"
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 2
|
|
} |