{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/track-and-monitor-experiments/using-mlflow/train-projects-remote/train-projects-remote.png)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Copyright (c) Microsoft Corporation. All rights reserved.\n", "\n", "Licensed under the MIT License." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Train with MLflow Projects on AML Compute\n", "\n", "Train MLflow Projects on Azure Machine Learning Compute.\n", "\n", "Train MLflow Projects on your machine with AzureML compute and tracking. In this notebook you will:\n", "\n", "1. Set up MLflow tracking URI to track experiments and metrics in AzureML\n", "2. Create experiment\n", "3. Set up an MLflow project to run on AzureML compute\n", "4. Submit an MLflow project run and view it in an AzureML workspace \n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Prerequisites \n", "\n", "If you are using a Notebook VM, you are all set. Otherwise, go through the [Configuration](../../../../configuration.ipnyb) notebook to set up your Azure Machine Learning workspace and ensure other common prerequisites are met.\n", "\n", "Make sure you have the following before staring the notebook: \n", "- Connected to an AML Workspace \n", "- Have an existing [Azure ML Compute cluster](https://docs.microsoft.com/azure/machine-learning/how-to-create-attach-compute-sdk#amlcompute) in that Workspace \n", "- Have an MLproject file with a modified environment specification \n", "\n", "Add the azureml-mlflow package as a pip dependency to your environment configuration file (conda.yaml). The project can run without this addition, but key artifacts and metrics will not be logged to your Workspace. An example conda.yaml is included in this tutorial folder with the necessary packages. \n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Set-up \n", "\n", "Check the Azure ML and MLflow SDK version installed on your computer and connect to your workspace" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "import sys, os\n", "import mlflow\n", "import mlflow.azureml\n", "\n", "import azureml.core\n", "from azureml.core import Workspace\n", "\n", "ws = Workspace.from_config()\n", "\n", "print(\"SDK version:\", azureml.core.VERSION)\n", "print(\"MLflow version:\", mlflow.version.VERSION)\n", "print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep = '\\n')\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Initialize Tracking Store and Experiment\n", "\n", "### Set Tracking URI \n", "\n", "Set the MLflow tracking URI to point to your Azure ML Workspace. The subsequent logging calls from MLflow APIs will go to Azure ML services and will be tracked under your Workspace." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "mlflow.set_tracking_uri(ws.get_mlflow_tracking_uri())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create Experiment\n", "\n", "Create an Mlflow Experiment to organize your runs. It can be set either by passing the name as a **parameter** in the mlflow.projects.run call or by the following," ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "experiment_name = \"train-project-amlcompute\"\n", "mlflow.set_experiment(experiment_name)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Create the Backend Configuration Object\n", "\n", "The backend configuration object will store necesary information for the integration such as the compute target and whether to use your local managed environment or a system managed environment. \n", "\n", "The integration will accept \"COMPUTE\" and \"USE_CONDA\" as parameters where \"COMPUTE\" is set to the name of your remote compute cluster and \"USE_CONDA\" which creates a new environment for the project from the environment configuration file. If \"COMPUTE\" is present in the object, the project will be automatically submitted to the remote compute and ignore \"USE_CONDA\". MLflow accepts a dictionary object or a JSON file." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# dictionary\n", "backend_config = {\"COMPUTE\": \"cpu-cluster\", \"USE_CONDA\": False}\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Modify your Environment specification\n", "\n", "Add the azureml-mlflow package as a pip dependency to your environment configuration file (conda.yaml). The project can run without this addition, but key artifacts and metrics will not be logged to your Workspace. An example conda.yaml is included in the notebook folder. Adding it to to the file will look like this,\n", "\n", "```\n", "name: mlflow-example\n", "channels:\n", " - defaults\n", " - anaconda\n", " - conda-forge\n", "dependencies:\n", " - python=3.6\n", " - scikit-learn=0.19.1\n", " - pip\n", " - pip:\n", " - mlflow\n", " - azureml-mlflow\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Submit Run \n", "\n", "Submit the mlflow project run using aml compute and ensure the **backened** parameter is set to azureml.\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "remote_mlflow_run = mlflow.projects.run(uri=\".\", \n", " parameters={\"alpha\":0.3},\n", " backend = \"azureml\",\n", " backend_config = backend_config,\n", " synchronous=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### View run \n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "remote_mlflow_run" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Next Steps\n", "\n", "Try out these notebooks to learn more about MLflow-Azure Machine Learning integration:\n", "\n", " * [Train a model using remote compute on Azure Cloud](../train-remote/train-remote.ipynb)\n", " * [Train a model using Pytorch and MLflow](../../../ml-frameworks/using-mlflow/train-and-deploy-pytorch)\n", "\n" ] } ], "metadata": { "authors": [ { "name": "shipatel" } ], "category": "tutorial", "celltoolbar": "Edit Metadata", "compute": [ "AML Compute" ], "exclude_from_index": false, "framework": [ "Scikit" ], "friendly_name": "Use MLflow projects with Azure Machine Learning to train a model", "kernelspec": { "display_name": "Python 3.8 - AzureML", "language": "python", "name": "python38-azureml" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.5-final" }, "tags": [ "mlflow", "scikit" ], "task": "Use MLflow projects with Azure Machine Learning to train a model using azureml compute" }, "nbformat": 4, "nbformat_minor": 2 }