{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/track-and-monitor-experiments/using-mlflow/train-projects-local/train-projects-local.png)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Copyright (c) Microsoft Corporation. All rights reserved.\n",
        "\n",
        "Licensed under the MIT License."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "# Train with MLflow Projects on local compute\n",
        "\n",
        "Train MLflow Projects on your machine with local compute and AzureML tracking. In this notebook you will:\n",
        "\n",
        "1. Set up MLflow tracking URI to track experiments and metrics in AzureML\n",
        "2. Create experiment\n",
        "3. Set up an MLflow project to run on AzureML compute\n",
        "4. Submit an MLflow project run and view it in an AzureML workspace "
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Prerequisites \n",
        "\n",
        "If you are using a Notebook VM, you're all set. Otherwise, go through the [Configuration](../../../../configuration.ipnyb) notebook to set up your Azure Machine Learning workspace and ensure other common prerequisites are met.\n",
        "\n",
        "Install azureml-mlflow package before running this notebook. Note that mlflow itself gets installed as dependency if you haven't installed it yet.\n",
        "\n",
        "```\n",
        "pip install azureml-mlflow\n",
        "```\n",
        "\n",
        "This example also uses scikit-learn. Install them using the following:\n",
        "```\n",
        "pip install scikit-learn matplotlib\n",
        "```\n",
        "\n",
        "Make sure you have the following before starting the notebook: \n",
        "- Connected to an AzureML Workspace \n",
        "- Your local conda environment has the necessary packages needed to run this project\n",
        "\n",
        "\n",
        "\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Set-up\n",
        "\n",
        "Configure your workspace and check package versions"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "tags": []
      },
      "outputs": [],
      "source": [
        "import sys, os\n",
        "import mlflow\n",
        "import mlflow.azureml\n",
        "\n",
        "import azureml.core\n",
        "from azureml.core import Workspace\n",
        "\n",
        "ws = Workspace.from_config()\n",
        "\n",
        "print(\"SDK version:\", azureml.core.VERSION)\n",
        "print(\"MLflow version:\", mlflow.version.VERSION)\n",
        "print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep = '\\n')"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Initialize Tracking Store and Experiment\n",
        "\n",
        "### Set the Tracking Store\n",
        "Set the MLflow tracking URI to point to your Azure ML Workspace. The subsequent logging calls from MLflow APIs will go to Azure ML services and will be tracked under your Workspace."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "tags": []
      },
      "outputs": [],
      "source": [
        "mlflow.set_tracking_uri(ws.get_mlflow_tracking_uri())"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### Create Experiment\n",
        "Create an Mlflow Experiment to organize your runs. It can be set either by passing the name as a parameter in the mlflow.projects.run call or by the following,"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "tags": []
      },
      "outputs": [],
      "source": [
        "experiment_name = \"train-project-local\"\n",
        "mlflow.set_experiment(experiment_name)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Create the Backend Configuration Object\n",
        "\n",
        "The backend configuration object will store necesary information for the integration such as the compute target and whether to use your local managed environment or a system managed environment. \n",
        "\n",
        "The integration will accept \"COMPUTE\" and \"USE_CONDA\" as parameters where \"COMPUTE\" is set to the name of a remote target (not applicable for this training example) and \"USE_CONDA\" which creates a new environment for the project from the environment configuration file. You must set this to \"False\" and include it in the backend configuration object if you want to use your local environment for the project run. Mlflow accepts a dictionary object or a JSON file."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# dictionary\n",
        "backend_config = {\"USE_CONDA\": False}\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Add the Integration to your Environment Configuration\n",
        "\n",
        "Add the azureml-mlflow package as a pip dependency to your environment configuration file (conda.yaml). The project can run without this addition, but key artifacts and metrics will not be logged to your Workspace. An example conda.yaml file is included in this notebook folder. Adding it to to the file will look like this,\n",
        "\n",
        "```\n",
        "name: mlflow-example\n",
        "channels:\n",
        "  - defaults\n",
        "  - anaconda\n",
        "  - conda-forge\n",
        "dependencies:\n",
        "  - python=3.6\n",
        "  - scikit-learn=0.19.1\n",
        "  - pip\n",
        "  - pip:\n",
        "    - mlflow\n",
        "    - azureml-mlflow\n",
        "```"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## User Managed environment\n",
        "For using your local conda environment, set `use_conda = False` in the backend_config object. Ensure your local environment has all the necessary packages for running the project and you are specifying the **backend parameter** in any run call to be **\"azureml\"**."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "tags": []
      },
      "outputs": [],
      "source": [
        "local_env_run = mlflow.projects.run(uri=\".\", \n",
        "                                    parameters={\"alpha\":0.3},\n",
        "                                    backend = \"azureml\",\n",
        "                                    use_conda=False,\n",
        "                                    backend_config = backend_config)\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "local_env_run"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Note: All these calculations were run on your local machine, in the conda environment you defined above. You can find the results in:\n",
        "- Your AzureML Experiments (a link to the run will be provided in the console)\n",
        "- ~/.azureml/envs/azureml_xxxx for the conda environment you just created\n",
        "- ~/AppData/Local/Temp/azureml_runs/train-on-local_xxxx for the machine learning models you trained (this path may differ depending on the platform you use). This folder also contains\n",
        "    - Logs (under azureml_logs/)\n",
        "    - Output pickled files (under outputs/)\n",
        "    - The configuration files (credentials, local and docker image setups)\n",
        "    - The train.py and mylib.py scripts\n",
        "    - The current notebook"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## System Mananged Environment\n",
        "\n",
        "Now, instead of managing the setup of the environment yourself, you can ask the system to build a new conda environment for you using the environment configuration file in this project. If a backend configuration object is not provided in the call, the integration will default to creating a new conda environment. The environment is built once, and will be reused in subsequent executions as long as the conda dependencies remain unchanged.\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "tags": []
      },
      "outputs": [],
      "source": [
        "backend_config = {\"USE_CONDA\": True}\n",
        "local_mlproject_run = mlflow.projects.run(uri=\".\", \n",
        "                                    parameters={\"alpha\":0.3},\n",
        "                                    backend = \"azureml\",\n",
        "                                    backend_config = backend_config)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Next Steps \n",
        "\n",
        "Try out these notebooks to learn more about MLflow-Azure Machine Learning integration:\n",
        "\n",
        " * [Train a model using remote compute on Azure Cloud](../train-on-remote/train-on-remote.ipynb)\n",
        " * [Deploy the model as a web service](../deploy-model/deploy-model.ipynb)\n",
        " * [Train a model using Pytorch and MLflow](../../ml-frameworks/using-mlflow/train-and-deploy-pytorch)\n",
        "\n"
      ]
    }
  ],
  "metadata": {
    "authors": [
      {
        "name": "shipatel"
      }
    ],
    "category": "tutorial",
    "celltoolbar": "Edit Metadata",
    "compute": [
      "Local"
    ],
    "exclude_from_index": false,
    "framework": [
      "ScikitLearn"
    ],
    "friendly_name": "Use MLflow projects with Azure Machine Learning to train a model with local compute",
    "kernelspec": {
      "display_name": "Python 3.8 - AzureML",
      "language": "python",
      "name": "python38-azureml"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.8.5-final"
    },
    "tags": [
      "mlflow",
      "scikit"
    ],
    "task": "Use MLflow projects with Azure Machine Learning to train a model using local compute"
  },
  "nbformat": 4,
  "nbformat_minor": 2
}