{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "Copyright (c) Microsoft Corporation. All rights reserved.\n", "\n", "Licensed under the MIT License" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/training/using-environments/using-environments.png)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Using environments\n", "\n", "\n", "## Contents\n", "\n", "1. [Introduction](#Introduction)\n", "1. [Setup](#Setup)\n", "1. [Create environment](#Create-environment)\n", " 1. Add Python packages\n", " 1. Specify environment variables\n", "1. [Submit run using environment](#Submit-run-using-environment)\n", "1. [Register environment](#Register-environment)\n", "1. [List and get existing environments](#List-and-get-existing-environments)\n", "1. [Other ways to create environments](#Other-ways-to-create-environments)\n", " 1. From existing Conda environment\n", " 1. From Conda or pip files\n", "1. [Docker settings](#Docker-settings)\n", "1. [Spark and Azure Databricks settings](#Spark-and-Azure-Databricks-settings)\n", "1. [Next steps](#Next-steps)\n", "\n", "## Introduction\n", "\n", "Azure ML environments are an encapsulation of the environment where your machine learning training happens. They define Python packages, environment variables, Docker settings and other attributes in declarative fashion. Environments are versioned: you can update them and retrieve old versions to revist and review your work.\n", "\n", "Environments allow you to:\n", "* Encapsulate dependencies of your training process, such as Python packages and their versions.\n", "* Reproduce the Python environment on your local computer in a remote run on VM or ML Compute cluster\n", "* Reproduce your experimentation environment in production setting.\n", "* Revisit and audit the environment in which an existing model was trained.\n", "\n", "Environment, compute target and training script together form run configuration: the full specification of training run.\n", "\n", "## Setup\n", "\n", "If you are using an Azure Machine Learning Notebook VM, you are all set. Otherwise, make sure you go through the [configuration notebook](../../../configuration.ipynb) first if you haven't.\n", "\n", "First, let's validate Azure ML SDK version and connect to workspace." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import azureml.core\n", "print(azureml.core.VERSION)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from azureml.core.workspace import Workspace\n", "ws = Workspace.from_config()\n", "ws.get_details()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Create environment\n", "\n", "You can create an environment by instantiating ```Environment``` object and then setting its attributes: set of Python packages, environment variables and others.\n", "\n", "### Add Python packages\n", "\n", "The recommended way is to specify Conda packages, as they typically come with complete set of pre-built binaries." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from azureml.core import Environment\n", "from azureml.core.environment import CondaDependencies\n", "\n", "myenv = Environment(name=\"myenv\")\n", "conda_dep = CondaDependencies()\n", "conda_dep.add_conda_package(\"scikit-learn\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can also add pip packages, and specify the version of package" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "conda_dep.add_pip_package(\"pillow==5.4.1\")\n", "myenv.python.conda_dependencies=conda_dep" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Specify environment variables\n", "\n", "You can add environment variables to your environment. These then become available using ```os.environ.get``` in your training script." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "myenv.environment_variables = {\"MESSAGE\":\"Hello from Azure Machine Learning\"}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Submit run using environment\n", "\n", "When you submit a run, you can specify which environment to use. \n", "\n", "On the first run in given environment, Azure ML spends some time building the environment. On the subsequent runs, Azure ML keeps track of changes and uses the existing environment, resulting in faster run completion." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from azureml.core import ScriptRunConfig, Experiment\n", "\n", "myexp = Experiment(workspace=ws, name = \"environment-example\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To submit a run, create a run configuration that combines the script file and environment, and pass it to ```Experiment.submit```. In this example, the script is submitted to local computer, but you can specify other compute targets such as remote clusters as well." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "runconfig = ScriptRunConfig(source_directory=\".\", script=\"example.py\")\n", "runconfig.run_config.target = \"local\"\n", "runconfig.run_config.environment = myenv\n", "run = myexp.submit(config=runconfig)\n", "\n", "run.wait_for_completion(show_output=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Register environment\n", "\n", "You can manage environments by registering them. This allows you to track their versions, and reuse them in future runs. For example, once you've constructed an environment that meets your requirements, you can register it and use it in other experiments so as to standardize your workflow.\n", "\n", "If you register the environment with same name, the version number is increased by one. Note that Azure ML keeps track of differences between the version, so if you re-register an identical version, the version number is not increased." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "myenv.register(workspace=ws)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## List and get existing environments\n", "\n", "Your workspace contains a dictionary of registered environments. You can then use ```Environment.get``` to retrieve a specific environment with specific version." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for name,env in ws.environments.items():\n", " print(\"Name {} \\t version {}\".format(name,env.version))\n", "\n", "restored_environment = Environment.get(workspace=ws,name=\"myenv\",version=\"1\")\n", "\n", "print(\"Attributes of restored environment\")\n", "restored_environment" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Other ways to create environments\n", "\n", "### From existing Conda environment\n", "\n", "You can create an environment from existing conda environment. This make it easy to reuse your local interactive environment in Azure ML remote runs. For example, if you've created conda environment using\n", "```\n", "conda create -n mycondaenv\n", "```\n", "you can create Azure ML environment out of that conda environment using\n", "```\n", "myenv = Environment.from_existing_conda_environment(name=\"myenv\",conda_environment_name=\"mycondaenv\")\n", "```\n", "\n", "### From conda or pip files\n", "\n", "You can create environments from conda specification or pip requirements files using\n", "```\n", "myenv = Environment.from_conda_specification(name=\"myenv\", file_path=\"path-to-conda-specification-file\")\n", "\n", "myenv = Environment.from_pip_requirements(name=\"myenv\", file_path=\"path-to-pip-requirements-file\")\n", "```\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Docker settings\n", "\n", "Docker container provides an efficient way to encapsulate the dependencies. When you enable Docker, Azure ML builds a Docker image and creates a Python environment within that container, given your specifications. The Docker images are reused: the first run in a new environment typically takes longer as the image is build.\n", "\n", "**Note:** For runs on local computer or attached virtual machine, that computer must have Docker installed and enabled. Machine Learning Compute has Docker pre-installed.\n", "\n", "Attribute ```docker.enabled``` controls whether to use Docker container or host OS for execution. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "myenv.docker.enabled = True" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can specify custom Docker base image and registry. This allows you to customize and control in detail the guest OS in which your training run executes. whether to use GPU, whether to use shared volumes, and shm size." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "myenv.docker.base_image\n", "myenv.docker.base_image_registry" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can also specify whether to use GPU or shared volumes, and shm size." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "myenv.docker.gpu_support\n", "myenv.docker.shared_volumes\n", "myenv.docker.shm_size" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Spark and Azure Databricks settings\n", "\n", "In addition to Python and Docker settings, Environment also contains attributes for Spark and Azure Databricks runs. These attributes become relevant when you submit runs on those compute targets." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Next steps\n", "\n", "Learn more about remote runs on different compute targets:\n", "\n", "* [Train on ML Compute](../../train-on-amlcompute)\n", "\n", "* [Train on remote VM](../../train-on-remote-vm)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "authors": [ { "name": "roastala" } ], "kernelspec": { "display_name": "Python 3.6", "language": "python", "name": "python36" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.5" } }, "nbformat": 4, "nbformat_minor": 2 }