mirror of
https://github.com/Azure/MachineLearningNotebooks.git
synced 2025-12-20 01:27:06 -05:00
371 lines
12 KiB
Plaintext
371 lines
12 KiB
Plaintext
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
""
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Copyright (c) Microsoft Corporation. All rights reserved.\n",
|
|
"\n",
|
|
"Licensed under the MIT License."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"# Train a model using a custom Docker image"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"In this tutorial, learn how to use a custom Docker image when training models with Azure Machine Learning.\n",
|
|
"\n",
|
|
"The example scripts in this article are used to classify pet images by creating a convolutional neural network. "
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Set up the experiment\n",
|
|
"This section sets up the training experiment by initializing a workspace, creating an experiment, and uploading the training data and training scripts."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Initialize a workspace\n",
|
|
"The Azure Machine Learning workspace is the top-level resource for the service. It provides you with a centralized place to work with all the artifacts you create. In the Python SDK, you can access the workspace artifacts by creating a `workspace` object.\n",
|
|
"\n",
|
|
"Create a workspace object from the config.json file."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from azureml.core import Workspace\n",
|
|
"\n",
|
|
"ws = Workspace.from_config()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Prepare scripts\n",
|
|
"Create a directory titled `fastai-example`."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"import os\n",
|
|
"os.makedirs('fastai-example', exist_ok=True)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Then run the cell below to create the training script train.py in the directory."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {
|
|
"jupyter": {
|
|
"outputs_hidden": false,
|
|
"source_hidden": false
|
|
},
|
|
"nteract": {
|
|
"transient": {
|
|
"deleting": false
|
|
}
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"%%writefile fastai-example/train.py\n",
|
|
"\n",
|
|
"from fastai.vision.all import *\n",
|
|
"\n",
|
|
"path = untar_data(URLs.PETS)\n",
|
|
"path.ls()\n",
|
|
"\n",
|
|
"files = get_image_files(path/\"images\")\n",
|
|
"len(files)\n",
|
|
"\n",
|
|
"#(Path('/home/ashwin/.fastai/data/oxford-iiit-pet/images/yorkshire_terrier_102.jpg'),Path('/home/ashwin/.fastai/data/oxford-iiit-pet/images/great_pyrenees_102.jpg'))\n",
|
|
"\n",
|
|
"def label_func(f): return f[0].isupper()\n",
|
|
"\n",
|
|
"#To get our data ready for a model, we need to put it in a DataLoaders object. Here we have a function that labels using the file names, so we will use ImageDataLoaders.from_name_func. There are other factory methods of ImageDataLoaders that could be more suitable for your problem, so make sure to check them all in vision.data.\n",
|
|
"\n",
|
|
"dls = ImageDataLoaders.from_name_func(path, files, label_func, item_tfms=Resize(224))\n",
|
|
"\n",
|
|
"#We have passed to this function the directory we're working in, the files we grabbed, our label_func and one last piece as item_tfms: this is a Transform applied on all items of our dataset that will resize each imge to 224 by 224, by using a random crop on the largest dimension to make it a square, then resizing to 224 by 224. If we didn't pass this, we would get an error later as it would be impossible to batch the items together.\n",
|
|
"\n",
|
|
"dls.show_batch()\n",
|
|
"\n",
|
|
"learn = cnn_learner(dls, resnet34, metrics=error_rate)\n",
|
|
"learn.fine_tune(1)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Define your environment\n",
|
|
"Create an environment object and enable Docker."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from azureml.core import Environment\n",
|
|
"\n",
|
|
"fastai_env = Environment(\"fastai\")\n",
|
|
"fastai_env.docker.enabled = True"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"This specified base image supports the fast.ai library which allows for distributed deep learning capabilities. For more information, see the [fast.ai DockerHub](https://hub.docker.com/u/fastdotai). \n",
|
|
"\n",
|
|
"When you are using your custom Docker image, you might already have your Python environment properly set up. In that case, set the `user_managed_dependencies` flag to True in order to leverage your custom image's built-in python environment."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"fastai_env.docker.base_image = \"fastdotai/fastai:latest\"\n",
|
|
"fastai_env.python.user_managed_dependencies = True"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"To use an image from a private container registry that is not in your workspace, you must use `docker.base_image_registry` to specify the address of the repository as well as a username and password."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"```python\n",
|
|
"fastai_env.docker.base_image_registry.address = \"myregistry.azurecr.io\"\n",
|
|
"fastai_env.docker.base_image_registry.username = \"username\"\n",
|
|
"fastai_env.docker.base_image_registry.password = \"password\"\n",
|
|
"```"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"It is also possible to use a custom Dockerfile. Use this approach if you need to install non-Python packages as dependencies and remember to set the base image to None. "
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Specify docker steps as a string:\n",
|
|
"```python \n",
|
|
"dockerfile = r\"\"\" \\\n",
|
|
"FROM mcr.microsoft.com/azureml/base:intelmpi2018.3-ubuntu16.04\n",
|
|
"RUN echo \"Hello from custom container!\" \\\n",
|
|
"\"\"\"\n",
|
|
"```\n",
|
|
"Set base image to None, because the image is defined by dockerfile:\n",
|
|
"```python\n",
|
|
"fastai_env.docker.base_image = None \\\n",
|
|
"fastai_env.docker.base_dockerfile = dockerfile\n",
|
|
"```\n",
|
|
"Alternatively, load the string from a file:\n",
|
|
"```python\n",
|
|
"fastai_env.docker.base_image = None \\\n",
|
|
"fastai_env.docker.base_dockerfile = \"./Dockerfile\"\n",
|
|
"```"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Create or attach existing AmlCompute\n",
|
|
"You will need to create a [compute target](https://docs.microsoft.com/azure/machine-learning/service/concept-azure-machine-learning-architecture#compute-target) for training your model. In this tutorial, you create `AmlCompute` as your training compute resource.\n",
|
|
"\n",
|
|
"**Creation of AmlCompute takes approximately 5 minutes.** If the AmlCompute with that name is already in your workspace this code will skip the creation process.\n",
|
|
"\n",
|
|
"As with other Azure services, there are limits on certain resources (e.g. AmlCompute) associated with the Azure Machine Learning service. Please read [this article](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-manage-quotas) on the default limits and how to request more quota."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from azureml.core.compute import ComputeTarget, AmlCompute\n",
|
|
"from azureml.core.compute_target import ComputeTargetException\n",
|
|
"\n",
|
|
"# choose a name for your cluster\n",
|
|
"cluster_name = \"gpu-cluster\"\n",
|
|
"\n",
|
|
"try:\n",
|
|
" compute_target = ComputeTarget(workspace=ws, name=cluster_name)\n",
|
|
" print('Found existing compute target.')\n",
|
|
"except ComputeTargetException:\n",
|
|
" print('Creating a new compute target...')\n",
|
|
" compute_config = AmlCompute.provisioning_configuration(vm_size='STANDARD_NC6',\n",
|
|
" max_nodes=4)\n",
|
|
"\n",
|
|
" # create the cluster\n",
|
|
" compute_target = ComputeTarget.create(ws, cluster_name, compute_config)\n",
|
|
"\n",
|
|
" compute_target.wait_for_completion(show_output=True)\n",
|
|
"\n",
|
|
"# use get_status() to get a detailed status for the current AmlCompute\n",
|
|
"print(compute_target.get_status().serialize())"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Create a ScriptRunConfig\n",
|
|
"This ScriptRunConfig will configure your job for execution on the desired compute target."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {
|
|
"jupyter": {
|
|
"outputs_hidden": false,
|
|
"source_hidden": false
|
|
},
|
|
"nteract": {
|
|
"transient": {
|
|
"deleting": false
|
|
}
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"from azureml.core import ScriptRunConfig\n",
|
|
"\n",
|
|
"fastai_config = ScriptRunConfig(source_directory='fastai-example',\n",
|
|
" script='train.py',\n",
|
|
" compute_target=compute_target,\n",
|
|
" environment=fastai_env)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Submit your run\n",
|
|
"When a training run is submitted using a ScriptRunConfig object, the submit method returns an object of type ScriptRun. The returned ScriptRun object gives you programmatic access to information about the training run. "
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {
|
|
"jupyter": {
|
|
"outputs_hidden": false,
|
|
"source_hidden": false
|
|
},
|
|
"nteract": {
|
|
"transient": {
|
|
"deleting": false
|
|
}
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"from azureml.core import Experiment\n",
|
|
"\n",
|
|
"run = Experiment(ws,'fastai-custom-image').submit(fastai_config)\n",
|
|
"run.wait_for_completion(show_output=True)"
|
|
]
|
|
}
|
|
],
|
|
"metadata": {
|
|
"authors": [
|
|
{
|
|
"name": "sagopal"
|
|
}
|
|
],
|
|
"category": "training",
|
|
"compute": [
|
|
"AML Compute"
|
|
],
|
|
"datasets": [
|
|
"Oxford IIIT Pet"
|
|
],
|
|
"deployment": [
|
|
"None"
|
|
],
|
|
"exclude_from_index": false,
|
|
"framework": [
|
|
"Pytorch"
|
|
],
|
|
"friendly_name": "Train a model with a custom Docker image",
|
|
"index_order": 1,
|
|
"kernelspec": {
|
|
"display_name": "Python 3.6",
|
|
"language": "python",
|
|
"name": "python36"
|
|
},
|
|
"language_info": {
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"file_extension": ".py",
|
|
"mimetype": "text/x-python",
|
|
"name": "python",
|
|
"nbconvert_exporter": "python",
|
|
"pygments_lexer": "ipython3",
|
|
"version": "3.6.7"
|
|
},
|
|
"nteract": {
|
|
"version": "nteract-front-end@1.0.0"
|
|
},
|
|
"tags": [
|
|
"None"
|
|
],
|
|
"task": "Train with custom Docker image"
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 2
|
|
} |