update samples from Release-137 as a part of 1.0.53 SDK release

This commit is contained in:
vizhur
2019-07-24 22:37:36 +00:00
parent ddfce6b24c
commit ee1da0ee19
57 changed files with 2778 additions and 511 deletions

View File

@@ -38,6 +38,7 @@ The [How to use Azure ML](./how-to-use-azureml) folder contains specific example
- [Machine Learning Pipelines](./how-to-use-azureml/machine-learning-pipelines) - Examples showing how to create and use reusable pipelines for training and batch scoring
- [Deployment](./how-to-use-azureml/deployment) - Examples showing how to deploy and manage machine learning models and solutions
- [Azure Databricks](./how-to-use-azureml/azure-databricks) - Examples showing how to use Azure ML with Azure Databricks
- [Monitor Models](./how-to-use-azureml/monitor-models) - Examples showing how to enable model monitoring services such as DataDrift
---
## Documentation
@@ -52,6 +53,7 @@ The [How to use Azure ML](./how-to-use-azureml) folder contains specific example
Visit following repos to see projects contributed by Azure ML users:
- [AMLSamples](https://github.com/Azure/AMLSamples) Number of end-to-end examples, including face recognition, predictive maintenance, customer churn and sentiment analysis.
- [Fine tune natural language processing models using Azure Machine Learning service](https://github.com/Microsoft/AzureML-BERT)
- [Fashion MNIST with Azure ML SDK](https://github.com/amynic/azureml-sdk-fashion)

View File

@@ -103,7 +103,7 @@
"source": [
"import azureml.core\n",
"\n",
"print(\"This notebook was created using version 1.0.48\r\n of the Azure ML SDK\")\n",
"print(\"This notebook was created using version 1.0.53 of the Azure ML SDK\")\n",
"print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
]
},

View File

@@ -2,6 +2,7 @@ name: azure_automl
dependencies:
# The python interpreter version.
# Currently Azure ML only supports 3.5.2 and later.
- pip
- nomkl
- python>=3.5.2,<3.6.8
- nb_conda

View File

@@ -578,7 +578,7 @@
"metadata": {
"authors": [
{
"name": "xiaga@microsoft.com, tosingli@microsoft.com, erwright@microsoft.com"
"name": "erwright"
}
],
"kernelspec": {

View File

@@ -587,7 +587,7 @@
"metadata": {
"authors": [
{
"name": "xiaga, tosingli, erwright"
"name": "erwright"
}
],
"kernelspec": {

View File

@@ -829,7 +829,7 @@
"metadata": {
"authors": [
{
"name": "erwright, tosingli"
"name": "erwright"
}
],
"kernelspec": {

View File

@@ -87,7 +87,7 @@ These instruction setup the integration for SQL Server 2017 on Windows.
sudo /opt/mssql/mlservices/bin/python/python -m pip install --upgrade sklearn
```
7. Start SQL Server.
8. Execute the files aml_model.sql, aml_connection.sql, AutoMLGetMetrics.sql, AutoMLPredict.sql and AutoMLTrain.sql in SQL Server Management Studio.
8. Execute the files aml_model.sql, aml_connection.sql, AutoMLGetMetrics.sql, AutoMLPredict.sql, AutoMLForecast.sql and AutoMLTrain.sql in SQL Server Management Studio.
9. Create an Azure Machine Learning Workspace. You can use the instructions at: [https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-manage-workspace](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-manage-workspace)
10. Create a config.json file file using the subscription id, resource group name and workspace name that you use to create the workspace. The file is described at: [https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-configure-environment#workspace](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-configure-environment#workspace)
11. Create an Azure service principal. You can do this with the commands:
@@ -109,5 +109,5 @@ First you need to load the sample data in the database.
You can then run the queries in the energy-demand folder:
* TrainEnergyDemand.sql runs AutoML, trains multiple models on data and selects the best model.
* PredictEnergyDemand.sql predicts based on the most recent training run.
* ForecastEnergyDemand.sql forecasts based on the most recent training run.
* GetMetrics.sql returns all the metrics for each model in the most recent training run.

View File

@@ -12,7 +12,7 @@ Easily create and train a model using various deep neural networks (DNNs) as a f
To learn more about the azureml-accel-model classes, see the section [Model Classes](#model-classes) below or the [Azure ML Accel Models SDK documentation](https://docs.microsoft.com/en-us/python/api/azureml-accel-models/azureml.accel?view=azure-ml-py).
### Step 1: Create an Azure ML workspace
Follow [these instructions](https://docs.microsoft.com/en-us/azure/machine-learning/service/quickstart-create-workspace-with-python) to install the Azure ML SDK on your local machine, create an Azure ML workspace, and set up your notebook environment, which is required for the next step.
Follow [these instructions](https://docs.microsoft.com/en-us/azure/machine-learning/service/setup-create-workspace) to install the Azure ML SDK on your local machine, create an Azure ML workspace, and set up your notebook environment, which is required for the next step.
### Step 2: Check your FPGA quota
Use the Azure CLI to check whether you have quota.

View File

@@ -1,5 +1,12 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/deployment/accelerated-models/accelerated-models-object-detection.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
@@ -230,11 +237,14 @@
"\n",
"# Convert model\n",
"convert_request = AccelOnnxConverter.convert_tf_model(ws, registered_model, input_tensors, output_tensors_str)\n",
"# If it fails, you can run wait_for_completion again with show_output=True.\n",
"convert_request.wait_for_completion(show_output=False)\n",
"converted_model = convert_request.result\n",
"print(\"\\nSuccessfully converted: \", converted_model.name, converted_model.url, converted_model.version, \n",
"if convert_request.wait_for_completion(show_output = False):\n",
" # If the above call succeeded, get the converted model\n",
" converted_model = convert_request.result\n",
" print(\"\\nSuccessfully converted: \", converted_model.name, converted_model.url, converted_model.version, \n",
" converted_model.id, converted_model.created_time, '\\n')\n",
"else:\n",
" print(\"Model conversion failed. Showing output.\")\n",
" convert_request.wait_for_completion(show_output = True)\n",
"\n",
"# Package into AccelContainerImage\n",
"image_config = AccelContainerImage.image_configuration()\n",
@@ -298,6 +308,7 @@
"metadata": {},
"outputs": [],
"source": [
"%%time\n",
"aks_target.wait_for_completion(show_output = True)\n",
"print(aks_target.provisioning_state)\n",
"print(aks_target.provisioning_errors)"
@@ -316,6 +327,7 @@
"metadata": {},
"outputs": [],
"source": [
"%%time\n",
"from azureml.core.webservice import Webservice, AksWebservice\n",
"\n",
"# Set the web service configuration (for creating a test service, we don't want autoscale enabled)\n",
@@ -342,10 +354,9 @@
"## 5. Test the service\n",
"<a id=\"create-client\"></a>\n",
"### 5.a. Create Client\n",
"The image supports gRPC and the TensorFlow Serving \"predict\" API. We have a client that can call into the docker image to get predictions. \n",
"\n",
"**Note:** If you chose to use auth_enabled=True when creating your AksWebservice.deploy_configuration(), see documentation [here](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.webservice(class)?view=azure-ml-py#get-keys--) on how to retrieve your keys and use either key as an argument to PredictionClient(...,access_token=key).",
"The image supports gRPC and the TensorFlow Serving \"predict\" API. We will create a PredictionClient from the Webservice object that can call into the docker image to get predictions. If you do not have the Webservice object, you can also create [PredictionClient](https://docs.microsoft.com/en-us/python/api/azureml-accel-models/azureml.accel.predictionclient?view=azure-ml-py) directly.\n",
"\n",
"**Note:** If you chose to use auth_enabled=True when creating your AksWebservice.deploy_configuration(), see documentation [here](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.webservice(class)?view=azure-ml-py#get-keys--) on how to retrieve your keys and use either key as an argument to PredictionClient(...,access_token=key).\n",
"**WARNING:** If you are running on Azure Notebooks free compute, you will not be able to make outgoing calls to your service. Try locating your client on a different machine to consume it."
]
},
@@ -356,18 +367,10 @@
"outputs": [],
"source": [
"# Using the grpc client in AzureML Accelerated Models SDK\n",
"from azureml.accel.client import PredictionClient\n",
"\n",
"address = aks_service.scoring_uri\n",
"ssl_enabled = address.startswith(\"https\")\n",
"address = address[address.find('/')+2:].strip('/')\n",
"port = 443 if ssl_enabled else 80\n",
"from azureml.accel import client_from_service\n",
"\n",
"# Initialize AzureML Accelerated Models client\n",
"client = PredictionClient(address=address,\n",
" port=port,\n",
" use_ssl=ssl_enabled,\n",
" service_name=aks_service.name)"
"client = client_from_service(aks_service)"
]
},
{
@@ -486,7 +489,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.0"
"version": "3.5.6"
}
},
"nbformat": 4,

View File

@@ -0,0 +1,8 @@
name: accelerated-models-object-detection
dependencies:
- pip:
- azureml-sdk
- azureml-accel-models
- tensorflow
- opencv-python
- matplotlib

View File

@@ -1,5 +1,12 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/deployment/accelerated-models/accelerated-models-quickstart.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
@@ -270,12 +277,15 @@
"from azureml.accel import AccelOnnxConverter\n",
"\n",
"convert_request = AccelOnnxConverter.convert_tf_model(ws, registered_model, input_tensors, output_tensors)\n",
"# If it fails, you can run wait_for_completion again with show_output=True.\n",
"convert_request.wait_for_completion(show_output = False)\n",
"# If the above call succeeded, get the converted model\n",
"converted_model = convert_request.result\n",
"print(\"\\nSuccessfully converted: \", converted_model.name, converted_model.url, converted_model.version, \n",
" converted_model.id, converted_model.created_time, '\\n')"
"\n",
"if convert_request.wait_for_completion(show_output = False):\n",
" # If the above call succeeded, get the converted model\n",
" converted_model = convert_request.result\n",
" print(\"\\nSuccessfully converted: \", converted_model.name, converted_model.url, converted_model.version, \n",
" converted_model.id, converted_model.created_time, '\\n')\n",
"else:\n",
" print(\"Model conversion failed. Showing output.\")\n",
" convert_request.wait_for_completion(show_output = True)"
]
},
{
@@ -366,6 +376,7 @@
"metadata": {},
"outputs": [],
"source": [
"%%time\n",
"aks_target.wait_for_completion(show_output = True)\n",
"print(aks_target.provisioning_state)\n",
"print(aks_target.provisioning_errors)"
@@ -384,9 +395,10 @@
"metadata": {},
"outputs": [],
"source": [
"%%time\n",
"from azureml.core.webservice import Webservice, AksWebservice\n",
"\n",
"#Set the web service configuration (for creating a test service, we don't want autoscale enabled)\n",
"# Set the web service configuration (for creating a test service, we don't want autoscale enabled)\n",
"# Authentication is enabled by default, but for testing we specify False\n",
"aks_config = AksWebservice.deploy_configuration(autoscale_enabled=False,\n",
" num_replicas=1,\n",
@@ -415,10 +427,9 @@
"metadata": {},
"source": [
"### 7.a. Create Client\n",
"The image supports gRPC and the TensorFlow Serving \"predict\" API. We have a client that can call into the docker image to get predictions.\n",
"\n",
"**Note:** If you chose to use auth_enabled=True when creating your AksWebservice, see documentation [here](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.webservice(class)?view=azure-ml-py#get-keys--) on how to retrieve your keys and use either key as an argument to PredictionClient(...,access_token=key).",
"The image supports gRPC and the TensorFlow Serving \"predict\" API. We will create a PredictionClient from the Webservice object that can call into the docker image to get predictions. If you do not have the Webservice object, you can also create [PredictionClient](https://docs.microsoft.com/en-us/python/api/azureml-accel-models/azureml.accel.predictionclient?view=azure-ml-py) directly.\n",
"\n",
"**Note:** If you chose to use auth_enabled=True when creating your AksWebservice, see documentation [here](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.webservice(class)?view=azure-ml-py#get-keys--) on how to retrieve your keys and use either key as an argument to PredictionClient(...,access_token=key).\n",
"**WARNING:** If you are running on Azure Notebooks free compute, you will not be able to make outgoing calls to your service. Try locating your client on a different machine to consume it."
]
},
@@ -429,18 +440,10 @@
"outputs": [],
"source": [
"# Using the grpc client in AzureML Accelerated Models SDK\n",
"from azureml.accel.client import PredictionClient\n",
"\n",
"address = aks_service.scoring_uri\n",
"ssl_enabled = address.startswith(\"https\")\n",
"address = address[address.find('/')+2:].strip('/')\n",
"port = 443 if ssl_enabled else 80\n",
"from azureml.accel import client_from_service\n",
"\n",
"# Initialize AzureML Accelerated Models client\n",
"client = PredictionClient(address=address,\n",
" port=port,\n",
" use_ssl=ssl_enabled,\n",
" service_name=aks_service.name)"
"client = client_from_service(aks_service)"
]
},
{
@@ -540,7 +543,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.0"
"version": "3.5.6"
}
},
"nbformat": 4,

View File

@@ -0,0 +1,6 @@
name: accelerated-models-quickstart
dependencies:
- pip:
- azureml-sdk
- azureml-accel-models
- tensorflow

View File

@@ -1,5 +1,12 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/deployment/accelerated-models/accelerated-models-training.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
@@ -410,6 +417,7 @@
"metadata": {},
"outputs": [],
"source": [
"%%time\n",
"# Launch the training\n",
"tf.reset_default_graph()\n",
"sess = tf.Session(graph=tf.get_default_graph())\n",
@@ -582,11 +590,14 @@
"\n",
"# Convert model\n",
"convert_request = AccelOnnxConverter.convert_tf_model(ws, registered_model, input_tensors, output_tensors)\n",
"# If it fails, you can run wait_for_completion again with show_output=True.\n",
"convert_request.wait_for_completion(show_output=False)\n",
"converted_model = convert_request.result\n",
"print(\"\\nSuccessfully converted: \", converted_model.name, converted_model.url, converted_model.version, \n",
"if convert_request.wait_for_completion(show_output = False):\n",
" # If the above call succeeded, get the converted model\n",
" converted_model = convert_request.result\n",
" print(\"\\nSuccessfully converted: \", converted_model.name, converted_model.url, converted_model.version, \n",
" converted_model.id, converted_model.created_time, '\\n')\n",
"else:\n",
" print(\"Model conversion failed. Showing output.\")\n",
" convert_request.wait_for_completion(show_output = True)\n",
"\n",
"# Package into AccelContainerImage\n",
"image_config = AccelContainerImage.image_configuration()\n",
@@ -655,6 +666,7 @@
"metadata": {},
"outputs": [],
"source": [
"%%time\n",
"aks_target.wait_for_completion(show_output = True)\n",
"print(aks_target.provisioning_state)\n",
"print(aks_target.provisioning_errors)"
@@ -673,6 +685,7 @@
"metadata": {},
"outputs": [],
"source": [
"%%time\n",
"from azureml.core.webservice import Webservice, AksWebservice\n",
"\n",
"# Set the web service configuration (for creating a test service, we don't want autoscale enabled)\n",
@@ -700,10 +713,9 @@
"\n",
"<a id=\"create-client\"></a>\n",
"### 9.a. Create Client\n",
"The image supports gRPC and the TensorFlow Serving \"predict\" API. We have a client that can call into the docker image to get predictions. \n",
"\n",
"**Note:** If you chose to use auth_enabled=True when creating your AksWebservice.deploy_configuration(), see documentation [here](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.webservice(class)?view=azure-ml-py#get-keys--) on how to retrieve your keys and use either key as an argument to PredictionClient(...,access_token=key).",
"The image supports gRPC and the TensorFlow Serving \"predict\" API. We will create a PredictionClient from the Webservice object that can call into the docker image to get predictions. If you do not have the Webservice object, you can also create [PredictionClient](https://docs.microsoft.com/en-us/python/api/azureml-accel-models/azureml.accel.predictionclient?view=azure-ml-py) directly.\n",
"\n",
"**Note:** If you chose to use auth_enabled=True when creating your AksWebservice.deploy_configuration(), see documentation [here](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.webservice(class)?view=azure-ml-py#get-keys--) on how to retrieve your keys and use either key as an argument to PredictionClient(...,access_token=key).\n",
"**WARNING:** If you are running on Azure Notebooks free compute, you will not be able to make outgoing calls to your service. Try locating your client on a different machine to consume it."
]
},
@@ -714,18 +726,10 @@
"outputs": [],
"source": [
"# Using the grpc client in AzureML Accelerated Models SDK\n",
"from azureml.accel.client import PredictionClient\n",
"\n",
"address = aks_service.scoring_uri\n",
"ssl_enabled = address.startswith(\"https\")\n",
"address = address[address.find('/')+2:].strip('/')\n",
"port = 443 if ssl_enabled else 80\n",
"from azureml.accel import client_from_service\n",
"\n",
"# Initialize AzureML Accelerated Models client\n",
"client = PredictionClient(address=address,\n",
" port=port,\n",
" use_ssl=ssl_enabled,\n",
" service_name=aks_service.name)"
"client = client_from_service(aks_service)"
]
},
{
@@ -854,7 +858,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.0"
"version": "3.5.6"
}
},
"nbformat": 4,

View File

@@ -0,0 +1,9 @@
name: accelerated-models-training
dependencies:
- pip:
- azureml-sdk
- azureml-accel-models
- tensorflow
- keras
- tqdm
- sklearn

View File

@@ -150,7 +150,9 @@
"> Estimator object initialization involves specifying a list of DataReference objects in its 'inputs' parameter.\n",
" In Pipelines, a step can take another step's output or DataReferences as input. So when creating an EstimatorStep,\n",
" the parameters 'inputs' and 'outputs' need to be set explicitly and that will override 'inputs' parameter\n",
" specified in the Estimator object."
" specified in the Estimator object.\n",
" \n",
"> The best practice is to use separate folders for scripts and its dependent files for each step and specify that folder as the `source_directory` for the step. This helps reduce the size of the snapshot created for the step (only the specific folder is snapshotted). Since changes in any files in the `source_directory` would trigger a re-upload of the snapshot, this helps keep the reuse of the step when there are no changes in the `source_directory` of the step."
]
},
{
@@ -170,7 +172,9 @@
" data_reference_name=\"input_data\",\n",
" path_on_datastore=\"20newsgroups/20news.pkl\")\n",
"\n",
"output = PipelineData(\"output\", datastore=def_blob_store)"
"output = PipelineData(\"output\", datastore=def_blob_store)\n",
"\n",
"source_directory = 'estimator_train'"
]
},
{
@@ -181,7 +185,7 @@
"source": [
"from azureml.train.estimator import Estimator\n",
"\n",
"est = Estimator(source_directory='.', \n",
"est = Estimator(source_directory=source_directory, \n",
" compute_target=cpu_cluster, \n",
" entry_script='dummy_train.py', \n",
" conda_packages=['scikit-learn'])"

View File

@@ -88,7 +88,11 @@
"metadata": {},
"source": [
"## Create an Azure ML experiment\n",
"Let's create an experiment named \"tf-mnist\" and a folder to hold the training scripts. The script runs will be recorded under the experiment in Azure.\n"
"Let's create an experiment named \"tf-mnist\" and a folder to hold the training scripts. \n",
"\n",
"> The best practice is to use separate folders for scripts and its dependent files for each step. This helps reduce the size of the snapshot created for the step (only the specific folder is snapshotted). Since changes in any files in the `source_directory` would trigger a re-upload of the snapshot, this helps keep the reuse of the step when there are no changes in the `source_directory` of the step. \n",
"\n",
"> The script runs will be recorded under the experiment in Azure."
]
},
{

View File

@@ -57,10 +57,8 @@
"ws = Workspace.from_config()\n",
"print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep = '\\n')\n",
"\n",
"# Default datastore (Azure file storage)\n",
"def_file_store = ws.get_default_datastore() \n",
"print(\"Default datastore's name: {}\".format(def_file_store.name))\n",
"\n",
"# Default datastore (Azure blob storage)\n",
"# def_blob_store = ws.get_default_datastore()\n",
"def_blob_store = Datastore(ws, \"workspaceblobstore\")\n",
"print(\"Blobstore's name: {}\".format(def_blob_store.name))"
]
@@ -147,7 +145,9 @@
"#### Define a Step that consumes a datasource and produces intermediate data.\n",
"In this step, we define a step that consumes a datasource and produces intermediate data.\n",
"\n",
"**Open `train.py` in the local machine and examine the arguments, inputs, and outputs for the script. That will give you a good sense of why the script argument names used below are important.** "
"**Open `train.py` in the local machine and examine the arguments, inputs, and outputs for the script. That will give you a good sense of why the script argument names used below are important.** \n",
"\n",
"The best practice is to use separate folders for scripts and its dependent files for each step and specify that folder as the `source_directory` for the step. This helps reduce the size of the snapshot created for the step (only the specific folder is snapshotted). Since changes in any files in the `source_directory` would trigger a re-upload of the snapshot, this helps keep the reuse of the step when there are no changes in the `source_directory` of the step."
]
},
{
@@ -158,13 +158,16 @@
"source": [
"# trainStep consumes the datasource (Datareference) in the previous step\n",
"# and produces processed_data1\n",
"\n",
"source_directory = \"publish_run_train\"\n",
"\n",
"trainStep = PythonScriptStep(\n",
" script_name=\"train.py\", \n",
" arguments=[\"--input_data\", blob_input_data, \"--output_train\", processed_data1],\n",
" inputs=[blob_input_data],\n",
" outputs=[processed_data1],\n",
" compute_target=aml_compute, \n",
" source_directory='.'\n",
" source_directory=source_directory\n",
")\n",
"print(\"trainStep created\")"
]
@@ -188,6 +191,7 @@
"# extractStep to use the intermediate data produced by step4\n",
"# This step also produces an output processed_data2\n",
"processed_data2 = PipelineData(\"processed_data2\", datastore=def_blob_store)\n",
"source_directory = \"publish_run_extract\"\n",
"\n",
"extractStep = PythonScriptStep(\n",
" script_name=\"extract.py\",\n",
@@ -195,7 +199,7 @@
" inputs=[processed_data1],\n",
" outputs=[processed_data2],\n",
" compute_target=aml_compute, \n",
" source_directory='.')\n",
" source_directory=source_directory)\n",
"print(\"extractStep created\")"
]
},
@@ -247,8 +251,7 @@
"source": [
"# Now define step6 that takes two inputs (both intermediate data), and produce an output\n",
"processed_data3 = PipelineData(\"processed_data3\", datastore=def_blob_store)\n",
"\n",
"\n",
"source_directory = \"publish_run_compare\"\n",
"\n",
"compareStep = PythonScriptStep(\n",
" script_name=\"compare.py\",\n",
@@ -256,7 +259,7 @@
" inputs=[processed_data1, processed_data2],\n",
" outputs=[processed_data3], \n",
" compute_target=aml_compute, \n",
" source_directory='.')\n",
" source_directory=source_directory)\n",
"print(\"compareStep created\")"
]
},

View File

@@ -103,7 +103,7 @@
"metadata": {},
"source": [
"### Define a pipeline step\n",
"Define a single step pipeline for demonstration purpose."
"Define a single step pipeline for demonstration purpose. The best practice is to use separate folders for scripts and its dependent files for each step and specify that folder as the `source_directory` for the step. This helps reduce the size of the snapshot created for the step (only the specific folder is snapshotted). Since changes in any files in the `source_directory` would trigger a re-upload of the snapshot, this helps keep the reuse of the step when there are no changes in the `source_directory` of the step."
]
},
{
@@ -114,11 +114,13 @@
"source": [
"from azureml.pipeline.steps import PythonScriptStep\n",
"\n",
"source_directory = \"publish_run_train\"\n",
"\n",
"trainStep = PythonScriptStep(\n",
" name=\"Training_Step\",\n",
" script_name=\"train.py\", \n",
" compute_target=aml_compute_target, \n",
" source_directory='.'\n",
" source_directory=source_directory\n",
")\n",
"print(\"TrainStep created\")"
]

View File

@@ -76,7 +76,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Initialization, Steps to create a Pipeline"
"#### Initialization, Steps to create a Pipeline\n",
"\n",
"The best practice is to use separate folders for scripts and its dependent files for each step and specify that folder as the `source_directory` for the step. This helps reduce the size of the snapshot created for the step (only the specific folder is snapshotted). Since changes in any files in the `source_directory` would trigger a re-upload of the snapshot, this helps keep the reuse of the step when there are no changes in the `source_directory` of the step."
]
},
{
@@ -105,7 +107,7 @@
" aml_compute.wait_for_completion(show_output=True, min_node_count=None, timeout_in_minutes=20)\n",
"\n",
"# source_directory\n",
"source_directory = '.'\n",
"source_directory = 'publish_run_train'\n",
"# define a single step pipeline for demonstration purpose.\n",
"trainStep = PythonScriptStep(\n",
" name=\"Training_Step\",\n",

View File

@@ -290,7 +290,9 @@
"- **priority:** the priority value to use for the current job *(optional)*\n",
"- **runtime_version:** the runtime version of the Data Lake Analytics engine *(optional)*\n",
"- **source_directory:** folder that contains the script, assemblies etc. *(optional)*\n",
"- **hash_paths:** list of paths to hash to detect a change (script file is always hashed) *(optional)*"
"- **hash_paths:** list of paths to hash to detect a change (script file is always hashed) *(optional)*\n",
"\n",
"The best practice is to use separate folders for scripts and its dependent files for each step and specify that folder as the `source_directory` for the step. This helps reduce the size of the snapshot created for the step (only the specific folder is snapshotted). Since changes in any files in the `source_directory` would trigger a re-upload of the snapshot, this helps keep the reuse of the step when there are no changes in the `source_directory` of the step."
]
},
{

View File

@@ -175,7 +175,7 @@
"metadata": {},
"source": [
"## Data Connections with Inputs and Outputs\n",
"The DatabricksStep supports Azure Blob and ADLS for inputs and outputs. You also will need to define a [Secrets](https://docs.azuredatabricks.net/user-guide/secrets/index.html) scope to enable authentication to external data sources such as Blob and ADLS from Databricks.\n",
"The DatabricksStep supports DBFS, Azure Blob and ADLS for inputs and outputs. You also will need to define a [Secrets](https://docs.azuredatabricks.net/user-guide/secrets/index.html) scope to enable authentication to external data sources such as Blob and ADLS from Databricks.\n",
"\n",
"- Databricks documentation on [Azure Blob](https://docs.azuredatabricks.net/spark/latest/data-sources/azure/azure-storage.html)\n",
"- Databricks documentation on [ADLS](https://docs.databricks.com/spark/latest/data-sources/azure/azure-datalake.html)\n",

View File

@@ -108,7 +108,9 @@
"metadata": {},
"source": [
"## Create an Azure ML experiment\n",
"Let's create an experiment named \"automl-classification\" and a folder to hold the training scripts. The script runs will be recorded under the experiment in Azure.\n"
"Let's create an experiment named \"automl-classification\" and a folder to hold the training scripts. The script runs will be recorded under the experiment in Azure.\n",
"\n",
"The best practice is to use separate folders for scripts and its dependent files for each step and specify that folder as the `source_directory` for the step. This helps reduce the size of the snapshot created for the step (only the specific folder is snapshotted). Since changes in any files in the `source_directory` would trigger a re-upload of the snapshot, this helps keep the reuse of the step when there are no changes in the `source_directory` of the step."
]
},
{

View File

@@ -76,14 +76,20 @@
"ws = Workspace.from_config()\n",
"print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep = '\\n')\n",
"\n",
"# Default datastore (Azure file storage)\n",
"def_file_store = ws.get_default_datastore() \n",
"print(\"Default datastore's name: {}\".format(def_file_store.name))\n",
"\n",
"# Default datastore (Azure blob storage)\n",
"# def_blob_store = ws.get_default_datastore()\n",
"def_blob_store = Datastore(ws, \"workspaceblobstore\")\n",
"print(\"Blobstore's name: {}\".format(def_blob_store.name))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Source Directory\n",
"The best practice is to use separate folders for scripts and its dependent files for each step and specify that folder as the `source_directory` for the step. This helps reduce the size of the snapshot created for the step (only the specific folder is snapshotted). Since changes in any files in the `source_directory` would trigger a re-upload of the snapshot, this helps keep the reuse of the step when there are no changes in the `source_directory` of the step."
]
},
{
"cell_type": "code",
"execution_count": null,
@@ -91,7 +97,7 @@
"outputs": [],
"source": [
"# source directory\n",
"source_directory = '.'\n",
"source_directory = 'data_dependency_run_train'\n",
" \n",
"print('Sample scripts will be created in {} directory.'.format(source_directory))"
]
@@ -340,6 +346,7 @@
"# step5 to use the intermediate data produced by step4\n",
"# This step also produces an output processed_data2\n",
"processed_data2 = PipelineData(\"processed_data2\", datastore=def_blob_store)\n",
"source_directory = \"data_dependency_run_extract\"\n",
"\n",
"extractStep = PythonScriptStep(\n",
" script_name=\"extract.py\",\n",
@@ -386,6 +393,7 @@
"source": [
"# Now define the compare step which takes two inputs and produces an output\n",
"processed_data3 = PipelineData(\"processed_data3\", datastore=def_blob_store)\n",
"source_directory = \"data_dependency_run_compare\"\n",
"\n",
"compareStep = PythonScriptStep(\n",
" script_name=\"compare.py\",\n",

View File

@@ -0,0 +1,24 @@
# Copyright (c) Microsoft. All rights reserved.
# Licensed under the MIT license.
import argparse
import os
print("In compare.py")
print("As a data scientist, this is where I use my compare code.")
parser = argparse.ArgumentParser("compare")
parser.add_argument("--compare_data1", type=str, help="compare_data1 data")
parser.add_argument("--compare_data2", type=str, help="compare_data2 data")
parser.add_argument("--output_compare", type=str, help="output_compare directory")
parser.add_argument("--pipeline_param", type=int, help="pipeline parameter")
args = parser.parse_args()
print("Argument 1: %s" % args.compare_data1)
print("Argument 2: %s" % args.compare_data2)
print("Argument 3: %s" % args.output_compare)
print("Argument 4: %s" % args.pipeline_param)
if not (args.output_compare is None):
os.makedirs(args.output_compare, exist_ok=True)
print("%s created" % args.output_compare)

View File

@@ -0,0 +1,21 @@
# Copyright (c) Microsoft. All rights reserved.
# Licensed under the MIT license.
import argparse
import os
print("In extract.py")
print("As a data scientist, this is where I use my extract code.")
parser = argparse.ArgumentParser("extract")
parser.add_argument("--input_extract", type=str, help="input_extract data")
parser.add_argument("--output_extract", type=str, help="output_extract directory")
args = parser.parse_args()
print("Argument 1: %s" % args.input_extract)
print("Argument 2: %s" % args.output_extract)
if not (args.output_extract is None):
os.makedirs(args.output_extract, exist_ok=True)
print("%s created" % args.output_extract)

View File

@@ -0,0 +1,22 @@
# Copyright (c) Microsoft. All rights reserved.
# Licensed under the MIT license.
import argparse
import os
print("In train.py")
print("As a data scientist, this is where I use my training code.")
parser = argparse.ArgumentParser("train")
parser.add_argument("--input_data", type=str, help="input data")
parser.add_argument("--output_train", type=str, help="output_train directory")
args = parser.parse_args()
print("Argument 1: %s" % args.input_data)
print("Argument 2: %s" % args.output_train)
if not (args.output_train is None):
os.makedirs(args.output_train, exist_ok=True)
print("%s created" % args.output_train)

View File

@@ -0,0 +1,30 @@
# Copyright (c) Microsoft Corporation. All rights reserved.
# Licensed under the MIT License.
import argparse
import os
print("*********************************************************")
print("Hello Azure ML!")
parser = argparse.ArgumentParser()
parser.add_argument('--datadir', type=str, help="data directory")
parser.add_argument('--output', type=str, help="output")
args = parser.parse_args()
print("Argument 1: %s" % args.datadir)
print("Argument 2: %s" % args.output)
if not (args.output is None):
os.makedirs(args.output, exist_ok=True)
print("%s created" % args.output)
try:
from azureml.core import Run
run = Run.get_context()
print("Log Fibonacci numbers.")
run.log_list('Fibonacci numbers', [0, 1, 1, 2, 3, 5, 8, 13, 21, 34])
run.complete()
except:
print("Warning: you need to install Azure ML SDK in order to log metrics.")
print("*********************************************************")

View File

@@ -0,0 +1,24 @@
# Copyright (c) Microsoft. All rights reserved.
# Licensed under the MIT license.
import argparse
import os
print("In compare.py")
print("As a data scientist, this is where I use my compare code.")
parser = argparse.ArgumentParser("compare")
parser.add_argument("--compare_data1", type=str, help="compare_data1 data")
parser.add_argument("--compare_data2", type=str, help="compare_data2 data")
parser.add_argument("--output_compare", type=str, help="output_compare directory")
parser.add_argument("--pipeline_param", type=int, help="pipeline parameter")
args = parser.parse_args()
print("Argument 1: %s" % args.compare_data1)
print("Argument 2: %s" % args.compare_data2)
print("Argument 3: %s" % args.output_compare)
print("Argument 4: %s" % args.pipeline_param)
if not (args.output_compare is None):
os.makedirs(args.output_compare, exist_ok=True)
print("%s created" % args.output_compare)

View File

@@ -0,0 +1,21 @@
# Copyright (c) Microsoft. All rights reserved.
# Licensed under the MIT license.
import argparse
import os
print("In extract.py")
print("As a data scientist, this is where I use my extract code.")
parser = argparse.ArgumentParser("extract")
parser.add_argument("--input_extract", type=str, help="input_extract data")
parser.add_argument("--output_extract", type=str, help="output_extract directory")
args = parser.parse_args()
print("Argument 1: %s" % args.input_extract)
print("Argument 2: %s" % args.output_extract)
if not (args.output_extract is None):
os.makedirs(args.output_extract, exist_ok=True)
print("%s created" % args.output_extract)

View File

@@ -0,0 +1,22 @@
# Copyright (c) Microsoft. All rights reserved.
# Licensed under the MIT license.
import argparse
import os
print("In train.py")
print("As a data scientist, this is where I use my training code.")
parser = argparse.ArgumentParser("train")
parser.add_argument("--input_data", type=str, help="input data")
parser.add_argument("--output_train", type=str, help="output_train directory")
args = parser.parse_args()
print("Argument 1: %s" % args.input_data)
print("Argument 2: %s" % args.output_train)
if not (args.output_train is None):
os.makedirs(args.output_train, exist_ok=True)
print("%s created" % args.output_train)

View File

@@ -0,0 +1,724 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Track Data Drift between Training and Inference Data in Production \n",
"\n",
"With this notebook, you will learn how to enable the DataDrift service to automatically track and determine whether your inference data is drifting from the data your model was initially trained on. The DataDrift service provides metrics and visualizations to help stakeholders identify which specific features cause the concept drift to occur.\n",
"\n",
"Please email driftfeedback@microsoft.com with any issues. A member from the DataDrift team will respond shortly. \n",
"\n",
"The DataDrift Public Preview API can be found [here](https://docs.microsoft.com/en-us/python/api/azureml-contrib-datadrift/?view=azure-ml-py). "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/monitor-models/data-drift/azureml-datadrift.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Prerequisites and Setup"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Install the DataDrift package\n",
"\n",
"Install the azureml-contrib-datadrift, azureml-opendatasets and lightgbm packages before running this notebook.\n",
"```\n",
"pip install azureml-contrib-datadrift\n",
"pip install lightgbm\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Import Dependencies"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import json\n",
"import os\n",
"import time\n",
"from datetime import datetime, timedelta\n",
"\n",
"import numpy as np\n",
"import pandas as pd\n",
"import requests\n",
"from azureml.contrib.datadrift import DataDriftDetector, AlertConfiguration\n",
"from azureml.opendatasets import NoaaIsdWeather\n",
"from azureml.core import Dataset, Workspace, Run\n",
"from azureml.core.compute import AksCompute, ComputeTarget\n",
"from azureml.core.conda_dependencies import CondaDependencies\n",
"from azureml.core.experiment import Experiment\n",
"from azureml.core.image import ContainerImage\n",
"from azureml.core.model import Model\n",
"from azureml.core.webservice import Webservice, AksWebservice\n",
"from azureml.widgets import RunDetails\n",
"from sklearn.externals import joblib\n",
"from sklearn.model_selection import train_test_split\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Set up Configuraton and Create Azure ML Workspace\n",
"\n",
"If you are using an Azure Machine Learning Notebook VM, you are all set. Otherwise, go through the [configuration notebook](../../../configuration.ipynb) first if you haven't already to establish your connection to the AzureML Workspace."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Please type in your initials/alias. The prefix is prepended to the names of resources created by this notebook. \n",
"prefix = \"dd\"\n",
"\n",
"# NOTE: Please do not change the model_name, as it's required by the score.py file\n",
"model_name = \"driftmodel\"\n",
"image_name = \"{}driftimage\".format(prefix)\n",
"service_name = \"{}driftservice\".format(prefix)\n",
"\n",
"# optionally, set email address to receive an email alert for DataDrift\n",
"email_address = \"\""
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"ws = Workspace.from_config()\n",
"print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep = '\\n')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Generate Train/Testing Data\n",
"\n",
"For this demo, we will use NOAA weather data from [Azure Open Datasets](https://azure.microsoft.com/services/open-datasets/). You may replace this step with your own dataset. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"usaf_list = ['725724', '722149', '723090', '722159', '723910', '720279',\n",
" '725513', '725254', '726430', '720381', '723074', '726682',\n",
" '725486', '727883', '723177', '722075', '723086', '724053',\n",
" '725070', '722073', '726060', '725224', '725260', '724520',\n",
" '720305', '724020', '726510', '725126', '722523', '703333',\n",
" '722249', '722728', '725483', '722972', '724975', '742079',\n",
" '727468', '722193', '725624', '722030', '726380', '720309',\n",
" '722071', '720326', '725415', '724504', '725665', '725424',\n",
" '725066']\n",
"\n",
"columns = ['usaf', 'wban', 'datetime', 'latitude', 'longitude', 'elevation', 'windAngle', 'windSpeed', 'temperature', 'stationName', 'p_k']\n",
"\n",
"\n",
"def enrich_weather_noaa_data(noaa_df):\n",
" hours_in_day = 23\n",
" week_in_year = 52\n",
" \n",
" noaa_df[\"hour\"] = noaa_df[\"datetime\"].dt.hour\n",
" noaa_df[\"weekofyear\"] = noaa_df[\"datetime\"].dt.week\n",
" \n",
" noaa_df[\"sine_weekofyear\"] = noaa_df['datetime'].transform(lambda x: np.sin((2*np.pi*x.dt.week-1)/week_in_year))\n",
" noaa_df[\"cosine_weekofyear\"] = noaa_df['datetime'].transform(lambda x: np.cos((2*np.pi*x.dt.week-1)/week_in_year))\n",
"\n",
" noaa_df[\"sine_hourofday\"] = noaa_df['datetime'].transform(lambda x: np.sin(2*np.pi*x.dt.hour/hours_in_day))\n",
" noaa_df[\"cosine_hourofday\"] = noaa_df['datetime'].transform(lambda x: np.cos(2*np.pi*x.dt.hour/hours_in_day))\n",
" \n",
" return noaa_df\n",
"\n",
"def add_window_col(input_df):\n",
" shift_interval = pd.Timedelta('-7 days') # your X days interval\n",
" df_shifted = input_df.copy()\n",
" df_shifted['datetime'] = df_shifted['datetime'] - shift_interval\n",
" df_shifted.drop(list(input_df.columns.difference(['datetime', 'usaf', 'wban', 'sine_hourofday', 'temperature'])), axis=1, inplace=True)\n",
"\n",
" # merge, keeping only observations where -1 lag is present\n",
" df2 = pd.merge(input_df,\n",
" df_shifted,\n",
" on=['datetime', 'usaf', 'wban', 'sine_hourofday'],\n",
" how='inner', # use 'left' to keep observations without lags\n",
" suffixes=['', '-7'])\n",
" return df2\n",
"\n",
"def get_noaa_data(start_time, end_time, cols, station_list):\n",
" isd = NoaaIsdWeather(start_time, end_time, cols=cols)\n",
" # Read into Pandas data frame.\n",
" noaa_df = isd.to_pandas_dataframe()\n",
" noaa_df = noaa_df.rename(columns={\"stationName\": \"station_name\"})\n",
" \n",
" df_filtered = noaa_df[noaa_df[\"usaf\"].isin(station_list)]\n",
" df_filtered.reset_index(drop=True)\n",
" \n",
" # Enrich with time features\n",
" df_enriched = enrich_weather_noaa_data(df_filtered)\n",
" \n",
" return df_enriched\n",
"\n",
"def get_featurized_noaa_df(start_time, end_time, cols, station_list):\n",
" df_1 = get_noaa_data(start_time - timedelta(days=7), start_time - timedelta(seconds=1), cols, station_list)\n",
" df_2 = get_noaa_data(start_time, end_time, cols, station_list)\n",
" noaa_df = pd.concat([df_1, df_2])\n",
" \n",
" print(\"Adding window feature\")\n",
" df_window = add_window_col(noaa_df)\n",
" \n",
" cat_columns = df_window.dtypes == object\n",
" cat_columns = cat_columns[cat_columns == True]\n",
" \n",
" print(\"Encoding categorical columns\")\n",
" df_encoded = pd.get_dummies(df_window, columns=cat_columns.keys().tolist())\n",
" \n",
" print(\"Dropping unnecessary columns\")\n",
" df_featurized = df_encoded.drop(['windAngle', 'windSpeed', 'datetime', 'elevation'], axis=1).dropna().drop_duplicates()\n",
" \n",
" return df_featurized"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Train model on Jan 1 - 14, 2009 data\n",
"df = get_featurized_noaa_df(datetime(2009, 1, 1), datetime(2009, 1, 14, 23, 59, 59), columns, usaf_list)\n",
"df.head()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"label = \"temperature\"\n",
"x_df = df.drop(label, axis=1)\n",
"y_df = df[[label]]\n",
"x_train, x_test, y_train, y_test = train_test_split(df, y_df, test_size=0.2, random_state=223)\n",
"print(x_train.shape, x_test.shape, y_train.shape, y_test.shape)\n",
"\n",
"training_dir = 'outputs/training'\n",
"training_file = \"training.csv\"\n",
"\n",
"# Generate training dataframe to register as Training Dataset\n",
"os.makedirs(training_dir, exist_ok=True)\n",
"training_df = pd.merge(x_train.drop(label, axis=1), y_train, left_index=True, right_index=True)\n",
"training_df.to_csv(training_dir + \"/\" + training_file)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Create/Register Training Dataset"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"dataset_name = \"dataset\"\n",
"name_suffix = datetime.utcnow().strftime(\"%Y-%m-%d-%H-%M-%S\")\n",
"snapshot_name = \"snapshot-{}\".format(name_suffix)\n",
"\n",
"dstore = ws.get_default_datastore()\n",
"dstore.upload(training_dir, \"data/training\", show_progress=True)\n",
"dpath = dstore.path(\"data/training/training.csv\")\n",
"trainingDataset = Dataset.auto_read_files(dpath, include_path=True)\n",
"trainingDataset = trainingDataset.register(workspace=ws, name=dataset_name, description=\"dset\", exist_ok=True)\n",
"\n",
"datasets = [(Dataset.Scenario.TRAINING, trainingDataset)]\n",
"print(\"dataset registration done.\\n\")\n",
"datasets"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Train and Save Model"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import lightgbm as lgb\n",
"\n",
"train = lgb.Dataset(data=x_train, \n",
" label=y_train)\n",
"\n",
"test = lgb.Dataset(data=x_test, \n",
" label=y_test,\n",
" reference=train)\n",
"\n",
"params = {'learning_rate' : 0.1,\n",
" 'boosting' : 'gbdt',\n",
" 'metric' : 'rmse',\n",
" 'feature_fraction' : 1,\n",
" 'bagging_fraction' : 1,\n",
" 'max_depth': 6,\n",
" 'num_leaves' : 31,\n",
" 'objective' : 'regression',\n",
" 'bagging_freq' : 1,\n",
" \"verbose\": -1,\n",
" 'min_data_per_leaf': 100}\n",
"\n",
"model = lgb.train(params, \n",
" num_boost_round=500,\n",
" train_set=train,\n",
" valid_sets=[train, test],\n",
" verbose_eval=50,\n",
" early_stopping_rounds=25)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"model_file = 'outputs/{}.pkl'.format(model_name)\n",
"\n",
"os.makedirs('outputs', exist_ok=True)\n",
"joblib.dump(model, model_file)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Register Model"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"model = Model.register(model_path=model_file,\n",
" model_name=model_name,\n",
" workspace=ws,\n",
" datasets=datasets)\n",
"\n",
"print(model_name, image_name, service_name, model)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Deploy Model To AKS"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Prepare Environment"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"myenv = CondaDependencies.create(conda_packages=['numpy','scikit-learn', 'joblib', 'lightgbm', 'pandas'],\n",
" pip_packages=['azureml-monitoring', 'azureml-sdk[automl]'])\n",
"\n",
"with open(\"myenv.yml\",\"w\") as f:\n",
" f.write(myenv.serialize_to_string())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Create Image"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Image creation may take up to 15 minutes.\n",
"\n",
"image_name = image_name + str(model.version)\n",
"\n",
"if not image_name in ws.images:\n",
" # Use the score.py defined in this directory as the execution script\n",
" # NOTE: The Model Data Collector must be enabled in the execution script for DataDrift to run correctly\n",
" image_config = ContainerImage.image_configuration(execution_script=\"score.py\",\n",
" runtime=\"python\",\n",
" conda_file=\"myenv.yml\",\n",
" description=\"Image with weather dataset model\")\n",
" image = ContainerImage.create(name=image_name,\n",
" models=[model],\n",
" image_config=image_config,\n",
" workspace=ws)\n",
"\n",
" image.wait_for_creation(show_output=True)\n",
"else:\n",
" image = ws.images[image_name]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Create Compute Target"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"aks_name = 'dd-demo-e2e'\n",
"prov_config = AksCompute.provisioning_configuration()\n",
"\n",
"if not aks_name in ws.compute_targets:\n",
" aks_target = ComputeTarget.create(workspace=ws,\n",
" name=aks_name,\n",
" provisioning_configuration=prov_config)\n",
"\n",
" aks_target.wait_for_completion(show_output=True)\n",
" print(aks_target.provisioning_state)\n",
" print(aks_target.provisioning_errors)\n",
"else:\n",
" aks_target=ws.compute_targets[aks_name]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Deploy Service"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"aks_service_name = service_name\n",
"\n",
"if not aks_service_name in ws.webservices:\n",
" aks_config = AksWebservice.deploy_configuration(collect_model_data=True, enable_app_insights=True)\n",
" aks_service = Webservice.deploy_from_image(workspace=ws,\n",
" name=aks_service_name,\n",
" image=image,\n",
" deployment_config=aks_config,\n",
" deployment_target=aks_target)\n",
" aks_service.wait_for_deployment(show_output=True)\n",
" print(aks_service.state)\n",
"else:\n",
" aks_service = ws.webservices[aks_service_name]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Run DataDrift Analysis"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Send Scoring Data to Service"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Download Scoring Data"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Score Model on March 15, 2016 data\n",
"scoring_df = get_noaa_data(datetime(2016, 3, 15) - timedelta(days=7), datetime(2016, 3, 16), columns, usaf_list)\n",
"# Add the window feature column\n",
"scoring_df = add_window_col(scoring_df)\n",
"\n",
"# Drop features not used by the model\n",
"print(\"Dropping unnecessary columns\")\n",
"scoring_df = scoring_df.drop(['windAngle', 'windSpeed', 'datetime', 'elevation'], axis=1).dropna()\n",
"scoring_df.head()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# One Hot Encode the scoring dataset to match the training dataset schema\n",
"columns_dict = model.datasets[\"training\"][0].get_profile().columns\n",
"extra_cols = ('Path', 'Column1')\n",
"for k in extra_cols:\n",
" columns_dict.pop(k, None)\n",
"training_columns = list(columns_dict.keys())\n",
"\n",
"categorical_columns = scoring_df.dtypes == object\n",
"categorical_columns = categorical_columns[categorical_columns == True]\n",
"\n",
"test_df = pd.get_dummies(scoring_df[categorical_columns.keys().tolist()])\n",
"encoded_df = scoring_df.join(test_df)\n",
"\n",
"# Populate missing OHE columns with 0 values to match traning dataset schema\n",
"difference = list(set(training_columns) - set(encoded_df.columns.tolist()))\n",
"for col in difference:\n",
" encoded_df[col] = 0\n",
"encoded_df.head()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Serialize dataframe to list of row dictionaries\n",
"encoded_dict = encoded_df.to_dict('records')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Submit Scoring Data to Service"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%time\n",
"\n",
"# retreive the API keys. AML generates two keys.\n",
"key1, key2 = aks_service.get_keys()\n",
"\n",
"total_count = len(scoring_df)\n",
"i = 0\n",
"load = []\n",
"for row in encoded_dict:\n",
" load.append(row)\n",
" i = i + 1\n",
" if i % 100 == 0:\n",
" payload = json.dumps({\"data\": load})\n",
" \n",
" # construct raw HTTP request and send to the service\n",
" payload_binary = bytes(payload,encoding = 'utf8')\n",
" headers = {'Content-Type':'application/json', 'Authorization': 'Bearer ' + key1}\n",
" resp = requests.post(aks_service.scoring_uri, payload_binary, headers=headers)\n",
" \n",
" print(\"prediction:\", resp.content, \"Progress: {}/{}\".format(i, total_count)) \n",
"\n",
" load = []\n",
" time.sleep(3)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We need to wait up to 10 minutes for the Model Data Collector to dump the model input and inference data to storage in the Workspace, where it's used by the DataDriftDetector job."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"time.sleep(600)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Configure DataDrift"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"services = [service_name]\n",
"start = datetime.now() - timedelta(days=2)\n",
"end = datetime(year=2020, month=1, day=22, hour=15, minute=16)\n",
"feature_list = ['usaf', 'wban', 'latitude', 'longitude', 'station_name', 'p_k', 'sine_hourofday', 'cosine_hourofday', 'temperature-7']\n",
"alert_config = AlertConfiguration([email_address]) if email_address else None\n",
"\n",
"# there will be an exception indicating using get() method if DataDrift object already exist\n",
"try:\n",
" datadrift = DataDriftDetector.create(ws, model.name, model.version, services, frequency=\"Day\", alert_config=alert_config)\n",
"except KeyError:\n",
" datadrift = DataDriftDetector.get(ws, model.name, model.version)\n",
" \n",
"print(\"Details of DataDrift Object:\\n{}\".format(datadrift))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Run an Adhoc DataDriftDetector Run"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"target_date = datetime.today()\n",
"run = datadrift.run(target_date, services, feature_list=feature_list, create_compute_target=True)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"exp = Experiment(ws, datadrift._id)\n",
"dd_run = Run(experiment=exp, run_id=run)\n",
"RunDetails(dd_run).show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Get Drift Analysis Results"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"children = list(dd_run.get_children())\n",
"for child in children:\n",
" child.wait_for_completion()\n",
"\n",
"drift_metrics = datadrift.get_output(start_time=start, end_time=end)\n",
"drift_metrics"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Show all drift figures, one per serivice.\n",
"# If setting with_details is False (by default), only drift will be shown; if it's True, all details will be shown.\n",
"\n",
"drift_figures = datadrift.show(with_details=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Enable DataDrift Schedule"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"datadrift.enable_schedule()"
]
}
],
"metadata": {
"authors": [
{
"name": "rafarmah"
}
],
"kernelspec": {
"display_name": "Python 3.6",
"language": "python",
"name": "python36"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.6"
},
"notice": "Copyright (c) Microsoft Corporation. All rights reserved. Licensed under the MIT License."
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -0,0 +1,8 @@
name: azure-ml-datadrift
dependencies:
- pip:
- azureml-sdk
- azureml-contrib-datadrift
- azureml-opendatasets
- lightgbm
- azureml-widgets

View File

@@ -0,0 +1,58 @@
import pickle
import json
import numpy
import azureml.train.automl
from sklearn.externals import joblib
from sklearn.linear_model import Ridge
from azureml.core.model import Model
from azureml.core.run import Run
from azureml.monitoring import ModelDataCollector
import time
import pandas as pd
def init():
global model, inputs_dc, prediction_dc, feature_names, categorical_features
print("Model is initialized" + time.strftime("%H:%M:%S"))
model_path = Model.get_model_path(model_name="driftmodel")
model = joblib.load(model_path)
feature_names = ["usaf", "wban", "latitude", "longitude", "station_name", "p_k",
"sine_weekofyear", "cosine_weekofyear", "sine_hourofday", "cosine_hourofday",
"temperature-7"]
categorical_features = ["usaf", "wban", "p_k", "station_name"]
inputs_dc = ModelDataCollector(model_name="driftmodel",
identifier="inputs",
feature_names=feature_names)
prediction_dc = ModelDataCollector("driftmodel",
identifier="predictions",
feature_names=["temperature"])
def run(raw_data):
global inputs_dc, prediction_dc
try:
data = json.loads(raw_data)["data"]
data = pd.DataFrame(data)
# Remove the categorical features as the model expects OHE values
input_data = data.drop(categorical_features, axis=1)
result = model.predict(input_data)
# Collect the non-OHE dataframe
collected_df = data[feature_names]
inputs_dc.collect(collected_df.values)
prediction_dc.collect(result)
return result.tolist()
except Exception as e:
error = str(e)
print(error + time.strftime("%H:%M:%S"))
return error

View File

@@ -153,7 +153,11 @@
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"metadata": {
"tags": [
"tensorboard-export-sample"
]
},
"outputs": [],
"source": [
"# Export Run History to Tensorboard logs\n",

View File

@@ -227,7 +227,11 @@
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"metadata": {
"tags": [
"tensorboard-sample"
]
},
"outputs": [],
"source": [
"from azureml.tensorboard import Tensorboard\n",

View File

@@ -1,5 +1,6 @@
import argparse
import os
import numpy as np
@@ -131,6 +132,8 @@ def main():
run.log("Accuracy", np.float(val_accuracy))
serializers.save_npz(os.path.join(args.output_dir, 'model.npz'), model)
if __name__ == '__main__':
main()

View File

@@ -0,0 +1,45 @@
import numpy as np
import os
import json
from chainer import serializers, using_config, Variable, datasets
import chainer.functions as F
import chainer.links as L
from chainer import Chain
from azureml.core.model import Model
class MyNetwork(Chain):
def __init__(self, n_mid_units=100, n_out=10):
super(MyNetwork, self).__init__()
with self.init_scope():
self.l1 = L.Linear(None, n_mid_units)
self.l2 = L.Linear(n_mid_units, n_mid_units)
self.l3 = L.Linear(n_mid_units, n_out)
def forward(self, x):
h = F.relu(self.l1(x))
h = F.relu(self.l2(h))
return self.l3(h)
def init():
global model
model_root = Model.get_model_path('chainer-dnn-mnist')
# Load our saved artifacts
model = MyNetwork()
serializers.load_npz(model_root, model)
def run(input_data):
i = np.array(json.loads(input_data)['data'])
_, test = datasets.get_mnist()
x = Variable(np.asarray([test[i][0]]))
y = model(x)
return np.ndarray.tolist(y.data.argmax(axis=1))

View File

@@ -45,6 +45,16 @@
"print(\"SDK version:\", azureml.core.VERSION)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!jupyter nbextension install --py --user azureml.widgets\n",
"!jupyter nbextension enable --py --user azureml.widgets"
]
},
{
"cell_type": "markdown",
"metadata": {},
@@ -121,6 +131,7 @@
"except ComputeTargetException:\n",
" print('Creating a new compute target...')\n",
" compute_config = AmlCompute.provisioning_configuration(vm_size='STANDARD_NC6', \n",
" min_nodes=2,\n",
" max_nodes=4)\n",
"\n",
" # create the cluster\n",
@@ -206,7 +217,8 @@
"source": [
"import shutil\n",
"\n",
"shutil.copy('chainer_mnist.py', project_folder)"
"shutil.copy('chainer_mnist.py', project_folder)\n",
"shutil.copy('chainer_score.py', project_folder)"
]
},
{
@@ -353,6 +365,7 @@
"hyperdrive_config = HyperDriveConfig(estimator=estimator,\n",
" hyperparameter_sampling=param_sampling, \n",
" primary_metric_name='Accuracy',\n",
" policy=BanditPolicy(evaluation_interval=1, slack_factor=0.1, delay_evaluation=3),\n",
" primary_metric_goal=PrimaryMetricGoal.MAXIMIZE,\n",
" max_total_runs=8,\n",
" max_concurrent_runs=4)\n"
@@ -398,14 +411,344 @@
"metadata": {},
"outputs": [],
"source": [
"run.wait_for_completion(show_output=True)"
"hyperdrive_run.wait_for_completion(show_output=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Find and register best model\n",
"When all jobs finish, we can find out the one that has the highest accuracy."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"best_run = hyperdrive_run.get_best_run_by_primary_metric()\n",
"print(best_run.get_details()['runDefinition']['arguments'])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now, let's list the model files uploaded during the run."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print(best_run.get_file_names())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can then register the folder (and all files in it) as a model named `chainer-dnn-mnist` under the workspace for deployment"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"model = best_run.register_model(model_name='chainer-dnn-mnist', model_path='outputs/model.npz')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Deploy the model in ACI\n",
"Now, we are ready to deploy the model as a web service running in Azure Container Instance, [ACI](https://azure.microsoft.com/en-us/services/container-instances/). Azure Machine Learning accomplishes this by constructing a Docker image with the scoring logic and model baked in.\n",
"\n",
"### Create scoring script\n",
"First, we will create a scoring script that will be invoked by the web service call.\n",
"+ Now that the scoring script must have two required functions, `init()` and `run(input_data)`.\n",
" + In `init()`, you typically load the model into a global object. This function is executed only once when the Docker contianer is started.\n",
" + In `run(input_data)`, the model is used to predict a value based on the input data. The input and output to `run` uses NPZ as the serialization and de-serialization format because it is the preferred format for Chainer, but you are not limited to it.\n",
" \n",
"Refer to the scoring script `chainer_score.py` for this tutorial. Our web service will use this file to predict. When writing your own scoring script, don't forget to test it locally first before you go and deploy the web service."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"shutil.copy('chainer_score.py', project_folder)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Create myenv.yml\n",
"We also need to create an environment file so that Azure Machine Learning can install the necessary packages in the Docker image which are required by your scoring script. In this case, we need to specify conda packages `numpy` and `chainer`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core.runconfig import CondaDependencies\n",
"\n",
"cd = CondaDependencies.create()\n",
"cd.add_conda_package('numpy')\n",
"cd.add_conda_package('chainer')\n",
"cd.save_to_file(base_directory='./', conda_file_path='myenv.yml')\n",
"\n",
"print(cd.serialize_to_string())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Deploy to ACI\n",
"We are almost ready to deploy. Create a deployment configuration and specify the number of CPUs and gigabytes of RAM needed for your ACI container."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core.webservice import AciWebservice\n",
"\n",
"aciconfig = AciWebservice.deploy_configuration(cpu_cores=1,\n",
" auth_enabled=True, # this flag generates API keys to secure access\n",
" memory_gb=1,\n",
" tags={'name': 'mnist', 'framework': 'Chainer'},\n",
" description='Chainer DNN with MNIST')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Deployment Process**\n",
"\n",
"Now we can deploy. **This cell will run for about 7-8 minutes.** Behind the scenes, it will do the following:\n",
"\n",
"1. **Build Docker image**\n",
"Build a Docker image using the scoring file (chainer_score.py), the environment file (myenv.yml), and the model object.\n",
"2. **Register image**\n",
"Register that image under the workspace.\n",
"3. **Ship to ACI**\n",
"And finally ship the image to the ACI infrastructure, start up a container in ACI using that image, and expose an HTTP endpoint to accept REST client calls."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core.image import ContainerImage\n",
"\n",
"imgconfig = ContainerImage.image_configuration(execution_script=\"chainer_score.py\", \n",
" runtime=\"python\", \n",
" conda_file=\"myenv.yml\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%time\n",
"from azureml.core.webservice import Webservice\n",
"\n",
"service = Webservice.deploy_from_model(workspace=ws,\n",
" name='chainer-mnist-1',\n",
" deployment_config=aciconfig,\n",
" models=[model],\n",
" image_config=imgconfig)\n",
"\n",
"service.wait_for_deployment(show_output=True)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print(service.get_logs())"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print(service.scoring_uri)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Tip: If something goes wrong with the deployment, the first thing to look at is the logs from the service by running the following command:** `print(service.get_logs())`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This is the scoring web service endpoint: `print(service.scoring_uri)`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Test the deployed model\n",
"Let's test the deployed model. Pick a random sample from the test set, and send it to the web service hosted in ACI for a prediction. Note, here we are using the an HTTP request to invoke the service.\n",
"\n",
"We can retrieve the API keys used for accessing the HTTP endpoint and construct a raw HTTP request to send to the service. Don't forget to add key to the HTTP header."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# retreive the API keys. two keys were generated.\n",
"key1, Key2 = service.get_keys()\n",
"print(key1)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%matplotlib inline\n",
"import matplotlib.pyplot as plt\n",
"import urllib\n",
"import gzip\n",
"import numpy as np\n",
"import struct\n",
"import requests\n",
"\n",
"\n",
"# load compressed MNIST gz files and return numpy arrays\n",
"def load_data(filename, label=False):\n",
" with gzip.open(filename) as gz:\n",
" struct.unpack('I', gz.read(4))\n",
" n_items = struct.unpack('>I', gz.read(4))\n",
" if not label:\n",
" n_rows = struct.unpack('>I', gz.read(4))[0]\n",
" n_cols = struct.unpack('>I', gz.read(4))[0]\n",
" res = np.frombuffer(gz.read(n_items[0] * n_rows * n_cols), dtype=np.uint8)\n",
" res = res.reshape(n_items[0], n_rows * n_cols)\n",
" else:\n",
" res = np.frombuffer(gz.read(n_items[0]), dtype=np.uint8)\n",
" res = res.reshape(n_items[0], 1)\n",
" return res\n",
"\n",
"os.makedirs('./data/mnist', exist_ok=True)\n",
"urllib.request.urlretrieve('http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz', filename = './data/mnist/test-images.gz')\n",
"urllib.request.urlretrieve('http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz', filename = './data/mnist/test-labels.gz')\n",
"\n",
"X_test = load_data('./data/mnist/test-images.gz', False)\n",
"y_test = load_data('./data/mnist/test-labels.gz', True).reshape(-1)\n",
"\n",
"\n",
"# send a random row from the test set to score\n",
"random_index = np.random.randint(0, len(X_test)-1)\n",
"input_data = \"{\\\"data\\\": [\" + str(random_index) + \"]}\"\n",
"\n",
"headers = {'Content-Type':'application/json', 'Authorization': 'Bearer ' + key1}\n",
"\n",
"# send sample to service for scoring\n",
"resp = requests.post(service.scoring_uri, input_data, headers=headers)\n",
"\n",
"print(\"label:\", y_test[random_index])\n",
"print(\"prediction:\", resp.text[1])\n",
"\n",
"plt.imshow(X_test[random_index].reshape((28,28)), cmap='gray')\n",
"plt.axis('off')\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's look at the workspace after the web service was deployed. You should see\n",
"\n",
" + a registered model named 'chainer-dnn-mnist' and with the id 'chainer-dnn-mnist:1'\n",
" + an image called 'chainer-mnist-svc' and with a docker image location pointing to your workspace's Azure Container Registry (ACR)\n",
" + a webservice called 'chainer-mnist-svc' with some scoring URL"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"models = ws.models\n",
"for name, model in models.items():\n",
" print(\"Model: {}, ID: {}\".format(name, model.id))\n",
" \n",
"images = ws.images\n",
"for name, image in images.items():\n",
" print(\"Image: {}, location: {}\".format(name, image.image_location))\n",
" \n",
"webservices = ws.webservices\n",
"for name, webservice in webservices.items():\n",
" print(\"Webservice: {}, scoring URI: {}\".format(name, webservice.scoring_uri))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Clean up"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can delete the ACI deployment with a simple delete API call."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"service.delete()"
]
}
],
"metadata": {
"authors": [
{
"name": "ninhu"
"name": "dipeck"
}
],
"kernelspec": {
@@ -424,7 +767,8 @@
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.6"
}
},
"msauthor": "dipeck"
},
"nbformat": 4,
"nbformat_minor": 2

View File

@@ -4,4 +4,9 @@ dependencies:
- azureml-sdk
- azureml-widgets
- numpy
- pytest
- matplotlib
- json
- urllib
- gzip
- struct
- requests

View File

@@ -11,7 +11,7 @@ from azureml.core.model import Model
def init():
global model
model_path = Model.get_model_path('pytorch-hymenoptera')
model_path = Model.get_model_path('pytorch-birds')
model = torch.load(model_path, map_location=lambda storage, loc: storage)
model.eval()
@@ -22,7 +22,7 @@ def run(input_data):
# get prediction
with torch.no_grad():
output = model(input_data)
classes = ['ants', 'bees']
classes = ['chicken', 'turkey']
softmax = nn.Softmax(dim=1)
pred_probs = softmax(output).numpy()[0]
index = torch.argmax(output, 1)

View File

@@ -165,8 +165,8 @@ def download_data():
import urllib
from zipfile import ZipFile
# download data
data_file = './hymenoptera_data.zip'
download_url = 'https://download.pytorch.org/tutorial/hymenoptera_data.zip'
data_file = './fowl_data.zip'
download_url = 'https://msdocsdatasets.blob.core.windows.net/pytorchfowl/fowl_data.zip'
urllib.request.urlretrieve(download_url, filename=data_file)
# extract files

Binary file not shown.

Before

Width:  |  Height:  |  Size: 123 KiB

After

Width:  |  Height:  |  Size: 1.6 MiB

View File

@@ -24,7 +24,7 @@
"\n",
"In this tutorial, you will train, hyperparameter tune, and deploy a PyTorch model using the Azure Machine Learning (Azure ML) Python SDK.\n",
"\n",
"This tutorial will train an image classification model using transfer learning, based on PyTorch's [Transfer Learning tutorial](https://pytorch.org/tutorials/beginner/transfer_learning_tutorial.html). The model is trained to classify ants and bees by first using a pretrained ResNet18 model that has been trained on the [ImageNet](http://image-net.org/index) dataset."
"This tutorial will train an image classification model using transfer learning, based on PyTorch's [Transfer Learning tutorial](https://pytorch.org/tutorials/beginner/transfer_learning_tutorial.html). The model is trained to classify chickens and turkeys by first using a pretrained ResNet18 model that has been trained on the [ImageNet](http://image-net.org/index) dataset."
]
},
{
@@ -165,7 +165,7 @@
"source": [
"import os\n",
"\n",
"project_folder = './pytorch-hymenoptera'\n",
"project_folder = './pytorch-birds'\n",
"os.makedirs(project_folder, exist_ok=True)"
]
},
@@ -174,7 +174,7 @@
"metadata": {},
"source": [
"### Download training data\n",
"The dataset we will use (located [here](https://download.pytorch.org/tutorial/hymenoptera_data.zip) as a zip file) consists of about 120 training images each for ants and bees, with 75 validation images for each class. [Hymenoptera](https://en.wikipedia.org/wiki/Hymenoptera) is the order of insects that includes ants and bees. We will download and extract the dataset as part of our training script `pytorch_train.py`"
"The dataset we will use (located on a public blob [here](https://msdocsdatasets.blob.core.windows.net/pytorchfowl/fowl_data.zip) as a zip file) consists of about 120 training images each for turkeys and chickens, with 100 validation images for each class. The images are a subset of the [Open Images v5 Dataset](https://storage.googleapis.com/openimages/web/index.html). We will download and extract the dataset as part of our training script `pytorch_train.py`"
]
},
{
@@ -235,7 +235,7 @@
"source": [
"from azureml.core import Experiment\n",
"\n",
"experiment_name = 'pytorch-hymenoptera'\n",
"experiment_name = 'pytorch-birds'\n",
"experiment = Experiment(ws, name=experiment_name)"
]
},
@@ -273,7 +273,7 @@
"metadata": {},
"source": [
"The `script_params` parameter is a dictionary containing the command-line arguments to your training script `entry_script`. Please note the following:\n",
"- We passed our training data reference `ds_data` to our script's `--data_dir` argument. This will 1) mount our datastore on the remote compute and 2) provide the path to the training data `hymenoptera_data` on our datastore.\n",
"- We passed our training data reference `ds_data` to our script's `--data_dir` argument. This will 1) mount our datastore on the remote compute and 2) provide the path to the training data `fowl_data` on our datastore.\n",
"- We specified the output directory as `./outputs`. The `outputs` directory is specially treated by Azure ML in that all the content in this directory gets uploaded to your workspace as part of your run history. The files written to this directory are therefore accessible even once your remote run is over. In this tutorial, we will save our trained model to this output directory.\n",
"\n",
"To leverage the Azure VM's GPU for training, we set `use_gpu=True`."
@@ -481,7 +481,7 @@
"metadata": {},
"outputs": [],
"source": [
"model = best_run.register_model(model_name = 'pytorch-hymenoptera', model_path = 'outputs/model.pt')\n",
"model = best_run.register_model(model_name = 'pytorch-birds', model_path = 'outputs/model.pt')\n",
"print(model.name, model.id, model.version, sep = '\\t')"
]
},
@@ -503,7 +503,7 @@
"* `init()`: In this function, you typically load the model into a `global` object. This function is executed only once when the Docker container is started. \n",
"* `run(input_data)`: In this function, the model is used to predict a value based on the input data. The input and output typically use JSON as serialization and deserialization format, but you are not limited to that.\n",
"\n",
"Refer to the scoring script `pytorch_score.py` for this tutorial. Our web service will use this file to predict whether an image is an ant or a bee. When writing your own scoring script, don't forget to test it locally first before you go and deploy the web service."
"Refer to the scoring script `pytorch_score.py` for this tutorial. Our web service will use this file to predict whether an image is a chicken or a turkey. When writing your own scoring script, don't forget to test it locally first before you go and deploy the web service."
]
},
{
@@ -549,7 +549,7 @@
"image_config = ContainerImage.image_configuration(execution_script='pytorch_score.py', \n",
" runtime='python', \n",
" conda_file='myenv.yml',\n",
" description='Image with hymenoptera model')"
" description='Image with bird model')"
]
},
{
@@ -570,8 +570,8 @@
"\n",
"aciconfig = AciWebservice.deploy_configuration(cpu_cores=1, \n",
" memory_gb=1, \n",
" tags={'data': 'hymenoptera', 'method':'transfer learning', 'framework':'pytorch'},\n",
" description='Classify ants/bees using transfer learning with PyTorch')"
" tags={'data': 'birds', 'method':'transfer learning', 'framework':'pytorch'},\n",
" description='Classify turkey/chickens using transfer learning with PyTorch')"
]
},
{
@@ -591,7 +591,7 @@
"%%time\n",
"from azureml.core.webservice import Webservice\n",
"\n",
"service_name = 'aci-hymenoptera'\n",
"service_name = 'aci-birds'\n",
"service = Webservice.deploy_from_model(workspace=ws,\n",
" name=service_name,\n",
" models=[model],\n",
@@ -659,6 +659,7 @@
"from PIL import Image\n",
"import matplotlib.pyplot as plt\n",
"\n",
"%matplotlib inline\n",
"plt.imshow(Image.open('test_img.jpg'))"
]
},

View File

@@ -0,0 +1,123 @@
# Copyright (c) Microsoft Corporation. All rights reserved.
# Licensed under the MIT License.
import numpy as np
import argparse
import os
import re
import tensorflow as tf
from azureml.core import Run
from utils import load_data
print("TensorFlow version:", tf.VERSION)
parser = argparse.ArgumentParser()
parser.add_argument('--data-folder', type=str, dest='data_folder', help='data folder mounting point')
parser.add_argument('--resume-from', type=str, default=None,
help='location of the model or checkpoint files from where to resume the training')
args = parser.parse_args()
previous_model_location = args.resume_from
# You can also use environment variable to get the model/checkpoint files location
# previous_model_location = os.path.expandvars(os.getenv("AZUREML_DATAREFERENCE_MODEL_LOCATION", None))
data_folder = os.path.join(args.data_folder, 'mnist')
print('training dataset is stored here:', data_folder)
X_train = load_data(os.path.join(data_folder, 'train-images.gz'), False) / 255.0
X_test = load_data(os.path.join(data_folder, 'test-images.gz'), False) / 255.0
y_train = load_data(os.path.join(data_folder, 'train-labels.gz'), True).reshape(-1)
y_test = load_data(os.path.join(data_folder, 'test-labels.gz'), True).reshape(-1)
print(X_train.shape, y_train.shape, X_test.shape, y_test.shape, sep='\n')
training_set_size = X_train.shape[0]
n_inputs = 28 * 28
n_h1 = 100
n_h2 = 100
n_outputs = 10
learning_rate = 0.01
n_epochs = 20
batch_size = 50
with tf.name_scope('network'):
# construct the DNN
X = tf.placeholder(tf.float32, shape=(None, n_inputs), name='X')
y = tf.placeholder(tf.int64, shape=(None), name='y')
h1 = tf.layers.dense(X, n_h1, activation=tf.nn.relu, name='h1')
h2 = tf.layers.dense(h1, n_h2, activation=tf.nn.relu, name='h2')
output = tf.layers.dense(h2, n_outputs, name='output')
with tf.name_scope('train'):
cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=y, logits=output)
loss = tf.reduce_mean(cross_entropy, name='loss')
optimizer = tf.train.GradientDescentOptimizer(learning_rate)
train_op = optimizer.minimize(loss)
with tf.name_scope('eval'):
correct = tf.nn.in_top_k(output, y, 1)
acc_op = tf.reduce_mean(tf.cast(correct, tf.float32))
init = tf.global_variables_initializer()
saver = tf.train.Saver()
# start an Azure ML run
run = Run.get_context()
with tf.Session() as sess:
start_epoch = 0
if previous_model_location:
checkpoint_file_path = tf.train.latest_checkpoint(previous_model_location)
saver.restore(sess, checkpoint_file_path)
checkpoint_filename = os.path.basename(checkpoint_file_path)
num_found = re.search(r'\d+', checkpoint_filename)
if num_found:
start_epoch = int(num_found.group(0))
print("Resuming from epoch {}".format(str(start_epoch)))
else:
init.run()
for epoch in range(start_epoch, n_epochs):
# randomly shuffle training set
indices = np.random.permutation(training_set_size)
X_train = X_train[indices]
y_train = y_train[indices]
# batch index
b_start = 0
b_end = b_start + batch_size
for _ in range(training_set_size // batch_size):
# get a batch
X_batch, y_batch = X_train[b_start: b_end], y_train[b_start: b_end]
# update batch index for the next batch
b_start = b_start + batch_size
b_end = min(b_start + batch_size, training_set_size)
# train
sess.run(train_op, feed_dict={X: X_batch, y: y_batch})
# evaluate training set
acc_train = acc_op.eval(feed_dict={X: X_batch, y: y_batch})
# evaluate validation set
acc_val = acc_op.eval(feed_dict={X: X_test, y: y_test})
# log accuracies
run.log('training_acc', np.float(acc_train))
run.log('validation_acc', np.float(acc_val))
print(epoch, '-- Training accuracy:', acc_train, '\b Validation accuracy:', acc_val)
y_hat = np.argmax(output.eval(feed_dict={X: X_test}), axis=1)
if epoch % 5 == 0:
saver.save(sess, './outputs/', global_step=epoch)
# saving only half of the model and resuming again from same epoch
if not previous_model_location and epoch == 10:
break
run.log('final_acc', np.float(acc_val))

View File

@@ -0,0 +1,487 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Copyright (c) Microsoft Corporation. All rights reserved.\n",
"\n",
"Licensed under the MIT License."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/training-with-deep-learning/train-tensorflow-resume-training/tensorflow-resume-training.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Resuming Tensorflow training from previous run\n",
"In this tutorial, you will resume a mnist model in TensorFlow from a previously submitted run."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Prerequisites\n",
"* Understand the [architecture and terms](https://docs.microsoft.com/azure/machine-learning/service/concept-azure-machine-learning-architecture) introduced by Azure Machine Learning (AML)\n",
"* Go through the [configuration notebook](../../../configuration.ipynb) to:\n",
" * install the AML SDK\n",
" * create a workspace and its configuration file (`config.json`)\n",
"* Review the [tutorial](../train-hyperparameter-tune-deploy-with-tensorflow/train-hyperparameter-tune-deploy-with-tensorflow.ipynb) on single-node TensorFlow training using the SDK"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Check core SDK version number\n",
"import azureml.core\n",
"\n",
"print(\"SDK version:\", azureml.core.VERSION)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Diagnostics\n",
"Opt-in diagnostics for better experience, quality, and security of future releases."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"Diagnostics"
]
},
"outputs": [],
"source": [
"from azureml.telemetry import set_diagnostics_collection\n",
"\n",
"set_diagnostics_collection(send_diagnostics=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Initialize workspace\n",
"Initialize a [Workspace](https://docs.microsoft.com/azure/machine-learning/service/concept-azure-machine-learning-architecture#workspace) object from the existing workspace you created in the Prerequisites step. `Workspace.from_config()` creates a workspace object from the details stored in `config.json`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core.workspace import Workspace\n",
"\n",
"ws = Workspace.from_config()\n",
"print('Workspace name: ' + ws.name, \n",
" 'Azure region: ' + ws.location, \n",
" 'Subscription id: ' + ws.subscription_id, \n",
" 'Resource group: ' + ws.resource_group, sep='\\n')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Create or Attach existing AmlCompute\n",
"You will need to create a [compute target](https://docs.microsoft.com/azure/machine-learning/service/concept-azure-machine-learning-architecture#compute-target) for training your model. In this tutorial, you create `AmlCompute` as your training compute resource.\n",
"\n",
"**Creation of AmlCompute takes approximately 5 minutes.** If the AmlCompute with that name is already in your workspace this code will skip the creation process.\n",
"\n",
"As with other Azure services, there are limits on certain resources (e.g. AmlCompute) associated with the Azure Machine Learning service. Please read [this article](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-manage-quotas) on the default limits and how to request more quota."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core.compute import ComputeTarget, AmlCompute\n",
"from azureml.core.compute_target import ComputeTargetException\n",
"\n",
"# choose a name for your cluster\n",
"cluster_name = \"gpu-cluster\"\n",
"\n",
"try:\n",
" compute_target = ComputeTarget(workspace=ws, name=cluster_name)\n",
" print('Found existing compute target.')\n",
"except ComputeTargetException:\n",
" print('Creating a new compute target...')\n",
" compute_config = AmlCompute.provisioning_configuration(vm_size='STANDARD_NC6', \n",
" max_nodes=4)\n",
"\n",
" # create the cluster\n",
" compute_target = ComputeTarget.create(ws, cluster_name, compute_config)\n",
"\n",
" compute_target.wait_for_completion(show_output=True)\n",
"\n",
"# use get_status() to get a detailed status for the current cluster. \n",
"print(compute_target.get_status().serialize())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The above code creates a GPU cluster. If you instead want to create a CPU cluster, provide a different VM size to the `vm_size` parameter, such as `STANDARD_D2_V2`."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Upload data to datastore\n",
"To make data accessible for remote training, AML provides a convenient way to do so via a [Datastore](https://docs.microsoft.com/azure/machine-learning/service/how-to-access-data). The datastore provides a mechanism for you to upload/download data to Azure Storage, and interact with it from your remote compute targets. \n",
"\n",
"If your data is already stored in Azure, or you download the data as part of your training script, you will not need to do this step. For this tutorial, although you can download the data in your training script, we will demonstrate how to upload the training data to a datastore and access it during training to illustrate the datastore functionality."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"First download the data from Yan LeCun's web site directly and save them in a data folder locally."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import urllib\n",
"\n",
"os.makedirs('./data/mnist', exist_ok=True)\n",
"\n",
"urllib.request.urlretrieve('http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz', filename = './data/mnist/train-images.gz')\n",
"urllib.request.urlretrieve('http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz', filename = './data/mnist/train-labels.gz')\n",
"urllib.request.urlretrieve('http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz', filename = './data/mnist/test-images.gz')\n",
"urllib.request.urlretrieve('http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz', filename = './data/mnist/test-labels.gz')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Each workspace is associated with a default datastore. In this tutorial, we will upload the training data to this default datastore."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"ds = ws.get_default_datastore()\n",
"print(ds.datastore_type, ds.account_name, ds.container_name)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Upload MNIST data to the default datastore."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"ds.upload(src_dir='./data/mnist', target_path='mnist', overwrite=True, show_progress=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"For convenience, let's get a reference to the datastore. In the next section, we can then pass this reference to our training script's `--data-folder` argument. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"ds_data = ds.as_mount()\n",
"print(ds_data)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Train model on the remote compute"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Create a project directory\n",
"Create a directory that will contain all the necessary code from your local machine that you will need access to on the remote resource. This includes the training script, and any additional files your training script depends on."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"script_folder = './tf-resume-training'\n",
"os.makedirs(script_folder, exist_ok=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Copy the training script `tf_mnist_with_checkpoint.py` into this project directory."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import shutil\n",
"\n",
"# the training logic is in the tf_mnist_with_checkpoint.py file.\n",
"shutil.copy('./tf_mnist_with_checkpoint.py', script_folder)\n",
"\n",
"# the utils.py just helps loading data from the downloaded MNIST dataset into numpy arrays.\n",
"shutil.copy('./utils.py', script_folder)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Create an experiment\n",
"Create an [Experiment](https://docs.microsoft.com/azure/machine-learning/service/concept-azure-machine-learning-architecture#experiment) to track all the runs in your workspace for this distributed TensorFlow tutorial. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core import Experiment\n",
"\n",
"experiment_name = 'tf-resume-training'\n",
"experiment = Experiment(ws, name=experiment_name)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Create a TensorFlow estimator\n",
"The AML SDK's TensorFlow estimator enables you to easily submit TensorFlow training jobs for both single-node and distributed runs. For more information on the TensorFlow estimator, refer [here](https://docs.microsoft.com/azure/machine-learning/service/how-to-train-tensorflow).\n",
"\n",
"The TensorFlow estimator also takes a `framework_version` parameter -- if no version is provided, the estimator will default to the latest version supported by AzureML. Use `TensorFlow.get_supported_versions()` to get a list of all versions supported by your current SDK version or see the [SDK documentation](https://docs.microsoft.com/en-us/python/api/azureml-train-core/azureml.train.dnn?view=azure-ml-py) for the versions supported in the most current release."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.train.dnn import TensorFlow\n",
"\n",
"script_params={\n",
" '--data-folder': ds_data\n",
"}\n",
"\n",
"estimator= TensorFlow(source_directory=script_folder,\n",
" compute_target=compute_target,\n",
" script_params=script_params,\n",
" entry_script='tf_mnist_with_checkpoint.py')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In the above code, we passed our training data reference `ds_data` to our script's `--data-folder` argument. This will 1) mount our datastore on the remote compute and 2) provide the path to the data zip file on our datastore."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Submit job\n",
"### Run your experiment by submitting your estimator object. Note that this call is asynchronous."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"run = experiment.submit(estimator)\n",
"print(run)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Monitor your run\n",
"You can monitor the progress of the run with a Jupyter widget. Like the run submission, the widget is asynchronous and provides live updates every 10-15 seconds until the job completes."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.widgets import RunDetails\n",
"RunDetails(run).show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Alternatively, you can block until the script has completed training before running more code."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"run.wait_for_completion(show_output=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Now let's resume the training from the above run"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"First, we will get the DataPath to the outputs directory of the above run which\n",
"contains the checkpoint files and/or model"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"model_location = run._get_outputs_datapath()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now, we will create a new TensorFlow estimator and pass in the model location. On passing 'resume_from' parameter, a new entry in script_params is created with key as 'resume_from' and value as the model/checkpoint files location and the location gets automatically mounted on the compute target."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.train.dnn import TensorFlow\n",
"\n",
"script_params={\n",
" '--data-folder': ds_data\n",
"}\n",
"\n",
"estimator2 = TensorFlow(source_directory=script_folder,\n",
" compute_target=compute_target,\n",
" script_params=script_params,\n",
" entry_script='tf_mnist_with_checkpoint.py',\n",
" resume_from=model_location)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now you can submit the experiment and it should resume from previous run's checkpoint files."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"run2 = experiment.submit(estimator2)\n",
"print(run2)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"run2.wait_for_completion(show_output=True)"
]
}
],
"metadata": {
"authors": [
{
"name": "hesuri"
}
],
"kernelspec": {
"display_name": "Python 3.6",
"language": "python",
"name": "python36"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.6"
},
"msauthor": "hesuri"
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -0,0 +1,5 @@
name: train-tensorflow-resume-training
dependencies:
- pip:
- azureml-sdk
- azureml-widgets

View File

@@ -0,0 +1,27 @@
# Copyright (c) Microsoft Corporation. All rights reserved.
# Licensed under the MIT License.
import gzip
import numpy as np
import struct
# load compressed MNIST gz files and return numpy arrays
def load_data(filename, label=False):
with gzip.open(filename) as gz:
struct.unpack('I', gz.read(4))
n_items = struct.unpack('>I', gz.read(4))
if not label:
n_rows = struct.unpack('>I', gz.read(4))[0]
n_cols = struct.unpack('>I', gz.read(4))[0]
res = np.frombuffer(gz.read(n_items[0] * n_rows * n_cols), dtype=np.uint8)
res = res.reshape(n_items[0], n_rows * n_cols)
else:
res = np.frombuffer(gz.read(n_items[0]), dtype=np.uint8)
res = res.reshape(n_items[0], 1)
return res
# one-hot encode a 1-D array
def one_hot_encode(array, num_of_classes):
return np.eye(num_of_classes)[array.reshape(-1)]

View File

@@ -100,7 +100,7 @@
"\n",
"# Check core SDK version number\n",
"\n",
"print(\"This notebook was created using SDK version 1.0.48\r\n, you are currently running version\", azureml.core.VERSION)"
"print(\"This notebook was created using SDK version 1.0.53, you are currently running version\", azureml.core.VERSION)"
]
},
{

View File

@@ -120,19 +120,42 @@
"As with other Azure services, there are limits on certain resources (e.g. AmlCompute) associated with the Azure Machine Learning service. Please read [this article](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-manage-quotas) on the default limits and how to request more quota."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If we could not find the cluster with the given name, then we will create a new cluster here. We will create an `AmlCompute` cluster of `STANDARD_D2_V2` CPU VMs. This process is broken down into 3 steps:\n",
"1. create the configuration (this step is local and only takes a second)\n",
"2. create the cluster (this step will take about **20 seconds**)\n",
"3. provision the VMs to bring the cluster to the initial size (of 1 in this case). This step will take about **3-5 minutes** and is providing only sparse output in the process. Please make sure to wait until the call returns before moving to the next cell"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core.compute import ComputeTarget\n",
"from azureml.core.compute import ComputeTarget, AmlCompute\n",
"from azureml.core.compute_target import ComputeTargetException\n",
"\n",
"# choose a name for your cluster\n",
"cluster_name = \"cpu-cluster\"\n",
"\n",
"compute_target = ComputeTarget(workspace=ws, name=cluster_name)\n",
"print('Found existing compute target.')\n",
"try:\n",
" compute_target = ComputeTarget(workspace=ws, name=cluster_name)\n",
" print('Found existing compute target')\n",
"except ComputeTargetException:\n",
" print('Creating a new compute target...')\n",
" compute_config = AmlCompute.provisioning_configuration(vm_size='STANDARD_D2_V2', \n",
" max_nodes=4)\n",
"\n",
" # create the cluster\n",
" compute_target = ComputeTarget.create(ws, cluster_name, compute_config)\n",
"\n",
" # can poll for a minimum number of nodes and for a specific timeout. \n",
" # if no min node count is provided it uses the scale settings for the cluster\n",
" compute_target.wait_for_completion(show_output=True, min_node_count=None, timeout_in_minutes=20)\n",
"\n",
"# use get_status() to get a detailed status for the current cluster. \n",
"print(compute_target.get_status().serialize())"
@@ -142,7 +165,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"The above code retrieves an existing CPU compute target. Scikit-learn does not support GPU computing."
"The above code retrieves a CPU compute target. Scikit-learn does not support GPU computing."
]
},
{
@@ -289,7 +312,7 @@
" script_params=script_params,\n",
" compute_target=compute_target,\n",
" entry_script='train_iris.py',\n",
" pip_packages=['joblib']\n",
" pip_packages=['joblib==0.13.2']\n",
" )"
]
},
@@ -507,7 +530,7 @@
"metadata": {},
"outputs": [],
"source": [
"model = best_run.register_model(model_name='sklearn-iris', model_path='model.joblib')"
"model = best_run.register_model(model_name='sklearn-iris', model_path='outputs/model.joblib')"
]
}
],

View File

@@ -1,6 +1,7 @@
# Modified from https://www.geeksforgeeks.org/multiclass-classification-using-scikit-learn/
import argparse
import os
# importing necessary libraries
import numpy as np
@@ -50,8 +51,9 @@ def main():
cm = confusion_matrix(y_test, svm_predictions)
print(cm)
# save model
joblib.dump(svm_model_linear, 'model.joblib')
os.makedirs('outputs', exist_ok=True)
# files saved in the "outputs" folder are automatically uploaded into run history
joblib.dump(svm_model_linear, 'outputs/model.joblib')
if __name__ == '__main__':

View File

@@ -102,7 +102,7 @@
"source": [
"import azureml.core\n",
"\n",
"print(\"This notebook was created using version 1.0.48\r\n of the Azure ML SDK\")\n",
"print(\"This notebook was created using version 1.0.53 of the Azure ML SDK\")\n",
"print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
]
},

View File

@@ -0,0 +1,45 @@
-----BEGIN PRIVATE KEY-----
MIIEvgIBADANBgkqhkiG9w0BAQEFAASCBKgwggSkAgEAAoIBAQC/C0oc6vvF1UEc
y9JeGDXdtKynG11wTTIHIokFhNinHNSpJBLmNWFyFkqzvjJCPR4kWuqw4IXhCS3L
VoqRmT680SvUFFF6HnEaa75Bc1YSACn1ZsHuCRGrqO9BaTgt3mM0sRYC67+f+W0E
tA+k+EA0XnTtDdEBX3RLzvaYAR4yijEHIBQeeNemPYK4msW6Xw67ib1xn59blX4Z
a4Z85FjrekmoTl9493bFj6znDTX6wpKsPF7WLEF9S+oD/Lg4EHBi9BfefFxQpGZ9
FQHToFKyz1tA2iaY/9LjCtJcincMkuXt3KuQA4Nv2GiTzz4+FEy1pOqHnyNL2tFR
1G5n04BHAgMBAAECggEAAqcXeltQ76hMZSf3XdMcPF3b394jaAHKZgr2uBrmHzvp
QAf+MzAekET6+I/1hrHujzar95TGhx9ngWFMP0VPd7O31hQKJZXyoBlK5QHC+jEC
ZCPvIW0Cz81itRfO7eQeoIas9ZFscb4240/Uv8eqrI97NCdy9X/rz3mqNuYdEzqN
2v9XlwE/Fyx79O1PQqzPRiQt3n4ss9NO169y7X99KUZtYiZAiyBBGS8wYdaGF69G
URZ3qwoUE+nByZdeRfFLLTy+UDCOwQZV+0V4p0J++YLqQAac340A1F4D60qzMHnv
KVKnMc+RrYYVFOZU+USRlphSl3Ws5j0u94CiLitK4QKBgQDivJVHNmk1JleI/MPF
bx/YT5gzcVRFhGxkGso12JrQiFPs05JmoRFaqNBDNoZYDn2ggUrMwZVfPI5C6+7U
tCe2vrjVpvcAO9reK1u4N9ohpUpkocxWQy0nNHlrorDTZnyKreRtPC87W8xpiwl4
R/+nMgGd8vex7tGfchpThj8ZeQKBgQDXs2sgpE8vmnZBWrXAuGD8M9VnfcALEjwL
Fi3NR+XCr8jHkeIJVbSI2/asWsBGg8v6gV6Cdx9KV9r+fHDzdocS85X4P7crP83A
IX2rTT6Hsmc170SzCDa2jJJyLHQ6qtXBS9ZW8/dPFc1fiBf0NcmTLrRoNg5N8Px6
Qt0T51q3vwKBgQCYAfhOetMD2AW9iEAzwDFoUsxmSKdHx+TnI/LHMMVx4sPpNVqk
RX2d+ylMtmRQ6r4cejHMnkfnRnDVutkubu1lHe5LBpn35Sjx472k/oTWI7uBRdv5
RSYjb5GrsLG9uKrsSnKnLT85G20qoRUjN5nU3LiqzPZ0qviMXfH6ZzkseQKBgQCT
ft6MTY7QUGD4w5xxEiNPkeolgHmnmGpyclITg0x7WlSDEyBrna17wF3m8Y91KH58
56XGtMoyvezEBDgAY1ZuAR7VyEvqSRDahow2bPWLONUWrmxduAohvfIOHJPF4jeU
m9UPVHgSHih3YMpwda9G87LtZ7lUVqtutvYRvCvuZQKBgAypo514DZW7Y9lMCgkR
GpJLKCWFR0Sl9bQXI7N5nAG0YFz5ZhdA1PjS2tj+OKyWR6wekbv3g0CyVXT4XYsi
tKRu9PR2OUQLPv/h2qLAeSOYdScfWoOU5tlb4tkLoUNmj5/N9VpqbvLdDh6hPWQL
o4s+29QYKEoNmOrcZ6oRkRP8
-----END PRIVATE KEY-----
-----BEGIN CERTIFICATE-----
MIICoTCCAYkCAgPoMA0GCSqGSIb3DQEBBQUAMBQxEjAQBgNVBAMMCUNMSS1Mb2dp
bjAiGA8yMDE5MDUwMzIwMDIwOVoYDzIwMjAwNTAzMjAwMjExWjAUMRIwEAYDVQQD
DAlDTEktTG9naW4wggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQC/C0oc
6vvF1UEcy9JeGDXdtKynG11wTTIHIokFhNinHNSpJBLmNWFyFkqzvjJCPR4kWuqw
4IXhCS3LVoqRmT680SvUFFF6HnEaa75Bc1YSACn1ZsHuCRGrqO9BaTgt3mM0sRYC
67+f+W0EtA+k+EA0XnTtDdEBX3RLzvaYAR4yijEHIBQeeNemPYK4msW6Xw67ib1x
n59blX4Za4Z85FjrekmoTl9493bFj6znDTX6wpKsPF7WLEF9S+oD/Lg4EHBi9Bfe
fFxQpGZ9FQHToFKyz1tA2iaY/9LjCtJcincMkuXt3KuQA4Nv2GiTzz4+FEy1pOqH
nyNL2tFR1G5n04BHAgMBAAEwDQYJKoZIhvcNAQEFBQADggEBAGz3pOgNPESr+QoO
OVCgSS6VtWlmrAcxl5JaiNBFpBGAqfvbfRe1eZY7Rn6fuw1jc3pPBVzNTf8Plel+
DcuLzDLJAEag2GpRE+Xg57DNSwPqP6jZfHRE/ufLwIRLcNG9wRUwqlBvdAu1Kign
nlTZvTEAwxlQdvmIIT1XrTLZ+OwtVXcgrf0vInmueZKz/UDqsSDPY+d426S9eOWt
60h2WgXPU3QvBYfA6Yd2ReeP3+SHwBd4/1ByNFWBytcI9ow3pp2JznU366dfX4IQ
Q0iOTvHzXbfPmtsxqho6+hBbLvXVNWJMg8e22Pp/TyXYqeV5V09k18EgCnuA/9Gd
kKDVROA=
-----END CERTIFICATE-----

View File

@@ -222,7 +222,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.8"
"version": "3.6.4"
},
"notice": "Copyright (c) Microsoft Corporation. All rights reserved. Licensed under the MIT License."
},

View File

@@ -47,6 +47,7 @@
"[Read PostgreSQL](#postgresql)<br>\n",
"[Read From Azure Blob](#azure-blob)<br>\n",
"[Read From ADLS](#adls)<br>\n",
"[Read From ADLSGen2](#adlsgen2)<br>\n",
"[Read Pandas DataFrame](#pandas-df)<br>"
]
},
@@ -315,6 +316,25 @@
"df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can see in the results that the FBI Code column now contains some NaN values where before, when calling head, it didn't. By default, `to_pandas_dataframe` attempts to coalesce columns into a single type for better performance and lower memory overhead. This specific column has a mixutre of both numbers and strings and the strings were replaced with NaN values.\n",
"\n",
"If you wish to keep the mixed-type column in the Pandas DataFrame, you can set the `extended_types` argument to True when calling `to_pandas_dataframe`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"df = dflow_skipped_rows.to_pandas_dataframe(extended_types=True)\n",
"df"
]
},
{
"cell_type": "markdown",
"metadata": {},
@@ -635,7 +655,7 @@
"metadata": {},
"outputs": [],
"source": [
"df = dflow.to_pandas_dataframe()\n",
"df = dflow.to_pandas_dataframe(extended_types=True)\n",
"df.dtypes"
]
},
@@ -751,7 +771,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"There are two ways the Data Prep API can acquire the necessary OAuth token to access Azure DataLake Storage:\n",
"Data Prep currently supports both ADLS and ADLSGen2. There are two ways the Data Prep API can acquire the necessary OAuth token to access Azure DataLake Storage:\n",
"1. Retrieve the access token from a recent login session of the user's [Azure CLI](https://docs.microsoft.com/en-us/cli/azure/install-azure-cli?view=azure-cli-latest) login.\n",
"2. Use a ServicePrincipal (SP) and a certificate as a secret."
]
@@ -883,6 +903,70 @@
"dflow.to_pandas_dataframe().head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a id=\"adlsgen2\"></a>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Read from ADLSGen2"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Please refer to the Read for ADLS section above to get details of how to register a Service Principal and obtain an OAuth access token.[ADLS](http://localhost:8888/notebooks/notebooks/how-to-guides/data-ingestion.ipynb#adls)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Configure ADLSGen2 Account for ServicePrincipal"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"certThumbprint = '23:66:84:6B:3A:14:9E:B1:17:CA:EE:E3:BB:2C:21:2D:20:B0:DF:F2'\n",
"certificate = ''\n",
"with open('../data/ADLSgen2-datapreptest.crt', 'rt', encoding='utf-8') as crtFile:\n",
" certificate = crtFile.read()\n",
"\n",
"servicePrincipalAppId = \"127a58c3-f307-46a1-969e-a6b63da3f411\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Acquire an OAuth Access Token for ADLSGen2"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import adal\n",
"from azureml.dataprep.api.datasources import ADLSGen2\n",
"\n",
"ctx = adal.AuthenticationContext('https://login.microsoftonline.com/72f988bf-86f1-41af-91ab-2d7cd011db47')\n",
"token = ctx.acquire_token_with_client_certificate('https://storage.azure.com/', servicePrincipalAppId, certificate, certThumbprint)\n",
"dflow = dprep.read_csv(path = ADLSGen2(path='https://adlsgen2datapreptest.dfs.core.windows.net/datapreptest/people.csv', accessToken=token['accessToken']))\n",
"dflow.to_pandas_dataframe().head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
@@ -923,7 +1007,24 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"After loading in the data you can now do `read_pandas_dataframe`."
"After loading in the data you can now do `read_pandas_dataframe`. If you only need to consume the Dataflow created from the current environment, you can read the DataFrame in memory."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"dflow_df = dprep.read_pandas_dataframe(df, in_memory=True)\n",
"dflow_df.head(5)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"However, if you intend to use this Dataflow past the end of your current Python session (such as by saving the Dataflow to a file), you can provide a cache directory where the contents of the DataFrame will be stored so they can be retrieved later."
]
},
{

View File

@@ -183,6 +183,37 @@
"dflow_adls = dprep.read_csv(path=DataPath(datastore, path_on_datastore='/input/crime0-10.csv'))\n",
"dflow_adls.head(5)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now you can read all the files in the `dataprep_adlsgen2` datastore which references an ADLSGen2 Storage account."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# read a file from ADLSGen2\n",
"datastore = Datastore(workspace=workspace, name='adlsgen2')\n",
"dflow_adlsgen2 = dprep.read_csv(path=DataPath(datastore, path_on_datastore='/testfolder/peopletest.csv'))\n",
"dflow_adlsgen2.head(5)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# read all files from ADLSGen2 directory\n",
"datastore = Datastore(workspace=workspace, name='adlsgen2')\n",
"dflow_adlsgen2 = dprep.read_csv(path=DataPath(datastore, path_on_datastore='/testfolder/testdir'))\n",
"dflow_adlsgen2.head()"
]
}
],
"metadata": {

View File

@@ -186,7 +186,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we have successfully split the data into useful columns through examples. "
"Now we have successfully split the data into useful columns through examples."
]
}
],