Compare commits

...

16 Commits

Author SHA1 Message Date
Sheri Gilley
98d24243bd add cell metadata 2020-02-04 11:32:41 -06:00
Sheri Gilley
3ee5a4c2b2 Update train-within-notebook.ipynb 2020-02-04 11:06:41 -06:00
Sheri Gilley
fd60846887 Update train-within-notebook.ipynb 2020-02-04 09:13:56 -06:00
Harneet Virk
e895d7c2bf update samples - test (#758)
Co-authored-by: vizhur <vizhur@live.com>
2020-01-31 15:19:58 -05:00
Shané Winner
3588eb9665 Update index.md 2020-01-23 15:46:43 -08:00
Harneet Virk
a09e726f31 update samples - test (#748)
Co-authored-by: vizhur <vizhur@live.com>
2020-01-23 16:50:29 -05:00
Shané Winner
4fb1d9ee5b Update index.md 2020-01-22 11:38:24 -08:00
Harneet Virk
b05ff80e9d update samples from Release-169 as a part of 1.0.85 SDK release (#742)
Co-authored-by: vizhur <vizhur@live.com>
2020-01-21 18:00:15 -05:00
Shané Winner
512630472b Update index.md 2020-01-08 14:52:23 -08:00
vizhur
ae1337fe70 Merge pull request #724 from Azure/release_update/Release-167
update samples from Release-167 as a part of 1.0.83 SDK release
2020-01-06 15:38:25 -05:00
vizhur
c95f970dc8 update samples from Release-167 as a part of 1.0.83 SDK release 2020-01-06 20:16:21 +00:00
Shané Winner
9b9d112719 Update index.md 2019-12-24 07:40:48 -08:00
vizhur
fe8fcd4b48 Merge pull request #712 from Azure/release_update/Release-31
update samples - test
2019-12-23 20:28:02 -05:00
vizhur
296ae01587 update samples - test 2019-12-24 00:42:48 +00:00
Shané Winner
8f4efe15eb Update index.md 2019-12-10 09:05:23 -08:00
vizhur
d179080467 Merge pull request #690 from Azure/release_update/Release-163
update samples from Release-163 as a part of 1.0.79 SDK release
2019-12-09 15:41:03 -05:00
114 changed files with 2375 additions and 3862 deletions

View File

@@ -20,8 +20,8 @@ If you want to...
* ...try out and explore Azure ML, start with image classification tutorials: [Part 1 (Training)](./tutorials/img-classification-part1-training.ipynb) and [Part 2 (Deployment)](./tutorials/img-classification-part2-deploy.ipynb).
* ...learn about experimentation and tracking run history, first [train within Notebook](./how-to-use-azureml/training/train-within-notebook/train-within-notebook.ipynb), then try [training on remote VM](./how-to-use-azureml/training/train-on-remote-vm/train-on-remote-vm.ipynb) and [using logging APIs](./how-to-use-azureml/training/logging-api/logging-api.ipynb).
* ...train deep learning models at scale, first learn about [Machine Learning Compute](./how-to-use-azureml/training/train-on-amlcompute/train-on-amlcompute.ipynb), and then try [distributed hyperparameter tuning](./how-to-use-azureml/training-with-deep-learning/train-hyperparameter-tune-deploy-with-pytorch/train-hyperparameter-tune-deploy-with-pytorch.ipynb) and [distributed training](./how-to-use-azureml/training-with-deep-learning/distributed-pytorch-with-horovod/distributed-pytorch-with-horovod.ipynb).
* ...deploy models as a realtime scoring service, first learn the basics by [training within Notebook and deploying to Azure Container Instance](./how-to-use-azureml/training/train-within-notebook/train-within-notebook.ipynb), then learn how to [register and manage models, and create Docker images](./how-to-use-azureml/deployment/register-model-create-image-deploy-service/register-model-create-image-deploy-service.ipynb), and [production deploy models on Azure Kubernetes Cluster](./how-to-use-azureml/deployment/production-deploy-to-aks/production-deploy-to-aks.ipynb).
* ...deploy models as a batch scoring service, first [train a model within Notebook](./how-to-use-azureml/training/train-within-notebook/train-within-notebook.ipynb), learn how to [register and manage models](./how-to-use-azureml/deployment/register-model-create-image-deploy-service/register-model-create-image-deploy-service.ipynb), then [create Machine Learning Compute for scoring compute](./how-to-use-azureml/training/train-on-amlcompute/train-on-amlcompute.ipynb), and [use Machine Learning Pipelines to deploy your model](https://aka.ms/pl-batch-scoring).
* ...deploy models as a realtime scoring service, first learn the basics by [training within Notebook and deploying to Azure Container Instance](./how-to-use-azureml/training/train-within-notebook/train-within-notebook.ipynb), then learn how to [production deploy models on Azure Kubernetes Cluster](./how-to-use-azureml/deployment/production-deploy-to-aks/production-deploy-to-aks.ipynb).
* ...deploy models as a batch scoring service, first [train a model within Notebook](./how-to-use-azureml/training/train-within-notebook/train-within-notebook.ipynb), then [create Machine Learning Compute for scoring compute](./how-to-use-azureml/training/train-on-amlcompute/train-on-amlcompute.ipynb), and [use Machine Learning Pipelines to deploy your model](https://aka.ms/pl-batch-scoring).
* ...monitor your deployed models, learn about using [App Insights](./how-to-use-azureml/deployment/enable-app-insights-in-production-service/enable-app-insights-in-production-service.ipynb).
## Tutorials

View File

@@ -103,7 +103,7 @@
"source": [
"import azureml.core\n",
"\n",
"print(\"This notebook was created using version 1.0.79 of the Azure ML SDK\")\n",
"print(\"This notebook was created using version 1.0.85 of the Azure ML SDK\")\n",
"print(\"You are currently using version\", azureml.core.VERSION, \"of the Azure ML SDK\")"
]
},

View File

@@ -9,7 +9,6 @@ As a pre-requisite, run the [configuration Notebook](../configuration.ipynb) not
* [train-on-amlcompute](./training/train-on-amlcompute): Use a 1-n node Azure ML managed compute cluster for remote runs on Azure CPU or GPU infrastructure.
* [train-on-remote-vm](./training/train-on-remote-vm): Use Data Science Virtual Machine as a target for remote runs.
* [logging-api](./track-and-monitor-experiments/logging-api): Learn about the details of logging metrics to run history.
* [register-model-create-image-deploy-service](./deployment/register-model-create-image-deploy-service): Learn about the details of model management.
* [production-deploy-to-aks](./deployment/production-deploy-to-aks) Deploy a model to production at scale on Azure Kubernetes Service.
* [enable-app-insights-in-production-service](./deployment/enable-app-insights-in-production-service) Learn how to use App Insights with production web service.

View File

@@ -197,6 +197,17 @@ If automl_setup_linux.sh fails on Ubuntu Linux with the error: `unable to execut
4) Check that the region is one of the supported regions: `eastus2`, `eastus`, `westcentralus`, `southeastasia`, `westeurope`, `australiaeast`, `westus2`, `southcentralus`
5) Check that you have access to the region using the Azure Portal.
## import AutoMLConfig fails after upgrade from before 1.0.76 to 1.0.76 or later
There were package changes in automated machine learning version 1.0.76, which require the previous version to be uninstalled before upgrading to the new version.
If you have manually upgraded from a version of automated machine learning before 1.0.76 to 1.0.76 or later, you may get the error:
`ImportError: cannot import name 'AutoMLConfig'`
This can be resolved by running:
`pip uninstall azureml-train-automl` and then
`pip install azureml-train-automl`
The automl_setup.cmd script does this automatically.
## workspace.from_config fails
If the call `ws = Workspace.from_config()` fails:
1) Make sure that you have run the `configuration.ipynb` notebook successfully.

View File

@@ -2,7 +2,7 @@ name: azure_automl
dependencies:
# The python interpreter version.
# Currently Azure ML only supports 3.5.2 and later.
- pip
- pip<=19.3.1
- python>=3.5.2,<3.6.8
- nb_conda
- matplotlib==2.1.0
@@ -13,7 +13,6 @@ dependencies:
- scikit-learn>=0.19.0,<=0.20.3
- pandas>=0.22.0,<=0.23.4
- py-xgboost<=0.80
- pyarrow>=0.11.0
- fbprophet==0.5
- pytorch=1.1.0
- cudatoolkit=9.0
@@ -30,7 +29,7 @@ dependencies:
- pytorch-transformers==1.0.0
- spacy==2.1.8
- joblib
- onnxruntime==0.4.0
- onnxruntime==1.0.0
- https://aka.ms/automl-resources/packages/en_core_web_sm-2.1.0.tar.gz
channels:

View File

@@ -2,7 +2,7 @@ name: azure_automl
dependencies:
# The python interpreter version.
# Currently Azure ML only supports 3.5.2 and later.
- pip
- pip<=19.3.1
- nomkl
- python>=3.5.2,<3.6.8
- nb_conda
@@ -14,7 +14,6 @@ dependencies:
- scikit-learn>=0.19.0,<=0.20.3
- pandas>=0.22.0,<0.23.0
- py-xgboost<=0.80
- pyarrow>=0.11.0
- fbprophet==0.5
- pytorch=1.1.0
- cudatoolkit=9.0
@@ -31,7 +30,7 @@ dependencies:
- pytorch-transformers==1.0.0
- spacy==2.1.8
- joblib
- onnxruntime==0.4.0
- onnxruntime==1.0.0
- https://aka.ms/automl-resources/packages/en_core_web_sm-2.1.0.tar.gz
channels:

View File

@@ -92,6 +92,32 @@
"from azureml.explain.model._internal.explanation_client import ExplanationClient"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Accessing the Azure ML workspace requires authentication with Azure.\n",
"\n",
"The default authentication is interactive authentication using the default tenant. Executing the `ws = Workspace.from_config()` line in the cell below will prompt for authentication the first time that it is run.\n",
"\n",
"If you have multiple Azure tenants, you can specify the tenant by replacing the `ws = Workspace.from_config()` line in the cell below with the following:\n",
"\n",
"```\n",
"from azureml.core.authentication import InteractiveLoginAuthentication\n",
"auth = InteractiveLoginAuthentication(tenant_id = 'mytenantid')\n",
"ws = Workspace.from_config(auth = auth)\n",
"```\n",
"\n",
"If you need to run in an environment where interactive login is not possible, you can use Service Principal authentication by replacing the `ws = Workspace.from_config()` line in the cell below with the following:\n",
"\n",
"```\n",
"from azureml.core.authentication import ServicePrincipalAuthentication\n",
"auth = auth = ServicePrincipalAuthentication('mytenantid', 'myappid', 'mypassword')\n",
"ws = Workspace.from_config(auth = auth)\n",
"```\n",
"For more details, see [aka.ms/aml-notebook-auth](http://aka.ms/aml-notebook-auth)"
]
},
{
"cell_type": "code",
"execution_count": null,
@@ -288,7 +314,7 @@
"|**blacklist_models** | *List* of *strings* indicating machine learning algorithms for AutoML to avoid in this run. <br><br> Allowed values for **Classification**<br><i>LogisticRegression</i><br><i>SGD</i><br><i>MultinomialNaiveBayes</i><br><i>BernoulliNaiveBayes</i><br><i>SVM</i><br><i>LinearSVM</i><br><i>KNN</i><br><i>DecisionTree</i><br><i>RandomForest</i><br><i>ExtremeRandomTrees</i><br><i>LightGBM</i><br><i>GradientBoosting</i><br><i>TensorFlowDNN</i><br><i>TensorFlowLinearClassifier</i><br><br>Allowed values for **Regression**<br><i>ElasticNet</i><br><i>GradientBoosting</i><br><i>DecisionTree</i><br><i>KNN</i><br><i>LassoLars</i><br><i>SGD</i><br><i>RandomForest</i><br><i>ExtremeRandomTrees</i><br><i>LightGBM</i><br><i>TensorFlowLinearRegressor</i><br><i>TensorFlowDNN</i><br><br>Allowed values for **Forecasting**<br><i>ElasticNet</i><br><i>GradientBoosting</i><br><i>DecisionTree</i><br><i>KNN</i><br><i>LassoLars</i><br><i>SGD</i><br><i>RandomForest</i><br><i>ExtremeRandomTrees</i><br><i>LightGBM</i><br><i>TensorFlowLinearRegressor</i><br><i>TensorFlowDNN</i><br><i>Arima</i><br><i>Prophet</i>|\n",
"| **whitelist_models** | *List* of *strings* indicating machine learning algorithms for AutoML to use in this run. Same values listed above for **blacklist_models** allowed for **whitelist_models**.|\n",
"|**experiment_exit_score**| Value indicating the target for *primary_metric*. <br>Once the target is surpassed the run terminates.|\n",
"|**experiment_timeout_minutes**| Maximum amount of time in minutes that all iterations combined can take before the experiment terminates.|\n",
"|**experiment_timeout_hours**| Maximum amount of time in hours that all iterations combined can take before the experiment terminates.|\n",
"|**enable_early_stopping**| Flag to enble early termination if the score is not improving in the short term.|\n",
"|**featurization**| 'auto' / 'off' Indicator for whether featurization step should be done automatically or not. Note: If the input data is sparse, featurization cannot be turned on.|\n",
"|**n_cross_validations**|Number of cross validation splits.|\n",
@@ -306,7 +332,7 @@
"outputs": [],
"source": [
"automl_settings = {\n",
" \"experiment_timeout_minutes\" : 20,\n",
" \"experiment_timeout_hours\" : 0.3,\n",
" \"enable_early_stopping\" : True,\n",
" \"iteration_timeout_minutes\": 5,\n",
" \"max_concurrent_iterations\": 4,\n",
@@ -694,10 +720,10 @@
"from azureml.core.webservice import AciWebservice\n",
"from azureml.core.webservice import Webservice\n",
"from azureml.core.model import Model\n",
"from azureml.core.environment import Environment\n",
"\n",
"inference_config = InferenceConfig(runtime = \"python\", \n",
" entry_script = script_file_name,\n",
" conda_file = conda_env_file_name)\n",
"myenv = Environment.from_conda_specification(name=\"myenv\", file_path=conda_env_file_name)\n",
"inference_config = InferenceConfig(entry_script=script_file_name, environment=myenv)\n",
"\n",
"aciconfig = AciWebservice.deploy_configuration(cpu_cores = 1, \n",
" memory_gb = 1, \n",

View File

@@ -2,12 +2,10 @@ name: auto-ml-classification-bank-marketing-all-features
dependencies:
- pip:
- azureml-sdk
- interpret
- azureml-defaults
- azureml-train-automl
- azureml-widgets
- matplotlib
- pandas_ml
- onnxruntime==0.4.0
- interpret
- onnxruntime==1.0.0
- azureml-explain-model
- azureml-contrib-interpret

View File

@@ -210,10 +210,9 @@
"automl_settings = {\n",
" \"n_cross_validations\": 3,\n",
" \"primary_metric\": 'average_precision_score_weighted',\n",
" \"preprocess\": True,\n",
" \"enable_early_stopping\": True,\n",
" \"max_concurrent_iterations\": 2, # This is a limit for testing purpose, please increase it as per cluster size\n",
" \"experiment_timeout_minutes\": 10, # This is a time limit for testing purposes, remove it for real use cases, this will drastically limit ablity to find the best model possible\n",
" \"experiment_timeout_hours\": 0.2, # This is a time limit for testing purposes, remove it for real use cases, this will drastically limit ablity to find the best model possible\n",
" \"verbosity\": logging.INFO,\n",
"}\n",
"\n",
@@ -305,7 +304,7 @@
"source": [
"#### Explain model\n",
"\n",
"Automated ML models can be explained and visualized using the SDK Explainability library. [Learn how to use the explainer](https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/automated-machine-learning/model-explanation-remote-amlcompute/auto-ml-model-explanations-remote-compute.ipynb)."
"Automated ML models can be explained and visualized using the SDK Explainability library. "
]
},
{
@@ -334,17 +333,7 @@
"metadata": {},
"source": [
"#### Print the properties of the model\n",
"The fitted_model is a python object and you can read the different properties of the object.\n",
"See *Print the properties of the model* section in [this sample notebook](https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/automated-machine-learning/classification/auto-ml-classification.ipynb)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Deploy\n",
"\n",
"To deploy the model into a web service endpoint, see _Deploy_ section in [this sample notebook](https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/automated-machine-learning/classification-with-deployment/auto-ml-classification-with-deployment.ipynb)"
"The fitted_model is a python object and you can read the different properties of the object.\n"
]
},
{

View File

@@ -2,10 +2,8 @@ name: auto-ml-classification-credit-card-fraud
dependencies:
- pip:
- azureml-sdk
- interpret
- azureml-defaults
- azureml-explain-model
- azureml-train-automl
- azureml-widgets
- matplotlib
- pandas_ml
- interpret
- azureml-explain-model

View File

@@ -275,7 +275,6 @@
"automl_settings = {\n",
" \"experiment_timeout_minutes\": 20,\n",
" \"primary_metric\": 'accuracy',\n",
" \"preprocess\": True,\n",
" \"max_concurrent_iterations\": 4, \n",
" \"max_cores_per_iteration\": -1,\n",
" \"enable_dnn\": True,\n",
@@ -519,12 +518,12 @@
"name": "anshirga"
}
],
"datasets": [
"None"
],
"compute": [
"AML Compute"
],
"datasets": [
"None"
],
"deployment": [
"None"
],

View File

@@ -3,8 +3,6 @@ dependencies:
- pip:
- azureml-sdk
- azureml-train-automl
- azureml-train
- azureml-widgets
- matplotlib
- pandas_ml
- statsmodels
- azurmel-train

View File

@@ -210,7 +210,24 @@
"metadata": {},
"source": [
"## Data Ingestion Pipeline \n",
"For this demo, we will use NOAA weather data from [Azure Open Datasets](https://azure.microsoft.com/services/open-datasets/). You can replace this with your own dataset, or you can skip this pipeline if you already have a time-series based `TabularDataset`.\n",
"For this demo, we will use NOAA weather data from [Azure Open Datasets](https://azure.microsoft.com/services/open-datasets/). You can replace this with your own dataset, or you can skip this pipeline if you already have a time-series based `TabularDataset`.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# The name and target column of the Dataset to create \n",
"dataset = \"NOAA-Weather-DS4\"\n",
"target_column_name = \"temperature\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"### Upload Data Step\n",
"The data ingestion pipeline has a single step with a script to query the latest weather data and upload it to the blob store. During the first run, the script will create and register a time-series based `TabularDataset` with the past one week of weather data. For each subsequent run, the script will create a partition in the blob store by querying NOAA for new weather data since the last modified time of the dataset (`dataset.data_changed_time`) and creating a data.csv file."
@@ -225,8 +242,6 @@
"from azureml.pipeline.core import Pipeline, PipelineParameter\n",
"from azureml.pipeline.steps import PythonScriptStep\n",
"\n",
"# The name of the Dataset to create \n",
"dataset = \"NOAA-Weather-DS4\"\n",
"ds_name = PipelineParameter(name=\"ds_name\", default_value=dataset)\n",
"upload_data_step = PythonScriptStep(script_name=\"upload_weather_data.py\", \n",
" allow_reuse=False,\n",
@@ -272,7 +287,7 @@
"## Training Pipeline\n",
"### Prepare Training Data Step\n",
"\n",
"Script to bring data into common X,y format. We need to set allow_reuse flag to False to allow the pipeline to run even when inputs don't change. We also need the name of the model to check the time the model was last trained."
"Script to check if new data is available since the model was last trained. If no new data is available, we cancel the remaining pipeline steps. We need to set allow_reuse flag to False to allow the pipeline to run even when inputs don't change. We also need the name of the model to check the time the model was last trained."
]
},
{
@@ -283,11 +298,8 @@
"source": [
"from azureml.pipeline.core import PipelineData\n",
"\n",
"target_column = PipelineParameter(\"target_column\", default_value=\"y\")\n",
"# The model name with which to register the trained model in the workspace.\n",
"model_name = PipelineParameter(\"model_name\", default_value=\"y\")\n",
"output_x = PipelineData(\"output_x\", datastore=dstor)\n",
"output_y = PipelineData(\"output_y\", datastore=dstor)"
"model_name = PipelineParameter(\"model_name\", default_value=\"noaaweatherds\")"
]
},
{
@@ -299,16 +311,23 @@
"data_prep_step = PythonScriptStep(script_name=\"check_data.py\", \n",
" allow_reuse=False,\n",
" name=\"check_data\",\n",
" arguments=[\"--target_column\", target_column,\n",
" \"--output_x\", output_x,\n",
" \"--output_y\", output_y,\n",
" \"--ds_name\", ds_name,\n",
" \"--model_name\", model_name],\n",
" outputs=[output_x, output_y], \n",
" arguments=[\"--ds_name\", ds_name,\n",
" \"--model_name\", model_name],\n",
" compute_target=compute_target, \n",
" runconfig=conda_run_config)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core import Dataset\n",
"train_ds = Dataset.get_by_name(ws, dataset)\n",
"train_ds = train_ds.drop_columns([\"partition_date\"])"
]
},
{
"cell_type": "markdown",
"metadata": {},
@@ -324,14 +343,13 @@
"outputs": [],
"source": [
"from azureml.train.automl import AutoMLConfig\n",
"from azureml.train.automl.runtime import AutoMLStep\n",
"from azureml.train.automl import AutoMLStep\n",
"\n",
"automl_settings = {\n",
" \"iteration_timeout_minutes\": 20,\n",
" \"experiment_timeout_minutes\": 30,\n",
" \"iteration_timeout_minutes\": 10,\n",
" \"experiment_timeout_hours\": 0.2,\n",
" \"n_cross_validations\": 3,\n",
" \"primary_metric\": 'r2_score',\n",
" \"preprocess\": True,\n",
" \"max_concurrent_iterations\": 3,\n",
" \"max_cores_per_iteration\": -1,\n",
" \"verbosity\": logging.INFO,\n",
@@ -342,8 +360,8 @@
" debug_log = 'automl_errors.log',\n",
" path = \".\",\n",
" compute_target=compute_target,\n",
" run_configuration=conda_run_config,\n",
" data_script = \"get_data.py\",\n",
" training_data = train_ds,\n",
" label_column_name = target_column_name,\n",
" **automl_settings\n",
" )"
]
@@ -359,7 +377,7 @@
"metrics_output_name = 'metrics_output'\n",
"best_model_output_name = 'best_model_output'\n",
"\n",
"metirics_data = PipelineData(name='metrics_data',\n",
"metrics_data = PipelineData(name='metrics_data',\n",
" datastore=dstor,\n",
" pipeline_output_name=metrics_output_name,\n",
" training_output=TrainingOutput(type='Metrics'))\n",
@@ -378,8 +396,7 @@
"automl_step = AutoMLStep(\n",
" name='automl_module',\n",
" automl_config=automl_config,\n",
" inputs=[output_x, output_y],\n",
" outputs=[metirics_data, model_data],\n",
" outputs=[metrics_data, model_data],\n",
" allow_reuse=False)"
]
},
@@ -432,7 +449,7 @@
"outputs": [],
"source": [
"training_pipeline_run = experiment.submit(training_pipeline, pipeline_parameters={\n",
" \"target_column\": \"temperature\", \"ds_name\": dataset, \"model_name\": \"noaaweatherds\"})"
" \"ds_name\": dataset, \"model_name\": \"noaaweatherds\"})"
]
},
{
@@ -475,7 +492,7 @@
"source": [
"from azureml.pipeline.core import Schedule\n",
"schedule = Schedule.create(workspace=ws, name=\"RetrainingSchedule\",\n",
" pipeline_parameters={\"target_column\": \"temperature\",\"ds_name\": dataset, \"model_name\": \"noaaweatherds\"},\n",
" pipeline_parameters={\"ds_name\": dataset, \"model_name\": \"noaaweatherds\"},\n",
" pipeline_id=published_pipeline.id, \n",
" experiment_name=experiment_name, \n",
" datastore=dstor,\n",

View File

@@ -3,7 +3,6 @@ dependencies:
- pip:
- azureml-sdk
- azureml-train-automl
- azureml-pipeline
- azureml-widgets
- matplotlib
- pandas_ml
- azureml-pipeline

View File

@@ -15,32 +15,16 @@ if type(run) == _OfflineRun:
else:
ws = run.experiment.workspace
def write_output(df, path):
os.makedirs(path, exist_ok=True)
print("%s created" % path)
df.to_csv(path + "/part-00000", index=False)
print("Check for new data and prepare the data")
print("Check for new data.")
parser = argparse.ArgumentParser("split")
parser.add_argument("--target_column", type=str, help="input split features")
parser.add_argument("--ds_name", help="input dataset name")
parser.add_argument("--model_name", help="name of the deployed model")
parser.add_argument("--output_x", type=str,
help="output features")
parser.add_argument("--output_y", type=str,
help="output labels")
args = parser.parse_args()
print("Argument 1(ds_name): %s" % args.ds_name)
print("Argument 2(target_column): %s" % args.target_column)
print("Argument 3(model_name): %s" % args.model_name)
print("Argument 4(output_x): %s" % args.output_x)
print("Argument 5(output_y): %s" % args.output_y)
print("Argument 2(model_name): %s" % args.model_name)
# Get the latest registered model
try:
@@ -54,22 +38,9 @@ except Exception as e:
train_ds = Dataset.get_by_name(ws, args.ds_name)
dataset_changed_time = train_ds.data_changed_time
if dataset_changed_time > last_train_time:
# New data is available since the model was last trained
print("Dataset was last updated on {0}. Retraining...".format(dataset_changed_time))
train_ds = train_ds.drop_columns(["partition_date"])
X_train = train_ds.drop_columns(
columns=[args.target_column]).to_pandas_dataframe()
y_train = train_ds.keep_columns(
columns=[args.target_column]).to_pandas_dataframe()
non_null = y_train[args.target_column].notnull()
y = y_train[non_null]
X = X_train[non_null]
if not (args.output_x is None and args.output_y is None):
write_output(X, args.output_x)
write_output(y, args.output_y)
else:
if not dataset_changed_time > last_train_time:
print("Cancelling run since there is no new data.")
run.parent.cancel()
else:
# New data is available since the model was last trained
print("Dataset was last updated on {0}. Retraining...".format(dataset_changed_time))

View File

@@ -1,15 +0,0 @@
import os
import pandas as pd
def get_data():
print("In get_data")
print(os.environ['AZUREML_DATAREFERENCE_output_x'])
X_train = pd.read_csv(
os.environ['AZUREML_DATAREFERENCE_output_x'] + "/part-00000")
y_train = pd.read_csv(
os.environ['AZUREML_DATAREFERENCE_output_y'] + "/part-00000")
print(X_train.head(3))
return {"X": X_train.values, "y": y_train.values.flatten()}

View File

@@ -58,7 +58,7 @@ except Exception as e:
print(traceback.format_exc())
print("Dataset with name {0} not found, registering new dataset.".format(args.ds_name))
register_dataset = True
end_time_last_slice = datetime.today() - relativedelta(weeks=1)
end_time_last_slice = datetime.today() - relativedelta(weeks=2)
end_time = datetime.utcnow()
train_df = get_noaa_data(end_time_last_slice, end_time)
@@ -80,10 +80,10 @@ if train_df.size > 0:
target_path=folder_name,
overwrite=True,
show_progress=True)
if register_dataset:
ds = Dataset.Tabular.from_delimited_files(dstor.path("{}/**/*.csv".format(
args.ds_name)), partition_format='/{partition_date:yyyy/MM/dd/hh/mm/ss}/data.csv')
ds.register(ws, name=args.ds_name)
else:
print("No new data since {0}.".format(end_time_last_slice))
if register_dataset:
ds = Dataset.Tabular.from_delimited_files(dstor.path("{}/**/*.csv".format(
args.ds_name)), partition_format='/{partition_date:yyyy/MM/dd/HH/mm/ss}/data.csv')
ds.register(ws, name=args.ds_name)

View File

@@ -358,7 +358,7 @@
"\n",
"automl_config = AutoMLConfig(task='forecasting', \n",
" primary_metric='normalized_root_mean_squared_error',\n",
" experiment_timeout_minutes = 60,\n",
" experiment_timeout_hours = 1,\n",
" training_data=train_dataset,\n",
" label_column_name=target_column_name,\n",
" validation_data=valid_dataset, \n",

View File

@@ -5,8 +5,6 @@ dependencies:
- pip:
- azureml-sdk
- azureml-train-automl
- azureml-train
- azureml-widgets
- matplotlib
- pandas_ml
- statsmodels
- azureml-train

View File

@@ -76,9 +76,12 @@ def get_result_df(remote_run):
def run_inference(test_experiment, compute_target, script_folder, train_run,
test_dataset, lookback_dataset, max_horizon,
target_column_name, time_column_name, freq):
train_run.download_file('outputs/model.pkl', 'inference/model.pkl')
train_run.download_file('outputs/conda_env_v_1_0_0.yml',
'inference/condafile.yml')
model_base_name = 'model.pkl'
if 'model_data_location' in train_run.properties:
model_location = train_run.properties['model_data_location']
_, model_base_name = model_location.rsplit('/', 1)
train_run.download_file('outputs/{}'.format(model_base_name), 'inference/{}'.format(model_base_name))
train_run.download_file('outputs/conda_env_v_1_0_0.yml', 'inference/condafile.yml')
inference_env = Environment("myenv")
inference_env.docker.enabled = True
@@ -91,7 +94,8 @@ def run_inference(test_experiment, compute_target, script_folder, train_run,
'--max_horizon': max_horizon,
'--target_column_name': target_column_name,
'--time_column_name': time_column_name,
'--frequency': freq
'--frequency': freq,
'--model_path': model_base_name
},
inputs=[test_dataset.as_named_input('test_data'),
lookback_dataset.as_named_input('lookback_data')],

View File

@@ -232,6 +232,9 @@ parser.add_argument(
parser.add_argument(
'--frequency', type=str, dest='freq',
help='Frequency of prediction')
parser.add_argument(
'--model_path', type=str, dest='model_path',
default='model.pkl', help='Filename of model to be loaded')
args = parser.parse_args()
@@ -239,6 +242,7 @@ max_horizon = args.max_horizon
target_column_name = args.target_column_name
time_column_name = args.time_column_name
freq = args.freq
model_path = args.model_path
print('args passed are: ')
@@ -246,6 +250,7 @@ print(max_horizon)
print(target_column_name)
print(time_column_name)
print(freq)
print(model_path)
run = Run.get_context()
# get input dataset by name
@@ -267,7 +272,8 @@ X_lookback_df = lookback_dataset.drop_columns(columns=[target_column_name])
y_lookback_df = lookback_dataset.with_timestamp_columns(
None).keep_columns(columns=[target_column_name])
fitted_model = joblib.load('model.pkl')
fitted_model = joblib.load(model_path)
if hasattr(fitted_model, 'get_lookback'):
lookback = fitted_model.get_lookback()

View File

@@ -248,7 +248,7 @@
"|**task**|forecasting|\n",
"|**primary_metric**|This is the metric that you want to optimize.<br> Forecasting supports the following primary metrics <br><i>spearman_correlation</i><br><i>normalized_root_mean_squared_error</i><br><i>r2_score</i><br><i>normalized_mean_absolute_error</i>\n",
"|**blacklist_models**|Models in blacklist won't be used by AutoML. All supported models can be found at [here](https://docs.microsoft.com/en-us/python/api/azureml-train-automl-client/azureml.train.automl.constants.supportedmodels.forecasting?view=azure-ml-py).|\n",
"|**experiment_timeout_minutes**|Experimentation timeout in minutes.|\n",
"|**experiment_timeout_hours**|Experimentation timeout in hours.|\n",
"|**training_data**|Input dataset, containing both features and label column.|\n",
"|**label_column_name**|The name of the label column.|\n",
"|**compute_target**|The remote compute for training.|\n",
@@ -260,7 +260,7 @@
"|**target_lags**|The target_lags specifies how far back we will construct the lags of the target variable.|\n",
"|**drop_column_names**|Name(s) of columns to drop prior to modeling|\n",
"\n",
"This notebook uses the blacklist_models parameter to exclude some models that take a longer time to train on this dataset. You can choose to remove models from the blacklist_models list but you may need to increase the experiment_timeout_minutes parameter value to get results."
"This notebook uses the blacklist_models parameter to exclude some models that take a longer time to train on this dataset. You can choose to remove models from the blacklist_models list but you may need to increase the experiment_timeout_hours parameter value to get results."
]
},
{
@@ -305,7 +305,7 @@
"automl_config = AutoMLConfig(task='forecasting', \n",
" primary_metric='normalized_root_mean_squared_error',\n",
" blacklist_models = ['ExtremeRandomTrees'], \n",
" experiment_timeout_minutes=20,\n",
" experiment_timeout_hours=0.3,\n",
" training_data=train,\n",
" label_column_name=target_column_name,\n",
" compute_target=compute_target,\n",

View File

@@ -7,5 +7,3 @@ dependencies:
- azureml-train-automl
- azureml-widgets
- matplotlib
- pandas_ml
- statsmodels

View File

@@ -253,7 +253,7 @@
"source": [
"# split into train based on time\n",
"train = dataset.time_before(datetime(2017, 8, 8, 5), include_boundary=True)\n",
"train.to_pandas_dataframe().sort_values(time_column_name).tail(5).reset_index(drop=True)"
"train.to_pandas_dataframe().reset_index(drop=True).sort_values(time_column_name).tail(5)"
]
},
{
@@ -264,7 +264,7 @@
"source": [
"# split into test based on time\n",
"test = dataset.time_between(datetime(2017, 8, 8, 6), datetime(2017, 8, 10, 5))\n",
"test.to_pandas_dataframe().head(5).reset_index(drop=True)"
"test.to_pandas_dataframe().reset_index(drop=True).head(5)"
]
},
{
@@ -302,7 +302,7 @@
"|**task**|forecasting|\n",
"|**primary_metric**|This is the metric that you want to optimize.<br> Forecasting supports the following primary metrics <br><i>spearman_correlation</i><br><i>normalized_root_mean_squared_error</i><br><i>r2_score</i><br><i>normalized_mean_absolute_error</i>|\n",
"|**blacklist_models**|Models in blacklist won't be used by AutoML. All supported models can be found at [here](https://docs.microsoft.com/en-us/python/api/azureml-train-automl-client/azureml.train.automl.constants.supportedmodels.forecasting?view=azure-ml-py).|\n",
"|**experiment_timeout_minutes**|Maximum amount of time in minutes that the experiment take before it terminates.|\n",
"|**experiment_timeout_hours**|Maximum amount of time in hours that the experiment take before it terminates.|\n",
"|**training_data**|The training data to be used within the experiment.|\n",
"|**label_column_name**|The name of the label column.|\n",
"|**compute_target**|The remote compute for training.|\n",
@@ -316,7 +316,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"This notebook uses the blacklist_models parameter to exclude some models that take a longer time to train on this dataset. You can choose to remove models from the blacklist_models list but you may need to increase the experiment_timeout_minutes parameter value to get results."
"This notebook uses the blacklist_models parameter to exclude some models that take a longer time to train on this dataset. You can choose to remove models from the blacklist_models list but you may need to increase the experiment_timeout_hours parameter value to get results."
]
},
{
@@ -333,7 +333,7 @@
"automl_config = AutoMLConfig(task='forecasting', \n",
" primary_metric='normalized_root_mean_squared_error',\n",
" blacklist_models = ['ExtremeRandomTrees', 'AutoArima', 'Prophet'], \n",
" experiment_timeout_minutes=20,\n",
" experiment_timeout_hours=0.3,\n",
" training_data=train,\n",
" label_column_name=target_column_name,\n",
" compute_target=compute_target,\n",
@@ -578,7 +578,7 @@
"automl_config = AutoMLConfig(task='forecasting', \n",
" primary_metric='normalized_root_mean_squared_error',\n",
" blacklist_models = ['ElasticNet','ExtremeRandomTrees','GradientBoosting','XGBoostRegressor','ExtremeRandomTrees', 'AutoArima', 'Prophet'], #These models are blacklisted for tutorial purposes, remove this for real use cases. \n",
" experiment_timeout_minutes=20,\n",
" experiment_timeout_hours=0.3,\n",
" training_data=train,\n",
" label_column_name=target_column_name,\n",
" compute_target=compute_target,\n",

View File

@@ -2,11 +2,9 @@ name: auto-ml-forecasting-energy-demand
dependencies:
- pip:
- azureml-sdk
- interpret
- azureml-train-automl
- azureml-widgets
- matplotlib
- pandas_ml
- statsmodels
- interpret
- azureml-explain-model
- azureml-contrib-interpret

View File

@@ -251,7 +251,7 @@
"source": [
"automl_settings = {\n",
" \"iteration_timeout_minutes\" : 5,\n",
" \"experiment_timeout_minutes\" : 15,\n",
" \"experiment_timeout_hours\" : 0.25,\n",
" \"primary_metric\" : 'normalized_mean_absolute_error',\n",
" \"time_column_name\": time_column_name,\n",
" \"grain_column_names\": grain_column_names,\n",

View File

@@ -3,8 +3,6 @@ dependencies:
- pip:
- azureml-sdk
- azureml-train-automl
- azureml-pipeline
- azureml-widgets
- pandas_ml
- statsmodels
- matplotlib
- azureml-pipeline

View File

@@ -30,11 +30,11 @@ def _get_configs(automlconfig: AutoMLConfig,
groups = _get_groups(data, group_column_names)
configs = {}
for i, group in groups.iterrows():
single = data
single = data._dataflow
group_name = "#####".join(str(x) for x in group.values)
group_name = valid_chars.sub('', group_name)
for key in group.index:
single = single._dataflow.filter(data._dataflow[key] == group[key])
single = single.filter(data._dataflow[key] == group[key])
t_dataset = TabularDataset._create(single)
group_conf = copy.deepcopy(automlconfig)
group_conf.user_settings['training_data'] = t_dataset
@@ -71,7 +71,7 @@ def build_pipeline_steps(automlconfig: AutoMLConfig,
# create each automl step end-to-end (train, register)
for group_name, conf in configs.items():
# create automl metrics output
metirics_data = PipelineData(
metrics_data = PipelineData(
name='metrics_data_{}'.format(group_name),
pipeline_output_name=metrics_output_name.format(group_name),
training_output=TrainingOutput(type='Metrics'))
@@ -84,7 +84,7 @@ def build_pipeline_steps(automlconfig: AutoMLConfig,
automl_step = AutoMLStep(
name='automl_{}'.format(group_name),
automl_config=conf,
outputs=[metirics_data, model_data],
outputs=[metrics_data, model_data],
allow_reuse=True)
steps.append(automl_step)

View File

@@ -1,9 +1,9 @@
import argparse
import json
from azureml.core import Run, Model, Workspace
from azureml.core.conda_dependencies import CondaDependencies
from azureml.core import Run, Model
from azureml.core.model import InferenceConfig
from azureml.core.environment import Environment
from azureml.core.webservice import AciWebservice
@@ -39,6 +39,8 @@ print(model_list)
run = Run.get_context()
ws = run.experiment.workspace
myenv = Environment.from_conda_specification(name="env", file_path=conda_env_file_name)
deployment_config = AciWebservice.deploy_configuration(
cpu_cores=1,
memory_gb=2,
@@ -46,11 +48,7 @@ deployment_config = AciWebservice.deploy_configuration(
description='grouping demo aci deployment'
)
inference_config = InferenceConfig(
entry_script=script_file_name,
runtime='python',
conda_file=conda_env_file_name
)
inference_config = InferenceConfig(entry_script=script_file_name, environment=myenv)
models = []
for model_name in model_list:

View File

@@ -335,7 +335,7 @@
"automl_config = AutoMLConfig(task='forecasting',\n",
" debug_log='automl_forecasting_function.log',\n",
" primary_metric='normalized_root_mean_squared_error',\n",
" experiment_timeout_minutes=15,\n",
" experiment_timeout_hours=0.25,\n",
" enable_early_stopping=True,\n",
" training_data=train_data,\n",
" compute_target=compute_target,\n",

View File

@@ -6,6 +6,4 @@ dependencies:
- azureml-sdk
- azureml-train-automl
- azureml-widgets
- pandas_ml
- statsmodels
- matplotlib

View File

@@ -335,7 +335,7 @@
"|-|-|\n",
"|**task**|forecasting|\n",
"|**primary_metric**|This is the metric that you want to optimize.<br> Forecasting supports the following primary metrics <br><i>spearman_correlation</i><br><i>normalized_root_mean_squared_error</i><br><i>r2_score</i><br><i>normalized_mean_absolute_error</i>\n",
"|**experiment_timeout_minutes**|Experimentation timeout in minutes.|\n",
"|**experiment_timeout_hours**|Experimentation timeout in hours.|\n",
"|**enable_early_stopping**|If early stopping is on, training will stop when the primary metric is no longer improving.|\n",
"|**training_data**|Input dataset, containing both features and label column.|\n",
"|**label_column_name**|The name of the label column.|\n",
@@ -366,7 +366,7 @@
"automl_config = AutoMLConfig(task='forecasting',\n",
" debug_log='automl_oj_sales_errors.log',\n",
" primary_metric='normalized_mean_absolute_error',\n",
" experiment_timeout_minutes=15,\n",
" experiment_timeout_hours=0.25,\n",
" training_data=train_dataset,\n",
" label_column_name=target_column_name,\n",
" compute_target=compute_target,\n",
@@ -631,9 +631,7 @@
"outputs": [],
"source": [
"import json\n",
"# The request data frame needs to have y_query column which corresponds to query.\n",
"X_query = X_test.copy()\n",
"X_query['y_query'] = np.NaN\n",
"# We have to convert datetime to string, because Timestamps cannot be serialized to JSON.\n",
"X_query[time_column_name] = X_query[time_column_name].astype(str)\n",
"# The Service object accept the complex dictionary, which is internally converted to JSON string.\n",

View File

@@ -7,5 +7,3 @@ dependencies:
- azureml-train-automl
- azureml-widgets
- matplotlib
- pandas_ml
- statsmodels

View File

@@ -155,8 +155,7 @@
"automl_settings = {\n",
" \"n_cross_validations\": 3,\n",
" \"primary_metric\": 'average_precision_score_weighted',\n",
" \"preprocess\": True,\n",
" \"experiment_timeout_minutes\": 10, # This is a time limit for testing purposes, remove it for real use cases, this will drastically limit ablity to find the best model possible\n",
" \"experiment_timeout_hours\": 0.2, # This is a time limit for testing purposes, remove it for real use cases, this will drastically limit ability to find the best model possible\n",
" \"verbosity\": logging.INFO,\n",
" \"enable_stack_ensemble\": False\n",
"}\n",
@@ -260,17 +259,7 @@
"metadata": {},
"source": [
"#### Print the properties of the model\n",
"The fitted_model is a python object and you can read the different properties of the object.\n",
"See *Print the properties of the model* section in [this sample notebook](https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/automated-machine-learning/classification/auto-ml-classification.ipynb)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Deploy\n",
"\n",
"To deploy the model into a web service endpoint, see _Deploy_ section in [this sample notebook](https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/automated-machine-learning/classification-with-deployment/auto-ml-classification-with-deployment.ipynb)"
"The fitted_model is a python object and you can read the different properties of the object.\n"
]
},
{

View File

@@ -2,10 +2,8 @@ name: auto-ml-classification-credit-card-fraud-local
dependencies:
- pip:
- azureml-sdk
- interpret
- azureml-defaults
- azureml-explain-model
- azureml-train-automl
- azureml-widgets
- matplotlib
- pandas_ml
- interpret
- azureml-explain-model

View File

@@ -206,9 +206,9 @@
"|-|-|\n",
"|**task**|classification, regression or forecasting|\n",
"|**primary_metric**|This is the metric that you want to optimize. Regression supports the following primary metrics: <br><i>spearman_correlation</i><br><i>normalized_root_mean_squared_error</i><br><i>r2_score</i><br><i>normalized_mean_absolute_error</i>|\n",
"|**experiment_timeout_minutes**| Maximum amount of time in minutes that all iterations combined can take before the experiment terminates.|\n",
"|**experiment_timeout_hours**| Maximum amount of time in hours that all iterations combined can take before the experiment terminates.|\n",
"|**enable_early_stopping**| Flag to enble early termination if the score is not improving in the short term.|\n",
"|**featurization**| 'auto' / 'off' / FeaturizationConfig Indicator for whether featurization step should be done automatically or not, or whether customized featurization should be used. Note: If the input data is sparse, featurization cannot be turned on.|\n",
"|**featurization**| 'auto' / 'off' / FeaturizationConfig Indicator for whether featurization step should be done automatically or not, or whether customized featurization should be used. Setting this enables AutoML to perform featurization on the input to handle *missing data*, and to perform some common *feature extraction*. Note: If the input data is sparse, featurization cannot be turned on.|\n",
"|**n_cross_validations**|Number of cross validation splits.|\n",
"|**training_data**|(sparse) array-like, shape = [n_samples, n_features]|\n",
"|**label_column_name**|(sparse) array-like, shape = [n_samples, ], targets values.|"
@@ -244,7 +244,7 @@
"source": [
"featurization_config = FeaturizationConfig()\n",
"featurization_config.blocked_transformers = ['LabelEncoder']\n",
"#featurization_config.drop_columns = ['ERP', 'MMIN']\n",
"#featurization_config.drop_columns = ['MMIN']\n",
"featurization_config.add_column_purpose('MYCT', 'Numeric')\n",
"featurization_config.add_column_purpose('VendorName', 'CategoricalHash')\n",
"#default strategy mean, add transformer param for for 3 columns\n",
@@ -262,7 +262,7 @@
"source": [
"automl_settings = {\n",
" \"enable_early_stopping\": True, \n",
" \"experiment_timeout_minutes\" : 10,\n",
" \"experiment_timeout_hours\" : 0.2,\n",
" \"max_concurrent_iterations\": 4,\n",
" \"max_cores_per_iteration\": -1,\n",
" \"n_cross_validations\": 5,\n",
@@ -558,7 +558,6 @@
"\n",
"# specify CondaDependencies obj\n",
"conda_run_config.environment.python.conda_dependencies = CondaDependencies.create(\n",
" conda_packages=['scikit-learn', 'numpy','py-xgboost<=0.80'],\n",
" pip_packages=azureml_pip_packages)"
]
},
@@ -718,17 +717,7 @@
"metadata": {},
"outputs": [],
"source": [
"from azureml.core.conda_dependencies import CondaDependencies \n",
"\n",
"azureml_pip_packages = [\n",
" 'azureml-explain-model', 'azureml-train-automl', 'azureml-defaults'\n",
"]\n",
" \n",
"\n",
"# specify CondaDependencies obj\n",
"myenv = CondaDependencies.create(conda_packages=['scikit-learn', 'pandas', 'numpy', 'py-xgboost<=0.80'],\n",
" pip_packages=azureml_pip_packages,\n",
" pin_sdk_version=True)\n",
"myenv = automl_run.get_environment().python.conda_dependencies\n",
"\n",
"with open(\"myenv.yml\",\"w\") as f:\n",
" f.write(myenv.serialize_to_string())\n",

View File

@@ -2,12 +2,10 @@ name: auto-ml-regression-hardware-performance-explanation-and-featurization
dependencies:
- pip:
- azureml-sdk
- interpret
- azureml-defaults
- azureml-explain-model
- azureml-train-automl
- azureml-widgets
- matplotlib
- pandas_ml
- interpret
- azureml-explain-model
- azureml-explain-model
- azureml-contrib-interpret

View File

@@ -7,7 +7,7 @@ from azureml.core.experiment import Experiment
from sklearn.externals import joblib
from azureml.core.dataset import Dataset
from azureml.train.automl.runtime.automl_explain_utilities import AutoMLExplainerSetupClass, \
automl_setup_model_explanations
automl_setup_model_explanations, automl_check_model_if_explainable
from azureml.explain.model.mimic.models.lightgbm_model import LGBMExplainableModel
from azureml.explain.model.mimic_wrapper import MimicWrapper
from automl.client.core.common.constants import MODEL_PATH
@@ -25,6 +25,11 @@ ws = run.experiment.workspace
experiment = Experiment(ws, '<<experimnet_name>>')
automl_run = Run(experiment=experiment, run_id='<<run_id>>')
# Check if this AutoML model is explainable
if not automl_check_model_if_explainable(automl_run):
raise Exception("Model explanations is currently not supported for " + automl_run.get_properties().get(
'run_algorithm'))
# Download the best model from the artifact store
automl_run.download_file(name=MODEL_PATH, output_file_path='model.pkl')

View File

@@ -188,15 +188,18 @@
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"metadata": {
"tags": [
"automlconfig-remarks-sample"
]
},
"outputs": [],
"source": [
"automl_settings = {\n",
" \"n_cross_validations\": 3,\n",
" \"primary_metric\": 'r2_score',\n",
" \"preprocess\": True,\n",
" \"enable_early_stopping\": True, \n",
" \"experiment_timeout_minutes\": 20, #for real scenarios we reccommend a timeout of at least one hour \n",
" \"experiment_timeout_hours\": 0.3, #for real scenarios we reccommend a timeout of at least one hour \n",
" \"max_concurrent_iterations\": 4,\n",
" \"max_cores_per_iteration\": -1,\n",
" \"verbosity\": logging.INFO,\n",

View File

@@ -5,5 +5,3 @@ dependencies:
- azureml-train-automl
- azureml-widgets
- matplotlib
- pandas_ml
- paramiko<2.5.0

View File

@@ -56,7 +56,7 @@ CREATE OR ALTER PROCEDURE [dbo].[AutoMLTrain]
@task NVARCHAR(40)='classification', -- The type of task. Can be classification, regression or forecasting.
@experiment_name NVARCHAR(32)='automl-sql-test', -- This can be used to find the experiment in the Azure Portal.
@iteration_timeout_minutes INT = 15, -- The maximum time in minutes for training a single pipeline.
@experiment_timeout_minutes INT = 60, -- The maximum time in minutes for training all pipelines.
@experiment_timeout_hours FLOAT = 1, -- The maximum time in hours for training all pipelines.
@n_cross_validations INT = 3, -- The number of cross validations.
@blacklist_models NVARCHAR(MAX) = '', -- A comma separated list of algos that will not be used.
-- The list of possible models can be found at:
@@ -131,8 +131,8 @@ if __name__.startswith("sqlindb"):
X_train = data_train
if experiment_timeout_minutes == 0:
experiment_timeout_minutes = None
if experiment_timeout_hours == 0:
experiment_timeout_hours = None
if experiment_exit_score == 0:
experiment_exit_score = None
@@ -163,7 +163,7 @@ if __name__.startswith("sqlindb"):
debug_log = log_file_name,
primary_metric = primary_metric,
iteration_timeout_minutes = iteration_timeout_minutes,
experiment_timeout_minutes = experiment_timeout_minutes,
experiment_timeout_hours = experiment_timeout_hours,
iterations = iterations,
n_cross_validations = n_cross_validations,
preprocess = preprocess,
@@ -204,7 +204,7 @@ if __name__.startswith("sqlindb"):
@iterations INT, @task NVARCHAR(40),
@experiment_name NVARCHAR(32),
@iteration_timeout_minutes INT,
@experiment_timeout_minutes INT,
@experiment_timeout_hours FLOAT,
@n_cross_validations INT,
@blacklist_models NVARCHAR(MAX),
@whitelist_models NVARCHAR(MAX),
@@ -223,7 +223,7 @@ if __name__.startswith("sqlindb"):
, @task = @task
, @experiment_name = @experiment_name
, @iteration_timeout_minutes = @iteration_timeout_minutes
, @experiment_timeout_minutes = @experiment_timeout_minutes
, @experiment_timeout_hours = @experiment_timeout_hours
, @n_cross_validations = @n_cross_validations
, @blacklist_models = @blacklist_models
, @whitelist_models = @whitelist_models

View File

@@ -235,7 +235,7 @@
" @task NVARCHAR(40)='classification', -- The type of task. Can be classification, regression or forecasting.\r\n",
" @experiment_name NVARCHAR(32)='automl-sql-test', -- This can be used to find the experiment in the Azure Portal.\r\n",
" @iteration_timeout_minutes INT = 15, -- The maximum time in minutes for training a single pipeline. \r\n",
" @experiment_timeout_minutes INT = 60, -- The maximum time in minutes for training all pipelines.\r\n",
" @experiment_timeout_hours FLOAT = 1, -- The maximum time in hours for training all pipelines.\r\n",
" @n_cross_validations INT = 3, -- The number of cross validations.\r\n",
" @blacklist_models NVARCHAR(MAX) = '', -- A comma separated list of algos that will not be used.\r\n",
" -- The list of possible models can be found at:\r\n",
@@ -307,8 +307,8 @@
"\r\n",
" X_train = data_train\r\n",
"\r\n",
" if experiment_timeout_minutes == 0:\r\n",
" experiment_timeout_minutes = None\r\n",
" if experiment_timeout_hours == 0:\r\n",
" experiment_timeout_hours = None\r\n",
"\r\n",
" if experiment_exit_score == 0:\r\n",
" experiment_exit_score = None\r\n",
@@ -337,7 +337,7 @@
" debug_log = log_file_name, \r\n",
" primary_metric = primary_metric, \r\n",
" iteration_timeout_minutes = iteration_timeout_minutes, \r\n",
" experiment_timeout_minutes = experiment_timeout_minutes,\r\n",
" experiment_timeout_hours = experiment_timeout_hours,\r\n",
" iterations = iterations, \r\n",
" n_cross_validations = n_cross_validations, \r\n",
" preprocess = preprocess,\r\n",
@@ -378,7 +378,7 @@
"\t\t\t\t @iterations INT, @task NVARCHAR(40),\r\n",
"\t\t\t\t @experiment_name NVARCHAR(32),\r\n",
"\t\t\t\t @iteration_timeout_minutes INT,\r\n",
"\t\t\t\t @experiment_timeout_minutes INT,\r\n",
"\t\t\t\t @experiment_timeout_hours FLOAT,\r\n",
"\t\t\t\t @n_cross_validations INT,\r\n",
"\t\t\t\t @blacklist_models NVARCHAR(MAX),\r\n",
"\t\t\t\t @whitelist_models NVARCHAR(MAX),\r\n",
@@ -396,7 +396,7 @@
"\t, @task = @task\r\n",
"\t, @experiment_name = @experiment_name\r\n",
"\t, @iteration_timeout_minutes = @iteration_timeout_minutes\r\n",
"\t, @experiment_timeout_minutes = @experiment_timeout_minutes\r\n",
"\t, @experiment_timeout_hours = @experiment_timeout_hours\r\n",
"\t, @n_cross_validations = @n_cross_validations\r\n",
"\t, @blacklist_models = @blacklist_models\r\n",
"\t, @whitelist_models = @whitelist_models\r\n",
@@ -560,9 +560,6 @@
"framework": [
"Azure ML AutoML"
],
"tags": [
""
],
"friendly_name": "Setup automated ML SQL integration",
"index_order": 1,
"kernelspec": {
@@ -574,6 +571,9 @@
"name": "sql",
"version": ""
},
"tags": [
""
],
"task": "None"
},
"nbformat": 4,

View File

@@ -11,6 +11,13 @@
"Licensed under the MIT License."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Register Azure Databricks trained model and deploy it to ACI\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
@@ -161,9 +168,9 @@
"source": [
"from azureml.core.conda_dependencies import CondaDependencies \n",
"\n",
"myacienv = CondaDependencies.create(conda_packages=['scikit-learn','numpy','pandas']) #showing how to add libs as an eg. - not needed for this model.\n",
"myacienv = CondaDependencies.create(conda_packages=['scikit-learn','numpy','pandas']) # showing how to add libs as an eg. - not needed for this model.\n",
"\n",
"with open(\"mydeployenv.yml\",\"w\") as f:\n",
"with open(\"myenv.yml\",\"w\") as f:\n",
" f.write(myacienv.serialize_to_string())"
]
},
@@ -177,6 +184,9 @@
"from azureml.core.webservice import AciWebservice, Webservice\n",
"from azureml.exceptions import WebserviceException\n",
"from azureml.core.model import InferenceConfig\n",
"from azureml.core.environment import Environment\n",
"from azureml.core.conda_dependencies import CondaDependencies\n",
"\n",
"\n",
"myaci_config = AciWebservice.deploy_configuration(cpu_cores = 2, \n",
" memory_gb = 2, \n",
@@ -191,9 +201,16 @@
"except WebserviceException:\n",
" pass\n",
"\n",
"inference_config = InferenceConfig(runtime= 'spark-py', \n",
" entry_script='score_sparkml.py',\n",
" conda_file='mydeployenv.yml')\n",
"myenv = Environment.get(ws, name='AzureML-PySpark-MmlSpark-0.15')\n",
"# we need to add extra packages to procured environment\n",
"# in order to deploy amended environment we need to rename it\n",
"myenv.name = 'myenv'\n",
"model_dependencies = CondaDependencies('myenv.yml')\n",
"for pip_dep in model_dependencies.pip_packages:\n",
" myenv.python.conda_dependencies.add_pip_package(pip_dep)\n",
"for conda_dep in model_dependencies.conda_packages:\n",
" myenv.python.conda_dependencies.add_conda_package(conda_dep)\n",
"inference_config = InferenceConfig(entry_script='score_sparkml.py', environment=myenv)\n",
"\n",
"myservice = Model.deploy(ws, service_name, [mymodel], inference_config, myaci_config)\n",
"myservice.wait_for_deployment(show_output=True)"
@@ -255,6 +272,15 @@
"myservice.delete()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Deploying to other types of computes\n",
"\n",
"In order to learn how to deploy to other types of compute targets, such as AKS, please take a look at the set of notebooks in the [deployment](https://github.com/Azure/MachineLearningNotebooks/tree/master/how-to-use-azureml/deployment) folder."
]
},
{
"cell_type": "markdown",
"metadata": {},

View File

@@ -1,312 +0,0 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Azure ML & Azure Databricks notebooks by Parashar Shah.\n",
"\n",
"Copyright (c) Microsoft Corporation. All rights reserved.\n",
"\n",
"Licensed under the MIT License."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This notebook uses image from ACI notebook for deploying to AKS."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import azureml.core\n",
"\n",
"# Check core SDK version number\n",
"print(\"SDK version:\", azureml.core.VERSION)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Set auth to be used by workspace related APIs.\n",
"# For automation or CI/CD ServicePrincipalAuthentication can be used.\n",
"# https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.authentication.serviceprincipalauthentication?view=azure-ml-py\n",
"auth = None"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core import Workspace\n",
"\n",
"ws = Workspace.from_config(auth = auth)\n",
"print('Workspace name: ' + ws.name, \n",
" 'Azure region: ' + ws.location, \n",
" 'Subscription id: ' + ws.subscription_id, \n",
" 'Resource group: ' + ws.resource_group, sep = '\\n')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#Register the model\n",
"import os\n",
"from azureml.core.model import Model\n",
"\n",
"model_name = \"AdultCensus_runHistory_aks.mml\" # \n",
"model_name_dbfs = os.path.join(\"/dbfs\", model_name)\n",
"\n",
"print(\"copy model from dbfs to local\")\n",
"model_local = \"file:\" + os.getcwd() + \"/\" + model_name\n",
"dbutils.fs.cp(model_name, model_local, True)\n",
"\n",
"mymodel = Model.register(model_path = model_name, # this points to a local file\n",
" model_name = model_name, # this is the name the model is registered as, am using same name for both path and name. \n",
" description = \"ADB trained model by Parashar\",\n",
" workspace = ws)\n",
"\n",
"print(mymodel.name, mymodel.description, mymodel.version)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#%%writefile score_sparkml.py\n",
"score_sparkml = \"\"\"\n",
" \n",
"import json\n",
" \n",
"def init():\n",
" # One-time initialization of PySpark and predictive model\n",
" import pyspark\n",
" from azureml.core.model import Model\n",
" from pyspark.ml import PipelineModel\n",
" \n",
" global trainedModel\n",
" global spark\n",
" \n",
" spark = pyspark.sql.SparkSession.builder.appName(\"ADB and AML notebook by Parashar\").getOrCreate()\n",
" model_name = \"{model_name}\" #interpolated\n",
" model_path = Model.get_model_path(model_name)\n",
" trainedModel = PipelineModel.load(model_path)\n",
" \n",
"def run(input_json):\n",
" if isinstance(trainedModel, Exception):\n",
" return json.dumps({{\"trainedModel\":str(trainedModel)}})\n",
" \n",
" try:\n",
" sc = spark.sparkContext\n",
" input_list = json.loads(input_json)\n",
" input_rdd = sc.parallelize(input_list)\n",
" input_df = spark.read.json(input_rdd)\n",
" \n",
" # Compute prediction\n",
" prediction = trainedModel.transform(input_df)\n",
" #result = prediction.first().prediction\n",
" predictions = prediction.collect()\n",
" \n",
" #Get each scored result\n",
" preds = [str(x['prediction']) for x in predictions]\n",
" result = \",\".join(preds)\n",
" # you can return any data type as long as it is JSON-serializable\n",
" return result.tolist()\n",
" except Exception as e:\n",
" result = str(e)\n",
" return result\n",
" \n",
"\"\"\".format(model_name=model_name)\n",
" \n",
"exec(score_sparkml)\n",
" \n",
"with open(\"score_sparkml.py\", \"w\") as file:\n",
" file.write(score_sparkml)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core.conda_dependencies import CondaDependencies \n",
"\n",
"myacienv = CondaDependencies.create(conda_packages=['scikit-learn','numpy','pandas']) #showing how to add libs as an eg. - not needed for this model.\n",
"\n",
"with open(\"mydeployenv.yml\",\"w\") as f:\n",
" f.write(myacienv.serialize_to_string())"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#create AKS compute\n",
"#it may take 20-25 minutes to create a new cluster\n",
"\n",
"from azureml.core.compute import AksCompute, ComputeTarget\n",
"from azureml.core.compute_target import ComputeTargetException\n",
"\n",
"aks_name = 'ps-aks-demo2' \n",
"\n",
"try:\n",
" aks_target = ComputeTarget(workspace=ws, name=aks_name)\n",
" print('Found existing cluster, use it.')\n",
"except ComputeTargetException:\n",
" # Use the default configuration (can also provide parameters to customize)\n",
" prov_config = AksCompute.provisioning_configuration()\n",
" \n",
" # Create the cluster\n",
" aks_target = ComputeTarget.create(workspace = ws, \n",
" name = aks_name, \n",
" provisioning_configuration = prov_config)\n",
"\n",
"aks_target.wait_for_completion(show_output = True)\n",
"\n",
"print(aks_target.provisioning_state)\n",
"print(aks_target.provisioning_errors)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#deploy to AKS\n",
"from azureml.core.webservice import AksWebservice, Webservice\n",
"from azureml.exceptions import WebserviceException\n",
"from azureml.core.model import InferenceConfig\n",
"\n",
"aks_config = AksWebservice.deploy_configuration(enable_app_insights=True)\n",
"\n",
"service_name = 'ps-aks-service'\n",
"\n",
"# Remove any existing service under the same name.\n",
"try:\n",
" Webservice(ws, service_name).delete()\n",
"except WebserviceException:\n",
" pass\n",
"\n",
"inference_config = InferenceConfig(runtime = 'spark-py', \n",
" entry_script ='score_sparkml.py',\n",
" conda_file ='mydeployenv.yml')\n",
"\n",
"aks_service = Model.deploy(ws, service_name, [mymodel], inference_config, aks_config, aks_target)\n",
"aks_service.wait_for_deployment(show_output=True)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"aks_service.deployment_status"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#for using the Web HTTP API \n",
"print(aks_service.scoring_uri)\n",
"print(aks_service.get_keys())"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import json\n",
"\n",
"#get the some sample data\n",
"test_data_path = \"AdultCensusIncomeTest\"\n",
"test = spark.read.parquet(test_data_path).limit(5)\n",
"\n",
"test_json = json.dumps(test.toJSON().collect())\n",
"\n",
"print(test_json)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#using data defined above predict if income is >50K (1) or <=50K (0)\n",
"aks_service.run(input_data=test_json)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#comment to not delete the web service\n",
"aks_service.delete()\n",
"#model.delete()\n",
"aks_target.delete() "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/azure-databricks/amlsdk/deploy-to-aks-existingimage-05.png)"
]
}
],
"metadata": {
"authors": [
{
"name": "pasha"
}
],
"kernelspec": {
"display_name": "Python 3.6",
"language": "python",
"name": "python36"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.8"
},
"name": "deploy-to-aks-existingimage-05",
"notebookId": 1030695628045968
},
"nbformat": 4,
"nbformat_minor": 1
}

View File

@@ -640,7 +640,7 @@
"\n",
"myenv = CondaDependencies.create(conda_packages=['numpy','scikit-learn'], pip_packages=['azureml-defaults', 'azureml-sdk[automl]'])\n",
"\n",
"conda_env_file_name = 'mydeployenv.yml'\n",
"conda_env_file_name = 'myenv.yml'\n",
"myenv.save_to_file('.', conda_env_file_name)"
]
},
@@ -664,17 +664,27 @@
"from azureml.exceptions import WebserviceException\n",
"from azureml.core.model import InferenceConfig\n",
"from azureml.core.model import Model\n",
"from azureml.core.environment import Environment\n",
"from azureml.core.conda_dependencies import CondaDependencies\n",
"import uuid\n",
"\n",
"\n",
"myaci_config = AciWebservice.deploy_configuration(\n",
" cpu_cores = 2, \n",
" memory_gb = 2, \n",
" tags = {'name':'Databricks Azure ML ACI'}, \n",
" description = 'This is for ADB and AutoML example.')\n",
"\n",
"inference_config = InferenceConfig(runtime= 'spark-py', \n",
" entry_script='score.py',\n",
" conda_file='mydeployenv.yml')\n",
"myenv = Environment.get(ws, name='AzureML-PySpark-MmlSpark-0.15')\n",
"# we need to add extra packages to procured environment\n",
"# in order to deploy amended environment we need to rename it\n",
"myenv.name = 'myenv'\n",
"model_dependencies = CondaDependencies('myenv.yml')\n",
"for pip_dep in model_dependencies.pip_packages:\n",
" myenv.python.conda_dependencies.add_pip_package(pip_dep)\n",
"for conda_dep in model_dependencies.conda_packages:\n",
" myenv.python.conda_dependencies.add_conda_package(conda_dep)\n",
"inference_config = InferenceConfig(entry_script='score_sparkml.py', environment=myenv)\n",
"\n",
"guid = str(uuid.uuid4()).split(\"-\")[0]\n",
"service_name = \"myservice-{}\".format(guid)\n",

View File

@@ -195,7 +195,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"You can now create and/or use an Environment object when deploying a Webservice. The Environment can have been previously registered with your Workspace, or it will be registered with it as a part of the Webservice deployment. Only Environments that were created using azureml-defaults version 1.0.48 or later will work with this new handling however.\n",
"You can now create and/or use an Environment object when deploying a Webservice. The Environment can have been previously registered with your Workspace, or it will be registered with it as a part of the Webservice deployment. Please note that your environment must include azureml-defaults with verion >= 1.0.45 as a pip dependency, because it contains the functionality needed to host the model as a web service.\n",
"\n",
"More information can be found in our [using environments notebook](../training/using-environments/using-environments.ipynb)."
]
@@ -221,23 +221,30 @@
"## Create Inference Configuration\n",
"\n",
"There is now support for a source directory, you can upload an entire folder from your local machine as dependencies for the Webservice.\n",
"Note: in that case, your entry_script, conda_file, and extra_docker_file_steps paths are relative paths to the source_directory path.\n",
"Note: in that case, environments's entry_script and file_path are relative paths to the source_directory path; myenv.docker.base_dockerfile is a string containing extra docker steps or contents of the docker file.\n",
"\n",
"Sample code for using a source directory:\n",
"\n",
"```python\n",
"from azureml.core.environment import Environment\n",
"from azureml.core.model import InferenceConfig\n",
"\n",
"myenv = Environment.from_conda_specification(name='myenv', file_path='env/myenv.yml')\n",
"\n",
"# explicitly set base_image to None when setting base_dockerfile\n",
"myenv.docker.base_image = None\n",
"# add extra docker commends to execute\n",
"myenv.docker.base_dockerfile = \"FROM ubuntu\\n RUN echo \\\"hello\\\"\"\n",
"\n",
"inference_config = InferenceConfig(source_directory=\"C:/abc\",\n",
" runtime= \"python\", \n",
" entry_script=\"x/y/score.py\",\n",
" conda_file=\"env/myenv.yml\", \n",
" extra_docker_file_steps=\"helloworld.txt\")\n",
" environment=myenv)\n",
"```\n",
"\n",
" - source_directory = holds source path as string, this entire folder gets added in image so its really easy to access any files within this folder or subfolder\n",
" - runtime = Which runtime to use for the image. Current supported runtimes are 'spark-py' and 'python\n",
" - entry_script = contains logic specific to initializing your model and running predictions\n",
" - conda_file = manages conda and python package dependencies.\n",
" - extra_docker_file_steps = optional: any extra steps you want to inject into docker file"
" - file_path: input parameter to Environment constructor. Manages conda and python package dependencies.\n",
" - env.docker.base_dockerfile: any extra steps you want to inject into docker file\n",
" - source_directory: holds source path as string, this entire folder gets added in image so its really easy to access any files within this folder or subfolder\n",
" - entry_script: contains logic specific to initializing your model and running predictions"
]
},
{

View File

@@ -20,7 +20,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Register model and deploy as webservice\n",
"# Register model and deploy as webservice in ACI\n",
"\n",
"Following this notebook, you will:\n",
"\n",
@@ -45,6 +45,7 @@
"source": [
"import azureml.core\n",
"\n",
"\n",
"# Check core SDK version number.\n",
"print('SDK version:', azureml.core.VERSION)"
]
@@ -70,6 +71,7 @@
"source": [
"from azureml.core import Workspace\n",
"\n",
"\n",
"ws = Workspace.from_config()\n",
"print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep='\\n')"
]
@@ -91,6 +93,7 @@
"source": [
"from azureml.core import Dataset\n",
"\n",
"\n",
"datastore = ws.get_default_datastore()\n",
"datastore.upload_files(files=['./features.csv', './labels.csv'],\n",
" target_path='sklearn_regression/',\n",
@@ -125,6 +128,7 @@
"from azureml.core import Model\n",
"from azureml.core.resource_configuration import ResourceConfiguration\n",
"\n",
"\n",
"model = Model.register(workspace=ws,\n",
" model_name='my-sklearn-model', # Name of the registered model in your workspace.\n",
" model_path='./sklearn_regression_model.pkl', # Local file to upload and register as a model.\n",
@@ -159,6 +163,8 @@
"\n",
"The Azure Machine Learning service provides a default environment for supported model frameworks, including scikit-learn, based on the metadata you provided when registering your model. This is the easiest way to deploy your model.\n",
"\n",
"Even when you deploy your model to ACI with a default environment you can still customize the deploy configuration (i.e. the number of cores and amount of memory made available for the deployment) using the [AciWebservice.deploy_configuration()](https://docs.microsoft.com/python/api/azureml-core/azureml.core.webservice.aci.aciwebservice#deploy-configuration-cpu-cores-none--memory-gb-none--tags-none--properties-none--description-none--location-none--auth-enabled-none--ssl-enabled-none--enable-app-insights-none--ssl-cert-pem-file-none--ssl-key-pem-file-none--ssl-cname-none--dns-name-label-none--). Look at the \"Use a custom environment\" section of this notebook for more information on deploy configuration.\n",
"\n",
"**Note**: This step can take several minutes."
]
},
@@ -171,6 +177,7 @@
"from azureml.core import Webservice\n",
"from azureml.exceptions import WebserviceException\n",
"\n",
"\n",
"service_name = 'my-sklearn-service'\n",
"\n",
"# Remove any existing service under the same name.\n",
@@ -198,6 +205,7 @@
"source": [
"import json\n",
"\n",
"\n",
"input_payload = json.dumps({\n",
" 'data': [\n",
" [ 0.03807591, 0.05068012, 0.06169621, 0.02187235, -0.0442235,\n",
@@ -231,9 +239,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Use a custom environment (for all models)\n",
"### Use a custom environment\n",
"\n",
"If you want more control over how your model is run, if it uses another framework, or if it has special runtime requirements, you can instead specify your own environment and scoring method.\n",
"If you want more control over how your model is run, if it uses another framework, or if it has special runtime requirements, you can instead specify your own environment and scoring method. Custom environments can be used for any model you want to deploy.\n",
"\n",
"Specify the model's runtime environment by creating an [Environment](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.environment%28class%29?view=azure-ml-py) object and providing the [CondaDependencies](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.conda_dependencies.condadependencies?view=azure-ml-py) needed by your model."
]
@@ -247,6 +255,7 @@
"from azureml.core import Environment\n",
"from azureml.core.conda_dependencies import CondaDependencies\n",
"\n",
"\n",
"environment = Environment('my-sklearn-environment')\n",
"environment.python.conda_dependencies = CondaDependencies.create(pip_packages=[\n",
" 'azureml-defaults',\n",
@@ -278,7 +287,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Deploy your model in the custom environment by providing an [InferenceConfig](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.model.inferenceconfig?view=azure-ml-py) object to [Model.deploy()](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.model.model?view=azure-ml-py#deploy-workspace--name--models--inference-config--deployment-config-none--deployment-target-none-).\n",
"Deploy your model in the custom environment by providing an [InferenceConfig](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.model.inferenceconfig?view=azure-ml-py) object to [Model.deploy()](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.model.model?view=azure-ml-py#deploy-workspace--name--models--inference-config--deployment-config-none--deployment-target-none-). In this case we are also using the [AciWebservice.deploy_configuration()](https://docs.microsoft.com/python/api/azureml-core/azureml.core.webservice.aci.aciwebservice#deploy-configuration-cpu-cores-none--memory-gb-none--tags-none--properties-none--description-none--location-none--auth-enabled-none--ssl-enabled-none--enable-app-insights-none--ssl-cert-pem-file-none--ssl-key-pem-file-none--ssl-cname-none--dns-name-label-none--) method to generate a custom deploy configuration.\n",
"\n",
"**Note**: This step can take several minutes."
]
@@ -288,15 +297,18 @@
"execution_count": null,
"metadata": {
"tags": [
"azuremlexception-remarks-sample"
"azuremlexception-remarks-sample",
"sample-aciwebservice-deploy-config"
]
},
"outputs": [],
"source": [
"from azureml.core import Webservice\n",
"from azureml.core.model import InferenceConfig\n",
"from azureml.core.webservice import AciWebservice\n",
"from azureml.exceptions import WebserviceException\n",
"\n",
"\n",
"service_name = 'my-custom-env-service'\n",
"\n",
"# Remove any existing service under the same name.\n",
@@ -305,11 +317,14 @@
"except WebserviceException:\n",
" pass\n",
"\n",
"inference_config = InferenceConfig(entry_script='score.py',\n",
" source_directory='.',\n",
" environment=environment)\n",
"inference_config = InferenceConfig(entry_script='score.py', environment=environment)\n",
"aci_config = AciWebservice.deploy_configuration(cpu_cores=1, memory_gb=1)\n",
"\n",
"service = Model.deploy(ws, service_name, [model], inference_config)\n",
"service = Model.deploy(workspace=ws,\n",
" name=service_name,\n",
" models=[model],\n",
" inference_config=inference_config,\n",
" deployment_config=aci_config)\n",
"service.wait_for_deployment(show_output=True)"
]
},
@@ -328,6 +343,7 @@
"source": [
"import json\n",
"\n",
"\n",
"input_payload = json.dumps({\n",
" 'data': [\n",
" [ 0.03807591, 0.05068012, 0.06169621, 0.02187235, -0.0442235,\n",

View File

@@ -189,6 +189,15 @@
" return error"
]
},
{
"cell_type": "markdown",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"Please note that you must indicate azureml-defaults with verion >= 1.0.45 as a pip dependency for your environemnt. This package contains the functionality needed to host the model as a web service."
]
},
{
"cell_type": "code",
"execution_count": null,
@@ -206,16 +215,6 @@
" - inference-schema[numpy-support]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%writefile C:/abc/dockerstep/customDockerStep.txt\n",
"RUN echo \"this is test\""
]
},
{
"cell_type": "code",
"execution_count": null,
@@ -240,11 +239,10 @@
"source": [
"## Create Inference Configuration\n",
"\n",
" - source_directory = holds source path as string, this entire folder gets added in image so its really easy to access any files within this folder or subfolder\n",
" - runtime = Which runtime to use for the image. Current supported runtimes are 'spark-py' and 'python\n",
" - entry_script = contains logic specific to initializing your model and running predictions\n",
" - conda_file = manages conda and python package dependencies.\n",
" - extra_docker_file_steps = optional: any extra steps you want to inject into docker file"
" - file_path: input parameter to Environment constructor. Manages conda and python package dependencies.\n",
" - env.docker.base_dockerfile: any extra steps you want to inject into docker file\n",
" - source_directory: holds source path as string, this entire folder gets added in image so its really easy to access any files within this folder or subfolder\n",
" - entry_script: contains logic specific to initializing your model and running predictions"
]
},
{
@@ -253,13 +251,19 @@
"metadata": {},
"outputs": [],
"source": [
"from azureml.core.environment import Environment\n",
"from azureml.core.model import InferenceConfig\n",
"\n",
"\n",
"myenv = Environment.from_conda_specification(name='myenv', file_path='env/myenv.yml')\n",
"\n",
"# explicitly set base_image to None when setting base_dockerfile\n",
"myenv.docker.base_image = None\n",
"myenv.docker.base_dockerfile = \"RUN echo \\\"this is test\\\"\"\n",
"\n",
"inference_config = InferenceConfig(source_directory=\"C:/abc\",\n",
" runtime=\"python\", \n",
" entry_script=\"x/y/score.py\",\n",
" conda_file=\"env/myenv.yml\", \n",
" extra_docker_file_steps=\"dockerstep/customDockerStep.txt\")"
" environment=myenv)\n"
]
},
{

View File

@@ -158,7 +158,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## 5. *Create myenv.yml file*"
"## 5. *Create myenv.yml file*\n",
"Please note that you must indicate azureml-defaults with verion >= 1.0.45 as a pip dependency, because it contains the functionality needed to host the model as a web service."
]
},
{
@@ -169,7 +170,8 @@
"source": [
"from azureml.core.conda_dependencies import CondaDependencies \n",
"\n",
"myenv = CondaDependencies.create(conda_packages=['numpy','scikit-learn'])\n",
"myenv = CondaDependencies.create(conda_packages=['numpy','scikit-learn'],\n",
" pip_packages=['azureml-defaults'])\n",
"\n",
"with open(\"myenv.yml\",\"w\") as f:\n",
" f.write(myenv.serialize_to_string())"
@@ -189,10 +191,11 @@
"outputs": [],
"source": [
"from azureml.core.model import InferenceConfig\n",
"from azureml.core.environment import Environment\n",
"\n",
"inference_config = InferenceConfig(runtime= \"python\", \n",
" entry_script=\"score.py\",\n",
" conda_file=\"myenv.yml\")"
"\n",
"myenv = Environment.from_conda_specification(name=\"myenv\", file_path=\"myenv.yml\")\n",
"inference_config = InferenceConfig(entry_script=\"score.py\", environment=myenv)"
]
},
{

View File

@@ -244,7 +244,7 @@
"metadata": {},
"source": [
"### Setting up inference configuration\n",
"First we create a YAML file that specifies which dependencies we would like to see in our container."
"First we create a YAML file that specifies which dependencies we would like to see in our container. Please note that you must include azureml-defaults with verion >= 1.0.45 as a pip dependency, because it contains the functionality needed to host the model as a web service."
]
},
{
@@ -255,7 +255,7 @@
"source": [
"from azureml.core.conda_dependencies import CondaDependencies \n",
"\n",
"myenv = CondaDependencies.create(pip_packages=[\"numpy\",\"onnxruntime==0.4.0\",\"azureml-core\"])\n",
"myenv = CondaDependencies.create(pip_packages=[\"numpy\", \"onnxruntime==0.4.0\", \"azureml-core\", \"azureml-defaults\"])\n",
"\n",
"with open(\"myenv.yml\",\"w\") as f:\n",
" f.write(myenv.serialize_to_string())"
@@ -275,11 +275,11 @@
"outputs": [],
"source": [
"from azureml.core.model import InferenceConfig\n",
"from azureml.core.environment import Environment\n",
"\n",
"inference_config = InferenceConfig(runtime= \"python\", \n",
" entry_script=\"score.py\",\n",
" conda_file=\"myenv.yml\",\n",
" extra_docker_file_steps = \"Dockerfile\")"
"\n",
"myenv = Environment.from_conda_specification(name=\"myenv\", file_path=\"myenv.yml\")\n",
"inference_config = InferenceConfig(entry_script=\"score.py\", environment=myenv)"
]
},
{
@@ -373,7 +373,7 @@
"metadata": {},
"outputs": [],
"source": [
"#aci_service.delete()"
"aci_service.delete()"
]
}
],

View File

@@ -319,7 +319,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Write Environment File"
"### Write Environment File\n",
"Please note that you must indicate azureml-defaults with verion >= 1.0.45 as a pip dependency, because it contains the functionality needed to host the model as a web service."
]
},
{
@@ -330,7 +331,8 @@
"source": [
"from azureml.core.conda_dependencies import CondaDependencies \n",
"\n",
"myenv = CondaDependencies.create(pip_packages=[\"numpy\", \"onnxruntime\", \"azureml-core\"])\n",
"\n",
"myenv = CondaDependencies.create(pip_packages=[\"numpy\", \"onnxruntime\", \"azureml-core\", \"azureml-defaults\"])\n",
"\n",
"with open(\"myenv.yml\",\"w\") as f:\n",
" f.write(myenv.serialize_to_string())"
@@ -350,11 +352,11 @@
"outputs": [],
"source": [
"from azureml.core.model import InferenceConfig\n",
"from azureml.core.environment import Environment\n",
"\n",
"inference_config = InferenceConfig(runtime= \"python\", \n",
" entry_script=\"score.py\",\n",
" conda_file=\"myenv.yml\",\n",
" extra_docker_file_steps = \"Dockerfile\")"
"\n",
"myenv = Environment.from_conda_specification(name=\"myenv\", file_path=\"myenv.yml\")\n",
"inference_config = InferenceConfig(entry_script=\"score.py\", environment=myenv)"
]
},
{
@@ -724,7 +726,7 @@
"source": [
"# remember to delete your service after you are done using it!\n",
"\n",
"# aci_service.delete()"
"aci_service.delete()"
]
},
{

View File

@@ -306,7 +306,7 @@
"source": [
"### Write Environment File\n",
"\n",
"This step creates a YAML environment file that specifies which dependencies we would like to see in our Linux Virtual Machine."
"This step creates a YAML environment file that specifies which dependencies we would like to see in our Linux Virtual Machine. Please note that you must indicate azureml-defaults with verion >= 1.0.45 as a pip dependency, because it contains the functionality needed to host the model as a web service."
]
},
{
@@ -317,7 +317,7 @@
"source": [
"from azureml.core.conda_dependencies import CondaDependencies \n",
"\n",
"myenv = CondaDependencies.create(pip_packages=[\"numpy\", \"onnxruntime\", \"azureml-core\"])\n",
"myenv = CondaDependencies.create(pip_packages=[\"numpy\", \"onnxruntime\", \"azureml-core\", \"azureml-defaults\"])\n",
"\n",
"with open(\"myenv.yml\",\"w\") as f:\n",
" f.write(myenv.serialize_to_string())"
@@ -337,11 +337,11 @@
"outputs": [],
"source": [
"from azureml.core.model import InferenceConfig\n",
"from azureml.core.environment import Environment\n",
"\n",
"inference_config = InferenceConfig(runtime= \"python\", \n",
" entry_script=\"score.py\",\n",
" extra_docker_file_steps = \"Dockerfile\",\n",
" conda_file=\"myenv.yml\")"
"\n",
"myenv = Environment.from_conda_specification(name=\"myenv\", file_path=\"myenv.yml\")\n",
"inference_config = InferenceConfig(entry_script=\"score.py\", environment=myenv)"
]
},
{
@@ -733,7 +733,7 @@
"source": [
"# remember to delete your service after you are done using it!\n",
"\n",
"# aci_service.delete()"
"aci_service.delete()"
]
},
{

View File

@@ -241,7 +241,8 @@
"source": [
"from azureml.core.conda_dependencies import CondaDependencies \n",
"\n",
"myenv = CondaDependencies.create(pip_packages=[\"numpy\",\"onnxruntime\",\"azureml-core\"])\n",
"\n",
"myenv = CondaDependencies.create(pip_packages=[\"numpy\", \"onnxruntime\", \"azureml-core\", \"azureml-defaults\"])\n",
"\n",
"with open(\"myenv.yml\",\"w\") as f:\n",
" f.write(myenv.serialize_to_string())"
@@ -251,7 +252,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Create the inference configuration object"
"Create the inference configuration object. Please note that you must indicate azureml-defaults with verion >= 1.0.45 as a pip dependency, because it contains the functionality needed to host the model as a web service."
]
},
{
@@ -261,11 +262,11 @@
"outputs": [],
"source": [
"from azureml.core.model import InferenceConfig\n",
"from azureml.core.environment import Environment\n",
"\n",
"inference_config = InferenceConfig(runtime= \"python\", \n",
" entry_script=\"score.py\",\n",
" conda_file=\"myenv.yml\",\n",
" extra_docker_file_steps = \"Dockerfile\")"
"\n",
"myenv = Environment.from_conda_specification(name=\"myenv\", file_path=\"myenv.yml\")\n",
"inference_config = InferenceConfig(entry_script=\"score.py\", environment=myenv)"
]
},
{
@@ -361,7 +362,7 @@
"metadata": {},
"outputs": [],
"source": [
"#aci_service.delete()"
"aci_service.delete()"
]
}
],

View File

@@ -405,7 +405,7 @@
"metadata": {},
"source": [
"### Create inference configuration\n",
"First we create a YAML file that specifies which dependencies we would like to see in our container."
"First we create a YAML file that specifies which dependencies we would like to see in our container. Please note that you must indicate azureml-defaults with verion >= 1.0.45 as a pip dependency, because it contains the functionality needed to host the model as a web service."
]
},
{
@@ -416,7 +416,7 @@
"source": [
"from azureml.core.conda_dependencies import CondaDependencies \n",
"\n",
"myenv = CondaDependencies.create(pip_packages=[\"numpy\",\"onnxruntime\",\"azureml-core\"])\n",
"myenv = CondaDependencies.create(pip_packages=[\"numpy\",\"onnxruntime\",\"azureml-core\", \"azureml-defaults\"])\n",
"\n",
"with open(\"myenv.yml\",\"w\") as f:\n",
" f.write(myenv.serialize_to_string())"
@@ -436,11 +436,11 @@
"outputs": [],
"source": [
"from azureml.core.model import InferenceConfig\n",
"from azureml.core.environment import Environment\n",
"\n",
"inference_config = InferenceConfig(runtime= \"python\", \n",
" entry_script=\"score.py\",\n",
" conda_file=\"myenv.yml\",\n",
" extra_docker_file_steps = \"Dockerfile\")"
"\n",
"myenv = Environment.from_conda_specification(name=\"myenv\", file_path=\"myenv.yml\")\n",
"inference_config = InferenceConfig(entry_script=\"score.py\", environment=myenv)"
]
},
{
@@ -537,7 +537,7 @@
"metadata": {},
"outputs": [],
"source": [
"#aci_service.delete()"
"aci_service.delete()"
]
}
],

View File

@@ -318,7 +318,11 @@
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"metadata": {
"tags": [
"sample-deploy-to-aks"
]
},
"outputs": [],
"source": [
"# Set the web service configuration (using default here)\n",
@@ -331,7 +335,11 @@
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"metadata": {
"tags": [
"sample-deploy-to-aks"
]
},
"outputs": [],
"source": [
"%%time\n",

View File

@@ -1,457 +0,0 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Copyright (c) Microsoft Corporation. All rights reserved.\n",
"\n",
"Licensed under the MIT License."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/deployment/register-model-create-image-deploy-service/register-model-create-image-deploy-service.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Register Model, Create Image and Deploy Service\n",
"\n",
"This example shows how to deploy a web service in step-by-step fashion:\n",
"\n",
" 1. Register model\n",
" 2. Query versions of models and select one to deploy\n",
" 3. Create Docker image\n",
" 4. Query versions of images\n",
" 5. Deploy the image as web service\n",
" \n",
"**IMPORTANT**:\n",
" * This notebook requires you to first complete [train-within-notebook](../../training/train-within-notebook/train-within-notebook.ipynb) example\n",
" \n",
"The train-within-notebook example taught you how to deploy a web service directly from model in one step. This Notebook shows a more advanced approach that gives you more control over model versions and Docker image versions. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Prerequisites\n",
"If you are using an Azure Machine Learning Notebook VM, you are all set. Otherwise, make sure you go through the [configuration](../../../configuration.ipynb) Notebook first if you haven't."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Check core SDK version number\n",
"import azureml.core\n",
"\n",
"print(\"SDK version:\", azureml.core.VERSION)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Initialize Workspace\n",
"\n",
"Initialize a workspace object from persisted configuration."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"create workspace"
]
},
"outputs": [],
"source": [
"from azureml.core import Workspace\n",
"\n",
"ws = Workspace.from_config()\n",
"print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep = '\\n')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Register Model"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can add tags and descriptions to your models. Note you need to have a `sklearn_linreg_model.pkl` file in the current directory. This file is generated by the 01 notebook. The below call registers that file as a model with the same name `sklearn_linreg_model.pkl` in the workspace.\n",
"\n",
"Using tags, you can track useful information such as the name and version of the machine learning library used to train the model. Note that tags must be alphanumeric."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"register model from file"
]
},
"outputs": [],
"source": [
"from azureml.core.model import Model\n",
"import sklearn\n",
"\n",
"library_version = \"sklearn\"+sklearn.__version__.replace(\".\",\"x\")\n",
"\n",
"model = Model.register(model_path = \"sklearn_regression_model.pkl\",\n",
" model_name = \"sklearn_regression_model.pkl\",\n",
" tags = {'area': \"diabetes\", 'type': \"regression\", 'version': library_version},\n",
" description = \"Ridge regression model to predict diabetes\",\n",
" workspace = ws)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can explore the registered models within your workspace and query by tag. Models are versioned. If you call the register_model command many times with same model name, you will get multiple versions of the model with increasing version numbers."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"register model from file"
]
},
"outputs": [],
"source": [
"regression_models = Model.list(workspace=ws, tags=['area'])\n",
"for m in regression_models:\n",
" print(\"Name:\", m.name,\"\\tVersion:\", m.version, \"\\tDescription:\", m.description, m.tags)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can pick a specific model to deploy"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print(model.name, model.description, model.version, sep = '\\t')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Create Docker Image"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Show `score.py`. Note that the `sklearn_regression_model.pkl` in the `get_model_path` call is referring to a model named `sklearn_linreg_model.pkl` registered under the workspace. It is NOT referenceing the local file."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%writefile score.py\n",
"import os\n",
"import pickle\n",
"import json\n",
"import numpy\n",
"from sklearn.externals import joblib\n",
"from sklearn.linear_model import Ridge\n",
"\n",
"def init():\n",
" global model\n",
" # AZUREML_MODEL_DIR is an environment variable created during deployment.\n",
" # It is the path to the model folder (./azureml-models/$MODEL_NAME/$VERSION)\n",
" # For multiple models, it points to the folder containing all deployed models (./azureml-models)\n",
" model_path = os.path.join(os.getenv('AZUREML_MODEL_DIR'), 'sklearn_regression_model.pkl')\n",
" # deserialize the model file back into a sklearn model\n",
" model = joblib.load(model_path)\n",
"\n",
"# note you can pass in multiple rows for scoring\n",
"def run(raw_data):\n",
" try:\n",
" data = json.loads(raw_data)['data']\n",
" data = numpy.array(data)\n",
" result = model.predict(data)\n",
" # you can return any datatype as long as it is JSON-serializable\n",
" return result.tolist()\n",
" except Exception as e:\n",
" error = str(e)\n",
" return error"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core.conda_dependencies import CondaDependencies \n",
"\n",
"myenv = CondaDependencies.create(conda_packages=['numpy','scikit-learn'])\n",
"\n",
"with open(\"myenv.yml\",\"w\") as f:\n",
" f.write(myenv.serialize_to_string())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note that following command can take few minutes. \n",
"\n",
"You can add tags and descriptions to images. Also, an image can contain multiple models."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"create image",
"sample-image-create"
]
},
"outputs": [],
"source": [
"from azureml.core.image import Image, ContainerImage\n",
"\n",
"image_config = ContainerImage.image_configuration(runtime= \"python\",\n",
" execution_script=\"score.py\",\n",
" conda_file=\"myenv.yml\",\n",
" tags = {'area': \"diabetes\", 'type': \"regression\"},\n",
" description = \"Image with ridge regression model\")\n",
"\n",
"image = Image.create(name = \"myimage1\",\n",
" # this is the model object. note you can pass in 0-n models via this list-type parameter\n",
" # in case you need to reference multiple models, or none at all, in your scoring script.\n",
" models = [model],\n",
" image_config = image_config, \n",
" workspace = ws)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"create image"
]
},
"outputs": [],
"source": [
"image.wait_for_creation(show_output = True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Use a custom Docker image\n",
"\n",
"You can also specify a custom Docker image to be used as base image if you don't want to use the default base image provided by Azure ML. Please make sure the custom Docker image has Ubuntu >= 16.04, Conda >= 4.5.\\* and Python(3.5.\\* or 3.6.\\*).\n",
"\n",
"Only Supported for `ContainerImage`(from azureml.core.image) with `python` runtime.\n",
"```python\n",
"# use an image available in public Container Registry without authentication\n",
"image_config.base_image = \"mcr.microsoft.com/azureml/o16n-sample-user-base/ubuntu-miniconda\"\n",
"\n",
"# or, use an image available in a private Container Registry\n",
"image_config.base_image = \"myregistry.azurecr.io/mycustomimage:1.0\"\n",
"image_config.base_image_registry.address = \"myregistry.azurecr.io\"\n",
"image_config.base_image_registry.username = \"username\"\n",
"image_config.base_image_registry.password = \"password\"\n",
"\n",
"# or, use an image built during training.\n",
"image_config.base_image = run.properties[\"AzureML.DerivedImageName\"]\n",
"```\n",
"You can get the address of training image from the properties of a Run object. Only new runs submitted with azureml-sdk>=1.0.22 to AMLCompute targets will have the 'AzureML.DerivedImageName' property. Instructions on how to get a Run can be found in [manage-runs](../../training/manage-runs/manage-runs.ipynb). \n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"List images by tag and find out the detailed build log for debugging."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"create image"
]
},
"outputs": [],
"source": [
"for i in Image.list(workspace = ws,tags = [\"area\"]):\n",
" print('{}(v.{} [{}]) stored at {} with build log {}'.format(i.name, i.version, i.creation_state, i.image_location, i.image_build_log_uri))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Deploy image as web service on Azure Container Instance\n",
"\n",
"Note that the service creation can take few minutes."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"deploy service",
"aci",
"sample-aciwebservice-deploy-config"
]
},
"outputs": [],
"source": [
"from azureml.core.webservice import AciWebservice\n",
"\n",
"aciconfig = AciWebservice.deploy_configuration(cpu_cores = 1, \n",
" memory_gb = 1, \n",
" tags = {'area': \"diabetes\", 'type': \"regression\"}, \n",
" description = 'Predict diabetes using regression model')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"deploy service",
"aci",
"sample-aciwebservice-deploy-from-image"
]
},
"outputs": [],
"source": [
"from azureml.core.webservice import Webservice\n",
"\n",
"aci_service_name = 'my-aci-service-2'\n",
"print(aci_service_name)\n",
"aci_service = Webservice.deploy_from_image(deployment_config = aciconfig,\n",
" image = image,\n",
" name = aci_service_name,\n",
" workspace = ws)\n",
"aci_service.wait_for_deployment(True)\n",
"print(aci_service.state)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Test web service"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Call the web service with some dummy input data to get a prediction."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"deploy service",
"aci"
]
},
"outputs": [],
"source": [
"import json\n",
"\n",
"test_sample = json.dumps({'data': [\n",
" [1,2,3,4,5,6,7,8,9,10], \n",
" [10,9,8,7,6,5,4,3,2,1]\n",
"]})\n",
"test_sample = bytes(test_sample,encoding = 'utf8')\n",
"\n",
"prediction = aci_service.run(input_data=test_sample)\n",
"print(prediction)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Delete ACI to clean up"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"deploy service",
"aci"
]
},
"outputs": [],
"source": [
"aci_service.delete()"
]
}
],
"metadata": {
"authors": [
{
"name": "aashishb"
}
],
"kernelspec": {
"display_name": "Python 3.6",
"language": "python",
"name": "python36"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.6"
}
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -1,8 +0,0 @@
name: register-model-create-image-deploy-service
dependencies:
- pip:
- azureml-sdk
- matplotlib
- tqdm
- scipy
- sklearn

View File

@@ -0,0 +1 @@
{"class":"org.apache.spark.ml.classification.LogisticRegressionModel","timestamp":1570147252329,"sparkVersion":"2.4.0","uid":"LogisticRegression_5df3978caaf3","paramMap":{"regParam":0.01},"defaultParamMap":{"aggregationDepth":2,"threshold":0.5,"rawPredictionCol":"rawPrediction","featuresCol":"features","labelCol":"label","predictionCol":"prediction","family":"auto","regParam":0.0,"tol":1.0E-6,"probabilityCol":"probability","standardization":true,"elasticNetParam":0.0,"maxIter":100,"fitIntercept":true}}

View File

@@ -0,0 +1,343 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Copyright (c) Microsoft Corporation. All rights reserved.\n",
"\n",
"Licensed under the MIT License."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/deployment/deploy-to-cloud/model-register-and-deploy.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Register Spark Model and deploy as Webservice\n",
"\n",
"This example shows how to deploy a Webservice in step-by-step fashion:\n",
"\n",
" 1. Register Spark Model\n",
" 2. Deploy Spark Model as Webservice"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Prerequisites\n",
"If you are using an Azure Machine Learning Notebook VM, you are all set. Otherwise, make sure you go through the [configuration](../../../configuration.ipynb) Notebook first if you haven't."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Check core SDK version number\n",
"import azureml.core\n",
"\n",
"print(\"SDK version:\", azureml.core.VERSION)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Initialize Workspace\n",
"\n",
"Initialize a workspace object from persisted configuration."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"create workspace"
]
},
"outputs": [],
"source": [
"from azureml.core import Workspace\n",
"\n",
"ws = Workspace.from_config()\n",
"print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep='\\n')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Register Model"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can add tags and descriptions to your Models. Note you need to have a `iris.model` file in the current directory. This model file is generated using [train in spark](../training/train-in-spark/train-in-spark.ipynb) notebook. The below call registers that file as a Model with the same name `iris.model` in the workspace.\n",
"\n",
"Using tags, you can track useful information such as the name and version of the machine learning library used to train the model. Note that tags must be alphanumeric."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"register model from file"
]
},
"outputs": [],
"source": [
"from azureml.core.model import Model\n",
"\n",
"model = Model.register(model_path=\"iris.model\",\n",
" model_name=\"iris.model\",\n",
" tags={'type': \"regression\"},\n",
" description=\"Logistic regression model to predict iris species\",\n",
" workspace=ws)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Fetch Environment"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can now create and/or use an Environment object when deploying a Webservice. The Environment can have been previously registered with your Workspace, or it will be registered with it as a part of the Webservice deployment.\n",
"\n",
"In this notebook, we will be using 'AzureML-PySpark-MmlSpark-0.15', a curated environment.\n",
"\n",
"More information can be found in our [using environments notebook](../training/using-environments/using-environments.ipynb)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core import Environment\n",
"\n",
"env = Environment.get(ws, name='AzureML-PySpark-MmlSpark-0.15')\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Create Inference Configuration\n",
"\n",
"There is now support for a source directory, you can upload an entire folder from your local machine as dependencies for the Webservice.\n",
"Note: in that case, your entry_script is relative path to the source_directory path.\n",
"\n",
"Sample code for using a source directory:\n",
"\n",
"```python\n",
"inference_config = InferenceConfig(source_directory=\"C:/abc\",\n",
" entry_script=\"x/y/score.py\",\n",
" environment=environment)\n",
"```\n",
"\n",
" - source_directory = holds source path as string, this entire folder gets added in image so its really easy to access any files within this folder or subfolder\n",
" - entry_script = contains logic specific to initializing your model and running predictions\n",
" - environment = An environment object to use for the deployment. Doesn't have to be registered"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"create image"
]
},
"outputs": [],
"source": [
"from azureml.core.model import InferenceConfig\n",
"\n",
"inference_config = InferenceConfig(entry_script=\"score.py\", environment=env)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Deploy Model as Webservice on Azure Container Instance\n",
"\n",
"Note that the service creation can take few minutes."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"azuremlexception-remarks-sample"
]
},
"outputs": [],
"source": [
"from azureml.core.webservice import AciWebservice, Webservice\n",
"from azureml.exceptions import WebserviceException\n",
"\n",
"deployment_config = AciWebservice.deploy_configuration(cpu_cores=1, memory_gb=1)\n",
"aci_service_name = 'aciservice1'\n",
"\n",
"try:\n",
" # if you want to get existing service below is the command\n",
" # since aci name needs to be unique in subscription deleting existing aci if any\n",
" # we use aci_service_name to create azure aci\n",
" service = Webservice(ws, name=aci_service_name)\n",
" if service:\n",
" service.delete()\n",
"except WebserviceException as e:\n",
" print()\n",
"\n",
"service = Model.deploy(ws, aci_service_name, [model], inference_config, deployment_config)\n",
"\n",
"service.wait_for_deployment(True)\n",
"print(service.state)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Test web service"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import json\n",
"test_sample = json.dumps({'features':{'type':1,'values':[4.3,3.0,1.1,0.1]},'label':2.0})\n",
"\n",
"test_sample_encoded = bytes(test_sample, encoding='utf8')\n",
"prediction = service.run(input_data=test_sample_encoded)\n",
"print(prediction)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Delete ACI to clean up"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"deploy service",
"aci"
]
},
"outputs": [],
"source": [
"service.delete()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Model Profiling\n",
"\n",
"You can also take advantage of the profiling feature to estimate CPU and memory requirements for models.\n",
"\n",
"```python\n",
"profile = Model.profile(ws, \"profilename\", [model], inference_config, test_sample)\n",
"profile.wait_for_profiling(True)\n",
"profiling_results = profile.get_results()\n",
"print(profiling_results)\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Model Packaging\n",
"\n",
"If you want to build a Docker image that encapsulates your model and its dependencies, you can use the model packaging option. The output image will be pushed to your workspace's ACR.\n",
"\n",
"You must include an Environment object in your inference configuration to use `Model.package()`.\n",
"\n",
"```python\n",
"package = Model.package(ws, [model], inference_config)\n",
"package.wait_for_creation(show_output=True) # Or show_output=False to hide the Docker build logs.\n",
"package.pull()\n",
"```\n",
"\n",
"Instead of a fully-built image, you can also generate a Dockerfile and download all the assets needed to build an image on top of your Environment.\n",
"\n",
"```python\n",
"package = Model.package(ws, [model], inference_config, generate_dockerfile=True)\n",
"package.wait_for_creation(show_output=True)\n",
"package.save(\"./local_context_dir\")\n",
"```"
]
}
],
"metadata": {
"authors": [
{
"name": "aashishb"
}
],
"category": "deployment",
"compute": [
"None"
],
"datasets": [
"Iris"
],
"deployment": [
"Azure Container Instance"
],
"exclude_from_index": false,
"framework": [
"PySpark"
],
"friendly_name": "Register Spark model and deploy as webservice",
"kernelspec": {
"display_name": "Python 3.6",
"language": "python",
"name": "python36"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.2"
}
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -0,0 +1,4 @@
name: model-register-and-deploy-spark
dependencies:
- pip:
- azureml-sdk

View File

@@ -0,0 +1,37 @@
import traceback
from pyspark.ml.linalg import VectorUDT
from azureml.core.model import Model
from pyspark.ml.classification import LogisticRegressionModel
from pyspark.sql.types import StructType, StructField
from pyspark.sql.types import DoubleType
from pyspark.sql import SQLContext
from pyspark import SparkContext
sc = SparkContext.getOrCreate()
sqlContext = SQLContext(sc)
spark = sqlContext.sparkSession
input_schema = StructType([StructField("features", VectorUDT()), StructField("label", DoubleType())])
reader = spark.read
reader.schema(input_schema)
def init():
global model
# note here "iris.model" is the name of the model registered under the workspace
# this call should return the path to the model.pkl file on the local disk.
model_path = Model.get_model_path('iris.model')
# Load the model file back into a LogisticRegression model
model = LogisticRegressionModel.load(model_path)
def run(data):
try:
input_df = reader.json(sc.parallelize([data]))
result = model.transform(input_df)
# you can return any datatype as long as it is JSON-serializable
return result.collect()[0]['prediction']
except Exception as e:
traceback.print_exc()
error = str(e)
return error

View File

@@ -308,7 +308,9 @@
"source": [
"## Deploy \n",
"\n",
"Deploy Model and ScoringExplainer"
"Deploy Model and ScoringExplainer.\n",
"\n",
"Please note that you must indicate azureml-defaults with verion >= 1.0.45 as a pip dependency, because it contains the functionality needed to host the model as a web service."
]
},
{
@@ -319,7 +321,7 @@
"source": [
"from azureml.core.conda_dependencies import CondaDependencies \n",
"\n",
"# WARNING: to install this, g++ needs to be available on the Docker image and is not by default (look at the next cell)\n",
"# azureml-defaults is required to host the model as a web service.\n",
"azureml_pip_packages = [\n",
" 'azureml-defaults', 'azureml-contrib-interpret', 'azureml-core', 'azureml-telemetry',\n",
" 'azureml-interpret'\n",
@@ -338,16 +340,6 @@
" print(f.read())"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%writefile dockerfile\n",
"RUN apt-get update && apt-get install -y g++ "
]
},
{
"cell_type": "code",
"execution_count": null,
@@ -369,6 +361,8 @@
"from azureml.core.model import InferenceConfig\n",
"from azureml.core.webservice import AciWebservice\n",
"from azureml.core.model import Model\n",
"from azureml.core.environment import Environment\n",
"\n",
"\n",
"aciconfig = AciWebservice.deploy_configuration(cpu_cores=1, \n",
" memory_gb=1, \n",
@@ -376,10 +370,8 @@
" \"method\" : \"local_explanation\"}, \n",
" description='Get local explanations for IBM Employee Attrition data')\n",
"\n",
"inference_config = InferenceConfig(runtime= \"python\", \n",
" entry_script=\"score_local_explain.py\",\n",
" conda_file=\"myenv.yml\",\n",
" extra_docker_file_steps=\"dockerfile\")\n",
"myenv = Environment.from_conda_specification(name=\"myenv\", file_path=\"myenv.yml\")\n",
"inference_config = InferenceConfig(entry_script=\"score_local_explain.py\", environment=myenv)\n",
"\n",
"# Use configs and models generated above\n",
"service = Model.deploy(ws, 'model-scoring-deploy-local', [scoring_explainer_model, original_model], inference_config, aciconfig)\n",

View File

@@ -43,5 +43,7 @@ In this directory, there are two types of notebooks:
1. [pipeline-batch-scoring.ipynb](https://aka.ms/pl-batch-score): This notebook demonstrates how to run a batch scoring job using Azure Machine Learning pipelines.
2. [pipeline-style-transfer.ipynb](https://aka.ms/pl-style-trans): This notebook demonstrates a multi-step pipeline that uses GPU compute. This sample also showcases how to use conda dependencies using runconfig when using Pipelines.
3. [nyc-taxi-data-regression-model-building.ipynb](https://aka.ms/pl-nyctaxi-tutorial): This notebook is an AzureML Pipelines version of the previously published two part sample.
4. [file-dataset-image-inference-mnist.ipynb](https://aka.ms/pl-pr-filedata): This notebook demonstrates how to use ParallelRunStep to process unstructured data (file dataset).
5. [tabular-dataset-inference-iris.ipynb](https://aka.ms/pl-pr-tabulardata): This notebook demonstrates how to use ParallelRunStep to process structured data (tabular dataset).
![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/machine-learning-pipelines/README.png)

View File

@@ -246,7 +246,7 @@
"metadata": {},
"source": [
"## Create TensorFlow estimator\n",
"Next, we construct an [TensorFlow](https://docs.microsoft.com/en-us/python/api/azureml-train-core/azureml.train.dnn.tensorflow?view=azure-ml-py) estimator object.\n",
"Next, we construct an [TensorFlow](https://docs.microsoft.com/python/api/azureml-train-core/azureml.train.dnn.tensorflow?view=azure-ml-py) estimator object.\n",
"The TensorFlow estimator is providing a simple way of launching a TensorFlow training job on a compute target. It will automatically provide a docker image that has TensorFlow installed -- if additional pip or conda packages are required, their names can be passed in via the `pip_packages` and `conda_packages` arguments and they will be included in the resulting docker.\n",
"\n",
"The TensorFlow estimator also takes a `framework_version` parameter -- if no version is provided, the estimator will default to the latest version supported by AzureML. Use `TensorFlow.get_supported_versions()` to get a list of all versions supported by your current SDK version or see the [SDK documentation](https://docs.microsoft.com/en-us/python/api/azureml-train-core/azureml.train.dnn?view=azure-ml-py) for the versions supported in the most current release.\n",
@@ -385,7 +385,7 @@
"outputs": [],
"source": [
"metrics_output_name = 'metrics_output'\n",
"metirics_data = PipelineData(name='metrics_data',\n",
"metrics_data = PipelineData(name='metrics_data',\n",
" datastore=ds,\n",
" pipeline_output_name=metrics_output_name)\n",
"\n",
@@ -395,7 +395,7 @@
" hyperdrive_config=hd_config,\n",
" estimator_entry_script_arguments=['--data-folder', data_folder],\n",
" inputs=[data_folder],\n",
" metrics_output=metirics_data)"
" metrics_output=metrics_data)"
]
},
{
@@ -620,14 +620,13 @@
"outputs": [],
"source": [
"%%time\n",
"from azureml.core.model import InferenceConfig\n",
"from azureml.core.environment import Environment\n",
"from azureml.core.model import Model, InferenceConfig\n",
"from azureml.core.webservice import AciWebservice\n",
"from azureml.core.webservice import Webservice\n",
"from azureml.core.model import Model\n",
"\n",
"inference_config = InferenceConfig(runtime = \"python\", \n",
" entry_script = \"score.py\",\n",
" conda_file = \"myenv.yml\")\n",
"\n",
"myenv = Environment.from_conda_specification(name=\"env\", file_path=\"myenv.yml\")\n",
"inference_config = InferenceConfig(entry_script=\"score.py\", environment=myenv)\n",
"\n",
"aciconfig = AciWebservice.deploy_configuration(cpu_cores=1, \n",
" memory_gb=1, \n",

View File

@@ -180,7 +180,7 @@
"# just get the published pipeline object that you have the ID for.\n",
"\n",
"# Get all published pipeline objects in the workspace\n",
"all_pub_pipelines = PublishedPipeline.get_all(ws)\n",
"all_pub_pipelines = PublishedPipeline.list(ws)\n",
"\n",
"# We will iterate through the list of published pipelines and \n",
"# use the last ID in the list for Schelue operations: \n",
@@ -244,7 +244,7 @@
"metadata": {},
"outputs": [],
"source": [
"schedules = Schedule.get_all(ws, pipeline_id=pub_pipeline_id)\n",
"schedules = Schedule.list(ws, pipeline_id=pub_pipeline_id)\n",
"\n",
"# We will iterate through the list of schedules and \n",
"# use the last recurrence schedule in the list for further operations: \n",
@@ -272,7 +272,7 @@
"outputs": [],
"source": [
"# Use active_only=False to get all schedules including disabled schedules\n",
"schedules = Schedule.get_all(ws, active_only=True) \n",
"schedules = Schedule.list(ws, active_only=True) \n",
"print(\"Your workspace has the following schedules set up:\")\n",
"for schedule in schedules:\n",
" print(\"{} (Published pipeline: {}\".format(schedule.id, schedule.pipeline_id))"

View File

@@ -230,7 +230,7 @@
"metadata": {},
"outputs": [],
"source": [
"endpoint_list = PipelineEndpoint.get_all(workspace=ws, active_only=True)\n",
"endpoint_list = PipelineEndpoint.list(workspace=ws, active_only=True)\n",
"endpoint_list"
]
},
@@ -360,7 +360,7 @@
"metadata": {},
"outputs": [],
"source": [
"versions = pipeline_endpoint_by_name.get_all_versions()\n",
"versions = pipeline_endpoint_by_name.list_versions()\n",
"\n",
"for ve in versions:\n",
" print(ve.version)\n",
@@ -381,7 +381,7 @@
"metadata": {},
"outputs": [],
"source": [
"pipelines = pipeline_endpoint_by_name.get_all_pipelines(active_only=True)\n",
"pipelines = pipeline_endpoint_by_name.list_pipelines(active_only=True)\n",
"pipelines"
]
},

View File

@@ -285,7 +285,7 @@
"metrics_output_name = 'metrics_output'\n",
"best_model_output_name = 'best_model_output'\n",
"\n",
"metirics_data = PipelineData(name='metrics_data',\n",
"metrics_data = PipelineData(name='metrics_data',\n",
" datastore=ds,\n",
" pipeline_output_name=metrics_output_name,\n",
" training_output=TrainingOutput(type='Metrics'))\n",
@@ -311,7 +311,7 @@
"automl_step = AutoMLStep(\n",
" name='automl_module',\n",
" automl_config=automl_config,\n",
" outputs=[metirics_data, model_data],\n",
" outputs=[metrics_data, model_data],\n",
" allow_reuse=True)"
]
},

View File

@@ -0,0 +1,433 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Copyright (c) Microsoft Corporation. All rights reserved. \n",
"Licensed under the MIT License."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-getting-started.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Azure Machine Learning Pipeline with NotebookRunnerStep\n",
"This notebook demonstrates the use of `NotebookRunnerStep`. It allows you to run a local notebook as a step in Azure Machine Learning Pipeline."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Introduction\n",
"In this example we showcase how you can run another notebook `notebook_runner/training_notebook.ipynb` as a step in Azure Machine Learning Pipeline.\n",
"\n",
"If you are using an Azure Machine Learning Notebook VM, you are all set. Otherwise, make sure you have executed the [configuration](https://aka.ms/pl-config) before running this notebook.\n",
"\n",
"In this notebook you will learn how to:\n",
"1. Create an `Experiment` in an existing `Workspace`.\n",
"2. Create or Attach existing AmlCompute to a workspace.\n",
"3. Configure NotebookRun using `NotebokRunConfig`.\n",
"5. Use NotebookRunnerStep.\n",
"6. Run the notebook on `AmlCompute` as a pipeline step consuming the output of a python script step.\n",
"\n",
"Advantages of running your notebook as a step in pipeline:\n",
"1. Run your notebook like a python script without converting into .py files, leveraging complete end to end experience of Azure Machine Learning Pipelines.\n",
"2. Use pipeline intermediate data to and from the notebook along with other steps in pipeline.\n",
"3. Parameterize your notebook with [Pipeline Parameters](./aml-pipelines-publish-and-run-using-rest-endpoint.ipynb).\n",
"\n",
"Try some more [quick start notebooks](https://github.com/microsoft/recommenders/tree/master/notebooks/00_quick_start) with `NotebookRunnerStep`."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Azure Machine Learning and Pipeline SDK-specific imports"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"import azureml.core\n",
"\n",
"from azureml.core.compute import AmlCompute, ComputeTarget\n",
"from azureml.core.runconfig import RunConfiguration\n",
"from azureml.data.data_reference import DataReference\n",
"from azureml.pipeline.core import PipelineData\n",
"from azureml.core.datastore import Datastore\n",
"\n",
"from azureml.widgets import RunDetails\n",
"\n",
"from azureml.core import Workspace, Experiment\n",
"from azureml.contrib.notebook import NotebookRunConfig, AzureMLNotebookHandler\n",
"\n",
"from azureml.pipeline.core import Pipeline\n",
"from azureml.pipeline.steps import PythonScriptStep\n",
"from azureml.contrib.notebook import NotebookRunnerStep\n",
"\n",
"# Check core SDK version number\n",
"print(\"SDK version:\", azureml.core.VERSION)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Initialize Workspace\n",
"\n",
"Initialize a [workspace](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.workspace(class%29) object from persisted configuration."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"ws = Workspace.from_config()\n",
"print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep = '\\n')\n",
"ws.set_default_datastore(\"workspaceblobstore\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Upload data to datastore"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"Datastore.get(ws, \"workspaceblobstore\").upload_files([\"./20news.pkl\"], target_path=\"20newsgroups\", overwrite=True)\n",
"print(\"Upload call completed\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Create an Azure ML experiment\n",
"Let's create an experiment named \"notebook-step-run-example\" and a folder to holding the notebook and other scripts. The script runs will be recorded under the experiment in Azure.\n",
"\n",
"The best practice is to use separate folders for scripts and its dependent files for each step and specify that folder as the `source_directory` for the step. This helps reduce the size of the snapshot created for the step (only the specific folder is snapshotted). Since changes in any files in the `source_directory` would trigger a re-upload of the snapshot, this helps keep the reuse of the step when there are no changes in the `source_directory` of the step."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Choose a name for the run history container in the workspace.\n",
"experiment_name = 'notebook-step-run-example'\n",
"source_directory = 'notebook_runner'\n",
"\n",
"experiment = Experiment(ws, experiment_name)\n",
"experiment"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Create or Attach an AmlCompute cluster\n",
"You will need to create a [compute target](https://docs.microsoft.com/azure/machine-learning/service/concept-azure-machine-learning-architecture#compute-target) for your AutoML run. In this tutorial, you get the default `AmlCompute` as your training compute resource."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Choose a name for your cluster.\n",
"amlcompute_cluster_name = \"cpu-cluster\"\n",
"\n",
"found = False\n",
"# Check if this compute target already exists in the workspace.\n",
"cts = ws.compute_targets\n",
"if amlcompute_cluster_name in cts and cts[amlcompute_cluster_name].type == 'AmlCompute':\n",
" found = True\n",
" print('Found existing compute target.')\n",
" compute_target = cts[amlcompute_cluster_name]\n",
" \n",
"if not found:\n",
" print('Creating a new compute target...')\n",
" provisioning_config = AmlCompute.provisioning_configuration(vm_size = \"STANDARD_D2_V2\", # for GPU, use \"STANDARD_NC6\"\n",
" #vm_priority = 'lowpriority', # optional\n",
" max_nodes = 4)\n",
"\n",
" # Create the cluster.\n",
" compute_target = ComputeTarget.create(ws, amlcompute_cluster_name, provisioning_config)\n",
" \n",
" # Can poll for a minimum number of nodes and for a specific timeout.\n",
" # If no min_node_count is provided, it will use the scale settings for the cluster.\n",
" compute_target.wait_for_completion(show_output = True, min_node_count = 1, timeout_in_minutes = 10)\n",
" \n",
" # For a more detailed view of current AmlCompute status, use get_status()."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Create a new RunConfig object"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core.conda_dependencies import CondaDependencies\n",
"\n",
"conda_run_config = RunConfiguration(framework=\"python\")\n",
"\n",
"conda_run_config.environment.docker.enabled = True\n",
"conda_run_config.environment.docker.base_image = azureml.core.runconfig.DEFAULT_CPU_IMAGE\n",
"\n",
"cd = CondaDependencies.create(pip_packages=['azureml-sdk'], pin_sdk_version=False)\n",
"conda_run_config.environment.python.conda_dependencies = cd\n",
"\n",
"print('run config is ready')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Define input and outputs"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"input_data = DataReference(\n",
" datastore=Datastore.get(ws, \"workspaceblobstore\"),\n",
" data_reference_name=\"blob_test_data\",\n",
" path_on_datastore=\"20newsgroups/20news.pkl\")\n",
"\n",
"output_data = PipelineData(name=\"processed_data\",\n",
" datastore=Datastore.get(ws, \"workspaceblobstore\"))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Create notebook run configuration and set parameters values"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"handler = AzureMLNotebookHandler(timeout=600, progress_bar=False, log_output=True)\n",
"\n",
"cfg = NotebookRunConfig(source_directory=source_directory, notebook=\"training_notebook.ipynb\",\n",
" handler = handler,\n",
" parameters={\"arg1\": \"Machine Learning\"},\n",
" run_config=conda_run_config)\n",
"\n",
"print(\"Notebook Run Config is created.\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Define PythonScriptStep"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print('Source directory for the step is {}.'.format(os.path.realpath('./train')))\n",
"python_script_step = PythonScriptStep(\n",
" script_name=\"train.py\",\n",
" arguments=[\"--input_data\", input_data],\n",
" inputs=[input_data],\n",
" outputs=[output_data],\n",
" compute_target=compute_target, \n",
" source_directory=\"./train\",\n",
" allow_reuse=True)\n",
"print(\"python_script_step created\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Define NotebookRunnerStep\n",
"\n",
"This step will consume intermediate output produced by `python_script_step` as an input.\n",
"\n",
"Optionally, a output of type `output_notebook_pipeline_data_name` can be added to the `NotebookRunnerStep` to redirect the `output_notebook` of notebook run to `NotebookRunnerStep`'s step output produced as `PipelineData` and can be further passed along the pipeline."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.pipeline.core import PipelineParameter, TrainingOutput\n",
"\n",
"output_from_notebook = PipelineData(name=\"notebook_processed_data\",\n",
" datastore=Datastore.get(ws, \"workspaceblobstore\"))\n",
"\n",
"my_pipeline_param = PipelineParameter(name=\"pipeline_param\", default_value=\"my_param\")\n",
"\n",
"print('Source directory for the step is {}.'.format(os.path.realpath(source_directory)))\n",
"notebook_runner_step = NotebookRunnerStep(name=\"training_notebook_step\",\n",
" notebook_run_config=cfg,\n",
" params={\"my_pipeline_param\": my_pipeline_param},\n",
" inputs=[output_data],\n",
" outputs=[output_from_notebook],\n",
" allow_reuse=True,\n",
" compute_target=compute_target,\n",
" output_notebook_pipeline_data_name=\"notebook_result\")\n",
"\n",
"print(\"Notebook Runner Step is Created.\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Build Pipeline\n",
"\n",
"Once we have the steps (or steps collection), we can build the [pipeline](https://docs.microsoft.com/en-us/python/api/azureml-pipeline-core/azureml.pipeline.core.pipeline.pipeline?view=azure-ml-py). By deafult, all these steps will run in **parallel** once we submit the pipeline for run.\n",
"\n",
"A pipeline is created with a list of steps and a workspace. Submit a pipeline using [submit](https://docs.microsoft.com/en-us/python/api/azureml-pipeline-core/azureml.pipeline.core.pipeline.pipeline?view=azure-ml-py#submit-experiment-name--pipeline-parameters-none--continue-on-step-failure-false--regenerate-outputs-false--parent-run-id-none----kwargs-). When submit is called, a [PipelineRun](https://docs.microsoft.com/en-us/python/api/azureml-pipeline-core/azureml.pipeline.core.pipelinerun?view=azure-ml-py) is created which in turn creates [StepRun](https://docs.microsoft.com/en-us/python/api/azureml-pipeline-core/azureml.pipeline.core.steprun?view=azure-ml-py) objects for each step in the workflow."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"pipeline1 = Pipeline(workspace=ws, steps=[notebook_runner_step])\n",
"print(\"Pipeline creation complete\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"pipeline_run1 = experiment.submit(pipeline1)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"RunDetails(pipeline_run1).show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Download output notebook\n",
"\n",
"`output_notebook` can be retrieved via pipeline step output if `output_notebook_pipeline_data_name` is provided to the `NotebookRunnerStep`"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"pipeline_run1.wait_for_completion()\n",
"train_step = pipeline_run1.find_step_run('training_notebook_step') # Retrieve the step runs by name `train.py`\n",
"\n",
"if train_step:\n",
" train_step_obj = train_step[0] # since we have only one step by name `training_notebook_step`\n",
" train_step_obj.get_output_data('notebook_result').download(source_directory) # download the output to source_directory"
]
}
],
"metadata": {
"authors": [
{
"name": "sanpil"
}
],
"category": "tutorial",
"compute": [
"AML Compute"
],
"datasets": [
"Custom"
],
"deployment": [
"None"
],
"exclude_from_index": false,
"framework": [
"Azure ML"
],
"friendly_name": "How to use run a notebook as a step in AML Pipelines",
"kernelspec": {
"display_name": "Python 3.6",
"language": "python",
"name": "python36"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.7"
},
"order_index": 12,
"star_tag": [
"None"
],
"tags": [
"None"
],
"task": "Demonstrates the use of NotebookRunnerStep"
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -0,0 +1,6 @@
name: aml-pipelines-with-notebook-runner-step
dependencies:
- pip:
- azureml-sdk
- azureml-widgets
- azureml-contrib-notebook

View File

@@ -0,0 +1,106 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Copyright (c) Microsoft Corporation. All rights reserved. \n",
"Licensed under the MIT License."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/notebook_runner/training_notebook.png)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print(\"In training_notebook.ipynb\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"parameters"
]
},
"outputs": [],
"source": [
"# declaring parameters to override\n",
"\n",
"arg1 = \"Azure\"\n",
"processed_data = None\n",
"notebook_processed_data = None\n",
"my_pipeline_param = None"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Final parameter values\n",
"\n",
"print(\"arg1: %s\" % arg1)\n",
"print(\"input from previous step: %s\" % processed_data)\n",
"print(\"output from notebook: %s\" % notebook_processed_data)\n",
"print(\"pipeline_parameter: %s\" % my_pipeline_param)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"if not (notebook_processed_data is None):\n",
" os.makedirs(notebook_processed_data, exist_ok=True)\n",
" print(\"%s created\" % notebook_processed_data)"
]
}
],
"metadata": {
"authors": [
{
"name": "sanpil"
}
],
"kernelspec": {
"display_name": "Python 3.6",
"language": "python",
"name": "python36"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.7"
}
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -11,13 +11,13 @@ Batch inference public preview offers a platform in which to do large inference
### Python package installation
Following the convention of most AzureML Public Preview features, Batch Inference SDK is currently available as a contrib package.
If you're unfamiliar with creating a new Python environment, you may follow this example for [creating a conda environment](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-configure-environment#local). Batch Inference package can be installed through the following pip command.
If you're unfamiliar with creating a new Python environment, you may follow this example for [creating a conda environment](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-configure-environment#local). Batch Inference package can be installed through the following pip command.
```
pip install azureml-contrib-pipeline-steps
```
### Creation of Azure Machine Learning Workspace
If you do not already have a Azure ML Workspace, please run the [configuration Notebook](../../configuration.ipynb).
If you do not already have a Azure ML Workspace, please run the [configuration Notebook](https://aka.ms/pl-config).
## Configure a Batch Inference job
@@ -124,4 +124,4 @@ pipeline_run.wait_for_completion(show_output=True)
- [file-dataset-image-inference-mnist.ipynb](./file-dataset-image-inference-mnist.ipynb) demonstrates how to run batch inference on an MNIST dataset.
- [tabular-dataset-inference-iris.ipynb](./tabular-dataset-inference-iris.ipynb) demonstrates how to run batch inference on an IRIS dataset.
![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/contrib/batch_inferencing/README.png)
![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/machine-learning-pipelines/parallel-run/README.png)

View File

@@ -12,7 +12,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/machine-learning-pipelines/contrib/batch_inferencing/file-dataset-image-inference-mnist.png)"
"![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/machine-learning-pipelines/parallel-run/file-dataset-image-inference-mnist.png)"
]
},
{
@@ -23,6 +23,11 @@
"\n",
"In this notebook, we will demonstrate how to make predictions on large quantities of data asynchronously using the ML pipelines with Azure Machine Learning. Batch inference (or batch scoring) provides cost-effective inference, with unparalleled throughput for asynchronous applications. Batch prediction pipelines can scale to perform inference on terabytes of production data. Batch prediction is optimized for high throughput, fire-and-forget predictions for a large collection of data.\n",
"\n",
"> **Note**\n",
"This notebook uses public preview functionality (ParallelRunStep). Please install azureml-contrib-pipeline-steps package before running this notebook.\n",
"```\n",
"pip install azureml-contrib-pipeline-steps\n",
"```\n",
"> **Tip**\n",
"If your system requires low-latency processing (to process a single document or small set of documents quickly), use [real-time scoring](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-consume-web-service) instead of batch prediction.\n",
"\n",
@@ -519,9 +524,6 @@
"name": "tracych"
}
],
"friendly_name": "MNIST data inferencing using ParallelRunStep",
"exclude_from_index": false,
"index_order": 1,
"category": "Other notebooks",
"compute": [
"AML Compute"
@@ -532,14 +534,12 @@
"deployment": [
"None"
],
"exclude_from_index": false,
"framework": [
"None"
],
"tags": [
"Batch Inferencing",
"Pipeline"
],
"task": "Digit identification",
"friendly_name": "MNIST data inferencing using ParallelRunStep",
"index_order": 1,
"kernelspec": {
"display_name": "Python 3.6",
"language": "python",
@@ -556,7 +556,12 @@
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.9"
}
},
"tags": [
"Batch Inferencing",
"Pipeline"
],
"task": "Digit identification"
},
"nbformat": 4,
"nbformat_minor": 2

View File

@@ -4,7 +4,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Copyright (c) Microsoft Corporation. All rights reserved. \n",
"Copyright (c) Microsoft Corporation. All rights reserved.\n",
"Licensed under the MIT License."
]
},
@@ -12,7 +12,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/machine-learning-pipelines/contrib/batch_inferencing/tabular-dataset-inference-iris.png)"
"![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/machine-learning-pipelines/parallel-run/tabular-dataset-inference-iris.png)"
]
},
{
@@ -23,6 +23,11 @@
"\n",
"In this notebook, we will demonstrate how to make predictions on large quantities of data asynchronously using the ML pipelines with Azure Machine Learning. Batch inference (or batch scoring) provides cost-effective inference, with unparalleled throughput for asynchronous applications. Batch prediction pipelines can scale to perform inference on terabytes of production data. Batch prediction is optimized for high throughput, fire-and-forget predictions for a large collection of data.\n",
"\n",
"> **Note**\n",
"This notebook uses public preview functionality (ParallelRunStep). Please install azureml-contrib-pipeline-steps package before running this notebook.\n",
"```\n",
"pip install azureml-contrib-pipeline-steps\n",
"```\n",
"> **Tip**\n",
"If your system requires low-latency processing (to process a single document or small set of documents quickly), use [real-time scoring](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-consume-web-service) instead of batch prediction.\n",
"\n",
@@ -494,9 +499,6 @@
"name": "tracych"
}
],
"friendly_name": "IRIS data inferencing using ParallelRunStep",
"exclude_from_index": false,
"index_order": 1,
"category": "Other notebooks",
"compute": [
"AML Compute"
@@ -507,14 +509,12 @@
"deployment": [
"None"
],
"exclude_from_index": false,
"framework": [
"None"
],
"tags": [
"Batch Inferencing",
"Pipeline"
],
"task": "Recognize flower type",
"friendly_name": "IRIS data inferencing using ParallelRunStep",
"index_order": 1,
"kernelspec": {
"display_name": "Python 3.6",
"language": "python",
@@ -531,7 +531,12 @@
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.2"
}
},
"tags": [
"Batch Inferencing",
"Pipeline"
],
"task": "Recognize flower type"
},
"nbformat": 4,
"nbformat_minor": 2

View File

@@ -1,119 +0,0 @@
# Copyright (c) Microsoft. All rights reserved.
# Licensed under the MIT license.
import os
import argparse
import datetime
import time
import tensorflow as tf
from math import ceil
import numpy as np
import shutil
from tensorflow.contrib.slim.python.slim.nets import inception_v3
from azureml.core.model import Model
slim = tf.contrib.slim
parser = argparse.ArgumentParser(description="Start a tensorflow model serving")
parser.add_argument('--model_name', dest="model_name", required=True)
parser.add_argument('--label_dir', dest="label_dir", required=True)
parser.add_argument('--dataset_path', dest="dataset_path", required=True)
parser.add_argument('--output_dir', dest="output_dir", required=True)
parser.add_argument('--batch_size', dest="batch_size", type=int, required=True)
args = parser.parse_args()
image_size = 299
num_channel = 3
# create output directory if it does not exist
os.makedirs(args.output_dir, exist_ok=True)
def get_class_label_dict(label_file):
label = []
proto_as_ascii_lines = tf.gfile.GFile(label_file).readlines()
for l in proto_as_ascii_lines:
label.append(l.rstrip())
return label
class DataIterator:
def __init__(self, data_dir):
self.file_paths = []
image_list = os.listdir(data_dir)
# total_size = len(image_list)
self.file_paths = [data_dir + '/' + file_name.rstrip() for file_name in image_list]
self.labels = [1 for file_name in self.file_paths]
@property
def size(self):
return len(self.labels)
def input_pipeline(self, batch_size):
images_tensor = tf.convert_to_tensor(self.file_paths, dtype=tf.string)
labels_tensor = tf.convert_to_tensor(self.labels, dtype=tf.int64)
input_queue = tf.train.slice_input_producer([images_tensor, labels_tensor], shuffle=False)
labels = input_queue[1]
images_content = tf.read_file(input_queue[0])
image_reader = tf.image.decode_jpeg(images_content, channels=num_channel, name="jpeg_reader")
float_caster = tf.cast(image_reader, tf.float32)
new_size = tf.constant([image_size, image_size], dtype=tf.int32)
images = tf.image.resize_images(float_caster, new_size)
images = tf.divide(tf.subtract(images, [0]), [255])
image_batch, label_batch = tf.train.batch([images, labels], batch_size=batch_size, capacity=5 * batch_size)
return image_batch
def main(_):
# start_time = datetime.datetime.now()
label_file_name = os.path.join(args.label_dir, "labels.txt")
label_dict = get_class_label_dict(label_file_name)
classes_num = len(label_dict)
test_feeder = DataIterator(data_dir=args.dataset_path)
total_size = len(test_feeder.labels)
count = 0
# get model from model registry
model_path = Model.get_model_path(args.model_name)
with tf.Session() as sess:
test_images = test_feeder.input_pipeline(batch_size=args.batch_size)
with slim.arg_scope(inception_v3.inception_v3_arg_scope()):
input_images = tf.placeholder(tf.float32, [args.batch_size, image_size, image_size, num_channel])
logits, _ = inception_v3.inception_v3(input_images,
num_classes=classes_num,
is_training=False)
probabilities = tf.argmax(logits, 1)
sess.run(tf.global_variables_initializer())
sess.run(tf.local_variables_initializer())
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(sess=sess, coord=coord)
saver = tf.train.Saver()
saver.restore(sess, model_path)
out_filename = os.path.join(args.output_dir, "result-labels.txt")
with open(out_filename, "w") as result_file:
i = 0
while count < total_size and not coord.should_stop():
test_images_batch = sess.run(test_images)
file_names_batch = test_feeder.file_paths[i * args.batch_size:
min(test_feeder.size, (i + 1) * args.batch_size)]
results = sess.run(probabilities, feed_dict={input_images: test_images_batch})
new_add = min(args.batch_size, total_size - count)
count += new_add
i += 1
for j in range(new_add):
result_file.write(os.path.basename(file_names_batch[j]) + ": " + label_dict[results[j]] + "\n")
result_file.flush()
coord.request_stop()
coord.join(threads)
# copy the file to artifacts
shutil.copy(out_filename, "./outputs/")
# Move the processed data out of the blob so that the next run can process the data.
if __name__ == "__main__":
tf.app.run()

View File

@@ -1,630 +0,0 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Copyright (c) Microsoft Corporation. All rights reserved. \n",
"Licensed under the MIT License."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/machine-learning-pipelines/pipeline-batch-scoring/pipeline-batch-scoring.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Note**: Azure Machine Learning recently released ParallelRunStep for public preview, this will allow for parallelization of your workload across many compute nodes without the difficulty of orchestrating worker pools and queues. See the [batch inference notebooks](../../../contrib/batch_inferencing/) for examples on how to get started."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Using Azure Machine Learning Pipelines for batch prediction\n",
"\n",
"In this notebook we will demonstrate how to run a batch scoring job using Azure Machine Learning pipelines. Our example job will be to take an already-trained image classification model, and run that model on some unlabeled images. The image classification model that we'll use is the __[Inception-V3 model](https://arxiv.org/abs/1512.00567)__ and we'll run this model on unlabeled images from the __[ImageNet](http://image-net.org/)__ dataset. \n",
"\n",
"The outline of this notebook is as follows:\n",
"\n",
"- Register the pretrained inception model into the model registry. \n",
"- Store the dataset images in a blob container.\n",
"- Use the registered model to do batch scoring on the images in the data blob container."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Prerequisites\n",
"If you are using an Azure Machine Learning Notebook VM, you are all set. Otherwise, make sure you go through the configuration Notebook located at https://github.com/Azure/MachineLearningNotebooks first if you haven't. This sets you up with a working config file that has information on your workspace, subscription id, etc. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core import Experiment\n",
"from azureml.core.compute import AmlCompute, ComputeTarget\n",
"from azureml.core.datastore import Datastore\n",
"from azureml.core.runconfig import CondaDependencies, RunConfiguration\n",
"from azureml.data.data_reference import DataReference\n",
"from azureml.pipeline.core import Pipeline, PipelineData\n",
"from azureml.pipeline.steps import PythonScriptStep"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"from azureml.core import Workspace\n",
"\n",
"ws = Workspace.from_config()\n",
"print('Workspace name: ' + ws.name, \n",
" 'Azure region: ' + ws.location, \n",
" 'Subscription id: ' + ws.subscription_id, \n",
" 'Resource group: ' + ws.resource_group, sep = '\\n')\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Set up machine learning resources"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Set up datastores\n",
"First, let\u00e2\u20ac\u2122s access the datastore that has the model, labels, and images. \n",
"\n",
"### Create a datastore that points to a blob container containing sample images\n",
"\n",
"We have created a public blob container `sampledata` on an account named `pipelinedata`, containing images from the ImageNet evaluation set. In the next step, we create a datastore with the name `images_datastore`, which points to this container. In the call to `register_azure_blob_container` below, setting the `overwrite` flag to `True` overwrites any datastore that was created previously with that name. \n",
"\n",
"This step can be changed to point to your blob container by providing your own `datastore_name`, `container_name`, and `account_name`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"account_name = \"pipelinedata\"\n",
"datastore_name=\"images_datastore\"\n",
"container_name=\"sampledata\"\n",
"\n",
"batchscore_blob = Datastore.register_azure_blob_container(ws, \n",
" datastore_name=datastore_name, \n",
" container_name= container_name, \n",
" account_name=account_name, \n",
" overwrite=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Next, let\u00e2\u20ac\u2122s specify the default datastore for the outputs."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def_data_store = ws.get_default_datastore()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Configure data references\n",
"Now you need to add references to the data, as inputs to the appropriate pipeline steps in your pipeline. A data source in a pipeline is represented by a DataReference object. The DataReference object points to data that lives in, or is accessible from, a datastore. We need DataReference objects corresponding to the following: the directory containing the input images, the directory in which the pretrained model is stored, the directory containing the labels, and the output directory."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"input_images = DataReference(datastore=batchscore_blob, \n",
" data_reference_name=\"input_images\",\n",
" path_on_datastore=\"batchscoring/images\",\n",
" mode=\"download\"\n",
" )\n",
"model_dir = DataReference(datastore=batchscore_blob, \n",
" data_reference_name=\"input_model\",\n",
" path_on_datastore=\"batchscoring/models\",\n",
" mode=\"download\" \n",
" )\n",
"label_dir = DataReference(datastore=batchscore_blob, \n",
" data_reference_name=\"input_labels\",\n",
" path_on_datastore=\"batchscoring/labels\",\n",
" mode=\"download\" \n",
" )\n",
"output_dir = PipelineData(name=\"scores\", \n",
" datastore=def_data_store, \n",
" output_path_on_compute=\"batchscoring/results\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Create and attach Compute targets\n",
"Use the below code to create and attach Compute targets. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# choose a name for your cluster\n",
"aml_compute_name = os.environ.get(\"AML_COMPUTE_NAME\", \"gpu-cluster\")\n",
"cluster_min_nodes = os.environ.get(\"AML_COMPUTE_MIN_NODES\", 0)\n",
"cluster_max_nodes = os.environ.get(\"AML_COMPUTE_MAX_NODES\", 1)\n",
"vm_size = os.environ.get(\"AML_COMPUTE_SKU\", \"STANDARD_NC6\")\n",
"\n",
"\n",
"if aml_compute_name in ws.compute_targets:\n",
" compute_target = ws.compute_targets[aml_compute_name]\n",
" if compute_target and type(compute_target) is AmlCompute:\n",
" print('found compute target. just use it. ' + aml_compute_name)\n",
"else:\n",
" print('creating a new compute target...')\n",
" provisioning_config = AmlCompute.provisioning_configuration(vm_size = vm_size, # NC6 is GPU-enabled\n",
" vm_priority = 'lowpriority', # optional\n",
" min_nodes = cluster_min_nodes, \n",
" max_nodes = cluster_max_nodes)\n",
"\n",
" # create the cluster\n",
" compute_target = ComputeTarget.create(ws, aml_compute_name, provisioning_config)\n",
" \n",
" # can poll for a minimum number of nodes and for a specific timeout. \n",
" # if no min node count is provided it will use the scale settings for the cluster\n",
" compute_target.wait_for_completion(show_output=True, min_node_count=None, timeout_in_minutes=20)\n",
" \n",
" # For a more detailed view of current Azure Machine Learning Compute status, use get_status()\n",
" print(compute_target.get_status().serialize())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Prepare the Model"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Download the Model\n",
"\n",
"Download and extract the model from http://download.tensorflow.org/models/inception_v3_2016_08_28.tar.gz to `\"models\"`"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# create directory for model\n",
"model_dir = 'models'\n",
"if not os.path.isdir(model_dir):\n",
" os.mkdir(model_dir)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import tarfile\n",
"import urllib.request\n",
"\n",
"url=\"http://download.tensorflow.org/models/inception_v3_2016_08_28.tar.gz\"\n",
"response = urllib.request.urlretrieve(url, \"model.tar.gz\")\n",
"tar = tarfile.open(\"model.tar.gz\", \"r:gz\")\n",
"tar.extractall(model_dir)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Register the model with Workspace"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import shutil\n",
"from azureml.core.model import Model\n",
"\n",
"# register downloaded model \n",
"model = Model.register(model_path = \"models/inception_v3.ckpt\",\n",
" model_name = \"inception\", # this is the name the model is registered as\n",
" tags = {'pretrained': \"inception\"},\n",
" description = \"Imagenet trained tensorflow inception\",\n",
" workspace = ws)\n",
"# remove the downloaded dir after registration if you wish\n",
"shutil.rmtree(\"models\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Write your scoring script"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To do the scoring, we use a batch scoring script `batch_scoring.py`, which is located in the same directory that this notebook is in. You can take a look at this script to see how you might modify it for your custom batch scoring task.\n",
"\n",
"The python script `batch_scoring.py` takes input images, applies the image classification model to these images, and outputs a classification result to a results file.\n",
"\n",
"The script `batch_scoring.py` takes the following parameters:\n",
"\n",
"- `--model_name`: the name of the model being used, which is expected to be in the `model_dir` directory\n",
"- `--label_dir` : the directory holding the `labels.txt` file \n",
"- `--dataset_path`: the directory containing the input images\n",
"- `--output_dir` : the script will run the model on the data and output a `results-label.txt` to this directory\n",
"- `--batch_size` : the batch size used in running the model.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Build and run the batch scoring pipeline\n",
"You have everything you need to build the pipeline. Let\u00e2\u20ac\u2122s put all these together."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Specify the environment to run the script\n",
"Specify the conda dependencies for your script. You will need this object when you create the pipeline step later on."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core.runconfig import DEFAULT_GPU_IMAGE\n",
"\n",
"cd = CondaDependencies.create(pip_packages=[\"tensorflow-gpu==1.13.1\", \"azureml-defaults\"])\n",
"\n",
"# Runconfig\n",
"amlcompute_run_config = RunConfiguration(conda_dependencies=cd)\n",
"amlcompute_run_config.environment.docker.enabled = True\n",
"amlcompute_run_config.environment.docker.base_image = DEFAULT_GPU_IMAGE\n",
"amlcompute_run_config.environment.spark.precache_packages = False"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Specify the parameters for your pipeline\n",
"A subset of the parameters to the python script can be given as input when we re-run a `PublishedPipeline`. In the current example, we define `batch_size` taken by the script as such parameter."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.pipeline.core.graph import PipelineParameter\n",
"batch_size_param = PipelineParameter(name=\"param_batch_size\", default_value=20)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Create the pipeline step\n",
"Create the pipeline step using the script, environment configuration, and parameters. Specify the compute target you already attached to your workspace as the target of execution of the script. We will use PythonScriptStep to create the pipeline step."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"inception_model_name = \"inception_v3.ckpt\"\n",
"\n",
"batch_score_step = PythonScriptStep(\n",
" name=\"batch_scoring\",\n",
" script_name=\"batch_scoring.py\",\n",
" arguments=[\"--dataset_path\", input_images, \n",
" \"--model_name\", \"inception\",\n",
" \"--label_dir\", label_dir, \n",
" \"--output_dir\", output_dir, \n",
" \"--batch_size\", batch_size_param],\n",
" compute_target=compute_target,\n",
" inputs=[input_images, label_dir],\n",
" outputs=[output_dir],\n",
" runconfig=amlcompute_run_config\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Run the pipeline\n",
"At this point you can run the pipeline and examine the output it produced. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"pipelineparameterssample"
]
},
"outputs": [],
"source": [
"pipeline = Pipeline(workspace=ws, steps=[batch_score_step])\n",
"pipeline_run = Experiment(ws, 'batch_scoring').submit(pipeline, pipeline_parameters={\"param_batch_size\": 20})"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Monitor the run"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.widgets import RunDetails\n",
"RunDetails(pipeline_run).show()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"pipeline_run.wait_for_completion(show_output=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Download and review output"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"step_run = list(pipeline_run.get_children())[0]\n",
"step_run.download_file(\"./outputs/result-labels.txt\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"df = pd.read_csv(\"result-labels.txt\", delimiter=\":\", header=None)\n",
"df.columns = [\"Filename\", \"Prediction\"]\n",
"df.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Publish a pipeline and rerun using a REST call"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Create a published pipeline\n",
"Once you are satisfied with the outcome of the run, you can publish the pipeline to run it with different input values later. When you publish a pipeline, you will get a REST endpoint that accepts invoking of the pipeline with the set of parameters you have already incorporated above using PipelineParameter."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"published_pipeline = pipeline_run.publish_pipeline(\n",
" name=\"Inception_v3_scoring\", description=\"Batch scoring using Inception v3 model\", version=\"1.0\")\n",
"\n",
"published_pipeline"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Get published pipeline\n",
"\n",
"You can get the published pipeline using **pipeline id**.\n",
"\n",
"To get all the published pipelines for a given workspace(ws): \n",
"```css\n",
"all_pub_pipelines = PublishedPipeline.get_all(ws)\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.pipeline.core import PublishedPipeline\n",
"\n",
"pipeline_id = published_pipeline.id # use your published pipeline id\n",
"published_pipeline = PublishedPipeline.get(ws, pipeline_id)\n",
"\n",
"published_pipeline"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Rerun the pipeline using the REST endpoint"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Get AAD token\n",
"[This notebook](https://aka.ms/pl-restep-auth) shows how to authenticate to AML workspace."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core.authentication import InteractiveLoginAuthentication\n",
"import requests\n",
"\n",
"auth = InteractiveLoginAuthentication()\n",
"aad_token = auth.get_authentication_header()\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Run published pipeline"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"rest_endpoint = published_pipeline.endpoint\n",
"# specify batch size when running the pipeline\n",
"response = requests.post(rest_endpoint, \n",
" headers=aad_token, \n",
" json={\"ExperimentName\": \"batch_scoring\",\n",
" \"ParameterAssignments\": {\"param_batch_size\": 50}})"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"try:\n",
" response.raise_for_status()\n",
"except Exception: \n",
" raise Exception('Received bad response from the endpoint: {}\\n'\n",
" 'Response Code: {}\\n'\n",
" 'Headers: {}\\n'\n",
" 'Content: {}'.format(rest_endpoint, response.status_code, response.headers, response.content))\n",
"\n",
"run_id = response.json().get('Id')\n",
"print('Submitted pipeline run: ', run_id)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Monitor the new run"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.pipeline.core.run import PipelineRun\n",
"published_pipeline_run = PipelineRun(ws.experiments[\"batch_scoring\"], run_id)\n",
"\n",
"RunDetails(published_pipeline_run).show()"
]
}
],
"metadata": {
"authors": [
{
"name": "sanpil"
}
],
"kernelspec": {
"display_name": "Python 3.6",
"language": "python",
"name": "python36"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.7"
}
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -1,7 +0,0 @@
name: pipeline-batch-scoring
dependencies:
- pip:
- azureml-sdk
- azureml-widgets
- pandas
- requests

View File

@@ -1,207 +0,0 @@
# Original source: https://github.com/pytorch/examples/blob/master/fast_neural_style/neural_style/neural_style.py
import argparse
import os
import sys
import re
from PIL import Image
import torch
from torchvision import transforms
from mpi4py import MPI
def load_image(filename, size=None, scale=None):
img = Image.open(filename)
if size is not None:
img = img.resize((size, size), Image.ANTIALIAS)
elif scale is not None:
img = img.resize((int(img.size[0] / scale), int(img.size[1] / scale)), Image.ANTIALIAS)
return img
def save_image(filename, data):
img = data.clone().clamp(0, 255).numpy()
img = img.transpose(1, 2, 0).astype("uint8")
img = Image.fromarray(img)
img.save(filename)
class TransformerNet(torch.nn.Module):
def __init__(self):
super(TransformerNet, self).__init__()
# Initial convolution layers
self.conv1 = ConvLayer(3, 32, kernel_size=9, stride=1)
self.in1 = torch.nn.InstanceNorm2d(32, affine=True)
self.conv2 = ConvLayer(32, 64, kernel_size=3, stride=2)
self.in2 = torch.nn.InstanceNorm2d(64, affine=True)
self.conv3 = ConvLayer(64, 128, kernel_size=3, stride=2)
self.in3 = torch.nn.InstanceNorm2d(128, affine=True)
# Residual layers
self.res1 = ResidualBlock(128)
self.res2 = ResidualBlock(128)
self.res3 = ResidualBlock(128)
self.res4 = ResidualBlock(128)
self.res5 = ResidualBlock(128)
# Upsampling Layers
self.deconv1 = UpsampleConvLayer(128, 64, kernel_size=3, stride=1, upsample=2)
self.in4 = torch.nn.InstanceNorm2d(64, affine=True)
self.deconv2 = UpsampleConvLayer(64, 32, kernel_size=3, stride=1, upsample=2)
self.in5 = torch.nn.InstanceNorm2d(32, affine=True)
self.deconv3 = ConvLayer(32, 3, kernel_size=9, stride=1)
# Non-linearities
self.relu = torch.nn.ReLU()
def forward(self, X):
y = self.relu(self.in1(self.conv1(X)))
y = self.relu(self.in2(self.conv2(y)))
y = self.relu(self.in3(self.conv3(y)))
y = self.res1(y)
y = self.res2(y)
y = self.res3(y)
y = self.res4(y)
y = self.res5(y)
y = self.relu(self.in4(self.deconv1(y)))
y = self.relu(self.in5(self.deconv2(y)))
y = self.deconv3(y)
return y
class ConvLayer(torch.nn.Module):
def __init__(self, in_channels, out_channels, kernel_size, stride):
super(ConvLayer, self).__init__()
reflection_padding = kernel_size // 2
self.reflection_pad = torch.nn.ReflectionPad2d(reflection_padding)
self.conv2d = torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride)
def forward(self, x):
out = self.reflection_pad(x)
out = self.conv2d(out)
return out
class ResidualBlock(torch.nn.Module):
"""ResidualBlock
introduced in: https://arxiv.org/abs/1512.03385
recommended architecture: http://torch.ch/blog/2016/02/04/resnets.html
"""
def __init__(self, channels):
super(ResidualBlock, self).__init__()
self.conv1 = ConvLayer(channels, channels, kernel_size=3, stride=1)
self.in1 = torch.nn.InstanceNorm2d(channels, affine=True)
self.conv2 = ConvLayer(channels, channels, kernel_size=3, stride=1)
self.in2 = torch.nn.InstanceNorm2d(channels, affine=True)
self.relu = torch.nn.ReLU()
def forward(self, x):
residual = x
out = self.relu(self.in1(self.conv1(x)))
out = self.in2(self.conv2(out))
out = out + residual
return out
class UpsampleConvLayer(torch.nn.Module):
"""UpsampleConvLayer
Upsamples the input and then does a convolution. This method gives better results
compared to ConvTranspose2d.
ref: http://distill.pub/2016/deconv-checkerboard/
"""
def __init__(self, in_channels, out_channels, kernel_size, stride, upsample=None):
super(UpsampleConvLayer, self).__init__()
self.upsample = upsample
if upsample:
self.upsample_layer = torch.nn.Upsample(mode='nearest', scale_factor=upsample)
reflection_padding = kernel_size // 2
self.reflection_pad = torch.nn.ReflectionPad2d(reflection_padding)
self.conv2d = torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride)
def forward(self, x):
x_in = x
if self.upsample:
x_in = self.upsample_layer(x_in)
out = self.reflection_pad(x_in)
out = self.conv2d(out)
return out
def stylize(args, comm):
rank = comm.Get_rank()
size = comm.Get_size()
device = torch.device("cuda" if args.cuda else "cpu")
with torch.no_grad():
style_model = TransformerNet()
state_dict = torch.load(os.path.join(args.model_dir, args.style + ".pth"))
# remove saved deprecated running_* keys in InstanceNorm from the checkpoint
for k in list(state_dict.keys()):
if re.search(r'in\d+\.running_(mean|var)$', k):
del state_dict[k]
style_model.load_state_dict(state_dict)
style_model.to(device)
filenames = os.listdir(args.content_dir)
filenames = sorted(filenames)
partition_size = len(filenames) // size
partitioned_filenames = filenames[rank * partition_size: (rank + 1) * partition_size]
print("RANK {} - is processing {} images out of the total {}".format(rank, len(partitioned_filenames),
len(filenames)))
output_paths = []
for filename in partitioned_filenames:
# print("Processing {}".format(filename))
full_path = os.path.join(args.content_dir, filename)
content_image = load_image(full_path, scale=args.content_scale)
content_transform = transforms.Compose([
transforms.ToTensor(),
transforms.Lambda(lambda x: x.mul(255))
])
content_image = content_transform(content_image)
content_image = content_image.unsqueeze(0).to(device)
output = style_model(content_image).cpu()
output_path = os.path.join(args.output_dir, filename)
save_image(output_path, output[0])
output_paths.append(output_path)
print("RANK {} - number of pre-aggregated output files {}".format(rank, len(output_paths)))
output_paths_list = comm.gather(output_paths, root=0)
if rank == 0:
print("RANK {} - number of aggregated output files {}".format(rank, len(output_paths_list)))
print("RANK {} - end".format(rank))
def main():
arg_parser = argparse.ArgumentParser(description="parser for fast-neural-style")
arg_parser.add_argument("--content-scale", type=float, default=None,
help="factor for scaling down the content image")
arg_parser.add_argument("--model-dir", type=str, required=True,
help="saved model to be used for stylizing the image.")
arg_parser.add_argument("--cuda", type=int, required=True,
help="set it to 1 for running on GPU, 0 for CPU")
arg_parser.add_argument("--style", type=str, help="style name")
arg_parser.add_argument("--content-dir", type=str, required=True,
help="directory holding the images")
arg_parser.add_argument("--output-dir", type=str, required=True,
help="directory holding the output images")
args = arg_parser.parse_args()
comm = MPI.COMM_WORLD
if args.cuda and not torch.cuda.is_available():
print("ERROR: cuda is not available, try running on CPU")
sys.exit(1)
os.makedirs(args.output_dir, exist_ok=True)
stylize(args, comm)
if __name__ == "__main__":
main()

View File

@@ -16,13 +16,6 @@
"![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/machine-learning-pipelines/pipeline-style-transfer/pipeline-style-transfer.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Note**: Azure Machine Learning recently released ParallelRunStep for public preview, this will allow for parallelization of your workload across many compute nodes without the difficulty of orchestrating worker pools and queues. See the [batch inference notebooks](../../../contrib/batch_inferencing/) for examples on how to get started."
]
},
{
"cell_type": "markdown",
"metadata": {},
@@ -31,7 +24,13 @@
"Using modified code from `pytorch`'s neural style [example](https://pytorch.org/tutorials/advanced/neural_style_tutorial.html), we show how to setup a pipeline for doing style transfer on video. The pipeline has following steps:\n",
"1. Split a video into images\n",
"2. Run neural style on each image using one of the provided models (from `pytorch` pretrained models for this example).\n",
"3. Stitch the image back into a video."
"3. Stitch the image back into a video.\n",
"\n",
"> **Note**\n",
"This notebook uses public preview functionality (ParallelRunStep). Please install azureml-contrib-pipeline-steps package before running this notebook.\n",
"```\n",
"pip install azureml-contrib-pipeline-steps\n",
"```"
]
},
{
@@ -57,19 +56,25 @@
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"# Check core SDK version number\n",
"import azureml.core\n",
"\n",
"print(\"SDK version:\", azureml.core.VERSION)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core import Workspace, Experiment\n",
"\n",
"ws = Workspace.from_config()\n",
"print('Workspace name: ' + ws.name, \n",
" 'Azure region: ' + ws.location, \n",
" 'Subscription id: ' + ws.subscription_id, \n",
" 'Resource group: ' + ws.resource_group, sep = '\\n')\n",
"\n",
"scripts_folder = \"scripts_folder\"\n",
"\n",
"if not os.path.isdir(scripts_folder):\n",
" os.mkdir(scripts_folder)"
" 'Resource group: ' + ws.resource_group, sep = '\\n')"
]
},
{
@@ -82,11 +87,96 @@
"from azureml.core.datastore import Datastore\n",
"from azureml.data.data_reference import DataReference\n",
"from azureml.pipeline.core import Pipeline, PipelineData\n",
"from azureml.pipeline.steps import PythonScriptStep, MpiStep\n",
"from azureml.pipeline.steps import PythonScriptStep\n",
"from azureml.core.runconfig import CondaDependencies, RunConfiguration\n",
"from azureml.core.compute_target import ComputeTargetException"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Download models"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"# create directory for model\n",
"model_dir = 'models'\n",
"if not os.path.isdir(model_dir):\n",
" os.mkdir(model_dir)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import urllib.request\n",
"\n",
"def download_model(model_name):\n",
" # downloaded models from https://pytorch.org/tutorials/advanced/neural_style_tutorial.html are kept here\n",
" url=\"https://pipelinedata.blob.core.windows.net/styletransfer/saved_models/\" + model_name\n",
" local_path = os.path.join(model_dir, model_name)\n",
" urllib.request.urlretrieve(url, local_path)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Register all Models"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core.model import Model\n",
"mosaic_model = None\n",
"candy_model = None\n",
"\n",
"models = Model.list(workspace=ws, tags=['scenario'])\n",
"for m in models:\n",
" print(\"Name:\", m.name,\"\\tVersion:\", m.version, \"\\tDescription:\", m.description, m.tags)\n",
" if m.name == 'mosaic' and mosaic_model is None:\n",
" mosaic_model = m\n",
" elif m.name == 'candy' and candy_model is None:\n",
" candy_model = m\n",
"\n",
"if mosaic_model is None:\n",
" print('Mosaic model does not exist, registering it')\n",
" download_model('mosaic.pth')\n",
" mosaic_model = Model.register(model_path = os.path.join(model_dir, \"mosaic.pth\"),\n",
" model_name = \"mosaic\",\n",
" tags = {'type': \"mosaic\", 'scenario': \"Style transfer using batch inference\"},\n",
" description = \"Style transfer - Mosaic\",\n",
" workspace = ws)\n",
"else:\n",
" print('Reusing existing mosaic model')\n",
" \n",
"\n",
"if candy_model is None:\n",
" print('Candy model does not exist, registering it')\n",
" download_model('candy.pth')\n",
" candy_model = Model.register(model_path = os.path.join(model_dir, \"candy.pth\"),\n",
" model_name = \"candy\",\n",
" tags = {'type': \"candy\", 'scenario': \"Style transfer using batch inference\"},\n",
" description = \"Style transfer - Candy\",\n",
" workspace = ws)\n",
"else:\n",
" print('Reusing existing candy model')"
]
},
{
"cell_type": "markdown",
"metadata": {},
@@ -122,7 +212,7 @@
"except ComputeTargetException:\n",
" print(\"creating new cluster\")\n",
" provisioning_config = AmlCompute.provisioning_configuration(vm_size = \"STANDARD_NC6\",\n",
" max_nodes = 3)\n",
" max_nodes = 3)\n",
"\n",
" # create the cluster\n",
" gpu_cluster = ComputeTarget.create(ws, gpu_cluster_name, provisioning_config)\n",
@@ -145,8 +235,7 @@
"metadata": {},
"outputs": [],
"source": [
"import shutil\n",
"shutil.copy(\"neural_style_mpi.py\", scripts_folder)"
"scripts_folder = \"scripts\""
]
},
{
@@ -155,31 +244,11 @@
"metadata": {},
"outputs": [],
"source": [
"%%writefile $scripts_folder/process_video.py\n",
"import argparse\n",
"import glob\n",
"import os\n",
"import subprocess\n",
"process_video_script_file = \"process_video.py\"\n",
"\n",
"parser = argparse.ArgumentParser(description=\"Process input video\")\n",
"parser.add_argument('--input_video', required=True)\n",
"parser.add_argument('--output_audio', required=True)\n",
"parser.add_argument('--output_images', required=True)\n",
"\n",
"args = parser.parse_args()\n",
"\n",
"os.makedirs(args.output_audio, exist_ok=True)\n",
"os.makedirs(args.output_images, exist_ok=True)\n",
"\n",
"subprocess.run(\"ffmpeg -i {} {}/video.aac\"\n",
" .format(args.input_video, args.output_audio),\n",
" shell=True, check=True\n",
" )\n",
"\n",
"subprocess.run(\"ffmpeg -i {} {}/%05d_video.jpg -hide_banner\"\n",
" .format(args.input_video, args.output_images),\n",
" shell=True, check=True\n",
" )"
"# peek at contents\n",
"with open(os.path.join(scripts_folder, process_video_script_file)) as process_video_file:\n",
" print(process_video_file.read())"
]
},
{
@@ -188,31 +257,11 @@
"metadata": {},
"outputs": [],
"source": [
"%%writefile $scripts_folder/stitch_video.py\n",
"import argparse\n",
"import os\n",
"import subprocess\n",
"stitch_video_script_file = \"stitch_video.py\"\n",
"\n",
"parser = argparse.ArgumentParser(description=\"Process input video\")\n",
"parser.add_argument('--images_dir', required=True)\n",
"parser.add_argument('--input_audio', required=True)\n",
"parser.add_argument('--output_dir', required=True)\n",
"\n",
"args = parser.parse_args()\n",
"\n",
"os.makedirs(args.output_dir, exist_ok=True)\n",
"\n",
"subprocess.run(\"ffmpeg -framerate 30 -i {}/%05d_video.jpg -c:v libx264 -profile:v high -crf 20 -pix_fmt yuv420p \"\n",
" \"-y {}/video_without_audio.mp4\"\n",
" .format(args.images_dir, args.output_dir),\n",
" shell=True, check=True\n",
" )\n",
"\n",
"subprocess.run(\"ffmpeg -i {}/video_without_audio.mp4 -i {}/video.aac -map 0:0 -map 1:0 -vcodec \"\n",
" \"copy -acodec copy -y {}/video_with_audio.mp4\"\n",
" .format(args.output_dir, args.input_audio, args.output_dir),\n",
" shell=True, check=True\n",
" )"
"# peek at contents\n",
"with open(os.path.join(scripts_folder, stitch_video_script_file)) as stitch_video_file:\n",
" print(stitch_video_file.read())"
]
},
{
@@ -233,15 +282,6 @@
"video_ds = Datastore.register_azure_blob_container(ws, \"videos\", \"sample-videos\",\n",
" account_name=account_name, overwrite=True)\n",
"\n",
"# datastore for models\n",
"models_ds = Datastore.register_azure_blob_container(ws, \"models\", \"styletransfer\", \n",
" account_name=\"pipelinedata\", \n",
" overwrite=True)\n",
" \n",
"# downloaded models from https://pytorch.org/tutorials/advanced/neural_style_tutorial.html are kept here\n",
"models_dir = DataReference(data_reference_name=\"models\", datastore=models_ds, \n",
" path_on_datastore=\"saved_models\", mode=\"download\")\n",
"\n",
"# the default blob store attached to a workspace\n",
"default_datastore = ws.get_default_datastore()"
]
@@ -276,13 +316,8 @@
"cd.add_channel(\"conda-forge\")\n",
"cd.add_conda_package(\"ffmpeg\")\n",
"\n",
"cd.add_channel(\"pytorch\")\n",
"cd.add_conda_package(\"pytorch\")\n",
"cd.add_conda_package(\"torchvision\")\n",
"\n",
"# Runconfig\n",
"amlcompute_run_config = RunConfiguration(conda_dependencies=cd)\n",
"amlcompute_run_config.environment.docker.enabled = True\n",
"amlcompute_run_config.environment.docker.base_image = \"pytorch/pytorch\"\n",
"amlcompute_run_config.environment.spark.precache_packages = False"
]
@@ -294,9 +329,13 @@
"outputs": [],
"source": [
"ffmpeg_audio = PipelineData(name=\"ffmpeg_audio\", datastore=default_datastore)\n",
"ffmpeg_images = PipelineData(name=\"ffmpeg_images\", datastore=default_datastore)\n",
"processed_images = PipelineData(name=\"processed_images\", datastore=default_datastore)\n",
"output_video = PipelineData(name=\"output_video\", datastore=default_datastore)"
"output_video = PipelineData(name=\"output_video\", datastore=default_datastore)\n",
"\n",
"ffmpeg_images_ds_name = \"ffmpeg_images_data\"\n",
"ffmpeg_images = PipelineData(name=\"ffmpeg_images\", datastore=default_datastore)\n",
"ffmpeg_images_file_dataset = ffmpeg_images.as_dataset()\n",
"ffmpeg_images_named_file_dataset = ffmpeg_images_file_dataset.as_named_input(ffmpeg_images_ds_name)"
]
},
{
@@ -304,7 +343,10 @@
"metadata": {},
"source": [
"# Define tweakable parameters to pipeline\n",
"These parameters can be changed when the pipeline is published and rerun from a REST call"
"These parameters can be changed when the pipeline is published and rerun from a REST call.\n",
"As part of ParallelRunStep following 2 pipeline parameters will be created which can be used to override values.\n",
" node_count\n",
" process_count_per_node"
]
},
{
@@ -314,10 +356,8 @@
"outputs": [],
"source": [
"from azureml.pipeline.core.graph import PipelineParameter\n",
"# create a parameter for style (one of \"candy\", \"mosaic\", \"rain_princess\", \"udnie\") to transfer the images to\n",
"style_param = PipelineParameter(name=\"style\", default_value=\"mosaic\")\n",
"# create a parameter for the number of nodes to use in step no. 2 (style transfer)\n",
"nodecount_param = PipelineParameter(name=\"nodecount\", default_value=1)"
"# create a parameter for style (one of \"candy\", \"mosaic\") to transfer the images to\n",
"style_param = PipelineParameter(name=\"style\", default_value=\"mosaic\")"
]
},
{
@@ -340,27 +380,6 @@
" source_directory=scripts_folder\n",
")\n",
"\n",
"# create a MPI step for distributing style transfer step across multiple nodes in AmlCompute \n",
"# using 'nodecount_param' PipelineParameter\n",
"distributed_style_transfer_step = MpiStep(\n",
" name=\"mpi style transfer\",\n",
" script_name=\"neural_style_mpi.py\",\n",
" arguments=[\"--content-dir\", ffmpeg_images,\n",
" \"--output-dir\", processed_images,\n",
" \"--model-dir\", models_dir,\n",
" \"--style\", style_param,\n",
" \"--cuda\", 1\n",
" ],\n",
" compute_target=gpu_cluster,\n",
" node_count=nodecount_param, \n",
" process_count_per_node=1,\n",
" inputs=[models_dir, ffmpeg_images],\n",
" outputs=[processed_images],\n",
" pip_packages=[\"mpi4py\", \"torch\", \"torchvision\"],\n",
" use_gpu=True,\n",
" source_directory=scripts_folder\n",
")\n",
"\n",
"stitch_video_step = PythonScriptStep(\n",
" name=\"stitch\",\n",
" script_name=\"stitch_video.py\",\n",
@@ -375,6 +394,76 @@
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Create environment, parallel step run config and parallel run step"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core import Environment\n",
"from azureml.core.runconfig import DEFAULT_GPU_IMAGE\n",
"\n",
"parallel_cd = CondaDependencies()\n",
"\n",
"parallel_cd.add_channel(\"pytorch\")\n",
"parallel_cd.add_conda_package(\"pytorch\")\n",
"parallel_cd.add_conda_package(\"torchvision\")\n",
"\n",
"styleenvironment = Environment(name=\"styleenvironment\")\n",
"styleenvironment.python.conda_dependencies=parallel_cd\n",
"styleenvironment.docker.base_image = DEFAULT_GPU_IMAGE"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.contrib.pipeline.steps import ParallelRunConfig\n",
"\n",
"parallel_run_config = ParallelRunConfig(\n",
" environment=styleenvironment,\n",
" entry_script='transform.py',\n",
" output_action='summary_only',\n",
" mini_batch_size=\"1\",\n",
" error_threshold=1,\n",
" source_directory=scripts_folder,\n",
" compute_target=gpu_cluster, \n",
" node_count=3)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.contrib.pipeline.steps import ParallelRunStep\n",
"from datetime import datetime\n",
"\n",
"parallel_step_name = 'styletransfer-' + datetime.now().strftime('%Y%m%d%H%M')\n",
"\n",
"distributed_style_transfer_step = ParallelRunStep(\n",
" name=parallel_step_name,\n",
" inputs=[ffmpeg_images_named_file_dataset], # Input file share/blob container/file dataset\n",
" output=processed_images, # Output file share/blob container\n",
" models=[mosaic_model, candy_model],\n",
" tags = {'scenario': \"batch inference\", 'type': \"demo\"},\n",
" properties = {'area': \"style transfer\"},\n",
" arguments=[\"--style\", style_param],\n",
" parallel_run_config=parallel_run_config,\n",
" allow_reuse=True #[optional - default value True]\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
@@ -389,8 +478,18 @@
"outputs": [],
"source": [
"pipeline = Pipeline(workspace=ws, steps=[stitch_video_step])\n",
"\n",
"pipeline.validate()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# submit the pipeline and provide values for the PipelineParameters used in the pipeline\n",
"pipeline_run = Experiment(ws, 'style_transfer').submit(pipeline, pipeline_parameters={\"style\": \"mosaic\", \"nodecount\": 3})"
"pipeline_run = Experiment(ws, 'styletransfer_parallel_mosaic').submit(pipeline)"
]
},
{
@@ -406,10 +505,20 @@
"metadata": {},
"outputs": [],
"source": [
"# Track pipeline run progress\n",
"from azureml.widgets import RunDetails\n",
"RunDetails(pipeline_run).show()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"pipeline_run.wait_for_completion()"
]
},
{
"cell_type": "markdown",
"metadata": {},
@@ -459,24 +568,21 @@
"metadata": {},
"outputs": [],
"source": [
"published_pipeline = pipeline_run.publish_pipeline(\n",
" name=\"batch score style transfer\", description=\"style transfer\", version=\"1.0\")\n",
"pipeline_name = \"style-transfer-batch-inference\"\n",
"print(pipeline_name)\n",
"\n",
"published_pipeline"
"published_pipeline = pipeline.publish(\n",
" name=pipeline_name, \n",
" description=pipeline_name)\n",
"print(\"Newly published pipeline id: {}\".format(published_pipeline.id))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Get published pipeline\n",
"\n",
"You can get the published pipeline using **pipeline id**.\n",
"\n",
"To get all the published pipelines for a given workspace(ws): \n",
"```css\n",
"all_pub_pipelines = PublishedPipeline.get_all(ws)\n",
"```"
"# Get published pipeline\n",
"This is another way to get the published pipeline."
]
},
{
@@ -487,25 +593,30 @@
"source": [
"from azureml.pipeline.core import PublishedPipeline\n",
"\n",
"pipeline_id = published_pipeline.id # use your published pipeline id\n",
"published_pipeline = PublishedPipeline.get(ws, pipeline_id)\n",
"# You could retrieve all pipelines that are published, or \n",
"# just get the published pipeline object that you have the ID for.\n",
"\n",
"published_pipeline"
"# Get all published pipeline objects in the workspace\n",
"all_pub_pipelines = PublishedPipeline.list(ws)\n",
"\n",
"# We will iterate through the list of published pipelines and \n",
"# use the last ID in the list for Schelue operations: \n",
"print(\"Published pipelines found in the workspace:\")\n",
"for pub_pipeline in all_pub_pipelines:\n",
" print(\"Name:\", pub_pipeline.name,\"\\tDescription:\", pub_pipeline.description, \"\\tId:\", pub_pipeline.id, \"\\tStatus:\", pub_pipeline.status)\n",
" if(pub_pipeline.name == pipeline_name):\n",
" published_pipeline = pub_pipeline\n",
"\n",
"print(\"Published pipeline id: {}\".format(published_pipeline.id))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Re-run pipeline through REST calls for other styles"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Get AAD token\n",
"[This notebook](https://aka.ms/pl-restep-auth) shows how to authenticate to AML workspace."
"# Run pipeline through REST calls for other styles\n",
"\n",
"# Get AAD token"
]
},
{
@@ -518,14 +629,14 @@
"import requests\n",
"\n",
"auth = InteractiveLoginAuthentication()\n",
"aad_token = auth.get_authentication_header()\n"
"aad_token = auth.get_authentication_header()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Get endpoint URL"
"# Get endpoint URL"
]
},
{
@@ -534,21 +645,15 @@
"metadata": {},
"outputs": [],
"source": [
"rest_endpoint = published_pipeline.endpoint"
"rest_endpoint = published_pipeline.endpoint\n",
"print(\"Pipeline REST endpoing: {}\".format(rest_endpoint))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Send request and monitor"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Run the pipeline using PipelineParameter values style='candy' and nodecount=2"
"# Send request and monitor"
]
},
{
@@ -557,38 +662,16 @@
"metadata": {},
"outputs": [],
"source": [
"experiment_name = 'styletransfer_parallel_candy'\n",
"response = requests.post(rest_endpoint, \n",
" headers=aad_token,\n",
" json={\"ExperimentName\": \"style_transfer\",\n",
" \"ParameterAssignments\": {\"style\": \"candy\", \"nodecount\": 2}})"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"try:\n",
" response.raise_for_status()\n",
"except Exception: \n",
" raise Exception('Received bad response from the endpoint: {}\\n'\n",
" 'Response Code: {}\\n'\n",
" 'Headers: {}\\n'\n",
" 'Content: {}'.format(rest_endpoint, response.status_code, response.headers, response.content))\n",
" json={\"ExperimentName\": experiment_name,\n",
" \"ParameterAssignments\": {\"style\": \"candy\", \"aml_node_count\": 2}})\n",
"run_id = response.json()[\"Id\"]\n",
"\n",
"run_id = response.json().get('Id')\n",
"print('Submitted pipeline run: ', run_id)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.pipeline.core.run import PipelineRun\n",
"published_pipeline_run_candy = PipelineRun(ws.experiments[\"style_transfer\"], run_id)\n",
"published_pipeline_run_candy = PipelineRun(ws.experiments[experiment_name], run_id)\n",
"\n",
"RunDetails(published_pipeline_run_candy).show()"
]
},
@@ -596,7 +679,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Run the pipeline using PipelineParameter values style='rain_princess' and nodecount=3"
"# Download output from re-run"
]
},
{
@@ -605,10 +688,7 @@
"metadata": {},
"outputs": [],
"source": [
"response = requests.post(rest_endpoint, \n",
" headers=aad_token,\n",
" json={\"ExperimentName\": \"style_transfer\",\n",
" \"ParameterAssignments\": {\"style\": \"rain_princess\", \"nodecount\": 3}})"
"published_pipeline_run_candy.wait_for_completion()"
]
},
{
@@ -617,111 +697,30 @@
"metadata": {},
"outputs": [],
"source": [
"try:\n",
" response.raise_for_status()\n",
"except Exception: \n",
" raise Exception('Received bad response from the endpoint: {}\\n'\n",
" 'Response Code: {}\\n'\n",
" 'Headers: {}\\n'\n",
" 'Content: {}'.format(rest_endpoint, response.status_code, response.headers, response.content))\n",
"\n",
"run_id = response.json().get('Id')\n",
"print('Submitted pipeline run: ', run_id)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"published_pipeline_run_rain = PipelineRun(ws.experiments[\"style_transfer\"], run_id)\n",
"RunDetails(published_pipeline_run_rain).show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Run the pipeline using PipelineParameter values style='udnie' and nodecount=4"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"response = requests.post(rest_endpoint, \n",
" headers=aad_token,\n",
" json={\"ExperimentName\": \"style_transfer\",\n",
" \"ParameterAssignments\": {\"style\": \"udnie\", \"nodecount\": 3}})\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"try:\n",
" response.raise_for_status()\n",
"except Exception: \n",
" raise Exception('Received bad response from the endpoint: {}\\n'\n",
" 'Response Code: {}\\n'\n",
" 'Headers: {}\\n'\n",
" 'Content: {}'.format(rest_endpoint, response.status_code, response.headers, response.content))\n",
"\n",
"run_id = response.json().get('Id')\n",
"print('Submitted pipeline run: ', run_id)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"published_pipeline_run_udnie = PipelineRun(ws.experiments[\"style_transfer\"], run_id)\n",
"RunDetails(published_pipeline_run_udnie).show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Download output from re-run"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"published_pipeline_run_candy.wait_for_completion()\n",
"published_pipeline_run_rain.wait_for_completion()\n",
"published_pipeline_run_udnie.wait_for_completion()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"download_video(published_pipeline_run_candy, target_dir=\"output_video_candy\")\n",
"download_video(published_pipeline_run_rain, target_dir=\"output_video_rain_princess\")\n",
"download_video(published_pipeline_run_udnie, target_dir=\"output_video_udnie\")"
"download_video(published_pipeline_run_candy, target_dir=\"output_video_candy\")"
]
}
],
"metadata": {
"authors": [
{
"name": "sanpil"
"name": "sanpil joringer asraniwa pansav tracych"
}
],
"category": "Other notebooks",
"compute": [
"AML Compute"
],
"datasets": [],
"deployment": [
"None"
],
"exclude_from_index": true,
"framework": [
"None"
],
"friendly_name": "Style transfer using ParallelRunStep",
"index_order": 1,
"kernelspec": {
"display_name": "Python 3.6",
"language": "python",
@@ -737,8 +736,13 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.7"
}
"version": "3.6.9"
},
"tags": [
"Batch Inferencing",
"Pipeline"
],
"task": "Style transfer"
},
"nbformat": 4,
"nbformat_minor": 2

View File

@@ -2,5 +2,6 @@ name: pipeline-style-transfer
dependencies:
- pip:
- azureml-sdk
- azureml-contrib-pipeline-steps
- azureml-widgets
- requests

View File

@@ -0,0 +1,22 @@
import argparse
import glob
import os
import subprocess
parser = argparse.ArgumentParser(description="Process input video")
parser.add_argument('--input_video', required=True)
parser.add_argument('--output_audio', required=True)
parser.add_argument('--output_images', required=True)
args = parser.parse_args()
os.makedirs(args.output_audio, exist_ok=True)
os.makedirs(args.output_images, exist_ok=True)
subprocess.run("ffmpeg -i {} {}/video.aac".format(args.input_video, args.output_audio),
shell=True,
check=True)
subprocess.run("ffmpeg -i {} {}/%05d_video.jpg -hide_banner".format(args.input_video, args.output_images),
shell=True,
check=True)

View File

@@ -0,0 +1,22 @@
import argparse
import os
import subprocess
parser = argparse.ArgumentParser(description="Process input video")
parser.add_argument('--images_dir', required=True)
parser.add_argument('--input_audio', required=True)
parser.add_argument('--output_dir', required=True)
args = parser.parse_args()
os.makedirs(args.output_dir, exist_ok=True)
subprocess.run("ffmpeg -framerate 30 -i {}/%05d_video.jpg -c:v libx264 -profile:v high -crf 20 -pix_fmt yuv420p "
"-y {}/video_without_audio.mp4"
.format(args.images_dir, args.output_dir),
shell=True, check=True)
subprocess.run("ffmpeg -i {}/video_without_audio.mp4 -i {}/video.aac -map 0:0 -map 1:0 -vcodec "
"copy -acodec copy -y {}/video_with_audio.mp4"
.format(args.output_dir, args.input_audio, args.output_dir),
shell=True, check=True)

View File

@@ -1,28 +1,17 @@
# Original source: https://github.com/pytorch/examples/blob/master/fast_neural_style/neural_style/neural_style.py
import argparse
import os
import sys
import re
import json
import traceback
from PIL import Image
import torch
from torchvision import transforms
from azureml.core.model import Model
def load_image(filename, size=None, scale=None):
img = Image.open(filename)
if size is not None:
img = img.resize((size, size), Image.ANTIALIAS)
elif scale is not None:
img = img.resize((int(img.size[0] / scale), int(img.size[1] / scale)), Image.ANTIALIAS)
return img
def save_image(filename, data):
img = data.clone().clamp(0, 255).numpy()
img = img.transpose(1, 2, 0).astype("uint8")
img = Image.fromarray(img)
img.save(filename)
style_model = None
class TransformerNet(torch.nn.Module):
@@ -125,60 +114,59 @@ class UpsampleConvLayer(torch.nn.Module):
return out
def stylize(args):
device = torch.device("cuda" if args.cuda else "cpu")
def load_image(filename):
img = Image.open(filename)
return img
def save_image(filename, data):
img = data.clone().clamp(0, 255).numpy()
img = img.transpose(1, 2, 0).astype("uint8")
img = Image.fromarray(img)
img.save(filename)
def init():
global output_path, args
global style_model, device
output_path = os.environ['AZUREML_BI_OUTPUT_PATH']
print(f'output path: {output_path}')
print(f'Cuda available? {torch.cuda.is_available()}')
arg_parser = argparse.ArgumentParser(description="parser for fast-neural-style")
arg_parser.add_argument("--style", type=str, help="style name")
args, unknown_args = arg_parser.parse_known_args()
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
with torch.no_grad():
style_model = TransformerNet()
state_dict = torch.load(os.path.join(args.model_dir, args.style+".pth"))
model_path = Model.get_model_path(args.style)
state_dict = torch.load(os.path.join(model_path))
# remove saved deprecated running_* keys in InstanceNorm from the checkpoint
for k in list(state_dict.keys()):
if re.search(r'in\d+\.running_(mean|var)$', k):
del state_dict[k]
style_model.load_state_dict(state_dict)
style_model.to(device)
print(f'Model loaded successfully. Path: {model_path}')
filenames = os.listdir(args.content_dir)
for filename in filenames:
print("Processing {}".format(filename))
full_path = os.path.join(args.content_dir, filename)
content_image = load_image(full_path, scale=args.content_scale)
def run(mini_batch):
result = []
for image_file_path in mini_batch:
img = load_image(image_file_path)
with torch.no_grad():
content_transform = transforms.Compose([
transforms.ToTensor(),
transforms.Lambda(lambda x: x.mul(255))
])
content_image = content_transform(content_image)
content_image = content_transform(img)
content_image = content_image.unsqueeze(0).to(device)
output = style_model(content_image).cpu()
output_file_path = os.path.join(output_path, os.path.basename(image_file_path))
save_image(output_file_path, output[0])
result.append(output_file_path)
output_path = os.path.join(args.output_dir, filename)
save_image(output_path, output[0])
def main():
arg_parser = argparse.ArgumentParser(description="parser for fast-neural-style")
arg_parser.add_argument("--content-scale", type=float, default=None,
help="factor for scaling down the content image")
arg_parser.add_argument("--model-dir", type=str, required=True,
help="saved model to be used for stylizing the image.")
arg_parser.add_argument("--cuda", type=int, required=True,
help="set it to 1 for running on GPU, 0 for CPU")
arg_parser.add_argument("--style", type=str,
help="style name")
arg_parser.add_argument("--content-dir", type=str, required=True,
help="directory holding the images")
arg_parser.add_argument("--output-dir", type=str, required=True,
help="directory holding the output images")
args = arg_parser.parse_args()
if args.cuda and not torch.cuda.is_available():
print("ERROR: cuda is not available, try running on CPU")
sys.exit(1)
os.makedirs(args.output_dir, exist_ok=True)
stylize(args)
if __name__ == "__main__":
main()
return result

View File

@@ -507,7 +507,7 @@
"metadata": {},
"source": [
"### Create myenv.yml\n",
"We also need to create an environment file so that Azure Machine Learning can install the necessary packages in the Docker image which are required by your scoring script. In this case, we need to specify conda packages `numpy` and `chainer`."
"We also need to create an environment file so that Azure Machine Learning can install the necessary packages in the Docker image which are required by your scoring script. In this case, we need to specify conda packages `numpy` and `chainer`. Please note that you must indicate azureml-defaults with verion >= 1.0.45 as a pip dependency, because it contains the functionality needed to host the model as a web service."
]
},
{
@@ -521,6 +521,7 @@
"cd = CondaDependencies.create()\n",
"cd.add_conda_package('numpy')\n",
"cd.add_conda_package('chainer')\n",
"cd.add_pip_package(\"azureml-defaults\")\n",
"cd.save_to_file(base_directory='./', conda_file_path='myenv.yml')\n",
"\n",
"print(cd.serialize_to_string())"
@@ -544,10 +545,11 @@
"from azureml.core.model import InferenceConfig\n",
"from azureml.core.webservice import Webservice\n",
"from azureml.core.model import Model\n",
"from azureml.core.environment import Environment\n",
"\n",
"inference_config = InferenceConfig(runtime= \"python\", \n",
" entry_script=\"chainer_score.py\",\n",
" conda_file=\"myenv.yml\")\n",
"\n",
"myenv = Environment.from_conda_specification(name=\"myenv\", file_path=\"myenv.yml\")\n",
"inference_config = InferenceConfig(entry_script=\"chainer_score.py\", environment=myenv)\n",
"\n",
"aciconfig = AciWebservice.deploy_configuration(cpu_cores=1,\n",
" auth_enabled=True, # this flag generates API keys to secure access\n",

View File

@@ -561,10 +561,11 @@
"from azureml.core.model import InferenceConfig\n",
"from azureml.core.webservice import Webservice\n",
"from azureml.core.model import Model\n",
"from azureml.core.environment import Environment\n",
"\n",
"inference_config = InferenceConfig(runtime= \"python\", \n",
" entry_script=\"pytorch_score.py\",\n",
" conda_file=\"myenv.yml\")\n",
"\n",
"myenv = Environment.from_conda_specification(name=\"myenv\", file_path=\"myenv.yml\")\n",
"inference_config = InferenceConfig(entry_script=\"pytorch_score.py\", environment=myenv)\n",
"\n",
"aciconfig = AciWebservice.deploy_configuration(cpu_cores=1, \n",
" memory_gb=1, \n",

View File

@@ -908,13 +908,16 @@
"def init():\n",
" global X, output, sess\n",
" tf.reset_default_graph()\n",
" model_root = Model.get_model_path('tf-dnn-mnist')\n",
" saver = tf.train.import_meta_graph(os.path.join(model_root, 'mnist-tf.model.meta'))\n",
" model_root = os.getenv('AZUREML_MODEL_DIR')\n",
" # the name of the folder in which to look for tensorflow model files\n",
" tf_model_folder = 'model'\n",
" saver = tf.train.import_meta_graph(\n",
" os.path.join(model_root, tf_model_folder, 'mnist-tf.model.meta'))\n",
" X = tf.get_default_graph().get_tensor_by_name(\"network/X:0\")\n",
" output = tf.get_default_graph().get_tensor_by_name(\"network/output/MatMul:0\")\n",
" \n",
"\n",
" sess = tf.Session()\n",
" saver.restore(sess, os.path.join(model_root, 'mnist-tf.model'))\n",
" saver.restore(sess, os.path.join(model_root, tf_model_folder, 'mnist-tf.model'))\n",
"\n",
"def run(raw_data):\n",
" data = np.array(json.loads(raw_data)['data'])\n",
@@ -943,6 +946,7 @@
"cd = CondaDependencies.create()\n",
"cd.add_conda_package('numpy')\n",
"cd.add_tensorflow_conda_package()\n",
"cd.add_pip_package(\"azureml-defaults\")\n",
"cd.save_to_file(base_directory='./', conda_file_path='myenv.yml')\n",
"\n",
"print(cd.serialize_to_string())"
@@ -966,10 +970,11 @@
"from azureml.core.model import InferenceConfig\n",
"from azureml.core.webservice import Webservice\n",
"from azureml.core.model import Model\n",
"from azureml.core.environment import Environment\n",
"\n",
"inference_config = InferenceConfig(runtime= \"python\", \n",
" entry_script=\"score.py\",\n",
" conda_file=\"myenv.yml\")\n",
"\n",
"myenv = Environment.from_conda_specification(name=\"myenv\", file_path=\"myenv.yml\")\n",
"inference_config = InferenceConfig(entry_script=\"score.py\", environment=myenv)\n",
"\n",
"aciconfig = AciWebservice.deploy_configuration(cpu_cores=1, \n",
" memory_gb=1, \n",

View File

@@ -0,0 +1,346 @@
latitude,longitude,temperature,windAngle,windSpeed,elevation
26.536,-81.755,17.8,10.0,2.1,9.0
26.536,-81.755,16.7,360.0,1.5,9.0
26.536,-81.755,16.1,350.0,1.5,9.0
26.536,-81.755,15.0,0.0,0.0,9.0
26.536,-81.755,14.4,350.0,1.5,9.0
26.536,-81.755,0.0,0.0,0.0,9.0
26.536,-81.755,13.9,360.0,2.1,9.0
26.536,-81.755,13.3,350.0,1.5,9.0
26.536,-81.755,13.3,10.0,2.1,9.0
26.536,-81.755,13.3,360.0,1.5,9.0
26.536,-81.755,13.3,0.0,0.0,9.0
26.536,-81.755,12.2,0.0,0.0,9.0
26.536,-81.755,11.7,0.0,0.0,9.0
26.536,-81.755,14.4,0.0,0.0,9.0
26.536,-81.755,17.2,10.0,2.6,9.0
26.536,-81.755,20.0,20.0,2.6,9.0
26.536,-81.755,22.2,10.0,3.6,9.0
26.536,-81.755,23.3,30.0,4.6,9.0
26.536,-81.755,23.3,330.0,2.6,9.0
26.536,-81.755,24.4,0.0,0.0,9.0
26.536,-81.755,25.0,360.0,3.1,9.0
26.536,-81.755,24.4,20.0,4.1,9.0
26.536,-81.755,23.3,10.0,2.6,9.0
26.536,-81.755,21.1,30.0,2.1,9.0
26.536,-81.755,18.3,0.0,0.0,9.0
26.536,-81.755,17.2,30.0,2.1,9.0
26.536,-81.755,15.6,60.0,2.6,9.0
26.536,-81.755,15.6,0.0,0.0,9.0
26.536,-81.755,13.9,60.0,2.6,9.0
26.536,-81.755,12.8,70.0,2.6,9.0
26.536,-81.755,0.0,0.0,0.0,9.0
26.536,-81.755,11.7,70.0,2.1,9.0
26.536,-81.755,12.2,20.0,2.1,9.0
26.536,-81.755,11.7,30.0,1.5,9.0
26.536,-81.755,11.1,40.0,2.1,9.0
26.536,-81.755,12.2,40.0,2.6,9.0
26.536,-81.755,12.2,30.0,2.6,9.0
26.536,-81.755,12.2,0.0,0.0,9.0
26.536,-81.755,15.0,30.0,6.2,9.0
26.536,-81.755,17.2,50.0,3.6,9.0
26.536,-81.755,20.6,60.0,5.1,9.0
26.536,-81.755,22.8,50.0,4.6,9.0
26.536,-81.755,24.4,80.0,6.2,9.0
26.536,-81.755,25.0,100.0,5.7,9.0
26.536,-81.755,25.6,60.0,3.1,9.0
26.536,-81.755,25.6,80.0,4.6,9.0
26.536,-81.755,25.0,90.0,5.1,9.0
26.536,-81.755,24.4,80.0,5.1,9.0
26.536,-81.755,21.1,60.0,2.6,9.0
26.536,-81.755,19.4,70.0,3.6,9.0
26.536,-81.755,18.3,70.0,2.6,9.0
26.536,-81.755,18.3,80.0,2.6,9.0
26.536,-81.755,17.2,60.0,1.5,9.0
26.536,-81.755,16.1,70.0,2.6,9.0
26.536,-81.755,15.6,70.0,2.6,9.0
26.536,-81.755,0.0,0.0,0.0,9.0
26.536,-81.755,16.1,50.0,2.6,9.0
26.536,-81.755,15.6,50.0,2.1,9.0
26.536,-81.755,15.0,50.0,1.5,9.0
26.536,-81.755,15.0,0.0,0.0,9.0
26.536,-81.755,15.0,0.0,0.0,9.0
26.536,-81.755,14.4,0.0,0.0,9.0
26.536,-81.755,14.4,30.0,4.1,9.0
26.536,-81.755,16.1,40.0,1.5,9.0
26.536,-81.755,19.4,0.0,1.5,9.0
26.536,-81.755,22.8,90.0,2.6,9.0
26.536,-81.755,24.4,130.0,3.6,9.0
26.536,-81.755,25.6,100.0,4.6,9.0
26.536,-81.755,26.1,120.0,3.1,9.0
26.536,-81.755,26.7,0.0,2.6,9.0
26.536,-81.755,27.2,0.0,0.0,9.0
26.536,-81.755,27.2,40.0,3.1,9.0
26.536,-81.755,26.1,30.0,1.5,9.0
26.536,-81.755,22.8,310.0,2.1,9.0
26.536,-81.755,23.3,330.0,2.1,9.0
-34.067,-56.238,17.5,30.0,3.1,68.0
-34.067,-56.238,21.2,30.0,5.7,68.0
-34.067,-56.238,24.5,30.0,3.1,68.0
-34.067,-56.238,27.5,330.0,3.6,68.0
-34.067,-56.238,29.2,30.0,4.1,68.0
-34.067,-56.238,31.0,20.0,4.6,68.0
-34.067,-56.238,33.0,360.0,2.6,68.0
-34.067,-56.238,33.6,60.0,3.1,68.0
-34.067,-56.238,33.6,30.0,3.6,68.0
-34.067,-56.238,18.6,40.0,3.1,68.0
-34.067,-56.238,22.0,120.0,1.5,68.0
-34.067,-56.238,25.0,120.0,2.6,68.0
-34.067,-56.238,28.6,50.0,3.1,68.0
-34.067,-56.238,30.6,50.0,4.1,68.0
-34.067,-56.238,31.5,30.0,6.7,68.0
-34.067,-56.238,32.0,40.0,7.2,68.0
-34.067,-56.238,33.0,30.0,5.7,68.0
-34.067,-56.238,33.2,360.0,3.6,68.0
-34.067,-56.238,20.6,30.0,3.1,68.0
-34.067,-56.238,21.2,0.0,0.0,68.0
-34.067,-56.238,22.0,210.0,3.1,68.0
-34.067,-56.238,23.0,210.0,3.6,68.0
-34.067,-56.238,24.0,180.0,6.7,68.0
-34.067,-56.238,24.5,210.0,7.2,68.0
-34.067,-56.238,21.0,180.0,8.2,68.0
-34.067,-56.238,20.0,180.0,6.7,68.0
-34.083,-56.233,20.2,180.0,7.2,68.0
-29.917,-71.2,16.6,290.0,4.1,146.0
-29.916,-71.2,17.0,290.0,4.1,147.0
-29.916,-71.2,16.0,310.0,3.1,147.0
-29.916,-71.2,16.0,300.0,2.1,147.0
-29.917,-71.2,15.1,0.0,0.0,146.0
-29.916,-71.2,15.0,0.0,1.0,147.0
-29.916,-71.2,15.0,160.0,1.0,147.0
-29.916,-71.2,15.0,120.0,1.0,147.0
-29.917,-71.2,14.3,190.0,1.0,146.0
-29.916,-71.2,14.0,190.0,1.0,147.0
-29.916,-71.2,14.0,0.0,0.0,147.0
-29.916,-71.2,14.0,100.0,3.1,147.0
-29.917,-71.2,12.9,0.0,0.0,146.0
-29.916,-71.2,13.0,0.0,1.0,147.0
-29.916,-71.2,14.0,0.0,0.5,147.0
-29.916,-71.2,15.0,0.0,0.5,147.0
-29.917,-71.2,15.9,0.0,0.0,146.0
-29.916,-71.2,16.0,0.0,0.0,147.0
-29.916,-71.2,17.0,270.0,4.6,147.0
-29.916,-71.2,19.0,260.0,4.1,147.0
-29.917,-71.2,18.1,270.0,6.2,146.0
-29.916,-71.2,18.0,270.0,6.2,147.0
-29.916,-71.2,19.0,270.0,6.2,147.0
-29.916,-71.2,20.0,260.0,5.1,147.0
-29.917,-71.2,19.6,280.0,6.2,146.0
-29.916,-71.2,20.0,280.0,6.2,147.0
-29.916,-71.2,20.0,270.0,6.2,147.0
-29.916,-71.2,19.0,280.0,6.7,147.0
-29.917,-71.2,18.3,270.0,5.7,146.0
-29.916,-71.2,18.0,270.0,5.7,147.0
-29.916,-71.2,18.0,0.0,0.0,147.0
-29.916,-71.2,17.0,280.0,4.6,147.0
-29.917,-71.2,15.9,280.0,4.1,146.0
-29.916,-71.2,16.0,280.0,4.1,147.0
-29.916,-71.2,15.0,280.0,3.6,147.0
-29.916,-71.2,15.0,280.0,3.6,147.0
-29.917,-71.2,15.4,280.0,4.1,146.0
-29.916,-71.2,15.0,280.0,4.1,147.0
-29.916,-71.2,16.0,240.0,2.1,147.0
-29.916,-71.2,15.0,0.0,0.5,147.0
-29.917,-71.2,15.8,80.0,3.6,146.0
-29.916,-71.2,16.0,80.0,3.6,147.0
-29.916,-71.2,16.0,10.0,1.5,147.0
-29.916,-71.2,16.0,100.0,1.5,147.0
-29.917,-71.2,15.3,130.0,1.5,146.0
-29.916,-71.2,15.0,130.0,1.5,147.0
-29.916,-71.2,15.0,110.0,1.0,147.0
-29.916,-71.2,16.0,280.0,6.2,147.0
-29.917,-71.2,15.9,240.0,3.6,146.0
-29.916,-71.2,16.0,240.0,3.6,147.0
-29.916,-71.2,16.0,240.0,3.1,147.0
-29.916,-71.2,16.0,220.0,3.1,147.0
-29.917,-71.2,16.4,260.0,3.1,146.0
-29.916,-71.2,16.0,260.0,3.1,147.0
-29.916,-71.2,17.0,230.0,2.6,147.0
-29.916,-71.2,18.0,0.0,1.5,147.0
-29.917,-71.2,20.3,340.0,2.6,146.0
-29.916,-71.2,20.0,340.0,2.6,147.0
-29.916,-71.2,21.0,270.0,5.1,147.0
-29.916,-71.2,20.0,270.0,6.7,147.0
-29.917,-71.2,19.2,280.0,6.7,146.0
-29.916,-71.2,19.0,280.0,6.7,147.0
-29.916,-71.2,19.0,310.0,2.6,147.0
-29.916,-71.2,18.0,270.0,5.1,147.0
-29.917,-71.2,17.0,300.0,4.6,146.0
-29.916,-71.2,17.0,300.0,4.6,147.0
-29.916,-71.2,17.0,300.0,3.6,147.0
-29.916,-71.2,17.0,290.0,3.1,147.0
-29.917,-71.2,16.3,290.0,2.1,146.0
-29.916,-71.2,16.0,290.0,2.1,147.0
-29.916,-71.2,17.0,270.0,1.0,147.0
-29.916,-71.2,17.0,0.0,0.5,147.0
-29.917,-71.2,16.5,160.0,2.1,146.0
-29.916,-71.2,17.0,160.0,2.1,147.0
-29.916,-71.2,15.0,120.0,3.1,147.0
-29.916,-71.2,16.0,180.0,1.5,147.0
-29.917,-71.2,14.7,0.0,0.0,146.0
-29.916,-71.2,15.0,0.0,1.0,147.0
-29.916,-71.2,15.0,300.0,1.0,147.0
-29.916,-71.2,16.0,0.0,0.0,147.0
-29.917,-71.2,18.5,110.0,1.0,146.0
-29.916,-71.2,19.0,110.0,1.0,147.0
-29.916,-71.2,20.0,270.0,3.6,147.0
-29.916,-71.2,20.0,270.0,5.7,147.0
-29.917,-71.2,20.0,280.0,6.2,146.0
-29.916,-71.2,20.0,280.0,6.2,147.0
-29.916,-71.2,21.0,290.0,6.7,147.0
-29.916,-71.2,20.0,270.0,6.2,147.0
-29.917,-71.2,21.0,260.0,6.7,146.0
-29.916,-71.2,21.0,260.0,6.7,147.0
-29.916,-71.2,20.0,270.0,6.2,147.0
-29.916,-71.2,19.0,260.0,5.1,147.0
-29.916,-71.2,18.0,280.0,4.6,147.0
-29.917,-71.2,17.5,280.0,3.1,146.0
-29.916,-71.2,18.0,280.0,3.1,147.0
30.349,-85.788,11.1,0.0,0.0,21.0
30.349,-85.788,11.1,0.0,0.0,21.0
30.349,-85.788,9.4,0.0,0.0,21.0
30.349,-85.788,9.4,0.0,0.0,21.0
30.349,-85.788,8.3,300.0,2.1,21.0
30.349,-85.788,11.1,280.0,1.5,21.0
30.349,-85.788,0.0,0.0,0.0,21.0
30.349,-85.788,10.6,320.0,3.1,21.0
30.349,-85.788,9.4,310.0,3.1,21.0
30.349,-85.788,7.8,320.0,2.6,21.0
30.349,-85.788,6.1,340.0,2.1,21.0
30.349,-85.788,6.7,330.0,2.6,21.0
30.349,-85.788,6.1,310.0,1.5,21.0
30.349,-85.788,7.2,310.0,2.1,21.0
30.349,-85.788,12.8,360.0,3.1,21.0
30.349,-85.788,15.0,0.0,3.1,21.0
30.349,-85.788,16.7,20.0,4.6,21.0
30.349,-85.788,18.9,30.0,5.1,21.0
30.349,-85.788,19.4,10.0,4.1,21.0
30.349,-85.788,21.1,330.0,2.6,21.0
30.349,-85.788,21.1,10.0,4.6,21.0
30.349,-85.788,21.7,360.0,4.1,21.0
30.349,-85.788,21.7,30.0,2.1,21.0
30.349,-85.788,21.7,330.0,2.6,21.0
30.349,-85.788,16.1,350.0,2.1,21.0
30.349,-85.788,11.7,0.0,0.0,21.0
30.349,-85.788,8.9,0.0,0.0,21.0
30.349,-85.788,9.4,0.0,0.0,21.0
30.349,-85.788,7.8,0.0,0.0,21.0
30.349,-85.788,11.1,30.0,3.1,21.0
30.349,-85.788,7.2,0.0,0.0,21.0
30.349,-85.788,7.2,0.0,0.0,21.0
30.349,-85.788,0.0,0.0,0.0,21.0
30.349,-85.788,7.8,30.0,2.1,21.0
30.349,-85.788,8.3,40.0,2.6,21.0
30.349,-85.788,7.2,50.0,1.5,21.0
30.349,-85.788,8.3,60.0,1.5,21.0
30.349,-85.788,5.6,40.0,2.1,21.0
30.349,-85.788,6.7,40.0,2.1,21.0
30.349,-85.788,7.8,50.0,3.1,21.0
30.349,-85.788,11.7,70.0,2.6,21.0
30.349,-85.788,15.6,70.0,3.1,21.0
30.349,-85.788,18.9,100.0,3.6,21.0
30.349,-85.788,20.0,130.0,3.6,21.0
30.349,-85.788,21.1,140.0,4.1,21.0
30.349,-85.788,21.7,150.0,4.1,21.0
30.349,-85.788,21.7,170.0,3.1,21.0
30.349,-85.788,22.2,170.0,3.1,21.0
30.349,-85.788,20.6,0.0,0.0,21.0
30.349,-85.788,17.2,0.0,0.0,21.0
30.349,-85.788,14.4,0.0,0.0,21.0
30.349,-85.788,12.8,100.0,1.5,21.0
30.349,-85.788,13.3,100.0,1.5,21.0
30.349,-85.788,10.6,0.0,0.0,21.0
30.349,-85.788,9.4,0.0,0.0,21.0
30.349,-85.788,7.8,0.0,0.0,21.0
30.358,-85.799,8.3,0.0,0.0,21.0
30.349,-85.788,0.0,0.0,0.0,21.0
30.358,-85.799,6.7,0.0,0.0,21.0
30.358,-85.799,7.2,0.0,0.0,21.0
30.358,-85.799,7.2,0.0,0.0,21.0
30.358,-85.799,8.3,50.0,1.5,21.0
30.358,-85.799,9.4,0.0,0.0,21.0
30.358,-85.799,8.9,0.0,0.0,21.0
30.358,-85.799,10.0,340.0,1.5,21.0
30.358,-85.799,12.8,40.0,1.5,21.0
30.358,-85.799,16.7,100.0,2.1,21.0
30.358,-85.799,21.1,100.0,1.5,21.0
30.358,-85.799,23.3,0.0,0.0,21.0
30.358,-85.799,25.0,180.0,4.6,21.0
30.358,-85.799,24.4,230.0,3.6,21.0
30.358,-85.799,25.0,210.0,4.1,21.0
30.358,-85.799,23.9,170.0,4.1,21.0
30.358,-85.799,22.8,0.0,0.0,21.0
30.358,-85.799,19.4,0.0,0.0,21.0
30.358,-85.799,17.8,140.0,2.1,21.0
60.383,5.333,-0.7,0.0,0.0,36.0
60.383,5.333,0.6,270.0,2.0,36.0
60.383,5.333,-0.9,120.0,1.0,36.0
60.383,5.333,-1.6,130.0,2.0,36.0
60.383,5.333,-1.4,150.0,1.0,36.0
60.383,5.333,-1.7,0.0,0.0,36.0
60.383,5.333,-1.7,140.0,1.0,36.0
60.383,5.333,-1.4,0.0,0.0,36.0
60.383,5.333,-1.0,0.0,0.0,36.0
60.383,5.333,-1.0,150.0,1.0,36.0
60.383,5.333,-0.7,140.0,1.0,36.0
60.383,5.333,0.5,150.0,1.0,36.0
60.383,5.333,1.9,0.0,0.0,36.0
60.383,5.333,1.7,0.0,0.0,36.0
60.383,5.333,2.1,310.0,2.0,36.0
60.383,5.333,1.5,90.0,1.0,36.0
60.383,5.333,1.9,290.0,1.0,36.0
60.383,5.333,2.0,320.0,1.0,36.0
60.383,5.333,1.9,330.0,1.0,36.0
60.383,5.333,1.3,350.0,1.0,36.0
60.383,5.333,1.5,120.0,1.0,36.0
60.383,5.333,1.3,150.0,2.0,36.0
60.383,5.333,0.8,140.0,1.0,36.0
60.383,5.333,0.3,300.0,1.0,36.0
60.383,5.333,0.2,140.0,1.0,36.0
60.383,5.333,0.4,140.0,1.0,36.0
60.383,5.333,0.5,320.0,1.0,36.0
60.383,5.333,1.5,330.0,1.0,36.0
60.383,5.333,1.8,40.0,1.0,36.0
60.383,5.333,2.3,170.0,1.0,36.0
60.383,5.333,2.7,140.0,1.0,36.0
60.383,5.333,3.1,330.0,1.0,36.0
60.383,5.333,3.8,350.0,1.0,36.0
60.383,5.333,3.8,140.0,1.0,36.0
60.383,5.333,4.1,150.0,1.0,36.0
60.383,5.333,4.4,180.0,1.0,36.0
60.383,5.333,4.9,300.0,1.0,36.0
60.383,5.333,5.2,320.0,1.0,36.0
60.383,5.333,6.7,340.0,1.0,36.0
60.383,5.333,6.9,250.0,1.0,36.0
60.383,5.333,7.9,300.0,2.0,36.0
60.383,5.333,5.5,140.0,1.0,36.0
60.383,5.333,7.1,140.0,2.0,36.0
60.383,5.333,7.0,280.0,2.0,36.0
60.383,5.333,4.6,170.0,1.0,36.0
60.383,5.333,4.8,330.0,1.0,36.0
60.383,5.333,6.4,260.0,2.0,36.0
60.383,5.333,6.2,340.0,1.0,36.0
60.383,5.333,5.7,320.0,2.0,36.0
60.383,5.333,5.2,100.0,1.0,36.0
60.383,5.333,5.1,310.0,1.0,36.0
60.383,5.333,4.9,290.0,2.0,36.0
60.383,5.333,4.9,310.0,2.0,36.0
60.383,5.333,6.1,320.0,2.0,36.0
60.383,5.333,7.0,250.0,1.0,36.0
60.383,5.333,5.3,140.0,1.0,36.0
60.383,5.333,6.9,350.0,1.0,36.0
60.383,5.333,9.7,110.0,3.0,36.0
60.383,5.333,10.3,300.0,3.0,36.0
60.383,5.333,8.7,310.0,1.0,36.0
60.383,5.333,9.0,270.0,3.0,36.0
60.383,5.333,11.6,80.0,3.0,36.0
60.383,5.333,11.4,80.0,4.0,36.0
60.383,5.333,9.7,70.0,5.0,36.0
60.383,5.333,9.5,80.0,6.0,36.0
60.383,5.333,8.7,80.0,5.0,36.0
60.383,5.333,7.7,80.0,5.0,36.0
60.383,5.333,8.2,80.0,4.0,36.0
60.383,5.333,7.7,30.0,1.0,36.0
60.383,5.333,7.2,310.0,1.0,36.0
60.383,5.333,6.8,300.0,2.0,36.0
60.383,5.333,6.7,140.0,1.0,36.0
1 latitude longitude temperature windAngle windSpeed elevation
2 26.536 -81.755 17.8 10.0 2.1 9.0
3 26.536 -81.755 16.7 360.0 1.5 9.0
4 26.536 -81.755 16.1 350.0 1.5 9.0
5 26.536 -81.755 15.0 0.0 0.0 9.0
6 26.536 -81.755 14.4 350.0 1.5 9.0
7 26.536 -81.755 0.0 0.0 0.0 9.0
8 26.536 -81.755 13.9 360.0 2.1 9.0
9 26.536 -81.755 13.3 350.0 1.5 9.0
10 26.536 -81.755 13.3 10.0 2.1 9.0
11 26.536 -81.755 13.3 360.0 1.5 9.0
12 26.536 -81.755 13.3 0.0 0.0 9.0
13 26.536 -81.755 12.2 0.0 0.0 9.0
14 26.536 -81.755 11.7 0.0 0.0 9.0
15 26.536 -81.755 14.4 0.0 0.0 9.0
16 26.536 -81.755 17.2 10.0 2.6 9.0
17 26.536 -81.755 20.0 20.0 2.6 9.0
18 26.536 -81.755 22.2 10.0 3.6 9.0
19 26.536 -81.755 23.3 30.0 4.6 9.0
20 26.536 -81.755 23.3 330.0 2.6 9.0
21 26.536 -81.755 24.4 0.0 0.0 9.0
22 26.536 -81.755 25.0 360.0 3.1 9.0
23 26.536 -81.755 24.4 20.0 4.1 9.0
24 26.536 -81.755 23.3 10.0 2.6 9.0
25 26.536 -81.755 21.1 30.0 2.1 9.0
26 26.536 -81.755 18.3 0.0 0.0 9.0
27 26.536 -81.755 17.2 30.0 2.1 9.0
28 26.536 -81.755 15.6 60.0 2.6 9.0
29 26.536 -81.755 15.6 0.0 0.0 9.0
30 26.536 -81.755 13.9 60.0 2.6 9.0
31 26.536 -81.755 12.8 70.0 2.6 9.0
32 26.536 -81.755 0.0 0.0 0.0 9.0
33 26.536 -81.755 11.7 70.0 2.1 9.0
34 26.536 -81.755 12.2 20.0 2.1 9.0
35 26.536 -81.755 11.7 30.0 1.5 9.0
36 26.536 -81.755 11.1 40.0 2.1 9.0
37 26.536 -81.755 12.2 40.0 2.6 9.0
38 26.536 -81.755 12.2 30.0 2.6 9.0
39 26.536 -81.755 12.2 0.0 0.0 9.0
40 26.536 -81.755 15.0 30.0 6.2 9.0
41 26.536 -81.755 17.2 50.0 3.6 9.0
42 26.536 -81.755 20.6 60.0 5.1 9.0
43 26.536 -81.755 22.8 50.0 4.6 9.0
44 26.536 -81.755 24.4 80.0 6.2 9.0
45 26.536 -81.755 25.0 100.0 5.7 9.0
46 26.536 -81.755 25.6 60.0 3.1 9.0
47 26.536 -81.755 25.6 80.0 4.6 9.0
48 26.536 -81.755 25.0 90.0 5.1 9.0
49 26.536 -81.755 24.4 80.0 5.1 9.0
50 26.536 -81.755 21.1 60.0 2.6 9.0
51 26.536 -81.755 19.4 70.0 3.6 9.0
52 26.536 -81.755 18.3 70.0 2.6 9.0
53 26.536 -81.755 18.3 80.0 2.6 9.0
54 26.536 -81.755 17.2 60.0 1.5 9.0
55 26.536 -81.755 16.1 70.0 2.6 9.0
56 26.536 -81.755 15.6 70.0 2.6 9.0
57 26.536 -81.755 0.0 0.0 0.0 9.0
58 26.536 -81.755 16.1 50.0 2.6 9.0
59 26.536 -81.755 15.6 50.0 2.1 9.0
60 26.536 -81.755 15.0 50.0 1.5 9.0
61 26.536 -81.755 15.0 0.0 0.0 9.0
62 26.536 -81.755 15.0 0.0 0.0 9.0
63 26.536 -81.755 14.4 0.0 0.0 9.0
64 26.536 -81.755 14.4 30.0 4.1 9.0
65 26.536 -81.755 16.1 40.0 1.5 9.0
66 26.536 -81.755 19.4 0.0 1.5 9.0
67 26.536 -81.755 22.8 90.0 2.6 9.0
68 26.536 -81.755 24.4 130.0 3.6 9.0
69 26.536 -81.755 25.6 100.0 4.6 9.0
70 26.536 -81.755 26.1 120.0 3.1 9.0
71 26.536 -81.755 26.7 0.0 2.6 9.0
72 26.536 -81.755 27.2 0.0 0.0 9.0
73 26.536 -81.755 27.2 40.0 3.1 9.0
74 26.536 -81.755 26.1 30.0 1.5 9.0
75 26.536 -81.755 22.8 310.0 2.1 9.0
76 26.536 -81.755 23.3 330.0 2.1 9.0
77 -34.067 -56.238 17.5 30.0 3.1 68.0
78 -34.067 -56.238 21.2 30.0 5.7 68.0
79 -34.067 -56.238 24.5 30.0 3.1 68.0
80 -34.067 -56.238 27.5 330.0 3.6 68.0
81 -34.067 -56.238 29.2 30.0 4.1 68.0
82 -34.067 -56.238 31.0 20.0 4.6 68.0
83 -34.067 -56.238 33.0 360.0 2.6 68.0
84 -34.067 -56.238 33.6 60.0 3.1 68.0
85 -34.067 -56.238 33.6 30.0 3.6 68.0
86 -34.067 -56.238 18.6 40.0 3.1 68.0
87 -34.067 -56.238 22.0 120.0 1.5 68.0
88 -34.067 -56.238 25.0 120.0 2.6 68.0
89 -34.067 -56.238 28.6 50.0 3.1 68.0
90 -34.067 -56.238 30.6 50.0 4.1 68.0
91 -34.067 -56.238 31.5 30.0 6.7 68.0
92 -34.067 -56.238 32.0 40.0 7.2 68.0
93 -34.067 -56.238 33.0 30.0 5.7 68.0
94 -34.067 -56.238 33.2 360.0 3.6 68.0
95 -34.067 -56.238 20.6 30.0 3.1 68.0
96 -34.067 -56.238 21.2 0.0 0.0 68.0
97 -34.067 -56.238 22.0 210.0 3.1 68.0
98 -34.067 -56.238 23.0 210.0 3.6 68.0
99 -34.067 -56.238 24.0 180.0 6.7 68.0
100 -34.067 -56.238 24.5 210.0 7.2 68.0
101 -34.067 -56.238 21.0 180.0 8.2 68.0
102 -34.067 -56.238 20.0 180.0 6.7 68.0
103 -34.083 -56.233 20.2 180.0 7.2 68.0
104 -29.917 -71.2 16.6 290.0 4.1 146.0
105 -29.916 -71.2 17.0 290.0 4.1 147.0
106 -29.916 -71.2 16.0 310.0 3.1 147.0
107 -29.916 -71.2 16.0 300.0 2.1 147.0
108 -29.917 -71.2 15.1 0.0 0.0 146.0
109 -29.916 -71.2 15.0 0.0 1.0 147.0
110 -29.916 -71.2 15.0 160.0 1.0 147.0
111 -29.916 -71.2 15.0 120.0 1.0 147.0
112 -29.917 -71.2 14.3 190.0 1.0 146.0
113 -29.916 -71.2 14.0 190.0 1.0 147.0
114 -29.916 -71.2 14.0 0.0 0.0 147.0
115 -29.916 -71.2 14.0 100.0 3.1 147.0
116 -29.917 -71.2 12.9 0.0 0.0 146.0
117 -29.916 -71.2 13.0 0.0 1.0 147.0
118 -29.916 -71.2 14.0 0.0 0.5 147.0
119 -29.916 -71.2 15.0 0.0 0.5 147.0
120 -29.917 -71.2 15.9 0.0 0.0 146.0
121 -29.916 -71.2 16.0 0.0 0.0 147.0
122 -29.916 -71.2 17.0 270.0 4.6 147.0
123 -29.916 -71.2 19.0 260.0 4.1 147.0
124 -29.917 -71.2 18.1 270.0 6.2 146.0
125 -29.916 -71.2 18.0 270.0 6.2 147.0
126 -29.916 -71.2 19.0 270.0 6.2 147.0
127 -29.916 -71.2 20.0 260.0 5.1 147.0
128 -29.917 -71.2 19.6 280.0 6.2 146.0
129 -29.916 -71.2 20.0 280.0 6.2 147.0
130 -29.916 -71.2 20.0 270.0 6.2 147.0
131 -29.916 -71.2 19.0 280.0 6.7 147.0
132 -29.917 -71.2 18.3 270.0 5.7 146.0
133 -29.916 -71.2 18.0 270.0 5.7 147.0
134 -29.916 -71.2 18.0 0.0 0.0 147.0
135 -29.916 -71.2 17.0 280.0 4.6 147.0
136 -29.917 -71.2 15.9 280.0 4.1 146.0
137 -29.916 -71.2 16.0 280.0 4.1 147.0
138 -29.916 -71.2 15.0 280.0 3.6 147.0
139 -29.916 -71.2 15.0 280.0 3.6 147.0
140 -29.917 -71.2 15.4 280.0 4.1 146.0
141 -29.916 -71.2 15.0 280.0 4.1 147.0
142 -29.916 -71.2 16.0 240.0 2.1 147.0
143 -29.916 -71.2 15.0 0.0 0.5 147.0
144 -29.917 -71.2 15.8 80.0 3.6 146.0
145 -29.916 -71.2 16.0 80.0 3.6 147.0
146 -29.916 -71.2 16.0 10.0 1.5 147.0
147 -29.916 -71.2 16.0 100.0 1.5 147.0
148 -29.917 -71.2 15.3 130.0 1.5 146.0
149 -29.916 -71.2 15.0 130.0 1.5 147.0
150 -29.916 -71.2 15.0 110.0 1.0 147.0
151 -29.916 -71.2 16.0 280.0 6.2 147.0
152 -29.917 -71.2 15.9 240.0 3.6 146.0
153 -29.916 -71.2 16.0 240.0 3.6 147.0
154 -29.916 -71.2 16.0 240.0 3.1 147.0
155 -29.916 -71.2 16.0 220.0 3.1 147.0
156 -29.917 -71.2 16.4 260.0 3.1 146.0
157 -29.916 -71.2 16.0 260.0 3.1 147.0
158 -29.916 -71.2 17.0 230.0 2.6 147.0
159 -29.916 -71.2 18.0 0.0 1.5 147.0
160 -29.917 -71.2 20.3 340.0 2.6 146.0
161 -29.916 -71.2 20.0 340.0 2.6 147.0
162 -29.916 -71.2 21.0 270.0 5.1 147.0
163 -29.916 -71.2 20.0 270.0 6.7 147.0
164 -29.917 -71.2 19.2 280.0 6.7 146.0
165 -29.916 -71.2 19.0 280.0 6.7 147.0
166 -29.916 -71.2 19.0 310.0 2.6 147.0
167 -29.916 -71.2 18.0 270.0 5.1 147.0
168 -29.917 -71.2 17.0 300.0 4.6 146.0
169 -29.916 -71.2 17.0 300.0 4.6 147.0
170 -29.916 -71.2 17.0 300.0 3.6 147.0
171 -29.916 -71.2 17.0 290.0 3.1 147.0
172 -29.917 -71.2 16.3 290.0 2.1 146.0
173 -29.916 -71.2 16.0 290.0 2.1 147.0
174 -29.916 -71.2 17.0 270.0 1.0 147.0
175 -29.916 -71.2 17.0 0.0 0.5 147.0
176 -29.917 -71.2 16.5 160.0 2.1 146.0
177 -29.916 -71.2 17.0 160.0 2.1 147.0
178 -29.916 -71.2 15.0 120.0 3.1 147.0
179 -29.916 -71.2 16.0 180.0 1.5 147.0
180 -29.917 -71.2 14.7 0.0 0.0 146.0
181 -29.916 -71.2 15.0 0.0 1.0 147.0
182 -29.916 -71.2 15.0 300.0 1.0 147.0
183 -29.916 -71.2 16.0 0.0 0.0 147.0
184 -29.917 -71.2 18.5 110.0 1.0 146.0
185 -29.916 -71.2 19.0 110.0 1.0 147.0
186 -29.916 -71.2 20.0 270.0 3.6 147.0
187 -29.916 -71.2 20.0 270.0 5.7 147.0
188 -29.917 -71.2 20.0 280.0 6.2 146.0
189 -29.916 -71.2 20.0 280.0 6.2 147.0
190 -29.916 -71.2 21.0 290.0 6.7 147.0
191 -29.916 -71.2 20.0 270.0 6.2 147.0
192 -29.917 -71.2 21.0 260.0 6.7 146.0
193 -29.916 -71.2 21.0 260.0 6.7 147.0
194 -29.916 -71.2 20.0 270.0 6.2 147.0
195 -29.916 -71.2 19.0 260.0 5.1 147.0
196 -29.916 -71.2 18.0 280.0 4.6 147.0
197 -29.917 -71.2 17.5 280.0 3.1 146.0
198 -29.916 -71.2 18.0 280.0 3.1 147.0
199 30.349 -85.788 11.1 0.0 0.0 21.0
200 30.349 -85.788 11.1 0.0 0.0 21.0
201 30.349 -85.788 9.4 0.0 0.0 21.0
202 30.349 -85.788 9.4 0.0 0.0 21.0
203 30.349 -85.788 8.3 300.0 2.1 21.0
204 30.349 -85.788 11.1 280.0 1.5 21.0
205 30.349 -85.788 0.0 0.0 0.0 21.0
206 30.349 -85.788 10.6 320.0 3.1 21.0
207 30.349 -85.788 9.4 310.0 3.1 21.0
208 30.349 -85.788 7.8 320.0 2.6 21.0
209 30.349 -85.788 6.1 340.0 2.1 21.0
210 30.349 -85.788 6.7 330.0 2.6 21.0
211 30.349 -85.788 6.1 310.0 1.5 21.0
212 30.349 -85.788 7.2 310.0 2.1 21.0
213 30.349 -85.788 12.8 360.0 3.1 21.0
214 30.349 -85.788 15.0 0.0 3.1 21.0
215 30.349 -85.788 16.7 20.0 4.6 21.0
216 30.349 -85.788 18.9 30.0 5.1 21.0
217 30.349 -85.788 19.4 10.0 4.1 21.0
218 30.349 -85.788 21.1 330.0 2.6 21.0
219 30.349 -85.788 21.1 10.0 4.6 21.0
220 30.349 -85.788 21.7 360.0 4.1 21.0
221 30.349 -85.788 21.7 30.0 2.1 21.0
222 30.349 -85.788 21.7 330.0 2.6 21.0
223 30.349 -85.788 16.1 350.0 2.1 21.0
224 30.349 -85.788 11.7 0.0 0.0 21.0
225 30.349 -85.788 8.9 0.0 0.0 21.0
226 30.349 -85.788 9.4 0.0 0.0 21.0
227 30.349 -85.788 7.8 0.0 0.0 21.0
228 30.349 -85.788 11.1 30.0 3.1 21.0
229 30.349 -85.788 7.2 0.0 0.0 21.0
230 30.349 -85.788 7.2 0.0 0.0 21.0
231 30.349 -85.788 0.0 0.0 0.0 21.0
232 30.349 -85.788 7.8 30.0 2.1 21.0
233 30.349 -85.788 8.3 40.0 2.6 21.0
234 30.349 -85.788 7.2 50.0 1.5 21.0
235 30.349 -85.788 8.3 60.0 1.5 21.0
236 30.349 -85.788 5.6 40.0 2.1 21.0
237 30.349 -85.788 6.7 40.0 2.1 21.0
238 30.349 -85.788 7.8 50.0 3.1 21.0
239 30.349 -85.788 11.7 70.0 2.6 21.0
240 30.349 -85.788 15.6 70.0 3.1 21.0
241 30.349 -85.788 18.9 100.0 3.6 21.0
242 30.349 -85.788 20.0 130.0 3.6 21.0
243 30.349 -85.788 21.1 140.0 4.1 21.0
244 30.349 -85.788 21.7 150.0 4.1 21.0
245 30.349 -85.788 21.7 170.0 3.1 21.0
246 30.349 -85.788 22.2 170.0 3.1 21.0
247 30.349 -85.788 20.6 0.0 0.0 21.0
248 30.349 -85.788 17.2 0.0 0.0 21.0
249 30.349 -85.788 14.4 0.0 0.0 21.0
250 30.349 -85.788 12.8 100.0 1.5 21.0
251 30.349 -85.788 13.3 100.0 1.5 21.0
252 30.349 -85.788 10.6 0.0 0.0 21.0
253 30.349 -85.788 9.4 0.0 0.0 21.0
254 30.349 -85.788 7.8 0.0 0.0 21.0
255 30.358 -85.799 8.3 0.0 0.0 21.0
256 30.349 -85.788 0.0 0.0 0.0 21.0
257 30.358 -85.799 6.7 0.0 0.0 21.0
258 30.358 -85.799 7.2 0.0 0.0 21.0
259 30.358 -85.799 7.2 0.0 0.0 21.0
260 30.358 -85.799 8.3 50.0 1.5 21.0
261 30.358 -85.799 9.4 0.0 0.0 21.0
262 30.358 -85.799 8.9 0.0 0.0 21.0
263 30.358 -85.799 10.0 340.0 1.5 21.0
264 30.358 -85.799 12.8 40.0 1.5 21.0
265 30.358 -85.799 16.7 100.0 2.1 21.0
266 30.358 -85.799 21.1 100.0 1.5 21.0
267 30.358 -85.799 23.3 0.0 0.0 21.0
268 30.358 -85.799 25.0 180.0 4.6 21.0
269 30.358 -85.799 24.4 230.0 3.6 21.0
270 30.358 -85.799 25.0 210.0 4.1 21.0
271 30.358 -85.799 23.9 170.0 4.1 21.0
272 30.358 -85.799 22.8 0.0 0.0 21.0
273 30.358 -85.799 19.4 0.0 0.0 21.0
274 30.358 -85.799 17.8 140.0 2.1 21.0
275 60.383 5.333 -0.7 0.0 0.0 36.0
276 60.383 5.333 0.6 270.0 2.0 36.0
277 60.383 5.333 -0.9 120.0 1.0 36.0
278 60.383 5.333 -1.6 130.0 2.0 36.0
279 60.383 5.333 -1.4 150.0 1.0 36.0
280 60.383 5.333 -1.7 0.0 0.0 36.0
281 60.383 5.333 -1.7 140.0 1.0 36.0
282 60.383 5.333 -1.4 0.0 0.0 36.0
283 60.383 5.333 -1.0 0.0 0.0 36.0
284 60.383 5.333 -1.0 150.0 1.0 36.0
285 60.383 5.333 -0.7 140.0 1.0 36.0
286 60.383 5.333 0.5 150.0 1.0 36.0
287 60.383 5.333 1.9 0.0 0.0 36.0
288 60.383 5.333 1.7 0.0 0.0 36.0
289 60.383 5.333 2.1 310.0 2.0 36.0
290 60.383 5.333 1.5 90.0 1.0 36.0
291 60.383 5.333 1.9 290.0 1.0 36.0
292 60.383 5.333 2.0 320.0 1.0 36.0
293 60.383 5.333 1.9 330.0 1.0 36.0
294 60.383 5.333 1.3 350.0 1.0 36.0
295 60.383 5.333 1.5 120.0 1.0 36.0
296 60.383 5.333 1.3 150.0 2.0 36.0
297 60.383 5.333 0.8 140.0 1.0 36.0
298 60.383 5.333 0.3 300.0 1.0 36.0
299 60.383 5.333 0.2 140.0 1.0 36.0
300 60.383 5.333 0.4 140.0 1.0 36.0
301 60.383 5.333 0.5 320.0 1.0 36.0
302 60.383 5.333 1.5 330.0 1.0 36.0
303 60.383 5.333 1.8 40.0 1.0 36.0
304 60.383 5.333 2.3 170.0 1.0 36.0
305 60.383 5.333 2.7 140.0 1.0 36.0
306 60.383 5.333 3.1 330.0 1.0 36.0
307 60.383 5.333 3.8 350.0 1.0 36.0
308 60.383 5.333 3.8 140.0 1.0 36.0
309 60.383 5.333 4.1 150.0 1.0 36.0
310 60.383 5.333 4.4 180.0 1.0 36.0
311 60.383 5.333 4.9 300.0 1.0 36.0
312 60.383 5.333 5.2 320.0 1.0 36.0
313 60.383 5.333 6.7 340.0 1.0 36.0
314 60.383 5.333 6.9 250.0 1.0 36.0
315 60.383 5.333 7.9 300.0 2.0 36.0
316 60.383 5.333 5.5 140.0 1.0 36.0
317 60.383 5.333 7.1 140.0 2.0 36.0
318 60.383 5.333 7.0 280.0 2.0 36.0
319 60.383 5.333 4.6 170.0 1.0 36.0
320 60.383 5.333 4.8 330.0 1.0 36.0
321 60.383 5.333 6.4 260.0 2.0 36.0
322 60.383 5.333 6.2 340.0 1.0 36.0
323 60.383 5.333 5.7 320.0 2.0 36.0
324 60.383 5.333 5.2 100.0 1.0 36.0
325 60.383 5.333 5.1 310.0 1.0 36.0
326 60.383 5.333 4.9 290.0 2.0 36.0
327 60.383 5.333 4.9 310.0 2.0 36.0
328 60.383 5.333 6.1 320.0 2.0 36.0
329 60.383 5.333 7.0 250.0 1.0 36.0
330 60.383 5.333 5.3 140.0 1.0 36.0
331 60.383 5.333 6.9 350.0 1.0 36.0
332 60.383 5.333 9.7 110.0 3.0 36.0
333 60.383 5.333 10.3 300.0 3.0 36.0
334 60.383 5.333 8.7 310.0 1.0 36.0
335 60.383 5.333 9.0 270.0 3.0 36.0
336 60.383 5.333 11.6 80.0 3.0 36.0
337 60.383 5.333 11.4 80.0 4.0 36.0
338 60.383 5.333 9.7 70.0 5.0 36.0
339 60.383 5.333 9.5 80.0 6.0 36.0
340 60.383 5.333 8.7 80.0 5.0 36.0
341 60.383 5.333 7.7 80.0 5.0 36.0
342 60.383 5.333 8.2 80.0 4.0 36.0
343 60.383 5.333 7.7 30.0 1.0 36.0
344 60.383 5.333 7.2 310.0 1.0 36.0
345 60.383 5.333 6.8 300.0 2.0 36.0
346 60.383 5.333 6.7 140.0 1.0 36.0

View File

@@ -92,7 +92,7 @@
"dstore = ws.get_default_datastore()\n",
"\n",
"# upload weather data\n",
"dstore.upload('training-dataset', 'drift-on-aks-data', overwrite=True, show_progress=False)"
"dstore.upload('dataset', 'drift-on-aks-data', overwrite=True, show_progress=False)"
]
},
{
@@ -229,7 +229,7 @@
"source": [
"## Run recent weather data through the webservice \n",
"\n",
"The below cells take the past 2 days of weather data, filter and transform using the same processes as the training dataset, and runs the data through the service."
"The below cells take the weather data of Florida from 2019-11-20 to 2019-11-12, filter and transform using the same processes as the training dataset, and runs the data through the service."
]
},
{
@@ -238,16 +238,10 @@
"metadata": {},
"outputs": [],
"source": [
"from datetime import datetime, timedelta\n",
"from azureml.opendatasets import NoaaIsdWeather\n",
"# create dataset \n",
"tset = Dataset.Tabular.from_delimited_files(dstore.path('drift-on-aks-data/testing.csv'))\n",
"\n",
"start = datetime.today() - timedelta(days=2)\n",
"end = datetime.today()\n",
"\n",
"isd = NoaaIsdWeather(start, end)\n",
"\n",
"df = isd.to_pandas_dataframe().fillna(0)\n",
"df = df[df['stationName'].str.contains('FLORIDA', regex=True, na=False)]\n",
"df = tset.to_pandas_dataframe().fillna(0)\n",
"\n",
"X_features = ['latitude', 'longitude', 'temperature', 'windAngle', 'windSpeed']\n",
"y_features = ['elevation']\n",
@@ -264,9 +258,9 @@
"source": [
"import json\n",
"\n",
"today_data = json.dumps({'data': X.values.tolist()})\n",
"data = json.dumps({'data': X.values.tolist()})\n",
"\n",
"data_encoded = bytes(today_data, encoding='utf8')\n",
"data_encoded = bytes(data, encoding='utf8')\n",
"prediction = service.run(input_data=data_encoded)\n",
"print(prediction)"
]
@@ -342,6 +336,7 @@
"metadata": {},
"outputs": [],
"source": [
"from datetime import datetime, timedelta\n",
"from azureml.datadrift import DataDriftDetector, AlertConfiguration\n",
"\n",
"services = [service_name]\n",

View File

@@ -100,7 +100,7 @@
"\n",
"# Check core SDK version number\n",
"\n",
"print(\"This notebook was created using SDK version 1.0.79, you are currently running version\", azureml.core.VERSION)"
"print(\"This notebook was created using SDK version 1.0.85, you are currently running version\", azureml.core.VERSION)"
]
},
{

View File

@@ -1,342 +0,0 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Copyright (c) Microsoft Corporation. All rights reserved.\n",
"\n",
"Licensed under the MIT License."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/using-mlflow/deploy-model/deploy-model.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Deploy Model as Azure Machine Learning Web Service using MLflow\n",
"\n",
"This example shows you how to use mlflow together with Azure Machine Learning services for deploying a model as a web service. You'll learn how to:\n",
"\n",
" 1. Retrieve a previously trained scikit-learn model\n",
" 2. Create a Docker image from the model\n",
" 3. Deploy the model as a web service on Azure Container Instance\n",
" 4. Make a scoring request against the web service.\n",
"\n",
"## Prerequisites and Set-up\n",
"\n",
"This notebook requires you to first complete the [Use MLflow with Azure Machine Learning for Local Training Run](../train-local/train-local.ipnyb) or [Use MLflow with Azure Machine Learning for Remote Training Run](../train-remote/train-remote.ipnyb) notebook, so as to have an experiment run with uploaded model in your Azure Machine Learning Workspace.\n",
"\n",
"Also install following packages if you haven't already\n",
"\n",
"```\n",
"pip install azureml-mlflow pandas\n",
"```\n",
"\n",
"Then, import necessary packages:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import mlflow\n",
"import azureml.mlflow\n",
"import azureml.core\n",
"from azureml.core import Workspace\n",
"\n",
"# Check core SDK version number\n",
"print(\"SDK version:\", azureml.core.VERSION)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Connect to workspace and set MLflow tracking URI\n",
"\n",
"Setting the tracking URI is required for retrieving the model and creating an image using the MLflow APIs."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"ws = Workspace.from_config()\n",
"\n",
"mlflow.set_tracking_uri(ws.get_mlflow_tracking_uri())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Retrieve model from previous run\n",
"\n",
"Let's retrieve the experiment from training notebook, and list the runs within that experiment."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"experiment_name = \"experiment-with-mlflow\"\n",
"exp = ws.experiments[experiment_name]\n",
"\n",
"runs = list(exp.get_runs())\n",
"runs"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Then, let's select the most recent training run and find its ID. You also need to specify the path in run history where the model was saved. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"runid = runs[0].id\n",
"model_save_path = \"model\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Create Docker image\n",
"\n",
"To create a Docker image with Azure Machine Learning for Model Management, use ```mlflow.azureml.build_image``` method. Specify the model path, your workspace, run ID and other parameters.\n",
"\n",
"MLflow automatically recognizes the model framework as scikit-learn, and creates the scoring logic and includes library dependencies for you.\n",
"\n",
"Note that the image creation can take several minutes."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import mlflow.azureml\n",
"\n",
"azure_image, azure_model = mlflow.azureml.build_image(model_uri=\"runs:/{}/{}\".format(runid, model_save_path),\n",
" workspace=ws,\n",
" model_name='diabetes-sklearn-model',\n",
" image_name='diabetes-sklearn-image',\n",
" synchronous=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Deploy web service\n",
"\n",
"Let's use Azure Machine Learning SDK to deploy the image as a web service. \n",
"\n",
"First, specify the deployment configuration. Azure Container Instance is a suitable choice for a quick dev-test deployment, while Azure Kubernetes Service is suitable for scalable production deployments.\n",
"\n",
"Then, deploy the image using Azure Machine Learning SDK's ```deploy_from_image``` method.\n",
"\n",
"Note that the deployment can take several minutes."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azureml.core.webservice import AciWebservice, Webservice\n",
"\n",
"\n",
"aci_config = AciWebservice.deploy_configuration(cpu_cores=1, \n",
" memory_gb=1, \n",
" tags={\"method\" : \"sklearn\"}, \n",
" description='Diabetes model',\n",
" location='eastus2')\n",
"\n",
"\n",
"# Deploy the image to Azure Container Instances (ACI) for real-time serving\n",
"webservice = Webservice.deploy_from_image(\n",
" image=azure_image, workspace=ws, name=\"diabetes-model-1\", deployment_config=aci_config)\n",
"\n",
"\n",
"webservice.wait_for_deployment(show_output=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Make a scoring request\n",
"\n",
"Let's take the first few rows of test data and score them using the web service"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"test_rows = [\n",
" [0.01991321, 0.05068012, 0.10480869, 0.07007254, -0.03596778,\n",
" -0.0266789 , -0.02499266, -0.00259226, 0.00371174, 0.04034337],\n",
" [-0.01277963, -0.04464164, 0.06061839, 0.05285819, 0.04796534,\n",
" 0.02937467, -0.01762938, 0.03430886, 0.0702113 , 0.00720652],\n",
" [ 0.03807591, 0.05068012, 0.00888341, 0.04252958, -0.04284755,\n",
" -0.02104223, -0.03971921, -0.00259226, -0.01811827, 0.00720652]]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"MLflow-based web service for scikit-learn model requires the data to be converted to Pandas DataFrame, and then serialized as JSON. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import json\n",
"import pandas as pd\n",
"\n",
"test_rows_as_json = pd.DataFrame(test_rows).to_json(orient=\"split\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's pass the conveted and serialized data to web service to get the predictions."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"predictions = webservice.run(test_rows_as_json)\n",
"\n",
"print(predictions)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can use the web service's scoring URI to make a raw HTTP request"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"webservice.scoring_uri"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can diagnose the web service using ```get_logs``` method."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"webservice.get_logs()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Next Steps\n",
"\n",
"Learn about [model management and inference in Azure Machine Learning service](https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-model-management-and-deployment)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"authors": [
{
"name": "shipatel"
}
],
"category": "deployment",
"compute": [
"None"
],
"datasets": [
"Diabetes"
],
"deployment": [
"Azure Container Instance"
],
"exclude_from_index": false,
"framework": [
"Scikit-learn"
],
"friendly_name": "Deploy a model as a web service using MLflow",
"index_order": 4,
"kernelspec": {
"display_name": "Python 3.6",
"language": "python",
"name": "python36"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.4"
},
"tags": [
"None"
],
"task": "Use MLflow with AML"
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -1,8 +0,0 @@
name: deploy-model
dependencies:
- scikit-learn
- matplotlib
- pip:
- azureml-sdk
- azureml-mlflow
- pandas

View File

@@ -1,150 +0,0 @@
# Copyright (c) 2017, PyTorch Team
# All rights reserved
# Licensed under BSD 3-Clause License.
# This example is based on PyTorch MNIST example:
# https://github.com/pytorch/examples/blob/master/mnist/main.py
import mlflow
import mlflow.pytorch
from mlflow.utils.environment import _mlflow_conda_env
import warnings
import cloudpickle
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import torchvision
from torchvision import datasets, transforms
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(1, 20, 5, 1)
self.conv2 = nn.Conv2d(20, 50, 5, 1)
self.fc1 = nn.Linear(4 * 4 * 50, 500)
self.fc2 = nn.Linear(500, 10)
def forward(self, x):
# Added the view for reshaping score requests
x = x.view(-1, 1, 28, 28)
x = F.relu(self.conv1(x))
x = F.max_pool2d(x, 2, 2)
x = F.relu(self.conv2(x))
x = F.max_pool2d(x, 2, 2)
x = x.view(-1, 4 * 4 * 50)
x = F.relu(self.fc1(x))
x = self.fc2(x)
return F.log_softmax(x, dim=1)
def train(args, model, device, train_loader, optimizer, epoch):
model.train()
for batch_idx, (data, target) in enumerate(train_loader):
data, target = data.to(device), target.to(device)
optimizer.zero_grad()
output = model(data)
loss = F.nll_loss(output, target)
loss.backward()
optimizer.step()
if batch_idx % args.log_interval == 0:
print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
epoch, batch_idx * len(data), len(train_loader.dataset),
100. * batch_idx / len(train_loader), loss.item()))
# Use MLflow logging
mlflow.log_metric("epoch_loss", loss.item())
def test(args, model, device, test_loader):
model.eval()
test_loss = 0
correct = 0
with torch.no_grad():
for data, target in test_loader:
data, target = data.to(device), target.to(device)
output = model(data)
# sum up batch loss
test_loss += F.nll_loss(output, target, reduction="sum").item()
# get the index of the max log-probability
pred = output.argmax(dim=1, keepdim=True)
correct += pred.eq(target.view_as(pred)).sum().item()
test_loss /= len(test_loader.dataset)
print("\n")
print("Test set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n".format(
test_loss, correct, len(test_loader.dataset),
100. * correct / len(test_loader.dataset)))
# Use MLflow logging
mlflow.log_metric("average_loss", test_loss)
class Args(object):
pass
# Training settings
args = Args()
setattr(args, 'batch_size', 64)
setattr(args, 'test_batch_size', 1000)
setattr(args, 'epochs', 3) # Higher number for better convergence
setattr(args, 'lr', 0.01)
setattr(args, 'momentum', 0.5)
setattr(args, 'no_cuda', True)
setattr(args, 'seed', 1)
setattr(args, 'log_interval', 10)
setattr(args, 'save_model', True)
use_cuda = not args.no_cuda and torch.cuda.is_available()
torch.manual_seed(args.seed)
device = torch.device("cuda" if use_cuda else "cpu")
kwargs = {'num_workers': 1, 'pin_memory': True} if use_cuda else {}
train_loader = torch.utils.data.DataLoader(
datasets.MNIST('../data', train=True, download=True,
transform=transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.1307,), (0.3081,))
])),
batch_size=args.batch_size, shuffle=True, **kwargs)
test_loader = torch.utils.data.DataLoader(
datasets.MNIST(
'../data',
train=False,
transform=transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.1307,), (0.3081,))])),
batch_size=args.test_batch_size, shuffle=True, **kwargs)
def driver():
warnings.filterwarnings("ignore")
# Dependencies for deploying the model
pytorch_index = "https://download.pytorch.org/whl/"
pytorch_version = "cpu/torch-1.1.0-cp36-cp36m-linux_x86_64.whl"
deps = [
"cloudpickle=={}".format(cloudpickle.__version__),
pytorch_index + pytorch_version,
"torchvision=={}".format(torchvision.__version__),
"Pillow=={}".format("6.0.0")
]
with mlflow.start_run() as run:
model = Net().to(device)
optimizer = optim.SGD(
model.parameters(),
lr=args.lr,
momentum=args.momentum)
for epoch in range(1, args.epochs + 1):
train(args, model, device, train_loader, optimizer, epoch)
test(args, model, device, test_loader)
# Log model to run history using MLflow
if args.save_model:
model_env = _mlflow_conda_env(additional_pip_deps=deps)
mlflow.pytorch.log_model(model, "model", conda_env=model_env)
return run
if __name__ == "__main__":
driver()

Some files were not shown because too many files have changed in this diff Show More