Files
MachineLearningNotebooks/training/readme.md
2018-10-09 12:22:11 -07:00

53 lines
3.9 KiB
Markdown

# Training ML models with Azure ML SDK
These notebook tutorials cover the various scenarios for training machine learning and deep learning models with Azure Machine Learning.
## Sample notebooks
- [01.train-hyperparameter-tune-deploy-with-pytorch](./01.train-hyperparameter-tune-deploy-with-pytorch/01.train-hyperparameter-tune-deploy-with-pytorch.ipynb)
Train, hyperparameter tune, and deploy a PyTorch image classification model that distinguishes bees vs. ants using transfer learning. Azure ML concepts covered:
- Create a remote compute target (Batch AI cluster)
- Upload training data using `Datastore`
- Run a single-node `PyTorch` training job
- Hyperparameter tune model with HyperDrive
- Find and register the best model
- Deploy model to ACI
- [02.distributed-pytorch-with-horovod](./02.distributed-pytorch-with-horovod/02.distributed-pytorch-with-horovod.ipynb)
Train a PyTorch model on the MNIST dataset using distributed training with Horovod. Azure ML concepts covered:
- Create a remote compute target (Batch AI cluster)
- Run a two-node distributed `PyTorch` training job using Horovod
- [03.train-hyperparameter-tun-deploy-with-tensorflow](./03.train-hyperparameter-tune-deploy-with-tensorflow/03.train-hyperparameter-tune-deploy-with-tensorflow.ipynb)
Train, hyperparameter tune, and deploy a TensorFlow model on the MNIST dataset. Azure ML concepts covered:
- Create a remote compute target (Batch AI cluster)
- Upload training data using `Datastore`
- Run a single-node `TensorFlow` training job
- Leverage features of the `Run` object
- Download the trained model
- Hyperparameter tune model with HyperDrive
- Find and register the best model
- Deploy model to ACI
- [04.distributed-tensorflow-with-horovod](./04.distributed-tensorflow-with-horovod/04.distributed-tensorflow-with-horovod.ipynb)
Train a TensorFlow word2vec model using distributed training with Horovod. Azure ML concepts covered:
- Create a remote compute target (Batch AI cluster)
- Upload training data using `Datastore`
- Run a two-node distributed `TensorFlow` training job using Horovod
- [05.distributed-tensorflow-with-parameter-server](./05.distributed-tensorflow-with-parameter-server/05.distributed-tensorflow-with-parameter-server.ipynb)
Train a TensorFlow model on the MNIST dataset using native distributed TensorFlow (parameter server). Azure ML concepts covered:
- Create a remote compute target (Batch AI cluster)
- Run a two workers, one parameter server distributed `TensorFlow` training job
- [06.distributed-cntk-with-custom-docker](./06.distributed-cntk-with-custom-docker/06.distributed-cntk-with-custom-docker.ipynb)
Train a CNTK model on the MNIST dataset using the Azure ML base `Estimator` with custom Docker image and distributed training. Azure ML concepts covered:
- Create a remote compute target (Batch AI cluster)
- Upload training data using `Datastore`
- Run a base `Estimator` training job using a custom Docker image from Docker Hub
- Distributed CNTK two-node training job via MPI using base `Estimator`
- [07.tensorboard](./07.tensorboard/07.tensorboard.ipynb)
Train a TensorFlow MNIST model locally, on a DSVM, and on Batch AI and view the logs live on TensorBoard. Azure ML concepts covered:
- Run the training job locally with Azure ML and run TensorBoard locally. Start (and stop) an Azure ML `TensorBoard` object to stream and view the logs
- Run the training job on a remote DSVM and stream the logs to TensorBoard
- Run the training job on a remote Batch AI cluster and stream the logs to TensorBoard
- Start a `Tensorboard` instance that displays the logs from all three above runs in one
- [08.export-run-history-to-tensorboard](./08.export-run-history-to-tensorboard/08.export-run-history-to-tensorboard.ipynb)
- Start an Azure ML `Experiment` and log metrics to `Run` history
- Export the `Run` history logs to TensorBoard logs
- View the logs in TensorBoard