Create readme.md
This commit is contained in:
52
training/readme.md
Normal file
52
training/readme.md
Normal file
@@ -0,0 +1,52 @@
|
||||
# Training ML models with Azure ML SDK
|
||||
These notebook tutorials cover the various scenarios for training machine learning and deep learning models with Azure Machine Learning.
|
||||
|
||||
## Sample notebooks
|
||||
- [01.train-hyperparameter-tune-deploy-with-pytorch](https://github.com/Azure/MachineLearningNotebooks/tree/master/training/01.train-hyperparameter-tune-deploy-with-pytorch)
|
||||
Train, hyperparameter tune, and deploy a PyTorch image classification model that distinguishes bees vs. ants using transfer learning. Azure ML concepts covered:
|
||||
- Create a remote compute target (Batch AI cluster)
|
||||
- Upload training data using `Datastore`
|
||||
- Run a single-node `PyTorch` training job
|
||||
- Hyperparameter tune model with HyperDrive
|
||||
- Find and register the best model
|
||||
- Deploy model to ACI
|
||||
- [02.distributed-pytorch-with-horovod](https://github.com/Azure/MachineLearningNotebooks/tree/master/training/02.distributed-pytorch-with-horovod)
|
||||
Train a PyTorch model on the MNIST dataset using distributed training with Horovod. Azure ML concepts covered:
|
||||
- Create a remote compute target (Batch AI cluster)
|
||||
- Run a two-node distributed `PyTorch` training job using Horovod
|
||||
- [03.train-hyperparameter-tun-deploy-with-tensorflow](https://github.com/Azure/MachineLearningNotebooks/tree/master/training/03.train-hyperparameter-tune-deploy-with-tensorflow)
|
||||
Train, hyperparameter tune, and deploy a TensorFlow model on the MNIST dataset. Azure ML concepts covered:
|
||||
- Create a remote compute target (Batch AI cluster)
|
||||
- Upload training data using `Datastore`
|
||||
- Run a single-node `TensorFlow` training job
|
||||
- Leverage features of the `Run` object
|
||||
- Download the trained model
|
||||
- Hyperparameter tune model with HyperDrive
|
||||
- Find and register the best model
|
||||
- Deploy model to ACI
|
||||
- [04.distributed-tensorflow-with-horovod](https://github.com/Azure/MachineLearningNotebooks/tree/master/training/04.distributed-tensorflow-with-horovod)
|
||||
Train a TensorFlow word2vec model using distributed training with Horovod. Azure ML concepts covered:
|
||||
- Create a remote compute target (Batch AI cluster)
|
||||
- Upload training data using `Datastore`
|
||||
- Run a two-node distributed `TensorFlow` training job using Horovod
|
||||
- [05.distributed-tensorflow-with-parameter-server](https://github.com/Azure/MachineLearningNotebooks/tree/master/training/05.distributed-tensorflow-with-parameter-server)
|
||||
Train a TensorFlow model on the MNIST dataset using native distributed TensorFlow (parameter server). Azure ML concepts covered:
|
||||
- Create a remote compute target (Batch AI cluster)
|
||||
- Run a two workers, one parameter server distributed `TensorFlow` training job
|
||||
- [06.distributed-cntk-with-custom-docker](https://github.com/Azure/MachineLearningNotebooks/tree/master/training/06.distributed-cntk-with-custom-docker)
|
||||
Train a CNTK model on the MNIST dataset using the Azure ML base `Estimator` with custom Docker image and distributed training. Azure ML concepts covered:
|
||||
- Create a remote compute target (Batch AI cluster)
|
||||
- Upload training data using `Datastore`
|
||||
- Run a base `Estimator` training job using a custom Docker image from Docker Hub
|
||||
- Distributed CNTK two-node training job via MPI using base `Estimator`
|
||||
|
||||
- [07.tensorboard](https://github.com/Azure/MachineLearningNotebooks/tree/master/training/07.tensorboard)
|
||||
Train a TensorFlow MNIST model locally, on a DSVM, and on Batch AI and view the logs live on TensorBoard. Azure ML concepts covered:
|
||||
- Run the training job locally with Azure ML and run TensorBoard locally. Start (and stop) an Azure ML `TensorBoard` object to stream and view the logs
|
||||
- Run the training job on a remote DSVM and stream the logs to TensorBoard
|
||||
- Run the training job on a remote Batch AI cluster and stream the logs to TensorBoard
|
||||
- Start a `Tensorboard` instance that displays the logs from all three above runs in one
|
||||
- [08.export-run-history-to-tensorboard](https://github.com/Azure/MachineLearningNotebooks/tree/master/training/08.export-run-history-to-tensorboard)
|
||||
- Start an Azure ML `Experiment` and log metrics to `Run` history
|
||||
- Export the `Run` history logs to TensorBoard logs
|
||||
- View the logs in TensorBoard
|
||||
Reference in New Issue
Block a user