From 6304fb1eb19d38c140b7cc925ac4c24a4718ee55 Mon Sep 17 00:00:00 2001 From: rastala Date: Tue, 4 Dec 2018 18:34:10 -0500 Subject: [PATCH] add adb readme --- .../azure-databricks/automl_adb_readme.md | 47 +++++++++++++++++++ 1 file changed, 47 insertions(+) create mode 100644 how-to-use-azureml/azure-databricks/automl_adb_readme.md diff --git a/how-to-use-azureml/azure-databricks/automl_adb_readme.md b/how-to-use-azureml/azure-databricks/automl_adb_readme.md new file mode 100644 index 00000000..9debec81 --- /dev/null +++ b/how-to-use-azureml/azure-databricks/automl_adb_readme.md @@ -0,0 +1,47 @@ +**PREVIEW capability** + +Automated ML now supports Azure Databricks as a local compute to perform training (**public preview**). Azure Databricks is a managed Spark offering on Azure and customers already use it for advanced analytics. It provides a collaborative Notebook based environment with CPU or GPU based compute cluster. +- Customers who use Azure Databricks for advanced analytics can now use the same cluster to run automated machine learning experiments. +- You can keep the data within the same cluster. +- You can leverage the local worker nodes with autoscale and auto termination capabilities. +- You can use multiple cores of your Azure Databricks cluster to perform simultenous training. +- You can further tune the model generated by automated machine learning if you chose to. +- Every run (including the best run) is available as a pipeline. +- The model from the pipeline can be registered in Azure ML SDK workspace and then deployed to Azure managed compute (ACI or AKS) using the Azure Machine learning SDK. + +**Create Azure Databricks Cluster:** + +Select New Cluster and fill in following detail: + - Cluster name: _yourclustername_ + - Cluster Mode: Any. **High Concurrency** preferred + - Databricks Runtime: Any 4.x runtime. + - Python version: **3** + - Workers: 2 or higher. + - Max. number of **concurrent iterations** in Automated ML settings is **<=** to the number of **worker nodes** in your Databricks cluster. + - Worker node VM types: **Memory optimized VM** preferred. + - Uncheck _Enable Autoscaling_ + + +It will take few minutes to create the cluster. Please ensure that the cluster state is running before proceeding further. + +**Install Azure ML with Automated ML SDK on your Azure Databricks cluster** + +- Select Import library + +- Source: Upload Python Egg or PyPI + +- PyPi Name: **azureml-sdk[automl_databricks]** + +- Click Install Library + +- Do not select _Attach automatically to all clusters_. In case you have selected earlier then you can go to your Home folder and deselect it. + +- Select the check box _Attach_ next to your cluster name + +(More details on how to attach and detach libs are here - [https://docs.databricks.com/user-guide/libraries.html#attach-a-library-to-a-cluster](https://docs.databricks.com/user-guide/libraries.html#attach-a-library-to-a-cluster) ) + +- Ensure that there are no errors until Status changes to _Attached_. It may take a couple of minutes. + +**Note** - If you have the old build the please deselect it from cluster’s installed libs > move to trash. Install the new build and restart the cluster. And if still there is an issue then detach and reattach your cluster. + +**Now you can run the Automated ML sample notebook on your Azure Databricks cluster. Please let us know your feedback.** \ No newline at end of file