Azure HDInsight
Azure HDInsight is a fully managed cloud Hadoop & Spark offering the gives optimized open-source analytic clusters for Spark, Hive, MapReduce, HBase, Storm, and Kafka. HDInsight Spark clusters provide kernels that you can use with the Jupyter notebook on Apache Spark for testing your applications.
How Azure HDInsight works with Azure Machine Learning service
-
You can train a model using Spark clusters and deploy the model to ACI/AKS from within Azure HDInsight.
-
You can also use automated machine learning capabilities integrated within Azure HDInsight.
You can use Azure HDInsight as a compute target from an Azure Machine Learning pipeline.
Set up your HDInsight cluster
Create HDInsight cluster
Quick create: Basic cluster setup
This article walks you through setup in the Azure portal, where you can create an HDInsight cluster using Quick create or Custom.
Follow instructions on the screen to do a basic cluster setup. Details are provided below for:
-
Cluster types and configuration (Cluster must be Spark 2.3 (HDI 3.6) or greater)
-
Cluster login and SSH username
Import the sample HDI notebook in Jupyter
Important links:
Create HDI cluster: https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-hadoop-provision-linux-clusters
