Copyright (c) Microsoft Corporation. All rights reserved.  
Licensed under the MIT License.

![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-use-databricks-as-compute-target.png)

# Using Databricks as a Compute Target from Azure Machine Learning Pipeline
To use Databricks as a compute target from [Azure Machine Learning Pipeline](https://aka.ms/pl-concept), a [DatabricksStep](https://docs.microsoft.com/en-us/python/api/azureml-pipeline-steps/azureml.pipeline.steps.databricks_step.databricksstep?view=azure-ml-py) is used. This notebook demonstrates the use of DatabricksStep in Azure Machine Learning Pipeline.

The notebook will show:
1. Running an arbitrary Databricks notebook that the customer has in Databricks workspace
2. Running an arbitrary Python script that the customer has in DBFS
3. Running an arbitrary Python script that is available on local computer (will upload to DBFS, and then run in Databricks) 
4. Running a JAR job that the customer has in DBFS.
5. How to get run context in a Databricks interactive cluster

## Before you begin:

1. **Create an Azure Databricks workspace** in the same subscription where you have your Azure Machine Learning workspace. You will need details of this workspace later on to define DatabricksStep. [Click here](https://ms.portal.azure.com/#blade/HubsExtension/Resources/resourceType/Microsoft.Databricks%2Fworkspaces) for more information.
2. **Create PAT (access token)**: Manually create a Databricks access token at the Azure Databricks portal. See [this](https://docs.databricks.com/api/latest/authentication.html#generate-a-token) for more information.
3. **Add demo notebook to ADB**: This notebook has a sample you can use as is. Launch Azure Databricks attached to your Azure Machine Learning workspace and add a new notebook. 
4. **Create/attach a Blob storage** for use from ADB

## Add demo notebook to ADB Workspace
Copy and paste the below code to create a new notebook in your ADB workspace.

```python
# direct access
dbutils.widgets.get("myparam")
p = getArgument("myparam")
print ("Param -\'myparam':")
print (p)

dbutils.widgets.get("input")
i = getArgument("input")
print ("Param -\'input':")
print (i)

dbutils.widgets.get("output")
o = getArgument("output")
print ("Param -\'output':")
print (o)

n = i + "/testdata.txt"
df = spark.read.csv(n)

display (df)

data = [('value1', 'value2')]
df2 = spark.createDataFrame(data)

z = o + "/output.txt"
df2.write.csv(z)
```

## Azure Machine Learning and Pipeline SDK-specific imports

In [None]:
import os
import azureml.core
from azureml.core.runconfig import JarLibrary
from azureml.core.compute import ComputeTarget, DatabricksCompute
from azureml.exceptions import ComputeTargetException
from azureml.core import Workspace, Experiment
from azureml.pipeline.core import Pipeline, PipelineData
from azureml.pipeline.steps import DatabricksStep
from azureml.core.datastore import Datastore
from azureml.data.data_reference import DataReference

# Check core SDK version number
print("SDK version:", azureml.core.VERSION)

## Initialize Workspace

Initialize a workspace object from persisted configuration. If you are using an Azure Machine Learning Notebook VM, you are all set. Otherwise, make sure you go through the [configuration Notebook](https://aka.ms/pl-config) first if you haven't.

In [None]:
ws = Workspace.from_config()
print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep = '\n')

## Attach Databricks compute target
Next, you need to add your Databricks workspace to Azure Machine Learning as a compute target and give it a name. You will use this name to refer to your Databricks workspace compute target inside Azure Machine Learning.

- **Resource Group** - The resource group name of your Azure Machine Learning workspace
- **Databricks Workspace Name** - The workspace name of your Azure Databricks workspace
- **Databricks Access Token** - The access token you created in ADB

**The Databricks workspace need to be present in the same subscription as your AML workspace**

In [None]:
# Replace with your account info before running.
 
db_compute_name=os.getenv("DATABRICKS_COMPUTE_NAME", "<my-databricks-compute-name>") # Databricks compute name
db_resource_group=os.getenv("DATABRICKS_RESOURCE_GROUP", "<my-db-resource-group>") # Databricks resource group
db_workspace_name=os.getenv("DATABRICKS_WORKSPACE_NAME", "<my-db-workspace-name>") # Databricks workspace name
db_access_token=os.getenv("DATABRICKS_ACCESS_TOKEN", "<my-access-token>") # Databricks access token
 
try:
    databricks_compute = DatabricksCompute(workspace=ws, name=db_compute_name)
    print('Compute target {} already exists'.format(db_compute_name))
except ComputeTargetException:
    print('Compute not found, will use below parameters to attach new one')
    print('db_compute_name {}'.format(db_compute_name))
    print('db_resource_group {}'.format(db_resource_group))
    print('db_workspace_name {}'.format(db_workspace_name))
    print('db_access_token {}'.format(db_access_token))
 
    config = DatabricksCompute.attach_configuration(
        resource_group = db_resource_group,
        workspace_name = db_workspace_name,
        access_token= db_access_token)
    databricks_compute=ComputeTarget.attach(ws, db_compute_name, config)
    databricks_compute.wait_for_completion(True)


## Data Connections with Inputs and Outputs
The DatabricksStep supports DBFS, Azure Blob and ADLS for inputs and outputs. You also will need to define a [Secrets](https://docs.azuredatabricks.net/user-guide/secrets/index.html) scope to enable authentication to external data sources such as Blob and ADLS from Databricks.

- Databricks documentation on [Azure Blob](https://docs.azuredatabricks.net/spark/latest/data-sources/azure/azure-storage.html)
- Databricks documentation on [ADLS](https://docs.databricks.com/spark/latest/data-sources/azure/azure-datalake.html)

### Type of Data Access
Databricks allows to interact with Azure Blob and ADLS in two ways.
- **Direct Access**: Databricks allows you to interact with Azure Blob or ADLS URIs directly. The input or output URIs will be mapped to a Databricks widget param in the Databricks notebook.
- **Mounting**: You will be supplied with additional parameters and secrets that will enable you to mount your ADLS or Azure Blob input or output location in your Databricks notebook.

#### Direct Access: Python sample code
If you have a data reference named "input" it will represent the URI of the input and you can access it directly in the Databricks python notebook like so:

```python
dbutils.widgets.get("input")
y = getArgument("input")
df = spark.read.csv(y)
```

#### Mounting: Python sample code for Azure Blob
Given an Azure Blob data reference named "input" the following widget params will be made available in the Databricks notebook:

```python
# This contains the input URI
dbutils.widgets.get("input")
myinput_uri = getArgument("input")

# How to get the input datastore name inside ADB notebook
# This contains the name of a Databricks secret (in the predefined "amlscope" secret scope) 
# that contians an access key or sas for the Azure Blob input (this name is obtained by appending 
# the name of the input with "_blob_secretname". 
dbutils.widgets.get("input_blob_secretname") 
myinput_blob_secretname = getArgument("input_blob_secretname")

# This contains the required configuration for mounting
dbutils.widgets.get("input_blob_config")
myinput_blob_config = getArgument("input_blob_config")

# Usage
dbutils.fs.mount(
  source = myinput_uri,
  mount_point = "/mnt/input",
  extra_configs = {myinput_blob_config:dbutils.secrets.get(scope = "amlscope", key = myinput_blob_secretname)})
```

#### Mounting: Python sample code for ADLS
Given an ADLS data reference named "input" the following widget params will be made available in the Databricks notebook:

```python
# This contains the input URI
dbutils.widgets.get("input") 
myinput_uri = getArgument("input")

# This contains the client id for the service principal 
# that has access to the adls input
dbutils.widgets.get("input_adls_clientid") 
myinput_adls_clientid = getArgument("input_adls_clientid")

# This contains the name of a Databricks secret (in the predefined "amlscope" secret scope) 
# that contains the secret for the above mentioned service principal
dbutils.widgets.get("input_adls_secretname") 
myinput_adls_secretname = getArgument("input_adls_secretname")

# This contains the refresh url for the mounting configs
dbutils.widgets.get("input_adls_refresh_url") 
myinput_adls_refresh_url = getArgument("input_adls_refresh_url")

# Usage 
configs = {"dfs.adls.oauth2.access.token.provider.type": "ClientCredential",
           "dfs.adls.oauth2.client.id": myinput_adls_clientid,
           "dfs.adls.oauth2.credential": dbutils.secrets.get(scope = "amlscope", key =myinput_adls_secretname),
           "dfs.adls.oauth2.refresh.url": myinput_adls_refresh_url}

dbutils.fs.mount(
  source = myinput_uri,
  mount_point = "/mnt/output",
  extra_configs = configs)
```

## Use Databricks from Azure Machine Learning Pipeline
To use Databricks as a compute target from Azure Machine Learning Pipeline, a DatabricksStep is used. Let's define a datasource (via DataReference), intermediate data (via PipelineData) and a pipeline parameter (via PipelineParameter) to be used in DatabricksStep.

In [None]:
from azureml.pipeline.core import PipelineParameter

# Use the default blob storage
def_blob_store = Datastore(ws, "workspaceblobstore")
print('Datastore {} will be used'.format(def_blob_store.name))

pipeline_param = PipelineParameter(name="my_pipeline_param", default_value="pipeline_param1")

# We are uploading a sample file in the local directory to be used as a datasource
def_blob_store.upload_files(files=["./testdata.txt"], target_path="dbtest", overwrite=False)

step_1_input = DataReference(datastore=def_blob_store, path_on_datastore="dbtest",
                                     data_reference_name="input")

step_1_output = PipelineData("output", datastore=def_blob_store)

### Add a DatabricksStep
Adds a Databricks notebook as a step in a Pipeline.
- ***name:** Name of the Module
- **inputs:** List of input connections for data consumed by this step. Fetch this inside the notebook using dbutils.widgets.get("input")
- **outputs:** List of output port definitions for outputs produced by this step. Fetch this inside the notebook using dbutils.widgets.get("output")
- **existing_cluster_id:** Cluster ID of an existing Interactive cluster on the Databricks workspace. If you are providing this, do not provide any of the parameters below that are used to create a new cluster such as spark_version, node_type, etc.
- **spark_version:** Version of spark for the databricks run cluster. default value: 4.0.x-scala2.11
- **node_type:** Azure vm node types for the databricks run cluster. default value: Standard_D3_v2
- **num_workers:** Specifies a static number of workers for the databricks run cluster
- **min_workers:** Specifies a min number of workers to use for auto-scaling the databricks run cluster
- **max_workers:** Specifies a max number of workers to use for auto-scaling the databricks run cluster
- **spark_env_variables:** Spark environment variables for the databricks run cluster (dictionary of {str:str}). default value: {'PYSPARK_PYTHON': '/databricks/python3/bin/python3'}
- **notebook_path:** Path to the notebook in the databricks instance. If you are providing this, do not provide python script related paramaters or JAR related parameters.
- **notebook_params:** Parameters  for the databricks notebook (dictionary of {str:str}). Fetch this inside the notebook using dbutils.widgets.get("myparam")
- **python_script_path:** The path to the python script in the DBFS or S3. If you are providing this, do not provide python_script_name which is used for uploading script from local machine.
- **python_script_params:** Parameters for the python script (list of str)
- **main_class_name:** The name of the entry point in a JAR module. If you are providing this, do not provide any python script or notebook related parameters.
- **jar_params:** Parameters for the JAR module (list of str)
- **python_script_name:** name of a python script on your local machine (relative to source_directory). If you are providing this do not provide python_script_path which is used to execute a remote python script; or any of the JAR or notebook related parameters.
- **source_directory:** folder that contains the script and other files
- **hash_paths:** list of paths to hash to detect a change in source_directory (script file is always hashed)
- **run_name:** Name in databricks for this run
- **timeout_seconds:** Timeout for the databricks run
- **runconfig:** Runconfig to use. Either pass runconfig or each library type as a separate parameter but do not mix the two
- **maven_libraries:** maven libraries for the databricks run
- **pypi_libraries:** pypi libraries for the databricks run
- **egg_libraries:** egg libraries for the databricks run
- **jar_libraries:** jar libraries for the databricks run
- **rcran_libraries:** rcran libraries for the databricks run
- **compute_target:** Azure Databricks compute
- **allow_reuse:** Whether the step should reuse previous results when run with the same settings/inputs
- **version:** Optional version tag to denote a change in functionality for the step

\* *denotes required fields*  
*You must provide exactly one of num_workers or min_workers and max_workers paramaters*  
*You must provide exactly one of databricks_compute or databricks_compute_name parameters*

## Use runconfig to specify library dependencies
You can use a runconfig to specify the library dependencies for your cluster in Databricks. The runconfig will contain a databricks section as follows:

```yaml
environment:
# Databricks details
  databricks:
# List of maven libraries.
    mavenLibraries:
    - coordinates: org.jsoup:jsoup:1.7.1
      repo: ''
      exclusions:
      - slf4j:slf4j
      - '*:hadoop-client'
# List of PyPi libraries
    pypiLibraries:
    - package: beautifulsoup4
      repo: ''
# List of RCran libraries
    rcranLibraries:
    -
# Coordinates.
      package: ada
# Repo
      repo: http://cran.us.r-project.org
# List of JAR libraries
    jarLibraries:
    -
# Coordinates.
      library: dbfs:/mnt/libraries/library.jar
# List of Egg libraries
    eggLibraries:
    -
# Coordinates.
      library: dbfs:/mnt/libraries/library.egg
```

You can then create a RunConfiguration object using this file and pass it as the runconfig parameter to DatabricksStep.
```python
from azureml.core.runconfig import RunConfiguration

runconfig = RunConfiguration()
runconfig.load(path='<directory_where_runconfig_is_stored>', name='<runconfig_file_name>')
```

### 1. Running the demo notebook already added to the Databricks workspace
Create a notebook in the Azure Databricks workspace, and provide the path to that notebook as the value associated with the environment variable "DATABRICKS_NOTEBOOK_PATH". This will then set the variableÂ notebook_pathÂ when you run the code cell below:

your notebook's path in Azure Databricks UI by hovering over to notebook's title. A typical path of notebook looks like this `/Users/example@databricks.com/example`. See [Databricks Workspace](https://docs.azuredatabricks.net/user-guide/workspace.html) to learn about the folder structure.

Note: DataPath `PipelineParameter` should be provided in list of inputs. Such parameters can be accessed by the datapath `name`.

In [None]:
notebook_path=os.getenv("DATABRICKS_NOTEBOOK_PATH", "<my-databricks-notebook-path>") # Databricks notebook path

dbNbStep = DatabricksStep(
    name="DBNotebookInWS",
    inputs=[step_1_input],
    outputs=[step_1_output],
    num_workers=1,
    notebook_path=notebook_path,
    notebook_params={'myparam': 'testparam', 
                     'myparam2': pipeline_param},
    run_name='DB_Notebook_demo',
    compute_target=databricks_compute,
    allow_reuse=True
)

#### Build and submit the Experiment

Note: Default value of `pipeline_param` will be used if different value is not specified in pipeline parameters during submission

In [None]:
steps = [dbNbStep]
pipeline = Pipeline(workspace=ws, steps=steps)
pipeline_run = Experiment(ws, 'DB_Notebook_demo').submit(pipeline)
pipeline_run.wait_for_completion()

#### View Run Details

In [None]:
from azureml.widgets import RunDetails
RunDetails(pipeline_run).show()

### 2. Running a Python script from DBFS
This shows how to run a Python script in DBFS. 

To complete this, you will need to first upload the Python script in your local machine to DBFS using the [CLI](https://docs.azuredatabricks.net/user-guide/dbfs-databricks-file-system.html). The CLI command is given below:

```
dbfs cp ./train-db-dbfs.py dbfs:/train-db-dbfs.py
```

The code in the below cell assumes that you have completed the previous step of uploading the script `train-db-dbfs.py` to the root folder in DBFS.

Note: `pipeline_param` will add two values in the python_script_params, a name followed by value. the name will be in this format `--MY_PIPELINE_PARAM`. For example, in the given case, python_script_params will be `["arg1", "--MY_PIPELINE_PARAM", "pipeline_param1", "arg2"]`

In [None]:
python_script_path = os.getenv("DATABRICKS_PYTHON_SCRIPT_PATH", "<my-databricks-python-script-path>") # Databricks python script path

dbPythonInDbfsStep = DatabricksStep(
    name="DBPythonInDBFS",
    inputs=[step_1_input],
    num_workers=1,
    python_script_path=python_script_path,
    python_script_params={'arg1', pipeline_param, 'arg2'},
    run_name='DB_Python_demo',
    compute_target=databricks_compute,
    allow_reuse=True
)

#### Build and submit the Experiment

In [None]:
steps = [dbPythonInDbfsStep]
pipeline = Pipeline(workspace=ws, steps=steps)
pipeline_run = Experiment(ws, 'DB_Python_demo').submit(pipeline)
pipeline_run.wait_for_completion()

#### View Run Details

In [None]:
from azureml.widgets import RunDetails
RunDetails(pipeline_run).show()

### 3. Running a Python script in Databricks that currenlty is in local computer
To run a Python script that is currently in your local computer, follow the instructions below. 

The commented out code below code assumes that you have `train-db-local.py` in the `source_directory` subdirectory under the current working directory. 

The best practice is to use separate folders for scripts and its dependent files for each step and specify that folder as the `source_directory` for the step. This helps reduce the size of the snapshot created for the step (only the specific folder is snapshotted). Since changes in any files in the `source_directory` would trigger a re-upload of the snapshot, this helps keep the reuse of the step when there are no changes in the `source_directory` of the step.

In this case, the Python script will be uploaded first to DBFS, and then the script will be run in Databricks.

In [None]:
python_script_name = "train-db-local.py"
source_directory = "./databricks_train"

dbPythonInLocalMachineStep = DatabricksStep(
    name="DBPythonInLocalMachine",
    inputs=[step_1_input],
    num_workers=1,
    python_script_name=python_script_name,
    source_directory=source_directory,
    run_name='DB_Python_Local_demo',
    compute_target=databricks_compute,
    allow_reuse=True
)

#### Build and submit the Experiment

In [None]:
steps = [dbPythonInLocalMachineStep]
pipeline = Pipeline(workspace=ws, steps=steps)
pipeline_run = Experiment(ws, 'DB_Python_Local_demo').submit(pipeline)
pipeline_run.wait_for_completion()

#### View Run Details

In [None]:
from azureml.widgets import RunDetails
RunDetails(pipeline_run).show()

### 4. Running a JAR job that is alreay added in DBFS
To run a JAR job that is already uploaded to DBFS, follow the instructions below. You will first upload the JAR file to DBFS using the [CLI](https://docs.azuredatabricks.net/user-guide/dbfs-databricks-file-system.html).

The commented out code in the below cell assumes that you have uploaded `train-db-dbfs.jar` to the root folder in DBFS. You can upload `train-db-dbfs.jar` to the root folder in DBFS using this commandline so you can use `jar_library_dbfs_path = "dbfs:/train-db-dbfs.jar"`:

```
dbfs cp ./train-db-dbfs.jar dbfs:/train-db-dbfs.jar
```

Note: `pipeline_param` will add two values in the python_script_params, a name followed by value. the name will be in this format `--MY_PIPELINE_PARAM`. For example, in the given case, python_script_params will be `["arg1", "--MY_PIPELINE_PARAM", "pipeline_param1", "arg2"]`

In [None]:
main_jar_class_name = "com.microsoft.aeva.Main"
jar_library_dbfs_path = os.getenv("DATABRICKS_JAR_LIB_PATH", "<my-databricks-jar-lib-path>") # Databricks jar library path

dbJarInDbfsStep = DatabricksStep(
    name="DBJarInDBFS",
    inputs=[step_1_input],
    num_workers=1,
    main_class_name=main_jar_class_name,
    jar_params={'arg1', pipeline_param, 'arg2'},
    run_name='DB_JAR_demo',
    jar_libraries=[JarLibrary(jar_library_dbfs_path)],
    compute_target=databricks_compute,
    allow_reuse=True
)

#### Build and submit the Experiment

In [None]:
steps = [dbJarInDbfsStep]
pipeline = Pipeline(workspace=ws, steps=steps)
pipeline_run = Experiment(ws, 'DB_JAR_demo').submit(pipeline)
pipeline_run.wait_for_completion()

#### View Run Details

In [None]:
from azureml.widgets import RunDetails
RunDetails(pipeline_run).show()

### 5. Running demo notebook already added to the Databricks workspace using existing cluster
First you need register DBFS datastore and make sure path_on_datastore does exist in databricks file system, you can browser the files by refering [this](https://docs.azuredatabricks.net/user-guide/dbfs-databricks-file-system.html).

Find existing_cluster_id by opeing Azure Databricks UI with Clusters page and in url you will find a string connected with '-' right after "clusters/".

In [None]:
try:
    dbfs_ds = Datastore.get(workspace=ws, datastore_name='dbfs_datastore')
    print('DBFS Datastore already exists')
except Exception as ex:
    dbfs_ds = Datastore.register_dbfs(ws, datastore_name='dbfs_datastore')

step_1_input = DataReference(datastore=dbfs_ds, path_on_datastore="FileStore", data_reference_name="input")
step_1_output = PipelineData("output", datastore=dbfs_ds)

In [None]:
dbNbWithExistingClusterStep = DatabricksStep(
    name="DBFSReferenceWithExisting",
    inputs=[step_1_input],
    outputs=[step_1_output],
    notebook_path=notebook_path,
    notebook_params={'myparam': 'testparam', 
        'myparam2': pipeline_param},
    run_name='DBFS_Reference_With_Existing',
    compute_target=databricks_compute,
    existing_cluster_id="your existing cluster id",
    allow_reuse=True
)

#### Build and submit the Experiment

In [None]:
steps = [dbNbWithExistingClusterStep]
pipeline = Pipeline(workspace=ws, steps=steps)
pipeline_run = Experiment(ws, 'DBFS_Reference_With_Existing').submit(pipeline)
pipeline_run.wait_for_completion()

#### View Run Details

In [None]:
from azureml.widgets import RunDetails
RunDetails(pipeline_run).show()

### 6. Running a Python script in Databricks that is currently in local computer with existing cluster
When you access azure blob or data lake storage from an existing (interactive) cluster, you need to ensure the Spark configuration is set up correctly to access this storage and this set up may require the cluster to be restarted.

If you set permit_cluster_restart to True, AML will check if the spark configuration needs to be updated and restart the cluster for you if required. This will ensure that the storage can be correctly accessed from the Databricks cluster.

In [None]:
step_1_input = DataReference(datastore=def_blob_store, path_on_datastore="dbtest",
                                     data_reference_name="input")

dbPythonInLocalWithExistingStep = DatabricksStep(
    name="DBPythonInLocalMachineWithExisting",
    inputs=[step_1_input],
    python_script_name=python_script_name,
    source_directory=source_directory,
    run_name='DB_Python_Local_existing_demo',
    compute_target=databricks_compute,
    existing_cluster_id="your existing cluster id",
    allow_reuse=False,
    permit_cluster_restart=True
)

#### Build and submit the Experiment

In [None]:
steps = [dbPythonInLocalWithExistingStep]
pipeline = Pipeline(workspace=ws, steps=steps)
pipeline_run = Experiment(ws, 'DB_Python_Local_existing_demo').submit(pipeline)
pipeline_run.wait_for_completion()

#### View Run Details

In [None]:
from azureml.widgets import RunDetails
RunDetails(pipeline_run).show()

### How to get run context in a Databricks interactive cluster

Users are used to being able to use Run.get_context() to retrieve the parent_run_id for a given run_id. In DatabricksStep, however, a little more work is required to achieve this.

The solution is to parse the script arguments and set corresponding environment variables to access the run context from within Databricks.
Note that this workaround is not required for job clusters. 

Here is a code sample:

```python
from azureml.core import Run
import argparse
import os


def populate_environ():
    parser = argparse.ArgumentParser(description='Process arguments passed to script')
    parser.add_argument('--AZUREML_SCRIPT_DIRECTORY_NAME')
    parser.add_argument('--AZUREML_RUN_TOKEN')
    parser.add_argument('--AZUREML_RUN_TOKEN_EXPIRY')
    parser.add_argument('--AZUREML_RUN_ID')
    parser.add_argument('--AZUREML_ARM_SUBSCRIPTION')
    parser.add_argument('--AZUREML_ARM_RESOURCEGROUP')
    parser.add_argument('--AZUREML_ARM_WORKSPACE_NAME')
    parser.add_argument('--AZUREML_ARM_PROJECT_NAME')
    parser.add_argument('--AZUREML_SERVICE_ENDPOINT')

    args = parser.parse_args()
    os.environ['AZUREML_SCRIPT_DIRECTORY_NAME'] = args.AZUREML_SCRIPT_DIRECTORY_NAME
    os.environ['AZUREML_RUN_TOKEN'] = args.AZUREML_RUN_TOKEN
    os.environ['AZUREML_RUN_TOKEN_EXPIRY'] = args.AZUREML_RUN_TOKEN_EXPIRY
    os.environ['AZUREML_RUN_ID'] = args.AZUREML_RUN_ID
    os.environ['AZUREML_ARM_SUBSCRIPTION'] = args.AZUREML_ARM_SUBSCRIPTION
    os.environ['AZUREML_ARM_RESOURCEGROUP'] = args.AZUREML_ARM_RESOURCEGROUP
    os.environ['AZUREML_ARM_WORKSPACE_NAME'] = args.AZUREML_ARM_WORKSPACE_NAME
    os.environ['AZUREML_ARM_PROJECT_NAME'] = args.AZUREML_ARM_PROJECT_NAME
    os.environ['AZUREML_SERVICE_ENDPOINT'] = args.AZUREML_SERVICE_ENDPOINT

populate_environ()
run = Run.get_context(allow_offline=False)
print(run._run_dto["parent_run_id"])
```

# Next: ADLA as a Compute Target
To use ADLA as a compute target from Azure Machine Learning Pipeline, a AdlaStep is used. This [notebook](https://aka.ms/pl-adla) demonstrates the use of AdlaStep in Azure Machine Learning Pipeline.