![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/MachineLearningNotebooks/work-with-data/dataprep/how-to-guides/datastore.png)

# Reading from and Writing to Datastores

A datastore is a reference that points to an Azure storage service like a blob container for example. It belongs to a workspace and a workspace can have many datastores.

A data path points to a path on the underlying Azure storage service the datastore references. For example, given a datastore named `blob` that points to an Azure blob container, a data path can point to `/test/data/titanic.csv` in the blob container.

## Read data from Datastore

Data Prep supports reading data from a `Datastore` or a `DataPath` or a `DataReference`. 

Passing in a datastore into all the `read_*` methods of Data Prep will result in reading everything in the underlying Azure storage service. To read a specific folder or file in the underlying storage, you have to pass in a data reference.

In [None]:
from azureml.core import Workspace, Datastore
from azureml.data.datapath import DataPath

import azureml.dataprep as dprep

First, get or create a workspace. Feel free to replace `subscription_id`, `resource_group`, and `workspace_name` with other values.

In [None]:
subscription_id = '35f16a99-532a-4a47-9e93-00305f6c40f2'
resource_group = 'DataStoreTest'
workspace_name = 'dataprep-centraleuap'

workspace = Workspace(subscription_id=subscription_id, resource_group=resource_group, workspace_name=workspace_name)

In [None]:
workspace.datastores

You can now read a crime data set from the datastore. If you are using your own workspace, the `crime0-10.csv` will not be there by default. You will have to upload the data to the datastore yourself.

In [None]:
datastore = Datastore(workspace=workspace, name='dataprep_blob')
dflow = dprep.read_csv(path=datastore.path('crime0-10.csv'))
dflow.head(5)

You can also read from an Azure SQL database. To do that, you will first get an Azure SQL database datastore instance and pass it to Data Prep for reading.

In [None]:
datastore = Datastore(workspace=workspace, name='test_sql')
dflow_sql = dprep.read_sql(data_source=datastore, query='SELECT * FROM team')
dflow_sql.head(5)

You can also read from a PostgreSQL database. To do that, you will first get a PostgreSQL database datastore instance and pass it to Data Prep for reading.

In [None]:
datastore = Datastore(workspace=workspace, name='postgre_test')
dflow_sql = dprep.read_postgresql(data_source=datastore, query='SELECT * FROM public.people')
dflow_sql.head(5)

## Write data to Datastore

You can also write a dataflow to a datastore. The code below will write the file you read in earlier to the folder in the datastore.

In [None]:
dest_datastore = Datastore(workspace, 'dataprep_blob_key')

In [None]:
dflow.write_to_csv(directory_path=dest_datastore.path('output/crime0-10')).run_local()

Now you can read all the files in the `dataprep_adls` datastore which references an Azure Data Lake store.

In [None]:
datastore = Datastore(workspace=workspace, name='dataprep_adls')
dflow_adls = dprep.read_csv(path=DataPath(datastore, path_on_datastore='/input/crime0-10.csv'))
dflow_adls.head(5)

Now you can read all the files in the `dataprep_adlsgen2` datastore which references an ADLSGen2 Storage account.

In [None]:
# read a file from ADLSGen2
datastore = Datastore(workspace=workspace, name='adlsgen2')
dflow_adlsgen2 = dprep.read_csv(path=DataPath(datastore, path_on_datastore='/testfolder/peopletest.csv'))
dflow_adlsgen2.head(5)

In [None]:
# read all files from ADLSGen2 directory
datastore = Datastore(workspace=workspace, name='adlsgen2')
dflow_adlsgen2 = dprep.read_csv(path=DataPath(datastore, path_on_datastore='/testfolder/testdir'))
dflow_adlsgen2.head()