* add info blurb to Cloud Bucket Storage sources and destinations * Apply suggestions from code review Remove extra colon Co-authored-by: Ben Church <ben@airbyte.io> --------- Co-authored-by: Ben Church <ben@airbyte.io>
3.1 KiB
3.1 KiB
Azure Blob Storage
This page contains the setup guide and reference information for the Azure Blob Storage source connector.
:::info Cloud storage may incur egress costs. Egress refers to data that is transferred out of the cloud storage system, such as when you download files or access them from a different location. For more information, see the Azure Blob Storage pricing guide. :::
Setup guide
Step 1: Set up Azure Blob Storage
- Create a storage account with the permissions details
Step 2: Set up the Azure Blob Storage connector in Airbyte
- Create a new Azure Blob Storage source with a suitable name.
- Set
containerappropriately. This will be the name of the container where the blobs are located. - If you are only interested in blobs containing some prefix in the container set the blobs prefix property
- Set schema inference limit if you want to limit the number of blobs being considered for constructing the schema
- Choose the format corresponding to the format of your files.
Supported sync modes
The Azure Blob Storage source connector supports the following sync modes:
| Feature | Supported? |
|---|---|
| Full Refresh Sync | Yes |
| Incremental Sync | Yes |
| Replicate Incremental Deletes | No |
| Replicate Multiple Files (blob prefix) | Yes |
| Replicate Multiple Streams (distinct tables) | No |
| Namespaces | No |
Azure Blob Storage Settings
azure_blob_storage_endpoint: azure blob storage endpoint to connect toazure_blob_storage_container_name: name of the container where your blobs are locatedazure_blob_storage_account_name: name of your accountazure_blob_storage_account_key: key of your accountazure_blob_storage_blobs_prefix: prefix for getting files which contain that prefix i.e. FolderA/FolderB/ will get files named FolderA/FolderB/blob.json but not FolderA/blob.jsonazure_blob_storage_schema_inference_limit: Limits the number of files being scanned for schema inference and can increase speed and efficiencyformat: File format of the blobs in the container
File Format Settings
Jsonl
Only the line-delimited JSON format is supported for now
Changelog 21210
| Version | Date | Pull Request | Subject |
|---|---|---|---|
| 0.1.0 | 2023-02-17 | https://github.com/airbytehq/airbyte/pull/23222 | Initial release with full-refresh and incremental sync with JSONL files |