Updates to low-code documentation (#17121)
* Update docs * move substreams to another file * Update link * rename * link to alpha definition * Update docs/connector-development/config-based/index.md Co-authored-by: Andy <andy@airbyte.io> Co-authored-by: Andy <andy@airbyte.io>
This commit is contained in:
@@ -1,22 +1,31 @@
|
||||
:warning: This framework is in [alpha](https://docs.airbyte.com/project-overview/product-release-stages/#alpha). It is still in active development and may include backward-incompatible changes. Please share feedback and requests directly with us at feedback@airbyte.io :warning:
|
||||
|
||||
# Index
|
||||
|
||||
## From scratch
|
||||
|
||||
This section gives an overview of the low-code framework.
|
||||
|
||||
- [Overview](overview.md)
|
||||
- [Yaml structure](yaml-structure.md)
|
||||
- [YAML structure](yaml-structure.md)
|
||||
- [Reference docs](https://airbyte-cdk.readthedocs.io/en/latest/api/airbyte_cdk.sources.declarative.html)
|
||||
|
||||
## Concepts
|
||||
|
||||
This section contains additional information on the different components that can be used to define a low-code connector.
|
||||
|
||||
- [Authentication](authentication.md)
|
||||
- [Error handling](error-handling.md)
|
||||
- [Pagination](pagination.md)
|
||||
- [Record selection](record-selector.md)
|
||||
- [Request options](request-options.md)
|
||||
- [Stream slicers](stream-slicers.md)
|
||||
- [Substreams](substreams.md)
|
||||
|
||||
## Tutorial
|
||||
|
||||
This section a tutorial that will guide you through the end-to-end process of implementing a low-code connector.
|
||||
|
||||
0. [Getting started](tutorial/0-getting-started.md)
|
||||
1. [Creating a source](tutorial/1-create-source.md)
|
||||
2. [Installing dependencies](tutorial/2-install-dependencies.md)
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
# Config-based connectors overview
|
||||
|
||||
:warning: This framework is in alpha stage. Support is not in production and is available only to select users. :warning:
|
||||
:warning: This framework is in [alpha](https://docs.airbyte.com/project-overview/product-release-stages/#alpha). It is still in active development and may include backward-incompatible changes. Please share feedback and requests directly with us at feedback@airbyte.io :warning:
|
||||
|
||||
The goal of this document is to give enough technical specifics to understand how config-based connectors work.
|
||||
When you're ready to start building a connector, you can start with [the tutorial](./tutorial/0-getting-started.md), or dive into [more detailed documentation](./index.md).
|
||||
@@ -53,11 +53,11 @@ The only pagination mechanisms supported are
|
||||
|
||||
### What is the authorization mechanism?
|
||||
|
||||
Endpoints that require authenticating using a query param or a HTTP header, as is the case for the [Exchange Rates Data API](https://apilayer.com/marketplace/exchangerates_data-api#authentication), are supported.
|
||||
Endpoints that require [authenticating using a query param or a HTTP header](./authentication.md#apikeyauthenticator), as is the case for the [Exchange Rates Data API](https://apilayer.com/marketplace/exchangerates_data-api#authentication), are supported.
|
||||
|
||||
Endpoints that require authenticating using Basic Auth over HTTPS, as is the case for [Greenhouse](https://developers.greenhouse.io/harvest.html#authentication), are supported.
|
||||
Endpoints that require [authenticating using Basic Auth over HTTPS](./authentication.md#basichttpauthenticator), as is the case for [Greenhouse](https://developers.greenhouse.io/harvest.html#authentication), are supported.
|
||||
|
||||
Endpoints that require authenticating using OAuth 2.0, as is the case for [Strava](https://developers.strava.com/docs/authentication/#introduction), are supported.
|
||||
Endpoints that require [authenticating using OAuth 2.0](./authentication.md#oauth), as is the case for [Strava](https://developers.strava.com/docs/authentication/#introduction), are supported.
|
||||
|
||||
Other authentication schemes such as GWT are not supported.
|
||||
|
||||
@@ -78,11 +78,11 @@ Throttling is not supported, but the connector can use exponential backoff to av
|
||||
| Transport protocol | HTTP |
|
||||
| HTTP methods | GET, POST |
|
||||
| Data format | JSON |
|
||||
| Resource type | Collections<br/>Sub-collection |
|
||||
| Resource type | Collections<br/>[Sub-collection](./substreams.md) |
|
||||
| [Pagination](./pagination.md) | [Page limit](./pagination.md#page-increment)<br/>[Offset](./pagination.md#offset-increment)<br/>[Cursor](./pagination.md#cursor) |
|
||||
| [Authentication](./authentication.md) | [Header based](./authentication.md#ApiKeyAuthenticator)<br/>[Bearer](./authentication.md#BearerAuthenticator)<br/>[Basic](./authentication.md#BasicHttpAuthenticator)<br/>[OAuth](./authentication.md#OAuth) |
|
||||
| Sync mode | Full refresh<br/>Incremental |
|
||||
| Schema discovery | Only static schemas |
|
||||
| Schema discovery | Static schemas |
|
||||
| [Stream slicing](./stream-slicers.md) | [Datetime](./stream-slicers.md#Datetime), [lists](./stream-slicers.md#list-stream-slicer), [parent-resource id](./stream-slicers.md#Substream-slicer) |
|
||||
| [Record transformation](./record-selector.md) | [Field selection](./record-selector.md#selecting-a-field)<br/>[Adding fields](./record-selector.md#adding-fields)<br/>[Removing fields](./record-selector.md#removing-fields)<br/>[Filtering records](./record-selector.md#filtering-records) |
|
||||
| [Error detection](./error-handling.md) | [From HTTP status code](./error-handling.md#from-status-code)<br/>[From error message](./error-handling.md#from-error-message) |
|
||||
@@ -122,9 +122,9 @@ The data retriever defines how to read the data for a Stream, and acts as an orc
|
||||
There is currently only one implementation, the `SimpleRetriever`, which is defined by
|
||||
|
||||
1. Requester: Describes how to submit requests to the API source
|
||||
2. Paginator: Describes how to navigate through the API's pages
|
||||
3. Record selector: Describes how to extract records from a HTTP response
|
||||
4. Stream Slicer: Describes how to partition the stream, enabling incremental syncs and checkpointing
|
||||
2. [Paginator](./pagination.md): Describes how to navigate through the API's pages
|
||||
3. [Record selector](./record-selector.md): Describes how to extract records from a HTTP response
|
||||
4. [Stream Slicer](./stream-slicers.md): Describes how to partition the stream, enabling incremental syncs and checkpointing
|
||||
|
||||
Each of those components (and their subcomponents) are defined by an explicit interface and one or many implementations.
|
||||
The developer can choose and configure the implementation they need depending on specifications of the integration they are building against.
|
||||
@@ -157,13 +157,9 @@ There is currently only one implementation, the `HttpRequester`, which is define
|
||||
1. A base url: The root of the API source
|
||||
2. A path: The specific endpoint to fetch data from for a resource
|
||||
3. The HTTP method: the HTTP method to use (GET or POST)
|
||||
4. A request options provider: Defines the request parameters (query parameters), headers, and request body to set on outgoing HTTP requests
|
||||
5. An authenticator: Defines how to authenticate to the source
|
||||
6. An error handler: Defines how to handle errors
|
||||
|
||||
More details on authentication can be found in the [authentication section](authentication.md).
|
||||
|
||||
More details on error handling can be found in the [error handling section](error-handling.md).
|
||||
4. [A request options provider](./request-options.md): Defines the request parameters (query parameters), headers, and request body to set on outgoing HTTP requests
|
||||
5. [An authenticator](./authentication.md): Defines how to authenticate to the source
|
||||
6. [An error handler](./error-handling.md): Defines how to handle errors
|
||||
|
||||
## Connection Checker
|
||||
|
||||
@@ -209,5 +205,5 @@ pagination_strategy:
|
||||
The following connectors can serve as example of what production-ready config-based connectors look like
|
||||
|
||||
- [Greenhouse](https://github.com/airbytehq/airbyte/tree/master/airbyte-integrations/connectors/source-greenhouse)
|
||||
- [Sendgrid](https://github.com/airbytehq/airbyte/tree/master/airbyte-integrations/connectors/source-sendgrid)
|
||||
- [Sentry](https://github.com/airbytehq/airbyte/tree/master/airbyte-integrations/connectors/source-sentry)
|
||||
- [Sendgrid](https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-sendgrid/source_sendgrid/sendgrid.yaml)
|
||||
- [Sentry](https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-sentry/source_sentry/sentry.yaml)
|
||||
|
||||
@@ -118,54 +118,10 @@ the resulting stream slices are
|
||||
]
|
||||
```
|
||||
|
||||
### Substream slicer
|
||||
|
||||
`SubstreamSlicer` iterates over the parent's stream slices.
|
||||
This is useful for defining sub-resources.
|
||||
|
||||
We might for instance want to read all the commits for a given repository (parent resource).
|
||||
|
||||
For each stream, the slicer needs to know
|
||||
|
||||
- what the parent stream is
|
||||
- what is the key of the records in the parent stream
|
||||
- what is the field defining the stream slice representing the parent record
|
||||
- how to specify that information on an outgoing HTTP request
|
||||
|
||||
Assuming the commits for a given repository can be read by specifying the repository as a request_parameter, this could be defined as
|
||||
|
||||
```yaml
|
||||
stream_slicer:
|
||||
type: "SubstreamSlicer"
|
||||
parent_streams_configs:
|
||||
- stream: "*ref(repositories_stream)"
|
||||
parent_key: "id"
|
||||
stream_slice_field: "repository"
|
||||
request_option:
|
||||
field_name: "repository"
|
||||
inject_into: "request_parameter"
|
||||
```
|
||||
|
||||
REST APIs often nest sub-resources in the URL path.
|
||||
If the URL to fetch commits was "/repositories/:id/commits", then the `Requester`'s path would need to refer to the stream slice's value and no `request_option` would be set:
|
||||
|
||||
```yaml
|
||||
retriever:
|
||||
<...>
|
||||
requester:
|
||||
<...>
|
||||
path: "/respositories/{{ stream_slice.repository }}/commits
|
||||
stream_slicer:
|
||||
type: "SubstreamSlicer"
|
||||
parent_streams_configs:
|
||||
- stream: "*ref(repositories_stream)"
|
||||
parent_key: "id"
|
||||
stream_slice_field: "repository"
|
||||
```
|
||||
|
||||
[^1] This is a slight oversimplification. See [update cursor section](#cursor-update) for more details on how the cursor is updated.
|
||||
|
||||
## More readings
|
||||
|
||||
- [Incremental streams](../cdk-python/incremental-stream.md)
|
||||
- [Stream slices](../cdk-python/stream-slices.md)
|
||||
- [Stream slices](../cdk-python/stream-slices.md)
|
||||
- [Substreams](./substreams.md)
|
||||
47
docs/connector-development/config-based/substreams.md
Normal file
47
docs/connector-development/config-based/substreams.md
Normal file
@@ -0,0 +1,47 @@
|
||||
# Substreams
|
||||
|
||||
Substreams are streams that depend on the records on another stream
|
||||
|
||||
We might for instance want to read all the commits for a given repository (parent stream).
|
||||
|
||||
## Substream slicer
|
||||
|
||||
Substreams are implemented by defining their stream slicer as a`SubstreamSlicer`.
|
||||
|
||||
For each stream, the slicer needs to know
|
||||
|
||||
- what the parent stream is
|
||||
- what is the key of the records in the parent stream
|
||||
- what is the field defining the stream slice representing the parent record
|
||||
- how to specify that information on an outgoing HTTP request
|
||||
|
||||
Assuming the commits for a given repository can be read by specifying the repository as a request_parameter, this could be defined as
|
||||
|
||||
```yaml
|
||||
stream_slicer:
|
||||
type: "SubstreamSlicer"
|
||||
parent_streams_configs:
|
||||
- stream: "*ref(repositories_stream)"
|
||||
parent_key: "id"
|
||||
stream_slice_field: "repository"
|
||||
request_option:
|
||||
field_name: "repository"
|
||||
inject_into: "request_parameter"
|
||||
```
|
||||
|
||||
REST APIs often nest sub-resources in the URL path.
|
||||
If the URL to fetch commits was "/repositories/:id/commits", then the `Requester`'s path would need to refer to the stream slice's value and no `request_option` would be set:
|
||||
|
||||
```yaml
|
||||
retriever:
|
||||
<...>
|
||||
requester:
|
||||
<...>
|
||||
path: "/respositories/{{ stream_slice.repository }}/commits
|
||||
stream_slicer:
|
||||
type: "SubstreamSlicer"
|
||||
parent_streams_configs:
|
||||
- stream: "*ref(repositories_stream)"
|
||||
parent_key: "id"
|
||||
stream_slice_field: "repository"
|
||||
```
|
||||
@@ -1,12 +1,12 @@
|
||||
# Getting Started
|
||||
|
||||
:warning: This framework is in alpha stage. Support is not in production and is available only to select users. :warning:
|
||||
:warning: This framework is in [alpha](https://docs.airbyte.com/project-overview/product-release-stages/#alpha). It is still in active development and may include backward-incompatible changes. Please share feedback and requests directly with us at feedback@airbyte.io :warning:
|
||||
|
||||
## Summary
|
||||
|
||||
Throughout this tutorial, we'll walk you through the creation an Airbyte source to read and extract data from an HTTP API.
|
||||
|
||||
We'll build a connector reading data from the Exchange Rates API, but the steps will apply to other HTTP APIs you might be interested in integrating with.
|
||||
We'll build a connector reading data from the Exchange Rates API, but the steps apply to other HTTP APIs you might be interested in integrating with.
|
||||
|
||||
The API documentations can be found [here](https://apilayer.com/marketplace/exchangerates_data-api).
|
||||
In this tutorial, we will read data from the following endpoints:
|
||||
@@ -30,7 +30,7 @@ The output schema of our stream will look like the following:
|
||||
|
||||
## Exchange Rates API Setup
|
||||
|
||||
Before we can get started, you'll need to generate an API access key for the Exchange Rates API.
|
||||
Before we get started, you'll need to generate an API access key for the Exchange Rates API.
|
||||
This can be done by signing up for the Free tier plan on [Exchange Rates API](https://exchangeratesapi.io/):
|
||||
|
||||
1. Visit https://exchangeratesapi.io and click "Get free API key" on the top right
|
||||
|
||||
Reference in New Issue
Block a user