📝 New documentation page: orchestrating Airbyte and Airbyte Cloud syncs with Kestra (#27695)
* Add documentation about Kestra Airbyte integration * Update using-kestra-plugin.md * Update using-kestra-plugin.md * Update using-kestra-plugin.md
This commit is contained in:
@@ -37,7 +37,7 @@ _Screenshot taken from [Airbyte Cloud](https://cloud.airbyte.com/signup)_.
|
||||
* [Deploy Airbyte Open Source](https://docs.airbyte.com/quickstart/deploy-airbyte) or set up [Airbyte Cloud](https://docs.airbyte.com/cloud/getting-started-with-airbyte-cloud) to start centralizing your data.
|
||||
* Create connectors in minutes with our [no-code Connector Builder](https://docs.airbyte.com/connector-development/connector-builder-ui/overview) or [low-code CDK](https://docs.airbyte.com/connector-development/config-based/low-code-cdk-overview).
|
||||
* Explore popular use cases in our [tutorials](https://airbyte.com/tutorials).
|
||||
* Orchestrate Airbyte syncs with [Airflow](https://docs.airbyte.com/operator-guides/using-the-airflow-airbyte-operator), [Prefect](https://docs.airbyte.com/operator-guides/using-prefect-task), [Dagster](https://docs.airbyte.com/operator-guides/using-dagster-integration) or the [Airbyte API](https://reference.airbyte.com/reference/start).
|
||||
* Orchestrate Airbyte syncs with [Airflow](https://docs.airbyte.com/operator-guides/using-the-airflow-airbyte-operator), [Prefect](https://docs.airbyte.com/operator-guides/using-prefect-task), [Dagster](https://docs.airbyte.com/operator-guides/using-dagster-integration), [Kestra](https://docs.airbyte.com/operator-guides/using-kestra-plugin) or the [Airbyte API](https://reference.airbyte.com/reference/start).
|
||||
* Easily transform loaded data with [SQL](https://docs.airbyte.com/operator-guides/transformation-and-normalization/transformations-with-sql) or [dbt](https://docs.airbyte.com/operator-guides/transformation-and-normalization/transformations-with-dbt).
|
||||
|
||||
Try it out yourself with our [demo app](https://demo.airbyte.io/), visit our [full documentation](https://docs.airbyte.com/) and learn more about [recent announcements](https://airbyte.com/blog-categories/company-updates). See our [registry](https://connectors.airbyte.com/files/generated_reports/connector_registry_report.html) for a full list of connectors already available in Airbyte or Airbyte Cloud.
|
||||
|
||||
BIN
docs/.gitbook/assets/airbyte_kestra_1.gif
Normal file
BIN
docs/.gitbook/assets/airbyte_kestra_1.gif
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 4.4 MiB |
BIN
docs/.gitbook/assets/airbyte_kestra_2.gif
Normal file
BIN
docs/.gitbook/assets/airbyte_kestra_2.gif
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 42 MiB |
101
docs/operator-guides/using-kestra-plugin.md
Normal file
101
docs/operator-guides/using-kestra-plugin.md
Normal file
@@ -0,0 +1,101 @@
|
||||
---
|
||||
description: Using the Kestra Plugin to Orchestrate Airbyte
|
||||
---
|
||||
|
||||
# Using the Kestra Plugin
|
||||
|
||||
Kestra has an official plugin for Airbyte, including support for self-hosted Airbyte and Airbyte Cloud. This plugin allows you to trigger data replication jobs (`Syncs`) and wait for their completion before proceeding with any downstream tasks. Alternatively, you may also run those syncs in a fire-and-forget way by setting the `wait` argument to `false`.
|
||||
|
||||
After Airbyte tasks successfully ingest raw data, you can easily start running downstream data transformations with dbt, Python, SQL, Spark, and many more, using a variety of available plugins. Check the [plugin documentation](https://kestra.io/plugins/) for a list of all supported integrations.
|
||||
|
||||
## Available tasks
|
||||
|
||||
These are the two main tasks to orchestrate Airbyte syncs:
|
||||
|
||||
1) The `io.kestra.plugin.airbyte.connections.Sync` task will sync connections for a self-hosted Airbyte instance
|
||||
|
||||
2) The `io.kestra.plugin.airbyte.cloud.jobs.Sync` task will sync connections for Airbyte Cloud
|
||||
|
||||
## **1. Set up the tools**
|
||||
|
||||
First, make sure you have Docker installed. We'll be using the `docker-compose` command, so your installation should contain `docker-compose`. When you use [Docker Desktop](https://docs.docker.com/compose/install/#scenario-one-install-docker-desktop), Docker Compose is already included.
|
||||
|
||||
### Start Airbyte
|
||||
|
||||
If this is your first time using Airbyte, we suggest following the [Quickstart Guide](https://github.com/airbytehq/airbyte/tree/e378d40236b6a34e1c1cb481c8952735ec687d88/docs/quickstart/getting-started.md). When creating Airbyte connections intended to be orchestrated with Kestra, set your Connection's **sync frequency** to **manual**. Kestra will automate triggering Airbyte jobs in response to external events or based on a schedule you’ll provide.
|
||||
|
||||
### Install Kestra
|
||||
|
||||
If you haven’t started Kestra yet, download [the Docker Compose file](https://raw.githubusercontent.com/kestra-io/kestra/develop/docker-compose.yml):
|
||||
|
||||
```yaml
|
||||
curl -o docker-compose.yml https://raw.githubusercontent.com/kestra-io/kestra/develop/docker-compose.yml
|
||||
```
|
||||
|
||||
Then, run `docker compose up -d` and [navigate to the UI](http://localhost:8080/). You can start [building your first flows](https://kestra.io/docs/getting-started) using the integrated code editor in the UI.
|
||||
|
||||

|
||||
|
||||
|
||||
## 2. Create a flow from the UI
|
||||
|
||||
Kestra UI provides a wide range of Blueprints to help you get started.
|
||||
|
||||
Navigate to Blueprints. Then type "Airbyte" in the search bar to find the desired integration. This way, you can easily accomplish fairly standardized data orchestration tasks, such as the following:
|
||||
|
||||
1. [Run a single Airbyte sync](https://demo.kestra.io/ui/blueprints/community/61) on a schedule
|
||||
2. [Run multiple Airbyte syncs in parallel](https://demo.kestra.io/ui/blueprints/community/18)
|
||||
3. [Run multiple Airbyte syncs in parallel, then clone a Git repository with dbt code and trigger dbt CLI commands](https://demo.kestra.io/ui/blueprints/community/30)
|
||||
4. [Run a single Airbyte Cloud sync](https://demo.kestra.io/ui/blueprints/community/62) on a schedule
|
||||
5. [Run multiple Airbyte Cloud syncs in parallel](https://demo.kestra.io/ui/blueprints/community/63)
|
||||
6. [Run multiple Airbyte Cloud syncs in parallel, then clone a Git repository with dbt code and trigger dbt CLI commands](https://demo.kestra.io/ui/blueprints/community/64)
|
||||
7. [Run multiple Airbyte Cloud syncs in parallel, then run a dbt Cloud job](https://demo.kestra.io/ui/blueprints/community/31)
|
||||
|
||||
Select a blueprint matching your use case and click "Use".
|
||||
|
||||

|
||||
|
||||
|
||||
Then, within the editor, adjust the connection ID and task names and click "Save". Finally, trigger your flow.
|
||||
|
||||
## 3. Simple demo
|
||||
|
||||
Here is an example flow that triggers multiple Airbyte connections in parallel to sync data for multiple **Pokémon**.
|
||||
|
||||
```yaml
|
||||
id: airbyteSyncs
|
||||
namespace: dev
|
||||
description: Gotta catch ‘em all!
|
||||
|
||||
tasks:
|
||||
- id: data-ingestion
|
||||
type: io.kestra.core.tasks.flows.Parallel
|
||||
tasks:
|
||||
- id: charizard
|
||||
type: io.kestra.plugin.airbyte.connections.Sync
|
||||
connectionId: 9bb96539-73e7-4b9a-9937-6ce861b49cb9
|
||||
- id: pikachu
|
||||
type: io.kestra.plugin.airbyte.connections.Sync
|
||||
connectionId: 39c38950-b0b9-4fce-a303-06ced3dbfa75
|
||||
- id: psyduck
|
||||
type: io.kestra.plugin.airbyte.connections.Sync
|
||||
connectionId: 4de8ab1e-50ef-4df0-aa01-7f21491081f1
|
||||
|
||||
taskDefaults:
|
||||
- type: io.kestra.plugin.airbyte.connections.Sync
|
||||
values:
|
||||
url: http://host.docker.internal:8000/
|
||||
username: "{{envs.airbyte_username}}"
|
||||
password: "{{envs.airbyte_password}}"
|
||||
|
||||
triggers:
|
||||
- id: everyMinute
|
||||
type: io.kestra.core.models.triggers.types.Schedule
|
||||
cron: "*/1 * * * *"
|
||||
```
|
||||
|
||||
## Next steps
|
||||
|
||||
If you liked that demo, check out [the blog post](https://airbyte.com/blog/everything-as-code-for-data-infrastructure-with-airbyte-and-kestra-terraform-providers) about using Airbyte and Kestra Terraform providers together to manage Everything as Code.
|
||||
|
||||
If you encounter anything unexpected while reproducing this tutorial, you can open [a GitHub issue](https://github.com/kestra-io/kestra) or [ask via Kestra Community Slack](https://kestra.io/slack). Lastly, give Kestra [a GitHub star](https://github.com/kestra-io/kestra) if you like the project.
|
||||
@@ -358,6 +358,7 @@ const operatorGuide = {
|
||||
'operator-guides/using-the-airflow-airbyte-operator',
|
||||
'operator-guides/using-prefect-task',
|
||||
'operator-guides/using-dagster-integration',
|
||||
'operator-guides/using-kestra-plugin',
|
||||
'operator-guides/locating-files-local-destination',
|
||||
'operator-guides/collecting-metrics',
|
||||
{
|
||||
|
||||
Reference in New Issue
Block a user