1
0
mirror of synced 2025-12-19 18:14:56 -05:00

📝 New documentation page: orchestrating Airbyte and Airbyte Cloud syncs with Kestra (#27695)

* Add documentation about Kestra Airbyte integration

* Update using-kestra-plugin.md

* Update using-kestra-plugin.md

* Update using-kestra-plugin.md
This commit is contained in:
Anna Geller
2023-07-07 15:52:54 +02:00
committed by GitHub
parent af9b332589
commit d7dc6b52d2
5 changed files with 103 additions and 1 deletions

View File

@@ -37,7 +37,7 @@ _Screenshot taken from [Airbyte Cloud](https://cloud.airbyte.com/signup)_.
* [Deploy Airbyte Open Source](https://docs.airbyte.com/quickstart/deploy-airbyte) or set up [Airbyte Cloud](https://docs.airbyte.com/cloud/getting-started-with-airbyte-cloud) to start centralizing your data.
* Create connectors in minutes with our [no-code Connector Builder](https://docs.airbyte.com/connector-development/connector-builder-ui/overview) or [low-code CDK](https://docs.airbyte.com/connector-development/config-based/low-code-cdk-overview).
* Explore popular use cases in our [tutorials](https://airbyte.com/tutorials).
* Orchestrate Airbyte syncs with [Airflow](https://docs.airbyte.com/operator-guides/using-the-airflow-airbyte-operator), [Prefect](https://docs.airbyte.com/operator-guides/using-prefect-task), [Dagster](https://docs.airbyte.com/operator-guides/using-dagster-integration) or the [Airbyte API](https://reference.airbyte.com/reference/start).
* Orchestrate Airbyte syncs with [Airflow](https://docs.airbyte.com/operator-guides/using-the-airflow-airbyte-operator), [Prefect](https://docs.airbyte.com/operator-guides/using-prefect-task), [Dagster](https://docs.airbyte.com/operator-guides/using-dagster-integration), [Kestra](https://docs.airbyte.com/operator-guides/using-kestra-plugin) or the [Airbyte API](https://reference.airbyte.com/reference/start).
* Easily transform loaded data with [SQL](https://docs.airbyte.com/operator-guides/transformation-and-normalization/transformations-with-sql) or [dbt](https://docs.airbyte.com/operator-guides/transformation-and-normalization/transformations-with-dbt).
Try it out yourself with our [demo app](https://demo.airbyte.io/), visit our [full documentation](https://docs.airbyte.com/) and learn more about [recent announcements](https://airbyte.com/blog-categories/company-updates). See our [registry](https://connectors.airbyte.com/files/generated_reports/connector_registry_report.html) for a full list of connectors already available in Airbyte or Airbyte Cloud.

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.4 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 42 MiB

View File

@@ -0,0 +1,101 @@
---
description: Using the Kestra Plugin to Orchestrate Airbyte
---
# Using the Kestra Plugin
Kestra has an official plugin for Airbyte, including support for self-hosted Airbyte and Airbyte Cloud. This plugin allows you to trigger data replication jobs (`Syncs`) and wait for their completion before proceeding with any downstream tasks. Alternatively, you may also run those syncs in a fire-and-forget way by setting the `wait` argument to `false`.
After Airbyte tasks successfully ingest raw data, you can easily start running downstream data transformations with dbt, Python, SQL, Spark, and many more, using a variety of available plugins. Check the [plugin documentation](https://kestra.io/plugins/) for a list of all supported integrations.
## Available tasks
These are the two main tasks to orchestrate Airbyte syncs:
1) The `io.kestra.plugin.airbyte.connections.Sync` task will sync connections for a self-hosted Airbyte instance
2) The `io.kestra.plugin.airbyte.cloud.jobs.Sync` task will sync connections for Airbyte Cloud
## **1. Set up the tools**
First, make sure you have Docker installed. We'll be using the `docker-compose` command, so your installation should contain `docker-compose`. When you use [Docker Desktop](https://docs.docker.com/compose/install/#scenario-one-install-docker-desktop), Docker Compose is already included.
### Start Airbyte
If this is your first time using Airbyte, we suggest following the [Quickstart Guide](https://github.com/airbytehq/airbyte/tree/e378d40236b6a34e1c1cb481c8952735ec687d88/docs/quickstart/getting-started.md). When creating Airbyte connections intended to be orchestrated with Kestra, set your Connection's **sync frequency** to **manual**. Kestra will automate triggering Airbyte jobs in response to external events or based on a schedule youll provide.
### Install Kestra
If you havent started Kestra yet, download [the Docker Compose file](https://raw.githubusercontent.com/kestra-io/kestra/develop/docker-compose.yml):
```yaml
curl -o docker-compose.yml https://raw.githubusercontent.com/kestra-io/kestra/develop/docker-compose.yml
```
Then, run `docker compose up -d` and [navigate to the UI](http://localhost:8080/). You can start [building your first flows](https://kestra.io/docs/getting-started) using the integrated code editor in the UI.
![airbyte_kestra_CLI](../.gitbook/assets/airbyte_kestra_1.gif)
## 2. Create a flow from the UI
Kestra UI provides a wide range of Blueprints to help you get started.
Navigate to Blueprints. Then type "Airbyte" in the search bar to find the desired integration. This way, you can easily accomplish fairly standardized data orchestration tasks, such as the following:
1. [Run a single Airbyte sync](https://demo.kestra.io/ui/blueprints/community/61) on a schedule
2. [Run multiple Airbyte syncs in parallel](https://demo.kestra.io/ui/blueprints/community/18)
3. [Run multiple Airbyte syncs in parallel, then clone a Git repository with dbt code and trigger dbt CLI commands](https://demo.kestra.io/ui/blueprints/community/30)
4. [Run a single Airbyte Cloud sync](https://demo.kestra.io/ui/blueprints/community/62) on a schedule
5. [Run multiple Airbyte Cloud syncs in parallel](https://demo.kestra.io/ui/blueprints/community/63)
6. [Run multiple Airbyte Cloud syncs in parallel, then clone a Git repository with dbt code and trigger dbt CLI commands](https://demo.kestra.io/ui/blueprints/community/64)
7. [Run multiple Airbyte Cloud syncs in parallel, then run a dbt Cloud job](https://demo.kestra.io/ui/blueprints/community/31)
Select a blueprint matching your use case and click "Use".
![airbyte_kestra_UI](../.gitbook/assets/airbyte_kestra_2.gif)
Then, within the editor, adjust the connection ID and task names and click "Save". Finally, trigger your flow.
## 3. Simple demo
Here is an example flow that triggers multiple Airbyte connections in parallel to sync data for multiple **Pokémon**.
```yaml
id: airbyteSyncs
namespace: dev
description: Gotta catch em all!
tasks:
- id: data-ingestion
type: io.kestra.core.tasks.flows.Parallel
tasks:
- id: charizard
type: io.kestra.plugin.airbyte.connections.Sync
connectionId: 9bb96539-73e7-4b9a-9937-6ce861b49cb9
- id: pikachu
type: io.kestra.plugin.airbyte.connections.Sync
connectionId: 39c38950-b0b9-4fce-a303-06ced3dbfa75
- id: psyduck
type: io.kestra.plugin.airbyte.connections.Sync
connectionId: 4de8ab1e-50ef-4df0-aa01-7f21491081f1
taskDefaults:
- type: io.kestra.plugin.airbyte.connections.Sync
values:
url: http://host.docker.internal:8000/
username: "{{envs.airbyte_username}}"
password: "{{envs.airbyte_password}}"
triggers:
- id: everyMinute
type: io.kestra.core.models.triggers.types.Schedule
cron: "*/1 * * * *"
```
## Next steps
If you liked that demo, check out [the blog post](https://airbyte.com/blog/everything-as-code-for-data-infrastructure-with-airbyte-and-kestra-terraform-providers) about using Airbyte and Kestra Terraform providers together to manage Everything as Code.
If you encounter anything unexpected while reproducing this tutorial, you can open [a GitHub issue](https://github.com/kestra-io/kestra) or [ask via Kestra Community Slack](https://kestra.io/slack). Lastly, give Kestra [a GitHub star](https://github.com/kestra-io/kestra) if you like the project.

View File

@@ -358,6 +358,7 @@ const operatorGuide = {
'operator-guides/using-the-airflow-airbyte-operator',
'operator-guides/using-prefect-task',
'operator-guides/using-dagster-integration',
'operator-guides/using-kestra-plugin',
'operator-guides/locating-files-local-destination',
'operator-guides/collecting-metrics',
{