1
0
mirror of synced 2025-12-25 02:09:19 -05:00

docs: Generate version 2.0 of platform documentation (#68102)

## What

Generates version 2.0 of the Airbyte platform documentation using
Docusaurus's built-in versioning system. This creates a frozen snapshot
of the current documentation that users can reference.

Requested by ian.alton@airbyte.io via [Slack
thread](https://airbytehq-team.slack.com/archives/D08FX8EC9L0/p1760490197805979?thread_ts=1760490197.805979).

Link to Devin run:
https://app.devin.ai/sessions/689693593bac44f4903f476aa17b872e

## How

- Ran `pnpm run docusaurus docs:version:platform 2.0` in the docusaurus
directory
- This automatically:
- Created `platform_versioned_docs/version-2.0/` containing a snapshot
of all current platform docs
- Created `platform_versioned_sidebars/version-2.0-sidebars.json` with
the sidebar navigation structure
  - Updated `platform_versions.json` to add "2.0" to the version list
- Ran prettier to format the JSON files
- Verified the documentation builds successfully locally (build
completed in ~3 minutes with only pre-existing broken anchor warnings)

## Review guide

1. **Verify timing**: Confirm this is the correct time to release
version 2.0 of the documentation
2. **Version order**: Check `docusaurus/platform_versions.json` - verify
"2.0" is first in the array (newest version first)
3. **Build verification**: Ensure CI/Vercel builds pass without errors
4. **Spot check**: Optionally review 2-3 files in
`docusaurus/platform_versioned_docs/version-2.0/` to ensure content
looks reasonable

Note: This is a standard Docusaurus versioning operation that creates a
frozen snapshot of the current "next" documentation. The generated files
are extensive (500+ files) but follow Docusaurus conventions.

## User Impact

Users will see version 2.0 available in the version dropdown on
docs.airbyte.com. This provides a stable reference point for platform
documentation at this point in time. Existing versions (1.6, 1.7, 1.8)
remain unchanged.

## Can this PR be safely reverted and rolled back?

- [x] YES 💚

This is an additive change that doesn't modify existing versioned docs.
Reverting would simply remove version 2.0 from the version list and
delete the associated documentation files.

Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-authored-by: ian.alton@airbyte.io <ian.alton@airbyte.io>
This commit is contained in:
devin-ai-integration[bot]
2025-10-14 18:27:24 -07:00
committed by GitHub
parent 0aa1802cae
commit 34282926a1
472 changed files with 32707 additions and 1 deletions

View File

@@ -0,0 +1,105 @@
---
products: all
---
import Tabs from "@theme/Tabs";
import TabItem from "@theme/TabItem";
import DocCardList from '@theme/DocCardList';
# Add a connection
After you add a [source](../using-airbyte/getting-started/add-a-source) and a [destination](../using-airbyte/getting-started/add-a-destination), add a new connection to start syncing data from your source to your destination.
## ELT and data activation destinations are different
You configure all connections in a similar way. However, the exact process is different for [data activation](elt-data-activation) destinations. This page explains how to set up both.
- **ELT databases, warehouses, and lakes**: These destinations are more agnostic about schema. You can, for example, create tables with any number of columns and change those column types if you need to. This is typical of data warehouses and data lakes, and allows you more freedom to determine the structure of your data.
- **Data activation**: These destinations have stricter schemas that your connection must adhere to. For example, if you're syncing data to Salesforce, your Salesforce records have pre-existing fields for name, email, company, phone number, revenue, etc. Some of those fields may be optional, some may be required, and all expect data in a certain format. This means you need to map data from your source to your destination to ensure it arrives in the necessary format and structure. Typically, you have less freedom over the structure of data in these destinations.
The unique needs of a destination account for why setting up some connections is different than others.
## Create an ELT connection
Follow these steps to create a connection to a database, warehouse, lake, or similar type of destination.
1. Click **Connections** in the navigation.
2. Click **New connection**.
3. Click the source you want to use. If you don't have one yet, you can [add one](../using-airbyte/getting-started/add-a-source).
4. Click the destination you want to use. If you don't have one yet, you can [add one](../using-airbyte/getting-started/add-a-destination). Wait a moment while Airbyte fetches the schema of your data.
5. Under **Select sync mode**, select your sync mode. You can either replicate sources, which maintains an up-to-date copy of your source data in the destination, or you can append historical changes, allowing you to track changes to your data over time in your destination. Airbyte automatically selects the most appropriate sync mode for each stream based on this selection. However, you can update specific streams later.
6. In the **Schema** table, configure your schema.
1. Choose the specific streams and fields you want to sync. You may have data you don't want to sync. For example, you might consider that data irrelevant or it might be subject to security or compliance rules. If you're unsure, you can always add it later. For help, see [Configuring schemas](../using-airbyte/configuring-schema).
2. Choose the sync mode for each stream. For help, see [sync modes](/platform/using-airbyte/core-concepts/sync-modes/).
3. Set primary keys for each stream, if applicable. If possible, Airbyte sets this for you automatically.
7. Click **Next**.
8. Under **Configure connection**, finalize the settings for this connection.
| Option | Description |
| ------------------------------------------------------------------ | ---------------------------------------------------------------------------------------------------- |
| Connection name | Give your connection a more meaningful name if the default `Source -> Destination` isn't sufficient. |
| [Tags](../using-airbyte/tagging) | Create and apply tags to organize your connections. |
| [Schedule type](../using-airbyte/core-concepts/sync-schedules) | Choose if and how you want to schedule this connection. |
| [Destination namespace](../using-airbyte/core-concepts/namespaces) | The location in the destination where you want Airbyte to write this data. |
| [Stream prefix](../using-airbyte/configuring-schema) | If you want to prefix each stream name with a unique value. |
| [Advanced settings](../using-airbyte/schema-change-management) | In most cases, you don't need to change these. |
9. Click **Set up connection**. Airbyte takes you to the page for that connection, where you can manage it and initiate syncs.
## Create a data activation connection
Follow these steps to create a connection to a data activation destination.
### Step 1: Choose your connectors
1. Click **Connections** in the navigation.
2. Click **New connection**.
3. Click the source you want to use. If you don't have one yet, you can [add one](../using-airbyte/getting-started/add-a-source).
4. Click the destination you want to use. If you don't have one yet, you can [add one](../using-airbyte/getting-started/add-a-destination). Wait a moment while Airbyte fetches the schema of your data. When it's done, Airbyte asks you to set up mappings.
### Step 2: Set up mappings, sync modes, and insertion methods
Data activation destinations have stricter schemas that your connection must adhere to. Some of those fields may be optional, some may be required, and all expect data in a certain format. This means you need to map data from your source to your destination to ensure it arrives in the necessary format and structure.
1. Select the first field you want to map from your source.
2. Select the sync mode and, if necessary, the cursor. For help, see [sync modes](/platform/using-airbyte/core-concepts/sync-modes/).
3. Select the destination field to which you want to map that field.
4. Select the insertion type. The insertion options available depend on the capabilities of the destination and its connector. Depending on your selection, Airbyte may require you to map other fields as well. In this case, it auto-populates those fields in the destination column.
5. Continue mapping source fields to destination fields until you've mapped all required fields, plus any optional fields you want to include.
### Step 3: Configure your connection
The final step is to configure details of your connection.
| Option | Description |
| ------------------------------------------------------------------ | ---------------------------------------------------------------------------------------------------- |
| Connection name | Give your connection a more meaningful name if the default `Source -> Destination` isn't sufficient. |
| [Tags](../using-airbyte/tagging) | Create and apply tags to organize your connections. |
| [Schedule type](../using-airbyte/core-concepts/sync-schedules) | Choose if and how you want to schedule this connection. |
| [Destination namespace](../using-airbyte/core-concepts/namespaces) | The location in the destination where you want Airbyte to write this data. |
| [Stream prefix](../using-airbyte/configuring-schema) | If you want to prefix each stream name with a unique value. |
| [Advanced settings](../using-airbyte/schema-change-management) | In most cases, you don't need to change these. |
When you're ready to create the connection, click **Set up connection**. Airbyte takes you to the page for that connection, where you can manage it and initiate syncs.
### Step 4: Review and manage rejected records {#da-rejected-records}
Destinations may reject records. See [Rejected records](rejected-records) to learn more.

Binary file not shown.

After

Width:  |  Height:  |  Size: 346 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 79 KiB

View File

@@ -0,0 +1,93 @@
---
products: all
---
# Data activation (reverse ETL)
Data activation enables you to move data out of your data warehouse and into the operational tools where work happens, like CRMs, marketing platforms, and support systems. With this capability, you can deliver modeled data directly to points of action and systems people already use, helping your organization respond faster and more effectively.
This page introduces the concept of data activation, outlines how it works within the Airbyte platform, and describes common use cases.
![Conceptual diagram showing data moving from a source, fields being mapped, and then moving to a destination](assets/data-activation-concept.png)
## What's data activation?
Data Activation operationalizes data by syncing it from storage systems&mdash;typically data warehouses&mdash;into the tools that business teams use daily. These tools include platforms like Salesforce, HubSpot, Marketo, Zendesk, and others.
Instead of limiting insights to dashboards and reports, data activation enables data to directly power workflows and decisions in real time in the places people need it.
The terms "data activation" and "reverse ETL" are sometimes used interchangeably, even if there is nuance in their meaning. Airbyte prefers the term data activation as a blanket term. It reflects the goal of any reverse ETL pipeline: to _activate your data_ by giving it to the people who need it in the places where it has the greatest impact on their work.
### Key characteristics
- **Warehouse-to-app sync**: transfer data from warehouses (e.g., Snowflake, BigQuery, Redshift) to operational destinations like Salesforce or Customer.io.
- **Reverse ETL**: a method used in data activation to extract, transform, and load data from warehouses into SaaS tools.
- **Declarative mapping**: You define how data fields map and transform between the warehouse and the destination.
- **Broad application**: data activation supports a range of business functions, including go-to-market operations, customer success, finance, and support.
## Why data activation is useful in Airbyte
Data Activation complements Airbytes existing data movement capabilities by enabling outbound syncs from your warehouse into operational tools. It expands the value of centralized data by delivering insights to where the action is.
### Benefits
- **Improved decision-making**: Sales teams can access lead scores directly within CRM platforms.
- **Personalized marketing**: Marketing teams can target users based on product usage and engagement.
- **Context-aware support**: Support teams can prioritize tickets using customer health metrics synced into their service tools.
This process turns your data warehouse into a central intelligence hub and ensures insights reach the systems—and people—who need them.
## How data activation works in Airbyte
Data activation works like any other sync, by moving data from a source to a destination. The process typically involves three stages:
1. **Ingestion**: Sync data from your sources to your data warehouse destination using Airbyte's connectors.
2. **Transformation**: Model and prepare your data using tools like dbt or SQL.
3. **Activation**: Sync that modeled data to operational tools using Airbyte's connectors and declarative mappings.
## Use Cases
Data Activation aligns with the shift toward operational analytics in modern data architectures. As organizations consolidate their data into warehouses, there is increasing demand for that data to inform business decisions beyond dashboards and reports.
Teams in sales, marketing, support, and finance often rely on operational systems that are disconnected from your data warehouse. Data activation bridges this gap, replacing manual exports, ad hoc pipelines, or no data at all with automated, governed workflows.
### Example: Revenue operations
- **User**: Revenue Operations Manager.
- **Objective**: Help sales reps prioritize high-intent accounts.
- **Challenge**: Usage metrics exist in Snowflake, but sales reps work in Salesforce.
- **Solution**: Use Airbyte to sync product usage scores from your data warehouse to custom fields in Salesforce.
- **Result**: Reps can view up-to-date engagement scores directly in their CRM and prioritize outreach accordingly.
### Additional use cases
| Use Case | Description |
| ---------------------- | ----------------------------------------------------------------- |
| Marketing Automation | Sync audience segments to HubSpot or Braze for targeted campaigns |
| Customer 360 | Push enriched customer profiles into CRMs for better visibility |
| Support Triage | Deliver customer health scores to Zendesk for prioritization |
| Finance Reconciliation | Notify finance teams via Slack when you detect billing anomalies |
## Get started
To start activating your data with Airbyte, see the following topics.
- [Set up a source](../using-airbyte/getting-started/add-a-source): The data warehouse or other source you're syncing data from.
- [Set up a destination](../using-airbyte/getting-started/add-a-destination): The CRM, marketing platform, or support system you're syncing data to.
- [Set up a connection](add-connection): Learn how to create a connection to a data activation destination and map fields from your source to your destination.
More resources:
- [All Airbyte connectors](/integrations)
- [dbt Core](https://www.getdbt.com/)

View File

@@ -0,0 +1,13 @@
---
products: all
---
import DocCardList from '@theme/DocCardList';
# Moving data
In Airbyte, you move data from **sources** to **destinations** using **connectors**. These connectors form **connections**. Connections contain **streams** of data and they're responsible for **syncing** your data.
This section explains these critical concepts and shows you how to move data in Airbyte.
<DocCardList />

View File

@@ -0,0 +1,74 @@
---
products: oss-enterprise, cloud-teams
---
# Rejected records
When syncing data to a [data activation destination](elt-data-activation), you may encounter rejected records. Rejected records are records Airbyte was unable to sync to your destination, even though the sync itself was otherwise successful.
## Why records get rejected
Records become rejected because they don't conform to the schema of the destination. The underlying reasons for this can be complex.
- The destination requires a field, but that field is empty in the source.
- The destination requires a field to be in a certain format, but that field is in an incompatible format in the source.
- The destination requires a field to be unique, but that field isn't unique in the source.
- A transformation error has corrupted a record at an earlier stage of your data pipeline.
- Many other issues.
Look at the following example.
| ID | First Name | Last Name | Phone Number | Address |
| --- | ---------------- | --------- | ------------ | --------------- |
| 123 | Alphonso | Mariyam | 123-456-7890 | 123 Fake Street |
| 456 | Emerald | Sanja | 234-567-8901 | 456 Fake Street |
| 789 | Sebastian Argyos | | 345-678-9012 | 789 Fake Street |
Imagine you want to move this data into your CRM, Salesforce. However, your Salesforce object requires that everyone has a first and last name. In this case, Sebastian Argyos' last name has been combined with his first name. From Salesforce's perspective, he doesn't have a last name. As a result, it rejects this record.
## Where rejected records go
Rejected records go into an S3 bucket, if you've configured one. You configure this bucket when you set up your destination. If you haven't configured one yet, you can do this later on, and rejected records begin to populate with subsequent syncs.
You should decide on a strategy for managing these records at scale. You might want to populate all of them to a single bucket for ease of observability, or you may want different destinations to use different buckets.
## Find out if the destination rejected records after a sync
Airbyte shows you rejected records on the connections Timeline page and the sync summary in the log for each sync.
If you've configured a storage bucket for rejected records, Airbyte links to it on the Timeline.
![Screenshot of rejected records in the connection Timeline](assets/rejected-records.png)
You can also monitor logs for them.
```json title="snowflake_salesforce_logs_12345_txt.txt"
Sync summary: {
// ...
"totalStats" : {
// ...
// highlight-next-line
"recordsRejected" : 1000
},
"streamStats" : [ {
"streamName" : "USERS",
"streamNamespace" : "DATA_PRODUCT",
"stats" : {
// ...
// highlight-next-line
"recordsRejected" : 1000
}
} ],
"performanceMetrics" : {
"mappers" : {
"field-renaming" : 0
}
}
}
```
## Fixing rejected records so Airbyte can sync them
In most cases, it's important to repair rejected records if you can. They may contain valuable data that you want to sync, and in large numbers, can erode the effectiveness of your data activation initiative.
You can repair rejected records in your source data warehouse or the upstream source that syncs to your data warehouse. Once you repair them, Airbyte can process them again during your next sync.

View File

@@ -0,0 +1,31 @@
---
products: all
---
import DocCardList from '@theme/DocCardList';
# Sources, destinations, and connectors
In Airbyte, you move data from **sources** to **destinations** using **connectors**.
## Sources and destinations
A **source** is the database, API, or other system **from** which you sync data. A **destination** is the data warehouse, data lake, database, or other system **to** which you sync data.
## Connectors
A **connector** is the component Airbyte uses to connect to and interact with your source or destination. Connectors are a key difference between a resilient Airbyte deployment and a more brittle in-house data pipeline.
Airbyte has two types of connectors: source connectors and destination connectors. Sometimes, people abbreviate these to "sources" and "destinations." That can be a little confusing, so to be clear, a connector isn't the same thing as the source or destination it connects to, but it's closely related to them.
Connectors have [different support levels](/integrations/connector-support-levels). Some are built and maintained by Airbyte and some are contributed by members of Airbyte's community.
### Connectors are open source
Airbyte provides over 600 connectors, almost all of which are open source. You can contribute to connectors to make them better or keep them up-to-date as third-parties make changes, or fork it to make it more suitable to your particular needs.
If you don't see the connector you need, you can build one from scratch. Airbyte provides a no-code and low-code [Connector Builder](../connector-development/connector-builder-ui/overview). For advanced use cases, you can use Connector Development Kits (CDKs), which are more traditional software development tools.
## Add and manage sources and destinations
<DocCardList />