jprdonnelly/airbyte

Fork 0

mirror of synced 2025-12-19 18:14:56 -05:00

Files

Benoit Moriceau 894fd3a136 Bump version (#70895 )

2025-12-12 10:25:44 -08:00

31 KiB

Raw Permalink Blame History

Postgres

This page guides you through the process of setting up the Postgres destination connector.

:::info Direct Load

Starting with version 3.0.0, the Postgres destination uses Direct Load architecture. This means data is written directly to final tables without using intermediate raw tables, providing improved performance and reduced storage costs.

For migration details and backward compatibility options, see the Postgres Migration Guide.

:::

Warning

:::warning

Postgres, while an excellent relational database, is not a data warehouse. Please only consider using postgres as a destination for small data volumes (e.g. less than 10GB) or for testing purposes. For larger data volumes, we recommend using a data warehouse like BigQuery, Snowflake, or Redshift. Learn more here.

:::

Prerequisites

To use the Postgres destination, you'll need:

A Postgres server version 9.5 or above

Airbyte Cloud only supports connecting to your Postgres instances with SSL or TLS encryption. TLS is used by default. Other than that, you can proceed with the open-source instructions below.

You'll need the following information to configure the Postgres destination:

Host - The host name of the server.
Port - The port number the server is listening on. Defaults to the PostgreSQL™ standard port number (5432).
Username
Password
Default Schema Name - Specify the schema (or several schemas separated by commas) to be set in the search-path. These schemas will be used to resolve unqualified object names used in statements executed over this connection.
Database - The database name. The default is to connect to a database with the same name as the user name.
JDBC URL Params (optional)

Refer to this guide for more details

Configure Network Access

Make sure your Postgres database can be accessed by Airbyte. If your database is within a VPC, you may need to allow access from the IP you're using to expose Airbyte.

Step 1: Set up Postgres

Permissions

You need a Postgres user with the following permissions:

can create tables and write rows.
can create schemas e.g:

You can create such a user by running:

CREATE USER airbyte_user WITH PASSWORD '<password>';
GRANT CREATE, TEMPORARY ON DATABASE <database> TO airbyte_user;

You can also use a pre-existing user but we highly recommend creating a dedicated user for Airbyte.

Step 2: Set up the Postgres connector in Airbyte

Target Database

You will need to choose an existing database or create a new database that will be used to store synced data from Airbyte.

Naming Conventions

From Postgres SQL Identifiers syntax:

SQL identifiers and key words must begin with a letter (a-z, but also letters with diacritical marks and non-Latin letters) or an underscore (_).
Subsequent characters in an identifier or key word can be letters, underscores, digits (0-9), or dollar signs ($).

Note that dollar signs are not allowed in identifiers according to the SQL standard, so their use might render applications less portable. The SQL standard will not define a key word that contains digits or starts or ends with an underscore, so identifiers of this form are safe against possible conflict with future extensions of the standard.
The system uses no more than NAMEDATALEN-1 bytes of an identifier; longer names can be written in commands, but they will be truncated. By default, NAMEDATALEN is 64 so the maximum identifier length is 63 bytes
Quoted identifiers can contain any character, except the character with code zero. (To include a double quote, write two double quotes.) This allows constructing table or column names that would otherwise not be possible, such as ones containing spaces or ampersands. The length limitation still applies.
Quoting an identifier also makes it case-sensitive, whereas unquoted names are always folded to lower case.
In order to make your applications portable and less error-prone, use consistent quoting with each name (either always quote it or never quote it).

:::info

Airbyte Postgres destination creates final tables and their corresponding columns using Quoted identifiers, preserving the case sensitivity. Special characters in table and column names are replaced with underscores.

When using the legacy "Raw tables only" mode, raw tables and schemas are created using Unquoted identifiers by replacing any special characters with an underscore.

:::

For Airbyte Cloud:

Log into your Airbyte Cloud account.
In the left navigation bar, click Destinations. In the top-right corner, click new destination.
On the Set up the destination page, enter the name for the Postgres connector and select Postgres from the Destination type dropdown.
Enter a name for your source.
For the Host, Port, and DB Name, enter the hostname, port number, and name for your Postgres database.
List the Default Schemas.

:::note

The schema names are case sensitive. The 'public' schema is set by default. Multiple schemas may be used at one time. No schemas set explicitly - will sync all of existing.

:::

For User and Password, enter the username and password you created in Step 1.
For Airbyte Open Source, toggle the switch to connect using SSL. For Airbyte Cloud uses SSL by default.
For SSL Modes, select:
- disable to disable encrypted communication between Airbyte and the source
- allow to enable encrypted communication only when required by the source
- prefer to allow unencrypted communication only when the source doesn't support encryption
- require to always require encryption. Note: The connection will fail if the source doesn't support encryption.
- verify-ca to always require encryption and verify that the source has a valid SSL certificate
- verify-full to always require encryption and verify the identity of the source
To customize the JDBC connection beyond common options, specify additional supported JDBC URL parameters as key-value pairs separated by the symbol & in the JDBC URL Parameters (Advanced) field.

Example: key1=value1&key2=value2&key3=value3

These parameters will be added at the end of the JDBC URL that the AirByte will use to connect to your Postgres database.

The connector now supports connectTimeout and defaults to 60 seconds. Setting connectTimeout to 0 seconds will set the timeout to the longest time available.

Note: Do not use the following keys in JDBC URL Params field as they will be overwritten by Airbyte: currentSchema, user, password, ssl, and sslmode.

:::warning

This is an advanced configuration option. Users are advised to use it with caution.

:::

For SSH Tunnel Method, select:
- No Tunnel for a direct connection to the database
- SSH Key Authentication to use an RSA Private as your secret for establishing the SSH tunnel
- Password Authentication to use a password as your secret for establishing the SSH tunnel

:::warning

Since Airbyte Cloud requires encrypted communication, select SSH Key Authentication or Password Authentication if you selected disable, allow, or prefer as the SSL Mode; otherwise, the connection will fail.

:::

Click Set up destination.

Supported sync modes

The Postgres destination connector supports the following sync modes:

Feature	Supported?(Yes/No)	Notes
Full Refresh Sync	Yes
Incremental - Append Sync	Yes
Incremental - Append + Deduped	Yes
Namespaces	Yes

Schema map

Output Schema (Direct Load)

The Postgres destination uses Direct Load architecture. Each stream is written directly to a final table in your configured schema. The table includes your data columns plus the following Airbyte metadata columns:

_airbyte_raw_id: a uuid assigned by Airbyte to each event that is processed. The column type in Postgres is VARCHAR.
_airbyte_extracted_at: a timestamp representing when the event was pulled from the data source. The column type in Postgres is TIMESTAMP WITH TIME ZONE.
_airbyte_meta: a jsonb column containing metadata about the record, including sync information and any schema changes. The column type in Postgres is JSONB.
_airbyte_generation_id: an identifier for the generation of the sync. The column type in Postgres is BIGINT.

Output Schema (Raw Tables) - Deprecated

:::warning Deprecated

Raw tables are deprecated starting with version 3.0.0. The connector now uses Direct Load to write directly to final tables. For backward compatibility options and migration guidance, see the Postgres Migration Guide.

:::

Legacy Raw Tables Documentation

Each stream will be mapped to a separate raw table in Postgres. The default schema in which the raw tables are created is airbyte_internal. This can be overridden in the configuration. Each table will contain 3 columns:

_airbyte_raw_id: a uuid assigned by Airbyte to each event that is processed. The column type in Postgres is VARCHAR.
_airbyte_extracted_at: a timestamp representing when the event was pulled from the data source. The column type in Postgres is TIMESTAMP WITH TIME ZONE.
_airbyte_loaded_at: a timestamp representing when the row was processed into final table. The column type in Postgres is TIMESTAMP WITH TIME ZONE.
_airbyte_data: a json blob representing with the event data. The column type in Postgres is JSONB.

Final Tables Data type mapping

Airbyte Type	Postgres Type
string	VARCHAR
number	DECIMAL
integer	BIGINT
boolean	BOOLEAN
object	JSONB
array	JSONB
timestamp_with_timezone	TIMESTAMP WITH TIME ZONE
timestamp_without_timezone	TIMESTAMP
time_with_timezone	TIME WITH TIME ZONE
time_without_timezone	TIME
date	DATE

Naming limitations

Postgres restricts all identifiers to 63 characters or less. If your stream includes column names longer than 63 characters, they will be truncated to this length. If this results in two columns having the same name, Airbyte may modify these column names to avoid the collision.

Creating dependent objects

:::caution

This section involves running DROP ... CASCADE on the tables that Airbyte produces. Make sure you fully understand the consequences before enabling this option. Permanent data loss is possible with this option!

:::

You may want to create objects that depend on the tables generated by Airbyte, such as views. If you do so, we strongly recommend:

Using a tool like dbt to automate the creation
And using an orchestrator to trigger dbt.

This is because you will need to enable the "Drop tables with CASCADE" option. The connector sometimes needs to recreate the tables; if you have created dependent objects, Postgres will require the connector to run drop statements with CASCADE enabled. However, this will cause the connector to also drop the dependent objects. Therefore, you MUST have a way to recreate those dependent objects from scratch.

Tutorials

Now that you have set up the Postgres destination connector, check out the following tutorials:

Vendor-Specific Connector Limitations

:::warning

Not all implementations or deployments of a database will be the same. This section lists specific limitations and known issues with the connector based on how or where it is deployed.

:::

Changelog

Expand to review

Version	Date	Pull Request	Subject
3.0.5	2025-12-12	70895	Update CDK to 0.1.86.
3.0.5-rc.1	2025-12-09	70338	Update CDK to 0.1.86.
3.0.4	2025-12-05	70355	Fix: Force Append mode when "Raw tables only" mode is enabled, bypassing Dedupe mode to avoid errors.
3.0.3	2025-12-04	70347	Fix index recreation on non-existent columns in raw tables mode.
3.0.2	2025-12-04	70337	Refactor: Move raw tables mode check to index creation for better code clarity.
3.0.1	2025-12-03	70326	Fix `PSQLException` when running in legacy "Raw tables only" mode.
3.0.0	2025-12-02	69846	Breaking Change: Introduces Direct Load architecture. Data is now written directly to final tables without using intermediate raw tables, providing improved performance and reduced storage costs. Raw tables are deprecated; use "Raw tables only" option if needed for backward compatibility.
2.4.7	2025-08-29	65620	Noop release.
2.4.7-rc.1	2025-08-29	65617	Testing RC publishing.
2.4.6	2025-08-21	63769	Fix numeric columns synced as NULL value in incremential sync.
2.4.5	2025-04-24	58652	Upgrade libraries for security patches
2.4.4	2025-03-24	56355	Upgrade to airbyte/java-connector-base:2.0.1 to be M4 compatible.
2.4.3	2025-02-14	53686	Add option to use unconstrained `DECIMAL` type for numeric columns.
2.4.2	2025-01-10	51483	Use a non root base image
2.4.1	2024-12-18	49885	Use a base image: airbyte/java-connector-base:1.0.0
2.4.0	2024-08-18	#45434	upgrade all dependencies.
2.3.2	2024-08-07	#43331	bump java CDK.
2.3.1	2024-08-07	#43363	Adopt latest CDK.
2.3.0	2024-07-22	#41954	Support for refreshes and resumable full refresh. WARNING: You must upgrade to platform 0.63.7 before upgrading to this connector version.
2.2.1	2024-07-22	#42423	no-op. Bumping to a clean image
2.2.0	2024-07-22	#42423	Revert refreshes support
2.1.1	2024-07-22	#42415	fixing PostgresSqlOperations.isOtherGenerationIdInTable to close the streams coming from JdbcDatabase.unsafeQuery
2.1.0	2024-07-22	#41954	Support for refreshes and resumable full refresh. WARNING: You must upgrade to platform 0.63.7 before upgrading to this connector version.
2.0.15	2024-06-26	#40554	Convert all strict-encrypt prod code to kotlin.
2.0.14	2024-06-26	#40563	Convert all test code to kotlin.
2.0.13	2024-06-13	#40159	Config error on drop failure when cascade is disabled
2.0.12	2024-06-12	#39388	Sources auto-conversion to Kotlin
2.0.11	2024-06-10	#39372	Fixed function already exists error
2.0.10	2024-05-07	#37660	Adopt CDK 0.33.2
2.0.9	2024-04-11	#36974	Add option to drop with `CASCADE`
2.0.8	2024-04-10	#36805	Adopt CDK 0.29.10 to improve long column name handling
2.0.7	2024-04-08	#36768	Adopt CDK 0.29.7 to improve destination state handling
2.0.6	2024-04-05	#36620	Adopt CDK 0.29.3 to use Kotlin CDK
2.0.5	2024-03-07	#35899	Adopt CDK 0.27.3; Bugfix for case-senstive table names in v1-v2 migration, `_airbyte_meta` in raw tables
2.0.4	2024-03-07	#35899	Adopt CDK 0.23.18; Null safety check in state parsing
2.0.3	2024-03-01	#35528	Adopt CDK 0.23.11; Use Migration framework
2.0.2	2024-03-01	#35760	Mark as certified, add PSQL exception to deinterpolator
2.0.1	2024-02-22	#35385	Upgrade CDK to 0.23.0; Gathering required initial state upfront
2.0.0	2024-02-09	#35042	GA release V2 destinations format.
0.6.3	2024-02-06	#34891	Remove varchar limit, use system defaults
0.6.2	2024-01-30	#34683	CDK Upgrade 0.16.3; Fix dependency mismatches in slf4j lib
0.6.1	2024-01-29	#34630	CDK Upgrade; Use lowercase raw table in T+D queries.
0.6.0	2024-01-19	#34372	Add dv2 flag in spec
0.5.5	2024-01-18	#34236	Upgrade CDK to 0.13.1; Add indexes in raw table for query optimization
0.5.4	2024-01-11	#34177	Add code for DV2 beta (no user-visible changes)
0.5.3	2024-01-10	#34135	Use published CDK missed in previous release
0.5.2	2024-01-08	#33875	Update CDK to get Tunnel heartbeats feature
0.5.1	2024-01-04	#33873	Install normalization to enable DV2 beta
0.5.0	2023-12-18	#33507	Upgrade to latest CDK; Fix DATs and tests
0.4.0	2023-06-27	#27781	License Update: Elv2
0.3.27	2023-04-04	#24604	Support for destination checkpointing
0.3.26	2022-09-27	#17299	Improve error handling for strict-encrypt postgres destination
0.3.24	2022-09-08	#16046	Fix missing database name URL Encoding
0.3.23	2022-07-18	#16260	Prevent traffic going on an unsecured channel in strict-encryption version of destination postgres
0.3.22	2022-07-18	#13840	Added the ability to connect using different SSL modes and SSL certificates
0.3.21	2022-07-06	#14479	Publish amd64 and arm64 versions of the connector
0.3.20	2022-05-17	#12820	Improved 'check' operation performance
0.3.19	2022-04-25	#12195	Add support for additional JDBC URL Params input
0.3.18	2022-04-12	#11729	Bump mina-sshd from 2.7.0 to 2.8.0
0.3.17	2022-04-05	#11729	Fixed bug with dashes in schema name
0.3.15	2022-02-25	#10421	Refactor JDBC parameters handling
0.3.14	2022-02-14	#10256	(unpublished) Add `-XX:+ExitOnOutOfMemoryError` JVM option
0.3.13	2021-12-01	#8371	Fixed incorrect handling "\n" in ssh key
0.3.12	2021-11-08	#7719	Improve handling of wide rows by buffering records based on their byte size rather than their count
0.3.11	2021-09-07	#5743	Add SSH Tunnel support
0.3.10	2021-08-11	#5336	Destination Postgres: fix \u0000(NULL) value processing

31 KiB Raw Permalink Blame History

Postgres

Warning

Prerequisites

Configure Network Access

Step 1: Set up Postgres

Permissions

Step 2: Set up the Postgres connector in Airbyte

Target Database

Naming Conventions

Supported sync modes

Schema map

Output Schema (Direct Load)

Output Schema (Raw Tables) - Deprecated

Final Tables Data type mapping

Naming limitations

Creating dependent objects

Tutorials

Vendor-Specific Connector Limitations

Changelog

31 KiB

Raw Permalink Blame History