1
0
mirror of synced 2025-12-20 18:39:31 -05:00
Commit Graph

233 Commits

Author SHA1 Message Date
Edward Gao
c8e3ec0210 Fix build: Revert "chore: clean out unused "bases" and utils (#53234)" (#53621) 2025-02-10 21:36:30 +00:00
Natik Gadzhi
4dec57a29f chore: clean out unused "bases" and utils (#53234) 2025-02-07 15:19:32 -08:00
Natik Gadzhi
cb80e6922a [tools] prettier rules for .md + formatting cleanup 2024-05-07 08:19:33 -07:00
Ella Rohm-Ensing
b7819d9f6c python: assert actual == expected ordering (#36980) 2024-04-11 15:16:33 +00:00
Marius Posta
2495575795 java-cdk: re-export airbyte-api dependency (#36759) 2024-04-03 10:43:05 -07:00
Marius Posta
f47db9051b delete bad or useless README files (#36196) 2024-03-15 12:02:23 -07:00
Joe Bell
2d80b5676d Destination Clickhouse - 1.0, remove normalization (#34637)
Co-authored-by: Aaron ("AJ") Steers <aj@airbyte.io>
Co-authored-by: Joe Reuter <joe@airbyte.io>
Co-authored-by: Obioma Anomnachi <onanomnachi@gmail.com>
Co-authored-by: Anatolii Yatsuk <35109939+tolik0@users.noreply.github.com>
Co-authored-by: Maxime Carbonneau-Leclerc <3360483+maxi297@users.noreply.github.com>
Co-authored-by: maxi297 <maxi297@users.noreply.github.com>
Co-authored-by: Ryan Waskewich <156025126+rwask@users.noreply.github.com>
Co-authored-by: Catherine Noll <clnoll@users.noreply.github.com>
Co-authored-by: Marius Posta <marius@airbyte.io>
Co-authored-by: Edward Gao <edward.gao@airbyte.io>
Co-authored-by: Marcos Marx <marcosmarxm@users.noreply.github.com>
Co-authored-by: SatishChGit <satishchinthanippu@gmail.com>
Co-authored-by: evantahler <evan@airbyte.io>
Co-authored-by: Rodi Reich Zilberman <867491+rodireich@users.noreply.github.com>
Co-authored-by: Anton Karpets <anton.karpets@globallogic.com>
Co-authored-by: Christo Grabowski <108154848+ChristoGrab@users.noreply.github.com>
Co-authored-by: Akash Kulkarni <akash@airbyte.io>
Co-authored-by: Akash Kulkarni <113392464+akashkulk@users.noreply.github.com>
Co-authored-by: Gireesh Sreepathi <gisripa@gmail.com>
Co-authored-by: Artem Inzhyyants <36314070+artem1205@users.noreply.github.com>
2024-02-22 11:17:25 -06:00
Augustin
cb3578c9e8 fix :airbyte-integrations:connectors:destination-duckdb' could not be found in project (#35279) 2024-02-14 17:50:14 +02:00
Marius Posta
796b2e8dad java CDK: clean up dependencies, refactor modules (#34745) 2024-02-08 19:46:51 -06:00
Augustin
0b33caecda Revert "[skip ci] formatting: add missing license headers (#33250)" (#33289) 2023-12-11 11:38:37 +01:00
Augustin
60c1cc01ad [skip ci] formatting: add missing license headers (#33250) 2023-12-11 10:15:18 +01:00
Ella Rohm-Ensing
ac3eb28de2 airbyte-ci: add format commands (#31831)
Co-authored-by: Ben Church <ben@airbyte.io>
Co-authored-by: bnchrch <bnchrch@users.noreply.github.com>
Co-authored-by: alafanechere <augustin.lafanechere@gmail.com>
Co-authored-by: Augustin <augustin@airbyte.io>
Co-authored-by: Marius Posta <marius@airbyte.io>
Co-authored-by: alafanechere <alafanechere@users.noreply.github.com>
2023-11-14 02:17:48 -06:00
Marius Posta
7cd8020ac8 java CDK: hoist top-level gradle projects into CDK (#31960)
Co-authored-by: postamar <postamar@users.noreply.github.com>
2023-10-30 12:03:06 -07:00
Marius Posta
696118de6a gradle: remove broken mypy task (#31468) 2023-10-16 12:05:56 -07:00
Marius Posta
f8edc18039 airbyte-ci,gradle: replace airbyte-docker with airbyte-ci (#30743) 2023-10-04 08:38:17 -07:00
Marius Posta
7ae97175a6 gradle: fix repo wide behaviour (#30607) 2023-09-28 05:01:13 -07:00
Marius Posta
51c67d7eaa gradle: remove airbyteDocker.outputs dependencies (#30314) 2023-09-11 17:16:27 -07:00
Marius Posta
ef2849e35e gradle: fix airbyteDocker task inputs (#30187) 2023-09-07 03:46:31 -07:00
Marius Posta
be1e1adabd gradle: cleanup (#30060) 2023-09-05 14:05:40 -05:00
Pedro S. Lopez
e10f768b50 workaround for normalizations (#28451) 2023-07-18 23:37:34 -05:00
Edward Gao
934acaa137 Destination bigquery: rerelease 1s1t behind gate (#27936)
* Revert "Revert "Destination Bigquery: Scaffolding for destinations v2 (#27268)""

This reverts commit 348c577dbb.

* version bumps+changelog

* Speed up BQ by having 2 queries, and not an OR (#27981)

* 🐛 Destination Bigquery: fix bug in standard inserts for syncs >10K records (#27856)

* only run t+d code if it's enabled

* dockerfile+changelog

* remove changelog entry

* Destinations V2: handle optional fields for `object` and `array` types (#27898)

* catch null schema

* fix null properties

* clean up

* consolidate + add more tests

* try catch

* empty json test

* Automated Commit - Formatting Changes

* remove todo

* destination bigquery: misc updates to 1s1t code (#28057)

* switch to checkedconsumer

* add unit test for buildColumnId

* use flag

* restructure prefix check

* fix build

* more type-parsing fixes (#28100)

* more type-parsing fixes

* handle duplicates

* Automated Commit - Format and Process Resources Changes

* add tests for asColumns

* Automated Commit - Format and Process Resources Changes

* log warnings instead of throwing exception

* better log message

* error level

---------

Co-authored-by: edgao <edgao@users.noreply.github.com>

* Automated Commit - Formatting Changes

* Improve protocol type parsing (#28126)

* Automated Commit - Formatting Changes

* Change from T&D every 10k records to an increasing time based interval (#28130)

* fifteen minute t&d

* add typing and deduping operation valve for increased intervals of typing and deduping

* Automated Commit - Format and Process Resources Changes

* resolve bizarre merge conflict

* Automated Commit - Format and Process Resources Changes

---------

Co-authored-by: jbfbell <jbfbell@users.noreply.github.com>

* Simplify and speed up CDC delete support [DestinationsV2] (#28029)

* Simplify and speed up CDC delete support [DestinationsV2]

* better QUOTE

* spotbugs?

* recompile dbt image for local arch and use that when building images

* things compile, but tests fail

* tests working-ish

* comment

* fix logic to re-insert deleted records for cursor comparison.

tests pass!

* remove comment

* Skip CDC re-include logic if there are no CDC columns

* stop hardcoding pk (#28092)

* wip

* remove TODOs

---------

Co-authored-by: Edward Gao <edward.gao@airbyte.io>

* update method name

* Automated Commit - Formatting Changes

* depend on pinned normalization version

* implement 1s1t DATs for destination-bigquery (#27852)

* intiial implementation

* Automated Commit - Formatting Changes

* add second sync to test

* do concurrent things

* Automated Commit - Formatting Changes

* clarify comment

* minor tweaks

* more stuff

* Automated Commit - Formatting Changes

* minor cleanup

* lots of fixes

* handle sql vs json null better
* verify extra columns
* only check deleted_at if in DEDUP mode and the column exists
* add full refresh append test case

* Automated Commit - Formatting Changes

* add tests for the remaining sync modes

* Automated Commit - Formatting Changes

* readability stuff

* Automated Commit - Formatting Changes

* add test for gcs mode

* remove static fields

* Automated Commit - Formatting Changes

* add more test cases, tweak test scaffold

* cleanup

* Automated Commit - Formatting Changes

* extract recorddiffer

* and use it in the sql generator test

* fix

* comment

* naming+comment

* one more comment

* better assert

* remove unnecessary thing

* one last thing

* Automated Commit - Formatting Changes

* enable concurrent execution on all java integration tests

* add test for default namespace

* Automated Commit - Formatting Changes

* implement a 2-stream test

* Automated Commit - Formatting Changes

* extract methods

* invert jsonNodesNotEquivalent

* Automated Commit - Formatting Changes

* fix conditional

* pull out diffSingleRecord

* Automated Commit - Formatting Changes

* handle nulls correctly

* remove raw-specific handling; break up methods

* Automated Commit - Formatting Changes

---------

Co-authored-by: edgao <edgao@users.noreply.github.com>
Co-authored-by: octavia-approvington <octavia-approvington@users.noreply.github.com>

* Destinations V2: move create raw tables earlier (#28255)

* move create raw tables

* better log message

* stop building normalization (#28256)

* fix ability to run tests

* disable incremental t+d for now

* Automated Commit - Formatting Changes

---------

Co-authored-by: Evan Tahler <evan@airbyte.io>
Co-authored-by: Cynthia Yin <cynthia@airbyte.io>
Co-authored-by: cynthiaxyin <cynthiaxyin@users.noreply.github.com>
Co-authored-by: edgao <edgao@users.noreply.github.com>
Co-authored-by: Joe Bell <joseph.bell@airbyte.io>
Co-authored-by: jbfbell <jbfbell@users.noreply.github.com>
Co-authored-by: octavia-approvington <octavia-approvington@users.noreply.github.com>
2023-07-14 09:34:56 -05:00
Evan Tahler
4fb1f98221 Fix destination-s3 build (#27786)
* bump version

* PR id

* shh normalization, shh

* remove a bunch of arm64 deps?

* might as well match the dockerfile
2023-06-27 17:15:44 -07:00
Edward Gao
9c56062a7d Normalization integration tests: set explicit cursor on cdc streams (#27670) 2023-06-23 14:33:49 -07:00
Evan Tahler
4dd9fe0c1c Fix normalization builds (#26930) 2023-06-02 07:41:40 -07:00
Evan Tahler
75240b0bbf Multi-architecture normalization build (local) (#26677)
* Multi-architecture normalization build (local)

When building and testing normalization locally, we need to force the base images to match the local host OS.

This is not a problem when publishing the connectors as `airbyte-ci`/dagger handles this for us

* Update build.gradle
2023-05-26 10:57:15 -07:00
Augustin
80032f73f9 connectors-ci: deprecate slash publish (#25865) 2023-05-22 10:10:56 +02:00
Joe Bell
9a4be977c1 Destination Bigquery: stop running normalization container for DAT (#25925)
* readme update

* allow passing additional flags to test containers

* remove build dependency

* Automated Change

* versioning updates

* restore denormalized change from master

* formatting changes

* formatting

* Automated Change

* update metadata file

---------

Co-authored-by: jbfbell <jbfbell@users.noreply.github.com>
2023-05-18 00:46:32 +00:00
Edward Gao
fb152a9a0a Normalization: Better handling for CDC transactional updates (#25993)
* try this?

* fix tests

* assert cdc values

* handle case where we have lsn but no updated_at

* readability improvements

* tweaks to test

* version bumps + changelogs

* Automated Change

---------

Co-authored-by: edgao <edgao@users.noreply.github.com>
2023-05-12 12:53:23 +00:00
Cynthia Yin
8400d20352 Destination Redshift: deprecate old migration normalization code (#25771)
* first pass normalization

* add pr link

* remove python test & resources

* linting
2023-05-05 14:18:27 -07:00
Jeff Cowan (Airbyte)
3a308ba48b Pin MarkupSafe for normalization (#25577)
We were running into a CI/CD system-only bug with dbt that requires this workaround to get it working
---------

Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
2023-04-26 23:16:50 +00:00
Edward Gao
9b7b30f92b Normalization: Use strict > comparison in incremental mode (#22381)
* copy tests from other branch

* switch to >

* [wip] wire up tests

* make tests work

* fixes

* nicer test structure

* maybe add feature flag?

* pattern matching

* also add version check

* formatting

* refactor test also

* extract test + fix method call

* minor tweaks

* add context to log message

* put workspace id in normalization input

* use non-semver tag

* add flag for version of normalization

* also flag old version

* add test

* missed part of the commit

* format

* add test for null workspace ID

* Revert "also flag old version"

This reverts commit 3be601d16c.

* Revert "missed part of the commit"

This reverts commit 47a67b4631.

* always apply flag, even if we're behind a version

* derp

* Add more logging to the normalization activity

* Update charts and kustomize for the feature flag

* fix clickhouse integration test

* remove replace_identifiers

* Revert "remove replace_identifiers"

This reverts commit 0e7ded5a7b.

* fix replace_identifiers

* garbage debug logs

* stop trying to setup duckdb test

* wake up and choose violence

* fix mssql

* exclude duckdb from tests

* make snowflake happy

* uncomment tests

* derp

* derpderp

* format

* format

* also fix redshift???

* maybe now everything works???

* remove debug logs

* use special docker tag

* bump to new tag

* use random test schema in publish also

* properly cleanup

* remove feature flag stuff

* version bump + changelog

* Automated Commit - Formatting Changes

* bump definitions

---------

Co-authored-by: Jimmy Ma <gosusnp@users.noreply.github.com>
Co-authored-by: Jimmy Ma <jimmy@airbyte.io>
Co-authored-by: octavia-squidington-iii <octavia-bot@airbyte.io>
Co-authored-by: edgao <edgao@users.noreply.github.com>
2023-03-23 09:37:15 -07:00
Edward Gao
294cdbcf4a Normalization Bigquery: Add more reserved words (#24077)
* add current times for bigquery

* bump version + changelog
2023-03-15 18:33:59 +00:00
Charles
f83ef9eea7 Remove workers (#23422) 2023-02-24 17:45:44 -08:00
Mikhail Shustov
2ce3c17048 🎉 Destination ClickHouse: bump dbt-clickhouse to v1.4.0 (#23023)
* bump dbt-clickhouse to 1.4.0

* fix clickhouse integration test

* exclude duckdb from tests

* add to changelog

* bump normalization version in definitions

---------

Co-authored-by: Marcos Marx <marcosmarxm@users.noreply.github.com>
Co-authored-by: Edward Gao <edward.gao@airbyte.io>
2023-02-16 20:15:09 -08:00
Sherif A. Nada
37501884eb unpin cdk version in normalization build(#22973) 2023-02-13 20:59:20 -08:00
Charles
c4bf76655e format (#22970) 2023-02-13 19:21:54 -08:00
Sherif A. Nada
bec7a26b27 ignore normalization CDK model related linting errors (#22963) 2023-02-13 17:48:23 -08:00
Greg Solovyev
6c8d3f655d Default CH ssl to true and fix the failure if ssl property is missing (#22846) 2023-02-13 17:40:08 -08:00
Cole Snodgrass
2e099acc52 update headers from 2022 -> 2023 (#22594)
* It's 2023!

* 2022 -> 2023

---------

Co-authored-by: evantahler <evan@airbyte.io>
2023-02-08 13:01:16 -08:00
Ryan Fu
d21068c989 Tentatively disallowed normalization for DuckDB (#22528) 2023-02-07 20:22:49 -08:00
Simon Späti
2bbc4f6f83 🎉 New Destination: DuckDB (#17494)
This is the first version of the DuckDB destination. There are potential edge cases that still need to be taken care of. But looking forward to your feedback.
2023-02-07 11:33:10 +01:00
Edward Gao
517fc6ac10 Normalization: Revert to protocol v0 (#22283)
* Revert "Normalization: handle non-object top-level schemas; treat binary data as string (#22165)"

This reverts commit 8276d03359.

* Revert "Normalization: check for ref type existence (#22161)"

This reverts commit dbe56d6fc2.

* Revert "🎉Updated normalization to handle new datatypes (#19721)"

This reverts commit c1d7736639.

* revert dest definitions

* also dockerfile

* re-add to changelog

* add comment in dockerfile
2023-02-06 10:14:36 -08:00
Edward Gao
8276d03359 Normalization: handle non-object top-level schemas; treat binary data as string (#22165)
* handle dumb top-level schemas

* version bump

* also definitions

* treat binary as string

* fallback case

* format

* new variable
2023-01-31 15:59:04 -06:00
Edward Gao
dbe56d6fc2 Normalization: check for ref type existence (#22161)
* check for ref type existence

* version bump

* bump normalization version

* format
2023-01-31 11:33:34 -08:00
Jimmy Ma
6660b13ad2 Add Airbyte Protocol V1 support. (#20036)
* Add Airbyte Protocol V1 support.

* Fix VersionedAirbyteStreamFactoryTest

* Remove AirbyteMessageMigrationV0 example

* Add Protocol Version constants

* 🎉Updated normalization to handle new datatypes (#19721)

* Updated normalization simple stream processing to handle new datatypes

* Updated normalization nested stream processing to handle new datatypes

* Updated normalization nested stream processing to handle new datatypes

* Updated normalization drop_scd_catalog processing to handle new datatypes

* Updated normalization ephemeral test processing to handle new datatypes

* fixed more tests for normalization

* fixed more tests for normalization

* fixed more tests for normalization

* fixed more tests for normalization

* fixed more issues

* fixed more issues (clickhouse)

* fixed more issues

* fixed more issues

* fixed more issues

* added binary type processing for some DBs

* cleared commented code and moved some hardcodes to processing as macro

* fixed codestyle and cleared commented code

* minor refactor

* minor refactor

* minor refactor

* fixed bool cast error

* fixed dict->str cast error

* fixed is_combining_node cast py check

* removed commented code

* removed commented code

* committed autogenerated normalization_test_output files

* committed autogenerated normalization_test_output files (new files)

* refactored utils.py

* Updated utils.py to use Callable functions and get rid of property_type in is_number and is_bool functions

* committed autogenerated normalization_test_output files (new files)

* fixed typo in TIMESTAMP_WITH_TIMEZONE_TYPE

* updated stream_processor to handle string type first as a wider type

* fixed arrays normalization by updating is_simple_property method as per new approaches

* format

Co-authored-by: Edward Gao <edward.gao@airbyte.io>

* Update airbyte protocol migration (#20745)

* Extract MigrationContainer from AirbyteMessageMigrator

* Add ConfiguredAirbyteCatalogMigrations

* Add ConfiguredAirbyteCatalog to AirbyteMessageMigrations

* Enable ConfiguredAirbyteCatalog migration

* Fix tests

* Remove extra this.

* Add missing docs

* Typo

Co-authored-by: Edward Gao <edward.gao@airbyte.io>

* Data types update: Implement protocol message migrations (#19240)

* Extract MigrationContainer from AirbyteMessageMigrator

* Add ConfiguredAirbyteCatalogMigrations

* Add ConfiguredAirbyteCatalog to AirbyteMessageMigrations

* Enable ConfiguredAirbyteCatalog migration

* set up scaffolding

* [wip] more scaffolding, basic unit test

* minimal green code

* [wip] add failing test for other primitive types

* correct version number

* handle basic primitive type decls

* add implicit cases

* add recursive schema

* formatting

* comment

* support not

* fix indentation

* handle all nested schema cases

* handle boolean schemas

* verify empty schema handling

* cleanup

* extract map

* code organization

* extract method

* reformat

* [wip] more tests, minor fix type array handling

* corrected test

* cleanup

* reformat

* switch to v1

* add support for multityped fields

* missed test case

* nested test class

* basic record upgrade

* implement record upgrades

* slight refactor

* comments+clarificationso

* extract constants

* (partly) correct model classes

* add de/ser

* formatting

* extract constants

* fix json reference

* update docs

* switch to v1 models

* fix compile+test

* add base64 handling

* use vnull

* Data types update: Implement protocol message downgrade path (#19909)

* rough skeleton for passing catalog into migration

* basic test

* more scaffolding

* basic implementation

* add primitives test

* add in other tests (nested fields currently failing)

* add formats

* impleent oneOf handling

* formatting

* oneOf handling

* better tests

* comments + organization

* progress

* basic test case

* downgrade objects, ish

* basic array implementation

* handle numeric failure

* test for new type

* handle array items

* empty schema handling

* first pass at oneof handling

* add more tests+handling

* more tests

* comments

* add empty oneof test case

* format + reorganize

* more reorganize

* fix name

* also downgrade binary data

* only import vnull

* move migrations into v1 package

* extract schema mutation code

* comment

* extract schema migration to new class

* extract record downgrade logic for future use

* format

* fix build after rebase

* rename private method for consistency

* also implement configuredcatalog migrations >.>

* quick and dirty tests

* slight cleanup

* fix tests

* pmd

* pmd test

* null check on message objects

* maybe fix acceptance tests?

* fix name

* extract constants

* more fixes

* tmp

* meh

* fix cdc acc tests

* revert to master source-postgres

* remove log messages

* revert other misc hacks

* integers are valid cursors

* remove unrelated change

* fix build

* fix build more?

* [MUST REVERT] use dev normalization

* capture kube logs

* also here?

* no debug logs?

* delete dup from merging

* add final everywhere

* revert test changes

Co-authored-by: Jimmy Ma <jimmy@airbyte.io>

* On-the-fly migrations of persisted catalogs (#21757)

* On the fly catalog migration for normalization activity

* On the fly catalog migration for job persistence

* On the fly migration for standard sync persistence

* On the fly migration for airbyte catalogs

* Refactor code to share JsonSchema traversal

* Add V0 Data type search function

* PMD and Format

* Fix getOrInsertActorCatalog and ConfigRepositoryE2E tests

* Null-proofing CatalogMigrationV1Helper

* More null checks

* Fix test

* Format

* Add data type v1 support to the FE

* Changes AC test check to check exited ps (#21672)

some docker compose changes no longer show exited
processes.  this broke out test

this change should fix master

tested in a runner that failed

* Move wellknown types mapping to the utility function

* use protocolv1 normalization

---------

Co-authored-by: Topher Lubaway <asimplechris@gmail.com>
Co-authored-by: Edward Gao <edward.gao@airbyte.io>

* Update protocol support range (#21996)

* bump normalization version to 0.3.0

* Add version check on normalization (#22048)

* Add normalization min version check

* Add visible for testing

---------

Co-authored-by: Edward Gao <edward.gao@airbyte.io>
Co-authored-by: Eugene <etsybaev@gmail.com>
Co-authored-by: Topher Lubaway <asimplechris@gmail.com>
2023-01-30 10:17:49 -08:00
Greg Solovyev
56c686440e New destination: databend (community PR #19815) (#20909)
* feat: Add databend destination

Co-authored-by: hantmac <hantmac@outlook.com>
Co-authored-by: josephkmh <joseph@airbyte.io>
Co-authored-by: Sajarin <sajarindider@gmail.com>
2023-01-09 10:19:07 -08:00
Jaakko Kangasharju
3035dc002a Add BOOLEAN to Redshift keywords (#20421) 2023-01-03 14:53:42 -08:00
Geoff Genz
b7816f4f58 🐛 Destination ClickHouse: Update Normalization Docker File (#19573)
* Update ClickHouse normalization docker file

* bump destination and norm version

* auto-bump connector version

* update doc

Co-authored-by: Marcos Marx <marcosmarxm@users.noreply.github.com>
Co-authored-by: marcosmarxm <marcosmarxm@gmail.com>
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
2022-12-06 14:49:18 -03:00
Greg Solovyev
6af98045f8 Parameterize test_empty_streams and test_stream_with_1_airbyte_column by destination (#18197)
* Remove lines that always add Postgres to list of destinations
* Parameterize all tests in test_ephemeral by destination
2022-11-01 12:55:32 -07:00
Greg Solovyev
8cf546483d 🐛 Add a drop table hook to drop scd tables in case of overwrite sync (#18015)
* Add a drop table hook to drop scd tables in case of overwrite sync

* Add an integration test for dropping SCD table on overwrite

* skip new test for Oracle and TiDB

* Add normalization run after initial reset

* Bump normalization version
2022-11-01 08:52:02 -07:00