1
0
mirror of synced 2025-12-21 11:01:41 -05:00
Commit Graph

87 Commits

Author SHA1 Message Date
Edward Gao
c8e3ec0210 Fix build: Revert "chore: clean out unused "bases" and utils (#53234)" (#53621) 2025-02-10 21:36:30 +00:00
Natik Gadzhi
4dec57a29f chore: clean out unused "bases" and utils (#53234) 2025-02-07 15:19:32 -08:00
Natik Gadzhi
cb80e6922a [tools] prettier rules for .md + formatting cleanup 2024-05-07 08:19:33 -07:00
Augustin
0b33caecda Revert "[skip ci] formatting: add missing license headers (#33250)" (#33289) 2023-12-11 11:38:37 +01:00
Augustin
60c1cc01ad [skip ci] formatting: add missing license headers (#33250) 2023-12-11 10:15:18 +01:00
Marius Posta
7ae97175a6 gradle: fix repo wide behaviour (#30607) 2023-09-28 05:01:13 -07:00
Edward Gao
9c56062a7d Normalization integration tests: set explicit cursor on cdc streams (#27670) 2023-06-23 14:33:49 -07:00
Edward Gao
fb152a9a0a Normalization: Better handling for CDC transactional updates (#25993)
* try this?

* fix tests

* assert cdc values

* handle case where we have lsn but no updated_at

* readability improvements

* tweaks to test

* version bumps + changelogs

* Automated Change

---------

Co-authored-by: edgao <edgao@users.noreply.github.com>
2023-05-12 12:53:23 +00:00
Cynthia Yin
8400d20352 Destination Redshift: deprecate old migration normalization code (#25771)
* first pass normalization

* add pr link

* remove python test & resources

* linting
2023-05-05 14:18:27 -07:00
Edward Gao
9b7b30f92b Normalization: Use strict > comparison in incremental mode (#22381)
* copy tests from other branch

* switch to >

* [wip] wire up tests

* make tests work

* fixes

* nicer test structure

* maybe add feature flag?

* pattern matching

* also add version check

* formatting

* refactor test also

* extract test + fix method call

* minor tweaks

* add context to log message

* put workspace id in normalization input

* use non-semver tag

* add flag for version of normalization

* also flag old version

* add test

* missed part of the commit

* format

* add test for null workspace ID

* Revert "also flag old version"

This reverts commit 3be601d16c.

* Revert "missed part of the commit"

This reverts commit 47a67b4631.

* always apply flag, even if we're behind a version

* derp

* Add more logging to the normalization activity

* Update charts and kustomize for the feature flag

* fix clickhouse integration test

* remove replace_identifiers

* Revert "remove replace_identifiers"

This reverts commit 0e7ded5a7b.

* fix replace_identifiers

* garbage debug logs

* stop trying to setup duckdb test

* wake up and choose violence

* fix mssql

* exclude duckdb from tests

* make snowflake happy

* uncomment tests

* derp

* derpderp

* format

* format

* also fix redshift???

* maybe now everything works???

* remove debug logs

* use special docker tag

* bump to new tag

* use random test schema in publish also

* properly cleanup

* remove feature flag stuff

* version bump + changelog

* Automated Commit - Formatting Changes

* bump definitions

---------

Co-authored-by: Jimmy Ma <gosusnp@users.noreply.github.com>
Co-authored-by: Jimmy Ma <jimmy@airbyte.io>
Co-authored-by: octavia-squidington-iii <octavia-bot@airbyte.io>
Co-authored-by: edgao <edgao@users.noreply.github.com>
2023-03-23 09:37:15 -07:00
Mikhail Shustov
2ce3c17048 🎉 Destination ClickHouse: bump dbt-clickhouse to v1.4.0 (#23023)
* bump dbt-clickhouse to 1.4.0

* fix clickhouse integration test

* exclude duckdb from tests

* add to changelog

* bump normalization version in definitions

---------

Co-authored-by: Marcos Marx <marcosmarxm@users.noreply.github.com>
Co-authored-by: Edward Gao <edward.gao@airbyte.io>
2023-02-16 20:15:09 -08:00
Cole Snodgrass
2e099acc52 update headers from 2022 -> 2023 (#22594)
* It's 2023!

* 2022 -> 2023

---------

Co-authored-by: evantahler <evan@airbyte.io>
2023-02-08 13:01:16 -08:00
Ryan Fu
d21068c989 Tentatively disallowed normalization for DuckDB (#22528) 2023-02-07 20:22:49 -08:00
Simon Späti
2bbc4f6f83 🎉 New Destination: DuckDB (#17494)
This is the first version of the DuckDB destination. There are potential edge cases that still need to be taken care of. But looking forward to your feedback.
2023-02-07 11:33:10 +01:00
Edward Gao
517fc6ac10 Normalization: Revert to protocol v0 (#22283)
* Revert "Normalization: handle non-object top-level schemas; treat binary data as string (#22165)"

This reverts commit 8276d03359.

* Revert "Normalization: check for ref type existence (#22161)"

This reverts commit dbe56d6fc2.

* Revert "🎉Updated normalization to handle new datatypes (#19721)"

This reverts commit c1d7736639.

* revert dest definitions

* also dockerfile

* re-add to changelog

* add comment in dockerfile
2023-02-06 10:14:36 -08:00
Jimmy Ma
6660b13ad2 Add Airbyte Protocol V1 support. (#20036)
* Add Airbyte Protocol V1 support.

* Fix VersionedAirbyteStreamFactoryTest

* Remove AirbyteMessageMigrationV0 example

* Add Protocol Version constants

* 🎉Updated normalization to handle new datatypes (#19721)

* Updated normalization simple stream processing to handle new datatypes

* Updated normalization nested stream processing to handle new datatypes

* Updated normalization nested stream processing to handle new datatypes

* Updated normalization drop_scd_catalog processing to handle new datatypes

* Updated normalization ephemeral test processing to handle new datatypes

* fixed more tests for normalization

* fixed more tests for normalization

* fixed more tests for normalization

* fixed more tests for normalization

* fixed more issues

* fixed more issues (clickhouse)

* fixed more issues

* fixed more issues

* fixed more issues

* added binary type processing for some DBs

* cleared commented code and moved some hardcodes to processing as macro

* fixed codestyle and cleared commented code

* minor refactor

* minor refactor

* minor refactor

* fixed bool cast error

* fixed dict->str cast error

* fixed is_combining_node cast py check

* removed commented code

* removed commented code

* committed autogenerated normalization_test_output files

* committed autogenerated normalization_test_output files (new files)

* refactored utils.py

* Updated utils.py to use Callable functions and get rid of property_type in is_number and is_bool functions

* committed autogenerated normalization_test_output files (new files)

* fixed typo in TIMESTAMP_WITH_TIMEZONE_TYPE

* updated stream_processor to handle string type first as a wider type

* fixed arrays normalization by updating is_simple_property method as per new approaches

* format

Co-authored-by: Edward Gao <edward.gao@airbyte.io>

* Update airbyte protocol migration (#20745)

* Extract MigrationContainer from AirbyteMessageMigrator

* Add ConfiguredAirbyteCatalogMigrations

* Add ConfiguredAirbyteCatalog to AirbyteMessageMigrations

* Enable ConfiguredAirbyteCatalog migration

* Fix tests

* Remove extra this.

* Add missing docs

* Typo

Co-authored-by: Edward Gao <edward.gao@airbyte.io>

* Data types update: Implement protocol message migrations (#19240)

* Extract MigrationContainer from AirbyteMessageMigrator

* Add ConfiguredAirbyteCatalogMigrations

* Add ConfiguredAirbyteCatalog to AirbyteMessageMigrations

* Enable ConfiguredAirbyteCatalog migration

* set up scaffolding

* [wip] more scaffolding, basic unit test

* minimal green code

* [wip] add failing test for other primitive types

* correct version number

* handle basic primitive type decls

* add implicit cases

* add recursive schema

* formatting

* comment

* support not

* fix indentation

* handle all nested schema cases

* handle boolean schemas

* verify empty schema handling

* cleanup

* extract map

* code organization

* extract method

* reformat

* [wip] more tests, minor fix type array handling

* corrected test

* cleanup

* reformat

* switch to v1

* add support for multityped fields

* missed test case

* nested test class

* basic record upgrade

* implement record upgrades

* slight refactor

* comments+clarificationso

* extract constants

* (partly) correct model classes

* add de/ser

* formatting

* extract constants

* fix json reference

* update docs

* switch to v1 models

* fix compile+test

* add base64 handling

* use vnull

* Data types update: Implement protocol message downgrade path (#19909)

* rough skeleton for passing catalog into migration

* basic test

* more scaffolding

* basic implementation

* add primitives test

* add in other tests (nested fields currently failing)

* add formats

* impleent oneOf handling

* formatting

* oneOf handling

* better tests

* comments + organization

* progress

* basic test case

* downgrade objects, ish

* basic array implementation

* handle numeric failure

* test for new type

* handle array items

* empty schema handling

* first pass at oneof handling

* add more tests+handling

* more tests

* comments

* add empty oneof test case

* format + reorganize

* more reorganize

* fix name

* also downgrade binary data

* only import vnull

* move migrations into v1 package

* extract schema mutation code

* comment

* extract schema migration to new class

* extract record downgrade logic for future use

* format

* fix build after rebase

* rename private method for consistency

* also implement configuredcatalog migrations >.>

* quick and dirty tests

* slight cleanup

* fix tests

* pmd

* pmd test

* null check on message objects

* maybe fix acceptance tests?

* fix name

* extract constants

* more fixes

* tmp

* meh

* fix cdc acc tests

* revert to master source-postgres

* remove log messages

* revert other misc hacks

* integers are valid cursors

* remove unrelated change

* fix build

* fix build more?

* [MUST REVERT] use dev normalization

* capture kube logs

* also here?

* no debug logs?

* delete dup from merging

* add final everywhere

* revert test changes

Co-authored-by: Jimmy Ma <jimmy@airbyte.io>

* On-the-fly migrations of persisted catalogs (#21757)

* On the fly catalog migration for normalization activity

* On the fly catalog migration for job persistence

* On the fly migration for standard sync persistence

* On the fly migration for airbyte catalogs

* Refactor code to share JsonSchema traversal

* Add V0 Data type search function

* PMD and Format

* Fix getOrInsertActorCatalog and ConfigRepositoryE2E tests

* Null-proofing CatalogMigrationV1Helper

* More null checks

* Fix test

* Format

* Add data type v1 support to the FE

* Changes AC test check to check exited ps (#21672)

some docker compose changes no longer show exited
processes.  this broke out test

this change should fix master

tested in a runner that failed

* Move wellknown types mapping to the utility function

* use protocolv1 normalization

---------

Co-authored-by: Topher Lubaway <asimplechris@gmail.com>
Co-authored-by: Edward Gao <edward.gao@airbyte.io>

* Update protocol support range (#21996)

* bump normalization version to 0.3.0

* Add version check on normalization (#22048)

* Add normalization min version check

* Add visible for testing

---------

Co-authored-by: Edward Gao <edward.gao@airbyte.io>
Co-authored-by: Eugene <etsybaev@gmail.com>
Co-authored-by: Topher Lubaway <asimplechris@gmail.com>
2023-01-30 10:17:49 -08:00
Greg Solovyev
6af98045f8 Parameterize test_empty_streams and test_stream_with_1_airbyte_column by destination (#18197)
* Remove lines that always add Postgres to list of destinations
* Parameterize all tests in test_ephemeral by destination
2022-11-01 12:55:32 -07:00
Greg Solovyev
8cf546483d 🐛 Add a drop table hook to drop scd tables in case of overwrite sync (#18015)
* Add a drop table hook to drop scd tables in case of overwrite sync

* Add an integration test for dropping SCD table on overwrite

* skip new test for Oracle and TiDB

* Add normalization run after initial reset

* Bump normalization version
2022-11-01 08:52:02 -07:00
Greg Solovyev
5f25d2d069 Greg/clickhouse polishing (#17483)
* add icon for clickhouse in destination folder

* use http port only in clickhouse

* declare driver: http for dbt explicitly

* bump destination clickhouse version

Co-authored-by: restrry <restrry@gmail.com>
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
2022-10-01 12:40:19 -07:00
Daemonxiao
d4524032ae 🎉 New Destination: TiDB (#15592)
* Add new destination-tidb

* support sync

* Add normalization-tidb

* fix failed tests

* Add unnest marco

* fmt

* Add new destination-tidb

* support sync

* Add normalization-tidb

* fix failed tests

* Add unnest marco

* fmt

* fmt

* fix integration test

* Update docs/integrations/destinations/tidb.md

Co-authored-by: Xiang Zhang <angwerzx@126.com>

* Update doc

* Update doc

* Update doc

* bump normalization version

* update normalization changelog

* run format

* add dest def

* generat spec

Co-authored-by: Xiang Zhang <angwerzx@126.com>
Co-authored-by: Marcos Marx <marcosmarxm@users.noreply.github.com>
Co-authored-by: marcosmarxm <marcosmarxm@gmail.com>
2022-08-31 16:50:27 -03:00
Greg Solovyev
5819733ab1 Greg/guykoh update dbt clickhouse (#14897)
* Update dbt-clickhouse version to 1.1.7 to support AirByte on ClickHouse cloud

* Fix quote handling in Clickhouse normalization tests

* Update test output for Clickhouse

* Bump version and update changelog

Co-authored-by: guykohen <guy@clickhouse.com>
2022-08-22 21:53:11 -07:00
Edward Gao
b2dd470d3d Handle ints and longs in normalization (#14362)
* generate airbyte_type:integer

* normalization accepts `airbyte_type: integer`

* handles ints+longs

* update avro for consistency

* delete long type for now, treat all ints as longs

* update avro type mappings

{type:number, airbyte_type:integer} -> long
{type:number, airbyte_type:big_integer} -> string (i.e. "unbounded integer")

* fix test

* remove long handling

* Revert "remove long handling"

This reverts commit 33ade8d2831e675c3545ac6019d200ec312e54d9.

* Revert "update avro type mappings"

This reverts commit 5b0349badad7545efe8e1191291a628445fe1c84.

* Revert "delete long type for now, treat all ints as longs"

This reverts commit 018efd4a5d0c59f392fd8e3b0d0967c666b72947.

* Revert "update avro for consistency"

This reverts commit bcf47c6799b5906deb4f219d7f6e64ea73b41b74.

* newline@eof

* update test

* slightly better local tests

* fix test

* missed a few cases

* postgres tests use correct hostnames

* fix normalization

* fix int macro

* add test case

* normalization test output

* handle int/long correctly

* fix types for other DBs

* uint32 -> bigint; tests

* add type value assertions

* more test updates

* regenerate output

* reconcile big_integer to match docs

* update comment

* fix type

* fix mysql constructor call

* bigint only has 38 digits

* fix s3 ints, fix DAT test case

* big_integer should be string

* reduce to 28 digit big_ints

* fix test setup, mysql

* kill big_integer tests

* regenerate output

* version bumps

* auto-bump connector version [ci skip]

* auto-bump connector version [ci skip]

* auto-bump connector version [ci skip]

* auto-bump connector version [ci skip]

Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
2022-07-26 16:40:14 -07:00
Anna Lvova
49636982c1 🎉 Base Normalization: handle airbyte_type from stream schema in normalization (#13591)
* add datatypes

* up

* up

* add MySQL

* add MSSQL

* fix

* add macros

* add macros

* upd

* upd

* upd for clickhouse

* Return datetime2 for MS SQL

* Upd time type for mysql

* Upd datetime for MySQL

* update

* upd date type for clickhouse

* up

* auto-generate

* bump version

* bump version
2022-07-26 19:49:05 +03:00
Edward Gao
89e78a6be5 🐛 Destination BIgQuery can handle nulls inside arrays (#14522) 2022-07-13 18:21:17 -07:00
Baz
062b12f1ba 🎉 Base Norrmalization: clean-up Redshift tmp_schemas after SAT (#14015)
Now after `base-normalization` SAT the Destination Redshift will be automatically cleaned up from test leftovers. Other destinations are not covered yet.
2022-06-27 20:44:04 +03:00
Serhii Chvaliuk
49d181a198 Normalization: Fix incorrect jinja2 macro json_extract_array call (#13894)
Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com>
2022-06-19 13:13:49 +03:00
Augustin
e8146e5ec2 Normalization: Upgrade MySQL to dbt 1.0.0 (#11470) 2022-06-15 15:05:49 -07:00
Edward Gao
897522cf51 Add some dev-facing normalization docs (#13780) 2022-06-15 08:21:14 -07:00
Edward Gao
61ce03a436 🐛 Normalization correctly propagates deletions to the final tables (#12846) 2022-06-14 14:56:18 -07:00
Serhii Chvaliuk
0342699daf Normalization: rename *.sql -> *.sql.j2 (#13474)
Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com>
2022-06-06 18:58:34 +03:00
Brian Leonard
b882538147 Snowflake integration test steps (#13205)
* Destination-snowflake test config update

* Tests assume this ones doesn’t work!

* Make GCS integration

* Use existing GCS integration because tests use it

* Add comments

* Snowflake setup in base-normalization

* Markdown!

* Respect the env variable

* readme update

* Updated snapshot
2022-05-26 13:24:29 -07:00
Alexandre Girard
3894134d11 Bump year in license short to 2022 (#13191)
* Bump to 2022

* format
2022-05-25 17:56:49 -07:00
Serhii Chvaliuk
1ea62f33f8 Normalization snowflake: add datetime without timezone (#12745)
Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com>
2022-05-16 13:56:07 +03:00
oneshcheret
d35d7b07b8 Mssql destination: enable DAT tests, use nvarchar and datetime2 by default (#12305)
* Mssql destination: enable DAT tests for mssql destination, use nvarchar and datetime2 by default

* Mssql destination: update array handling in test

* Mssql destination: update array and JSON handling in test

* Mssql destination: remove unused method

* bugfix bigquery tests, dataset_location added

* basic-normalization.md updated

Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com>

* Mssql destination: change parent class for mssql test

Co-authored-by: Sergey Chvalyuk <grubberr@gmail.com>
2022-05-07 11:22:22 +03:00
Davin Chia
e93bb85dc7 Fix build. (#12242) 2022-04-21 20:46:31 +08:00
Yurii Bidiuk
9d9507b227 revert formatting for test_pokemon_super.sql (#12234) 2022-04-21 11:28:23 +03:00
Yurii Bidiuk
785bcc4a9a 🐛 Destination Redshift: fix switching mode (#12085)
* fix switching mode for redshift

* bump version

* format code

* update spec
2022-04-20 16:57:15 +03:00
Serhii Chvaliuk
7023fbd48e Redshift SUPER type (#12064)
* 🎉 Destination Redshift: Use SUPER data type on Redshift destination for raw JSON data (#9407)

Co-authored-by: Oleksandr Tsukanov <alexander.tsukanovvv@gmail.com>
Co-authored-by: Sergey Chvalyuk <grubberr@gmail.com>
Co-authored-by: Christophe Duong <christophe.duong@gmail.com>
Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com>
2022-04-20 15:11:22 +03:00
Edward Gao
c1381cde2c Revert Redshift SUPER PRs (#12041) 2022-04-14 12:36:26 -07:00
Serhii Chvaliuk
9b05bc1f34 Normalization redshift - add support SUPER type (#9610)
Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com>
Co-authored-by: Oleksandr Tsukanov <alexander.tsukanovvv@gmail.com>
2022-04-12 21:42:43 +03:00
Parker Mossman
dfd25f0e85 Un-revert add /tmp emptyDir volume to connector pods (#11511)
* Revert "Revert "add /tmp emptyDir volume to connector pods (#10761)" (#11053)"

This reverts commit eea515614c.

* prettier

* bump version of base-normalization to pick up /tmp -> /dbt-tmp change

* change /dbt-tmp/dbt_modules to /dbt

* Regenerate test output files

* add to changelog

Co-authored-by: Christophe Duong <christophe.duong@gmail.com>
Co-authored-by: Edward Gao <edward.gao@airbyte.io>
2022-04-11 13:12:51 -07:00
Edward Gao
0464a1074b 🐛 Normalization: Decrease event buffer size (#11267) 2022-03-25 16:03:21 -07:00
Edward Gao
046fc5e1cc 🎉 upgrade dbt to 1.0.0 (except for oracle and mysql) (#11051) 2022-03-11 16:38:37 -08:00
Christophe Duong
f0e8e48d82 Format code (#10837)
* Regenerate MySQL outputs from normalization tests

* format
2022-03-03 17:28:22 +01:00
Christophe Duong
04a113ea8c Clean up normalization (#9355) 2022-01-07 18:03:53 +01:00
Marcos Marx
511819b5ae Normalization fix Prefix Tables starting with number (#9301)
* add normalization-clickhouse docker build step

* bump normalization version

* small changes gradle

* fix settings gradle

* fix eof file

* correct clickhouse normalization

* Refactor jinja template for scd (#9278)

* merge chris code and regenerate sql files

* correct scd post-hook generation for snowflake

* fix scd table for snowflake prefix table with number

* scd fix for all destinations

* use quote

* use normalize column for post-hook

* change logic to apply quote

* add logic to handle prefix for mssql and oracle

* run tests

* correct unit test

* bump normalization version

Co-authored-by: James Zhao <james.zhao@sinoreps.com>
Co-authored-by: Edward Gao <edward.gao@airbyte.io>
Co-authored-by: Christophe Duong <christophe.duong@gmail.com>
2022-01-06 23:39:41 -03:00
Edward Gao
b6926d44d4 🚨 Snowflake produces permanent tables 🚨 (#9063) 2022-01-06 10:10:25 -08:00
Christophe Duong
c5d4a97363 🐛 Fix normalization issue with quoted & case sensitive columns (#9317) 2022-01-06 18:59:09 +01:00
Christophe Duong
e0bac4aaeb 🐛 Fix normalization SCD partition by float columns errors with BigQuery (#9281) 2022-01-06 18:49:31 +01:00
Marcos Marx
de56d4713c Publish PR 9029: clickhouse normalization (#9072)
* add normalization-clickhouse docker build step

* bump normalization version

* small changes gradle

* fix settings gradle

* fix eof file

* correct clickhouse normalization

* Refactor jinja template for scd (#9278)

* merge chris code and regenerate sql files

Co-authored-by: James Zhao <james.zhao@sinoreps.com>
Co-authored-by: Edward Gao <edward.gao@airbyte.io>
Co-authored-by: Christophe Duong <christophe.duong@gmail.com>
2022-01-04 23:28:14 -03:00