airbyte

mirror of synced 2026-01-05 21:02:13 -05:00

Author	SHA1	Message	Date
Christophe Duong	b424c1a0e7	🐛 Fix incremental normalization with empty tables (#8394 ) * Fix incremental with empty final tables * upgrade docker images * Regen SQL * Bumpversion & format	2021-12-01 23:40:14 +01:00
Christophe Duong	c4c92bd689	Fix normalization when un-nesting (#8378 ) * Remove unique key on exploded nested tables * un-nest hint * Regen SQL	2021-12-01 17:13:04 +01:00
Christophe Duong	c5a7267378	🐛🐌 Optimize incremental normalization runtime with snowflake (#8088 )	2021-11-19 15:03:52 +01:00
Christophe Duong	affea7f60b	🐛 Minor fixes to incremental normalization and nesting (#7669 )	2021-11-08 17:42:57 +01:00
Christophe Duong	5fc50df39d	🎉 Incremental Normalization (#7162 )	2021-10-29 13:53:02 +02:00
Andrés Bravo	abf0159778	🎉 Add configurable dbt parameter to destination-bigquery (#7118 ) * Add configurable dbt parameter to destination-bigquery * Update airbyte-integrations/connectors/destination-bigquery/Dockerfile	2021-10-19 14:51:37 +02:00
Christophe Duong	c4620559d7	🎉 Refactor Normalization docker images and upgrade to use dbt 0.21.0 (#6959 ) * Split normalization docker images for some connectors with specifics dependencies * Regenerate (#7003)	2021-10-14 20:29:16 +02:00
Anna Lvova	ec68f478ff	🐛 fix: Normalization date-time should handle empty strings "" (#6379 ) * add empty string normalization for postgres * add empty string normalization for destinations * fix * fix * fix * fix for snowflake * fix for mysql * fix normalization for mysql * upd doc * upd doc * Update airbyte-integrations/bases/base-normalization/integration_tests/dbt_integration_test.py Co-authored-by: Christophe Duong <christophe.duong@gmail.com> * Update airbyte-integrations/bases/base-normalization/integration_tests/dbt_integration_test.py Co-authored-by: Christophe Duong <christophe.duong@gmail.com> * bump version * bump version * add datetime normalization for mssql * upd row count for mssql * upd * bump version * upd docs for 0.1.50 normalization version Co-authored-by: Christophe Duong <christophe.duong@gmail.com>	2021-10-08 13:57:37 +03:00
Daniel Diamond	f13313e37e	Order CDC records by cdc log position if present (#6688 )	2021-10-08 14:00:44 +05:30
Baz	e5abaeccef	🎉 Base-normalization: Implement `normalization` for `MSSQL-destination` (#6079 ) See the attached PR (https://github.com/airbytehq/airbyte/pull/6079)	2021-10-07 18:46:27 +03:00
Christophe Duong	a3196428a7	Forward destination location to dbt profiles (#6709 ) * Forward destination location to dbt profiles * Format code * Update version	2021-10-06 19:20:15 +02:00
Charles	5e750164ac	Publish SSL-only version of Postgres Destination (#6496 ) * try to publish new normalization version * default to using ssl in postgres destinatoin * tidy up * Run normalization tests using postgres DB with SSL support * bump version Co-authored-by: Christophe Duong <christophe.duong@gmail.com>	2021-09-30 12:55:26 +02:00
andriikorotkov	8fa15713c3	🎉 Destination MySQl - Added support for connection via ssh (aka bastion server) (#6317 ) * updated mysql tests * updated mysql tests * added mysql ssh tunnel tests by key * fixed remarks * fixed remarks * updated DatabricksStreamCopier * switch to custom file for ssh config in normalization * updated MySQL SSH tests * bump version * get local port properly * updated assertSameValue for MySQL ssh tunnel * updated image version and documentation * updated code style * updated CI credentials * updated normalization documentation Co-authored-by: George Claireaux <george@claireaux.co.uk>	2021-09-28 13:11:32 +03:00
Michel Tricot	1773e41e47	Shorten our headers + adds contributors file (#6478 )	2021-09-27 10:45:50 -07:00
George Claireaux	3d8625e03d	Fix ssh tunneling for normalization (#6396 ) * switch to custom file for ssh config in normalization * bump version * get local port properly * added unit test for write_ssh_config * format	2021-09-23 14:08:45 +01:00
Yaroslav Dudar	a6ecfda2ca	🐛 Fix Snowflake destination normalization to accept any date-time format. (#6052 ) snowflake date-time format parser	2021-09-23 11:10:12 +03:00
Charles	8ad43afb07	SSH for Postgres Destination (#5743 ) Co-authored-by: George Claireaux <phlair@users.noreply.github.com>	2021-09-07 17:06:25 -07:00
Marcos Marx	589d535a61	🎉 Oracle normalization (#5562 ) * oracle normalization * correct dbt_project function for oracle * unit tests * run format * correct ephemeral tests * add gradle dependency for oracle destination * run int tests * add oracle in settings.gradle for normalization run[ * use default airbyte columns * format * test all destinatoin ephemeral * correct unit test * correct unit test * destination docs update * correct mypy * integration test all dest * refactor oracle function * merge master * run all destinations * flake8 escape regex * surrogate key function * correct few minor comments * refactor scd sql function * refactor scd function * revert test * refactor minor details * revert tests * revert ephemeral test * revert unit test table_registry * revert airbyte_protocol format * format * bump normalization version in worker * minor chnages * minor chages * correct json_column for other destinations * gradlew format * revert tests * remove comments * add Oracle destination explicit in safe_cast_str * add quote_in_parenthesis inside if clause * gradlew format	2021-09-07 16:39:17 -03:00
Marcos Marx	7225187fa1	run gradlew format (#5552 )	2021-08-20 15:38:28 -03:00
Marcos Marx	a9b2c08934	Add condition for unnest_column_name for pg/redshift/mysql (#5467 ) * add unnest_column case conflict * add redshift files * format * change logic * change logic for unnest * bump normalization version * add files * add stream test unnest_alias	2021-08-20 11:09:15 -03:00
Christophe Duong	158594fccc	Remove BQ_keyfile.json mentions in normalization (#5528 ) * Remove BQ_keyfile.json mentions in normalization * Bumpversion normalization	2021-08-19 15:36:44 +02:00
Christophe Duong	f9705bf731	BigQuery normalization: make credentials json optional (#5433 ) * Allow service-account-json or oauth methods for bigquery destinations	2021-08-17 11:50:17 +02:00
Marcos Marx	e4fe62f739	Normalization: solve conflict when stream and field have same name (#4557 ) * solve conflict when stream and field have same name * add logic to handle conflict * change files * change json_extract functions * json_operations * add normalization files * test integration mysql * remove table_alias * mysql run * json ops * solve conflict with master * solve mysql circle dependency dbt * add tests for scalar and arrays * add sql files * bump normalization version * format	2021-08-11 20:18:45 -03:00
Subodh Kant Chaturvedi	923884b897	introduce implementation for date-time support in normalization (#5180 ) * introduce implementation for date-time support in normalization * update test output for all destinations * add comment	2021-08-11 02:28:03 +05:30
Christophe Duong	d6429a410a	Normalization handles quote in column names (#5027 ) * Handle quotes in columns names	2021-07-28 16:00:13 +02:00
Christophe Duong	5cdc7f8517	🐛 (contribution) Fix SQL model to build a Type 2 SCD to handle NULL cursor_field values correctly (#4881 ) * Update SQL model to build a Type 2 Slowly Changing Dimension (#4802) * Make SQL more portable * Bumpversion of normalization Co-authored-by: Daniel Diamond <33811744+danieldiamond@users.noreply.github.com>	2021-07-22 16:27:54 +02:00
LiRen Tu	2caf3904f0	🎉 MySQL destination: normalization (#4163 ) * Add mysql dbt package * Add mysql normalization support in java * Add mysql normalization support in python * Fix unit tests * Update readme * Setup mysql container in integration test * Add macros * Depend on dbt-mysql from git repo * Remove mysql limitation test * Test normalization * Revert protocol format change * Fix mysel json macros * Fix two more macros * Fix table name length * Fix array macro * Fix equality test macro * Update replace-identifiers * Add more identifiers to replace * Fix unnest macro * Fix equality macro * Check in mysql test output * Update column limit test for mysql * Escape parentheses * Remove unnecessary mysql test * Remove mysql output for easier code review * Remove unnecessary mysql test * Remove parentheses * Update dependencies * Skip mysql instead of manually write out types * Bump version * Check in unit test for mysql name transformer * Fix type conversion * Use json_value to extract scalar json fields * Move dbt-mysql to Dockerfile (#4459) * Format code * Check in mysql dbt output * Remove unnecessary quote * Update mysql equality test to match 0.19.0 * Check in schema_test update * Update readme * Bump base normalization version * Update document Co-authored-by: Christophe Duong <christophe.duong@gmail.com>	2021-07-03 20:30:59 -07:00
Marcos Marx	265e7f79d8	Normalization: remove dedup cdc excluded (#4297 ) * change stream processor * integraton tests * add integration tests * format gradle file * add excluded files * change catalog and msgs * add cdc messages * solve cdc excluded problem with tests * remove .egg files * remove time import * tab stream_processor * uncommented local test * add tests for dbt! * add excluded files * add missing snowflake file * add pg, bq and snowflake * chris comments * test comment * pytest parametrize tests * bump normalization version * formating * run test for all destinations	2021-06-30 14:59:13 -03:00
Christophe Duong	75a1dda07e	🎉 New BigQuery destination with Structured/Repeated Records (#4176 )	2021-06-23 16:19:36 +02:00
Marcos Marx	810fde9e21	Documentation correct summary normalization docs (#4158 ) * correct summary * run format master failing	2021-06-16 12:36:41 -03:00
Christophe Duong	144bc7814e	Normalization with empty catalog (#4020 ) * Normalization with empty catalog	2021-06-10 14:22:55 +02:00
Christophe Duong	bb4dcb1987	🎉 Remove hash when it is not necessary from normalization outputs (#3704 ) * Refactor `generate_new_table_name` using a table name registry class instead * update normalization docs * Enable MyPy * Regenerate output files * Closes https://github.com/airbytehq/airbyte/issues/2389 * Bumpversion normalization	2021-06-01 17:07:22 +02:00
Christophe Duong	8862fba1bb	🎉 Avoid dbt runtime exception "maximum recursion depth exceeded" in ephemeral materialization * Create new test_ephemeral and refactor with test_normalization * Add notes in docs * Refactor common normalization tests into DbtIntegrationTest * Bumpversion of normalization image	2021-05-21 18:07:20 +02:00
Christophe Duong	8790fc10ab	Simple rename of dbt generated models folder (#3469 ) * Rename folder where models are generated from airbyte_views to airbyte_ctes * Integration test outputs are moved around as a result	2021-05-18 20:01:09 +02:00
Christophe Duong	083aebcbcb	Workflow to handle operations (custom transformation) (#3379 ) * Keep normalization backward compatible with old settings from destination * Bumpversion normalization image	2021-05-17 18:08:27 +02:00
Charles	0df53170c9	Stop formatting python with spotless (#3388 )	2021-05-13 17:46:34 -07:00
Christophe Duong	86513d6c54	Fix normalization Nesting bug (#3110 ) * New test case for nested streams * Fix filename naming (collisions and nesting) * Update generated files from tests with new file naming * Allow invalid json data in raw tables when normalizing on redshift * Regenerate final sql files * Disable unit tests on stream naming (temporarly) * Fix unnesting bug in postgres * Reactivate unit tests and change table registry * Move normalization unit tests to integration tests (too slow) * Remove heavy catalog.json used in unit_tests (actual catalog from facebook/stripe with thousands of lines) * Bumpversion of normalization image	2021-04-29 14:32:59 +02:00
Christophe Duong	c2fa3e4c9c	Introduce normalization integration tests (#3025 ) * Speed normalization unit tests by dropping hubspot catalog (too heavy, will be covering it in integration tests instead * Add integration tests for normalization * Add dedup test case * adjust build.gradle * add readme for normalization * Share PATH env variable with subprocess calls * Handle git non-versionned tests vs versionned ones * Format code * Add tests check to normalization integration tests * Add docs * complete docs on normalization integration tests * format code * Normalization integration tests output (#3026) * Version generated/output files from normalization integration tests * simplify cast of float columns to string when used as partition key (#3027) * bump version of normalization image * Apply suggestions from code review Co-authored-by: Jared Rhizor <jared@dataline.io> * Apply suggestions from code review Co-authored-by: Jared Rhizor <jared@dataline.io>	2021-04-27 12:01:04 +02:00
Davin Chia	f660b0a946	Add template generation for Santa aka CDK. (#3034 ) Template generation for new Source using the Santa CDK - provide basic scaffolding for someone implementing a new source. General approach is to buff up comments in the original SDK, and add TODOs with secondary comments in the generated stub methods, as well as links to existing examples (e.g. Stripe or ExchangeRate api) users can look at. Checked in and added tests for the generated modules.	2021-04-25 18:02:33 +08:00
Charles	f445fdb5b2	match styling for spotlessApply and format (#3017 ) * as a java developer I want to be able to run spotlessApply without changing styles in python code	2021-04-23 09:21:41 -07:00
Christophe Duong	07a45df454	Add normalization test cases (#2992 ) * Add normalization test cases * Fix new normalization test on name collisions	2021-04-22 19:39:39 +02:00
Christophe Duong	5859e0cef1	Fix Normalization failing with "adapter" does not exist (#2941 ) * Fix normalization dedup on non-string primary key columns * Bumpversion of normalization image * Add test cases to standard test	2021-04-19 18:32:35 +02:00
Davin Chia	b9014acfca	:tada Namespace support. Supported source-destination pairs will now sync data into the same namespace as the source. (#2862 ) This PR introduces the following behavior for JDBC sources: Instead of streamName = schema.tableName, this is now streamName = tableName and namespace = schema. This means that, when replicating from these sources, data will be replicated into a form matching the source. e.g. public.users (postgres source) -> public.users (postgres destination) instead of current behaviour of public.public_users. Since MySQL does not have schemas, the MySQL source uses the database as it's namespace. To do so: - Make namespace a field class concept in Airbyte Protocol. This allows the source to propagate namespace and destinations to write to a source-defined namespace. Also sets us up for future namespace related configurability. - Add an optional namespace field to the AirbyteRecordMessage. This field will be set by sources that support namespace. - Introduce AirbyteStreamNameNamespacePair as a type-safe manner of identifying streams throughout our code base. - Modify base_normalisation to better support source defined namespace, specifically allowing normalisation of tables with the same name to different schemas.	2021-04-17 15:33:22 +08:00
Davin Chia	e11ccfd0a1	Revert "Remove schema from stream name. (#2807 )" (#2857 ) This reverts commit `6e9d6fce59`.	2021-04-12 14:56:11 -07:00
Davin Chia	6e9d6fce59	Remove schema from stream name. (#2807 ) Last step (besides documentation) of namespace changes. This is a follow up to #2767 . After this change, the following JDBC sources will change their behaviour to the behaviour described in the above document. Namely, instead of streamName = schema.tableName, this will become streamName = tableName and namespace = schema. This means that, when replicating from these sources, data will be replicated into a form matching the source. e.g. public.users (postgres source) -> public.users (postgres destination) instead of current behaviour of public.public_users. Since MySQL does not have schemas, the MySQL source uses the database as it's namespace. I cleaned up some bits of the CatalogHelpers. This affected the destinations, so I'm also running the destination tests.	2021-04-12 21:02:29 +08:00
Christophe Duong	fafc25d86a	Add primary key tests to TestDestination (#2776 ) * Add primary tests to TestDestination * Test with composite primary keys	2021-04-08 11:01:02 +02:00
Christophe Duong	0b6a9830da	Missing keywords in redshift (#2700 ) * Missing keywords in redshift	2021-04-01 17:33:11 +02:00
Christophe Duong	dbbb58d0a8	🎉 normalization bugfix: support integers with precision > 32 bits & support union types (#2410 )	2021-03-12 12:19:18 +01:00
Christophe Duong	28b5134d0e	Normalization support destination sync modes append_dedup #2372 (#2394 ) (This is not enabled for usage until front-end work is ready)	2021-03-12 12:18:24 +01:00
Jared Rhizor	fa505c7800	deterministic table name collision resolution for normalization (#2206 ) * deterministic collision handling for table names * remove debugging print statement * fmt * fix flake check * fix * fix * fix usage * respond to more feedback * fix everything except truncation * fix everything but expected values * add test for just table name middle truncation * handle inconsistent suffixes * update tests * fmt * refactor (again) * fix * update comments * remove formatting * use full path * remove logging * remove print statements	2021-03-01 11:25:51 -08:00

1 2

76 Commits