airbyte

mirror of synced 2026-01-22 19:01:47 -05:00

Author	SHA1	Message	Date
etsybaev	3bbbfcc8cf	🎉 Source Postres: Added new comprehensive data type tests (#4074 ) * [3795] Added Postres Source Comprehensive Tests	2021-06-16 23:10:09 +03:00
Charles	66e8eb9bbe	Remove Tap / Target Nomenclature from Workers (#4149 ) * StandardTapConfig => WorkerSourceConfig; StandardTargetConfig => WorkerDestinationConfig * variable renames	2021-06-16 11:11:13 -07:00
Davin Chia	b04c080c95	Kube Queueing POC (#3464 ) * Use CDK to generate source that can be configured to emit a certain number of records and always works. * Checkpoint: socat works from inside the docker container. * Override the entry point. * Clean up and add ReadMe. * Clean up socat. * Checkpoint: connect to Kube cluster and list all the pods. * Checkpoint: Sync worker pod is able to send output to the destination pod. * Checkpoint: Sync worker creates Dest pod if none existed previously. It also waits for the pod to be ready before doing anything else. Sync worker will also remove the pod on termination. * update readme * Checkpoint: Dest pod does nott restart after finishing. Comment out delete command in Sync worker. * working towards named pipes * named pipes working * update readme * WIP named pipe / socat sidecar kube port forwarding (#3518) * nearly working sources * update * stdin example * move all kube testing yamls into the airbyte-workers directories. sort the airbyte-workers resource folder; place all the poc yamls together. * Format. * Put back the original KubeProcessBuilderFactory. * Fix slight errors. * Checkpoint: Worker pod knows its own IP. Successfully starts and writes to Dest pod after refactor. * remove unused file and update readme * Dest pod loops back into worker pod. However, the right messages do not seem to be passing in. * Switch back to worker ip. * SWEET VICTORY!. * wrap kube pod in process (#3540) also clean up kubernetes deploys. * More clean up. (#3586) The first 6 points of #3464. The only interesting thing about this PR is the kube pod shutdown. For whatever reason, the OkHttpPool isn't respecting the evictAll call and 1 idle thread remains. So instead of shutting down immediately, the worker pod shuts down after 5 mins when the idle thread id reaped. There isn't an easy way to modify the pool's idle reap configuration now. I do not think this issue is blocking since it's relatively benign, so I vote we create a ticket and come back to this once we do an e2e test. * Implements redirecting standard error as well. (#3623) * Clean up before next implementation. * kube process launching (#3790) * processes must handle file mounting * remove comment * default to base entrypoint * use process builder factory / select stdin / use a pool of ports * fix up * add super hacky copying example * Checkpoint: Works end to end! * Checkpoint: Use API to make sure init container is ready instead of blind sleep. Propagate exception in DefaultCheckConnectionWorker. * Refactor KubePodProcess. Checked to make sure everything still works. * Format. * Clean up code. Begin putting this into variables and breaking up long constructor function. * Add comments to explain what is happening. * fix normalization test * increase timeout for initcontainer Co-authored-by: Davin Chia <davinchia@gmail.com> * facepalm moment * clean up kube poc pr (#3834) * clean up * remove source-always-works * create separate commons-docker * fix test * enable kube e2e tests (#3866) * enable kube e2e tests * use more generally accepted env definition * use new runners * use its own runner and install minikube differently * update name * use kubectl alias * use link instead of alias that doesn't propagate * start minikube * use driver=none * go back to using action * mess with versions * revert runner * install socat * print logs after run * also try re-runnining tasks * always wait for file transfer * use ports * increase wait timeout for kube * use different localhost ips and bump normalization to include an entrypoint * proposed fix * all working locally * revert temporary changes * revert normalization image change that's happening in a separate pr * readability * final comment * Working Kube Cancel. (#3983) * Port over the basic changes. * Add logic to return proper exit code in the event of termination. Add comments to explain why. * revert envs change and merge master to fix kube acceptance tests (#4012) * use older env format * fix build Co-authored-by: jrhizor <me@jaredrhizor.com> Co-authored-by: Jared Rhizor <jared@dataline.io>	2021-06-09 18:12:39 -07:00
Marcos Marx	05b8a2c532	Add method field on spec.json connectors (snowflake and postgres) (#3960 ) * add prop for oneOf snowflake * add method field to pg and update config * correct pg tests * change version in docker file	2021-06-08 20:11:45 -03:00
Andrii Leonets	213fae17a1	MySQL source: Add comprehensive data type test (#3810 )	2021-06-07 14:01:02 +03:00
Subodh Kant Chaturvedi	7ccc4fafe8	🎉 source: implementation for mysql cdc (#3505 ) * source: implementation for mysql cdc * add target file and position * dont want to add file in this PR * refine tests + add comments * fix typo * address review comments * fix formatting error * resolve conflicts * update docs + bump docker minor version * remove un-necessary new lines + add multiple checks for cdc * address review comments from Davin * increase the version in source_definitions.yaml * rebuild seed	2021-05-25 00:31:57 +05:30
Charles	8983f09aea	normalize connector acceptance test names (#3539 ) * Rename standard tests to acceptance tests * Normalize the names so that the nouns are always in the same order so it is easier to find tests	2021-05-22 13:40:40 -07:00
Charles	a7a398ba59	add explanatory comments for cdc (#3496 )	2021-05-20 14:28:32 -07:00
Charles	e4d0707781	Destination Checkpointing: Add StateMessage handing to BufferedStreamConsumer (#3230 )	2021-05-07 13:05:52 -07:00
Jared Rhizor	99f1448d30	remove isCdc config logging (#3179 )	2021-05-03 07:42:47 -07:00
Charles	e4d227f5b4	set source defined cursor field for cdc (#2878 )	2021-04-21 15:07:06 -07:00
Davin Chia	7e55f3a156	Modify CDC to be inline with namespace changes. (#2986 ) Modify CDC to be inline with namespace changes. Add a test case to double check this works.	2021-04-21 07:39:28 +08:00
Davin Chia	989ebee583	Hotfix: Postgres SSL tests. (#2926 )	2021-04-17 16:38:10 +08:00
Davin Chia	b9014acfca	:tada Namespace support. Supported source-destination pairs will now sync data into the same namespace as the source. (#2862 ) This PR introduces the following behavior for JDBC sources: Instead of streamName = schema.tableName, this is now streamName = tableName and namespace = schema. This means that, when replicating from these sources, data will be replicated into a form matching the source. e.g. public.users (postgres source) -> public.users (postgres destination) instead of current behaviour of public.public_users. Since MySQL does not have schemas, the MySQL source uses the database as it's namespace. To do so: - Make namespace a field class concept in Airbyte Protocol. This allows the source to propagate namespace and destinations to write to a source-defined namespace. Also sets us up for future namespace related configurability. - Add an optional namespace field to the AirbyteRecordMessage. This field will be set by sources that support namespace. - Introduce AirbyteStreamNameNamespacePair as a type-safe manner of identifying streams throughout our code base. - Modify base_normalisation to better support source defined namespace, specifically allowing normalisation of tables with the same name to different schemas.	2021-04-17 15:33:22 +08:00
Jared Rhizor	1ae2cc2051	make postgres ssl optional in spec (#2923 )	2021-04-16 16:22:24 -07:00
Marcos Marx	ca8f304f90	Add SSL option for Postgres source/destination (#2757 ) * add ssl for source-postgres * add config in utf8 test * correct comments from @jrhizor and @sherifnada * correct config get * add ssl test postgres * add sh generate ssl files * change pg ssl test * use custom image * correct spec.json * correc tests * remove unecessary config * add config and correct spec.json * add ssl to postgres destination * add tools to generate custom dockers images and correct spec.json * change how additional parameter is append * add logic ssl for postgres destination * remove if for append add params * gradlew format	2021-04-16 15:37:55 -07:00
Davin Chia	e11ccfd0a1	Revert "Remove schema from stream name. (#2807 )" (#2857 ) This reverts commit `6e9d6fce59`.	2021-04-12 14:56:11 -07:00
Davin Chia	6e9d6fce59	Remove schema from stream name. (#2807 ) Last step (besides documentation) of namespace changes. This is a follow up to #2767 . After this change, the following JDBC sources will change their behaviour to the behaviour described in the above document. Namely, instead of streamName = schema.tableName, this will become streamName = tableName and namespace = schema. This means that, when replicating from these sources, data will be replicated into a form matching the source. e.g. public.users (postgres source) -> public.users (postgres destination) instead of current behaviour of public.public_users. Since MySQL does not have schemas, the MySQL source uses the database as it's namespace. I cleaned up some bits of the CatalogHelpers. This affected the destinations, so I'm also running the destination tests.	2021-04-12 21:02:29 +08:00
Jared Rhizor	2b19da82d8	postgres cdc (#2548 ) * spike * more * debezium wip * use oneof for configuration * iterator wrapping structure * push current * working loop * move capability into source * hack it into a sharable state * debezium test runner (#2617) * CDC Wait for Values (#2618) * output actual AirbyteMessages for cdc (#2631) * message conversion * fmt * add lsn extraction and comparison (#2613) * postgres cdc catalog (#2673) * update cdc catalog * A * table selection for cdc (#2690) * table selection for cdc * fix broken merge * also test double quote in name * Add state management to CDC (#2718) * CDC: Fix Producer/Consumer State Machine (#2721) * CDC Postgres Tests (#2777) * fix postgres cdc image name and run check before reading data (#2785) * minor postgres cdc fixes * add test and fix check behavior * fix * improve comment * remove unused props, remove todos, add some more sanity tests (#2791) * cdc: add offset store tests (#2793) * clean (#2798) * postgres cdc docs (#2784) * cdc docs * Update docs/integrations/sources/postgres.md Co-authored-by: Charles <giardina.charles@gmail.com> * address gcp * learn too english * add link * add more disk space warnings * add additional cdc use case * add information on how to find postgresql.conf * add how to find the file Co-authored-by: Charles <giardina.charles@gmail.com> * various merge conflict fixes (#2799) * cdc standard tests (#2813) * require cdc users to create publications & update docs (#2818) * postgres cdc race condition * working? but different process * add additional logging to help debug in the future * everything done except working config * remove unintended change * Use oneOf in PG CDC spec (#2827) * add oneOf configuration for postgres cdc (#2831) * add oneof configuration for cdc postgres * fmt Co-authored-by: Charles <giardina.charles@gmail.com> * fix test (#2834) * fix test * bump version * add docs on creating replica identities (#2838) * add docs on creating replica identities * emphasize danger * grammar * bump pg version in source catalog * generate seed files Co-authored-by: cgardens <giardina.charles@gmail.com>	2021-04-09 16:36:58 -07:00
Davin Chia	58062faccb	Discover Schema sets Namespace field. (#2767 ) This PR is step 5 of this tech spec - https://docs.google.com/document/d/1qFk4YqnwxE4MCGeJ9M2scGOYej6JnDy9A0zbICP_zjI/edit. The first of (at least) 2 PRs to implement this on the source side. I made some headway before deciding to break the changes into one PR implementing this for discover schema job, and another PR implementing this for read. The combined PR would have been too big otherwise. Also refactor MoreResources as the test method was attempting to write to the location classes where loaded out from - the issue is we cannot guarantee where the class is loaded from can be written to. Changing this to write to a random folder in the temp directory.	2021-04-07 11:53:03 +08:00
Christophe Duong	6c6ea54bb8	Add SupportedDestinationSyncModes to destination specs objects (#2668 ) * Add SupportedDestinationSyncModes to destination specs objects * Bumpversions of destination connectors	2021-03-31 15:20:01 +02:00
Christophe Duong	8a29584125	☝🏼Destinations supports destination sync mode (#2460 ) * Handle destination sync mode in destinations * Source & Destination sync modes are required (#2500) * Provide Migration script making sure it is always defined for previous sync configs	2021-03-26 20:23:48 +01:00
Christophe Duong	41e8b6a824	Source support primary keys (#2488 ) * Source support primary keys	2021-03-17 19:28:56 +01:00
Charles	aadfae24bd	Iterator-based JDBC Source (and Redshift bugfix) (#1887 )	2021-02-02 17:14:14 -08:00
Jared Rhizor	227e709a48	add postgres (source and destination) field titles (#1765 ) * add postgres titles * fix other conflict * fix other usage	2021-01-25 10:48:56 -08:00
Charles	6c5d1b2340	Assert Best Practices for JdbcDestinations (#1680 )	2021-01-21 14:12:04 -08:00
Charles	13c5eef93a	Fix JdbcSource Incremental OOM (#1655 )	2021-01-14 14:33:44 -08:00
Sherif A. Nada	68ecf991d6	Handle invalid numeric values in JDBC source (#1588 )	2021-01-13 13:31:53 -08:00
Charles	102b432a5b	Migrate Postgres and MySql to use new JdbcSource (#1307 )	2021-01-08 14:15:34 -08:00
Sherif A. Nada	93674f6b4d	Respect sync mode regardless of input state in mailchimp (#1213 )	2020-12-11 13:04:45 -08:00
Charles	25689eea56	add incremental to jooq source (and postgres) (#1172 )	2020-12-08 21:14:11 -08:00
Sherif A. Nada	e8a332ae65	Standard source incremental tests (#1175 )	2020-12-04 09:54:10 -08:00
Christophe Duong	a49e7834f8	Change jdbc sources to discover more than standard schemas (#1038 ) Change jdbc sources to discover more than standard schemas	2020-11-30 17:36:54 +01:00
Charles	02819a4b87	Incremental Docs and Data Model Update (#1021 )	2020-11-19 22:07:32 -08:00
Sherif A. Nada	f4c3ac70f9	annotate secret fields (#1012 )	2020-11-19 15:13:23 -08:00
Charles	e7edb2c858	Adding incremental to the catalog data model (#998 ) * Add ConfiguredAirbyteCatalog and ConfiguredAirbyteStream	2020-11-18 14:15:59 -08:00
Jared Rhizor	ae25781fd9	prevent NPEs when password isn't set for jdbc integrations (#927 )	2020-11-11 20:48:52 -08:00
Charles	d507f0f95b	Fix names for standard tests (#862 )	2020-11-09 21:43:51 -08:00
Sherif A. Nada	7059825804	Add postgres JDBC source (#794 )	2020-11-03 10:56:36 -08:00

39 Commits