1
0
mirror of synced 2026-01-06 06:04:16 -05:00
Commit Graph

47 Commits

Author SHA1 Message Date
Andrii Leonets
107f5b8d61 🎉 Abstract level for SQL relational database sources (#4123)
Abstract level for SQL relational database sources
2021-07-05 17:18:07 +03:00
Christophe Duong
7c26305865 Add supportsDBT and supportsNormalization to API objects (#4031)
* Add supportsDBT and supportsNormalization to API objects
2021-06-11 10:11:42 +02:00
Andrii Leonets
213fae17a1 MySQL source: Add comprehensive data type test (#3810) 2021-06-07 14:01:02 +03:00
Subodh Kant Chaturvedi
6adad7d98e destination-specification: add supportsNormalization and supportsDBT attributes (#3779)
* destination-specification: add supportsNormalization and supportsDBT attributes

* address review comment

* missed this one

* output after gradle format
2021-06-01 17:40:28 +05:30
Subodh Kant Chaturvedi
601ea5eed7 destination: add implementation for mysql as destination (#3242)
* destination: add implementation for mysql as destination

* Fix formatting errors.

* address review comments + fix flaky test

* fix formatting

* address Davin's review comments

* add missing todo

* enable namespace test + only provide test user the minimum permissions required

Co-authored-by: Davin Chia <davinchia@gmail.com>
2021-05-07 11:44:58 +05:30
Davin Chia
b9014acfca :tada Namespace support. Supported source-destination pairs will now sync data into the same namespace as the source. (#2862)
This PR introduces the following behavior for JDBC sources:
Instead of streamName = schema.tableName,  this is now streamName = tableName and namespace = schema. This means that, when replicating from these sources, data will be replicated into a form matching the source. e.g. public.users (postgres source) -> public.users (postgres destination) instead of current behaviour of public.public_users. Since MySQL does not have schemas, the MySQL source uses the database as it's namespace.

To do so:
- Make namespace a field class concept in Airbyte Protocol. This allows the source to propagate namespace and destinations to write to a source-defined namespace. Also sets us up for future namespace related configurability.
- Add an optional namespace field to the AirbyteRecordMessage. This field will be set by sources that support namespace.
- Introduce AirbyteStreamNameNamespacePair as a type-safe manner of identifying streams throughout our code base.
- Modify base_normalisation to better support source defined namespace, specifically allowing normalisation of tables with the same name to different schemas.
2021-04-17 15:33:22 +08:00
Davin Chia
e11ccfd0a1 Revert "Remove schema from stream name. (#2807)" (#2857)
This reverts commit 6e9d6fce59.
2021-04-12 14:56:11 -07:00
Davin Chia
6e9d6fce59 Remove schema from stream name. (#2807)
Last step (besides documentation) of namespace changes. This is a follow up to #2767 .

After this change, the following JDBC sources will change their behaviour to the behaviour described in the above document.

Namely, instead of streamName = schema.tableName, this will become streamName = tableName and namespace = schema. This means that, when replicating from these sources, data will be replicated into a form matching the source. e.g. public.users (postgres source) -> public.users (postgres destination) instead of current behaviour of public.public_users. Since MySQL does not have schemas, the MySQL source uses the database as it's namespace.

I cleaned up some bits of the CatalogHelpers. This affected the destinations, so I'm also running the destination tests.
2021-04-12 21:02:29 +08:00
Davin Chia
58062faccb Discover Schema sets Namespace field. (#2767)
This PR is step 5 of this tech spec - https://docs.google.com/document/d/1qFk4YqnwxE4MCGeJ9M2scGOYej6JnDy9A0zbICP_zjI/edit.

The first of (at least) 2 PRs to implement this on the source side. I made some headway before deciding to break the changes into one PR implementing this for discover schema job, and another PR implementing this for read. The combined PR would have been too big otherwise.

Also refactor MoreResources as the test method was attempting to write to the location classes where loaded out from - the issue is we cannot guarantee where the class is loaded from can be written to. Changing this to write to a random folder in the temp directory.
2021-04-07 11:53:03 +08:00
Davin Chia
9f16651840 Add Namespace Field. (#2704)
Add namespace field to the Airbyte Stream in preparation to propagate a source defined namespace to the Destination.

This namespace field is then consumed as the destination schema the table is written to.

This only applies to JDBC destinations.

This is Steps 1 - 4 of the namespace tech spec, seen at https://docs.google.com/document/d/1qFk4YqnwxE4MCGeJ9M2scGOYej6JnDy9A0zbICP_zjI/edit.

Some minor refactoring and commenting as I go.
* Remove unnecessary test classes as they match Integration tests in terms of what is being tested. They have no real value since the corresponding integration test can be run locally without additional credentials. The main value the classes brings is letting us run tests without building the docker image (the integration tests require doing so). however I feel this benefit is not worth the additional maintenance cost.
* Centralise DataArgumentProvider into it's own class for easier maintenance and usability.
2021-04-06 11:50:47 +08:00
Christophe Duong
6c6ea54bb8 Add SupportedDestinationSyncModes to destination specs objects (#2668)
* Add SupportedDestinationSyncModes to destination specs objects

* Bumpversions of destination connectors
2021-03-31 15:20:01 +02:00
Christophe Duong
8a29584125 ☝🏼Destinations supports destination sync mode (#2460)
* Handle destination sync mode in destinations

* Source & Destination sync modes are required (#2500)

* Provide Migration script making sure it is always defined for previous sync configs
2021-03-26 20:23:48 +01:00
Jared Rhizor
0530f85e54 separate the retrieval of incremental and full refresh streams in the jdbc abstract source (#2582)
* separate full refresh and incremental iterator creation

* fix underspecified test

* fmt

* fmt

* respond to nit and fix test
2021-03-24 08:14:22 -07:00
Christophe Duong
b1e911e255 Back-end support destination sync modes #2370 (#2375)
* Add new fields for destination_sync_modes
2021-03-10 20:01:12 +01:00
Christophe Duong
070575ffdf Protocol allows future / unknown properties (#2238)
* Allow new extra properties in validation
* Create migration script to upgrade all connectors versions
* Bumpversion of all connectors
2021-03-09 13:36:36 +01:00
Charles
9a81bd6e5c MeiliSearch Destination (#1964) 2021-02-08 18:44:55 -08:00
Charles
8347a69c77 Add Incremental to AbstractJdbcSource (#1306)
* Add standard tests for sources that use the JdbcSource to guarantee that changes do not break any sources that rely on JdbcSource.

* Add JdbcStressTest to verify that we stream / chunk data properly (a.k.a can handle more data in any JdbcSource than fits in memory)

* Migrate MSSQL and Redshift to user the new base source
2020-12-18 14:17:56 -08:00
Charles
25689eea56 add incremental to jooq source (and postgres) (#1172) 2020-12-08 21:14:11 -08:00
Christophe Duong
b9fbf24ea6 Add Supports Incremental Flag to Destination Specs #1124 (#1193)
* Add Supports Incremental Flag to Destination Specs #1124
2020-12-04 18:19:09 +01:00
Charles
9b2c946255 revert stream / field filtering in sources (#1095) 2020-11-25 17:19:21 -08:00
Charles
daa40e8357 Pipe Incremental Configuration to API (#1079) 2020-11-25 15:25:39 -08:00
Charles
be1691ec81 fix duplicate key exception in mssql (#983) 2020-11-20 11:10:50 -08:00
Jared Rhizor
72596f0d80 only look at properties for validation (#1034)
* only look at properties for validation

* fmt
2020-11-20 09:06:08 -08:00
Charles
02819a4b87 Incremental Docs and Data Model Update (#1021) 2020-11-19 22:07:32 -08:00
Charles
e7edb2c858 Adding incremental to the catalog data model (#998)
* Add ConfiguredAirbyteCatalog and ConfiguredAirbyteStream
2020-11-18 14:15:59 -08:00
Jared Rhizor
bd14fffe69 stream/field name validation/filtering (#1004)
* add stream and field name validation in discovery

* fix resource handling

* filter out streams with invalid stream or field names in the sync worker

* remove accidental commit

* add unicode key to test

* add another unicode test to isValidIdentifier

* update docs to reflect this change
2020-11-18 10:28:44 -08:00
Charles
c2b68f1a20 hotfix: duplicate column in discover of mssql (#982) 2020-11-15 15:34:35 -08:00
Michel Tricot
4d1dc69b55 Hides SQL dialects + fix some bugs in the jdbc source (#767) 2020-10-31 11:18:53 -07:00
Charles
88052c3968 Add JDBC Base Source (#757) 2020-10-30 10:47:43 -07:00
Charles
e68ee481a4 Push Schema Struct out of Worker (#682) 2020-10-22 14:22:48 -07:00
Jared Rhizor
08737335ce remove airbytetype (#653) 2020-10-20 11:01:53 -07:00
Charles
b9e20e504c Migrate python-base, singer-base to AirbyteProtocol (#608) 2020-10-19 10:23:02 -07:00
Charles
8a43371a6c => succeeded, => failed (#601) 2020-10-16 18:45:28 -07:00
Charles
15bb5aeeb1 Converter between AirbyteCatalog and Schema (#600) 2020-10-16 17:46:39 -07:00
Jared Rhizor
f260c672ee use static python classes (#596)
* switch to datamodel-codegen

* fix stripe issue

* fix build

* add newline

* fix tests

* use connection status and fix log method

* fix typo

* possibly fix build?

* make build generate files

* fix build
2020-10-16 16:55:31 -07:00
Charles
a9796111f6 Move all airbyte protocol into a single file (#598)
* Issue to revert this out: https://github.com/airbytehq/airbyte/issues/599
2020-10-16 14:21:25 -07:00
Charles
97be04bce2 airbyte message as envelope (#597) 2020-10-16 14:05:07 -07:00
Charles
63098063cd add connection status message to airbyte protocol (#594) 2020-10-16 09:09:57 -07:00
Charles
ba9bf4a88d migrate destinations to use airbyte protocol (#557) 2020-10-14 11:18:50 -07:00
Charles
deff95a9fe add Catalog to DefaultAirbyteDestination (#563) 2020-10-14 10:40:03 -07:00
Michel Tricot
364eb1eafc Add enum for Airbyte Json Schemas (#543) 2020-10-12 10:07:18 -07:00
Michel Tricot
a07b8edfd9 Change AirbyteMessage emitted_at type (#542) 2020-10-12 09:44:47 -07:00
Michel Tricot
9e4a74b206 Prepare worker migration to AirbyteProtocol (#525) 2020-10-09 14:21:13 -07:00
Jared Rhizor
6e51c8e0c9 add stream name support in AirbyteRecordMessage (#532)
* add stream name support in AirbyteRecordMessage

* remove unnecessary deletes

* fmt

* also fix state structure
2020-10-09 09:55:32 -07:00
Jared Rhizor
88527df94a python source jsonschema support (#520)
* support using jsonschema types within sources + spec/read support for exchangeratesapi

* remove unnecessary key sorting

* projectDir/../.. -> rootDir

* better log and record handling

* output state messages and exclude outputting schema as message

* fix test

* hide warnings

* fmt
2020-10-09 08:45:23 -07:00
Jared Rhizor
ee487c145e clean up protocol definition files & change to yamlschema (#514)
* clean up protocol files

* fmt

* convert to yaml

* fix extensions

* revert quote changes

* convert rest to yaml

* fix builds

* fix build v2

* fmt

* fix build v3
2020-10-07 15:24:36 -07:00
Michel Tricot
0380cfb60a Implementation data protocol (#489) 2020-10-06 09:05:33 -07:00