1
0
mirror of synced 2026-01-04 18:04:31 -05:00
Commit Graph

125 Commits

Author SHA1 Message Date
Charles
3f00a3e4c5 improve javadocs in replication worker (#7942) 2021-11-18 17:01:40 -08:00
Jenny Brown
fcb2ff485b Resolve linting errors in the javadoc contents (#7612)
* Javadoc cleanup
2021-11-11 17:13:20 -06:00
Sherif A. Nada
5f03d32797 fix buffered stream consumer tests (#7834) 2021-11-11 08:25:52 -08:00
Sherif A. Nada
62992bff8b Fix BufferedStreamConsumer tests (#7773) 2021-11-09 12:49:59 -08:00
Jared Rhizor
109461b722 Bump Airbyte version from 0.30.35-alpha to 0.30.36-alpha (#7772)
Co-authored-by: sherifnada <sherifnada@users.noreply.github.com>
2021-11-08 20:29:53 -08:00
Sherif A. Nada
efb5151011 🐛 Make all JDBC destinations (SF, RS, PG, MySQL, MSSQL, Oracle) handle wide rows by using byte-based record buffering (#7719) 2021-11-08 19:26:32 -08:00
itaseskii
f53fd5e66b 🎉 New destination: Cassandra (#7186)
* add cassandra destination connector

* refactor and docs.

* delete test dockerfile

* revert Dockerfile rm change

* refactor & fix acceptance tests & format

* revert stream peek

* remove get pip

* add address example

* improved copy and code refactor

* add docker-compose and improved docs

Co-authored-by: itaseski <ivica.taseski@seavus.com>
2021-11-05 19:02:01 -03:00
Charles
58902f3df8 add cli commons to factor out common parsing code (#7301) 2021-10-29 18:44:22 -07:00
Eugene
46a249e5c9 🎉Source Clickhouse: added option to connect via SSH tunnel (aka Bastion server) (#7327)
Source-Clickhouse: added support for connection via ssh tunnel
2021-10-26 21:39:18 +03:00
Harsha Teja Kanna
3e7f95c25a 🎉 Support build on MacOS M1 (Apple Silicon) (#7104)
- See this doc for details: https://github.com/airbytehq/airbyte/blob/master/docs/contributing-to-airbyte/developing-locally.md
- Unit test does not work yet.
2021-10-19 11:20:21 -07:00
Charles
ba44f700b9 add final for params, local variables, and fields (#7084) 2021-10-15 16:41:04 -07:00
Jenny Brown
2e5fbba434 Clarify ssh private key format for ssh tunnels (#6585)
* Clarify ssh private key format for ssh tunnels
* Improved SSH Tunnel key generation steps, fixed formatting
* Modified wording 'app' to 'connector' for consistency
* Ran format
2021-10-04 13:59:47 -05:00
Charles
5e750164ac Publish SSL-only version of Postgres Destination (#6496)
* try to publish new normalization version

* default to using ssl in postgres destinatoin

* tidy up

* Run normalization tests using postgres DB with SSL support

* bump version

Co-authored-by: Christophe Duong <christophe.duong@gmail.com>
2021-09-30 12:55:26 +02:00
Charles
f30869001a Exposing SSL-only version of Postgres Source (#6362) 2021-09-27 16:46:39 -07:00
Michel Tricot
1773e41e47 Shorten our headers + adds contributors file (#6478) 2021-09-27 10:45:50 -07:00
VitaliiMaltsev
ec3951ba62 🎉 Integration Testing for SSH using a docker container | Postgres Source and Destination update integration tests using ssh bastion in docker container (#6312)
* ssh-test

* add authentification via ssh tunnel with bastion docker host and postgres testcontainer

* created SshBastion class in base-java module

* implement Postgres source basic ssh tunneling connection for integration tests

* implement Postgres source ssh tunneling connection and refactoring SshBastion

* generate keys inside a bastion container

* remove throwing Exception from startTestContainers method

* fix checkstyle

* add documentation and changelog for Posthres source and destination

* update documentation for ssh readme.md | update version fo Postgres source and destination to 0.3.12

* update version of Postgres source and destination to 0.3.12

* removed static variables, removed version bump, rename class to SshBastionContainer, removed ci credentials for ssh Postgres Source and Destination

Co-authored-by: vmaltsev <vitalii.maltsev@globallogic.com>
2021-09-24 20:00:06 +03:00
George Claireaux
0b82f7f695 add details for testing ssh normalization (#6376) 2021-09-22 15:33:58 +01:00
Charles
8ad43afb07 SSH for Postgres Destination (#5743)
Co-authored-by: George Claireaux <phlair@users.noreply.github.com>
2021-09-07 17:06:25 -07:00
Charles
7bf531a967 SSH for Postgres Source (#5742) 2021-09-02 11:32:04 -07:00
Andrii Leonets
b18bd439d0 🐛 Destination Postgres: fix \u0000(NULL) value processing (#5336)
* fix \u0000(NULL) value processing for Postgres + move postgres impl of SqlOperations to PostgresSqlOperations.

* changelog + format

* incr release version

* Add generic solution to adopt messages for a destination + remove unnecessary serialization

* revert version for build

* minor review fixes

* format

* add comments

* format

* incr version
2021-08-30 21:41:02 +03:00
Christophe Duong
76c1f3465c Add sanitized column name in some destinations' raw table outputs (#5026)
* Add sanitized column name in some destinations' raw table outputs
2021-07-28 14:38:08 +02:00
Christophe Duong
5cdc7f8517 🐛 (contribution) Fix SQL model to build a Type 2 SCD to handle NULL cursor_field values correctly (#4881)
* Update SQL model to build a Type 2 Slowly Changing Dimension (#4802)

* Make SQL more portable

* Bumpversion of normalization

Co-authored-by: Daniel Diamond <33811744+danieldiamond@users.noreply.github.com>
2021-07-22 16:27:54 +02:00
Mario Molina
fc3c692fb4 🎉 New Destination: Kafka (#3746) 2021-07-21 19:01:15 -07:00
Charles
9a13c792cf Checkpointing: Partial Success in BufferedStreamConsumer (Destination) (#3555) 2021-07-21 15:26:40 -07:00
Sherif A. Nada
356ca18b67 🐛 Fix Oracle spec to declare sid instead of database param, Redshift to allow additionalProperties, MSSQL test and spec to declare spec type correctly (#4874) 2021-07-20 17:04:36 -07:00
Eugene
7f4315fd7f 🎉 All java connectors: Added configValidator to check, discover, read and write calls (#4699)
* Added configValidator to java connectors
2021-07-16 18:59:40 +03:00
Charles
1187e9e687 remove unused deps (#4512)
Co-authored-by: Davin Chia <davinchia@gmail.com>
2021-07-12 13:55:47 -07:00
Sherif A. Nada
aca70d0f0d 🐛 platform: Fix silent failures in sources (#4617) 2021-07-07 22:39:41 -07:00
Christophe Duong
75a1dda07e 🎉 New BigQuery destination with Structured/Repeated Records (#4176) 2021-06-23 16:19:36 +02:00
Jared Rhizor
b4793b2510 add AIRBYTE_ENTRYPOINT for kubernetes support (#3973)
* add AIRBYTE_ENTRYPOINT for kubernetes support

* bump versions

* bump version in seed

* Update generic template

* keep scaffold sources at 0.1.0

* add missing newline

* handle python base versions correctly

* re-bump mysql and postgres sources

* re-bump snowflake destination

* add skip tests option

* switch to running tests

* reverse conditional to make it safer

* fix publish to include the test running

* fix iterable version

* fix file generation

Co-authored-by: Sherif A. Nada <snadalive@gmail.com>
2021-06-09 13:01:45 -07:00
Andrii Leonets
213fae17a1 MySQL source: Add comprehensive data type test (#3810) 2021-06-07 14:01:02 +03:00
masonwheeler
8dadd1cebd Oracle destination implementation (#3498)
Working implementation of Oracle destination

Co-authored-by: cgardens <giardina.charles@gmail.com>
2021-06-03 16:27:09 -06:00
LiRen Tu
c13b9883e8 🎉 New destination: S3 (#3672)
* Update README icon links

* Update airbyte-specification doc

* Extend base connector

* Remove redundant region

* Separate warning from info

* Implement s3 destination

* Run format

* Clarify logging message

* Rename variables and functions

* Update documentation

* Rename and annotate interface

* Inject formatter factory

* Remove part size

* Fix spec field names and add unit tests

* Add unit tests for csv output formatter

* Format code

* Complete acceptance test and fix bugs

* Fix uuid

* Remove generator template files

They belong to another PR.

* Add unhappy test case

* Checkin airbyte state message

* Adjust stream transfer manager parameters

* Use underscore in filename

* Create csv sheet generator to handle data processing

* Format code

* Add partition id to filename

* Rename date format variable
2021-06-03 09:40:51 -07:00
Charles
aa6afb7282 Checkpointing: Worker use destination (instead of source) for state (#3290)
* Migrate BufferedStreamConsumer users (e.g. all JDBC destinations, MeiliSearch) (#3473)

* Add checkpointing test cases in Acceptance Tests (#3473)

* Add testing for emitting state in Destination Standard Test (#3546)

* Migrate BQ to support checkpointing (#3546)

* Migrate copy destinations support checkpointing (#3547)

* Checkpointing: Migrate CSV and JSON destinations (#3551)
2021-05-25 16:47:40 -07:00
Davin Chia
99c7ac27ca Bugfix: BufferedStreamConsumer. (#3387)
* Format.

* Bump versions.
2021-05-13 16:38:23 +08:00
Jared Rhizor
6fd8e00ad8 don't split lines on LSEP unicode characters when reading lines in destinations (#3327)
* use strict JSONL definition of new lines in destinations

* failing test case

* use next instead of nextLine

* add \n in string for test

* bump destination versions

* bump to even newer version

* bump versions in dockerfiles as well

* force mysql test to pass
2021-05-10 12:57:12 -07:00
Charles
06599d475d Write output records from destination to STDOUT (#3274) 2021-05-07 13:32:39 -07:00
Charles
e4d0707781 Destination Checkpointing: Add StateMessage handing to BufferedStreamConsumer (#3230) 2021-05-07 13:05:52 -07:00
Davin Chia
ae9e48a321 Add missing EU region and bump Redshift version. (#3262)
* Add missing EU region and bump Redshift version.

* Fix error where we were checking the entire AirbyteRecordMessage instead of just the data field.
2021-05-06 18:26:09 +08:00
Charles
c512a7ed83 Pre-Work for adding Checkpointing (take 2) (#3188) 2021-05-03 12:51:39 -07:00
Davin Chia
75dd66438a Fix master build. (#3174)
* Fix master build.

* Revert "Pre-Work for adding Checkpointing (#3163)"

This reverts commit 4cf69ac699.

* Add Mason to auto-assign list.
2021-05-03 11:07:34 +08:00
Charles
4cf69ac699 Pre-Work for adding Checkpointing (#3163) 2021-05-01 17:21:11 -07:00
Christophe Duong
77ffd74b32 Ignore records that are too big in Redshift destinations (instead of failing) (#2988)
* Abort sync if one of the part fails to copy to temp table

* Check for record size when copying data from s3 to redshift

* Handle big record in RedshiftInsertDestination too
2021-04-30 21:04:03 +02:00
Jared Rhizor
a3b4444372 snowflake s3 copy & redshift s3 refactor (#2921)
* snowflake s3 copy

* refactor (some tests still need updating)

* revert accidentally removing files

* re-add purge

* use baseconnector

* getconnection logs error

* use generic configs for copiers/suppliers/consumers

* use stream copier terminology

* remove weird delegate generics

* some test changes

* remove non-ci test that doesn't have a good equivalent atm

* misc

* finally fixed

* tests and fix

* add credentials

* fix redshift build

* respond to comments

* fix check

* bump versions for redshift and snowflake

* fix creds
2021-04-26 09:41:53 -07:00
Charles
159a27f989 fix file clean up for BigQueue (#2570) 2021-04-23 09:39:34 -07:00
Davin Chia
b9014acfca :tada Namespace support. Supported source-destination pairs will now sync data into the same namespace as the source. (#2862)
This PR introduces the following behavior for JDBC sources:
Instead of streamName = schema.tableName,  this is now streamName = tableName and namespace = schema. This means that, when replicating from these sources, data will be replicated into a form matching the source. e.g. public.users (postgres source) -> public.users (postgres destination) instead of current behaviour of public.public_users. Since MySQL does not have schemas, the MySQL source uses the database as it's namespace.

To do so:
- Make namespace a field class concept in Airbyte Protocol. This allows the source to propagate namespace and destinations to write to a source-defined namespace. Also sets us up for future namespace related configurability.
- Add an optional namespace field to the AirbyteRecordMessage. This field will be set by sources that support namespace.
- Introduce AirbyteStreamNameNamespacePair as a type-safe manner of identifying streams throughout our code base.
- Modify base_normalisation to better support source defined namespace, specifically allowing normalisation of tables with the same name to different schemas.
2021-04-17 15:33:22 +08:00
Davin Chia
9f16651840 Add Namespace Field. (#2704)
Add namespace field to the Airbyte Stream in preparation to propagate a source defined namespace to the Destination.

This namespace field is then consumed as the destination schema the table is written to.

This only applies to JDBC destinations.

This is Steps 1 - 4 of the namespace tech spec, seen at https://docs.google.com/document/d/1qFk4YqnwxE4MCGeJ9M2scGOYej6JnDy9A0zbICP_zjI/edit.

Some minor refactoring and commenting as I go.
* Remove unnecessary test classes as they match Integration tests in terms of what is being tested. They have no real value since the corresponding integration test can be run locally without additional credentials. The main value the classes brings is letting us run tests without building the docker image (the integration tests require doing so). however I feel this benefit is not worth the additional maintenance cost.
* Centralise DataArgumentProvider into it's own class for easier maintenance and usability.
2021-04-06 11:50:47 +08:00
Christophe Duong
6c6ea54bb8 Add SupportedDestinationSyncModes to destination specs objects (#2668)
* Add SupportedDestinationSyncModes to destination specs objects

* Bumpversions of destination connectors
2021-03-31 15:20:01 +02:00
Davin Chia
e6be760a8f AcceptTracked should accept an AirbyteRecordMessage. (#2662)
The acceptTracked method should accept an AirbyteRecordMessage instead of a generic AirbyteMessage. This allows us to centralise checking for a record and makes the interface easier to understand.

We can also consolidate checking if a received message has a corresponding stream. However that's more involved and can be revisited at a later date.
2021-03-31 14:35:52 +08:00
Davin Chia
51ee38845b Refine Destination Interfaces (#2637)
Clean up Destination interfaces, with the goal of less repeated code and hopefully better readability for the next person.

* Rename the write method in the Destination interface to getConsumer to better reflect that the method is not writing itself, but returning a consumer that will write upon accepting a message. This was consuming me when I was reading the code.
* Remove generics from the FailureTrackingConsumer and the DestinationConsumer. Besides tests, there are no generic uses of the FaliureTrackingConsumer. Replace this with the AirbyteMessage to make explicit this is what the FailureTrackingConsumer use cases. Although this does restrict our future use of the FailureTrackingConsumer class, I'd rather limit this now and re-inject generics once we have more use cases. This was also confusing me - I kept on wondering what other data type can be consuming this interface.
* Rename FailureTrackingConsumer to FailureTrackingAirbyteMessageConsumer to better reflect how the consumer is meant to be used strictly as a DestinationConsumer (it implements the interface).
* Rename DestinationConsumer to AirbyteMessageConsumer.

In a subsequent PR, I plan to consolidate logic to error if the received Airbyte message is not of Record type, into the 
FailureTrackingAirbyteMessageConsumer class.
2021-03-30 13:30:15 +08:00