1
0
mirror of synced 2026-01-08 21:05:13 -05:00
Commit Graph

162 Commits

Author SHA1 Message Date
George Claireaux
2af780db3e base-java: Add utility for AirbyteTraceMessage and naively emit on any connector error (#12614)
* added AirbyteLoggedException class

* adding in int runr

* changes

* refactored to AirbyteTracedException to align with python impl.

* added catch for Exceptions that are already AirbyteTracedException

* refactor to static class & catch with UncaughtExceptionHandler

* testing ExceptionHandler

* add tests

* added docs section on using AirbyteTraceMessageUtility

* made AirbyteMessage maker methods more intuitive

* fix spotbugs errors

* format
2022-05-12 11:08:52 +01:00
Jonathan Pearlin
ebb9f3e1ac Prepare Database Access Layer for Dependency Injection (#12546)
* Prepare database access objects for dependency injection

* Replace duplicate code

* Remove unused imports

* Remove redundant validation call

* Remove unused imports

* Use constants

* Disable fast fail during connection pool initialization

* Remove typo

* Add missing test dependency

* Add missing test dependency

* Add missing test dependency

* Fix issue caused by rebase

* Add method for cloud

* Autoclose DSL context during migration

* Better connection close handling

* Fix typo in dependency

* Fix SpotBugs issue

* React to rebase

* Fix typo

* Update JavaDoc

* Fix database close calls

* Pass configs to getServer

* Fix typo

* Fix call to removed method

* Fix typo

* Use catalog to manage versions

* PR feedback

* Centralize shutdown hook

* Fix rebase issues

* Document test cases

* Document test cases

* Formatting

* Properly close database resources

* Rebase cleanup
2022-05-09 15:26:54 -04:00
Edward Gao
3d416129c7 🐛 Prevent sources from hanging if they have orphaned threads (#12544) 2022-05-03 18:48:43 -07:00
Greg Solovyev
53e625a511 Bump mina-sshd from 2.7.0 to 2.8.0 (#12376)
this is an attempt to merge the main change
from  https://github.com/airbytehq/airbyte/pull/11514,
which now has multiple conflicts.

The gist of the change

When creating a Postgres destination connector with SSH tunnel method 'SSH Key Authentication', one is required to provide a RSA key. Creating a rsa-sha2-256 or rsa-sha2-512 key, will result in the error SshException: KeyExchange signature verification failed for key type=ssh-rsa, if you haven't enabled ssh-rsa in the SSH server's host key algorithms.

mina-sshd in version 2.7.0 uses the wrong server key signature algorithm during DH group key exchange. https://issues.apache.org/jira/browse/SSHD-1163.

Bumping mina-sshd to version 2.8.0 addresses this issue. Changelog https://github.com/apache/mina-sshd/blob/master/docs/changes/2.8.0.md.
2022-04-26 14:37:50 -07:00
Parker Mossman
884a94ed29 Un-Revert OSS branch build for Cloud workflow (#11808)
* Revert "Revert "Build OSS branch for deploying to Cloud env (#11474)""

This reverts commit 55e3c1e051.

* add action to get dev branch tag to OSS project instead of doing it in cloud

* remove dev branch version action, going to do this in cloud afterall
2022-04-08 15:17:04 -07:00
LiRen Tu
8bd2d9b518 🎉 BigQuery destination: use serialized buffer for gcs staging (#11776)
* Rebase bigquery changes to master

* Add comments

* Uncomment test code

* Format code

* Bump versions

* Fix denormalized destination target table name

* Fix avro schema for denormalized destination

* Remove unnecessary params from consumer factory

* Add back previous version

* Add warning about standard mode

* auto-bump connector version

* Bump version for bigquery in seed

Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
2022-04-07 16:59:19 -07:00
lmossman
55e3c1e051 Revert "Build OSS branch for deploying to Cloud env (#11474)"
This reverts commit 189efe7b42.
2022-04-05 15:44:31 -07:00
Parker Mossman
189efe7b42 Build OSS branch for deploying to Cloud env (#11474)
* add VERSION buildArg to Dockerfiles, default to current airbyte version but overwritable

* use VERSION env var consistently as Dockerfile buildArg, jar version, and tag

pass version and image_tag into docker build task function

* add github action for building and pushing an OSS branch for Cloud to consume

* allow AirbyteVersion to validate versions containing 'oss-branch' prefix

* change oss-branch prefix to dev for branch-based versions

* better action name

* add docker-compose-cloud.build.yaml to define minimum set of cloud images that are pushed by oss branch action

* update local dev docs to describe optional usage of VERSION env var

* make branch_version_tag input optional, if not provided, generates dev-<commit_hash>

* fix typo

* fix missed merge conflict

* update docker docs

* update integrationRunner isDev check
2022-04-05 15:06:17 -07:00
Christophe Duong
848bb349b5 🎉 Change destination-s3 buffering to reduce/stabilize memory/thread consumption (#11294)
* Refactor destination-s3 to use the new serialization strategy and get memory usage under control
2022-03-28 17:40:44 +02:00
LiRen Tu
21ec23cc31 🐞 Fix invalid char in snowflake & bigquery namespace (#10793)
* Add namespace test for snowflake

* Enable namespace test for bigquery

* Format code

* Capitalize test case id

* Update exception message to point to test case file

* Update snowflake name transformer to prepend underscore

* Override convertStreamName instead of getIdentifier

* Add missing state message

* Remove unused import

* Disable more namespace test cases

We don't want to introduce changes that will affect existing connections for now.

* Dry method that mutates namespace

* Pass through null

* Normalize namespace

* Fix test case

* Revert consumer factory changes

* Normalize namespace in catalog

* Revert catalog normalization

* Enable namespace test for all snowflake destination tests

* Test namespace for both bigquery destination tests

* Add unit test for bigquery name transformer

* Transform bigquery schema name

* Fix avro name transformer

* Normalize avro namespace

* Standardize namespace in gcs utils

* Bump version for snowflake and bigquery

* Enable namespace test for bigquery denormalized

* Dry bigquery denormalized acceptance test

* Revert some of the variable scope change

* Fix unit test

* Bump version

* Introduce getNamespace method

* Implement getNamespace method for bigquery

* Switch to getNamespace methods

* Update comments

* Fix bigquery denormalized acceptance test

* Format code

* Dry bigquery destination test

* Skip partition test for gcs mode

* Bump version
2022-03-19 17:47:24 -07:00
Christophe Duong
298551d501 🎉 Change destination-snowflake buffering when staging to reduce/stabilize memory/thread consumption (#10866)
* Refactor Snowflake internal Staging as model to share staging abilities in jdbc destinations

* Switch Snowflake Copy Destination for Staging destination based off Internal Staging

Co-authored-by: LiRen Tu <tuliren.git@outlook.com>

* Bumpversion of destination-snowflake
2022-03-19 00:13:59 +01:00
LiRen Tu
462cdd6aad Remove sentry flag in integration runner (#11224)
When Sentry is not initialized it will just do nothing. So it is always safe to call the captureMessage method.
2022-03-17 02:32:44 -07:00
Charles
5fde59fdbd add spotbugs (#10522) 2022-03-11 12:05:17 -08:00
Christophe Duong
744e0d5f13 Refactor Snowflake internal Staging as a base class for other staging classes (#10865)
* Refactor Snowflake internal Staging as model to share staging abilities in jdbc destinations
2022-03-11 15:29:12 +01:00
LiRen Tu
81417e6728 Add connector metadata as sentry tags (#10475)
* Pass worker metadata to connector

* Fix compilation

* Pass in job id and image from worker

* Remove application version

* Add default job environment variables

* Add back removed comment

* Rename env map to job metadata

* Fix env configs

* Read connector from application

* Use empty string

* Remove println

* Fix unit test

* Fix compilation error

* Introduce constants for worker env

* Add worker env to ENV_VARS_TO_TRANSFER

* Pass into getWorkerMetadata map to all constructions

* Format code

* Format octavia cli

* Fix test compilation

* Fix typos
2022-03-09 07:36:03 -08:00
girarda
adea13cea7 Fix redshift and oracle acceptance tests (#10855)
* parse jdbc parameters

* Also fix redshift

* other oracle source acceptance test

* This is & now

* This is & now

* This is & now

* This is & now

* This is & now

* also update nne

* increase sleep to 11 seconds

* Bump to 15 seconds

* gradlew format

* try to reformat

* gradlew format

* Run ./gradlew :airbyte-integrations:connector-templates:generator:testScaffoldTemplates --scan

* reset to master

* Revert "reset to master"

This reverts commit d6141ed933.
2022-03-04 16:55:44 -08:00
Christophe Duong
0d38f276bf Surface any active child thread of dying connectors (#10660)
* Interrupt child thread of dying connectors to avoid getting stuck

* Catch and print stacktrace

* Add test on interrupt/kill time outs

* Send message to sentry too
2022-03-03 12:17:59 +01:00
Subodh Kant Chaturvedi
c14260aa8d close ssh tunnel in case of exception in destination consumer (#10686)
* close ssh tunnel in case of exception

* format
2022-03-02 03:56:32 +05:30
Subodh Kant Chaturvedi
f71754d836 close ssh in case of exception during check in Postgres connector (#10620)
* close ssh in case of exception

* remove unwanted change

* remove comment

* format

* do not close scanner

* fix semi-colon

* format
2022-02-28 17:13:33 +05:30
Subodh Kant Chaturvedi
9e22a558b9 add logs in FailureTrackingAirbyteMessageConsumer for debug (#10455) 2022-02-18 23:06:58 +05:30
LiRen Tu
049a11b2bc 🎉 Snowflake destination: reduce memory footprint (#10394)
* Add detailed logging for flushing

* Log sentry transaction event id

* Adjust logging

* Log memory usage

* Add jvm monitoring

* Remove log

* Remove port 9010

* Remove host network mode

* Sample record size

* Remove profiling code

* Add unit tests

* Use average estimation

* Rename variable

* Format code

* Bump version

* Revert unnecessary change

* Update doc

* Fix format

* Bump version in seed
2022-02-17 12:55:35 -08:00
Jared Rhizor
f94f42775c fix formatting on master (#10360) 2022-02-15 14:41:41 -08:00
andriikorotkov
b3916c987a 🐛 Snowflake Destination: use better file size with S3 staging files (#9920)
* split s3 staging files to files by 100 Mb and removed legacyS3StreamCopier

* split s3 staging files to files by 100 Mb and removed legacyS3StreamCopier

* updated code style

* fix remarks

* fix remarks

* fix code style

* fix remarks

* fix remarks

* fix remarks

* updated documentations and images versions

* updated documentation
2022-02-15 22:20:22 +02:00
LiRen Tu
6301cfa91f 🎉 Destination snowflake: reduce memory consumption (#10297)
* Avoid redundant adapter construction

* Remove unused logger

* Avoid redundant creation of buffer map

* Decrease max batch byte size to 128 mb

* Format code

* Move data adapter to an instance variable

* Bump version

* Bump version in seed
2022-02-14 23:37:54 -08:00
VitaliiMaltsev
e30d8348b2 Change JsonSchemaPrimitive to a class (#9913)
* fix for jdk 17

* add JsonSchemaType class

* fix tests

* fix tests

* fix tests

* fix tests

* fix tests

* fix tests

* fix Oracle tests

* fix Redshift tests

* fix Redshift tests

* fix checkstyle

* fix MSSQL tests

* fix cockroachdb tests

* fix checkstyle

* fix checkstyle

* replace star imports

* replace star imports

* replace star imports

* update JsonSchemaType | fixed checkstyle

* Remove unused variables in test

* Fix imports

* Expand imports

* Fix more imports

Co-authored-by: vmaltsev <vitalii.maltsev@globallogic.com>
Co-authored-by: Liren Tu <tuliren.git@outlook.com>
2022-02-14 02:12:37 -08:00
LiRen Tu
39049cbf24 🎉 End-to-end test source: support stream duplication (#10298)
* Add support to duplicate one stream multiple times

* Update cloud version

* Bump version

* Remove unused logger

* Add trace for each stream in mock source

* Fix cloud version

* Bump version in seed
2022-02-12 00:50:48 -08:00
LiRen Tu
5133ce6f4c 🐛 Destination snowflake & bigquery: fix null pointer exception (#9959)
* Prevent null exception

* Check nullable schema name

* Bump version

* Bump version in seed
2022-02-01 04:27:07 -08:00
LiRen Tu
8e8f402b8a 🎉 Destination snowflake & bigquery: integrate with sentry (#9945)
* Update doc

* Use empty dsn when sentry is not enabled

* Bump version in seed
2022-01-31 20:27:52 -08:00
LiRen Tu
a4b8edffbf Add detailed sentry tracing for JDBC destination stream consumer (#9898)
* Refactor airbyte sentry

* Add more sentry monitoring for jdbc destination

* Profile operations from failure tracking consumer

* Remove redundant method call

* Update operation names

* Trace snowflake starting process

* Trace snowflake copying step

* Move tracing to sql operation
2022-01-31 11:11:53 -08:00
Alexander Tsukanov
479f0d7c8d [MVP] Integrate sentry to all java-based connectors (#9745)
* airbyte-9328: Added Sentry integration to BigQuery and BigQuery denormalized connector.

* airbyte-5050: Added strategy for INSERT ROW.

* airbyte-9328: Added Sentry integration to Snowflake.

* airbyte-9328: Fix Sentry config.

* airbyte-9328: Fixed PR comments.

* airbyte-9328: Fixed PR comments.

* airbyte-9328: Fix PR comments.

* airbyte-9328: Fixed PR comments.

* airbyte-9328: Fixed PR comments.

* airbyte-9328: Fixed PR comments.

* airbyte-9328: Small changes.

* airbyte-9328: Small changes.

* airbyte-9328: Move SENTRY DSN keys to Dockerfiles.

* Use new dsn

* Revert format

* Remove sentry dsn from compose temporarily

* Log sentry event id

* Move sentry to java base

* Remove sentry code from bigquery

* Update dockerfiles

* Fix build

* Update release tag format

* Bump version

* Add env to dockerfiles

* Fix e2e test connector dockerfil

* Fix snowflake bigquery dockerfile

* Mark new versions as unpublished

Co-authored-by: LiRen Tu <tuliren@gmail.com>
Co-authored-by: Liren Tu <tuliren.git@outlook.com>
2022-01-29 16:58:35 -08:00
Eugene
4534703589 🎉Source Postgres: Set up connection - add schema selection (#9360)
* [1435] Source Postgres: Set up connection - added schema selection
2022-01-13 16:24:38 +02:00
Serhii Chvaliuk
c0a46c1987 BufferedStreamConsumerTest: remove non-determinism in size of generated test records (#9274)
* generate records fixed 40 bytes of size

* fix buffer flush

Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com>
2022-01-04 17:04:36 +02:00
Eugene
8c84ad2976 🎉Source-postgres\mssql\mysql added a HEAP dump capturing on outOfMemory Error (if any) (#8811)
* Updated entrypoint script for source-mysql\mssql\postgres connectors to capture a HEAP dump when connector fails with outOfMemory error
2021-12-28 17:34:35 +02:00
Alexander Tsukanov
eea41b4fc8 🎉 Destination Snowflake and RedShift: Implement the Byte-buffered logic (#8869)
* airbyte-8336: Byte based approach.

* test-commit

* airbyte-8336: Split file by cnhunks.

* airbyte-8336: Renamed variable.

* airbyte-8336: make snowflake DEFAULT_MAX_BATCH_SIZE_BYTES_SNOWFLAKE constant.

* airbyte-8336: make snowflake DEFAULT_MAX_BATCH_SIZE_BYTES_SNOWFLAKE constant.

* airbyte-8336: make snowflake DEFAULT_MAX_BATCH_SIZE_BYTES_SNOWFLAKE constant.

* airbyte-8336: fix of unit tests

* airbyte-8336: Changed to default buffer size in SnowFlake.

* airbyte-8336: Changed 15 GB to 1 GB for max size.

* airbyte-8336: Changed to default buffer size in SnowFlake.

* airbyte-8336: Bumped connector version.

* airbyte-8336: Bumped connector version.

* airbyte-8336: Bumped connector version.
2021-12-24 12:32:10 +02:00
Charles
3a6902963d remove validation from source delete (#8724) 2021-12-16 09:35:15 -08:00
Jared Rhizor
25674fc306 upgrade to Gradle 7.3.1 / Java 17 (#7964)
* upgrade gradle

* upgrade to Java 17 (and fix a few of the node versioning misses)

* oops

* try to run a different format version

* fix spotless by upgrading / reformatting some files

* fix ci settings

* upgrade mockito to avoid other errors

* undo bad format

* fix "incorrect" sql comments

* fmt

* add debug flag

* remove

* bump

* bump jooq to a version that has a java 17 dist

* fix

* remove logs

* oops

* revert jooq upgrade

* fix

* set up java for connector test

* fix yaml

* generate std source tests

* fail zombie job attempts and add failure reason (#8709)

* fail zombie job attempts and add failure reason

* remove failure reason

* bump gcp dependencies to pick up grpc update (#8713)

* Bump Airbyte version from 0.33.9-alpha to 0.33.10-alpha (#8714)

Co-authored-by: jrhizor <jrhizor@users.noreply.github.com>

* Change CDK "Caching" header to "nested streams & caching"

* Update fields in source-connectors specifications: file, freshdesk, github, google-directory, google-workspace-admin-reports, iterable (#8524)

Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com>

Co-authored-by: Serhii Chvaliuk <grubberr@gmail.com>
Co-authored-by: Sherif A. Nada <snadalive@gmail.com>

* move S3Config into destination-s3; update dependencies accordingly (#8562)

Co-authored-by: Lake Mossman <lake@airbyte.io>
Co-authored-by: jrhizor <jrhizor@users.noreply.github.com>
Co-authored-by: Sherif A. Nada <snadalive@gmail.com>
Co-authored-by: Iryna Grankova <87977540+igrankova@users.noreply.github.com>
Co-authored-by: Serhii Chvaliuk <grubberr@gmail.com>
Co-authored-by: Edward Gao <edward.gao@airbyte.io>
2021-12-10 16:57:54 -08:00
oneshcheret
fcf7491fee 🐛 Validate incorrect handling '\n' symbols in ssh key (#8371)
* 🐛 Validate \n in ssh key

* bump versions for ssh key connectors

* update versions for ssh key connectors

* temporal fix for checking failed tests

* revert temp changes and destination oracle version

* bump versions in config for ssh key related connectors
2021-12-03 20:11:51 +02:00
Charles
3f00a3e4c5 improve javadocs in replication worker (#7942) 2021-11-18 17:01:40 -08:00
Jenny Brown
fcb2ff485b Resolve linting errors in the javadoc contents (#7612)
* Javadoc cleanup
2021-11-11 17:13:20 -06:00
Sherif A. Nada
5f03d32797 fix buffered stream consumer tests (#7834) 2021-11-11 08:25:52 -08:00
Sherif A. Nada
62992bff8b Fix BufferedStreamConsumer tests (#7773) 2021-11-09 12:49:59 -08:00
Jared Rhizor
109461b722 Bump Airbyte version from 0.30.35-alpha to 0.30.36-alpha (#7772)
Co-authored-by: sherifnada <sherifnada@users.noreply.github.com>
2021-11-08 20:29:53 -08:00
Sherif A. Nada
efb5151011 🐛 Make all JDBC destinations (SF, RS, PG, MySQL, MSSQL, Oracle) handle wide rows by using byte-based record buffering (#7719) 2021-11-08 19:26:32 -08:00
itaseskii
f53fd5e66b 🎉 New destination: Cassandra (#7186)
* add cassandra destination connector

* refactor and docs.

* delete test dockerfile

* revert Dockerfile rm change

* refactor & fix acceptance tests & format

* revert stream peek

* remove get pip

* add address example

* improved copy and code refactor

* add docker-compose and improved docs

Co-authored-by: itaseski <ivica.taseski@seavus.com>
2021-11-05 19:02:01 -03:00
Charles
58902f3df8 add cli commons to factor out common parsing code (#7301) 2021-10-29 18:44:22 -07:00
Eugene
46a249e5c9 🎉Source Clickhouse: added option to connect via SSH tunnel (aka Bastion server) (#7327)
Source-Clickhouse: added support for connection via ssh tunnel
2021-10-26 21:39:18 +03:00
Harsha Teja Kanna
3e7f95c25a 🎉 Support build on MacOS M1 (Apple Silicon) (#7104)
- See this doc for details: https://github.com/airbytehq/airbyte/blob/master/docs/contributing-to-airbyte/developing-locally.md
- Unit test does not work yet.
2021-10-19 11:20:21 -07:00
Charles
ba44f700b9 add final for params, local variables, and fields (#7084) 2021-10-15 16:41:04 -07:00
Jenny Brown
2e5fbba434 Clarify ssh private key format for ssh tunnels (#6585)
* Clarify ssh private key format for ssh tunnels
* Improved SSH Tunnel key generation steps, fixed formatting
* Modified wording 'app' to 'connector' for consistency
* Ran format
2021-10-04 13:59:47 -05:00
Charles
5e750164ac Publish SSL-only version of Postgres Destination (#6496)
* try to publish new normalization version

* default to using ssl in postgres destinatoin

* tidy up

* Run normalization tests using postgres DB with SSL support

* bump version

Co-authored-by: Christophe Duong <christophe.duong@gmail.com>
2021-09-30 12:55:26 +02:00