1
0
mirror of synced 2026-01-02 21:02:43 -05:00
Commit Graph

125 Commits

Author SHA1 Message Date
Yevhen Sukhomud
de057533fb Reverted changes in SshBastionContainer (#13934) 2022-06-20 19:14:30 +07:00
Yevhen Sukhomud
174f15d0c0 13546 Fix integration tests source-postgres Mac OS (#13872)
* 13546 Fix integration tests source-postgres Mac OS
2022-06-20 15:09:01 +07:00
George Claireaux
da95f50555 updated stacktrace format in java trace messages (#13847)
* updated stacktrace format in java trace messages

* test checks specifically on stacktrace in trace message

* remove unused import
2022-06-16 16:58:53 +01:00
LiRen Tu
973f0b1165 Make connector adaptable based on deployment mode (#13522)
* Add deployment mode to env shared with jobs

* Add adaptive runners

* Migrate postgres source to use adaptive runner

* Add an array of specs in docker image spec definition

* Add copyright

* Parse docker image spec with specs list

* Update spec yaml files

* Pass in DEPLOYMENT_MODE to docker compose file

* Revert "Parse docker image spec with specs list"

This reverts commit 8fe41dd3b7.

* Revert changes in docker image spec

* Read cloud specific spec files based on deployment mode

* Revert "Update spec yaml files"

This reverts commit 059f326432.

* Publish cloud spec file if necessary

* Fix upload script

* Move test files

* Update docker compose file

* Format code

* Add comment about spec filename

* Add unit tests

* Remove redundant jdbc acceptance test

When running `PostgresStrictEncryptJdbcSourceAcceptanceTest`, the `discover` method tests always fail because there are unexpected columns in the catalog:
- `wakeup_at`
- `last_visited_at`
- `last_comment_at`

These columns only exist in `PostgresJdbcSourceAcceptanceTest`. And this failure cannot be reproduced locally.

The hypothesis is that when the JDBC unit tests are run on CI, they are run in parallel, and the same testcontainer is used for both tests. That's why the strict encrypt test can discover columns from the oridinary unit test.

Given that the JDBC strict encrypt test is basically redundant, it is removed.
2022-06-15 08:23:54 -07:00
Charles
dd3178ed77 Update destinations to handle new state messages (#13670) 2022-06-14 12:31:58 -07:00
Charles
0886ee06d4 Refactor state management out of BufferStrategy (#13669)
Co-authored-by: Edward Gao <edward.gao@airbyte.io>
2022-06-10 17:30:08 -07:00
Jonathan Pearlin
2b31011bce Separate platform and connector testcontainer versions (#13642)
* Separate platform and connector testcontainer versions

* Fix dependency

* Fix dependency

* Fix dependency usage

* Prevent leaking testcontainer dependencies
2022-06-10 09:34:31 -04:00
terencecho
0e06496d61 Fix build: run gradlew format (#13556) 2022-06-07 10:57:38 -04:00
Yevhen Sukhomud
3ad489eefc 13547 Fixed integration tests source-sftp Mac OS (#13551) 2022-06-07 17:44:15 +07:00
Davin Chia
eb99f47746 Fat Jar: Rename Dir Part 2 (#13478)
## What
Part 2 of https://github.com/airbytehq/airbyte/pull/13122.

Follow up to #13476 .

Explanation for what is happening:

Identically named subprojects have the following issues:

* publishing as is leads to classpath confusion when the jars with the same names are placed in the Java distribution. This leads to NoClassDefFound errors on runtime.
* deconflicting the jar names without changing directory names leads to dependency errors as the OSS jar pom files are generated using project dependencies (suggesting a dependency a sibling subproject in the same repo) that use subprojects group and name as a reference. This means the generated jars look for Jars that do not exists (as their names have been changed) and cannot compile.
* the workaround to changing a subproject's name involves resetting the subproject's name in the settings.gradle and depending on the new name in each build.gradle. This increases configuration burden and decreases the ease of reading, since one will have to check the settings.gradle to know what the right subproject name is. See https://github.com/gradle/gradle/issues/847 for more info.
* given that Gradle itself doesn't have support for identically named subprojects (see the linked issue), the simplest solution is to not allow duplicated directories. I've only renamed conflicting directories here to keep things simple. I will create a follow up issues to enforce non-identical subproject names in our builds.

* Rename airbyte-config:models to airbyte-config:config-models.
* Rename airbyte-config:persistence to airbyte-config:config-persistence.
2022-06-06 02:21:54 +08:00
Davin Chia
83a89aa843 Fat Jar: Rename Dir Part 1 (#13476)
Part 1 of #13122.

Rename airbyte-db:lib to airbyte-db:db-lib.
Rename airbyte-metrics:lib to airbyte-metrics:metrics-lib
Rename airbyte-protocol:models to airbyte-protocol:protocol-models.

Explanation for what is happening:

Identically named subprojects have the following issues:
- publishing as is leads to classpath confusion when the jars with the same names are placed in the Java distribution. This leads to NoClassDefFound errors on runtime.
- deconflicting the jar names without changing directory names leads to dependency errors as the OSS jar pom files are generated using project dependencies (suggesting a dependency a sibling subproject in the same repo) that use subprojects group and name as a reference. This means the generated jars look for Jars that do not exists (as their names have been changed) and cannot compile.
- the workaround to changing a subproject's name involves resetting the subproject's name in the settings.gradle and depending on the new name in each build.gradle. This increases configuration burden and decreases the ease of reading, since one will have to check the settings.gradle to know what the right subproject name is. See Projects with same name lead to unintended conflict resolution gradle/gradle#847 for more info.
- given that Gradle itself doesn't have support for identically named subprojects (see the linked issue), the simplest solution is to not allow duplicated directories. I've only renamed conflicting directories here to keep things simple. I will create a follow up issues to enforce non-identical subproject names in our builds.
2022-06-06 00:35:43 +08:00
Alexandre Girard
3894134d11 Bump year in license short to 2022 (#13191)
* Bump to 2022

* format
2022-05-25 17:56:49 -07:00
Evan Tahler
91d6d29085 AirbyteExceptionHandler should exit with a non-0 exit code (#12856) 2022-05-16 11:17:40 -07:00
George Claireaux
2af780db3e base-java: Add utility for AirbyteTraceMessage and naively emit on any connector error (#12614)
* added AirbyteLoggedException class

* adding in int runr

* changes

* refactored to AirbyteTracedException to align with python impl.

* added catch for Exceptions that are already AirbyteTracedException

* refactor to static class & catch with UncaughtExceptionHandler

* testing ExceptionHandler

* add tests

* added docs section on using AirbyteTraceMessageUtility

* made AirbyteMessage maker methods more intuitive

* fix spotbugs errors

* format
2022-05-12 11:08:52 +01:00
Jonathan Pearlin
ebb9f3e1ac Prepare Database Access Layer for Dependency Injection (#12546)
* Prepare database access objects for dependency injection

* Replace duplicate code

* Remove unused imports

* Remove redundant validation call

* Remove unused imports

* Use constants

* Disable fast fail during connection pool initialization

* Remove typo

* Add missing test dependency

* Add missing test dependency

* Add missing test dependency

* Fix issue caused by rebase

* Add method for cloud

* Autoclose DSL context during migration

* Better connection close handling

* Fix typo in dependency

* Fix SpotBugs issue

* React to rebase

* Fix typo

* Update JavaDoc

* Fix database close calls

* Pass configs to getServer

* Fix typo

* Fix call to removed method

* Fix typo

* Use catalog to manage versions

* PR feedback

* Centralize shutdown hook

* Fix rebase issues

* Document test cases

* Document test cases

* Formatting

* Properly close database resources

* Rebase cleanup
2022-05-09 15:26:54 -04:00
Edward Gao
3d416129c7 🐛 Prevent sources from hanging if they have orphaned threads (#12544) 2022-05-03 18:48:43 -07:00
Greg Solovyev
53e625a511 Bump mina-sshd from 2.7.0 to 2.8.0 (#12376)
this is an attempt to merge the main change
from  https://github.com/airbytehq/airbyte/pull/11514,
which now has multiple conflicts.

The gist of the change

When creating a Postgres destination connector with SSH tunnel method 'SSH Key Authentication', one is required to provide a RSA key. Creating a rsa-sha2-256 or rsa-sha2-512 key, will result in the error SshException: KeyExchange signature verification failed for key type=ssh-rsa, if you haven't enabled ssh-rsa in the SSH server's host key algorithms.

mina-sshd in version 2.7.0 uses the wrong server key signature algorithm during DH group key exchange. https://issues.apache.org/jira/browse/SSHD-1163.

Bumping mina-sshd to version 2.8.0 addresses this issue. Changelog https://github.com/apache/mina-sshd/blob/master/docs/changes/2.8.0.md.
2022-04-26 14:37:50 -07:00
Parker Mossman
884a94ed29 Un-Revert OSS branch build for Cloud workflow (#11808)
* Revert "Revert "Build OSS branch for deploying to Cloud env (#11474)""

This reverts commit 55e3c1e051.

* add action to get dev branch tag to OSS project instead of doing it in cloud

* remove dev branch version action, going to do this in cloud afterall
2022-04-08 15:17:04 -07:00
LiRen Tu
8bd2d9b518 🎉 BigQuery destination: use serialized buffer for gcs staging (#11776)
* Rebase bigquery changes to master

* Add comments

* Uncomment test code

* Format code

* Bump versions

* Fix denormalized destination target table name

* Fix avro schema for denormalized destination

* Remove unnecessary params from consumer factory

* Add back previous version

* Add warning about standard mode

* auto-bump connector version

* Bump version for bigquery in seed

Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
2022-04-07 16:59:19 -07:00
lmossman
55e3c1e051 Revert "Build OSS branch for deploying to Cloud env (#11474)"
This reverts commit 189efe7b42.
2022-04-05 15:44:31 -07:00
Parker Mossman
189efe7b42 Build OSS branch for deploying to Cloud env (#11474)
* add VERSION buildArg to Dockerfiles, default to current airbyte version but overwritable

* use VERSION env var consistently as Dockerfile buildArg, jar version, and tag

pass version and image_tag into docker build task function

* add github action for building and pushing an OSS branch for Cloud to consume

* allow AirbyteVersion to validate versions containing 'oss-branch' prefix

* change oss-branch prefix to dev for branch-based versions

* better action name

* add docker-compose-cloud.build.yaml to define minimum set of cloud images that are pushed by oss branch action

* update local dev docs to describe optional usage of VERSION env var

* make branch_version_tag input optional, if not provided, generates dev-<commit_hash>

* fix typo

* fix missed merge conflict

* update docker docs

* update integrationRunner isDev check
2022-04-05 15:06:17 -07:00
Christophe Duong
848bb349b5 🎉 Change destination-s3 buffering to reduce/stabilize memory/thread consumption (#11294)
* Refactor destination-s3 to use the new serialization strategy and get memory usage under control
2022-03-28 17:40:44 +02:00
LiRen Tu
21ec23cc31 🐞 Fix invalid char in snowflake & bigquery namespace (#10793)
* Add namespace test for snowflake

* Enable namespace test for bigquery

* Format code

* Capitalize test case id

* Update exception message to point to test case file

* Update snowflake name transformer to prepend underscore

* Override convertStreamName instead of getIdentifier

* Add missing state message

* Remove unused import

* Disable more namespace test cases

We don't want to introduce changes that will affect existing connections for now.

* Dry method that mutates namespace

* Pass through null

* Normalize namespace

* Fix test case

* Revert consumer factory changes

* Normalize namespace in catalog

* Revert catalog normalization

* Enable namespace test for all snowflake destination tests

* Test namespace for both bigquery destination tests

* Add unit test for bigquery name transformer

* Transform bigquery schema name

* Fix avro name transformer

* Normalize avro namespace

* Standardize namespace in gcs utils

* Bump version for snowflake and bigquery

* Enable namespace test for bigquery denormalized

* Dry bigquery denormalized acceptance test

* Revert some of the variable scope change

* Fix unit test

* Bump version

* Introduce getNamespace method

* Implement getNamespace method for bigquery

* Switch to getNamespace methods

* Update comments

* Fix bigquery denormalized acceptance test

* Format code

* Dry bigquery destination test

* Skip partition test for gcs mode

* Bump version
2022-03-19 17:47:24 -07:00
Christophe Duong
298551d501 🎉 Change destination-snowflake buffering when staging to reduce/stabilize memory/thread consumption (#10866)
* Refactor Snowflake internal Staging as model to share staging abilities in jdbc destinations

* Switch Snowflake Copy Destination for Staging destination based off Internal Staging

Co-authored-by: LiRen Tu <tuliren.git@outlook.com>

* Bumpversion of destination-snowflake
2022-03-19 00:13:59 +01:00
LiRen Tu
462cdd6aad Remove sentry flag in integration runner (#11224)
When Sentry is not initialized it will just do nothing. So it is always safe to call the captureMessage method.
2022-03-17 02:32:44 -07:00
Charles
5fde59fdbd add spotbugs (#10522) 2022-03-11 12:05:17 -08:00
Christophe Duong
744e0d5f13 Refactor Snowflake internal Staging as a base class for other staging classes (#10865)
* Refactor Snowflake internal Staging as model to share staging abilities in jdbc destinations
2022-03-11 15:29:12 +01:00
LiRen Tu
81417e6728 Add connector metadata as sentry tags (#10475)
* Pass worker metadata to connector

* Fix compilation

* Pass in job id and image from worker

* Remove application version

* Add default job environment variables

* Add back removed comment

* Rename env map to job metadata

* Fix env configs

* Read connector from application

* Use empty string

* Remove println

* Fix unit test

* Fix compilation error

* Introduce constants for worker env

* Add worker env to ENV_VARS_TO_TRANSFER

* Pass into getWorkerMetadata map to all constructions

* Format code

* Format octavia cli

* Fix test compilation

* Fix typos
2022-03-09 07:36:03 -08:00
girarda
adea13cea7 Fix redshift and oracle acceptance tests (#10855)
* parse jdbc parameters

* Also fix redshift

* other oracle source acceptance test

* This is & now

* This is & now

* This is & now

* This is & now

* This is & now

* also update nne

* increase sleep to 11 seconds

* Bump to 15 seconds

* gradlew format

* try to reformat

* gradlew format

* Run ./gradlew :airbyte-integrations:connector-templates:generator:testScaffoldTemplates --scan

* reset to master

* Revert "reset to master"

This reverts commit d6141ed933.
2022-03-04 16:55:44 -08:00
Christophe Duong
0d38f276bf Surface any active child thread of dying connectors (#10660)
* Interrupt child thread of dying connectors to avoid getting stuck

* Catch and print stacktrace

* Add test on interrupt/kill time outs

* Send message to sentry too
2022-03-03 12:17:59 +01:00
Subodh Kant Chaturvedi
c14260aa8d close ssh tunnel in case of exception in destination consumer (#10686)
* close ssh tunnel in case of exception

* format
2022-03-02 03:56:32 +05:30
Subodh Kant Chaturvedi
f71754d836 close ssh in case of exception during check in Postgres connector (#10620)
* close ssh in case of exception

* remove unwanted change

* remove comment

* format

* do not close scanner

* fix semi-colon

* format
2022-02-28 17:13:33 +05:30
Subodh Kant Chaturvedi
9e22a558b9 add logs in FailureTrackingAirbyteMessageConsumer for debug (#10455) 2022-02-18 23:06:58 +05:30
LiRen Tu
049a11b2bc 🎉 Snowflake destination: reduce memory footprint (#10394)
* Add detailed logging for flushing

* Log sentry transaction event id

* Adjust logging

* Log memory usage

* Add jvm monitoring

* Remove log

* Remove port 9010

* Remove host network mode

* Sample record size

* Remove profiling code

* Add unit tests

* Use average estimation

* Rename variable

* Format code

* Bump version

* Revert unnecessary change

* Update doc

* Fix format

* Bump version in seed
2022-02-17 12:55:35 -08:00
Jared Rhizor
f94f42775c fix formatting on master (#10360) 2022-02-15 14:41:41 -08:00
andriikorotkov
b3916c987a 🐛 Snowflake Destination: use better file size with S3 staging files (#9920)
* split s3 staging files to files by 100 Mb and removed legacyS3StreamCopier

* split s3 staging files to files by 100 Mb and removed legacyS3StreamCopier

* updated code style

* fix remarks

* fix remarks

* fix code style

* fix remarks

* fix remarks

* fix remarks

* updated documentations and images versions

* updated documentation
2022-02-15 22:20:22 +02:00
LiRen Tu
6301cfa91f 🎉 Destination snowflake: reduce memory consumption (#10297)
* Avoid redundant adapter construction

* Remove unused logger

* Avoid redundant creation of buffer map

* Decrease max batch byte size to 128 mb

* Format code

* Move data adapter to an instance variable

* Bump version

* Bump version in seed
2022-02-14 23:37:54 -08:00
VitaliiMaltsev
e30d8348b2 Change JsonSchemaPrimitive to a class (#9913)
* fix for jdk 17

* add JsonSchemaType class

* fix tests

* fix tests

* fix tests

* fix tests

* fix tests

* fix tests

* fix Oracle tests

* fix Redshift tests

* fix Redshift tests

* fix checkstyle

* fix MSSQL tests

* fix cockroachdb tests

* fix checkstyle

* fix checkstyle

* replace star imports

* replace star imports

* replace star imports

* update JsonSchemaType | fixed checkstyle

* Remove unused variables in test

* Fix imports

* Expand imports

* Fix more imports

Co-authored-by: vmaltsev <vitalii.maltsev@globallogic.com>
Co-authored-by: Liren Tu <tuliren.git@outlook.com>
2022-02-14 02:12:37 -08:00
LiRen Tu
39049cbf24 🎉 End-to-end test source: support stream duplication (#10298)
* Add support to duplicate one stream multiple times

* Update cloud version

* Bump version

* Remove unused logger

* Add trace for each stream in mock source

* Fix cloud version

* Bump version in seed
2022-02-12 00:50:48 -08:00
LiRen Tu
5133ce6f4c 🐛 Destination snowflake & bigquery: fix null pointer exception (#9959)
* Prevent null exception

* Check nullable schema name

* Bump version

* Bump version in seed
2022-02-01 04:27:07 -08:00
LiRen Tu
8e8f402b8a 🎉 Destination snowflake & bigquery: integrate with sentry (#9945)
* Update doc

* Use empty dsn when sentry is not enabled

* Bump version in seed
2022-01-31 20:27:52 -08:00
LiRen Tu
a4b8edffbf Add detailed sentry tracing for JDBC destination stream consumer (#9898)
* Refactor airbyte sentry

* Add more sentry monitoring for jdbc destination

* Profile operations from failure tracking consumer

* Remove redundant method call

* Update operation names

* Trace snowflake starting process

* Trace snowflake copying step

* Move tracing to sql operation
2022-01-31 11:11:53 -08:00
Alexander Tsukanov
479f0d7c8d [MVP] Integrate sentry to all java-based connectors (#9745)
* airbyte-9328: Added Sentry integration to BigQuery and BigQuery denormalized connector.

* airbyte-5050: Added strategy for INSERT ROW.

* airbyte-9328: Added Sentry integration to Snowflake.

* airbyte-9328: Fix Sentry config.

* airbyte-9328: Fixed PR comments.

* airbyte-9328: Fixed PR comments.

* airbyte-9328: Fix PR comments.

* airbyte-9328: Fixed PR comments.

* airbyte-9328: Fixed PR comments.

* airbyte-9328: Fixed PR comments.

* airbyte-9328: Small changes.

* airbyte-9328: Small changes.

* airbyte-9328: Move SENTRY DSN keys to Dockerfiles.

* Use new dsn

* Revert format

* Remove sentry dsn from compose temporarily

* Log sentry event id

* Move sentry to java base

* Remove sentry code from bigquery

* Update dockerfiles

* Fix build

* Update release tag format

* Bump version

* Add env to dockerfiles

* Fix e2e test connector dockerfil

* Fix snowflake bigquery dockerfile

* Mark new versions as unpublished

Co-authored-by: LiRen Tu <tuliren@gmail.com>
Co-authored-by: Liren Tu <tuliren.git@outlook.com>
2022-01-29 16:58:35 -08:00
Eugene
4534703589 🎉Source Postgres: Set up connection - add schema selection (#9360)
* [1435] Source Postgres: Set up connection - added schema selection
2022-01-13 16:24:38 +02:00
Serhii Chvaliuk
c0a46c1987 BufferedStreamConsumerTest: remove non-determinism in size of generated test records (#9274)
* generate records fixed 40 bytes of size

* fix buffer flush

Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com>
2022-01-04 17:04:36 +02:00
Eugene
8c84ad2976 🎉Source-postgres\mssql\mysql added a HEAP dump capturing on outOfMemory Error (if any) (#8811)
* Updated entrypoint script for source-mysql\mssql\postgres connectors to capture a HEAP dump when connector fails with outOfMemory error
2021-12-28 17:34:35 +02:00
Alexander Tsukanov
eea41b4fc8 🎉 Destination Snowflake and RedShift: Implement the Byte-buffered logic (#8869)
* airbyte-8336: Byte based approach.

* test-commit

* airbyte-8336: Split file by cnhunks.

* airbyte-8336: Renamed variable.

* airbyte-8336: make snowflake DEFAULT_MAX_BATCH_SIZE_BYTES_SNOWFLAKE constant.

* airbyte-8336: make snowflake DEFAULT_MAX_BATCH_SIZE_BYTES_SNOWFLAKE constant.

* airbyte-8336: make snowflake DEFAULT_MAX_BATCH_SIZE_BYTES_SNOWFLAKE constant.

* airbyte-8336: fix of unit tests

* airbyte-8336: Changed to default buffer size in SnowFlake.

* airbyte-8336: Changed 15 GB to 1 GB for max size.

* airbyte-8336: Changed to default buffer size in SnowFlake.

* airbyte-8336: Bumped connector version.

* airbyte-8336: Bumped connector version.

* airbyte-8336: Bumped connector version.
2021-12-24 12:32:10 +02:00
Charles
3a6902963d remove validation from source delete (#8724) 2021-12-16 09:35:15 -08:00
Jared Rhizor
25674fc306 upgrade to Gradle 7.3.1 / Java 17 (#7964)
* upgrade gradle

* upgrade to Java 17 (and fix a few of the node versioning misses)

* oops

* try to run a different format version

* fix spotless by upgrading / reformatting some files

* fix ci settings

* upgrade mockito to avoid other errors

* undo bad format

* fix "incorrect" sql comments

* fmt

* add debug flag

* remove

* bump

* bump jooq to a version that has a java 17 dist

* fix

* remove logs

* oops

* revert jooq upgrade

* fix

* set up java for connector test

* fix yaml

* generate std source tests

* fail zombie job attempts and add failure reason (#8709)

* fail zombie job attempts and add failure reason

* remove failure reason

* bump gcp dependencies to pick up grpc update (#8713)

* Bump Airbyte version from 0.33.9-alpha to 0.33.10-alpha (#8714)

Co-authored-by: jrhizor <jrhizor@users.noreply.github.com>

* Change CDK "Caching" header to "nested streams & caching"

* Update fields in source-connectors specifications: file, freshdesk, github, google-directory, google-workspace-admin-reports, iterable (#8524)

Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com>

Co-authored-by: Serhii Chvaliuk <grubberr@gmail.com>
Co-authored-by: Sherif A. Nada <snadalive@gmail.com>

* move S3Config into destination-s3; update dependencies accordingly (#8562)

Co-authored-by: Lake Mossman <lake@airbyte.io>
Co-authored-by: jrhizor <jrhizor@users.noreply.github.com>
Co-authored-by: Sherif A. Nada <snadalive@gmail.com>
Co-authored-by: Iryna Grankova <87977540+igrankova@users.noreply.github.com>
Co-authored-by: Serhii Chvaliuk <grubberr@gmail.com>
Co-authored-by: Edward Gao <edward.gao@airbyte.io>
2021-12-10 16:57:54 -08:00
oneshcheret
fcf7491fee 🐛 Validate incorrect handling '\n' symbols in ssh key (#8371)
* 🐛 Validate \n in ssh key

* bump versions for ssh key connectors

* update versions for ssh key connectors

* temporal fix for checking failed tests

* revert temp changes and destination oracle version

* bump versions in config for ssh key related connectors
2021-12-03 20:11:51 +02:00