1
0
mirror of synced 2026-01-10 09:04:48 -05:00
Commit Graph

52 Commits

Author SHA1 Message Date
Davin Chia
83a89aa843 Fat Jar: Rename Dir Part 1 (#13476)
Part 1 of #13122.

Rename airbyte-db:lib to airbyte-db:db-lib.
Rename airbyte-metrics:lib to airbyte-metrics:metrics-lib
Rename airbyte-protocol:models to airbyte-protocol:protocol-models.

Explanation for what is happening:

Identically named subprojects have the following issues:
- publishing as is leads to classpath confusion when the jars with the same names are placed in the Java distribution. This leads to NoClassDefFound errors on runtime.
- deconflicting the jar names without changing directory names leads to dependency errors as the OSS jar pom files are generated using project dependencies (suggesting a dependency a sibling subproject in the same repo) that use subprojects group and name as a reference. This means the generated jars look for Jars that do not exists (as their names have been changed) and cannot compile.
- the workaround to changing a subproject's name involves resetting the subproject's name in the settings.gradle and depending on the new name in each build.gradle. This increases configuration burden and decreases the ease of reading, since one will have to check the settings.gradle to know what the right subproject name is. See Projects with same name lead to unintended conflict resolution gradle/gradle#847 for more info.
- given that Gradle itself doesn't have support for identically named subprojects (see the linked issue), the simplest solution is to not allow duplicated directories. I've only renamed conflicting directories here to keep things simple. I will create a follow up issues to enforce non-identical subproject names in our builds.
2022-06-06 00:35:43 +08:00
Alexandre Girard
3894134d11 Bump year in license short to 2022 (#13191)
* Bump to 2022

* format
2022-05-25 17:56:49 -07:00
Topher Lubaway
013a886f4f Fixes Spotless and runs spotless (#13040)
zipped files with JSON extension made this task sad
2022-05-20 07:26:55 -05:00
Jonathan Pearlin
ebb9f3e1ac Prepare Database Access Layer for Dependency Injection (#12546)
* Prepare database access objects for dependency injection

* Replace duplicate code

* Remove unused imports

* Remove redundant validation call

* Remove unused imports

* Use constants

* Disable fast fail during connection pool initialization

* Remove typo

* Add missing test dependency

* Add missing test dependency

* Add missing test dependency

* Fix issue caused by rebase

* Add method for cloud

* Autoclose DSL context during migration

* Better connection close handling

* Fix typo in dependency

* Fix SpotBugs issue

* React to rebase

* Fix typo

* Update JavaDoc

* Fix database close calls

* Pass configs to getServer

* Fix typo

* Fix call to removed method

* Fix typo

* Use catalog to manage versions

* PR feedback

* Centralize shutdown hook

* Fix rebase issues

* Document test cases

* Document test cases

* Formatting

* Properly close database resources

* Rebase cleanup
2022-05-09 15:26:54 -04:00
LiRen Tu
35f2aa9aed 🎉 Jdbc sources: publish new version with adaptive fetch size (#12480)
* Default scaffold to use adaptive streaming config

* Switch more connectors to use adaptive streaming config

* Bump version for cockroach db

* Bump version for db2

* Bump mssql version

* Bump mysql version

* Bump oracle version

* Bump postgres version

* Bump redshift version

* Bump snowflake version

* Bump tidb version

* auto-bump connector version

* Fix db2 findbug issue

* auto-bump connector version

* auto-bump connector version

* auto-bump connector version

* auto-bump connector version

* Fix more findbug issues

* auto-bump connector version

* auto-bump connector version

* auto-bump connector version

* Fix findbug issue for mysql-strict-encrypt

* Fix findbugs issue for oracle source

* auto-bump connector version

* Remove suppress warnings annotation

* Fix oracle encrypt tests

* Fix oracle encrypt acceptance test

Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
2022-04-29 23:14:58 -07:00
LiRen Tu
55a0db7c67 🎉 JDBC source: adjust streaming query fetch size dynamically (#12400)
* Merge all streaming configs to one

* Implement new streaming query config

* Format code

* Fix comparison

* Use double for mean byte size

* Update fetch size only when changed

* Calculate mean size by sampling n rows

* Add javadoc

* Change min fetch size to 1

* Add comment by buffer size

* Update java connector template

* Perform division first

* Add unit test for fetching large rows

* Format code

* Fix connector compilation error
2022-04-28 22:36:17 -07:00
girarda
adea13cea7 Fix redshift and oracle acceptance tests (#10855)
* parse jdbc parameters

* Also fix redshift

* other oracle source acceptance test

* This is & now

* This is & now

* This is & now

* This is & now

* This is & now

* also update nne

* increase sleep to 11 seconds

* Bump to 15 seconds

* gradlew format

* try to reformat

* gradlew format

* Run ./gradlew :airbyte-integrations:connector-templates:generator:testScaffoldTemplates --scan

* reset to master

* Revert "reset to master"

This reverts commit d6141ed933.
2022-03-04 16:55:44 -08:00
Augustin
04521d063f 🎉 Source redshift: implement privileges check (#9744) 2022-03-01 14:45:04 +01:00
Lake Mossman
3d8a0dc048 Add ExitOnOutOfMemoryError to java connectors and bump versions (#10256) 2022-02-14 15:49:15 -08:00
VitaliiMaltsev
e30d8348b2 Change JsonSchemaPrimitive to a class (#9913)
* fix for jdk 17

* add JsonSchemaType class

* fix tests

* fix tests

* fix tests

* fix tests

* fix tests

* fix tests

* fix Oracle tests

* fix Redshift tests

* fix Redshift tests

* fix checkstyle

* fix MSSQL tests

* fix cockroachdb tests

* fix checkstyle

* fix checkstyle

* replace star imports

* replace star imports

* replace star imports

* update JsonSchemaType | fixed checkstyle

* Remove unused variables in test

* Fix imports

* Expand imports

* Fix more imports

Co-authored-by: vmaltsev <vitalii.maltsev@globallogic.com>
Co-authored-by: Liren Tu <tuliren.git@outlook.com>
2022-02-14 02:12:37 -08:00
LiRen Tu
2f41810cca Verify source redshift schema selection in tests (#9862)
* Verify catalog in redshift source acceptance test

* Dry code

* Fix tests
2022-01-28 11:37:44 -08:00
LiRen Tu
e4661fb92a Remove regex check from Java source acceptance test (#9829)
* Move getRegexTests to python source acceptance test

* Remove unused imports

* Update test template
2022-01-26 17:51:37 -08:00
Eugene
5bce24c469 🎉Source-redshift: added an optional field for schema\s selection (#9721)
* [9525] source-redshift: added schema selection
2022-01-26 22:09:52 +02:00
Iryna Grankova
579923d2f8 Update fields in source-connectors specifications: posthog, recurly, redshift, salesforce, salesloft (#8617)
Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com>
Co-authored-by: Sergey Chvalyuk <grubberr@gmail.com>
2022-01-25 16:14:02 +02:00
Yurii Bidiuk
cd30cf4fca bump version for affected sources from #8749 (#8958)
* bump versions

* fix tests for cockroachdb

* update changelog
2021-12-24 14:41:22 +02:00
Serhii Chvaliuk
844dd93122 Use multi-stage builds in dockerfiles to reduce java images (#9077)
* use multi-stage to reduce image size

Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com>
2021-12-23 22:27:05 +02:00
Oleksandr Sheheda
f0a9945f80 Revert "Change copy to add in dockerfiles to reduce container size (#8516)" (#8997)
This reverts commit 8ac2c6f4
2021-12-21 16:01:17 +02:00
Haoran Yu
8ac2c6f4a7 Change copy to add in dockerfiles to reduce container size (#8516)
Co-authored-by: Oleksandr Sheheda <alexandr-shegeda@users.noreply.github.com>
2021-12-16 23:52:59 -03:00
LiRen Tu
6843bc1d1f 🎉 Source MySQL: support all MySQL 8.0 types (#7970)
* Add jdbc compatible layer

* Support routine mysql types

* Format code

* Fix build

* Refactor abstract jdbc source and operation classes

* Update mysql source operations

* Test discover command for mysql

* Remove abstract jdbc compatible source layer

* Format code

* Update template

* Fix more types

* Bump version

* Log original field type

* Update comments

* Bump version in seed
2021-12-11 21:49:32 -08:00
VitaliiMaltsev
0c932749d4 🎉 Redshift Source and Destination set SSL as default option (#7234)
* Redshift Source and Restination set SSL as default option

* add changelog

* remove SSL test| add more documentation

* bump new version

* bump new version

Co-authored-by: vmaltsev <vitalii.maltsev@globallogic.com>
2021-10-22 12:28:54 +03:00
Charles
ba44f700b9 add final for params, local variables, and fields (#7084) 2021-10-15 16:41:04 -07:00
VitaliiMaltsev
ae9048cfaa 🎉 Redshift Source/Destination SSL Support (#6965)
* add tls option to spec

* Redshift Source add acceptance test

* Redshift Destination add ssl field to spec

* add RedshiftDestinationAcceptanceTestSSL

* fix checkstyle

* added changelog

* update docs

* bump versions of Redshift Source and Destination \ changed default tls to true

Co-authored-by: vmaltsev <vitalii.maltsev@globallogic.com>
2021-10-14 12:29:00 +03:00
Andrii Leonets
404d673aae Move SourceJdbcUtils.java to JdbcSourceOperations.java (#6397)
* move SourceJdbcUtils.java to JdbcSourceOperations.java

* format

* add proxy methods for overwriting

* Move SourceJdbcUtils.java to JdbcSourceOperations.java #6397
resolve conflicts

Co-authored-by: Oleksandr Sheheda <alexandrshegeda@gmail.com>
2021-10-04 14:32:37 +03:00
Charles
f30869001a Exposing SSL-only version of Postgres Source (#6362) 2021-09-27 16:46:39 -07:00
Michel Tricot
1773e41e47 Shorten our headers + adds contributors file (#6478) 2021-09-27 10:45:50 -07:00
Subodh Kant Chaturvedi
7591324351 introduce jvm flag MaxRAMPercentage for java connectors (#6001)
* introduce jvm flag MaxRAMPercentage for java connectors

* temporary commit to test this out on GKE via kube acceptance test

* undo temp commit
2021-09-22 19:59:39 +05:30
LiRen Tu
b9e1997d2f Split airbyte-db and move db dev commands to gradle (#5616)
# Summary

- A follow-up PR for #5543.
- This PR separates the `airbyte-db` project to two modules:
  - `lib` is the original `airbyte-db`.
  - `jooq` is for jOOQ code generation.
- This is necessary because the jOOQ generator requires a custom database implementation that can run Flyway migration. So the code generator logic needs to depend on the compilation of the original `airbyte-db` project.

# Commits
* Separate db to lib and jooq modules
* Update dependencies
* Add jobs db migrator test
* Fix compose build
* Add migration dev center
* Add schema dump task
* Update airbyte-db/lib/README.md
  * Co-authored-by: Davin Chia <davinchia@gmail.com>
* Update readme
* Remove bom dependency
* Update readme
* Use jooq code in db config persistence
* Remove AirbyteConfigsTable

Co-authored-by: Davin Chia <davinchia@gmail.com>
2021-08-26 10:44:09 -07:00
Eugene
a78efe090b 🎉 JAVA-Based connectors: Bumped version for some javabased connector to start using Config Validator from core module (#5398)
* Updated some java-based connectors version to start using new json config validator from java core
2021-08-17 22:30:16 +03:00
Andrii Leonets
107f5b8d61 🎉 Abstract level for SQL relational database sources (#4123)
Abstract level for SQL relational database sources
2021-07-05 17:18:07 +03:00
Davin Chia
b04c080c95 Kube Queueing POC (#3464)
* Use CDK to generate source that can be configured to emit a certain number of records and always works.

* Checkpoint: socat works from inside the docker container.

* Override the entry point.

* Clean up and add ReadMe.

* Clean up socat.

* Checkpoint: connect to Kube cluster and list all the pods.

* Checkpoint: Sync worker pod is able to send output to the destination pod.

* Checkpoint: Sync worker creates Dest pod if none existed previously. It also waits for the pod to be ready before doing anything else. Sync worker will also remove the pod on termination.

* update readme

* Checkpoint: Dest pod does nott restart after finishing. Comment out delete command in Sync worker.

* working towards named pipes

* named pipes working

* update readme

* WIP named pipe / socat sidecar kube port forwarding (#3518)

* nearly working sources

* update

* stdin example

* move all kube testing yamls into the airbyte-workers directories. sort the airbyte-workers resource folder; place all the poc yamls together.

* Format.

* Put back the original KubeProcessBuilderFactory.

* Fix slight errors.

* Checkpoint: Worker pod knows its own IP. Successfully starts and writes to Dest pod after refactor.

* remove unused file and update readme

* Dest pod loops back into worker pod. However, the right messages do not seem to be passing in.

* Switch back to worker ip.

* SWEET VICTORY!.

* wrap kube pod in process (#3540)

also clean up kubernetes deploys.

* More clean up. (#3586)

The first 6 points of #3464.

The only interesting thing about this PR is the kube pod shutdown. For whatever reason, the OkHttpPool isn't respecting the evictAll call and 1 idle thread remains. So instead of shutting down immediately, the worker pod shuts down after 5 mins when the idle thread id reaped. There isn't an easy way to modify the pool's idle reap configuration now. I do not think this issue is blocking since it's relatively benign, so I vote we create a ticket and come back to this once we do an e2e test.

* Implements redirecting standard error as well. (#3623)

* Clean up before next implementation.

* kube process launching (#3790)

* processes must handle file mounting

* remove comment

* default to base entrypoint

* use process builder factory / select stdin / use a pool of ports

* fix up

* add super hacky copying example

* Checkpoint: Works end to end!

* Checkpoint: Use API to make sure init container is ready instead of blind sleep. Propagate exception in DefaultCheckConnectionWorker.

* Refactor KubePodProcess. Checked to make sure everything still works.

* Format.

* Clean up code. Begin putting this into variables and breaking up long constructor function.

* Add comments to explain what is happening.

* fix normalization test

* increase timeout for initcontainer

Co-authored-by: Davin Chia <davinchia@gmail.com>

* facepalm moment

* clean up kube poc pr (#3834)

* clean up

* remove source-always-works

* create separate commons-docker

* fix test

* enable kube e2e tests (#3866)

* enable kube e2e tests

* use more generally accepted env definition

* use new runners

* use its own runner and install minikube differently

* update name

* use kubectl alias

* use link instead of alias that doesn't propagate

* start minikube

* use driver=none

* go back to using action

* mess with versions

* revert runner

* install socat

* print logs after run

* also try re-runnining tasks

* always wait for file transfer

* use ports

* increase wait timeout for kube

* use different localhost ips and bump normalization to include an entrypoint

* proposed fix

* all working locally

* revert temporary changes

* revert normalization image change that's happening in a separate pr

* readability

* final comment

* Working Kube Cancel. (#3983)

* Port over the basic changes.

* Add logic to return proper exit code in the event of termination. Add comments to explain why.

* revert envs change and merge master to fix kube acceptance tests (#4012)

* use older env format

* fix build

Co-authored-by: jrhizor <me@jaredrhizor.com>
Co-authored-by: Jared Rhizor <jared@dataline.io>
2021-06-09 18:12:39 -07:00
Jared Rhizor
b4793b2510 add AIRBYTE_ENTRYPOINT for kubernetes support (#3973)
* add AIRBYTE_ENTRYPOINT for kubernetes support

* bump versions

* bump version in seed

* Update generic template

* keep scaffold sources at 0.1.0

* add missing newline

* handle python base versions correctly

* re-bump mysql and postgres sources

* re-bump snowflake destination

* add skip tests option

* switch to running tests

* reverse conditional to make it safer

* fix publish to include the test running

* fix iterable version

* fix file generation

Co-authored-by: Sherif A. Nada <snadalive@gmail.com>
2021-06-09 13:01:45 -07:00
Andrii Leonets
213fae17a1 MySQL source: Add comprehensive data type test (#3810) 2021-06-07 14:01:02 +03:00
Charles
8983f09aea normalize connector acceptance test names (#3539)
* Rename standard tests to acceptance tests

* Normalize the names so that the nouns are always in the same order so it is easier to find tests
2021-05-22 13:40:40 -07:00
Davin Chia
42686add8a Release connectors with namespace change. (#2990)
Release all connectors affected by namespace change. Includes all JDBC sources and destinations.

Also add documentation for normalisation. Prerequisite to actually releasing 0.21.0-alpha.
2021-04-21 11:35:08 +08:00
Davin Chia
b9014acfca :tada Namespace support. Supported source-destination pairs will now sync data into the same namespace as the source. (#2862)
This PR introduces the following behavior for JDBC sources:
Instead of streamName = schema.tableName,  this is now streamName = tableName and namespace = schema. This means that, when replicating from these sources, data will be replicated into a form matching the source. e.g. public.users (postgres source) -> public.users (postgres destination) instead of current behaviour of public.public_users. Since MySQL does not have schemas, the MySQL source uses the database as it's namespace.

To do so:
- Make namespace a field class concept in Airbyte Protocol. This allows the source to propagate namespace and destinations to write to a source-defined namespace. Also sets us up for future namespace related configurability.
- Add an optional namespace field to the AirbyteRecordMessage. This field will be set by sources that support namespace.
- Introduce AirbyteStreamNameNamespacePair as a type-safe manner of identifying streams throughout our code base.
- Modify base_normalisation to better support source defined namespace, specifically allowing normalisation of tables with the same name to different schemas.
2021-04-17 15:33:22 +08:00
Davin Chia
e11ccfd0a1 Revert "Remove schema from stream name. (#2807)" (#2857)
This reverts commit 6e9d6fce59.
2021-04-12 14:56:11 -07:00
Davin Chia
6e9d6fce59 Remove schema from stream name. (#2807)
Last step (besides documentation) of namespace changes. This is a follow up to #2767 .

After this change, the following JDBC sources will change their behaviour to the behaviour described in the above document.

Namely, instead of streamName = schema.tableName, this will become streamName = tableName and namespace = schema. This means that, when replicating from these sources, data will be replicated into a form matching the source. e.g. public.users (postgres source) -> public.users (postgres destination) instead of current behaviour of public.public_users. Since MySQL does not have schemas, the MySQL source uses the database as it's namespace.

I cleaned up some bits of the CatalogHelpers. This affected the destinations, so I'm also running the destination tests.
2021-04-12 21:02:29 +08:00
Davin Chia
e8190ff860 🎉 Add NCHAR and NVCHAR support to DB and cursor type casting. (#2600) 2021-03-29 08:09:06 +08:00
Christophe Duong
8a29584125 ☝🏼Destinations supports destination sync mode (#2460)
* Handle destination sync mode in destinations

* Source & Destination sync modes are required (#2500)

* Provide Migration script making sure it is always defined for previous sync configs
2021-03-26 20:23:48 +01:00
Christophe Duong
41e8b6a824 Source support primary keys (#2488)
* Source support primary keys
2021-03-17 19:28:56 +01:00
Christophe Duong
070575ffdf Protocol allows future / unknown properties (#2238)
* Allow new extra properties in validation
* Create migration script to upgrade all connectors versions
* Bumpversion of all connectors
2021-03-09 13:36:36 +01:00
Charles
aadfae24bd Iterator-based JDBC Source (and Redshift bugfix) (#1887) 2021-02-02 17:14:14 -08:00
Charles
f2f3b4ec37 Fix NPE in State Decorator (#1746) 2021-01-25 17:31:23 -08:00
Charles
3670545995 Fix JdbcSource handling of tables with same names in different schemas (#1724)
* Fix JdbcSource handling of tables with same names in different schemas

* Previously the JdbcSource was combining the columns of any tables with the same name across different schemas into a single stream in the catalog.

* This was caught because in those tables there were columns of the same name with different types which triggered a precondition to check for this.

* The fix makes sure we group by both schema name and table name.

* Adds test to the standard jdbc tests to catch this case.

* This test does NOT run for mysql as, mysql has no concept of schemas.
2021-01-19 18:45:53 -08:00
Charles
13c5eef93a Fix JdbcSource Incremental OOM (#1655) 2021-01-14 14:33:44 -08:00
Sherif A. Nada
cd08188d70 handle non standard types in jdbc sources (#1576) 2021-01-07 10:31:33 -08:00
Sherif A. Nada
107bb54143 Hotfix redshift maven repository (#1445) 2020-12-24 20:42:31 -08:00
Charles
8347a69c77 Add Incremental to AbstractJdbcSource (#1306)
* Add standard tests for sources that use the JdbcSource to guarantee that changes do not break any sources that rely on JdbcSource.

* Add JdbcStressTest to verify that we stream / chunk data properly (a.k.a can handle more data in any JdbcSource than fits in memory)

* Migrate MSSQL and Redshift to user the new base source
2020-12-18 14:17:56 -08:00
Sherif A. Nada
93674f6b4d Respect sync mode regardless of input state in mailchimp (#1213) 2020-12-11 13:04:45 -08:00
Jared Rhizor
1bd19d1bae put all integration test tasks under integrationTest (#1231)
* always re-run standardSourceTestPython

* rename and regroup to integrationTest

* add comment
2020-12-07 10:10:26 -08:00