1
0
mirror of synced 2026-01-08 03:06:34 -05:00
Commit Graph

178 Commits

Author SHA1 Message Date
Jimmy Ma
a600f6ae47 Migrate StateDB to support per stream states (#13731)
* Update StateDB to support per Stream states.
* Add `StateType` type
* Add `steam_name`, `namespace` and `type` to `state` table.
* Set the default StateType to LEGACY
2022-06-14 14:27:38 -07:00
VitaliiMaltsev
f5a6a28211 🐛 Postgres Source: fixed truncated precision if the value of the milliseconds or seconds is 0 (#13549)
* Postgres Source: fixed truncated precision if the value of the millisecond or second is 0

* check CI with 1.15.3 testcontainer

* check CI with 1.15.3 testcontainer

* returned latest version of testcontainer

* fixed checkstyle

* fixed checkstyle

* returned latest testcontainer version

* updated CHANGELOG

* bump version

* auto-bump connector version

Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
2022-06-14 23:39:01 +03:00
Jimmy Ma
029085a56f Keep doc up-to-date with changes (#13667)
`airbyte-db/lib` has been renamed to `airbyte-db/db-lib`
2022-06-13 10:31:34 -07:00
Edward Gao
04d88f4760 fix build - reformat (#13697) 2022-06-10 14:56:29 -07:00
Serhii Chvaliuk
2daaf5b4c3 Normalization - BigQuery use json_extract_string_array for array of simple types (#13289)
Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com>
Co-authored-by: andrii.leonets <aleonets@gmail.com>
Co-authored-by: Andrii Leonets <30464745+DoNotPanicUA@users.noreply.github.com>
2022-06-10 23:31:32 +03:00
Jonathan Pearlin
2b31011bce Separate platform and connector testcontainer versions (#13642)
* Separate platform and connector testcontainer versions

* Fix dependency

* Fix dependency

* Fix dependency usage

* Prevent leaking testcontainer dependencies
2022-06-10 09:34:31 -04:00
LiRen Tu
402ec62c43 Fix db-lib unit tests (#13582) 2022-06-07 15:25:56 -07:00
LiRen Tu
545a7a3eb6 🎉 JDBC source: adjust fetch size based on max memory and max row size (#13435)
* Switch to measure max row byte size

* Reduce fetch size change logs

* Update unit tests

* Determine jdbc buffer size based on max memory

* Bump postgres version

* Bump postgres version

* Bump mysql version

* Bump mssql version

* Format java code

* Increase hikari connection timeout

* Update data source default parameters

* auto-bump connector version

* Mark postgres 0.4.21 as not published

* Revert "Bump mysql version"

This reverts commit ad9135258c.

* Fix unit test

Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
2022-06-07 14:29:34 -07:00
Davin Chia
7788594e22 Start publishing proper artifacts. (#13484)
## What
Finale of https://github.com/airbytehq/airbyte/pull/13122.

We've renamed all directories in previous PRs. Here we remove the fat jar configuration and add publishing to all subprojects.

Explanation for what is happening:

Identically named subprojects have the following issues:
* publishing as is leads to classpath confusion when the jars with the same names are placed in the Java distribution. This leads to NoClassDefFound errors on runtime.
* deconflicting the jar names without changing directory names leads to dependency errors as the OSS jar pom files are generated using project dependencies (suggesting a dependency a sibling subproject in the same repo) that use subprojects group and name as a reference. This means the generated jars look for Jars that do not exists (as their names have been changed) and cannot compile.
* the workaround to changing a subproject's name involves resetting the subproject's name in the settings.gradle and depending on the new name in each build.gradle. This increases configuration burden and decreases the ease of reading, since one will have to check the settings.gradle to know what the right subproject name is. See https://github.com/gradle/gradle/issues/847 for more info.
* given that Gradle itself doesn't have support for identically named subprojects (see the linked issue), the simplest solution is to not allow duplicated directories. I've only renamed conflicting directories here to keep things simple. I will create a follow up issues to enforce non-identical subproject names in our builds.

## How
* Remove fat jar configuration.
* Add publishing to all subprojects.
2022-06-06 17:15:25 +08:00
Davin Chia
eb99f47746 Fat Jar: Rename Dir Part 2 (#13478)
## What
Part 2 of https://github.com/airbytehq/airbyte/pull/13122.

Follow up to #13476 .

Explanation for what is happening:

Identically named subprojects have the following issues:

* publishing as is leads to classpath confusion when the jars with the same names are placed in the Java distribution. This leads to NoClassDefFound errors on runtime.
* deconflicting the jar names without changing directory names leads to dependency errors as the OSS jar pom files are generated using project dependencies (suggesting a dependency a sibling subproject in the same repo) that use subprojects group and name as a reference. This means the generated jars look for Jars that do not exists (as their names have been changed) and cannot compile.
* the workaround to changing a subproject's name involves resetting the subproject's name in the settings.gradle and depending on the new name in each build.gradle. This increases configuration burden and decreases the ease of reading, since one will have to check the settings.gradle to know what the right subproject name is. See https://github.com/gradle/gradle/issues/847 for more info.
* given that Gradle itself doesn't have support for identically named subprojects (see the linked issue), the simplest solution is to not allow duplicated directories. I've only renamed conflicting directories here to keep things simple. I will create a follow up issues to enforce non-identical subproject names in our builds.

* Rename airbyte-config:models to airbyte-config:config-models.
* Rename airbyte-config:persistence to airbyte-config:config-persistence.
2022-06-06 02:21:54 +08:00
Davin Chia
83a89aa843 Fat Jar: Rename Dir Part 1 (#13476)
Part 1 of #13122.

Rename airbyte-db:lib to airbyte-db:db-lib.
Rename airbyte-metrics:lib to airbyte-metrics:metrics-lib
Rename airbyte-protocol:models to airbyte-protocol:protocol-models.

Explanation for what is happening:

Identically named subprojects have the following issues:
- publishing as is leads to classpath confusion when the jars with the same names are placed in the Java distribution. This leads to NoClassDefFound errors on runtime.
- deconflicting the jar names without changing directory names leads to dependency errors as the OSS jar pom files are generated using project dependencies (suggesting a dependency a sibling subproject in the same repo) that use subprojects group and name as a reference. This means the generated jars look for Jars that do not exists (as their names have been changed) and cannot compile.
- the workaround to changing a subproject's name involves resetting the subproject's name in the settings.gradle and depending on the new name in each build.gradle. This increases configuration burden and decreases the ease of reading, since one will have to check the settings.gradle to know what the right subproject name is. See Projects with same name lead to unintended conflict resolution gradle/gradle#847 for more info.
- given that Gradle itself doesn't have support for identically named subprojects (see the linked issue), the simplest solution is to not allow duplicated directories. I've only renamed conflicting directories here to keep things simple. I will create a follow up issues to enforce non-identical subproject names in our builds.
2022-06-06 00:35:43 +08:00
Marcos Marx
adf4b6df25 run gradlew add headers, correct formatting (#13460) 2022-06-03 16:17:03 -03:00
Jonathan Pearlin
9fd4bba73c Cleaner migration wait logic (#13290) 2022-06-03 14:08:43 -04:00
Davin Chia
815b2e1d4d Update various platform images to be M1 compatible. (#13411)
Update socat image for ARM compatibility. Update curl image as best practice.
2022-06-02 17:36:46 +08:00
Anne
2714ee0af8 Create stream resets table (#13237)
* Add stream resets migration
2022-06-01 19:23:15 -07:00
VitaliiMaltsev
0a56b1d43f Postgres Source: add timezone awareness and handle BC dates (#13166)
* Postgres Source: add timezone awareness and handle BC dates

* fixed checkstyle

* add tests

* updated changelog

* removed star import

* fixed tests

* refactoring

* removed star import

* fixed bytea type

* created final static constants

* bump version

* auto-bump connector version

Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
2022-06-01 10:46:39 +03:00
Jonathan Pearlin
c955b01e2b Wait for migration before checking version (#13285) 2022-05-27 13:20:03 -04:00
Jonathan Pearlin
e06a9de60f Use Database Availability/Initialization Check (#13178)
* Use isolated database initialization logic

* Address PMD warnings

* Use test provider where possible

* Initialize database on bootloader load

* Combine availability and migration checks

* Ensure env vars are set

* Fix typo

* Avoid duplicate literals

* Add log message

* Use correct data source

* Revert change

* Update copyright

* Remove redundant exception catch/throw
2022-05-27 09:47:33 -04:00
Jonathan Pearlin
880c759cac Use "generated" in generated code package names (#13183) 2022-05-26 11:11:23 -04:00
Alexandre Girard
3894134d11 Bump year in license short to 2022 (#13191)
* Bump to 2022

* format
2022-05-25 17:56:49 -07:00
nahal99
8b9fa334fa PgLsn New Test Case for .toString() (#12899)
* PgLsn New Test Case for .toString()

* format

Co-authored-by: alafanechere <augustin.lafanechere@gmail.com>
2022-05-23 16:08:32 -07:00
Jonathan Pearlin
9d9968804d Refactor database initialization logic (#12961)
* Refactor database initialization logic

* Formatting

* Move database constants

* PR feedback

* PR feedback (use custom exceptions)

* PR feedback

* Formatting
2022-05-23 12:23:34 -04:00
Topher Lubaway
f106642cd2 Adds schedule_data json column (#13039)
* Adds schedule_data json column

updates the version test thingy
schema dumped

* adds schema dump

* formatting
2022-05-20 07:57:08 -05:00
Topher Lubaway
3aca043d01 Adds a new string cloumn to configs for cron (#12416)
* Adds a new string cloumn to configs for cron

closes #11418
i'm new to this task in Java please be brutal

* Adds airbyte header

* WIP

* Rebase a week of commits

* WIP for davin

* deps update

* Reorganize code for better readability. Also add a schema.

* Update tests.

* Correct bad test.

* Adds note for testing version change

* formatting change

Co-authored-by: Davin Chia <davinchia@gmail.com>
2022-05-18 09:03:32 -05:00
LiRen Tu
7c35b03595 🎉 Databricks destination: use new jdbc driver and open source the connector (#12861)
* Migrate to public databricks jdbc driver

* Update documentation

* Bump version

* Format code

* Check in databricks in seed file

* Check in databricks 0.2.0

* auto-bump connector version

Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
2022-05-16 14:48:52 -07:00
Jonathan Pearlin
fdaf335279 Better database connection handling for connectors (#12743)
* Better database connection handling for connectors

* Log connection error

* Properly close connection

* Remove unused method

* Close data source

* Use utility to close data source

* Use utility to close data source

* PR feedback

* Add Databricks driver

* Use driver class enum

* Use correct config

* Ensure config created before use

* Fix failing integration test

* Create DSLContext before use

* Address integration test failures

* Ensure DSLContext is closed

* Fix compile error

* Use correct datasource

* Use correct connection properties

* Close DSLContext

* Close DSLContext

* Fix integration test failures

* Properly close datasource

* Fix compilation issues

* Use existing database object

* Wrap close in try/finally

* Update test

* Wrap close in try/finally

* Ensure DSLContext is created

* Revert change to test

* Use correct data source

* Remove unused import

* More cleanup

* Add missing annotation

* Only initialize data source once

* Remove unused import

* Force testcontainers version

* Fix testcontainer issue

* Fix failing test

* Properly close all data sources

* Clear data sources after closing

* Fix compile error

* Fix compilation error

* Add missing method
2022-05-13 16:28:38 -04:00
Jonathan Pearlin
1a999b7191 Close underlying connections during migration (#12710) 2022-05-11 16:12:49 -04:00
Jonathan Pearlin
ebb9f3e1ac Prepare Database Access Layer for Dependency Injection (#12546)
* Prepare database access objects for dependency injection

* Replace duplicate code

* Remove unused imports

* Remove redundant validation call

* Remove unused imports

* Use constants

* Disable fast fail during connection pool initialization

* Remove typo

* Add missing test dependency

* Add missing test dependency

* Add missing test dependency

* Fix issue caused by rebase

* Add method for cloud

* Autoclose DSL context during migration

* Better connection close handling

* Fix typo in dependency

* Fix SpotBugs issue

* React to rebase

* Fix typo

* Update JavaDoc

* Fix database close calls

* Pass configs to getServer

* Fix typo

* Fix call to removed method

* Fix typo

* Use catalog to manage versions

* PR feedback

* Centralize shutdown hook

* Fix rebase issues

* Document test cases

* Document test cases

* Formatting

* Properly close database resources

* Rebase cleanup
2022-05-09 15:26:54 -04:00
Yurii Bidiuk
b3194b2200 🎉🐛: Source mongoDB: implement building JsonSchema with 'properties' for fields with type 'object' (#12428)
* mongodb: build JsonSchema with 'properties'

* add tests

* bump version

* auto-bump connector version

Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
2022-05-05 12:56:07 +03:00
LiRen Tu
e15ae56389 Close all unsafe queries (#12495)
* Add helper methods to return lists of json nodes

* Close all unsafe queries

* Add one more helper method

* Simplify helper names

* Format code
2022-05-03 13:45:02 -07:00
Jonathan Pearlin
de7035171d Add utility classes for database object creation (#12445)
* Add utility classes for database object creation

* Remove unused variable
2022-05-02 14:08:18 -04:00
LiRen Tu
35f2aa9aed 🎉 Jdbc sources: publish new version with adaptive fetch size (#12480)
* Default scaffold to use adaptive streaming config

* Switch more connectors to use adaptive streaming config

* Bump version for cockroach db

* Bump version for db2

* Bump mssql version

* Bump mysql version

* Bump oracle version

* Bump postgres version

* Bump redshift version

* Bump snowflake version

* Bump tidb version

* auto-bump connector version

* Fix db2 findbug issue

* auto-bump connector version

* auto-bump connector version

* auto-bump connector version

* auto-bump connector version

* Fix more findbug issues

* auto-bump connector version

* auto-bump connector version

* auto-bump connector version

* Fix findbug issue for mysql-strict-encrypt

* Fix findbugs issue for oracle source

* auto-bump connector version

* Remove suppress warnings annotation

* Fix oracle encrypt tests

* Fix oracle encrypt acceptance test

Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
2022-04-29 23:14:58 -07:00
LiRen Tu
55a0db7c67 🎉 JDBC source: adjust streaming query fetch size dynamically (#12400)
* Merge all streaming configs to one

* Implement new streaming query config

* Format code

* Fix comparison

* Use double for mean byte size

* Update fetch size only when changed

* Calculate mean size by sampling n rows

* Add javadoc

* Change min fetch size to 1

* Add comment by buffer size

* Update java connector template

* Perform division first

* Add unit test for fetching large rows

* Format code

* Fix connector compilation error
2022-04-28 22:36:17 -07:00
Subodh Kant Chaturvedi
367b863ed2 implement migration to create workspace_service_account table (#11943)
* implement migration to create workspace_service_account table

* make all columns non nullable

* introduce persistence code for service account table (#11944)

* implement persistence code for workspace_service_account table

* update yaml

* implement secret handling for workspace_service_account table (#11946)

* implement secret handling for workspace_service_account table

* add new line to the mock json

* get rid of file

* address review comments

* update method name and add comment
2022-04-26 19:49:50 +05:30
Serhii Chvaliuk
7023fbd48e Redshift SUPER type (#12064)
* 🎉 Destination Redshift: Use SUPER data type on Redshift destination for raw JSON data (#9407)

Co-authored-by: Oleksandr Tsukanov <alexander.tsukanovvv@gmail.com>
Co-authored-by: Sergey Chvalyuk <grubberr@gmail.com>
Co-authored-by: Christophe Duong <christophe.duong@gmail.com>
Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com>
2022-04-20 15:11:22 +03:00
Edward Gao
c1381cde2c Revert Redshift SUPER PRs (#12041) 2022-04-14 12:36:26 -07:00
Alexander Tsukanov
674221c07a 🎉 Destination Redshift: Use SUPER data type on Redshift destination for raw JSON data (#9407)
* airbyte-5050: Added support of SUPER datatype for destination-redshift on the Java side

Co-authored-by: Oleksandr Tsukanov <alexander.tsukanovvv@gmail.com>
Co-authored-by: user <user@HRK1-LMC-A13537.local>
Co-authored-by: Sergey Chvalyuk <grubberr@gmail.com>
Co-authored-by: Christophe Duong <christophe.duong@gmail.com>
2022-04-12 16:33:26 +03:00
Parker Mossman
884a94ed29 Un-Revert OSS branch build for Cloud workflow (#11808)
* Revert "Revert "Build OSS branch for deploying to Cloud env (#11474)""

This reverts commit 55e3c1e051.

* add action to get dev branch tag to OSS project instead of doing it in cloud

* remove dev branch version action, going to do this in cloud afterall
2022-04-08 15:17:04 -07:00
lmossman
55e3c1e051 Revert "Build OSS branch for deploying to Cloud env (#11474)"
This reverts commit 189efe7b42.
2022-04-05 15:44:31 -07:00
Parker Mossman
189efe7b42 Build OSS branch for deploying to Cloud env (#11474)
* add VERSION buildArg to Dockerfiles, default to current airbyte version but overwritable

* use VERSION env var consistently as Dockerfile buildArg, jar version, and tag

pass version and image_tag into docker build task function

* add github action for building and pushing an OSS branch for Cloud to consume

* allow AirbyteVersion to validate versions containing 'oss-branch' prefix

* change oss-branch prefix to dev for branch-based versions

* better action name

* add docker-compose-cloud.build.yaml to define minimum set of cloud images that are pushed by oss branch action

* update local dev docs to describe optional usage of VERSION env var

* make branch_version_tag input optional, if not provided, generates dev-<commit_hash>

* fix typo

* fix missed merge conflict

* update docker docs

* update integrationRunner isDev check
2022-04-05 15:06:17 -07:00
Malik Diarra
152af7cce6 Add new indices on the jobs table (#11590)
* Add new indices on the `jobs` table

* Format

* Fix build
2022-03-30 18:01:00 -07:00
Malik Diarra
b8b727f9a9 Migrate queries on OAuth table to use direct sql (#11370)
* Move buildDestinationOAuthParameter to DbConverter

* Migrate getDestinationOAuthParamByDefinitionIdOptional to use direct SQL

* Move buildSourceOAuthParameter to DbConverter

* Change getSourceOAuthParamByDefinitionIdOptional to use direct SQL

* Add tests

* Add new indices for oauth param table

* Fix index creation statement
2022-03-25 16:13:56 -07:00
Peter Hu
e0501774a5 Migrations for scoped connectors (#11305)
* migrations for supporting actor grants

* seed yaml definitions are always public, never custom

* add public and custom to actor definition models

* move methods to DbConverter

* add includeTombstones helper

to avoid having conditionals in every method with the tombstone boolean

* remove migration TODO comments

* assert custom is false when loading seeds

* format
2022-03-23 14:02:49 -07:00
LiRen Tu
e5ae3b3990 Rename jdbc db methods that require manual closure (#11300)
* Rename jdbc db methods to raise awareness of potential connection leak

* Format code
2022-03-21 15:00:10 -07:00
Malik Diarra
7d715f6ae6 Change getWorkspaceBySlugOptional to use direct sql statement (#11039)
* Move Workspace conversion function to DbConverter

* Migrate getWorkspaceBySlug to use direct SQL statement

* Move extraction of notification to converter function

* Add additional indices on workspace table

* Add tests for getWorkspaceBySlug
2022-03-18 08:37:40 -07:00
Harshith Mullapudi
fa8cd83e30 Harshith/connection updates (#11153)
* Feat: first cut to allow naming for connections

* fix

* fix: migration

* fix: migration

* fix: formatting

* fix: formatting

* fix: tests

* fix: -> is bit outside of what we do generally

* fix: tests are failing

* fix: tests are failing

* fix: tests are failing

* fix: tests are failing

* fix: tests are failing
2022-03-16 03:05:18 +05:30
Harshith Mullapudi
0afc31baa3 Revert "Feat: first cut to allow naming for connections (#10889)" (#11152)
This reverts commit 6225eecf29.
2022-03-15 15:38:49 +05:30
Harshith Mullapudi
6225eecf29 Feat: first cut to allow naming for connections (#10889)
* Feat: first cut to allow naming for connections

* fix

* fix: migration

* fix: migration

* fix: formatting

* fix: formatting

* fix: tests

* fix: -> is bit outside of what we do generally
2022-03-15 14:05:31 +05:30
Charles
65572f68ad add helper method for creating postgres db (#6244) 2022-03-13 14:54:44 -07:00
Parker Mossman
74007e2749 Remove deprecated FailureReason enum values (#10773)
* Remove deprecated/unused enum values from json schema, migration to update records to use corrected values

* make migration-specific classes handle any string, and remove extranneous comments/annotations for readability. also test that an unrecognized enum value is left alone and doesn't cause deserialization errors

* gradle format

* fix test

* fix jobTrackerTest with new enum values
2022-03-11 12:50:16 -08:00