* sweep all scheduler application code and new-scheduler conditional logic
* remove airbyte-scheduler from deployments and docs
* format
* remove 'v2' from github actions
* add back scheduler in delete deployment command
* remove scheduler parameters from helm chart values
* add back job cleaner + test and add comment
* remove now-unused env vars from code and docs
* format
* remove feature flags from web backend connection handler as it is no longer needed
* remove feature flags from config api as it is now longer needed
* remove feature flags input from config api test
* format + shorter url
* remove scheduler parameters from helm chart readme
## What
Part 3 of https://github.com/airbytehq/airbyte/pull/13122.
Follow up to #13478 .
Explanation for what is happening:
Identically named subprojects have the following issues:
* publishing as is leads to classpath confusion when the jars with the same names are placed in the Java distribution. This leads to NoClassDefFound errors on runtime.
* deconflicting the jar names without changing directory names leads to dependency errors as the OSS jar pom files are generated using project dependencies (suggesting a dependency a sibling subproject in the same repo) that use subprojects group and name as a reference. This means the generated jars look for Jars that do not exists (as their names have been changed) and cannot compile.
* the workaround to changing a subproject's name involves resetting the subproject's name in the settings.gradle and depending on the new name in each build.gradle. This increases configuration burden and decreases the ease of reading, since one will have to check the settings.gradle to know what the right subproject name is. See https://github.com/gradle/gradle/issues/847 for more info.
* given that Gradle itself doesn't have support for identically named subprojects (see the linked issue), the simplest solution is to not allow duplicated directories. I've only renamed conflicting directories here to keep things simple. I will create a follow up issues to enforce non-identical subproject names in our builds.
## How
Rename airbyte-scheduler:models to airbyte-scheduler:scheduler-models.
Rename airbyte-scheduler:persistence to airbyte-scheduler:scheduler-persistence.
## What
Part 2 of https://github.com/airbytehq/airbyte/pull/13122.
Follow up to #13476 .
Explanation for what is happening:
Identically named subprojects have the following issues:
* publishing as is leads to classpath confusion when the jars with the same names are placed in the Java distribution. This leads to NoClassDefFound errors on runtime.
* deconflicting the jar names without changing directory names leads to dependency errors as the OSS jar pom files are generated using project dependencies (suggesting a dependency a sibling subproject in the same repo) that use subprojects group and name as a reference. This means the generated jars look for Jars that do not exists (as their names have been changed) and cannot compile.
* the workaround to changing a subproject's name involves resetting the subproject's name in the settings.gradle and depending on the new name in each build.gradle. This increases configuration burden and decreases the ease of reading, since one will have to check the settings.gradle to know what the right subproject name is. See https://github.com/gradle/gradle/issues/847 for more info.
* given that Gradle itself doesn't have support for identically named subprojects (see the linked issue), the simplest solution is to not allow duplicated directories. I've only renamed conflicting directories here to keep things simple. I will create a follow up issues to enforce non-identical subproject names in our builds.
* Rename airbyte-config:models to airbyte-config:config-models.
* Rename airbyte-config:persistence to airbyte-config:config-persistence.
Part 1 of #13122.
Rename airbyte-db:lib to airbyte-db:db-lib.
Rename airbyte-metrics:lib to airbyte-metrics:metrics-lib
Rename airbyte-protocol:models to airbyte-protocol:protocol-models.
Explanation for what is happening:
Identically named subprojects have the following issues:
- publishing as is leads to classpath confusion when the jars with the same names are placed in the Java distribution. This leads to NoClassDefFound errors on runtime.
- deconflicting the jar names without changing directory names leads to dependency errors as the OSS jar pom files are generated using project dependencies (suggesting a dependency a sibling subproject in the same repo) that use subprojects group and name as a reference. This means the generated jars look for Jars that do not exists (as their names have been changed) and cannot compile.
- the workaround to changing a subproject's name involves resetting the subproject's name in the settings.gradle and depending on the new name in each build.gradle. This increases configuration burden and decreases the ease of reading, since one will have to check the settings.gradle to know what the right subproject name is. See Projects with same name lead to unintended conflict resolution gradle/gradle#847 for more info.
- given that Gradle itself doesn't have support for identically named subprojects (see the linked issue), the simplest solution is to not allow duplicated directories. I've only renamed conflicting directories here to keep things simple. I will create a follow up issues to enforce non-identical subproject names in our builds.
This PR provides 'version catalog' plugin, Where catalog is defined by using declared dependency list in 'build.gradle'.
In this pr all dependencies were transferred to 'libs' using 'dependencyResolutionManagement -> versionCatalogs -> libs'
* Switch json-avro-converter to jitpack
* Switch jsongenerator to jitpack
* Fix default config
* Fix one more default config use case
* Fix jitpack dependency
* Move jitpack repo to root build.gradle
Part 1 of https://docs.google.com/document/d/11pEUsHyKUhh4CtV3aReau3SUG-ncEvy6ROJRVln6YB4/edit?usp=sharing.
This is the initial set up of the reporter. Will add actual metrics in the follow up PR.
Since the reporter is primarily for Cloud use and limited to DD to begin with, we are keeping it outside the regular Airbyte docker and Kube deploys for now.
Add ReadMes to better define various modules.
Part 1 of #8303.
Essentially, create a separate application that will run all set up operations for Airbyte, including database migrations and metadata setup.
Part 2 will be removing these operations from the server, modifying start up conditions to look for migration versions (#8302). I'll then release a minor version.
This sequencing felt the most natural to me, since it allows for the new code to exist in master without any breaking changes. It also preserves the ServerApp class for comparison.
This is a custom auto-setup script for the temporal environment. Unfortunately there is no other way properly update the DB without copy pasting parts of the temporal auto-setup script. Ideally temporal would provide a dedicated container for it DB but it is not the case right now.
* add specs module with logic to fetch specs on build
* format + build and add gradle dependency for new script
* check seed file for existing specs + refactor
* add tests + a bit more refactoring
* run gw format
* update yaml config persistence to merge specs into definitions
* add comment
* add dep
* add tests for GcsBucketSpecFetcher
* get rid of static block + format
* DRY up parse call
* add GCS details to comment
* formatting + fix test
* update comment
* do not format seed specs files
* change signature of run to allow cloud to reuse this script
* run gw format
* revert commits that change signature of run
* fix comment typo
Co-authored-by: Davin Chia <davinchia@gmail.com>
* rename enum to be distinct from the enum in cloud
* add missing dependencies between modules
* add readme for seed connector spec generator
* reword
* reference readme in comment
* ignore 'spec' field in newFields logic
* rearrange dependencies so that CONNECTORS_BASE build does not depend on SeedConnectorSpecGenerator
* run format
* add some more helpful info to the GCS fetch failure message
* add more info
* get rid of unnecessary static block
* Fix publishing docs (#7589)
* Fix publishing docs
* Reorder steps and add a comment about rebuilding the platform
* Update README.md
Co-authored-by: Lake Mossman <lake@airbyte.io>
* add dependency and rebuild
* update PR template with seed connector generation steps
* revert formatting changes to PR template
* Update build.gradle
* Remove unnecessary dep
Co-authored-by: Davin Chia <davinchia@gmail.com>
Co-authored-by: Christophe Duong <christophe.duong@gmail.com>
* add specs module with logic to fetch specs on build
* format + build and add gradle dependency for new script
* check seed file for existing specs + refactor
* add tests + a bit more refactoring
* run gw format
* update yaml config persistence to merge specs into definitions
* add comment
* delete secrets migration to be consistent with master
* add dep
* add tests for GcsBucketSpecFetcher
* get rid of static block + format
* DRY up parse call
* add GCS details to comment
* formatting + fix test
* update comment
* do not format seed specs files
* change signature of run to allow cloud to reuse this script
* run gw format
* revert commits that change signature of run
* fix comment typo
Co-authored-by: Davin Chia <davinchia@gmail.com>
* rename enum to be distinct from the enum in cloud
* add missing dependencies between modules
* add readme for seed connector spec generator
* reword
* reference readme in comment
* ignore 'spec' field in newFields logic
Co-authored-by: Davin Chia <davinchia@gmail.com>
Wrapper around Prometheus lib to interface with Datadog.
We use prometheus because:
- Future-proofing as it's uses the general open metrics format.
- Prometheus makes it's metrics available to a scraper to it lends itself better to the OSS set up.
- Datadog automatically converts promethues metrics into dd metrics so we don't lose much.
* Created skeleton of a migration utility module.
* Secrets migration hello world functional
* Added dependencies for secrets migration
* Create Secrets store migration utility and related tests.
* Make secrets migration work in kube
* Make pod for secrets migration give right health result to kube
* docker-compose split of scheduler and worker
* fix heartbeat location bug + add support for kubernetes
* use two workers in integration tests
* capture logs in AirbyteTestContainer
* add waiting
* rename to make it easier to review
* rename module
* fix remaining conflicts
* allow configuring max workers of each type and document usage
* fix build
* remove comment
* add worker resource requiremetns
* try to fix for connector build
* fix regression in biuld
* add env comments for SUBMITTER_NUM_THREADS
* Update airbyte-workers/src/main/java/io/airbyte/workers/WorkerApp.java
Co-authored-by: Davin Chia <davinchia@gmail.com>
* Update airbyte-workers/src/main/java/io/airbyte/workers/temporal/TemporalPool.java
Co-authored-by: Davin Chia <davinchia@gmail.com>
* merge temporalpool into workerapp
* output docker system info
* move check to before
* remove unnecessary parts of the patch
* could this be the problem? i thought i added this
* show disk usage
* add print statements
* add pruning
* fix prune option
* use force
Co-authored-by: Davin Chia <davinchia@gmail.com>
* oracle normalization
* correct dbt_project function for oracle
* unit tests
* run format
* correct ephemeral tests
* add gradle dependency for oracle destination
* run int tests
* add oracle in settings.gradle for normalization run[
* use default airbyte columns
* format
* test all destinatoin ephemeral
* correct unit test
* correct unit test
* destination docs update
* correct mypy
* integration test all dest
* refactor oracle function
* merge master
* run all destinations
* flake8 escape regex
* surrogate key function
* correct few minor comments
* refactor scd sql function
* refactor scd function
* revert test
* refactor minor details
* revert tests
* revert ephemeral test
* revert unit test table_registry
* revert airbyte_protocol format
* format
* bump normalization version in worker
* minor chnages
* minor chages
* correct json_column for other destinations
* gradlew format
* revert tests
* remove comments
* add Oracle destination explicit in safe_cast_str
* add quote_in_parenthesis inside if clause
* gradlew format
# Summary
- A follow-up PR for #5543.
- This PR separates the `airbyte-db` project to two modules:
- `lib` is the original `airbyte-db`.
- `jooq` is for jOOQ code generation.
- This is necessary because the jOOQ generator requires a custom database implementation that can run Flyway migration. So the code generator logic needs to depend on the compilation of the original `airbyte-db` project.
# Commits
* Separate db to lib and jooq modules
* Update dependencies
* Add jobs db migrator test
* Fix compose build
* Add migration dev center
* Add schema dump task
* Update airbyte-db/lib/README.md
* Co-authored-by: Davin Chia <davinchia@gmail.com>
* Update readme
* Remove bom dependency
* Update readme
* Use jooq code in db config persistence
* Remove AirbyteConfigsTable
Co-authored-by: Davin Chia <davinchia@gmail.com>
* wip
* add file
* final structure
* few more updates
* undo unwanted changes
* add abstract test + more refinement
* remove CDC metadata to debezium
* rename class + add missing property
* move debezium to bases + upgrade debezium version + review comments
* downgrade version + minor fixes
* reset to minutes
* fix build
* address review comments
* should return Optional
* use common abstraction for CDC via debezium for mysql (#4604)
* use new cdc abstraction for mysql
* undo wanted change
* pull in latest changes
* use renamed class + move constants to MySqlSource
* bring in latest changes from cdc abstraction
* format
* bring in latest changes
* pull in latest changes
* use common abstraction for CDC via debezium for postgres (#4607)
* use cdc abstraction for postgres
* add files
* ready
* use renamed class + move constants to PostgresSource
* bring in the latest changes
* bring in latest changes
* pull in latest changes