1
0
mirror of synced 2026-01-06 15:03:36 -05:00
Commit Graph

144 Commits

Author SHA1 Message Date
Parker Mossman
9403c28b50 Parker/temporal cloud (#13243)
* switch to temporal cloud client for now

* format

* use client cert/key env secret instead of path to secret

* add TODO comments

* format

* add logging to debug timeout issue

* add more logging

* change workflow task timeout

* PR feedback: consolidate as much as possible, add missing javadoc

* fix acceptance test, needs to specify localhost

* add internal-use only comments

* format

* refactor to clean up TemporalClient and prepare it for future dependency injection framework

* remove extraneous log statements

* PR feedback

* fix test

* return isInitialized true in test
2022-06-21 11:37:27 -07:00
Jonathan Pearlin
2b31011bce Separate platform and connector testcontainer versions (#13642)
* Separate platform and connector testcontainer versions

* Fix dependency

* Fix dependency

* Fix dependency usage

* Prevent leaking testcontainer dependencies
2022-06-10 09:34:31 -04:00
Davin Chia
50778521d9 Split acceptance tests into Basic and Advanced Acceptance test classes for readability. (#13508)
Refactor the acceptance tests for readability & speed by splitting acceptance tests into a basic and advanced test class.

- Basic test class: Contains all tests around functionality that stays constant regardless of deployment. e.g. api changes. Only run for Docker.
- Advanced test class: Contains all tests around functionality that changes due to deployment. e.g. how we handle processes between Docker and Kubernetes. Runs for both Docker and Kubernetes.

The benefits are:
- Breaks up the huge monolith tests we have today for better readability. Preps us to run these tests on Cloud.
- Clarifies what tests run on what deployment.
- Gradle parallelises at the level of the test class, so we get some speed up. Anecdotally, this is faster by ~3 mins over the old Kubernetes acceptance tests.

There is some test fixture duplication, but I figured we can tackle that in a follow up PR.
2022-06-08 16:15:58 +08:00
Davin Chia
40cb78e5c7 Fix the acceptance tests. (#13501)
We've seen errors like:
- https://github.com/airbytehq/airbyte/runs/6758654948?check_suite_focus=true#step:11:52833 - from trying to list bootloader logs. This is no longer possible since we remove the bootloader pod.
- https://github.com/airbytehq/airbyte/runs/6746572522?check_suite_focus=true#step:11:52164 - errors while tearing down the test.

I will do a follow up PR to refactor tests to hopefully speed things up:
- split Kube from normal tests.
- explore not recreating the db on each time.

In the mean time, this should stabilise the tests and get us back to green build.
2022-06-07 02:38:19 +08:00
Lake Mossman
73034c64da Sweep old scheduler code (#13400)
* sweep all scheduler application code and new-scheduler conditional logic

* remove airbyte-scheduler from deployments and docs

* format

* remove 'v2' from github actions

* add back scheduler in delete deployment command

* remove scheduler parameters from helm chart values

* add back job cleaner + test and add comment

* remove now-unused env vars from code and docs

* format

* remove feature flags from web backend connection handler as it is no longer needed

* remove feature flags from config api as it is now longer needed

* remove feature flags input from config api test

* format + shorter url

* remove scheduler parameters from helm chart readme
2022-06-06 10:49:17 -07:00
Davin Chia
eb99f47746 Fat Jar: Rename Dir Part 2 (#13478)
## What
Part 2 of https://github.com/airbytehq/airbyte/pull/13122.

Follow up to #13476 .

Explanation for what is happening:

Identically named subprojects have the following issues:

* publishing as is leads to classpath confusion when the jars with the same names are placed in the Java distribution. This leads to NoClassDefFound errors on runtime.
* deconflicting the jar names without changing directory names leads to dependency errors as the OSS jar pom files are generated using project dependencies (suggesting a dependency a sibling subproject in the same repo) that use subprojects group and name as a reference. This means the generated jars look for Jars that do not exists (as their names have been changed) and cannot compile.
* the workaround to changing a subproject's name involves resetting the subproject's name in the settings.gradle and depending on the new name in each build.gradle. This increases configuration burden and decreases the ease of reading, since one will have to check the settings.gradle to know what the right subproject name is. See https://github.com/gradle/gradle/issues/847 for more info.
* given that Gradle itself doesn't have support for identically named subprojects (see the linked issue), the simplest solution is to not allow duplicated directories. I've only renamed conflicting directories here to keep things simple. I will create a follow up issues to enforce non-identical subproject names in our builds.

* Rename airbyte-config:models to airbyte-config:config-models.
* Rename airbyte-config:persistence to airbyte-config:config-persistence.
2022-06-06 02:21:54 +08:00
Davin Chia
83a89aa843 Fat Jar: Rename Dir Part 1 (#13476)
Part 1 of #13122.

Rename airbyte-db:lib to airbyte-db:db-lib.
Rename airbyte-metrics:lib to airbyte-metrics:metrics-lib
Rename airbyte-protocol:models to airbyte-protocol:protocol-models.

Explanation for what is happening:

Identically named subprojects have the following issues:
- publishing as is leads to classpath confusion when the jars with the same names are placed in the Java distribution. This leads to NoClassDefFound errors on runtime.
- deconflicting the jar names without changing directory names leads to dependency errors as the OSS jar pom files are generated using project dependencies (suggesting a dependency a sibling subproject in the same repo) that use subprojects group and name as a reference. This means the generated jars look for Jars that do not exists (as their names have been changed) and cannot compile.
- the workaround to changing a subproject's name involves resetting the subproject's name in the settings.gradle and depending on the new name in each build.gradle. This increases configuration burden and decreases the ease of reading, since one will have to check the settings.gradle to know what the right subproject name is. See Projects with same name lead to unintended conflict resolution gradle/gradle#847 for more info.
- given that Gradle itself doesn't have support for identically named subprojects (see the linked issue), the simplest solution is to not allow duplicated directories. I've only renamed conflicting directories here to keep things simple. I will create a follow up issues to enforce non-identical subproject names in our builds.
2022-06-06 00:35:43 +08:00
Jonathan Pearlin
880c759cac Use "generated" in generated code package names (#13183) 2022-05-26 11:11:23 -04:00
Alexandre Girard
3894134d11 Bump year in license short to 2022 (#13191)
* Bump to 2022

* format
2022-05-25 17:56:49 -07:00
Lake Mossman
26ed3856e1 Migrate OSS to temporal scheduler (#12757)
* Migrate OSS to temporal scheduler

* add comment about migration being performed in server

* add comments about removing migration logic

* formatting and add tests for migration logic

* rm duplicated test

* remove more duplicated build task

* remove retry

* disable acceptance tests that call temporal directly when on kube

* set NEW_SCHEDULER and CONTAINER_ORCHESTRATOR_ENABLED env vars to true to be consistent

* set default value of container orchestrator enabled to true

* Revert "set default value of container orchestrator enabled to true"

This reverts commit 21b36703a9.

* Revert "set NEW_SCHEDULER and CONTAINER_ORCHESTRATOR_ENABLED env vars to true to be consistent"

This reverts commit 6dd2ec04a2.

* Revert "Revert "set NEW_SCHEDULER and CONTAINER_ORCHESTRATOR_ENABLED env vars to true to be consistent""

This reverts commit 2f40f9da50.

* Revert "Revert "set default value of container orchestrator enabled to true""

This reverts commit 26068d5b31.

* fix sync workflow test

* remove defunct cancellation tests due to internal temporal error

* format - remove unused imports

* revert changes that set container orchestrator enabled to true everywhere

* remove NEW_SCHEDULER feature flag from .env files, and set CONTAINER_ORCHESTRATOR_ENABLED flag to true for kube .env files

Co-authored-by: Benoit Moriceau <benoit@airbyte.io>
2022-05-18 17:05:42 -07:00
Lake Mossman
e8084c0189 Repair temporal state when performing manual actions (#12289)
* Repair temporal state when performing manual actions

* refactor temporal client and fix tests

* add unreachable workflow exception

* format

* test repeated deletion

* add acceptance tests for automatic workflow repair

* rename and DRY up manual operation methods in SchedulerHandler

* refactor temporal client to batch signal and start requests together in repair case

* add comment

* remove main method

* fix job id fetching

* only overwrite workflowState if reset flags are true on input

* fix test

* fix cancel endpoint

* Clean job state before creating new jobs in connection manager workflow (#12589)

* first working iteration of cleaning job state on first workflow run

* second iteration, with tests

* undo local testing changes

* move method

* add comment explaining placement of clean job state logic

* change connection_workflow failure origin value to platform

* remove cast from new query

* create static var for non terminal job statuses

* change failure origin value to airbyte_platform

* tweak external message wording

* remove unused variable

* reword external message

* fix merge conflict

* remove log lines

* move cleaning job state to beginning of workflow

* do not clean job state if there is already a job id for this workflow, and add test

* see if sleeping fixes test on CI

* add repeated test annotation to protect from flakiness

* fail jobs before creating new ones to protect from quarantined state

* update external message for cleaning job state error
2022-05-12 17:43:19 -07:00
Evan Tahler
5ff96ab946 Use host.docker.internal for acceptance testing on macs (#12791)
* Use `host.docker.internal` for acceptance testing

* Lint

* conditional hostname for mac only
2022-05-11 16:22:16 -07:00
Jonathan Pearlin
ebb9f3e1ac Prepare Database Access Layer for Dependency Injection (#12546)
* Prepare database access objects for dependency injection

* Replace duplicate code

* Remove unused imports

* Remove redundant validation call

* Remove unused imports

* Use constants

* Disable fast fail during connection pool initialization

* Remove typo

* Add missing test dependency

* Add missing test dependency

* Add missing test dependency

* Fix issue caused by rebase

* Add method for cloud

* Autoclose DSL context during migration

* Better connection close handling

* Fix typo in dependency

* Fix SpotBugs issue

* React to rebase

* Fix typo

* Update JavaDoc

* Fix database close calls

* Pass configs to getServer

* Fix typo

* Fix call to removed method

* Fix typo

* Use catalog to manage versions

* PR feedback

* Centralize shutdown hook

* Fix rebase issues

* Document test cases

* Document test cases

* Formatting

* Properly close database resources

* Rebase cleanup
2022-05-09 15:26:54 -04:00
Augustin
07068b0fe0 🐙 octavia-cli: use model serialization (#12133) 2022-05-06 18:55:22 +02:00
Subodh Kant Chaturvedi
405bf4daad workspaceId should be part of spec request (#12112)
* workspaceId should be part of spec request

* address review comment

* fix test

* format

* update octavia according to API changes

* create integration test for definition generation

* fix test

* fix test

Co-authored-by: alafanechere <augustin.lafanechere@gmail.com>
2022-04-22 19:30:06 +05:30
Edward Gao
f32c5fa8ca 🐛 Connector exit code should still be detected if resourceVersion is updated (#11861) 2022-04-20 16:38:25 -07:00
Peter Hu
37a510c1ce Configure test retries for JUnit Acceptance Tests (#11818)
* acceptance tests with retries

* kube acceptance tests with retries

* javadoc to say we prefer not retrying

* no retries for tests that don't run on k8s
2022-04-14 09:45:59 -07:00
Augustin
06c902c357 Format AcceptanceTests.java (#11863) 2022-04-09 23:26:27 +08:00
Parker Mossman
84436b01a0 Unexpected Temporal State: Start a new workflow if no reachable workflow exists during update (#11771)
* start new workflow if not running during update

* add unit tests for update method

* remove comment

* just test update when temporal workflow does not exist

* put test inside usesNewScheduler conditional

* remove unused import

* add comment explaining acceptance test
2022-04-08 15:15:44 -07:00
terencecho
568f11242d Add acceptance test for deleting connetion (#11563)
* Add acceptance test for deleting connetion in bad temporal state

* disable new test on kube

* try using different temproalHost

* add log info line to give time for temporal client to spin up

* try avoiding missing row in temporal db

* check if temporal workflow is reachable

* check if temporal workflow is reachable

* check if temporal workflow is reachable

* try waiting for connection state

* try using airbyte-temporal hostname

* Revert "try using airbyte-temporal hostname"

This reverts commit 0e53a27622.

* Revert back to using localhost

* Add 5 second wait

* only enable test for new scheduler

* only enable test for new scheduler 2

* refactor test to cover normal and unexpected temporal state
2022-04-04 10:51:03 -07:00
Jared Rhizor
493f0ea9f6 use Kubernetes watch api for retrieving exit codes (#11083)
* use kubernetes api for retrieving exit codes

* undelete test

* clean up more status check interval

* fmt

* wip

* clean up

* smarter filtering

* reordering

* exception handling

* better logging for test + speed up acceptance tests temp

* re-enable running on branch

* fix race condition in test

* add log

* trigger build

* trigger build

* re-run tests with everything enabled

* run tests

* run tests

* clean up

* respond to comments

* fix formatting

* fix whitespace

* remove comment

* 10 -> 5

* log exit code error message
2022-03-31 04:01:55 -07:00
Parker Mossman
672b347aca Avoid double-calling the deleteConnection activity (#11246)
* avoid double-calling the deleteConnection activity

* delete connections before sources in test teardown to avoid double-delete
2022-03-18 11:09:37 -07:00
Malik Diarra
3d9f9ec5a8 Cache schema during discoverSchema (#10820)
* Make SchedulerHandler store schema after fetching it

* Add `disable_cache` parameter to discover_schema API

* Return cached catalog if it already exists

* Address code review comments

* Add tests for caching of catalog in SchedulerHandler

* Format fixes

* Fix Acceptance tests

* New code review fixes

- Use upper case for global variable
- Inline definition and assignment of variable
2022-03-17 06:40:58 -07:00
Harshith Mullapudi
fa8cd83e30 Harshith/connection updates (#11153)
* Feat: first cut to allow naming for connections

* fix

* fix: migration

* fix: migration

* fix: formatting

* fix: formatting

* fix: tests

* fix: -> is bit outside of what we do generally

* fix: tests are failing

* fix: tests are failing

* fix: tests are failing

* fix: tests are failing

* fix: tests are failing
2022-03-16 03:05:18 +05:30
Charles
c1c8675366 Add readmes to all modules (#8893) 2022-03-13 14:45:36 -07:00
Charles
5fde59fdbd add spotbugs (#10522) 2022-03-11 12:05:17 -08:00
Marcos Marx
44e8f6fcdf Auto-upgrade connectors when they are in use only a patch version update (#10515)
* auto-upgrade connectors there are in use with patch version only

* update check version docstring

* remove try/catch from hasNewPatchVersion

* refactor write std defs function

* run format

* add unit test and change exception

* update airbyte version function name to be more clear

* correct unit test in migration tests

* run format
2022-03-09 18:43:48 -03:00
Parker Mossman
ed19bcc9c2 Parker/acceptance test off benoit branch (#10843)
* Revert "Rm flaky test (#9628)"

This reverts commit 16133cf5e7.

* Restore the acceptance test running with the new scheduler

* Add timeout

* Isolate the new acceptance test

* Update github action name

* Attemptp to fix checkpointing

* Check the state retrieval instead of the existance of the workflow

* fix build

* Add concurrent list for test

* Do not wait for the workflow to be potentially destroy

* Silencely ignore the cancel exception

* Format

* Trigger build

* format

* Remove unrelated changes

* Update acceptance

* Try to fix race condition

* Try to slow down the connection

* Disable test

* Move the sleep

* Rm useless sleep

* Fix missing return

* add repeated

* try using infinite feed source for cancellation test

* set limits on infinite feed source

* misunderstood waitForJob, now correctly waiting for job to be in RUNNING

* clean up PR, DRY create definition methods, clearer method name for waiting on job

* fix acceptance tests action name

* fix imports

* more cleanup

* revert temporalClient do-while change

* fix workflow step names

Co-authored-by: Benoit Moriceau <benoit@airbyte.io>
2022-03-04 08:49:59 -08:00
Jared Rhizor
db8053fb6d upgrade temporal sdk to 1.8.1 (#10648)
* upgrade temporal from mostly 1.6.0 to 1.8.1

* try bumping GSM to get newer grpc dep

* Revert "try bumping GSM to get newer grpc dep"

This reverts commit d837650284.

* upgrade temporal-testing as well

* don't change version for temporal-testing-junit5
2022-02-24 18:57:41 -08:00
Jared Rhizor
9bf67dd91d fix orchestrator restart problem for cloud (#10565)
* test time ranges for cancellations

* try with wait

* fix cancellation on worker restart

* revert for CI testing that the test fails without the retry policy

* revert testing change

* matrix test the different possible cases

* re-enable new retry policy

* switch to no_retry

* switch back to new retry

* paramaterize correctly

* revert to no-retry

* re-enable new retry policy

* speed up test + fixees

* significantly speed up test

* fix ordering

* use multiple task queues in connection manager test

* use versioning for task queue change

* remove sync workflow registration for the connection manager queue

* use more specific example

* respond to parker's comments
2022-02-23 22:11:39 -08:00
Benoit Moriceau
22e4f6cd54 Change the block logic and block after the job creation (#10597)
This is changing the check to see if a connection exist in order to make it more performant and more accurate. It makes sure that the workflow is reachable by trying to query it.
2022-02-23 15:24:24 -08:00
Jared Rhizor
9e23e96794 enable backpressure test on kube (#10527) 2022-02-22 09:14:20 -08:00
Jared Rhizor
a66d8be03a continue workflows on restarts (#10294)
* fix normalization output processing in container orchestrator

* add full scheduler v2 acceptance tests

* speed up tests

* fixes

* clean up

* wip handle worker restarts

* only downtime during sync test not passing

* commit temp

* mostly cleaned up

* add attempt count check

* remove todo

* switch all pending checks to running checks

* use ++

* Update airbyte-container-orchestrator/src/main/java/io/airbyte/container_orchestrator/ContainerOrchestratorApp.java

Co-authored-by: Charles <giardina.charles@gmail.com>

* Update airbyte-workers/src/main/java/io/airbyte/workers/temporal/sync/LauncherWorker.java

Co-authored-by: Charles <giardina.charles@gmail.com>

* add more context

* remove unused arg

* test on CI that no_retry is insufficient

* revert back to orchestrator retry

* test for retry logic

* remove fialing test and switch back activity config to just no retry

Co-authored-by: Charles <giardina.charles@gmail.com>
2022-02-17 15:14:51 -08:00
Jared Rhizor
6829e86771 full scheduler v2 CI testing (#10199)
* fix normalization output processing in container orchestrator

* add full scheduler v2 acceptance tests

* speed up tests

* fixes

* clean up
2022-02-11 16:47:17 -08:00
Benoit Moriceau
e7da9232bb Fix record count and add acceptance test to the new scheduler (#9487)
* Add a job notification

The new scheduler was missing a notification step after the job is done.

This is needed in order to report the number of record of a sync.

* Acceptance test with the new scheduler

Add a new github action task to run the acceptances test with the new scheduler

* Retry if the failure

* PR comments
2022-01-19 18:16:19 -08:00
Jared Rhizor
2a600bebef fix migration test snowflake version comparison error (#9370)
* fix migration test again

* disable acceptance tests

* re-enable acceptance tests

* bring snowflake version back

* fix

* fix how we compare versions for migration tests
2022-01-09 18:25:21 -08:00
Jared Rhizor
ed46b2db78 remove health query for migration test (#9338)
* remove health query for migration test

* fmt
2022-01-06 10:59:27 -08:00
Jared Rhizor
ee26499d7d fix build error introduced by health check change (#9292) 2022-01-04 11:09:16 -08:00
Augustin
c51fb7afe6 Improve JOB_POD variable naming + improve doc about memory management (#9048) 2021-12-23 18:42:13 +01:00
Charles
b920dc8bb0 fix auto migration test (#8970) 2021-12-20 12:51:44 -08:00
LiRen Tu
0de30f5e4d 🎉 Testing destination: multiple logging modes (#8824)
* Implement destination null

* Update existing testing destinations

* Merge in logging consumer

* Remove old destination null

* Add documentation

* Add destination to build and summary

* Fix test

* Update acceptance test

* Log state message

* Remove unused variable

* Remove extra statement

* Remove old null doc

* Add dev null destination

* Update doc to include changelog for dev null

* Format code

* Fix doc

* Register e2e test destination in seed
2021-12-19 00:33:42 -08:00
Lake Mossman
464c485b94 Add acceptance tests for source close timeout (#8217)
* add test connectors and bump versions

* add failure timeout acceptance test

* run gw format

* include exception in runtime exception

* mark as disabled and add comment
2021-12-16 15:21:26 -08:00
Davin Chia
60e32373e8 Revert "Revert "Switch to use Bootloader. (#8584)" (#8778)" (#8790)
This reverts commit 216501b4fa.

Turn this back on since this was originally reverted for logging update changes.
2021-12-15 14:09:43 +08:00
Davin Chia
216501b4fa Revert "Switch to use Bootloader. (#8584)" (#8778)
This reverts commit 5cf3967424.
2021-12-14 21:36:12 +08:00
Davin Chia
5cf3967424 Switch to use Bootloader. (#8584)
- Add the CONFIGS_DATABASE_MINIMUM_FLYWAY_MIGRATION_VERSION and JOBS_DATABASE_MINIMUM_FLYWAY_MIGRATION_VERSION. These are env vars that will determine if the database is ready for an application to start.
- Add the CONFIGS_DATABASE_INITIALIZATION_TIMEOUT_MS and the JOBS_DATABASE_INITIALIZATION_TIMEOUT_MS env vars to determine how long an application should wait for the DB before giving up.
- Create the MinimumFlywayMigrationVersionCheck class. This class contains all the assertions to check if 1) a database is initialised. 2) a database meets the minimum migration version.
- Remove all set up operations from the ServerApp. Use MinimumFlywayMigrationVersionCheck operations instead.
- I also had to modify the Databases and BaseDatabaseInstance classes to support connecting to a database with timeouts. We would previously try forever.
- Add Bootloader to the relevant docker files and Kube files.
- Clean up the migration acceptance tests so it's clear what is happening.
2021-12-14 21:30:18 +08:00
Davin Chia
341f505a94 Rename env vars for better readability. (#8447)
* Rename GcsStorageBucket to GcsLogBucket.

* Update all references to GCP_STORAGE_BUCKET to GCS_LOG_BUCKET.

* Undo this for configuration files for older Airbyte versions.

* Clean up Job env vars. (#8462)

* Rename MAX_SYNC_JOB_ATTEMPTS to SYNC_JOB_MAX_ATTEMPTS.

* Rename MAX_SYNC_TIMEOUT_DAYS to SYNC_JOB_MAX_TIMEOUT_DAYS.

* Rename WORKER_POD_TOLERATIONS to JOB_POD_TOLERATIONS.

* Rename WORKER_POD_NODE_SELECTORS to JOB_POD_NODE_SELECTORS.

* Rename JOB_IMAGE_PULL_POLICY to JOB_POD_MAIN_CONTAINER_IMAGE_PULL_POLICY.

* Rename JOBS_IMAGE_PULL_SECRET to JOB_POD_MAIN_CONTAINER_IMAGE_PULL_SECRET.

* Rename JOB_SOCAT_IMAGE to JOB_POD_SOCAT_IMAGE.

* Rename JOB_BUSYBOX_IMAGE to JOB_POD_BUSYBOX_IMAGE.

* Rename JOB_CURL_IMAGE to JOB_POD_CURL_IMAGE.

* Rename KUBE_NAMESPACE to JOB_POD_KUBE_NAMESPACE.

* Rename RESOURCE_CPU_REQUEST to JOB_POD_MAIN_CONTAINER_CPU_REQUEST.

* Rename RESOURCE_CPU_LIMIT to JOB_POD_MAIN_CONTAINER_CPU_LIMIT.

* Rename RESOURCE_MEMORY_REQUEST to JOB_POD_MAIN_CONTAINER_MEMORY_REQUEST.

* Rename RESOURCE_MEMORY_LIMIT to JOB_POD_MAIN_CONTAINER_MEMORY_LIMIT.

* Remove worker suffix from created pods to reduce confusion with actual worker pods.

* Use sync instead of worker to name job pods.
2021-12-03 23:28:48 +08:00
Charles
ada2e1724a Refactor MigrationAcceptanceTest to test for major version bumps (#8154) 2021-11-29 20:14:18 -08:00
Charles
817ec6db7a handle major version bump in automigration test (#8152) 2021-11-19 17:13:44 -08:00
Charles
d67afa7654 hard code auto migration test version (#8146) 2021-11-19 13:16:17 -08:00
Christophe Duong
c5a7267378 🐛🐌 Optimize incremental normalization runtime with snowflake (#8088) 2021-11-19 15:03:52 +01:00