* add AttemptSyncConfig, move info out of JobSyncConfig
* get build working
* add db migration
* load config when building attempts
* persist AttemptSyncConfig
* it compiles
* fix job persistence test
* implement submitSync with attempt config
* fix TemporalClientTest
* reorganizing some code
* add GenerateInputActivity test
* verify AttemptSyncConfig is persisted
* add test for persistence changes
* add test for getAttemptByNumber
* use apis rather than direct db access
* fix compatibility with master
* copy update
* fix tests for allowed hosts addition
* remove debug logging
* fix: handle when state is not set on the connection
* fix: handle unset state (on the server this time)
* set state type when converting to internal representation
* Suggested Streams in in the Actor Definition
* Fix steam addition
* fix tests
* enable faker streams
* test resilaince.
* lint
* fig configLookup
* fixup definition load in webBackend
* fix build/tests
* Include 'suggested' in discover API response
* fix test
* Update airbyte-api/src/main/openapi/config.yaml
Co-authored-by: Pedro S. Lopez <pedroslopez@me.com>
* update build with typo
* fix test
* remove comment
* add more context
---------
Co-authored-by: Pedro S. Lopez <pedroslopez@me.com>
* wip: return whether configuration was updated
* updated outputs working
* fix pmd
* update description, format
* add didUpdateConfiguration to metadata, rm unneeded generics
* add didUpdateConfiguration to api response
* update name to fix pmd
* not required
* rename to match api response
* remove unused field
* match naming
* track latest config message
* pass new config as part of outputs
* persist new config
* persist config as the messages come through, dont set output
* clean up old implementation
* accept control messages for destinations
* get api client from micronaut
* mask instance-wide oauth params when updating configs
* defaultreplicationworker tests
* formatting
* tests for source/destination handlers
* rm todo
* refactor test a bit to fix pmd
* fix pmd
* fix test
* add PersistConfigHelperTest
* update message tracker comment
* fix pmd
* format
* move ApiClientBeanFactory to commons-worker, use in container-orchestrator
* pull out config updating to separate methods
* add jitter
* rename PersistConfigHelper -> UpdateConnectorConfigHelper, docs
* fix exception type
* fmt
* move message type check into runnable
* formatting
* pass api client env vars to container orchestrator
* pass micronaut envs to container orchestrator
* print stacktrace for debugging
* different api host for container orchestrator
* fix default env var
* format
* fix errors after merge
* set source and destination actor id as part of the sync input
* fix: get destination definition
* fix null ptr
* remove "actor" from naming
* fix missing change from rename
* revert ContainerOrchestratorConfigBeanFactory changes
* inject sourceapi/destinationapi directly rather than airbyteapiclient
* UpdateConnectorConfigHelper -> ConnectorConfigUpdater
* rm log
* fix test
* dont fail on config update error
* process control messages for discover jobs
* process control messages for CHECK
* persist config updates on check_connection_for_update
* get last config message rather than first
* fix pmd
* fix failing tests
* add tests
* source id not required for check connection (create case)
* suppress pmd warning for BusyWait literal
* source id not required for checkc onnection (create case) (p2)
* pass id, not full config to runnables/accept control message
* add new config required for api client
* add test file
* remove debugging logs
* rename method (getLast -> getMostRecent)
* rm version check (re-added this in by mistake on merge)
* fix test compatibility
* simplify
Implement the persistence layer changes following #19191.
This PR handles writing and reading stats to the new stream stat_table and columns in the existing sync_stats table.
At the same time we introduce upserts of stats records - i.e. merge updates into a single record - in preparation for real time stats updates vs the current approach where a new stat record is always written.
There will be two remaining PRs after this:
- First PR will be to fully wire up and test the API.
- Second PR will be to actually save stats while jobs are running.
* database migration to add column for field selection info
* add field selection info to standard sync persistence
* fix around persistence of field selection info
* API changes to support configuring column selection
* style and testing improvements around column selection api impl
* acceptance test fix for field selection api changes
* Remove the Change Management section as it was outdated. Added Authentication section to clarify how to use the API in OSS.
Co-authored-by: swyx <shawnthe1@gmail.com>
When auto-detect schema changes feature flag is on, disable connections that have breaking schema changes and connections that have any schema changes where the user has set their preference to disable.
Today we often see HTTP/1.1 header parser received no bytes' during syncs, especially in the Data Plane.
This PR attempts to fix this by adding naive retries.
Add a basic retry wrapper with the unique ability to retry for a much longer period on the last retry. This is particularly useful for us as most of our jobs are long running workflows, and the benefit of not having to restart the entire job outweighs the added wait time.
Alternative solutions I explored:
- Switching the underlying HTTP client to a more fully featured HTTP client. E.g. Apache or OkHttp. Issues with this:
- These clients do not support the ability to configure the retry policy we want.
- These clients do not support the ability to inject application aware logging.
- Most importantly, because this changes the interface, the resulting change set is big and affects many unrelated classes. I do think we eventually want to switch the underlying libraries out. However I don't think we should do this as part of OC work.
- Exploring pairing retry libraries such as https://resilience4j.readme.io/docs with the native http clients. The main issue here is the lack of ability to configure the last retry period.
Since the hand-rolled wrapper is simple + gets the job done, my thoughts are to run with this for the time being and revisit this if additional requirements around the clients come up.
* add structured dbt cloud information to the operations api
* remove unused webhook features, test updates
* update tests to use structured dbt cloud operation api
* add missing webhook operator type
API changes to support the progress bar.
- The eventual idea is for the save_stats route to be called by the workers during replication. Workers will save stats for a job id and attempt number.
- Make modifications to the /jobs/list and the /jobs/get_debug_info routes to also return estimated bytes/records.
We need both estimated metadata, as well as running states to calculate progress bar and throughput.
- add the save_stats route. This is the route that will be called by workers. I've done my best to reuse existing openapi bodies to reduce duplication.
- add the estimatedRecords and estimatedBytes fields to the AttemptStats body. This is part of the AttemptRead and the AttemptStreamStats objects. This eventually filters up to the jobs/list and jobs/get_debug_info objects. This also adds these to all the endpoints that were previously returning stats information. I think the duplicated data is a small issue and don't think it's worth splitting out a new api objects, though I will gladly do so if folks feel strongly.
minor changes to the AttemptApiController to support the new route.
- I've stubbed out the handlers for now since the backend is not yet implemented.
* Tmp
* Extract the Attempt API from the V1 API
* Add comments
* Move Connection API out of configuration API
* format
* format
* Rename to Controller
* Rename to Controller
* Add values to the factory
* Change the constructor to use hadler instead of objects needed by the handler
* Update with new tags.
* tmp
* Fix PMD errors
* Extract DB migrator
* Add something that I forgot
* extract destination definition api
* restore destination factory initialization
* extract destination definition specification api
* format
* format
* format
* extract health check api
* extract jobs api
* fix test
* format
* Extract logs api
* Add missing declaration
* Fix build
* Tmp
* format and PR comments
* Extract notification API
* re-organize tags
* Extract all Oauth
* Fix PMD
* add schemaChange
* merge conflict
* frontend tests
* tests
* l
* fix source catalog id
* test
* formatting
* move schema change to build backend web connection
* check if actor catalog id is different
* fix
* tests and fixes
* remove extra var
* remove logging
* continue to pass back new catalog id
* api updates
* fix mockdata
* tests
* add schemaChange
* merge conflict
* frontend tests
* tests
* l
* fix source catalog id
* test
* formatting
* move schema change to build backend web connection
* check if actor catalog id is different
* fix
* tests and fixes
* remove extra var
* remove logging
* continue to pass back new catalog id
* api updates
* fix mockdata
* tests
* tests
* optional of nullable
* Tmp
* For diff
* Add test
* More test
* Fix test and add some
* Fix merge and test
* Fix PMD
* Fix test
* Rm dead code
* Fix pmd
* Address PR comments
* RM unused column
Co-authored-by: alovew <anne@airbyte.io>
* ensure workspace webhook configs can be correctly passed between API and persistence layers
* remove unnecessary logging
* add unit tests to workspace webhook config handling
* additional testing and style cleanup around workspace webhook config handling
* introduce webhook operations to the operations API and persistence
* add unit tests for webhooks in operations endpoint handler
* fixes and additional testing in webhook operations handler
* cleanup refactor around operations handling to reduce duplicate code
* wip
* handle webhook configs in workspaces endpoint and split/hydrate secrets
* style improvements to documentation around webhook configs
* Clarify documentation around webhook auth tokens
* More documentation clarification around webhook configs
* Format.
* unit test coverage for webhook config handling
* use common json parsing libraries around webhook configs
* clean up around testing webhook operation configs
Co-authored-by: Davin Chia <davinchia@gmail.com>
* progress on adding geography throughout api
* fix workspace handler test
* more progress
* implement workspace defaulting and add/update more tests
* fix bootloader tests
* set defaultGeography in missing places
* add Geography column when reading Connection record from DB
* fix pmd
* add more comments/description
* format
* description