Follow up to #26366 .
Pull in the async consumer changes into the Consumer Factory. Also take the chance to split out the StagingConsumerFactory with the goal of clarifying the various general, serial and async functions.
Instead of one massive factory file, split into
- GeneralStagingFunction.java
- AsyncFlush.java
- SerialFlush.java
representing the general buckets of code we have today.
I'm sure we can do smarter things here. This is the bare minimum to unblock us + 'leave things better than we found them'.
Follow up to #26324 - here we split up the BufferManager and add tests and comments.
- Split up the buffer manager class into -> BufferManager, BufferEnqueue and BufferDequeue.
- Move all buffer related code to the buffers package.
- Rename test classes to match this split.
- Add java docs and tests as part of this split.
- Simplify the BufferDequeue interface to return a set streams representing the buffered streams instead of the underlying map of buffers. This lets us keep the memory queue package private.
- all getYMethods now return Optionals for better error handling. This would have resulted in NPEs previously.
Split out the smallest set of reasonable changes from #26086 .
My goal was to split out the interface, as well as show how the interface it's meant to be used.
Follow up PRs:
- Split out classes from BufferManager and add more tests there.
- Add in the AsyncConsumer with tests.
- Add in the StagingConsumer factory.
* Revert "Splits bases and updates build.gradle files (#25649)"
This reverts commit c673b0a692.
* Bumps branch to prevent a conflict with publishing
* Forward fixes Snowflake to use singular base-java and develop within a new package within the same module
* Forcing automated change to merge changes
* try this?
* fix tests
* assert cdc values
* handle case where we have lsn but no updated_at
* readability improvements
* tweaks to test
* version bumps + changelogs
* Automated Change
---------
Co-authored-by: edgao <edgao@users.noreply.github.com>
* Splits bases and updates build.gradle files
* Fixed changelog out of sync
* Bumps version number and metadata files
* auto-bump connector version
* Downgraded untouched connector bumps
---------
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
* map the other integer schema to long
* fix test + add test
* delete_public_access_block for bucket if public (#25663)
Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com>
* 🐛 Source Facebook Marketing: fix `expected records` for CAT (#25604)
* publish normalization (#25591)
* publish normalization
* bump normalization container version in all the destinations that use it
Co-authored-by: Edward Gao <edward.gao@airbyte.io>
Co-authored-by: edgao <edgao@users.noreply.github.com>
* Bump Airbyte version from 0.44.2 to 0.44.3
* Destination Bigquery: update AIRBYTE_ENTRYPOINT env var for kube process (#25588)
* add AIRBYTE_ENTRYPOINT env var for kube
* amazing, absolute genius
* version bump + changelog
* derp, no need to publish denormalizeid
* fix changelog entry
* auto-bump connector version
---------
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
* version bumps + changelog
* whoops
* bump metadata
* bump metadatas
* auto-bump connector version
* auto-bump connector version
* auto-bump connector version
* auto-bump connector version
* auto-bump connector version
---------
Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com>
Co-authored-by: Serhii Chvaliuk <grubberr@gmail.com>
Co-authored-by: Baz <oleksandr.bazarnov@globallogic.com>
Co-authored-by: Jeff Cowan (Airbyte) <4992320+jcowanpdx@users.noreply.github.com>
Co-authored-by: edgao <edgao@users.noreply.github.com>
Co-authored-by: benmoriceau <benmoriceau@users.noreply.github.com>
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
* Removes defunct Azure Blob Storage laoding option for Snowflake
* Bumps to major version and removes documentation that references AzureBlobStorage
* Updates the destination_definitions.yaml
* Run ProcessResources to match version of 1.0.0 mismatched spec
* Pinning urllib to older version since the 2.0 version removed classes
Changes in this refactor PR
* Use the proper interface name for the OnStartFunction
* Use the proper interface name for the OnCloseFunction
* Create and use a proper interface name for the FlushBufferFunction
* Create and use a proper interface name for the BufferCreateFunction
* Mostly naming consistency changes. These are things caught in static, compile time checks so should be low risk.
---------
Co-authored-by: jcowanpdx <jcowanpdx@users.noreply.github.com>
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
We were running into a CI/CD system-only bug with dbt that requires this workaround to get it working
---------
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
* super hacky start
* also check that we're writing
* v0 convert normalization logs to airbytemessage
* add start+end logs
* aggregate errors into a single trace?
* pipefail; quick tweaks to log parser
* make spotbugs happy
* more comments, uncomment env var check
* copy in SentryExceptionHelper
* final fixes
* write tests + fix bugs
* move to base-java
* remove outdated comment
* fix spotbugs
* Automated Change
* minor version bump
* changelog
* fix behavior when env var not set
* run normalization even if destination fails
* better logic
* better logging
* oops
* move to base-java
* rebump version
* Automated Change
* auto-bump connector version
* wtf how did this work previously
* auto-bump connector version
---------
Co-authored-by: edgao <edgao@users.noreply.github.com>
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
* Fix LSN parsing from Integer to Long
* rebasing
* Rebase
* Rebase
* Other casting
* Lock the file only when reading, so the file is free when parsing the object.
Increased from 1 to 166 checkpoints, and from skipping hundreds of checkpoints to never skip a state.
* Update load function documentation
* bump mysql and mssql
* cdc: refactor to remove debezium dependency from connector packages
* use gradle's shared dependency
* more refactoring
* upgrade docker version
* resolve master merge conflicts
* Automated Change
* minor changes
* resolve merge conflicts
* avoid deserializing multiple times
* simplify
* enable checkpointing for Postgres
* more improvements
* enable assertions
* changelog + bump version
* auto-bump connector version
* auto-bump connector version
* manual bump
---------
Co-authored-by: subodh <subodh1810@gmail.com>
Co-authored-by: subodh1810 <subodh1810@users.noreply.github.com>
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
* Pass argument along, add test that should pass and test that should fail
* Add tests with additionalProperties
* Set additionalproperties=false when not set|
* Parametrize test cases
* Make the behavior 'optional'
* Fix parametrization for all combinations
* Improve CI credentials README and rename param
* Update naming to be clearer about columns only
* record_has_unexpected_field > record_has_unexpected_column
* Automated Change
* Hacking the CAT dockerfile and run script to test my changes specifically
* First crack at running CAT on all connectors
* Write during instead of after all tests
* Async-ify it
* Add ability to define max concurrency
* Write successes
* ci_credentials: fix overwriting 'data' before getting nextPageToken
* Adjustible num_semaphores, check to make sure it's an airbyte connector first
* Automated Change
* Make create_issues and create_prs more configurable, add issue for fail_on_extra_columns
* Add ability to pass in sources as a list or from a txt file
* Add logs to issue, make project nullable
* Migrate multiple connectors
* Add cli args
* use ruamel.yaml to preserve ordering
* Separate config loading from config migration
* Add ability to pass in lists of sources to test. Sort output by exit codes. Fix max_concurrency flag
* Default to testing only beta and GA connectors
* Always write test output when available
* Revert "Add cli args"
This reverts commit b538a8c696.
* Remove slash
* Don't run on alpha connetors, handle older config style
* Don't migrate to new format, preserve quotes and long lines
* Automated Change
* Update issue, don't run for alpha connectors
* Automated Change
* Add bypass for extra fields test
* Add bypass for extra fields test
* Rename run_tests script
* Rename module
* Update args usage, small changes
* Refactor create_issues.py
* Clean up run_tests.py
* Sort out arg parsers
* Pull out get_valid_definitions_from_args
* Import definitions module instead of methods
* Use config files to provide constants for each migration
* Handle FileNotFoundError in create_issues.py, improve logging
* Rename to migrations, reference name of folder via utils
* Update readmes for migration modules, add script for getting outputs
* Use tmp dir, correct path for issue reference
* Fix bash script
* Fix create command, pull out test results insertion
* Update call to update_configuration
* add precommit to requirements
* Reorder README
* README cleanup for test and create issues
* README cleanup for create_prs and config_migration
* More readmes! Readmes galore
* allow_beta
* Restore hacky changes to dockerfile and acceptance-test-docker
* Handle 'other' release stages
* Update readme
* Remove TODO, add comments to shell script
* format according to gradle
* format
* Fix formatting
---------
Co-authored-by: marcosmarxm <marcosmarxm@users.noreply.github.com>
Co-authored-by: erohmensing <erohmensing@users.noreply.github.com>
* fix-cdc: errors retry debezium property should be less than max retry
* add comment
* version bump + changelog
* auto-bump connector version
* Update source-alloydb versions to match source-postgres
* rebump to 2.0.15
* auto-bump connector version
* definitions + regenerate manually
---------
Co-authored-by: Edward Gao <edward.gao@airbyte.io>
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
Co-authored-by: nguyenaiden <duy@airbyte.io>
* test docker behavior on CI env
* Automated Change
* test docker behavior on CI env
* Make all unit and integration tests in source-postgres pass locally
* Fix mysql ssh integration test
* Fix failing test
* Fix source-mssql build
* source-mssql runss tests locally.
Fix compilation errors
---------
Co-authored-by: rodireich <rodireich@users.noreply.github.com>
* use standard retry mode
* Automated Change
* dest-s3 version bump + changelog
* also in redhsift + snowflake
* auto-bump connector version
* version bumps
---------
Co-authored-by: edgao <edgao@users.noreply.github.com>
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
* Fix the error reading offset file meanwhile Debezium is writing.
Enable CDC checkpointing to Postgres.
Minor change in the variable name to fit the type.
* Add final statement on exception ;)
* Add comments to CDC Checkpoint tests.
Clean a bit.
* Bump connector versioning
* Add log message
* Fix changelog
* auto-bump connector version
* Manually generate definitions
---------
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
* MS SQL does not support schema change in incremental model
* make schema change test optional
* fix compilation errors in postgres-strict-encrypt destination
* deparallelize integration tests for destination postgres
* deparalellize MS SQL integration tests
* remove broken SSH tunnel test from destination-postgres-strict-encrypt
* Pass argument along, add test that should pass and test that should fail
* Add tests with additionalProperties
* Set additionalproperties=false when not set|
* Parametrize test cases
* Make the behavior 'optional'
* Fix parametrization for all combinations
* Improve CI credentials README and rename param
* Update naming to be clearer about columns only
* record_has_unexpected_field > record_has_unexpected_column
* Automated Change
* Add bypass for beta + ga connectors that failed
* Update docs and TODOs
* Update changelog and dockerfile
* Update TODO
* Update a few neglected connectors
* Remove uploaded file
* Update dockerfile after merge conflict
---------
Co-authored-by: marcosmarxm <marcosmarxm@users.noreply.github.com>
* copy tests from other branch
* switch to >
* [wip] wire up tests
* make tests work
* fixes
* nicer test structure
* maybe add feature flag?
* pattern matching
* also add version check
* formatting
* refactor test also
* extract test + fix method call
* minor tweaks
* add context to log message
* put workspace id in normalization input
* use non-semver tag
* add flag for version of normalization
* also flag old version
* add test
* missed part of the commit
* format
* add test for null workspace ID
* Revert "also flag old version"
This reverts commit 3be601d16c.
* Revert "missed part of the commit"
This reverts commit 47a67b4631.
* always apply flag, even if we're behind a version
* derp
* Add more logging to the normalization activity
* Update charts and kustomize for the feature flag
* fix clickhouse integration test
* remove replace_identifiers
* Revert "remove replace_identifiers"
This reverts commit 0e7ded5a7b.
* fix replace_identifiers
* garbage debug logs
* stop trying to setup duckdb test
* wake up and choose violence
* fix mssql
* exclude duckdb from tests
* make snowflake happy
* uncomment tests
* derp
* derpderp
* format
* format
* also fix redshift???
* maybe now everything works???
* remove debug logs
* use special docker tag
* bump to new tag
* use random test schema in publish also
* properly cleanup
* remove feature flag stuff
* version bump + changelog
* Automated Commit - Formatting Changes
* bump definitions
---------
Co-authored-by: Jimmy Ma <gosusnp@users.noreply.github.com>
Co-authored-by: Jimmy Ma <jimmy@airbyte.io>
Co-authored-by: octavia-squidington-iii <octavia-bot@airbyte.io>
Co-authored-by: edgao <edgao@users.noreply.github.com>
* more tests
* format
* format
* check orders separately per group
* typo
* add examples to docstring
* prepare release
---------
Co-authored-by: Augustin <augustin@airbyte.io>
* add grouping and collapsing fields to postgres source
* add auth group to github source connector
* revert postgres field order changes and adjust group of schemas field
* inject group into ssh tunnel spec for postgres only, through overloaded methods
* Automated Change
* bump Dockerfile versions and update changelogs
* bump strict encrypt version as well
* fix postgres acceptance test
* fix acceptance test again
* fix all postgres acceptance tests
* add newline
* undo other changes to postgres readme file
* add security group to tunnel_method in expected_spec.json
* bump version of strict encrypt
* manually bump versions in seed files
---------
Co-authored-by: lmossman <lmossman@users.noreply.github.com>
* fix-postgres-cdc-npe:do not put null in properties
* version bump + change log
* auto-bump connector version
* manual bump
---------
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
* add grouping and collapsing fields to postgres source
* add auth group to github source connector
* revert postgres field order changes and adjust group of schemas field
* inject group into ssh tunnel spec for postgres only, through overloaded methods
* Automated Change
* bump Dockerfile versions and update changelogs
* bump strict encrypt version as well
* fix postgres acceptance test
* fix acceptance test again
---------
Co-authored-by: lmossman <lmossman@users.noreply.github.com>