* This commit adds new functionality that generates checkpoints when doing CDC synchronization.
For that purpose we encapsulate an AirbyteMessage Iterator on a new iterator that handles the
checkpoint messaging.
* Reformat code
* Reformat code
* Reformat code
* Reformat code
* Second attempt with ugly if statement
* Add `isRecordBehindOffset` function to make sure is safe to send the state.
Tests are failing as now it has more state messages:
expected: <1> but was: <3>
* Code formatting
* Add additional check if the record is part of the snapshot load to skip state message.
* Remove comments
* Fix imports
* Fix format
* Add check if the iterator has extra elements so we don't send state message twice (edge case)
* Add a new check to avoid sending multiple state messages with same offset.
Fix PR comments.
Not sending checkpoints... figuring out
* Modify MSSQL and MySQL implementations
* Adds better control on Maps and include a test for time checkpoint.
Also adds extra assert to verify there are no duplicate states
* Formatting
* Improve code documentation and use default for CdcStateHandler new functions
* Sort out missing `final` and types from comments
* Minor improve in checkpoint validation
* format files
* It's 2023!
* Import issues
* Changes after merging master
* Upgrade Debezium version in MySQL
* Bump Postgres and Alloydb
* auto-bump connector version
* Manually bumping version
---------
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
* Added support for periodic buffer flush with tests and uses env variable
* Improves code readability and encapulates testing logic
* Removed demo changes and created const for tests
* Updated constructor to reuse method signature
* Increases Snowflake parallel integration forks
* Bumps version number, fixes linting issues and constant format
* Generate seed spec
* Split test for readability and increase waiting time as possible culprit of random failure
* Improve testDataContent() output and test all the types without instead of stopping the test in the first one.
* Format and add documentation
* Adds additional logging when flushing buffer and writing records
* Removes logging for writeRecord since this will explode log lines
* Added logging when uploading records to stage/bucket
* Fixes log lines to properly capture when records have been uploaded
* Bumps version and fixes logging message to more accurately reflect logic
* auto-bump connector version
---------
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
Scripts to
* Run CATs against the local CDK for one connector
* Run CATs against the local CDK for multiple connectors
* Create a connecter image with the local CDK
---------
Co-authored-by: Alexandre Girard <alexandre@airbyte.io>
Co-authored-by: Sherif A. Nada <snadalive@gmail.com>
* source-snowflake: use a safer method for parsing a BigInteger cursor value (#22358)
* use a safer method for parsing a BigInteger cursor value
* Add testing
* fix format change
* Fix failing integration tests
* Try removing the failing incremental test
* Try removing the failing incremental test
* Fix failing test
* Add metadata to connector logs (log level, class name, method name and line number) (#23105)
* Issue #17861 Add labels, class, method name and line numbers to connector logs
* Refactored unit test
* fix for warning about UTF8 charset in test class
---------
Co-authored-by: prateekmukhedkar <prateek@airbyte.io>
* This commit fixes the issue when permission is granted at ROLE level instead of USER level.
Missing revoke privileges in the tests.
* Change the query to recursively look for all roles asigned.
Also improve testing.
* Add test for subrole with replication access
* formatting
* Roles don't share attributes, only accesses.
That means that the REPLICATION or SUPERUSER can not be inhered to the user. Because of that, we need to make the user have REPLICATION access directly.
* Bump versions and update alloydb docs
* Roles don't share attributes, only accesses.
That means that the REPLICATION or SUPERUSER can not be inhered to the user. Because of that, we need to make the user have REPLICATION access directly.
* improve comment
* Typo
* Change from checking the permisions in `pg_users` to execute `createConnection` and verifying the connection is fine for CDC.
* Remove unneeded import
* format
* Rename ReplicationConnection class
* Revert "source-snowflake: use a safer method for parsing a BigInteger cursor value (#22358)"
This reverts commit e9efd9878a.
* Revert "Add metadata to connector logs (log level, class name, method name and line number) (#23105)"
This reverts commit a2c80a1fdb.
* Change ConfigError throw point
* Include in try to autoclose the connection
* Bump versions
* auto-bump connector version
* fix SSL failure on check
* format + undo spec changes
* auto-bump connector version
* Manual interaction for source definitions
---------
Co-authored-by: Rodi Reich Zilberman <867491+rodireich@users.noreply.github.com>
Co-authored-by: Prateek Mukhedkar <123108018+prateekmukhedkar@users.noreply.github.com>
Co-authored-by: prateekmukhedkar <prateek@airbyte.io>
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
Co-authored-by: subodh <subodh1810@gmail.com>
* source-snowflake: use a safer method for parsing a BigInteger cursor value (#22358)
* use a safer method for parsing a BigInteger cursor value
* Add testing
* fix format change
* Fix failing integration tests
* Try removing the failing incremental test
* Try removing the failing incremental test
* Fix failing test
* Add metadata to connector logs (log level, class name, method name and line number) (#23105)
* Issue #17861 Add labels, class, method name and line numbers to connector logs
* Refactored unit test
* fix for warning about UTF8 charset in test class
---------
Co-authored-by: prateekmukhedkar <prateek@airbyte.io>
* Update docker image and release notes
* auto-bump connector version
* manually bump version on spec
---------
Co-authored-by: Prateek Mukhedkar <123108018+prateekmukhedkar@users.noreply.github.com>
Co-authored-by: prateekmukhedkar <prateek@airbyte.io>
Co-authored-by: Sergio Ropero <sergio@airbyte.io>
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
* bump dbt-clickhouse to 1.4.0
* fix clickhouse integration test
* exclude duckdb from tests
* add to changelog
* bump normalization version in definitions
---------
Co-authored-by: Marcos Marx <marcosmarxm@users.noreply.github.com>
Co-authored-by: Edward Gao <edward.gao@airbyte.io>
* download latest connector catalog on build
* format
* Add small todos
* Add skeleton for local provider change
* Load catalog from local
* Remove seed class from localdefprovider
* Get catalog path from config
* refactor arguments out of download catalog
* Move to a constant file method
* Fix arguments
* Rename to CatalogDefinitionsConfig
* Add todos
* Update todo
* Apply formatting
* remove unnessessary try catch
* Refactor IconValidationTask
* Fix null value issue
* Only build oss catalog via github action
* Add LOCAL_CONNECTOR_CATALOG_PATH env var override
* Format
* Add logger
* Copy file directly to avoid circular dep
* Format
* Ensure run before tests
* Move airbyte:init:spec to general dependency
* LOCAL_CATALOG_PATH -> LOCAL_CONNECTOR_CATALOG_PATH
---------
Co-authored-by: Ben Church <ben@airbyte.io>
Support conversion of the airbyte-server-wrapped Cloud module to Micronaut
- Remove @Requires rule from controllers so they will be used in both OSS and Cloud
- Remove airbyte-commons-worker dependency from airbyte-server by moving shared code to new airbyte-commons-converters module
Co-authored-by: Davin Chia <davinchia@gmail.com>
* add airbyte-protocol to deps.toml
* use published protocol jar for platform
* use published protocol jar for connectors
* point at published jar
* fix dep
* bump gcs storage
* fix build failures in standard-source-test
* fix deps
* downgrade alloy db because it is missing strictness tests
* Revert "downgrade alloy db because it is missing strictness tests"
This reverts commit cc6089d053.
---------
Co-authored-by: cgardens <charles@airbyte.io>
* Catch state being empty
* Update test_two_sequential_reads to catch empty state on first read
* Add integration test of empty state
* Fix legacy state test
* Move state_name to variable
* Clean up
* Format
* Fix rogue test
* fix test
* remove unused var
* add converter into test
* use converters to convert client catalog to proto
* remove cdk related changes
* more cdk remove
* Minor format changes
* remove untrue comment
* Minor format changes
---------
Co-authored-by: Sergio Ropero <42538006+sergio-ropero@users.noreply.github.com>
Co-authored-by: Sergio Ropero <sergio@airbyte.io>
This is the first version of the DuckDB destination. There are potential edge cases that still need to be taken care of. But looking forward to your feedback.
* Revert "Normalization: handle non-object top-level schemas; treat binary data as string (#22165)"
This reverts commit 8276d03359.
* Revert "Normalization: check for ref type existence (#22161)"
This reverts commit dbe56d6fc2.
* Revert "🎉Updated normalization to handle new datatypes (#19721)"
This reverts commit c1d7736639.
* revert dest definitions
* also dockerfile
* re-add to changelog
* add comment in dockerfile
* api changes for writing discover catalog
* api changes
* format
* worker change 1
* change return type of the API to return catalogId
* worker to call api
* typo
* 🎉 Source GoogleSheets - migrated SAT to strictness level (#21399)
* migrated SAT to strictness level
* fixed expected records
* revert file from another source
* changed extension to txt
* changed extension to txt
* 🐛Destination-Bigquery: Added an explicit error message if sync fails due to a config issue (#21144)
* [19998] Destination-Bigquery: Added an explicit error message in sync fails due to a config issue
* ci-connector-ops: split workflows(#21474)
* CI: nightly build alpha sources and destinations (#21562)
* Revert "Change main class in strict-encrypt destination and bump versions on both destinations to keep them in sync (#21509)" (#21567)
This reverts commit 1d202d1707.
* Fixes webhook updating logic (#21519)
* ci_credentials: disable tooling test run by tox (#21580)
* disable tox
* rename steps
* revert changes on experimental workflow
* do not install tox
* Revert "CI: nightly build alpha sources and destinations (#21562)" (#21589)
This reverts commit 61f88f3013.
* Security update of default docker images (#21407)
Because there is a lot of CVEs in those releases.
Co-authored-by: Topher Lubaway <asimplechris@gmail.com>
* 📝 add docs for how to add normalization (#21563)
* add docs
* add schema link
* update based on feedback
* 🪟🚦 E2E tests: clean up matchers (#20887)
* improve serviceTypeDropdownOption selector
* add test ids to PathPopout component(s)
* add unique id's to table dropdowns
* extend submitButtonClick to support optional click options
* update dropdown(pathPopout) matchers
* add test-id to Overlay component
* remove redundant function brackets
* revert changes onSubmit button click
* fix dropDown overlay issue
* move all duplicated intercepters to beforeEach
* add test id's to Connections, Sources and Destinations tables
* add table helper functions
* update source page actions
* intercepter fixes
* update createTestConnection function with optional replication settings
* remove extra Connection name check
* replace "cypress-postgres" with "pg-promise" npm package
* update cypress config
* Revert "update createTestConnection function with optional replication settings"
This reverts commit 8e47c7837b.
* Revert "remove extra Connection name check"
This reverts commit dfb19c7dd4.
* replace openSourceDestinationFromGrid with specific selector
* replace openSourceDestinationFromGrid with specific selector
* turn on test
* add test-id's
* fix selectors
* update test
* update test snapshots
* fix lost data-testid after resolve merge conflicts
* remove extra check
* move clickOnCellInTable helper to common.ts file
* remove empty line and comments
* fix dropdownType
* replace partial string check with exact
* extract interceptors and waiters to separate file
* fix selector for predefined PK
* fix selector
* add comment regarding dropdown
* 🪟🎨 [Free connectors] Update modal copy (#21600)
* move start/end time options out of optional block (#21541)
* lingering fix
* reflecting api changes
* test fix
* worker to call api to do discover work
* recovered deleted html
* self review
* more converters refactor
* fix connector test
* fix test
* fix
* fix integration test
* add unit test for converter
* static fix
* api client needs to have a timeout in case request does not get responded
---------
Co-authored-by: midavadim <midavadim@yahoo.com>
Co-authored-by: Eugene <etsybaev@gmail.com>
Co-authored-by: Augustin <augustin@airbyte.io>
Co-authored-by: Greg Solovyev <grishick@users.noreply.github.com>
Co-authored-by: Yatsuk Bogdan <yatsukbogdan@gmail.com>
Co-authored-by: Hervé Commowick <github@herve.commowick.fr>
Co-authored-by: Topher Lubaway <asimplechris@gmail.com>
Co-authored-by: Pedro S. Lopez <pedroslopez@me.com>
Co-authored-by: Vladimir <volodymyr.s.petrov@globallogic.com>
Co-authored-by: Joey Marshment-Howell <josephkmh@users.noreply.github.com>
Co-authored-by: Lake Mossman <lake@airbyte.io>
* Refactor the job log json to include the docker_version
* Output to versioned folder
* Handle the case where people call the action without connector prefixed
* Retrieve status of each connector
* Use build report statuses in the QA Engine
* Cast build status as an enum
* Add Airbyte Protocol V1 support.
* Fix VersionedAirbyteStreamFactoryTest
* Remove AirbyteMessageMigrationV0 example
* Add Protocol Version constants
* 🎉Updated normalization to handle new datatypes (#19721)
* Updated normalization simple stream processing to handle new datatypes
* Updated normalization nested stream processing to handle new datatypes
* Updated normalization nested stream processing to handle new datatypes
* Updated normalization drop_scd_catalog processing to handle new datatypes
* Updated normalization ephemeral test processing to handle new datatypes
* fixed more tests for normalization
* fixed more tests for normalization
* fixed more tests for normalization
* fixed more tests for normalization
* fixed more issues
* fixed more issues (clickhouse)
* fixed more issues
* fixed more issues
* fixed more issues
* added binary type processing for some DBs
* cleared commented code and moved some hardcodes to processing as macro
* fixed codestyle and cleared commented code
* minor refactor
* minor refactor
* minor refactor
* fixed bool cast error
* fixed dict->str cast error
* fixed is_combining_node cast py check
* removed commented code
* removed commented code
* committed autogenerated normalization_test_output files
* committed autogenerated normalization_test_output files (new files)
* refactored utils.py
* Updated utils.py to use Callable functions and get rid of property_type in is_number and is_bool functions
* committed autogenerated normalization_test_output files (new files)
* fixed typo in TIMESTAMP_WITH_TIMEZONE_TYPE
* updated stream_processor to handle string type first as a wider type
* fixed arrays normalization by updating is_simple_property method as per new approaches
* format
Co-authored-by: Edward Gao <edward.gao@airbyte.io>
* Update airbyte protocol migration (#20745)
* Extract MigrationContainer from AirbyteMessageMigrator
* Add ConfiguredAirbyteCatalogMigrations
* Add ConfiguredAirbyteCatalog to AirbyteMessageMigrations
* Enable ConfiguredAirbyteCatalog migration
* Fix tests
* Remove extra this.
* Add missing docs
* Typo
Co-authored-by: Edward Gao <edward.gao@airbyte.io>
* Data types update: Implement protocol message migrations (#19240)
* Extract MigrationContainer from AirbyteMessageMigrator
* Add ConfiguredAirbyteCatalogMigrations
* Add ConfiguredAirbyteCatalog to AirbyteMessageMigrations
* Enable ConfiguredAirbyteCatalog migration
* set up scaffolding
* [wip] more scaffolding, basic unit test
* minimal green code
* [wip] add failing test for other primitive types
* correct version number
* handle basic primitive type decls
* add implicit cases
* add recursive schema
* formatting
* comment
* support not
* fix indentation
* handle all nested schema cases
* handle boolean schemas
* verify empty schema handling
* cleanup
* extract map
* code organization
* extract method
* reformat
* [wip] more tests, minor fix type array handling
* corrected test
* cleanup
* reformat
* switch to v1
* add support for multityped fields
* missed test case
* nested test class
* basic record upgrade
* implement record upgrades
* slight refactor
* comments+clarificationso
* extract constants
* (partly) correct model classes
* add de/ser
* formatting
* extract constants
* fix json reference
* update docs
* switch to v1 models
* fix compile+test
* add base64 handling
* use vnull
* Data types update: Implement protocol message downgrade path (#19909)
* rough skeleton for passing catalog into migration
* basic test
* more scaffolding
* basic implementation
* add primitives test
* add in other tests (nested fields currently failing)
* add formats
* impleent oneOf handling
* formatting
* oneOf handling
* better tests
* comments + organization
* progress
* basic test case
* downgrade objects, ish
* basic array implementation
* handle numeric failure
* test for new type
* handle array items
* empty schema handling
* first pass at oneof handling
* add more tests+handling
* more tests
* comments
* add empty oneof test case
* format + reorganize
* more reorganize
* fix name
* also downgrade binary data
* only import vnull
* move migrations into v1 package
* extract schema mutation code
* comment
* extract schema migration to new class
* extract record downgrade logic for future use
* format
* fix build after rebase
* rename private method for consistency
* also implement configuredcatalog migrations >.>
* quick and dirty tests
* slight cleanup
* fix tests
* pmd
* pmd test
* null check on message objects
* maybe fix acceptance tests?
* fix name
* extract constants
* more fixes
* tmp
* meh
* fix cdc acc tests
* revert to master source-postgres
* remove log messages
* revert other misc hacks
* integers are valid cursors
* remove unrelated change
* fix build
* fix build more?
* [MUST REVERT] use dev normalization
* capture kube logs
* also here?
* no debug logs?
* delete dup from merging
* add final everywhere
* revert test changes
Co-authored-by: Jimmy Ma <jimmy@airbyte.io>
* On-the-fly migrations of persisted catalogs (#21757)
* On the fly catalog migration for normalization activity
* On the fly catalog migration for job persistence
* On the fly migration for standard sync persistence
* On the fly migration for airbyte catalogs
* Refactor code to share JsonSchema traversal
* Add V0 Data type search function
* PMD and Format
* Fix getOrInsertActorCatalog and ConfigRepositoryE2E tests
* Null-proofing CatalogMigrationV1Helper
* More null checks
* Fix test
* Format
* Add data type v1 support to the FE
* Changes AC test check to check exited ps (#21672)
some docker compose changes no longer show exited
processes. this broke out test
this change should fix master
tested in a runner that failed
* Move wellknown types mapping to the utility function
* use protocolv1 normalization
---------
Co-authored-by: Topher Lubaway <asimplechris@gmail.com>
Co-authored-by: Edward Gao <edward.gao@airbyte.io>
* Update protocol support range (#21996)
* bump normalization version to 0.3.0
* Add version check on normalization (#22048)
* Add normalization min version check
* Add visible for testing
---------
Co-authored-by: Edward Gao <edward.gao@airbyte.io>
Co-authored-by: Eugene <etsybaev@gmail.com>
Co-authored-by: Topher Lubaway <asimplechris@gmail.com>
* 21908 Base Java S3: Update Avro TimeWithTimezone schema mapping
* 21908 Base Java S3: Formatting
* 21908 Base Java S3: fix integration test gcs + S3
* 21908 Base Java S3: fix unit test
* 21908 Base Java S3: fix format
* Checkpointing flush/commit and emit STATE message
* Fixed tests for SerializedBufferingStrategy
* Updates BigQuery to support checkpointing and consolidates method naming for uploading from staging (#21028)
* Updates BigQuery to support checkpointing and consolidates method naming for uploading from staging
* Updated messages to reflect method changes
* Updates createTable to include mimic replication by calling createPartitionTable and removes unused copyIntoTargetTable
* Updates the COPY INTO methods to match writing to table
* Fixed comments and non-executed path
* Fixed BufferedStreamConsumerTest to support new logic for checkpointing
* Removed cleanup logic that no longer applies with checkpointing changes
* Checkpointing flush/commit and emit STATE message
* Updates BigQuery to support checkpointing and consolidates method naming for uploading from staging (#21028)
* Updates BigQuery to support checkpointing and consolidates method naming for uploading from staging
* Updated messages to reflect method changes
* Updates createTable to include mimic replication by calling createPartitionTable and removes unused copyIntoTargetTable
* Updates the COPY INTO methods to match writing to table
* Fixed comments and non-executed path
* Resolved BigQuery partitioning tests and parameterized GCS Staging test
* Fixed review comments and bumps version number
* Definition generation