## What
Migrating Pydantic V2 for Protocol Messages to speed up emitting records. This gives us 2.5x boost over V1.
Close https://github.com/airbytehq/airbyte-internal-issues/issues/8333
## How
- Switch to using protocol models generated for pydantic_v2, in a new (temporary) package, `airbyte-protocol-models-pdv2` .
- Update pydantic dependency of the CDK accordingly to v2.
- For minimal impact, still use the compatibility code `pydantic.v1` in all of our pydantic code from airbyte-cdk that does not interact with the protocol models.
## Review guide
1. Checkout the code and clear your CDK virtual env (either `rm -rf .venv && python -m venv .venv` or `poetry env list; poetry env remove <env>`. This is necessary to fully clean out the `airbyte_protocol` library, for some reason. Then: `poetry lock --no-update && poetry install --all-extras`. This should install the CDK with new models.
2. Run unit tests on the CDK
3. Take your favorite connector and point it's `pyproject.toml` on local CDK (see example in `source-s3`) and try running it's tests and it's regression tests.
## User Impact
> [!warning]
> This is a major CDK change due to the pydantic dependency change - if connectors use pydantic 1.10, they will break and will need to do similar `from pydantic.v1` updates to get running again. Therefore, we should release this as a major CDK version bump.
## Can this PR be safely reverted and rolled back?
- [x] YES 💚
- [ ] NO ❌
Even if sources migrate to this version, state format should not change, so a revert should be possible.
## Follow up work - Ella to move into issues
<details>
### Source-s3 - turn this into an issue
- [ ] Update source s3 CDK version and any required code changes
- [ ] Fix source-s3 unit tests
- [ ] Run source-s3 regression tests
- [ ] Merge and release source-s3 by June 21st
### Docs
- [ ] Update documentation on how to build with CDK
### CDK pieces
- [ ] Update file-based CDK format validation to use Pydantic V2
- This is doable, and requires a breaking change to change `OneOfOptionConfig`. There are a few unhandled test cases that present issues we're unsure of how to handle so far.
- [ ] Update low-code component generators to use Pydantic V2
- This is doable, there are a few issues around custom component generation that are unhandled.
### Further CDK performance work - create issues for these
- [ ] Research if we can replace prints with buffered output (write to byte buffer and then flush to stdout)
- [ ] Replace `json` with `orjson`
...
</details>
* set test_read_limit_reached to true if we hit the max records limit
* rename slice to _slice to avoid shadowing a builtin keyword
* newline
* fix some of the typing issues
* fix some more typing issues
* another fix
* fix last typing issue
* format
* Automated Commit - Formatting Changes
* reset type
* fix the type
* Update for clarity
* Update types
---------
Co-authored-by: girarda <girarda@users.noreply.github.com>
* [ISSUE #27494] fix type issue caused by connector builder logging
* [ISSUE #21014] log request/response for oauth as 'global_requests'
* formatcdk
* [ISSUE #21014] support DeclarativeOauth2Authenticator as well
* [ISSUE #21014] improving message grouper tests
* formatcdk
* Test solution with logic in MessageRepository (#27990)
* Test solution with logic in MessageRepository
* Solution without creating a new ModelToComponentFactory
* [ISSUE #21014] adding tests
* [ISSUE #21014] add title and description to global requests
* Revert "Solution without creating a new ModelToComponentFactory"
This reverts commit f17799ecff.
* Automated Commit - Formatting Changes
* [ISSUE #21014] code review
* [ISSUE #21014] do not break on log appender conflict
* Automated Commit - Formatting Changes
* [ISSUE #21014] code review
* formatcdk
* [ISSUE #21014] moving is_global to is_auxiliary
* add the request filters and integration test fixtures
* pr feedback and some tweaks to the testing framework
* optimize the cache for more hits
* formatting
* remove cache
* Move condition for yielding the slice message to an overwritable method
* Automated Commit - Formatting Changes
* yield the slice log messages
* same for incremental
* refactor
* Revert "refactor"
This reverts commit c594365bd8.
* move flag from factory to source
* set the flag
* remove debug print
* halfmock
* clean up
* Add a test for a single page
* Add another test
* Pass the flag
* rename
---------
Co-authored-by: girarda <girarda@users.noreply.github.com>
* wip
* fix unit test
* fix other unit test
* format
* reset
* format
* missing unit test
* yield a LogMessage on error
* format
* format
* fix unit tests
* yield a trace message instead of a log message
* format
* fix bad merge
* enforce manifest version correctness against the CDK package being used
* parse versions into parts for better comparisons and error checking
* fix pr feedback and derp forgot to actually add the commit with the low-code manifests updated to the beta version
* pr feedback and fix new tests since last rebase
* New connector_builder module for handling requests from the Connector Builder.
Also implements `resolve_manifest` handler
* Automated Commit - Formatting Changes
* Rename ConnectorBuilderSource to ConnectorBuilderHandler
* Update source_declarative_manifest README
* Reorganize
* read records
* paste unit tests from connector builder server
* compiles but tests fail
* first test passes
* Second test passes
* 3rd test passes
* one more test
* another test
* one more test
* test
* return StreamRead
* test
* test
* rename
* test
* test
* test
* main seems to work
* Update
* Update
* Update
* Update
* update
* error message
* rename
* update
* Update
* CR improvements
* fix test_source_declarative_manifest
* fix tests
* Update
* Update
* Update
* Update
* rename
* rename
* rename
* format
* Give connector_builder its own main.py
* Update
* reset
* delete dead code
* remove debug print
* update test
* Update
* set right stream
* Add --catalog argument
* Remove unneeded preparse
* Update README
* handle error
* tests pass
* more explicit test
* reset
* format
* fix merge
* raise exception
* fix
* black format
* raise with config
* update
* fix flake
* __test_read_config is optional
* fix
* Automated Commit - Formatting Changes
* fix
* exclude_unset
---------
Co-authored-by: Catherine Noll <noll.catherine@gmail.com>
Co-authored-by: clnoll <clnoll@users.noreply.github.com>
Co-authored-by: girarda <girarda@users.noreply.github.com>