1
0
mirror of synced 2026-01-30 07:01:56 -05:00
Commit Graph

34 Commits

Author SHA1 Message Date
Ella Rohm-Ensing
fc12432305 airbyte-cdk: only update airbyte-protocol-models to pydantic v2 (#39524)
## What

Migrating Pydantic V2 for Protocol Messages to speed up emitting records. This gives us 2.5x boost over V1. 

Close https://github.com/airbytehq/airbyte-internal-issues/issues/8333

## How
- Switch to using protocol models generated for pydantic_v2, in a new (temporary) package, `airbyte-protocol-models-pdv2` .
- Update pydantic dependency of the CDK accordingly to v2.
- For minimal impact, still use the compatibility code `pydantic.v1` in all of our pydantic code from airbyte-cdk that does not interact with the protocol models.

## Review guide
1. Checkout the code and clear your CDK virtual env (either `rm -rf .venv && python -m venv .venv` or `poetry env list; poetry env remove <env>`. This is necessary to fully clean out the `airbyte_protocol` library, for some reason. Then: `poetry lock --no-update && poetry install --all-extras`. This should install the CDK with new models. 
2. Run unit tests on the CDK
3. Take your favorite connector and point it's `pyproject.toml` on local CDK (see example in `source-s3`) and try running it's tests and it's regression tests.

## User Impact

> [!warning]
> This is a major CDK change due to the pydantic dependency change - if connectors use pydantic 1.10, they will break and will need to do similar `from pydantic.v1` updates to get running again. Therefore, we should release this as a major CDK version bump.

## Can this PR be safely reverted and rolled back?
- [x] YES 💚
- [ ] NO 

Even if sources migrate to this version, state format should not change, so a revert should be possible.

## Follow up work - Ella to move into issues

<details>

### Source-s3 - turn this into an issue
- [ ] Update source s3 CDK version and any required code changes
- [ ] Fix source-s3 unit tests
- [ ] Run source-s3 regression tests
- [ ] Merge and release source-s3 by June 21st

### Docs
- [ ] Update documentation on how to build with CDK 

### CDK pieces
- [ ] Update file-based CDK format validation to use Pydantic V2
  - This is doable, and requires a breaking change to change `OneOfOptionConfig`. There are a few unhandled test cases that present issues we're unsure of how to handle so far.
- [ ] Update low-code component generators to use Pydantic V2
  - This is doable, there are a few issues around custom component generation that are unhandled.

### Further CDK performance work - create issues for these
- [ ] Research if we can replace prints with buffered output (write to byte buffer and then flush to stdout)
- [ ] Replace `json` with `orjson`
...

</details>
2024-06-21 01:53:44 +02:00
Patrick Nilan
18e82d949a [airbyte-cdk] - Integrate HttpClient into HttpRequester (#38906) 2024-06-18 01:03:15 +00:00
Anatolii Yatsuk
4863c9fbd6 fix(connector-builder): hide secrets in errors (#38759) 2024-05-30 15:06:28 +03:00
Alexandre Girard
86ee91ed5d Connector builder: read input state if it exists (#37495) 2024-04-24 15:53:09 -07:00
Alexandre Girard
a2e908dc17 connector builder: Set state on stream slices (#37109) 2024-04-18 16:16:02 -07:00
Ella Rohm-Ensing
b7819d9f6c python: assert actual == expected ordering (#36980) 2024-04-11 15:16:33 +00:00
Alexandre Girard
b27ddfe19e connector-builder: return full url-encoded URL instead of separating parameters (#36680) 2024-03-28 18:49:35 -07:00
Maxime Carbonneau-Leclerc
2f34f084e4 [ISSUE #6548] make all fields nullable except from pk and cursor field (#36201) 2024-03-20 09:48:38 -04:00
Alex Birdsall
385a70d89d Support user-specified test read limits in connector_builder code (#35312) 2024-02-16 15:53:26 -08:00
Artem Inzhyyants
9c6aea19cd Airbyte CDK: handle private network exception as config error (#33751) 2024-01-10 15:20:40 +01:00
Marius Posta
7ae97175a6 gradle: fix repo wide behaviour (#30607) 2023-09-28 05:01:13 -07:00
Maxime Carbonneau-Leclerc
b335880fda jira invalid user-provided urls generating sentry issues (#30672) 2023-09-21 15:01:17 -04:00
Joe Reuter
f8de9d12df CDK: Remove list endpoint (#29581) 2023-08-21 12:43:44 +02:00
Joe Reuter
df3b1d9c8d 🚨🚨 Low code CDK: Decouple SimpleRetriever and HttpStream (#28657)
* fix tests

* format

* review comments

* Automated Commit - Formatting Changes

* review comments

* review comments

* review comments

* log all messages

* log all message

* review comments

* review comments

* Automated Commit - Formatting Changes

* add comment

---------

Co-authored-by: flash1293 <flash1293@users.noreply.github.com>
2023-08-03 12:30:59 +02:00
Maxime Carbonneau-Leclerc
48bf520d87 Fix stream read given stream doesn't have any slice (#28746)
* Fix stream read given stream doesn't have any slice

* Not return slices if there are none

* Fix test
2023-07-27 10:05:35 -04:00
Alexandre Girard
3ae73fb0ff connector builder: Set test_read_limit_reached to true if we hit the max records limit (#28293)
* set test_read_limit_reached to true if we hit the max records limit

* rename slice to _slice to avoid shadowing a builtin keyword

* newline

* fix some of the typing issues

* fix some more typing issues

* another fix

* fix last typing issue

* format

* Automated Commit - Formatting Changes

* reset type

* fix the type

* Update for clarity

* Update types

---------

Co-authored-by: girarda <girarda@users.noreply.github.com>
2023-07-18 15:53:53 -07:00
Maxime Carbonneau-Leclerc
df2a6e50bb Issue 21014/oauth requests (#27973)
* [ISSUE #27494] fix type issue caused by connector builder logging

* [ISSUE #21014] log request/response for oauth as 'global_requests'

* formatcdk

* [ISSUE #21014] support DeclarativeOauth2Authenticator as well

* [ISSUE #21014] improving message grouper tests

* formatcdk

* Test solution with logic in MessageRepository (#27990)

* Test solution with logic in MessageRepository

* Solution without creating a new ModelToComponentFactory

* [ISSUE #21014] adding tests

* [ISSUE #21014] add title and description to global requests

* Revert "Solution without creating a new ModelToComponentFactory"

This reverts commit f17799ecff.

* Automated Commit - Formatting Changes

* [ISSUE #21014] code review

* [ISSUE #21014] do not break on log appender conflict

* Automated Commit - Formatting Changes

* [ISSUE #21014] code review

* formatcdk

* [ISSUE #21014] moving is_global to is_auxiliary
2023-07-11 13:37:38 -04:00
Brian Lai
02e4bd07f7 [26989] Add request filter for cloud and integration test fixtures for e2e sync testing (#27534)
* add the request filters and integration test fixtures

* pr feedback and some tweaks to the testing framework

* optimize the cache for more hits

* formatting

* remove cache
2023-06-22 12:14:07 -04:00
Maxime Carbonneau-Leclerc
f48849fdb4 [ISSUE #26909] adding message repository (#27158)
* [ISSUE #26909] adding message repository

* Automated Commit - Formatting Changes

* [ISSUE #26909] improve entrypoint error handling

* format CDK

* [ISSUE #26909] adding an integration test
2023-06-13 08:40:55 -04:00
Joe Reuter
d6512dea2c CDK: Datetime format inferrer (#27071)
* datetime inferrer class

* format

* pass inferred date formats along

* review comments
2023-06-09 10:33:54 +02:00
Maxime Carbonneau-Leclerc
4625cef571 [ISSUE #26909] add latest connector config control message to connect… (#26922)
* [ISSUE #26909] add latest connector config control message to connector builder API

* [ISSUE #26909] flake

* Automated Commit - Formatting Changes

* [ISSUE #26909] fallback on in-memory dict if no config control message

* [ISSUE #26909] update and add tests
2023-06-07 08:31:45 -04:00
Maxime Carbonneau-Leclerc
d54a68640f Improving error messages to have better messaging in datadog and the … (#26860)
* Improving error messages to have better messaging in datadog and the frontend

* fixing tests
2023-05-31 15:36:27 -04:00
Maxime Carbonneau-Leclerc
0efc18a114 [ISSUE #24720] connector builder set slice descriptor (#25677) 2023-05-01 12:18:22 -04:00
Maxime Carbonneau-Leclerc
3cc67a6d9e [ISSUE #23382] ignore backoff configuration on test reads (#25429) 2023-04-26 08:36:59 -04:00
Maxime Carbonneau-Leclerc
4d65fa1b98 [ISSUE #23994] make MessageGrouper use AirbyteEntrypoint (#25402)
* [ISSUE #23994] make MessageGrouper use AirbyteEntrypoint

* [ISSUE #23994] code review
2023-04-24 11:24:15 -04:00
Alexandre Girard
f3799280f2 connector builder: Emit message at start of slice (#25180)
* Move condition for yielding the slice message to an overwritable method

* Automated Commit - Formatting Changes

* yield the slice log messages

* same for incremental

* refactor

* Revert "refactor"

This reverts commit c594365bd8.

* move flag from factory to source

* set the flag

* remove debug print

* halfmock

* clean up

* Add a test for a single page

* Add another test

* Pass the flag

* rename

---------

Co-authored-by: girarda <girarda@users.noreply.github.com>
2023-04-14 10:23:59 -07:00
Alexandre Girard
71fc3dd517 Connector builder: set pages and slices limits (#25121)
* Set limits

* refactor and add unit tests

* Update as per comments
2023-04-12 14:46:43 -07:00
Alexandre Girard
edfc59533d Connector builder: Port "send stacktrace when error on read" to CDK connector builder module (#24173)
* wip

* fix unit test

* fix other unit test

* format

* reset

* format

* missing unit test

* yield a LogMessage on error

* format

* format

* fix unit tests

* yield a trace message instead of a log message

* format

* fix bad merge
2023-03-21 17:22:08 -07:00
Catherine Noll
f4fd4d98a2 Connector Builder: Make connector_builder part of the CDK package (#24280) 2023-03-21 13:31:16 -04:00
Maxime Carbonneau-Leclerc
98719cf3f3 [ISSUE #23794] CDK's read command handler supports Connector Builder … (#24204)
* [ISSUE #23794] CDK's read command handler supports Connector Builder list_streams requests

* [ISSUE #23794] code review
2023-03-21 09:01:33 -04:00
Catherine Noll
e890d01d55 Connector builder: handle empty catalog (#24184) 2023-03-17 12:51:10 -04:00
Brian Lai
903d34e5f1 [Low-Code CDK] Enforce manifest against the airbyte-cdk version and the Beta version 0.29.0 (#23796)
* enforce manifest version correctness against the CDK package being used

* parse versions into parts for better comparisons and error checking

* fix pr feedback and derp forgot to actually add the commit with the low-code manifests updated to the beta version

* pr feedback and fix new tests since last rebase
2023-03-16 00:50:30 -04:00
Alexandre Girard
bb5741a0c0 Connector builder: support for test read with message grouping per slices (#23925)
* New connector_builder module for handling requests from the Connector Builder.

Also implements `resolve_manifest` handler

* Automated Commit - Formatting Changes

* Rename ConnectorBuilderSource to ConnectorBuilderHandler

* Update source_declarative_manifest README

* Reorganize

* read records

* paste unit tests from connector builder server

* compiles but tests fail

* first test passes

* Second test passes

* 3rd test passes

* one more test

* another test

* one more test

* test

* return StreamRead

* test

* test

* rename

* test

* test

* test

* main seems to work

* Update

* Update

* Update

* Update

* update

* error message

* rename

* update

* Update

* CR improvements

* fix test_source_declarative_manifest

* fix tests

* Update

* Update

* Update

* Update

* rename

* rename

* rename

* format

* Give connector_builder its own main.py

* Update

* reset

* delete dead code

* remove debug print

* update test

* Update

* set right stream

* Add --catalog argument

* Remove unneeded preparse

* Update README

* handle error

* tests pass

* more explicit test

* reset

* format

* fix merge

* raise exception

* fix

* black format

* raise with config

* update

* fix flake

* __test_read_config is optional

* fix

* Automated Commit - Formatting Changes

* fix

* exclude_unset

---------

Co-authored-by: Catherine Noll <noll.catherine@gmail.com>
Co-authored-by: clnoll <clnoll@users.noreply.github.com>
Co-authored-by: girarda <girarda@users.noreply.github.com>
2023-03-15 17:12:37 -07:00
Catherine Noll
8ee32b1132 New connector_builder module for handling requests from the Connector Builder (#23888)
Also implements `resolve_manifest` handler
2023-03-14 13:51:27 -04:00