1
0
mirror of synced 2025-12-26 14:02:10 -05:00
Commit Graph

317 Commits

Author SHA1 Message Date
Anatolii Yatsuk
fb769bd0a0 feat(airbyte-cdk): Add Per Partition with Global fallback Cursor (#45125) 2024-10-28 14:03:41 +02:00
Anatolii Yatsuk
569ed5c45a fix(airbyte-cdk): Fix yielding parent records in SubstreamPartitionRouter (#46918) 2024-10-18 16:19:29 +03:00
Maxime Carbonneau-Leclerc
5fb3fc24f3 Add warning on AsyncRetriever (#46952) 2024-10-16 21:39:33 -04:00
Chandler Prall
f8ac7f9fd7 fix: typo in XmlDecoder description (#46714) 2024-10-10 09:44:34 -06:00
Anatolii Yatsuk
7ed64be86e feat(airbyte-cdk): Add extra fields to StreamSlice (#46311) 2024-10-10 14:49:52 +03:00
Patrick Nilan
5381fbab8d [airbyte-cdk] - removes class_types_registry and default_implementation_registry (#46693) 2024-10-09 16:40:03 -07:00
Patrick Nilan
9249347736 [airbyte-cdk] - Add XmlDecoder component to low code CDK (#46360) 2024-10-09 11:18:32 -07:00
Brian Lai
c2923bd095 [concurrent low-code] Add concurrency_level to manifest and allow it to be parsed into a runtime object (#45943) 2024-10-08 17:04:11 -04:00
Patrick Nilan
99f94674f6 [airbyte-cdk] - Consolidate decoder selection in low-code CDK (#46313) 2024-10-07 15:16:49 -07:00
Maxime Carbonneau-Leclerc
b0cac50b9e feat(airbyte-cdk) Async jobs - Limit memory usage (#46286) 2024-10-02 08:32:52 -04:00
Maxime Carbonneau-Leclerc
0e7f3bcdff feat(airbyte-cdk) - Async job salesforce (#45673) 2024-10-01 08:48:44 -04:00
Brian Lai
199a8078f2 [airbyte-cdk] Decouple request_options_provider from datetime_based_cursor + concurrent_cursor features for low-code (#45413) 2024-09-17 14:06:41 -04:00
Maxime Carbonneau-Leclerc
6baf254b5d feat(cdk): add async job components (#45178) 2024-09-10 08:59:12 -04:00
Artem Inzhyyants
e3ce82e476 feat(airbyte-cdk): add global_state => per_partition transformation (#45122)
Signed-off-by: Artem Inzhyyants <artem.inzhyyants@gmail.com>
2024-09-09 22:29:00 +02:00
Serhii Lazebnyi
41b858514d fix(connector-builder): add flag to disable cache (#45095) 2024-09-09 21:51:56 +02:00
Anatolii Yatsuk
2fa35ab30b feat(airbyte-cdk): Add Global Parent State Cursor (#39593) 2024-09-06 16:44:34 +03:00
Anatolii Yatsuk
03b7e1ad22 feat(airbyte-cdk): Add limitation for number of partitions to PerPartitionCursor (#42406) 2024-09-06 14:55:14 +03:00
Artem Inzhyyants
df34893b63 feat(airbyte-cdk): replace pydantic BaseModel with dataclasses + serpyco-rs in protocol (#44444)
Signed-off-by: Artem Inzhyyants <artem.inzhyyants@gmail.com>
2024-09-02 17:48:17 +02:00
Artem Inzhyyants
7644dcd2a3 feat(airbyte-cdk): use orjson to speed up parsing (#44829)
Signed-off-by: Artem Inzhyyants <artem.inzhyyants@gmail.com>
2024-08-29 21:16:23 +02:00
Tomasz Szuba
9e35a88bbc Improve performance of interpolation in decalarative sources (#44027)
Co-authored-by: Natik Gadzhi <natik@respawn.io>
Co-authored-by: Alexandre Girard <alexandre@airbyte.io>
2024-08-21 17:27:13 -07:00
Baz
7b2e0be012 🎉 CDK (Low-Code): Add RFR for Sub-streams (Low-code) (#42974) 2024-08-19 12:40:02 +03:00
Maxime Carbonneau-Leclerc
61c07e8bf6 feat(airbyte-cdk): Have better fallback error message on HTTP error (#43399) 2024-08-12 20:43:23 -04:00
Maxime Carbonneau-Leclerc
f8dfb52af9 fix(python-cdk): Ensure at least one element returned by decoder (#43043) 2024-08-05 10:41:25 -04:00
Brian Lai
197cb810b0 [RFR for API Sources] Add SubstreamResumableFullRefreshCursor to the Python CDK (#42429) 2024-08-01 16:39:14 -04:00
Maxime Carbonneau-Leclerc
36a6f35a61 feat(airbyte-cdk) Align BackoffStrategy interfaces to take attempt_count as a full-fled… (#42889) 2024-07-31 11:41:15 -04:00
Maxime Carbonneau-Leclerc
9a1520bd58 feat(airbyte-cdk) Add ability to stop stream when retry-after is greater than a duration (#42865) 2024-07-30 09:15:11 -04:00
Artem Inzhyyants
328be4b565 fix(airbyte-cdk): fix declarative schema refs (#42844)
Signed-off-by: Artem Inzhyyants <artem.inzhyyants@gmail.com>
2024-07-29 12:36:32 +02:00
Natik Gadzhi
813ad995f6 feat(python-cdk): add description to declarative source schema (#42392)
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
2024-07-25 14:26:35 -07:00
Serhii Lazebnyi
a066968ff5 fix[low-code] follow up #38829 (#42475) 2024-07-24 13:47:41 +02:00
Artem Inzhyyants
f5e5a9768b fix(airbyte-cdk): fix OOM on predicate for streamable responses (#42448)
Signed-off-by: Artem Inzhyyants <artem.inzhyyants@gmail.com>
2024-07-23 21:41:03 +02:00
Artem Inzhyyants
5056e67826 refactor!(airbyte-cdk): deprecate availability strategy (#42039)
Signed-off-by: Artem Inzhyyants <artem.inzhyyants@gmail.com>
2024-07-23 10:22:11 +02:00
Artem Inzhyyants
3cfe199b6b feat(airbyte-cdk): add new Decoders: JsonlDecoder and IterableDecoder (#38829)
Signed-off-by: Artem Inzhyyants <artem.inzhyyants@gmail.com>
Co-authored-by: Natik Gadzhi <natik@respawn.io>
2024-07-19 13:45:12 +02:00
Serhii Lazebnyi
da51b20a88 [low-code cdk]: fix overwrite for default backoff strategy (#42048) 2024-07-18 19:21:28 +02:00
Brian Lai
b2d53f552d [RFR for Python Sources] Make it easier for Python sources to automatically use RFR for eligible streams (#39450) 2024-07-18 01:17:14 -04:00
Serhii Lazebnyi
39b8e3da19 [airbyte-cdk] not exiting when rate limited (#41333) 2024-07-17 00:56:59 +02:00
Daryna Ishchenko
a49e779c59 feat(airbyte-cdk): add failure_type to HttpResponseFilter (raise config error in low-code) (#40676) 2024-07-16 16:50:29 +03:00
Brian Lai
9e23b3f89b 🐛 [airbyte-cdk] Fix bug where substreams depending on an RFR parent stream don't paginate or use existing state (#40671) 2024-07-11 02:53:20 -04:00
Artem Inzhyyants
02c5f59ccf ref(airbyte-cdk): use http_client inside HttpStream (#39811)
Signed-off-by: Artem Inzhyyants <artem.inzhyyants@gmail.com>
2024-07-09 12:01:03 +02:00
Cristina Mariscal
c4b8212ba7 CDK: Add support for input format parsing at jinja macro format_datetime (#40759)
Co-authored-by: cristina.mariscal <cristina.mariscal@cristina.mariscal--MacBook-Pro---DFJ27FJFXX>
2024-07-08 08:42:09 +00:00
Cristina Mariscal
a8e985b7a0 Revert "CDK: Add jinja macro format_datetime_string" (#40747) 2024-07-05 15:04:32 +00:00
Cristina Mariscal
b9c213a473 CDK: Add jinja macro format_datetime_string (#40744) 2024-07-05 13:19:43 +00:00
Natik Gadzhi
4a06230436 feat(python cdk): Allow regex_search in jinja interpolations (#40696)
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
2024-07-03 23:21:51 +00:00
Maxime Carbonneau-Leclerc
2b7ef3fb25 Validate error handler fallback (#40570)
Co-authored-by: Serhii Lazebnyi <serhii.lazebnyi@globallogic.com>
2024-06-27 17:03:03 -04:00
Ella Rohm-Ensing
fc12432305 airbyte-cdk: only update airbyte-protocol-models to pydantic v2 (#39524)
## What

Migrating Pydantic V2 for Protocol Messages to speed up emitting records. This gives us 2.5x boost over V1. 

Close https://github.com/airbytehq/airbyte-internal-issues/issues/8333

## How
- Switch to using protocol models generated for pydantic_v2, in a new (temporary) package, `airbyte-protocol-models-pdv2` .
- Update pydantic dependency of the CDK accordingly to v2.
- For minimal impact, still use the compatibility code `pydantic.v1` in all of our pydantic code from airbyte-cdk that does not interact with the protocol models.

## Review guide
1. Checkout the code and clear your CDK virtual env (either `rm -rf .venv && python -m venv .venv` or `poetry env list; poetry env remove <env>`. This is necessary to fully clean out the `airbyte_protocol` library, for some reason. Then: `poetry lock --no-update && poetry install --all-extras`. This should install the CDK with new models. 
2. Run unit tests on the CDK
3. Take your favorite connector and point it's `pyproject.toml` on local CDK (see example in `source-s3`) and try running it's tests and it's regression tests.

## User Impact

> [!warning]
> This is a major CDK change due to the pydantic dependency change - if connectors use pydantic 1.10, they will break and will need to do similar `from pydantic.v1` updates to get running again. Therefore, we should release this as a major CDK version bump.

## Can this PR be safely reverted and rolled back?
- [x] YES 💚
- [ ] NO 

Even if sources migrate to this version, state format should not change, so a revert should be possible.

## Follow up work - Ella to move into issues

<details>

### Source-s3 - turn this into an issue
- [ ] Update source s3 CDK version and any required code changes
- [ ] Fix source-s3 unit tests
- [ ] Run source-s3 regression tests
- [ ] Merge and release source-s3 by June 21st

### Docs
- [ ] Update documentation on how to build with CDK 

### CDK pieces
- [ ] Update file-based CDK format validation to use Pydantic V2
  - This is doable, and requires a breaking change to change `OneOfOptionConfig`. There are a few unhandled test cases that present issues we're unsure of how to handle so far.
- [ ] Update low-code component generators to use Pydantic V2
  - This is doable, there are a few issues around custom component generation that are unhandled.

### Further CDK performance work - create issues for these
- [ ] Research if we can replace prints with buffered output (write to byte buffer and then flush to stdout)
- [ ] Replace `json` with `orjson`
...

</details>
2024-06-21 01:53:44 +02:00
Augustin
6d42ecafb0 Augustin/protocolv2 (#39863)
## What
<!--
* Describe what the change is solving. Link all GitHub issues related to this change.
-->

Separate out the `datamodel-codegen` workflow into a dagger workflow. This enables us to, upstack, properly generate the same v1 models as previously. Unfortunately datamodel-codegen's "pydantic v1" output on its v2 versions doesn't output what one would expect - see [issue](https://github.com/koxudaxi/datamodel-code-generator/issues/1950) (thanks AJ!). 

## How
<!--
* Describe how code changes achieve the solution.
-->
* Convert the script from bash to python (in dagger) and run it via a shell script (to install dagger)

## User Impact
<!--
* What is the end result perceived by the user?
* If there are negative side effects, please list them. 
-->
None. Development experience is also the same

## Can this PR be safely reverted and rolled back?
<!--
* If unsure, leave it blank.
-->
- [x] YES 💚
- [ ] NO 
2024-06-21 01:36:43 +02:00
Serhii Lazebnyi
a284676a4d feat(airbyte-cdk): add DatetimeIntervalCursor (#39603) 2024-06-20 01:11:13 +02:00
Maxime Carbonneau-Leclerc
0386ca21ae Exclude airbyte-cdk modules from schema discovery (#39586) 2024-06-19 09:29:45 -04:00
Maxime Carbonneau-Leclerc
5b4bd3485d Allow access to _partition for source-jira (#39576) 2024-06-18 21:47:30 -04:00
Maxime Carbonneau-Leclerc
7d56e19ac7 Improve error message on state initialization (#39553) 2024-06-18 12:58:11 -04:00
Artem Inzhyyants
46f8d4e2fc fix(airbyte-cdk): client_side_incremental fix end_datetime comparison (#38874)
Signed-off-by: Artem Inzhyyants <artem.inzhyyants@gmail.com>
2024-06-18 11:35:42 +02:00