1
0
mirror of synced 2026-01-20 12:07:14 -05:00
Commit Graph

231 Commits

Author SHA1 Message Date
Brian Lai
9e23b3f89b 🐛 [airbyte-cdk] Fix bug where substreams depending on an RFR parent stream don't paginate or use existing state (#40671) 2024-07-11 02:53:20 -04:00
Artem Inzhyyants
02c5f59ccf ref(airbyte-cdk): use http_client inside HttpStream (#39811)
Signed-off-by: Artem Inzhyyants <artem.inzhyyants@gmail.com>
2024-07-09 12:01:03 +02:00
Cristina Mariscal
c4b8212ba7 CDK: Add support for input format parsing at jinja macro format_datetime (#40759)
Co-authored-by: cristina.mariscal <cristina.mariscal@cristina.mariscal--MacBook-Pro---DFJ27FJFXX>
2024-07-08 08:42:09 +00:00
Cristina Mariscal
a8e985b7a0 Revert "CDK: Add jinja macro format_datetime_string" (#40747) 2024-07-05 15:04:32 +00:00
Cristina Mariscal
b9c213a473 CDK: Add jinja macro format_datetime_string (#40744) 2024-07-05 13:19:43 +00:00
Natik Gadzhi
4a06230436 feat(python cdk): Allow regex_search in jinja interpolations (#40696)
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
2024-07-03 23:21:51 +00:00
Boris Staal
6dd9b7ab25 chore(cdk): Avoid using time.sleep in unit tests for backoff of http stream (#40239)
Co-authored-by: Alexandre Girard <alexandre@airbyte.io>
2024-07-02 11:47:56 -07:00
Maxime Carbonneau-Leclerc
2b7ef3fb25 Validate error handler fallback (#40570)
Co-authored-by: Serhii Lazebnyi <serhii.lazebnyi@globallogic.com>
2024-06-27 17:03:03 -04:00
Ella Rohm-Ensing
fc12432305 airbyte-cdk: only update airbyte-protocol-models to pydantic v2 (#39524)
## What

Migrating Pydantic V2 for Protocol Messages to speed up emitting records. This gives us 2.5x boost over V1. 

Close https://github.com/airbytehq/airbyte-internal-issues/issues/8333

## How
- Switch to using protocol models generated for pydantic_v2, in a new (temporary) package, `airbyte-protocol-models-pdv2` .
- Update pydantic dependency of the CDK accordingly to v2.
- For minimal impact, still use the compatibility code `pydantic.v1` in all of our pydantic code from airbyte-cdk that does not interact with the protocol models.

## Review guide
1. Checkout the code and clear your CDK virtual env (either `rm -rf .venv && python -m venv .venv` or `poetry env list; poetry env remove <env>`. This is necessary to fully clean out the `airbyte_protocol` library, for some reason. Then: `poetry lock --no-update && poetry install --all-extras`. This should install the CDK with new models. 
2. Run unit tests on the CDK
3. Take your favorite connector and point it's `pyproject.toml` on local CDK (see example in `source-s3`) and try running it's tests and it's regression tests.

## User Impact

> [!warning]
> This is a major CDK change due to the pydantic dependency change - if connectors use pydantic 1.10, they will break and will need to do similar `from pydantic.v1` updates to get running again. Therefore, we should release this as a major CDK version bump.

## Can this PR be safely reverted and rolled back?
- [x] YES 💚
- [ ] NO 

Even if sources migrate to this version, state format should not change, so a revert should be possible.

## Follow up work - Ella to move into issues

<details>

### Source-s3 - turn this into an issue
- [ ] Update source s3 CDK version and any required code changes
- [ ] Fix source-s3 unit tests
- [ ] Run source-s3 regression tests
- [ ] Merge and release source-s3 by June 21st

### Docs
- [ ] Update documentation on how to build with CDK 

### CDK pieces
- [ ] Update file-based CDK format validation to use Pydantic V2
  - This is doable, and requires a breaking change to change `OneOfOptionConfig`. There are a few unhandled test cases that present issues we're unsure of how to handle so far.
- [ ] Update low-code component generators to use Pydantic V2
  - This is doable, there are a few issues around custom component generation that are unhandled.

### Further CDK performance work - create issues for these
- [ ] Research if we can replace prints with buffered output (write to byte buffer and then flush to stdout)
- [ ] Replace `json` with `orjson`
...

</details>
2024-06-21 01:53:44 +02:00
Serhii Lazebnyi
a284676a4d feat(airbyte-cdk): add DatetimeIntervalCursor (#39603) 2024-06-20 01:11:13 +02:00
Maxime Carbonneau-Leclerc
0386ca21ae Exclude airbyte-cdk modules from schema discovery (#39586) 2024-06-19 09:29:45 -04:00
Maxime Carbonneau-Leclerc
5b4bd3485d Allow access to _partition for source-jira (#39576) 2024-06-18 21:47:30 -04:00
Maxime Carbonneau-Leclerc
7d56e19ac7 Improve error message on state initialization (#39553) 2024-06-18 12:58:11 -04:00
Artem Inzhyyants
46f8d4e2fc fix(airbyte-cdk): client_side_incremental fix end_datetime comparison (#38874)
Signed-off-by: Artem Inzhyyants <artem.inzhyyants@gmail.com>
2024-06-18 11:35:42 +02:00
Patrick Nilan
18e82d949a [airbyte-cdk] - Integrate HttpClient into HttpRequester (#38906) 2024-06-18 01:03:15 +00:00
Patrick Nilan
7a639660cf [airbyte-cdk] Updates Low Code CDK ErrorHandlers and BackoffStrategies to align with Python CDK Interfaces (#38743) 2024-06-17 17:41:14 -07:00
Anatolii Yatsuk
258603e907 low-code: Add Incremental Parent State Handling to SubstreamPartitionRouter (#38211) 2024-06-14 10:57:42 -07:00
Anatolii Yatsuk
2da71a3707 feat(low-code): add new format float_s (#38869)
Co-authored-by: Alexandre Girard <alexandre@airbyte.io>
2024-06-05 12:34:20 +03:00
Natik Gadzhi
8b82caa4df [airbyte-cdk] Fix dpath.util.* deprecation warnings (#38847)
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
2024-06-03 12:51:48 -07:00
Natik Gadzhi
dfd61c52ff [airbyte-cdk] Python 3.11 dataclass compatibility (#38846) 2024-06-03 12:14:36 -07:00
Artem Inzhyyants
b9a421ba15 feat(airbyte-cdk): add client side incremental sync (#38099)
Signed-off-by: Artem Inzhyyants <artem.inzhyyants@gmail.com>
2024-06-03 17:20:34 +02:00
Brian Lai
040f1415e5 [low-code CDK] Rsumable full refresh support for low-code streams (#38300) 2024-05-22 16:23:31 -04:00
Alexandre Girard
fb11ca22fe low-code: Yield records from generators instead of keeping them in in-memory lists (#36406) 2024-05-14 18:00:03 -07:00
Brian Lai
8fdd9818ec [airbyte-cdk] Promote low-code types and cursor interface into Python CDK (#38077) 2024-05-13 15:51:50 -04:00
Artem Inzhyyants
5fe60b7fb8 Airbyte CDK: use pytz.utc instead of datetime.utc (#38026)
Signed-off-by: Artem Inzhyyants <artem.inzhyyants@gmail.com>
2024-05-09 13:01:46 +02:00
Brian Lai
d74125bf10 [RFR for API Sources] New Python interfaces to support resumable full refresh (#37429) 2024-05-06 18:41:29 -04:00
Patrick Nilan
b20cd1bd1d airbyte-cdk - Adds JwtAuthenticator to low-code (#37005) 2024-04-19 09:17:54 -07:00
Ella Rohm-Ensing
b7819d9f6c python: assert actual == expected ordering (#36980) 2024-04-11 15:16:33 +00:00
Serhii Lazebnyi
033decc8c2 add backward compatibility for an old close slice logic (#36774) 2024-04-03 03:13:22 +02:00
Alexandre Girard
7676892ca9 low-code: Fix cursor pagination instantiation if the stop_condition is a string (#36760) 2024-04-02 14:03:27 -07:00
Alexandre Girard
4af69fc20d low-code: Add last_page_size and last_record to pagination context (#36408) 2024-04-02 10:43:30 -07:00
Serhii Lazebnyi
604a2dfee8 fix wrong partition key definition after legacy state migration (#36719) 2024-04-01 17:08:03 +02:00
Serhii Lazebnyi
c3c87ea1a5 follow up to #36294: allow migrate sub stream state with custom partition router (#36590) 2024-03-28 23:39:26 +01:00
Alexandre Girard
634db576dc Python CDK: rename a unit test (#36556) 2024-03-27 17:00:38 -07:00
Ella Rohm-Ensing
0c367680b0 Fix E721 errors in the CDK (#36490) 2024-03-26 18:05:33 +00:00
Alexandre Girard
118a864ea2 low-code: Add string filter (#36393)
Co-authored-by: Marcos Marx <marcosmarxm@users.noreply.github.com>
2024-03-25 17:02:31 -07:00
Alexandre Girard
28591c3481 per partition state and custom state migrations (#36294) 2024-03-25 16:05:55 -07:00
Roman Yermilov [GL]
242dd6a425 Airbyte CDK: request options allowed to be an array (#36357) 2024-03-22 07:07:44 +01:00
Alex Birdsall
44f784e200 Remove most_recent_record arg from Cursor.close_slice (#36216) 2024-03-18 18:28:50 -07:00
Artem Inzhyyants
240aa0180d Airbyte CDK (low code): add refresh_token_error handler to DeclarativeOauth2Authenticator (#36058)
Signed-off-by: Artem Inzhyyants <artem.inzhyyants@gmail.com>
Co-authored-by: Alexandre Girard <alexandre@airbyte.io>
2024-03-18 18:52:58 +01:00
Alexandre Girard
875e5dfacb low-code: Allow developers to use a custom schema loader (#36040) 2024-03-18 09:42:16 -07:00
Alex Birdsall
a6a1b3c0c3 Base datetime cursor state off latest observed record (#35843) 2024-03-15 15:06:43 -07:00
Alexandre Girard
15b954546f raise exception with the full class name if a class for a custom comp… (#35868) 2024-03-12 12:18:38 -07:00
Alexandre Girard
4a808ee178 🐛 follow up to #35471: update the cartesian stream slicer (#35865) 2024-03-07 08:20:02 -08:00
Alexandre Girard
f55abc1fdc 🐛 low-code: Fix incremental substreams (#35471) 2024-03-05 18:50:42 -08:00
Alexandre Girard
5f48da9a67 [low-code] allow page size to be defined with string interpolation (#35735)
Co-authored-by: Dan Lecocq <dlecocq@sofi.org>
2024-03-05 16:03:16 -08:00
Artem Inzhyyants
0954ad3d3a Airbyte CDK: add interpolation for request options (#35485)
Signed-off-by: Artem Inzhyyants <artem.inzhyyants@gmail.com>
Co-authored-by: Alexandre Girard <alexandre@airbyte.io>
2024-02-22 19:40:44 +01:00
Alexandre Girard
5724ca0cf0 Add ignore_stream_slicer_parameters_on_paginated_requests flag (#35462) 2024-02-21 14:14:37 -08:00
Artem Inzhyyants
3355c5c432 Airbyte CDK: add filter to RemoveFields (#35326)
Signed-off-by: Artem Inzhyyants <artem.inzhyyants@gmail.com>
2024-02-21 13:49:10 +01:00
Alexandre Girard
fc87183905 🐛 python cdk: mask oauth access key (#34931) 2024-02-14 22:25:18 -08:00