Brian Lai
e456bca099
🐛 [RFR for API Sources] Fix bug where checkpoint reader stops syncing too early if first partition is complete ( #41658 )
2024-07-12 19:00:25 -04:00
Anton Karpets
6c439a8859
[file-based cdk]: add config option to limit number of files for schema discover ( #39317 )
...
Co-authored-by: askarpets <anton.karpets@globallogic.com >
Co-authored-by: Serhii Lazebnyi <serhii.lazebnyi@globallogic.com >
Co-authored-by: Serhii Lazebnyi <53845333+lazebnyi@users.noreply.github.com >
2024-07-11 15:16:09 +02:00
Brian Lai
9e23b3f89b
🐛 [airbyte-cdk] Fix bug where substreams depending on an RFR parent stream don't paginate or use existing state ( #40671 )
2024-07-11 02:53:20 -04:00
Serhii Lazebnyi
f9b5d5b1a7
[airbyte-cdk] add incomplete status to availability check during read ( #41034 )
2024-07-10 23:18:28 +02:00
Serhii Lazebnyi
f00ed4a925
[airbyte-cdk] add running stream status with rate limit reason to backoff aproach ( #40681 )
2024-07-10 14:00:01 +02:00
Serhii Lazebnyi
bc60a740a2
[airbyte-cdk] add incomplete stream status to nonexistent stream handling ( #40568 )
2024-07-10 13:10:30 +02:00
Artem Inzhyyants
02c5f59ccf
ref(airbyte-cdk): use http_client inside HttpStream ( #39811 )
...
Signed-off-by: Artem Inzhyyants <artem.inzhyyants@gmail.com >
2024-07-09 12:01:03 +02:00
Cristina Mariscal
c4b8212ba7
CDK: Add support for input format parsing at jinja macro format_datetime ( #40759 )
...
Co-authored-by: cristina.mariscal <cristina.mariscal@cristina.mariscal--MacBook-Pro---DFJ27FJFXX >
2024-07-08 08:42:09 +00:00
Baz
0c237d81d0
🐛 [CDK, Declarative Source]: fix bug when type is missing for anyOf in nested arrays ( #40667 )
2024-07-08 10:09:02 +03:00
Cristina Mariscal
a8e985b7a0
Revert "CDK: Add jinja macro format_datetime_string" ( #40747 )
2024-07-05 15:04:32 +00:00
Aldo Gonzalez
1422786282
feat(Airbyte CDK): add with_json_schema method to ConfiguredAirbyteStreamBuilder ( #40737 )
2024-07-05 08:27:57 -06:00
Cristina Mariscal
b9c213a473
CDK: Add jinja macro format_datetime_string ( #40744 )
2024-07-05 13:19:43 +00:00
Natik Gadzhi
4a06230436
feat(python cdk): Allow regex_search in jinja interpolations ( #40696 )
...
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com >
2024-07-03 23:21:51 +00:00
Boris Staal
6dd9b7ab25
chore(cdk): Avoid using time.sleep in unit tests for backoff of http stream ( #40239 )
...
Co-authored-by: Alexandre Girard <alexandre@airbyte.io >
2024-07-02 11:47:56 -07:00
Maxime Carbonneau-Leclerc
2b7ef3fb25
Validate error handler fallback ( #40570 )
...
Co-authored-by: Serhii Lazebnyi <serhii.lazebnyi@globallogic.com >
2024-06-27 17:03:03 -04:00
Ella Rohm-Ensing
fc12432305
airbyte-cdk: only update airbyte-protocol-models to pydantic v2 ( #39524 )
...
## What
Migrating Pydantic V2 for Protocol Messages to speed up emitting records. This gives us 2.5x boost over V1.
Close https://github.com/airbytehq/airbyte-internal-issues/issues/8333
## How
- Switch to using protocol models generated for pydantic_v2, in a new (temporary) package, `airbyte-protocol-models-pdv2` .
- Update pydantic dependency of the CDK accordingly to v2.
- For minimal impact, still use the compatibility code `pydantic.v1` in all of our pydantic code from airbyte-cdk that does not interact with the protocol models.
## Review guide
1. Checkout the code and clear your CDK virtual env (either `rm -rf .venv && python -m venv .venv` or `poetry env list; poetry env remove <env>`. This is necessary to fully clean out the `airbyte_protocol` library, for some reason. Then: `poetry lock --no-update && poetry install --all-extras`. This should install the CDK with new models.
2. Run unit tests on the CDK
3. Take your favorite connector and point it's `pyproject.toml` on local CDK (see example in `source-s3`) and try running it's tests and it's regression tests.
## User Impact
> [!warning]
> This is a major CDK change due to the pydantic dependency change - if connectors use pydantic 1.10, they will break and will need to do similar `from pydantic.v1` updates to get running again. Therefore, we should release this as a major CDK version bump.
## Can this PR be safely reverted and rolled back?
- [x] YES 💚
- [ ] NO ❌
Even if sources migrate to this version, state format should not change, so a revert should be possible.
## Follow up work - Ella to move into issues
<details>
### Source-s3 - turn this into an issue
- [ ] Update source s3 CDK version and any required code changes
- [ ] Fix source-s3 unit tests
- [ ] Run source-s3 regression tests
- [ ] Merge and release source-s3 by June 21st
### Docs
- [ ] Update documentation on how to build with CDK
### CDK pieces
- [ ] Update file-based CDK format validation to use Pydantic V2
- This is doable, and requires a breaking change to change `OneOfOptionConfig`. There are a few unhandled test cases that present issues we're unsure of how to handle so far.
- [ ] Update low-code component generators to use Pydantic V2
- This is doable, there are a few issues around custom component generation that are unhandled.
### Further CDK performance work - create issues for these
- [ ] Research if we can replace prints with buffered output (write to byte buffer and then flush to stdout)
- [ ] Replace `json` with `orjson`
...
</details>
2024-06-21 01:53:44 +02:00
Serhii Lazebnyi
a284676a4d
feat(airbyte-cdk): add DatetimeIntervalCursor ( #39603 )
2024-06-20 01:11:13 +02:00
Maxime Carbonneau-Leclerc
0386ca21ae
Exclude airbyte-cdk modules from schema discovery ( #39586 )
2024-06-19 09:29:45 -04:00
Artem Inzhyyants
f49c8054ad
feat(airbyte-cdk): add json_schema from ConfiguredCatalog to Stream ( #39522 )
...
Signed-off-by: Artem Inzhyyants <artem.inzhyyants@gmail.com >
2024-06-19 11:12:33 +02:00
Maxime Carbonneau-Leclerc
5b4bd3485d
Allow access to _partition for source-jira ( #39576 )
2024-06-18 21:47:30 -04:00
Maxime Carbonneau-Leclerc
7d56e19ac7
Improve error message on state initialization ( #39553 )
2024-06-18 12:58:11 -04:00
Artem Inzhyyants
46f8d4e2fc
fix(airbyte-cdk): client_side_incremental fix end_datetime comparison ( #38874 )
...
Signed-off-by: Artem Inzhyyants <artem.inzhyyants@gmail.com >
2024-06-18 11:35:42 +02:00
Patrick Nilan
18e82d949a
[airbyte-cdk] - Integrate HttpClient into HttpRequester ( #38906 )
2024-06-18 01:03:15 +00:00
Patrick Nilan
7a639660cf
[airbyte-cdk] Updates Low Code CDK ErrorHandlers and BackoffStrategies to align with Python CDK Interfaces ( #38743 )
2024-06-17 17:41:14 -07:00
Anatolii Yatsuk
258603e907
✨ low-code: Add Incremental Parent State Handling to SubstreamPartitionRouter ( #38211 )
2024-06-14 10:57:42 -07:00
Maxime Carbonneau-Leclerc
9b1d7205f3
Add discover to entrypoint wrapper ( #39396 )
2024-06-11 10:18:31 -04:00
Gergely Imreh
d55995deb5
[cdk]: correctly raise unsupported logical type errors when parsing avro ( #36888 )
...
Co-authored-by: Natik Gadzhi <natik@respawn.io >
2024-06-06 04:17:02 +00:00
Anatolii Yatsuk
2da71a3707
feat(low-code): add new format float_s ( #38869 )
...
Co-authored-by: Alexandre Girard <alexandre@airbyte.io >
2024-06-05 12:34:20 +03:00
Natik Gadzhi
8b82caa4df
[airbyte-cdk] Fix dpath.util.* deprecation warnings ( #38847 )
...
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com >
2024-06-03 12:51:48 -07:00
Natik Gadzhi
dfd61c52ff
[airbyte-cdk] Python 3.11 dataclass compatibility ( #38846 )
2024-06-03 12:14:36 -07:00
Artem Inzhyyants
b9a421ba15
feat(airbyte-cdk): add client side incremental sync ( #38099 )
...
Signed-off-by: Artem Inzhyyants <artem.inzhyyants@gmail.com >
2024-06-03 17:20:34 +02:00
Bindi Pankhudi
700b1708d7
Fix: Vector-db-based CDK - Updated unstructured file type and removed experimental from file type ( #38722 )
2024-05-30 10:34:20 -07:00
Anatolii Yatsuk
4863c9fbd6
fix(connector-builder): hide secrets in errors ( #38759 )
2024-05-30 15:06:28 +03:00
Ignas Vyšniauskas
82a2283a76
🐛 [cdk] Return correct type during 'check' if config does not match schema ( #37398 )
...
Co-authored-by: Natik Gadzhi <natik@respawn.io >
Co-authored-by: Augustin <augustin@airbyte.io >
2024-05-24 17:32:37 -07:00
Brian Lai
29d615080a
[airbyte-cdk] Fix a bug so that successful Python RFR streams are not synced on subsequent attempts ( #38608 )
2024-05-24 15:58:18 -04:00
Brian Lai
040f1415e5
[low-code CDK] Rsumable full refresh support for low-code streams ( #38300 )
2024-05-22 16:23:31 -04:00
Anton Karpets
50f4965324
File-based CDK: avoid error on empty stream when running discover ( #38230 )
2024-05-21 15:10:34 +03:00
Alexandre Girard
0ceb76920a
refactor!(airbyte-cdk): Delete deprecated AirbyteLogger, AirbyteSpec, and Authenticators + move public classes to the top level init file ( #37805 )
...
Co-authored-by: Ella Rohm-Ensing <erohmensing@gmail.com >
2024-05-20 07:18:37 -07:00
Patrick Nilan
8396fd2d7f
airbyte-cdk: Improve Error Handling in Legacy CDK ( #37576 )
2024-05-16 19:07:21 -07:00
Alexandre Girard
fb11ca22fe
low-code: Yield records from generators instead of keeping them in in-memory lists ( #36406 )
2024-05-14 18:00:03 -07:00
Aldo Gonzalez
e7508a4572
Airbyte CDK: Add delete method to HttpMocker ( #38169 )
2024-05-14 08:06:18 -06:00
Brian Lai
8fdd9818ec
[airbyte-cdk] Promote low-code types and cursor interface into Python CDK ( #38077 )
2024-05-13 15:51:50 -04:00
Artem Inzhyyants
5fe60b7fb8
Airbyte CDK: use pytz.utc instead of datetime.utc ( #38026 )
...
Signed-off-by: Artem Inzhyyants <artem.inzhyyants@gmail.com >
2024-05-09 13:01:46 +02:00
Tobias Macey
18c9ebc64d
[airbyte-cdk] Increase the maximum parseable field size for CSV files ( #36320 )
2024-05-07 20:08:01 -03:00
Brian Lai
d74125bf10
[RFR for API Sources] New Python interfaces to support resumable full refresh ( #37429 )
2024-05-06 18:41:29 -04:00
Anton Karpets
8ec438acf0
File-based CDK: allow to merge schemas with nullable object values ( #37773 )
2024-05-02 17:43:14 +03:00
Anton Karpets
2cfa6ea2c8
File-based CDK: fix schemas merge for nullable object types ( #37619 )
2024-05-02 10:40:20 +03:00
Alexandre Girard
86ee91ed5d
Connector builder: read input state if it exists ( #37495 )
2024-04-24 15:53:09 -07:00
Maxime Carbonneau-Leclerc
48af92ad78
Concurrent CDK: if exception is AirbyteTracedException, raise this an… ( #37443 )
2024-04-19 20:32:46 +00:00
Patrick Nilan
b20cd1bd1d
✨ airbyte-cdk - Adds JwtAuthenticator to low-code ( #37005 )
2024-04-19 09:17:54 -07:00