Maxime Carbonneau-Leclerc
d0fd57ebf8
[ISSUE-32072] increase connection pool size ( #32246 )
2023-11-08 09:38:41 -05:00
Maxime Carbonneau-Leclerc
71d50635cc
[ISSUE #32070 ] concurrent cdk improve futures handling ( #32277 )
2023-11-08 09:16:39 -05:00
Alexandre Girard
139deeb081
Implement max_time on error handler ( #32272 )
2023-11-08 00:46:26 +00:00
Eugene Kulak
6c7ba28d75
API Call Rate limiter ( #31276 )
...
Co-authored-by: Eugene Kulak <kulak.eugene@gmail.com >
Co-authored-by: keu <keu@users.noreply.github.com >
Co-authored-by: Alexandre Girard <alexandre@airbyte.io >
2023-11-07 23:32:53 +02:00
Catherine Noll
4f44e33f5c
Concurrent CDK: handle legacy state messages ( #31964 )
2023-11-02 08:21:08 -04:00
Joe Reuter
66dd29f764
File CDK unstructured parser: Improve file type detection ( #31997 )
2023-11-02 12:19:27 +01:00
Maxime Carbonneau-Leclerc
32fdd7fd72
[ISSUE #29573 ] Concurrent CDK: incremental syncs ( #31466 )
...
Co-authored-by: Alexandre Girard <alexandre@airbyte.io >
Co-authored-by: girarda <girarda@users.noreply.github.com >
Co-authored-by: Catherine Noll <clnoll@users.noreply.github.com >
2023-11-01 12:00:25 -04:00
Martin Hwasser
bc4b7198a9
✨ Add pptx support in file based cdk ( #31912 )
...
Co-authored-by: Joe Reuter <joe@airbyte.io >
2023-10-30 14:42:39 +01:00
Alexandre Girard
b8ad0c6a91
🐛 CDK: use in memory caching if ENV_REQUEST_CACHE_PATH is not set ( #31887 )
...
Co-authored-by: Eugene Kulak <widowmakerreborn@gmail.com >
Co-authored-by: girarda <girarda@users.noreply.github.com >
2023-10-26 19:28:39 -07:00
Joe Reuter
e3793c1491
Move over unstructured parser ( #31390 )
...
Co-authored-by: flash1293 <flash1293@users.noreply.github.com >
2023-10-26 17:50:57 +02:00
Anatolii Yatsuk
c719137df3
🐛 Airbyte CDK: Fix flake errors in file-based CDK ( #31771 )
2023-10-24 16:15:11 +03:00
Anatolii Yatsuk
ce2342dde8
🎉 Airbyte CDK: Add CustomFileBasedException for custom errors in file-based CDK ( #31704 )
2023-10-24 11:09:50 +00:00
Alexandre Girard
7a764f8bbc
✨ low-code CDK: Allow connector developers to specify the type of an added field ( #31638 )
...
Co-authored-by: girarda <girarda@users.noreply.github.com >
Co-authored-by: erohmensing <erohmensing@gmail.com >
2023-10-23 14:12:59 -07:00
Alexandre Girard
7da2822488
Concurrent CDK: catch exceptions from worker thread and add integration test scenarios ( #31245 )
...
Co-authored-by: girarda <girarda@users.noreply.github.com >
2023-10-23 08:39:58 -07:00
Joe Reuter
d474827068
File CDK: Don't fetch full file list for availability check ( #31651 )
...
Co-authored-by: flash1293 <flash1293@users.noreply.github.com >
2023-10-23 16:14:41 +02:00
Joe Reuter
bb07939646
File CDK: Add analytics messages for parser usage ( #31498 )
...
Co-authored-by: flash1293 <flash1293@users.noreply.github.com >
2023-10-19 15:42:51 +02:00
Yevhenii
b951898c20
CDK: Support base64 encode and decode in Jinja Interpolation ( #31387 )
2023-10-19 13:55:45 +03:00
Alexandre Girard
ef9bd72a7e
Parameterize ScenarioBuilder on Source type ( #31244 )
...
Co-authored-by: girarda <girarda@users.noreply.github.com >
Co-authored-by: Catherine Noll <clnoll@users.noreply.github.com >
Co-authored-by: Maxime Carbonneau-Leclerc <maxi297@users.noreply.github.com >
2023-10-16 17:12:18 -07:00
Alexandre Girard
04c4fea5cc
🐛 Concurrent CDK bug fixes ( #31402 )
2023-10-16 12:06:35 -07:00
Anton Karpets
51fa2b3c31
🐛 Airbyte CDK: wrap HTTP error with status code 400 in AirbyteTracedException ( #31207 )
2023-10-16 11:15:04 +03:00
Joe Reuter
e35a1f2cd9
File CDK: Allow configuration of parsed records during check and discover from parser ( #31281 )
...
Co-authored-by: flash1293 <flash1293@users.noreply.github.com >
2023-10-13 09:50:22 +02:00
Catherine Noll
8536725944
CDK: URL-encode query parameters and request body ( #30407 )
2023-10-12 09:56:55 -04:00
Alexandre Girard
25fc396cdf
CDK: ThreadBasedConcurrentStream skeleton and top-level AbstractStream ( #30111 )
...
Co-authored-by: girarda <girarda@users.noreply.github.com >
Co-authored-by: Maxime Carbonneau-Leclerc <maxi297@users.noreply.github.com >
Co-authored-by: Catherine Noll <clnoll@users.noreply.github.com >
2023-10-11 16:46:02 -07:00
Yevhenii
17136a0c8a
CDK: Fix initialize of token_expiry_is_time_of_expiration field ( #31279 )
2023-10-11 16:35:56 +00:00
Yevhenii
c17fae5855
CDK: create new method for parsing refresh token lifespan ( #30698 )
...
Co-authored-by: yevhenii-ldv <yevhenii-ldv@users.noreply.github.com >
2023-10-10 17:08:41 +03:00
Ben Church
4c97b2994a
CDK: coerce read records to an iterator ( #31122 )
...
Co-authored-by: bnchrch <bnchrch@users.noreply.github.com >
2023-10-06 10:01:29 -07:00
Yevhenii
00452c9bd3
CDK: Enable Page Number/Offset to be set on the first request ( #30978 )
...
Co-authored-by: yevhenii-ldv <yevhenii-ldv@users.noreply.github.com >
2023-10-05 15:31:30 +03:00
Roman Yermilov [GL]
e561d5d432
Airbyte CDK: fix none type binary error in parquet parser ( #31073 )
2023-10-05 15:56:02 +04:00
Anton Karpets
767800d2d7
🐛 Airbyte CDK: fix parsing of UUID fields in avro files ( #31096 )
2023-10-05 10:53:18 +03:00
Eugene Kulak
5eba3c3b57
CDK: Fix request_cache clearing and move it to tmp folder ( #30719 )
...
Co-authored-by: Eugene Kulak <kulak.eugene@gmail.com >
2023-09-28 21:27:40 +03:00
Marius Posta
7ae97175a6
gradle: fix repo wide behaviour ( #30607 )
2023-09-28 05:01:13 -07:00
Yevhenii
8cdafabd82
Airbyte CDK: Change Error message if stream is not found ( #30723 )
...
Co-authored-by: Yevhenii Kurochkin <ykurochkin@flyaps.com >
2023-09-25 18:13:19 +03:00
Maxime Carbonneau-Leclerc
b6836ad950
[ISSUE #30353 ] remove file_type from stream config ( #30453 )
2023-09-18 08:50:00 -04:00
Maxime Carbonneau-Leclerc
48e8816b6b
[oncall #2838 ] migrate parsing errors as config errors ( #30209 )
2023-09-06 13:38:48 -04:00
Maxime Carbonneau-Leclerc
5b653676aa
Update spec and fix autogenerated headers with skip after ( #30123 )
2023-09-03 09:26:53 -04:00
Maxime Carbonneau-Leclerc
399b4d1fca
File-based CDK: ensure no errors in Sentry given empty CSV ( #29944 )
2023-09-02 09:40:08 -04:00
Alexandre Girard
7264b3e1d7
Fix mypy issues in AbstractSource + minor refactoring ( #29927 )
...
Co-authored-by: girarda <girarda@users.noreply.github.com >
2023-08-31 07:35:17 -07:00
Maxime Carbonneau-Leclerc
e2fb04f72d
File-based CDK: allow user to provided column names ( #29868 )
2023-08-28 18:00:19 -04:00
Maxime Carbonneau-Leclerc
82a96e0c69
File-based CDK: allow for extension mismatch ( #29835 )
2023-08-25 11:44:49 -04:00
Maxime Carbonneau-Leclerc
40b76a7813
✨ Source S3: v4 rollout/feature parity ( #29753 )
2023-08-23 11:30:08 -04:00
Maxime Carbonneau-Leclerc
b801a3d24f
Do not stop processing file on parsing error ( #29679 )
2023-08-21 15:56:01 -04:00
Joe Reuter
d293e1cce4
Embedded CDK: run a check before starting to load ( #29079 )
2023-08-21 12:42:58 +02:00
Alexandre Girard
b4ce532762
low-code: Allow formatting datetimes as milliseconds since unix epoch ( #29504 )
...
Co-authored-by: girarda <girarda@users.noreply.github.com >
2023-08-17 18:49:28 -07:00
Maxime Carbonneau-Leclerc
e9d99630ed
Removing validation on skip rows and autogenerated headers ( #29488 )
2023-08-17 16:14:19 -04:00
Catherine Noll
7c1d6081de
File-based CDK: handle legacy path_prefix + globs ( #29389 )
2023-08-15 12:18:25 -04:00
Brian Lai
5908b85e69
[file-based cdk] Remove CSV quoting_behavior config option ( #29388 )
...
* remove CSV quoting_behavior config option
* cleanup after getting latest master
2023-08-14 20:37:38 -04:00
Alexandre Girard
b512fa4628
file-based CDK: Configurable strings_can_be_null ( #29298 )
...
* [ISSUE #28893 ] infer csv schema
* [ISSUE #28893 ] align with pyarrow
* Automated Commit - Formatting Changes
* [ISSUE #28893 ] legacy inference and infer only when needed
* [ISSUE #28893 ] fix scenario tests
* [ISSUE #28893 ] using discovered schema as part of read
* [ISSUE #28893 ] self-review + cleanup
* [ISSUE #28893 ] fix test
* [ISSUE #28893 ] code review part #1
* [ISSUE #28893 ] code review part #2
* Fix test
* formatcdk
* first pass
* [ISSUE #28893 ] code review
* fix mypy issues
* comment
* rename for clarity
* Add a scenario test case
* this isn't optional anymore
* FIX test log level
* Re-adding failing tests
* [ISSUE #28893 ] improve inferrence to consider multiple types per value
* Automated Commit - Formatting Changes
* [ISSUE #28893 ] remove InferenceType.PRIMITIVE_AND_COMPLEX_TYPES
* Code review
* Automated Commit - Formatting Changes
* fix unit tests
---------
Co-authored-by: maxi297 <maxime@airbyte.io >
Co-authored-by: maxi297 <maxi297@users.noreply.github.com >
2023-08-14 12:51:27 -07:00
Maxime Carbonneau-Leclerc
12f1304a67
Issue 28893/infer schema csv ( #29099 )
2023-08-14 15:14:46 -04:00
Alexandre Girard
1a120ecd4b
File-CDK (Avro) Set double_as_string to false by default ( #29339 )
...
* set double_as_string to false by default
* Use default config when irrelevant to the test
* Update description
* Update the description again
2023-08-10 14:31:52 -07:00
Maxime Carbonneau-Leclerc
cfbd0b8219
[ISSUE #26764 ] support brute force multiline json objects for JSONL ( #29331 )
...
* [ISSUE #26764 ] support brute force multiline json objects for JSONL
* [ISSUE #26764 ] infer_schema to support multiline json objects as well
* [ISSUE #26764 ] code review
2023-08-10 15:54:46 -04:00