1
0
mirror of synced 2025-12-22 03:21:25 -05:00
Commit Graph

66 Commits

Author SHA1 Message Date
Francis Genet
4b367de700 [S3-DL] Update version (#69232) 2025-11-07 09:33:03 -08:00
octavia-bot[bot]
c20056167d chore: upgrade destination-s3-data-lake to bulk CDK 0.1.61 (#69133)
Co-authored-by: edgao <5741425+edgao@users.noreply.github.com>
Co-authored-by: octavia-bot[bot] <108746235+octavia-bot[bot]@users.noreply.github.com>
Co-authored-by: Subodh Kant Chaturvedi <subodh1810@gmail.com>
2025-11-05 12:26:25 +05:30
Subodh Kant Chaturvedi
5b3de4f456 feat(s3-datalake-destination): implement support for polaris catalog (#68108)
Issue: https://github.com/airbytehq/airbyte-internal-issues/issues/14734
2025-10-15 22:08:35 +05:30
Davin Chia
0120d16f21 fix: fix S3 Data Lake connector empty role_arn validation (#67005)
- Change role_arn validation to check for both null and blank strings
- Add validation for mismatched AWS credentials (only one of
access_key_id or secret_access_key provided)
- Add comprehensive tests for empty string role_arn, null role_arn, and
credential validation
- Improve error messages for better debugging

Fixes airbytehq/airbyte-internal-issues#14643

Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
Co-authored-by: Claude <noreply@anthropic.com>
2025-10-07 18:38:53 +00:00
Davin Chia
278ddd4e42 fix(destination-s3-data-lake): use unique table name in check operation (#67150)
Use UUID-based table names for check operation test tables to prevent conflicts with:

Stale metadata from previous check runs
Concurrent check operations
User tables named airbyte_test_table
This fixes the integration test failure caused by corrupted Glue catalog metadata pointing to non-existent S3 files.

🤖 Generated with Claude Code


Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
2025-10-07 18:17:20 +00:00
Edward Gao
4b45f00b37 Destination S3 Data Lake: improve docs (#66748) 2025-10-03 16:12:15 +00:00
Edward Gao
91f57e904b Destination S3 Data Lake: use default namespace in check (#66711) 2025-09-26 01:23:54 +00:00
Edward Gao
885cc13c4f Destination S3 Data Lake: remove unnecessary properties on table (#63746) 2025-07-23 20:51:20 +00:00
Francis Genet
ccdb3ef36b [S3-Data-Lake] Pin the CDK and update the version. (#62952) 2025-07-15 08:46:59 -07:00
Francis Genet
c388d1877e [S3-Data-Lake] Handle compaction and file delete on truncate refresh (#62888) 2025-07-11 16:46:23 -07:00
Edward Gao
a47f9861f4 Destination S3 Data Lake: revert accidental archiving (#62852) 2025-07-08 15:08:26 +00:00
Francis Genet
f29c9c5fec [S3-Data-Lake] Pin to the latest CDK version (#62835) 2025-07-07 19:48:44 +00:00
Francis Genet
df5cbfbff6 [S3 Data Lake] Replace branch instead of fast forwarding (#62105) 2025-07-07 10:46:48 -07:00
Davin Chia
0af493bed1 chore: add readme warning. (#61596)
After #61588, we discovered the local CDK version has a bug. These versions should not be used.
2025-06-13 15:05:06 -07:00
Davin Chia
2c573f8e4e chore: preemptively bump certified connectors. (#61588)
Following #61584. Bumping certified connector versions to make sure the version and code commits align. Doing this in 2 parts.

Bump BQ, SF, S3, S3-data-lake.
2025-06-13 17:34:40 +00:00
Johnny Schmidt
0543ad60ba Load-CDK: Bugfix: Correct jackson-to-input-string ratio (#59710) 2025-05-07 20:11:38 +00:00
Ian Alton
01cd16654e 11059 multi-instance, versioned docs (#58095)
Co-authored-by: devin-ai-integration[bot] <158243242+devin-ai-integration[bot]@users.noreply.github.com>
2025-04-24 02:58:09 +03:00
Edward Gao
38c86073c0 Destination MSSQL: Improve numeric handling (#58146) 2025-04-22 01:26:58 +03:00
Francis Genet
49161af438 Chore: Make the config region for S3 a string as opposed to the enum value (#58104) 2025-04-17 21:03:14 +00:00
Edward Gao
40774c816f Bulk load CDK: streamloader.close() knows whether there was any data in the sync (#58085) 2025-04-17 02:38:23 +03:00
Davin Chia
6f70b7bded Remove confusing iceberg and s3-glue docs. (#57002)
Since we've switched over to s3-datalake, remove the Iceberg and S3-Glue destinations entirely.

Also make it clear that S3-datalake is the official Airbyte Iceberg implementation for S3.
2025-04-04 12:51:51 -04:00
Edward Gao
a22756f7e7 Destination s3 data lake: handle numbers correctly (#56435) 2025-03-27 21:01:06 +00:00
Edward Gao
0d301a6510 Destination S3 Data Lake: fix coercing nested array value (#56395) 2025-03-25 22:10:33 +00:00
devin-ai-integration[bot]
ceaf322225 chore(destination): Upgrade all Java destination connectors to use java-connector-base:2.0.1 (#56355)
Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-authored-by: davin@airbyte.io <davin@airbyte.io>
Co-authored-by: Davin Chia <davinchia@gmail.com>
2025-03-24 18:59:36 -04:00
Johnny Schmidt
027b8efaad [Dest-S3DataLake]: Bugfix: New interface setup does not always await … (#56347) 2025-03-24 21:34:01 +02:00
Edward Gao
2b2991383d Destination MSSQL: use new typing interface (#55849)
Co-authored-by: Francis Genet <francis.genet@airbyte.io>
2025-03-24 11:50:14 -07:00
Francis Genet
7ef44752cc CDK Improvements: Typing (#55798)
Co-authored-by: Edward Gao <edward.gao@airbyte.io>
2025-03-20 16:26:48 -07:00
Edward Gao
092680e12c Bulk load CDK: DestinationRecord has full DestinationStream (#55811)
Co-authored-by: Francis Genet <francis.genet@airbyte.io>
2025-03-19 19:09:06 +02:00
Ian Alton
e2ea32ec4e 11526 second pass through iceberg documentation (#55736) 2025-03-13 16:02:26 -07:00
Francis Genet
cd7c0aa802 Update CDK to pass DestinationRecordRaw around (#55737)
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
Co-authored-by: Edward Gao <edward.gao@airbyte.io>
2025-03-13 22:48:01 +02:00
Edward Gao
9f9bcdcbdf Destination S3 Data Lake: Handle number in primary key (#55755) 2025-03-13 21:07:02 +02:00
Edward Gao
04d153eb38 Destination S3 Data Lake: certify (#54724)
Co-authored-by: Francis Genet <francis.genet@airbyte.io>
2025-03-03 11:39:17 -06:00
Johnny Schmidt
b88b37bed7 Load-CDK/Destination-S3DataLake: DirectLoader (no spill2disk, partitioning) (#53241) 2025-02-24 17:28:54 -08:00
Edward Gao
3a47e8bd26 Destination S3 Data Lake: extract AWS-specific pieces; move generic stuff to toolkit (#53697) 2025-02-15 00:17:43 +02:00
Edward Gao
3b97dbfff7 Destination S3 Data Lake: Document potential pitfall (#53678) 2025-02-14 09:33:21 -08:00
Edward Gao
0b62b7cf23 Destination S3 Data Lake: more docs (#53170) 2025-02-13 01:31:34 +02:00
Edward Gao
3985e69caf Destination S3 Data Lake: Force schema change in truncate sync (#53216) 2025-02-12 18:57:54 +00:00
Francis Genet
ebedcbe987 [S3-data-lake] Re-enable the Nessie integration tests (#53622)
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
2025-02-11 19:28:23 +00:00
Edward Gao
0ba9e3a137 Destination S3 Data Lake: very basic usability improvements (#53165)
Co-authored-by: Francis Genet <francis.genet@airbyte.io>
2025-02-10 22:27:46 +00:00
Francis Genet
e67c240372 [s3-data-lake] Using batches of 1.5Gb (#52666)
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
2025-02-10 23:51:07 +02:00
Francis Genet
85fab29ad8 [S3-data-lake] Add rest catalog integration tests (#53141)
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
Co-authored-by: Edward Gao <edward.gao@airbyte.io>
2025-02-07 14:14:01 -08:00
Edward Gao
04bbe45a7d Destination S3 Data Lake: move per-stream setup into start() (#53172) 2025-02-07 00:49:27 +00:00
Edward Gao
6dd9683886 Destination S3 Data Lake: improve error message on null PK (#53164) 2025-02-07 02:17:45 +02:00
Edward Gao
cee03d52b4 Destinations S3 / S3 Data Lake: explicitly state that arn spec option is only usable in cloud (#53173) 2025-02-05 14:53:57 -08:00
Edward Gao
62d661cacf Destination S3 Data Lake: fix+document temporal types (#53176) 2025-02-05 13:00:30 -08:00
Edward Gao
f2470864a1 Destination S3 Data Lake: handle more weird characters in stream name/ns (#52690) 2025-02-04 10:25:39 -08:00
Edward Gao
18798f6328 Destination S3 Data Lake: fix dedup (#52633)
Co-authored-by: Francis Genet <francis.genet@airbyte.io>
2025-02-03 09:19:42 -08:00
Francis Genet
e56fe8a1d9 [S3-Data-Lake] Make the Namespace/Database a required field (#52639) 2025-01-31 09:14:35 -08:00
Edward Gao
7c12755f35 Bulk Load CDK: refactor micronaut property handling (#51600) 2025-01-28 09:21:19 -08:00
Subodh Kant Chaturvedi
e92a3d2bfa feat: implement rest catalog for s3 data lake (#52081)
Co-authored-by: Francis Genet <francis.genet@airbyte.io>
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
2025-01-27 21:53:48 +00:00