1
0
mirror of synced 2025-12-21 19:11:14 -05:00
Commit Graph

149 Commits

Author SHA1 Message Date
Joe Reuter
2edcfb36fb Source S3: Convert to airbyte-lib (#33937) 2024-01-09 11:31:00 +01:00
Joe Reuter
5324da8303 S3, Google Drive, Azure Blob Storage: Update cdk (#33411)
Co-authored-by: alafanechere <augustin.lafanechere@gmail.com>
2023-12-15 14:51:59 +01:00
Joe Reuter
e2789b4f03 Update docusaurus to 3 (#33041) 2023-12-11 17:03:18 +01:00
Catherine Noll
f73827eb43 File-based CDK Sources: Hide primary key (#33187) 2023-12-06 13:01:47 -05:00
Joe Reuter
8f7abc2cc0 S3, Azure Blob Storage, GCS, Weviate, Milvus, Chroma, Qdrant: Bump cdk version (#32608)
Co-authored-by: flash1293 <flash1293@users.noreply.github.com>
2023-11-30 09:48:18 +01:00
Anatolii Yatsuk
a41d11b4a0 🐛 Source S3: Fix discovery for zip file (#32677)
Co-authored-by: Alexandre Girard <alexandre@airbyte.io>
2023-11-21 05:10:16 -06:00
Joe Reuter
b3396626ee S3, Azure Blob Storage, GCS, Pinecone, Weaviate, Milvus, Chroma, Qdrant: Update CDK to improve spec generation (#32357)
Co-authored-by: flash1293 <flash1293@users.noreply.github.com>
2023-11-14 16:53:27 +01:00
Anatolii Yatsuk
21cfb2a083 Source S3: Add HTTPS validation for S3 endpoint (#32109) 2023-11-10 12:33:38 +02:00
Joe Reuter
7c7acade71 S3 and Azure Blob Storage: Update File CDK to support document file types (#31904)
Co-authored-by: alafanechere <augustin.lafanechere@gmail.com>
2023-10-31 11:21:22 +01:00
Joe Reuter
68e99ce224 🎉 Source S3: Reduce image size and add acceptance test (#31654)
Co-authored-by: flash1293 <flash1293@users.noreply.github.com>
2023-10-25 06:44:00 -04:00
Joe Reuter
b7b5d9e7e5 S3: Call out unstructured in documentation (#31544)
Co-authored-by: Aaron ("AJ") Steers <aj@airbyte.io>
2023-10-25 10:34:58 +00:00
Joe Reuter
2b0b0cee1e S3 and Apify-dataset: Use fieldanchors in documentation (#31556) 2023-10-25 10:33:40 +00:00
Anatolii Yatsuk
053d08e404 Source S3: Add handling NoSuchBucket error (#31383) 2023-10-25 01:34:48 +03:00
Augustin
781dc3144d source-s3: migrate to the python base image (#31601) 2023-10-19 19:17:44 +02:00
Anatolii Yatsuk
951605ae8a Source S3: Add reading files inside zip archive (#31340) 2023-10-18 11:53:01 +03:00
Joe Reuter
0a01bc26f4 S3: Basic unstructured file support (#31209)
Co-authored-by: flash1293 <flash1293@users.noreply.github.com>
2023-10-17 15:18:27 +02:00
Christo Grabowski
fd742448b9 Docs: Source S3 clarify bucket-level permissions (#30529) 2023-09-18 16:40:00 -04:00
Maxime Carbonneau-Leclerc
2954cbb7ce Source S3: remove streams.*.file_type from source-s3 configuration (#30476) 2023-09-18 09:34:26 -04:00
Catherine Noll
8f214efe28 Source S3: bump airbyte-cdk version to pick up error message improvement (#30387) 2023-09-14 15:03:24 -04:00
Maxime Carbonneau-Leclerc
2b8748c074 Realign documentation with implementation (#30339)
Co-authored-by: Catherine Noll <clnoll@users.noreply.github.com>
2023-09-12 09:44:46 -04:00
Christo Grabowski
5f990f4afe Source S3: always show S3 Access Key fields (#28639)
Co-authored-by: Sherif A. Nada <snadalive@gmail.com>
2023-09-11 14:56:33 -04:00
Maxime Carbonneau-Leclerc
526be63fa6 Source S3: ensure parsing errors are consider as config errors to avoid Sentry alerts (#30217) 2023-09-06 15:15:08 -04:00
Maxime Carbonneau-Leclerc
4e7c70f767 Source S3: v4 rollout - take 3 (#30153)
Co-authored-by: Catherine Noll <clnoll@users.noreply.github.com>
2023-09-05 14:33:36 -04:00
Daryna Ishchenko
5a60b9d0b3 🐛 Source S3: added config error for conversion error (#29986) 2023-09-04 16:04:44 +03:00
Daryna Ishchenko
d0bc7ba936 🐛 Source S3: added handling for arrow invalid error (#29943) 2023-08-30 12:44:57 +03:00
Maxime Carbonneau-Leclerc
40b76a7813 Source S3: v4 rollout/feature parity (#29753) 2023-08-23 11:30:08 -04:00
Catherine Noll
620a941d21 Source S3: don't require history to be present to identify legacy state format (#29520) 2023-08-18 17:35:10 +00:00
Catherine Noll
fe005caa2d Source S3: StreamReader and Cursor fixes (#29505) 2023-08-17 06:48:42 -04:00
Artem Inzhyyants
6d49df712c Source S3: update Pyarrow to latest version 12.0.1 (#29480) 2023-08-17 00:37:48 +02:00
Alexandre Girard
7e95c1d175 🐛 Source S3 (V4): Ensure all files are not resync'd when migrating from v3 to v4 (#29418) 2023-08-15 18:11:15 -07:00
Catherine Noll
a29dbdfe04 Source S3: handle legacy path_prefix + path_patterns (#29382) 2023-08-15 18:45:43 -04:00
Catherine Noll
6946052513 Source S3: maintain backwards compatibility between V3 & V4 state messages (#29028) 2023-08-11 11:38:43 -04:00
Catherine Noll
57d3dafe16 Source S3: basic structure using file-based CDK (#28786) 2023-08-01 12:45:17 -04:00
Roman Yermilov [GL]
9a714db326 Source S3: encoding validation fix, refactor and test (#28730)
* Source S3: encoding validation fix, refactor and test

* Source S3: bump verson, update changelog

* Source S3: format imports

* Source S3: fix W291 trailing whitespace
2023-07-27 16:00:06 +04:00
Christo Grabowski
56a7f07a92 📝 Docs: Source S3 documentation update (#28229)
* add detailed setup steps for s3 bucket

* complete s3 setup steps

* compress versioned setup steps

* update S3 Provider Settings section

* update CSV and Parquet sections

* update file format settings section

* final edits/fixes

* maintain typecase of True/False

* Update docs/integrations/sources/s3.md

Co-authored-by: Sherif A. Nada <snadalive@gmail.com>

* Update docs/integrations/sources/s3.md

Co-authored-by: Sherif A. Nada <snadalive@gmail.com>

* Update docs/integrations/sources/s3.md

Co-authored-by: Sherif A. Nada <snadalive@gmail.com>

* Update docs/integrations/sources/s3.md

Co-authored-by: Sherif A. Nada <snadalive@gmail.com>

* add example to escape character field description

---------

Co-authored-by: Sherif A. Nada <snadalive@gmail.com>
2023-07-12 14:27:58 -04:00
Evan Tahler
79dba56923 S3 and GCS connector license to Elv2 (#27725)
* S3 and GCS connector license to Elv2

* docs update

* docs
2023-06-26 18:27:18 -05:00
Artem Inzhyyants
c68afefdf0 Source S3: handle Bucket Access Errors (#27651)
* Source S3: handle bucket access errors

* Source S3: update docs
2023-06-23 13:22:57 +02:00
Artem Inzhyyants
0c3d4499d6 Source S3: fix start date (#27611)
* Source S3: fix start date

* Source S3: update docs

* Source S3: bump version
2023-06-22 17:17:52 +02:00
Artem Inzhyyants
eef872e9f3 Source S3: Add logging for file reading (#27604)
* Source S3: Add logging for file reading

* Source S3: update docs
2023-06-22 10:53:32 +02:00
hehe
8f35bc45c7 docs: remove extra space in sources/s3.md (#26527) 2023-05-25 07:25:09 -05:00
Artem Inzhyyants
93f3286a0d 🚨🚨Source S3: use platform-handled schema evolution (#25127)
* Source S3: Remove match_target_schema; use platform-handled schema evolution instead

* Source S3: Remove ab_additional_col

* Source S3: update docs; bump version

* Source S3: fix unit tests

* Source S3: fix expected_records

* Source S3: revert _match_target_schema

* Source S3: update expected records for parquet dataset

* Source S3: update metadata

* auto-bump connector version

---------

Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
2023-05-15 17:14:26 +02:00
Artem Inzhyyants
f74d96f9e2 Source S3: support parquet dataset (#25937)
* Source S3: support parquet dataset

* Source S3: update docs

* Source S3: Fix expected records

* Source S3: Fix expected records

* Source S3: update sem version

* auto-bump connector version

---------

Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
2023-05-10 11:25:00 +02:00
Alex
f43cc9f3fd 📝 Add info on egress costs for Cloud storage connectors (#25935)
* add info blurb to Cloud Bucket Storage sources and destinations

* Apply suggestions from code review

Remove extra colon

Co-authored-by: Ben Church <ben@airbyte.io>

---------

Co-authored-by: Ben Church <ben@airbyte.io>
2023-05-09 17:33:49 -05:00
Artem Inzhyyants
64726c7413 Source S3: Parse nested avro schemas (#25361)
* Source S3: Parse nested avro schemas

Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com>
Co-authored-by: Sergey Chvalyuk <grubberr@gmail.com>
2023-05-01 22:31:25 +03:00
Artem Inzhyyants
7ce322552e 🐛 Source S3: remove minimum block size (#25706)
* Source S3: remove minimum block size

* Source S3: update docs

* auto-bump connector version

---------

Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
2023-05-01 15:00:42 +02:00
Artem Inzhyyants
e22f9e4cc0 Source S3: handle block size related errors (#25067)
* Source S3: handle pyarrow block size errors

* Source S3: bump version

* Automated Change

* Source S3: fix null field check

* Revert "Automated Change"

This reverts commit dc707f729d.

* Automated Change

* Source S3: bump version + update docs

* auto-bump connector version

---------

Co-authored-by: artem1205 <artem1205@users.noreply.github.com>
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
2023-04-18 16:08:23 +02:00
Artem Inzhyyants
3080f65429 Source S3: Add start date filter for files (#25010)
* Source S3: Add start date filter for files

* Source S3: add docs

* Source S3: add unittest

* Source S3: add unittest

* Source S3: add unittest

* Source S3: Fix spec test

* Source S3: bump version

* Source S3: fix tests

* Source S3: fix description

* auto-bump connector version

* Source S3: refactor start_date filtering

* Source S3: update setup

* Source S3: serialize state for cache

* Source S3: refactor skip file filter

* Source S3: bump version + update docs

* auto-bump connector version

---------

Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
2023-04-18 14:07:15 +02:00
Denys Davydov
13ac15130d Source S3: read a single record on check (#24429)
* #1697 source S3: read a single record on check

* #1697 source s3: upd changelog

* #1697 source s3: fix unit_tests

* auto-bump connector version

---------

Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
2023-03-27 12:56:48 +03:00
Denys Davydov
6a88625cca Source s3: fix datetime conversion (#24178)
* #1669 source s3: fix datetime conversion

* #1669 source s3: review fixes

* auto-bump connector version

---------

Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
2023-03-17 20:14:08 +02:00
Denys Davydov
db45f05814 Source S3: fix discovery issues (#24157)
* #1652 #1664 Source S3: fix discovery issues

* #1652 #1664 source s3: upd changelog

* #1652 #1664 source s3: review comments

* auto-bump connector version

---------

Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
2023-03-16 22:39:29 +02:00