1
0
mirror of synced 2025-12-22 03:21:25 -05:00
Commit Graph

228 Commits

Author SHA1 Message Date
Ella Rohm-Ensing
ac3eb28de2 airbyte-ci: add format commands (#31831)
Co-authored-by: Ben Church <ben@airbyte.io>
Co-authored-by: bnchrch <bnchrch@users.noreply.github.com>
Co-authored-by: alafanechere <augustin.lafanechere@gmail.com>
Co-authored-by: Augustin <augustin@airbyte.io>
Co-authored-by: Marius Posta <marius@airbyte.io>
Co-authored-by: alafanechere <alafanechere@users.noreply.github.com>
2023-11-14 02:17:48 -06:00
Anatolii Yatsuk
21cfb2a083 Source S3: Add HTTPS validation for S3 endpoint (#32109) 2023-11-10 12:33:38 +02:00
Augustin
368ba78b64 🧹doc: update connectors README and remove acceptance-test-docker.sh (#32209) 2023-11-06 12:36:07 -06:00
Joe Reuter
7c7acade71 S3 and Azure Blob Storage: Update File CDK to support document file types (#31904)
Co-authored-by: alafanechere <augustin.lafanechere@gmail.com>
2023-10-31 11:21:22 +01:00
Joe Reuter
68e99ce224 🎉 Source S3: Reduce image size and add acceptance test (#31654)
Co-authored-by: flash1293 <flash1293@users.noreply.github.com>
2023-10-25 06:44:00 -04:00
Anatolii Yatsuk
053d08e404 Source S3: Add handling NoSuchBucket error (#31383) 2023-10-25 01:34:48 +03:00
Augustin
781dc3144d source-s3: migrate to the python base image (#31601) 2023-10-19 19:17:44 +02:00
Anatolii Yatsuk
951605ae8a Source S3: Add reading files inside zip archive (#31340) 2023-10-18 11:53:01 +03:00
Joe Reuter
0a01bc26f4 S3: Basic unstructured file support (#31209)
Co-authored-by: flash1293 <flash1293@users.noreply.github.com>
2023-10-17 15:18:27 +02:00
Marius Posta
7ae97175a6 gradle: fix repo wide behaviour (#30607) 2023-09-28 05:01:13 -07:00
Ben Church
5c56ac1d84 Airbyte-ci: Remove gradle task connectorAcceptanceTest (#30326) 2023-09-19 15:16:37 -05:00
Maxime Carbonneau-Leclerc
2954cbb7ce Source S3: remove streams.*.file_type from source-s3 configuration (#30476) 2023-09-18 09:34:26 -04:00
Catherine Noll
8f214efe28 Source S3: bump airbyte-cdk version to pick up error message improvement (#30387) 2023-09-14 15:03:24 -04:00
Christo Grabowski
5f990f4afe Source S3: always show S3 Access Key fields (#28639)
Co-authored-by: Sherif A. Nada <snadalive@gmail.com>
2023-09-11 14:56:33 -04:00
Maxime Carbonneau-Leclerc
526be63fa6 Source S3: ensure parsing errors are consider as config errors to avoid Sentry alerts (#30217) 2023-09-06 15:15:08 -04:00
Maxime Carbonneau-Leclerc
8b6fe6cffe Source S3: v4 release - update metadata (#30171) 2023-09-06 08:29:05 -04:00
Ben Church
4d5e17bc90 [skip ci] Update test_incremental to be unaware when source defines the cursor (#27872)
Co-authored-by: bnchrch <bnchrch@users.noreply.github.com>
2023-09-05 15:57:18 -07:00
Maxime Carbonneau-Leclerc
4e7c70f767 Source S3: v4 rollout - take 3 (#30153)
Co-authored-by: Catherine Noll <clnoll@users.noreply.github.com>
2023-09-05 14:33:36 -04:00
Daryna Ishchenko
5a60b9d0b3 🐛 Source S3: added config error for conversion error (#29986) 2023-09-04 16:04:44 +03:00
Daryna Ishchenko
d0bc7ba936 🐛 Source S3: added handling for arrow invalid error (#29943) 2023-08-30 12:44:57 +03:00
Marius Posta
f5c7c1c0b8 chore: get ./gradlew format to pass for the whole repo (same java style) (#29786) 2023-08-24 05:09:42 -05:00
Maxime Carbonneau-Leclerc
40b76a7813 Source S3: v4 rollout/feature parity (#29753) 2023-08-23 11:30:08 -04:00
Catherine Noll
620a941d21 Source S3: don't require history to be present to identify legacy state format (#29520) 2023-08-18 17:35:10 +00:00
Catherine Noll
fe005caa2d Source S3: StreamReader and Cursor fixes (#29505) 2023-08-17 06:48:42 -04:00
Artem Inzhyyants
6d49df712c Source S3: update Pyarrow to latest version 12.0.1 (#29480) 2023-08-17 00:37:48 +02:00
Alexandre Girard
7e95c1d175 🐛 Source S3 (V4): Ensure all files are not resync'd when migrating from v3 to v4 (#29418) 2023-08-15 18:11:15 -07:00
Ben Church
40781313da Update Internal Fields: update ql levels to better resemble previous high strictness (#29450) 2023-08-15 18:03:10 -05:00
Catherine Noll
a29dbdfe04 Source S3: handle legacy path_prefix + path_patterns (#29382) 2023-08-15 18:45:43 -04:00
Alexandre Girard
690479d221 Source S3 (v4): Set decimal_as_float to True for parquet files (#29342)
* [ISSUE #28893] infer csv schema

* [ISSUE #28893] align with pyarrow

* Automated Commit - Formatting Changes

* [ISSUE #28893] legacy inference and infer only when needed

* [ISSUE #28893] fix scenario tests

* [ISSUE #28893] using discovered schema as part of read

* [ISSUE #28893] self-review + cleanup

* [ISSUE #28893] fix test

* [ISSUE #28893] code review part #1

* [ISSUE #28893] code review part #2

* Fix test

* formatcdk

* [ISSUE #28893] code review

* FIX test log level

* Re-adding failing tests

* [ISSUE #28893] improve inferrence to consider multiple types per value

* set decimal_as_float to True

* update

* Automated Commit - Formatting Changes

* add file adapters for avro, csv, jsonl, and parquet

* fix try catch

* update

* format

* pr feedback with a few additional default options set

---------

Co-authored-by: maxi297 <maxime@airbyte.io>
Co-authored-by: maxi297 <maxi297@users.noreply.github.com>
Co-authored-by: brianjlai <brian.lai@airbyte.io>
2023-08-14 20:13:52 -05:00
Brian Lai
82b8274063 [file-based cdk] S3 file format adapter (#29353)
* [ISSUE #28893] infer csv schema

* [ISSUE #28893] align with pyarrow

* Automated Commit - Formatting Changes

* [ISSUE #28893] legacy inference and infer only when needed

* [ISSUE #28893] fix scenario tests

* [ISSUE #28893] using discovered schema as part of read

* [ISSUE #28893] self-review + cleanup

* [ISSUE #28893] fix test

* [ISSUE #28893] code review part #1

* [ISSUE #28893] code review part #2

* Fix test

* formatcdk

* [ISSUE #28893] code review

* FIX test log level

* Re-adding failing tests

* [ISSUE #28893] improve inferrence to consider multiple types per value

* Automated Commit - Formatting Changes

* add file adapters for avro, csv, jsonl, and parquet

* fix try catch

* pr feedback with a few additional default options set

* fix things from the rebase of master

---------

Co-authored-by: maxi297 <maxime@airbyte.io>
Co-authored-by: maxi297 <maxi297@users.noreply.github.com>
2023-08-14 18:47:08 -04:00
Alexandre Girard
45c1de3c39 Source 3: Add a few CAT tests to verify backwards compatibility (#29368)
* Add a few legacy source-s3 tests

* remove unused file

* reset
2023-08-11 14:22:54 -07:00
Alexandre Girard
84a113f0d0 Source S3: Set CAT exact_order to true (#29351)
* set exact_order to true

* newline
2023-08-11 10:02:00 -07:00
Augustin
00d9462216 cat/connectors-ci: replace docker runner with dagger runner in CAT (#28000) 2023-08-11 17:58:48 +02:00
Catherine Noll
6946052513 Source S3: maintain backwards compatibility between V3 & V4 state messages (#29028) 2023-08-11 11:38:43 -04:00
Brian Lai
0543099b4d [file based cdk] S3 legacy config adapter (#29145)
* s3 adapter

* pr feedback and updates after rebasing master

* add comment

* formatting
2023-08-09 19:09:47 -04:00
Alexandre Girard
0aa86cf156 File-based CDK + Source S3 (v4): Pass configured file encoding to stream reader (#29110)
* Add encoding to open_file interface

* pass the encoding set in the config

* cleanup

* cleanup

* Automated Commit - Formatting Changes

* Add missing test

* Automated Commit - Formatting Changes

* Update infer_schema too

* Automated Commit - Formatting Changes

* Update unit test

* add a unit test

* fix

* format

* format

* remove newline

* use a mock

* fix

* format

---------

Co-authored-by: girarda <girarda@users.noreply.github.com>
2023-08-09 09:05:06 -05:00
Brian Lai
b8d5ca77db 🐛 [file based cdk] Fix S3 and abstract spec to be compatible with Airbyte UI and CAT (#29075)
* remove version, make validation_policy enum, fix input_schema for s3 and abstract file based configs

* remove multiple file format options from stream config

* pr feedback

* fix tests after rebase

* additional spec changes to work with the UI

* fix tests post-rebase

* fix tests post-rebase and cleanup

* formatting
2023-08-08 18:10:05 -04:00
Alexandre Girard
9f8ce9d265 Source S3: Add a few more CAT tests (#29069)
* Add a few more CAT tests

* rename

* Add missing file
2023-08-04 15:54:37 -07:00
Catherine Noll
694e883c0e Source S3: add defaults to file types to fix spec CATs (#29065) 2023-08-04 09:28:24 -04:00
Ben Church
2f7deaee02 [skip ci] Metadata: Remove leading underscore (#29024)
* DNC

* Add test models

* Add model test

* Remove underscore from metadata files

* Regenerate models

* Add test to check for key transformation

* Allow additional fields on metadata

* Delete transform
2023-08-03 10:56:13 -07:00
Ben Church
e9490e3fb6 Connector Levels: Add new internal metadata fields (#28904)
* Add airbyte internal

* Add tests

* First pass

* Set destinations to same levels as sources

* Best guess at missing statuses

* Best guess at _ql

* Add separate enum class

* Fix support level name

* Update templates

* Add one more test
2023-08-01 18:08:33 -05:00
Catherine Noll
57d3dafe16 Source S3: basic structure using file-based CDK (#28786) 2023-08-01 12:45:17 -04:00
Roman Yermilov [GL]
9a714db326 Source S3: encoding validation fix, refactor and test (#28730)
* Source S3: encoding validation fix, refactor and test

* Source S3: bump verson, update changelog

* Source S3: format imports

* Source S3: fix W291 trailing whitespace
2023-07-27 16:00:06 +04:00
Evan Tahler
79dba56923 S3 and GCS connector license to Elv2 (#27725)
* S3 and GCS connector license to Elv2

* docs update

* docs
2023-06-26 18:27:18 -05:00
Artem Inzhyyants
c68afefdf0 Source S3: handle Bucket Access Errors (#27651)
* Source S3: handle bucket access errors

* Source S3: update docs
2023-06-23 13:22:57 +02:00
Artem Inzhyyants
0c3d4499d6 Source S3: fix start date (#27611)
* Source S3: fix start date

* Source S3: update docs

* Source S3: bump version
2023-06-22 17:17:52 +02:00
Artem Inzhyyants
eef872e9f3 Source S3: Add logging for file reading (#27604)
* Source S3: Add logging for file reading

* Source S3: update docs
2023-06-22 10:53:32 +02:00
Serhii Lazebnyi
da67a60b7c Source Confluence, Source Greenhouse, Source Hubspot, Source Stripe, Source Close com, Source Klaviyo, Source Notion, Source Pinterest, Source Snapchat Marketin, Source S3, Source Airtable and Source Posthog: fix builds (#27135)
* Fix SAT tests for confluence, greenhouse, hubspot, stripe

* Fix CAT for close, klaviyo, notion, pinterest and snapchat marketing

* Fix CAT for source s3

* Fix CAT for airtable and posthog

* Bump posthog version
2023-06-09 03:05:44 +02:00
Augustin
7ca0d2e476 source-s3: delete integration tests using minio (#26908) 2023-06-01 18:03:36 +02:00
Ben Church
1dabc6208e Metadata: add tags field (#26320)
* Add optional tags field

* Remove duplicate icons

* Add programming tags to all

* Update docs

* supportUrl -> documentationUrl

* Ensure one language tag is applied

* Add keyvalue check

* rebase and fix tests

* Format

* Add cache buster

* Improve test

* Automated Commit - Formatting Changes

* Update error

* Fix missing tags

* Fix scaffold

---------

Co-authored-by: Octavia Squidington III <octavia-squidington-iii@sers.noreply.github.com>
Co-authored-by: bnchrch <bnchrch@users.noreply.github.com>
2023-05-26 16:13:09 -07:00