Christo Grabowski
56a7f07a92
📝 Docs: Source S3 documentation update ( #28229 )
...
* add detailed setup steps for s3 bucket
* complete s3 setup steps
* compress versioned setup steps
* update S3 Provider Settings section
* update CSV and Parquet sections
* update file format settings section
* final edits/fixes
* maintain typecase of True/False
* Update docs/integrations/sources/s3.md
Co-authored-by: Sherif A. Nada <snadalive@gmail.com >
* Update docs/integrations/sources/s3.md
Co-authored-by: Sherif A. Nada <snadalive@gmail.com >
* Update docs/integrations/sources/s3.md
Co-authored-by: Sherif A. Nada <snadalive@gmail.com >
* Update docs/integrations/sources/s3.md
Co-authored-by: Sherif A. Nada <snadalive@gmail.com >
* add example to escape character field description
---------
Co-authored-by: Sherif A. Nada <snadalive@gmail.com >
2023-07-12 14:27:58 -04:00
Evan Tahler
79dba56923
S3 and GCS connector license to Elv2 ( #27725 )
...
* S3 and GCS connector license to Elv2
* docs update
* docs
2023-06-26 18:27:18 -05:00
Artem Inzhyyants
c68afefdf0
Source S3: handle Bucket Access Errors ( #27651 )
...
* Source S3: handle bucket access errors
* Source S3: update docs
2023-06-23 13:22:57 +02:00
Artem Inzhyyants
0c3d4499d6
Source S3: fix start date ( #27611 )
...
* Source S3: fix start date
* Source S3: update docs
* Source S3: bump version
2023-06-22 17:17:52 +02:00
Artem Inzhyyants
eef872e9f3
Source S3: Add logging for file reading ( #27604 )
...
* Source S3: Add logging for file reading
* Source S3: update docs
2023-06-22 10:53:32 +02:00
hehe
8f35bc45c7
docs: remove extra space in sources/s3.md ( #26527 )
2023-05-25 07:25:09 -05:00
Artem Inzhyyants
93f3286a0d
🚨 🚨 Source S3: use platform-handled schema evolution ( #25127 )
...
* Source S3: Remove match_target_schema; use platform-handled schema evolution instead
* Source S3: Remove ab_additional_col
* Source S3: update docs; bump version
* Source S3: fix unit tests
* Source S3: fix expected_records
* Source S3: revert _match_target_schema
* Source S3: update expected records for parquet dataset
* Source S3: update metadata
* auto-bump connector version
---------
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com >
2023-05-15 17:14:26 +02:00
Artem Inzhyyants
f74d96f9e2
Source S3: support parquet dataset ( #25937 )
...
* Source S3: support parquet dataset
* Source S3: update docs
* Source S3: Fix expected records
* Source S3: Fix expected records
* Source S3: update sem version
* auto-bump connector version
---------
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com >
2023-05-10 11:25:00 +02:00
Alex
f43cc9f3fd
📝 Add info on egress costs for Cloud storage connectors ( #25935 )
...
* add info blurb to Cloud Bucket Storage sources and destinations
* Apply suggestions from code review
Remove extra colon
Co-authored-by: Ben Church <ben@airbyte.io >
---------
Co-authored-by: Ben Church <ben@airbyte.io >
2023-05-09 17:33:49 -05:00
Artem Inzhyyants
64726c7413
Source S3: Parse nested avro schemas ( #25361 )
...
* Source S3: Parse nested avro schemas
Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com >
Co-authored-by: Sergey Chvalyuk <grubberr@gmail.com >
2023-05-01 22:31:25 +03:00
Artem Inzhyyants
7ce322552e
🐛 Source S3: remove minimum block size ( #25706 )
...
* Source S3: remove minimum block size
* Source S3: update docs
* auto-bump connector version
---------
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com >
2023-05-01 15:00:42 +02:00
Artem Inzhyyants
e22f9e4cc0
Source S3: handle block size related errors ( #25067 )
...
* Source S3: handle pyarrow block size errors
* Source S3: bump version
* Automated Change
* Source S3: fix null field check
* Revert "Automated Change"
This reverts commit dc707f729d .
* Automated Change
* Source S3: bump version + update docs
* auto-bump connector version
---------
Co-authored-by: artem1205 <artem1205@users.noreply.github.com >
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com >
2023-04-18 16:08:23 +02:00
Artem Inzhyyants
3080f65429
Source S3: Add start date filter for files ( #25010 )
...
* Source S3: Add start date filter for files
* Source S3: add docs
* Source S3: add unittest
* Source S3: add unittest
* Source S3: add unittest
* Source S3: Fix spec test
* Source S3: bump version
* Source S3: fix tests
* Source S3: fix description
* auto-bump connector version
* Source S3: refactor start_date filtering
* Source S3: update setup
* Source S3: serialize state for cache
* Source S3: refactor skip file filter
* Source S3: bump version + update docs
* auto-bump connector version
---------
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com >
2023-04-18 14:07:15 +02:00
Denys Davydov
13ac15130d
Source S3: read a single record on check ( #24429 )
...
* #1697 source S3: read a single record on check
* #1697 source s3: upd changelog
* #1697 source s3: fix unit_tests
* auto-bump connector version
---------
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com >
2023-03-27 12:56:48 +03:00
Denys Davydov
6a88625cca
Source s3: fix datetime conversion ( #24178 )
...
* #1669 source s3: fix datetime conversion
* #1669 source s3: review fixes
* auto-bump connector version
---------
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com >
2023-03-17 20:14:08 +02:00
Denys Davydov
db45f05814
Source S3: fix discovery issues ( #24157 )
...
* #1652 #1664 Source S3: fix discovery issues
* #1652 #1664 source s3: upd changelog
* #1652 #1664 source s3: review comments
* auto-bump connector version
---------
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com >
2023-03-16 22:39:29 +02:00
Oliver Meyer
5975c323d8
🐛 Source S3: fix datetime format string in FileStream ( #23195 )
...
* Fix datetime format string in FileStream
* Update changelog
* Fix integration tests
* Localize datetime objects
* Bump Dockerfile version
* auto-bump connector version
---------
Co-authored-by: Nataly Merezhuk <65251165+natalyjazzviolin@users.noreply.github.com >
Co-authored-by: sh4sh <6833405+sh4sh@users.noreply.github.com >
Co-authored-by: Evan Tahler <evan@airbyte.io >
Co-authored-by: Augustin <augustin@airbyte.io >
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com >
2023-03-16 11:40:31 -07:00
Denys Davydov
3eecf5408c
Source S3: infer schema of the first file only ( #23189 )
...
* #1470 Source S3: infer schema of the first file
* #1470 source s3: upd changelog
* #1470 source s3: review fixes
* #1470 source s3: review fixes
* #1470 source s3: bump version
* #1470 source s3: review fixes
* auto-bump connector version
---------
Co-authored-by: Serhii Lazebnyi <53845333+lazebnyi@users.noreply.github.com >
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com >
2023-03-14 20:09:15 +02:00
Sophia Wiley
5512befeb1
Docs: updated links from .io to .com ( #23652 )
...
* updated links
* edited contributors link
* deleted line about CDK in docs
2023-03-06 17:27:55 +01:00
Baz
6a6039bbc5
🐛 Source S3: Make Advanced Reader Options and Advanced Options truly Optional ( #23669 )
2023-03-03 15:12:49 +02:00
Artem Inzhyyants
f83621ae05
Source S3: fix error handling: raise error on guessing file schema ( #23502 )
...
* Source S3: fix error handling: raise error on guessing file schema
* Source S3: update docs
* auto-bump connector version
---------
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com >
2023-02-27 19:19:52 +01:00
Denys Davydov
e17464703d
Source s3: fix avro discovery ( #23198 )
...
* #23197 source s3: fix avro discovery
* #23197 source s3: upd changelog
* #23197 source s3: add allowed hosts
* #23197 source s3: fix tests
* #23197 - fix build: formatting
* auto-bump connector version
---------
Co-authored-by: Serhii Lazebnyi <53845333+lazebnyi@users.noreply.github.com >
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com >
2023-02-24 12:37:00 +02:00
Denys Davydov
3dc79f5a99
Source S3: speed up discovery ( #22500 )
...
* #1470 source S3: speed up discovery
* #1470 source s3: upd changelog
* auto-bump connector version
---------
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com >
2023-02-09 21:44:48 +02:00
Denys Davydov
fcd3b0334e
Source S3: validate CSV read options and convert options ( #22550 )
...
* #1467 source S3: validate CSV read options and convert options
* #1467 source S3: upd changelog
* #1467 source s3: review fixes
* auto-bump connector version
---------
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com >
2023-02-09 18:27:25 +02:00
Joe Reuter
6e373435f2
Small spec fixes to make sure they work with connector form UI ( #21587 )
2023-01-25 19:43:26 +01:00
Roman Yermilov [GL]
04a77ad3aa
Source S3: keep processing but warn if OSError happen ( #21604 )
...
* Source S3: keep processing but warn if OSError happen
* Source S3: bump version and update changelog
* auto-bump connector version
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com >
2023-01-24 20:00:51 +04:00
Artem Inzhyyants
31edbd8bae
Source S3: update block size for json ( #21210 )
...
* Source S3: update block size for json
* Source S3: update docs
* auto-bump connector version
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com >
2023-01-10 19:53:42 +00:00
Amruta Ranade
cae63965bd
Deployment docs and sidebar cleanup ( #20965 )
2023-01-03 19:18:35 +05:30
Artem Inzhyyants
09cfcbf599
🐛 Source S3: Check config settings for CSV file format ( #20262 )
...
* Source S3: get master schema on check connection
* Source S3: bump version
* Source S3: update docs
* Source S3: fix test
* Source S3: add fields validation for CSV source
* Source S3: add test
* Source S3: Refactor config validation
* Source S3: update docs
* Source S3: format
* Source S3: format
* Source S3: fix tests
* Source S3: fix tests
* Source S3: fix tests
* auto-bump connector version
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com >
2022-12-14 21:53:06 +01:00
Arnaud Jeannin
0164355635
🎨 Add oss/cloud tags on doc for GA connectors ( #19118 )
...
* feat: add cloud and oss tags
* put headers back
* fix: rm prettier style
* fix: aws styles
2022-11-17 17:01:20 +01:00
Xingyuan-Chen
425cc91c85
Source S3: Add virtual-hosted-style option ( #19006 )
...
* add virtual-hosted-style option for S3 source
* update s3 version
* auto-bump connector version
Co-authored-by: Vincent Koc <vincentkoc@ieee.org >
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com >
2022-11-08 10:48:16 -05:00
Denys Davydov
6a40ac52fe
Source S3: use AirbyteTracedException ( #18602 )
...
* #750 # 837 #904 Source S3: use AirbyteTracedException
* source s3: upd changelog
* auto-bump connector version
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com >
2022-10-29 07:23:39 +03:00
Denys Davydov
5aa25a1e1a
Source S3 - fix schema inference ( #17991 )
...
* #678 oncall. Source S3 - fix schema inference
* source s3: upd changelog
* auto-bump connector version [ci skip]
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com >
2022-10-14 14:53:39 +03:00
Serhii Lazebnyi
5df66cd572
Source S3: Connector does not enforce SSL/TLS for non-S3 endpoints ( #17800 )
...
* Deleted ssl/tsl flag from config
* Updated PR number
* auto-bump connector version [ci skip]
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com >
2022-10-12 16:07:22 +02:00
Augustin
ff4ea3961a
Republish connectors using CDK 0.1.88 to 0.1.89 ( #17304 )
2022-09-28 18:18:59 +02:00
Denys Davydov
9054468c21
Source s3: upgrade pyarrow ( #16921 )
...
* #423 oncall source s3: upgrade pyarrow
* source s3: upd changelog
* auto-bump connector version [ci skip]
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com >
2022-09-20 19:24:07 +03:00
Denys Davydov
4dc394cb9a
Source S3: fix reading jsonl files with nested data ( #16607 )
...
* #531 source s3: fix reading nested jsonl files
* #531 source s3: upd changelog
* oncall #531 source s3: fix sample file
* auto-bump connector version [ci skip]
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com >
2022-09-19 12:09:40 +03:00
Denys Davydov
73ba7b63d5
Source S3: choose between data types when merging master schema ( #16631 )
...
* #422 source s3: choose broadest data type when there is a mismatch during merging json schemas
* #422 source s3: upd changelog
2022-09-19 10:50:18 +03:00
Bhupesh Varshney
498d70089f
Source S3: Doc fix grammar & typo describing parquet file source ( #16264 )
2022-09-05 17:53:41 -03:00
Liren Tu
a475235d89
📝 S3 source: update doc about path_prefix + path_pattern ( #16206 )
...
* Add more explanation for path_prefix + path_pattern
* Simplify wording
* Update wording
2022-08-31 20:51:00 -07:00
Jagruti Tiwari
8288c16485
fix: replace airbyte oss with airbyte open source ( #15885 )
...
Co-authored-by: Marcos Marx <marcosmarxm@users.noreply.github.com >
2022-08-24 01:01:53 -03:00
sivankumar86
3d499557b7
source-S3: Support JSON format ( #14213 )
...
* json format support added
* json format support added
* code formatted
* format convertion changed
* format naming convertion changed
* test cased issue fixed
* test case issued resolved
* sample file and config added for integration tests
* Json doc added
Json doc added
* update
* sample file and config added for integration tests
* sample file and config added for integration tests
* update jsonl files
* review 1
* review 1
* review 1
* pyarrow version upgrade
* clean integration test folder architecture
* add timestamp record to simple_test.jsonl
* fixed integration test and parser review change
* simplify table read
* doc update
* fix specs
* user sample files
* fix sample files
* add newlines at end of files
* rename json parser
* rename jsonfile to jsonlfile
* schema inference added
* patch review fix
* Update docs/integrations/sources/s3.md
doc update
Co-authored-by: George Claireaux <george@airbyte.io >
* changing the version
* changing the title to sync with other type
* fix expected csv records
* fix expected records for avro and parquet
* review fix
* fixed master schema handling
* remove sample configs
* fix expected records
* json doc update
added more details on json parser
* fixed api name
* bump version
* auto-bump connector version [ci skip]
Co-authored-by: alafanechere <augustin.lafanechere@gmail.com >
Co-authored-by: George Claireaux <george@airbyte.io >
Co-authored-by: George Claireaux <george@claireaux.co.uk >
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com >
2022-08-01 15:48:23 +01:00
Serhii Chvaliuk
29d6a61a21
🐛 Source S3: "decimal" type added for parquet ( #14911 )
...
Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com >
2022-07-22 01:04:44 +03:00
Baz
7cf67e2c85
🐛 Source S3: fixed bug when extra columns not in master schema ( #14669 )
2022-07-13 22:56:03 +03:00
Topher Lubaway
9c6c092a22
Revert "Improving docusaurus sidebar generation ( #1927 ) ( #14369 )" ( #14596 )
...
This reverts commit a2c194a11f .
2022-07-11 15:27:14 -05:00
Mykyta Serbynevskiy
a2c194a11f
Improving docusaurus sidebar generation ( #1927 ) ( #14369 )
...
* Improving docusaurus sidebar generation (#1927 )
* Added "Career & open positions" folder to sidebar, adjusted "Project overview" folder
* Deleted "career-and-open-positions" folder from sidebar
2022-07-08 14:18:27 -05:00
Serhii Lazebnyi
f896c574d1
🎉 Source Amazon S3: Fix docs link issue ( #14397 )
...
* Fix UI connector name and link issue
* Revert name to S3
2022-07-08 15:13:00 +03:00
Serhii Lazebnyi
f9348b2251
🐛 Source Amazon S3: solve possible case of files being missed during incremental syncs ( #12568 )
...
* Added history to state
* Deleted unused import
* Rollback abnormal state file
* Rollback abnormal state file
* Fixed type error issue
* Fix state issue
* Updated after review
* Bumped version
2022-05-31 21:39:10 +03:00
Serhii Lazebnyi
91326749d9
🎉 Source Amazon S3: increase unit test coverage at least 90% ( #11967 )
...
* Increased unittest coverage
* #11676 test coverage 85%
* #11676 unit tests 90%
* #11676 two more unit tests
* #11676 bump version
* auto-bump connector version
Co-authored-by: Denys Davydov <denys.i.davydov@globallogic.com >
Co-authored-by: Denys Davydov <davydov.den18@gmail.com >
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com >
2022-05-23 13:37:27 +03:00
Serhii Lazebnyi
225aecd37c
🐛 Source Amazon S3: Fixed empty options issue ( #12730 )
...
* Fixed empty oprions issue
* Update airbyte-integrations/connectors/source-s3/source_s3/utils.py
Co-authored-by: Denis Davydov <denys.i.davydov@globallogic.com >
* Bumped version
* Fix typo
* Bumped seed version
* Fix changelog
* Bumped version in docker file
* auto-bump connector version
Co-authored-by: Denis Davydov <denys.i.davydov@globallogic.com >
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com >
2022-05-11 21:21:54 +03:00