Serhii Lazebnyi
f9348b2251
🐛 Source Amazon S3: solve possible case of files being missed during incremental syncs ( #12568 )
...
* Added history to state
* Deleted unused import
* Rollback abnormal state file
* Rollback abnormal state file
* Fixed type error issue
* Fix state issue
* Updated after review
* Bumped version
2022-05-31 21:39:10 +03:00
Serhii Lazebnyi
91326749d9
🎉 Source Amazon S3: increase unit test coverage at least 90% ( #11967 )
...
* Increased unittest coverage
* #11676 test coverage 85%
* #11676 unit tests 90%
* #11676 two more unit tests
* #11676 bump version
* auto-bump connector version
Co-authored-by: Denys Davydov <denys.i.davydov@globallogic.com >
Co-authored-by: Denys Davydov <davydov.den18@gmail.com >
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com >
2022-05-23 13:37:27 +03:00
Serhii Lazebnyi
225aecd37c
🐛 Source Amazon S3: Fixed empty options issue ( #12730 )
...
* Fixed empty oprions issue
* Update airbyte-integrations/connectors/source-s3/source_s3/utils.py
Co-authored-by: Denis Davydov <denys.i.davydov@globallogic.com >
* Bumped version
* Fix typo
* Bumped seed version
* Fix changelog
* Bumped version in docker file
* auto-bump connector version
Co-authored-by: Denis Davydov <denys.i.davydov@globallogic.com >
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com >
2022-05-11 21:21:54 +03:00
Melker Öhrman
f9188590cc
Add avro parser to s3 source ( #12602 )
...
* added MVP avro parser running fine locally
* added unit tests for avro
* added wip state of avro integration test setup
* deleted unused files
* added avro specific config path
* fixed comments. Added nested record support, simplify code and minor fixes
* bumped version + docs update
* Added working acceptance tests + format
* auto-bump connector version
Co-authored-by: George Claireaux <george@claireaux.co.uk >
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com >
2022-05-11 16:59:58 +01:00
Davin Chia
f8a35eaa80
Add Java Catalog documentation. ( #12751 )
...
Clean up and add better guidelines on how to use the Java catalogs we recently added.
Took the chance to move existing documentation to improve reading flow.
2022-05-11 13:02:07 +08:00
Serhii Lazebnyi
27e6ce2ca8
Source Amazon S3: Refactored docs ( #12534 )
...
* Refactored spec and docs
* Updated spec.json
* Rollback spec fromating
* Rollback spec fromating
* Rollback spec fromating
2022-05-09 14:56:52 +03:00
Sherif A. Nada
b8e147538c
Update various connector input configs & docs copy ( #12500 )
2022-05-04 23:37:10 -07:00
Maksym Pavlenok
91eff1dffd
🐛 Source S3: Loading of files' metadata ( #8252 )
2022-02-02 00:49:18 +02:00
Serhii Chvaliuk
c7021e6f30
🐛 Source S3: work-around for format.delimiter change '\\t' -> '\t' ( #9163 )
...
* work-around for format.delimiter '\\t' -> '\t'
Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com >
2022-01-06 20:49:55 +02:00
Augustin
14b301ce37
Update S3 and file sources docs: we do not support unstructured data ( #9192 )
2021-12-29 18:22:59 +01:00
Vadym
504580d833
Remove base-python gradle dependencies in connectors where base-python is not used ( #7499 )
...
* Remeve base-python references.
* Add requirements.txt
* Fix requirements.txt blank line
* Fix source-exchange rates to common CDK approach
* Fix source-smartsheets SAT.
Fix source-exchange-rates build.gradle.
* Bump docker version
* Update source-dixa SAT config
* Fix source-exchange-rates SAT config
* Revert bump scaffold sources version
* Fix source-shortio SAT config
* Fix source-square invalid_config.json
* Fix source-us-census invalid_config.json
* Fix source-intercom versioning
2021-11-10 13:12:29 +02:00
George Claireaux
1d3a17a8fb
🎉 Source S3 - memory & performance optimisations + advanced CSV options ( #6615 )
...
* memory & performance optimisations
* address comments
* version bump
* added advanced_options for reading csv without header, and more custom pyarrow ReadOptions
* updated to use the latest airbyte-cdk
* updated docs
* bump source-s3 to 0.1.6
* remove unneeded lines
* Use the all dep ami for python builds.
* ec2-instance-id should be ec2-image-id
* ec2-instance-id should be ec2-image-id
Co-authored-by: Jingkun Zhuang <Jingkun.Zhuang@icims.com >
Co-authored-by: Davin Chia <davinchia@gmail.com >
2021-10-19 16:50:51 +01:00
Abhi Vaidyanatha
ae32ecbb27
GitBook: [master] 186 pages and 77 assets modified
2021-10-08 21:17:47 +00:00
Dmytro
6767424b6d
🎉 S3 source: add support for non-AWS S3 Storage ( #6398 )
2021-09-27 16:40:24 +03:00
Maksym Pavlenok
e5c44e64b1
🎉 Source S3: support of Parquet format ( #5305 )
...
* add parquet parser
* add integration tests for partquet formats
* add unit tests for parquet
* update docs and secrets
* fix incorrect import for tests
* add lib pandas for unit tests
* revert changes of foreign connectors
* update secret settings
* fix config values
* Update airbyte-integrations/connectors/source-s3/source_s3/source_files_abstract/formats/parquet_spec.py
Co-authored-by: George Claireaux <george@claireaux.co.uk >
* Update airbyte-integrations/connectors/source-s3/source_s3/source_files_abstract/formats/parquet_spec.py
Co-authored-by: George Claireaux <george@claireaux.co.uk >
* remove some unused default options
* update tests
* update docs
* bump its version
* fix expected test
Co-authored-by: Maksym Pavlenok <maksym.pavlenok@globallogic.com >
Co-authored-by: George Claireaux <george@claireaux.co.uk >
2021-09-05 02:40:49 +03:00
George Claireaux
137257b62b
🐛 Source S3: fixed bug where sync could hang indefinitely ( #5197 )
...
* infer schema in multi process
* use dill to pickle function
* moved funcs
* Revert "moved funcs"
This reverts commit c1739ad988 .
* Revert "use dill to pickle function"
This reverts commit 52404a9f1b .
* Revert "infer schema in multi process"
This reverts commit f0fb6f66f9 .
* multiprocess in csv schema iinfer
* simplify what happens in the multiprocess to offending code
* try this
* using tempfile
* formatting
* version bump
* changelog + formatting
* addressed review comments
* re-trigger checks
* ran testScaffoldTemplates to fix breaking check
2021-08-06 00:07:46 +01:00
George Claireaux
9e529545c2
🐛 Source S3: fixed bug in spec so that Format field displays in UI correctly ( #5135 )
...
* fixed bug in spec so that Format field displays in UI correctly
* newline & changelog
2021-08-02 17:23:10 +01:00
George Claireaux
d9f11bcf6a
🎉 New Source: S3 (+ abstract files source) ( #4990 )
...
* minor line length changes
* cdk generated source + oop structure + start of implementation
* fixed some broken syntax stuff
* pre-pyarrow convert
* introducing pyarrow
* skeleton for unit tests
* read working on multiple files
* incremental first draft
* blobfile -> fileclient
* change references of 'blob' to 'file'
* minor tidy to make draft PR
* fixes
* addressed review comments + more unit tests
* finished unit tests
* bugfixes and abstract integration tests framework
* remove old commented stuff
* docstrings
* restructure as source-s3
* Delete playground.py
* integration tests
* acceptance tests and some more reshuffling
* source S3 credentials
* change _airbyte_ columns to _ab_
* update spec with better descriptions and ordering
* created s3 source docs
* source definition
* reverse docstring change in cdk
* reverse docstring change
* reverse change
* reverse docstring change
* remove TODO comments
* add PR to changelog
* removed unused libraries
* formatting & address some review comments
* rename of files/classes for clarity
* addressing review comments
* address reviews
* add s3 source
* building spec with pydantic for provider-specific inheritance
* pydantic spec and improved path pattern with wcmatch.glob
* update path patterns info in doc
* formatting
* tests gzip and bz2 compression on csv
* updated compression support in doc
* forgot to upload bz2 test file
* added pattern validation to dataset
* formatting
* Format.
* ran testScaffoldTemplates & generated this diff
* bumped version because of documentationUrl fix
Co-authored-by: Davin Chia <davinchia@gmail.com >
2021-07-30 15:06:11 +01:00