* added MVP avro parser running fine locally
* added unit tests for avro
* added wip state of avro integration test setup
* deleted unused files
* added avro specific config path
* fixed comments. Added nested record support, simplify code and minor fixes
* bumped version + docs update
* Added working acceptance tests + format
* auto-bump connector version
Co-authored-by: George Claireaux <george@claireaux.co.uk>
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
* Dockerfile to 3.9
* Python version
* More python updates
* 3.9 on GitHub actions and lint updates
* Test out 3.9.11 on GitHub actions
* install python with an action
* formatting: newline
* Also has python code
* only check first level for changed modules
Previous example (source-google-search-console/credentials)
* Test failure: there is no logger.trace
* Implement Flag to avoid inferring data types for CSV input files in s3 SOURCE
* Unit Tests to Flag to avoid inferring data types for CSV input files in s3 SOURCE
Refactor parametrized tests in CSV and Parquet formats to use pytest.parametrize for better error reporting on test failure.
* S3 Source, infer_datatypes flag: additional unit tests
* wrong method signature
* Refactors
* s3-source - infer_datatypes flag, fix user message
* Update airbyte-integrations/connectors/source-s3/source_s3/source_files_abstract/formats/csv_spec.py
Co-authored-by: Eugene Kulak <widowmakerreborn@gmail.com>
* s3-source - refactor - use spec defaults instead of hardcoding them in code.
* Update airbyte-integrations/connectors/source-s3/source_s3/utils.py
Co-authored-by: Eugene Kulak <widowmakerreborn@gmail.com>
* code review changes
Co-authored-by: Eugene Kulak <widowmakerreborn@gmail.com>
* memory & performance optimisations
* address comments
* version bump
* added advanced_options for reading csv without header, and more custom pyarrow ReadOptions
* updated to use the latest airbyte-cdk
* updated docs
* bump source-s3 to 0.1.6
* remove unneeded lines
* Use the all dep ami for python builds.
* ec2-instance-id should be ec2-image-id
* ec2-instance-id should be ec2-image-id
Co-authored-by: Jingkun Zhuang <Jingkun.Zhuang@icims.com>
Co-authored-by: Davin Chia <davinchia@gmail.com>
* infer schema in multi process
* use dill to pickle function
* moved funcs
* Revert "moved funcs"
This reverts commit c1739ad988.
* Revert "use dill to pickle function"
This reverts commit 52404a9f1b.
* Revert "infer schema in multi process"
This reverts commit f0fb6f66f9.
* multiprocess in csv schema iinfer
* simplify what happens in the multiprocess to offending code
* try this
* using tempfile
* formatting
* version bump
* changelog + formatting
* addressed review comments
* re-trigger checks
* ran testScaffoldTemplates to fix breaking check
source-acceptance-test framework not longer required json_schema parameters
from catalog file. This parameter is verbose, makes reading config file
complicated and could be misleading when debugging acceptance test issues.
Co-authored-by: Dmytro Rezchykov <dmitry.rezchykov@zazmic.com>