* Add encoding to open_file interface
* pass the encoding set in the config
* cleanup
* cleanup
* Automated Commit - Formatting Changes
* Add missing test
* Automated Commit - Formatting Changes
* Update infer_schema too
* Automated Commit - Formatting Changes
* Update unit test
* add a unit test
* fix
* format
* format
* remove newline
* use a mock
* fix
* format
---------
Co-authored-by: girarda <girarda@users.noreply.github.com>
* Try running only on modified files
* make a change
* return something with the wrong type
* Revert "return something with the wrong type"
This reverts commit 23b828371e.
* fix typing in file-based
* format
* Mypy
* fix
* leave as Mapping
* Revert "leave as Mapping"
This reverts commit 908f063f70.
* Use Dict
* update
* move dict()
* Revert "move dict()"
This reverts commit fa347a8236.
* Revert "Revert "move dict()""
This reverts commit c9237df2e4.
* Revert "Revert "Revert "move dict()"""
This reverts commit 5ac1616414.
* use Mapping
* point to config file
* comment
* strict = False
* remove --
* Revert "comment"
This reverts commit 6000814a82.
* install types
* install types in same command as mypy runs
* non-interactive
* freeze version
* pydantic plugin
* plugins
* update
* ignore missing import
* Revert "ignore missing import"
This reverts commit 1da7930fb7.
* Install pydantic instead
* fix
* this passes locally
* strict = true
* format
* explicitly import models
* Update
* remove old mypy.ini config
* temporarily disable mypy
* format
* any
* format
* fix tests
* format
* Automated Commit - Formatting Changes
* Revert "temporarily disable mypy"
This reverts commit eb8470fa3f.
* implicit reexport
* update test
* fix mypy
* Automated Commit - Formatting Changes
* fix some errors in tests
* more type fixes
* more fixes
* more
* .
* done with tests
* fix last files
* format
* Update gradle
* change source-stripe
* only run mypy on cdk
* remove strict
* Add more rules
* update
* ignore missing imports
* cast to string
* Allow untyped decorator
* reset to master
* move to the cdk
* derp
* move explicit imports around
* Automated Commit - Formatting Changes
* Revert "move explicit imports around"
This reverts commit 56e306b72f.
* move explicit imports around
* Upgrade mypy version
* point to config file
* Update readme
* Ignore errors in the models module
* Automated Commit - Formatting Changes
* move check to gradle build
* Any
* try checking out master too
* Revert "try checking out master too"
This reverts commit 8a8f3e373c.
* fetch master
* install mypy
* try without origin
* fetch from the script
* checkout master
* ls the branches
* remotes/origin/master
* remove some cruft
* comment
* remove pydantic types
* unpin mypy
* fetch from the script
* Update connectors base too
* modify a non-cdk file to confirm it doesn't get checked by mypy
* run mypy after generateComponentManifestClassFiles
* run from the venv
* pass files as arguments
* update
* fix when running without args
* with subdir
* path
* try without /
* ./
* remove filter
* try resetting
* Revert "try resetting"
This reverts commit 3a54c424de.
* exclude autogen file
* do not use the github action
* works locally
* remove extra fetch
* run on connectors base
* try bad typing
* Revert "try bad typing"
This reverts commit 33b512a3e4.
* reset stripe
* Revert "reset stripe"
This reverts commit 28f23fc6dd.
* Revert "Revert "reset stripe""
This reverts commit 5bf5dee371.
* missing return type
* do not ignore the autogen file
* remove extra installs
* run from venv
* Only check files modified on current branch
* Revert "Only check files modified on current branch"
This reverts commit b4b728e654.
* use merge-base
* Revert "use merge-base"
This reverts commit 3136670cbf.
* try with updated mypy
* bump
* run other steps after mypy
* reset task ordering
* run mypy though
* looser config
* tests pass
* fix mypy issues
* type: ignore
* optional
* this is always a bool
* ignore
* fix typing issues
* remove ignore
* remove mapping
* Automated Commit - Formatting Changes
* Revert "remove ignore"
This reverts commit 9ffeeb6cb1.
* update config
---------
Co-authored-by: girarda <girarda@users.noreply.github.com>
Co-authored-by: Joe Bell <joseph.bell@airbyte.io>
* New file-based CDK module scaffolding
* Address code review comments
* Formatting
* Automated Commit - Formatting Changes
* Apply suggestions from code review
Co-authored-by: Sherif A. Nada <snadalive@gmail.com>
Co-authored-by: Alexandre Girard <alexandre@airbyte.io>
* Automated Commit - Formatting Changes
* address CR comments
* Update tests to use builder pattern
* Move max files for schema inference onto the discovery policy
* Reorganize stream & its dependencies
* File CDK: error handling for CSV parser (#27176)
* file url and updated_at timestamp is added to state's history field
* Address CR comments
* Address CR comments
* Use stream_slice to determine which files to sync
* fix
* test with no input state
* test with multiple files
* filter out older files
* group by timestamp
* Add another test
* comment
* use min time
* skip files that are already in the history
* move the code around
* include files that are not in the history
* remove start_timestamp
* cleanup
* sync misisng recent files even if history is more recent
* remove old files if history is full
* resync files if history is incomplete
* sync recent files
* comment
* configurable history size
* configurable days to sync if history is full
* move to a stateful object
* Only update state once per file
* two unit tests
* Unit tests
* missing files
* remove inner state
* fix tests
* fix interface
* fix constructor
* Update interface
* cleanup
* format
* Update
* cleanup
* Add timestamp and source file to schema
* set file uri on record
* format
* comment
* reset
* notes
* delete dead code
* format
* remove dead code
* remove dead code
* warning if history is not complete
* always set is_history_partial in the state
* rename
* Add a readme
* format
* Update
* rename
* rename
* missing files
* get instead of compute
* sort alphabetically, and sync everthing if the history is not partial
* unit tests
* Update airbyte-cdk/python/airbyte_cdk/sources/file_based/README.md
Co-authored-by: Catherine Noll <clnoll@users.noreply.github.com>
* Update docs
* reset
* Test to verify we remove files sorted (datetime, alphabetically)
* comment
* Update scenario
* Rename method to get_state
* If the file's ts is equal to the earliest ts, only sync it if its alphabetically greater than the file
* add missing test
* rename
* rename and update comments
* Update comment for clarity
* inject the cursor
* add interface
* comment
* Handle the case where the file has been modified since it was synced
* Only inject from AbstractFileSource
* keep the remote files in the stream slices
* Use file_based typedefs
* format
* Update the comment
* simplify the logic, update comment, and add a test
* Add a comment
* slightly cleaner
* clean up
* typing
* comment
* I think this is simpler to reason about
* create the cursor in the source
* update
* Remove methods from FiledBasedStreamReader and AbstractFileBasedStream interface (#27736)
* update the interface
* Add a comment
* rename
---------
Co-authored-by: Catherine Noll <noll.catherine@gmail.com>
Co-authored-by: clnoll <clnoll@users.noreply.github.com>
Co-authored-by: Sherif A. Nada <snadalive@gmail.com>