1
0
mirror of synced 2025-12-30 21:02:43 -05:00
Files
airbyte/airbyte-cdk/python/airbyte_cdk/sources/file_based
Brian Lai 59300093b1 [file-based cdk] Add avro parser for inferring schema and reading records (#28500)
* add avro parser for inferring schema and reading records

* fix mypy check not caught locally

* pr feedback and some additional types

* add decimal_as_float for avro

* formatting + mypy
2023-07-25 12:54:16 -04:00
..

Incremental syncs

The file-based connectors supports the following sync modes:

Feature Supported?
Full Refresh Sync Yes
Incremental Sync Yes
Replicate Incremental Deletes No
Replicate Multiple Files (pattern matching) Yes
Replicate Multiple Streams (distinct tables) Yes
Namespaces No

We recommend you do not

Incremental sync

After the initial sync, the connector only pulls files that were modified since the last sync.

The connector checkpoints the connection states when it is done syncing all files for a given timestamp. The connection's state only keeps track of the last 10 000 files synced. If more than 10 000 files are synced, the connector won't be able to rely on the connection state to deduplicate files. In this case, the connector will initialize its cursor to the minimum between the earliest file in the history, or 3 days ago.

Both the maximum number of files, and the time buffer can be configured by connector developers.