airbyte

mirror of synced 2026-01-04 18:04:31 -05:00

Author	SHA1	Message	Date
Alexandre Girard	6ebabdc2fa	File-based CDK: Support for incremental syncs (#27382 ) * New file-based CDK module scaffolding * Address code review comments * Formatting * Automated Commit - Formatting Changes * Apply suggestions from code review Co-authored-by: Sherif A. Nada <snadalive@gmail.com> Co-authored-by: Alexandre Girard <alexandre@airbyte.io> * Automated Commit - Formatting Changes * address CR comments * Update tests to use builder pattern * Move max files for schema inference onto the discovery policy * Reorganize stream & its dependencies * File CDK: error handling for CSV parser (#27176) * file url and updated_at timestamp is added to state's history field * Address CR comments * Address CR comments * Use stream_slice to determine which files to sync * fix * test with no input state * test with multiple files * filter out older files * group by timestamp * Add another test * comment * use min time * skip files that are already in the history * move the code around * include files that are not in the history * remove start_timestamp * cleanup * sync misisng recent files even if history is more recent * remove old files if history is full * resync files if history is incomplete * sync recent files * comment * configurable history size * configurable days to sync if history is full * move to a stateful object * Only update state once per file * two unit tests * Unit tests * missing files * remove inner state * fix tests * fix interface * fix constructor * Update interface * cleanup * format * Update * cleanup * Add timestamp and source file to schema * set file uri on record * format * comment * reset * notes * delete dead code * format * remove dead code * remove dead code * warning if history is not complete * always set is_history_partial in the state * rename * Add a readme * format * Update * rename * rename * missing files * get instead of compute * sort alphabetically, and sync everthing if the history is not partial * unit tests * Update airbyte-cdk/python/airbyte_cdk/sources/file_based/README.md Co-authored-by: Catherine Noll <clnoll@users.noreply.github.com> * Update docs * reset * Test to verify we remove files sorted (datetime, alphabetically) * comment * Update scenario * Rename method to get_state * If the file's ts is equal to the earliest ts, only sync it if its alphabetically greater than the file * add missing test * rename * rename and update comments * Update comment for clarity * inject the cursor * add interface * comment * Handle the case where the file has been modified since it was synced * Only inject from AbstractFileSource * keep the remote files in the stream slices * Use file_based typedefs * format * Update the comment * simplify the logic, update comment, and add a test * Add a comment * slightly cleaner * clean up * typing * comment * I think this is simpler to reason about * create the cursor in the source * update * Remove methods from FiledBasedStreamReader and AbstractFileBasedStream interface (#27736) * update the interface * Add a comment * rename --------- Co-authored-by: Catherine Noll <noll.catherine@gmail.com> Co-authored-by: clnoll <clnoll@users.noreply.github.com> Co-authored-by: Sherif A. Nada <snadalive@gmail.com>	2023-06-27 15:58:26 -07:00
Joe Reuter	8aba48810c	Low-code CDK: Serialize request body as string for connector builder module (#27657 ) * serialize request body as string * fix some bugs	2023-06-27 08:27:16 +02:00
midavadim	c44c3eae48	✨ CDK: availability check - handle HttpErrors which happen during slice extraction (#26630 ) * for availability check - handle HttError happens during slice extraction (reading of parent stream), updated reason messages, moved check availability call under common try/except which handles errors during usual stream read, moved log messages which indicate start of the stream sync before availability check in to make to understand which stream is the source of errors * why do we return here and not try next stream? * fixed bug in CheckStream, now we try to check availability for all streams	2023-06-23 13:15:25 -04:00
Joe Reuter	c53d1fa29d	Datetime inferrer: Improve detected formats (#27546 ) * consolidate formats * Automated Commit - Formatting Changes * consolidate formats * consolidate formats --------- Co-authored-by: flash1293 <flash1293@users.noreply.github.com>	2023-06-23 05:23:33 -04:00
Maxime Carbonneau-Leclerc	a45a1e3341	Maxi297/refactoring declarative state management (#27445 ) * [ISSUE #26581] per partition cursor * [ISSUE #26581] format * [ISSUE #26581] clean up state management * [ISSUE #26581] improving Hashabledict * [ISSUE #26581] format cdk * [ISSUE #26581] fix tests * [ISSUE #26581] code review from girarda * Retrigger pipeline * Decouple cursor and stream slicer and pushing state management as far up cursor as possible * Format cdk * Small fixes/comments * DatetimeBasedCursor should not update state based on slice (for now at least since it wasn't doing this before) * [ISSUE #26581] code review * Automated Commit - Formatting Changes * [ISSUE #26581] validation overlapping keys * [ISSUE #26581] add typing * [ISSUE #26581] code review * Remove SyncMode from stream_slices * Removing SyncMode from stream_slices up until SimpleRetriever and fixing typing * format cdk	2023-06-22 12:54:36 -04:00
Alexandre Girard	d548587161	[ISSUE #27289 ] Document macros output in the manifest schema (#27600 ) * Add example for macros * Update changelog * Revert "Update changelog" This reverts commit `2993e5820e`.	2023-06-22 09:14:17 -07:00
Brian Lai	02e4bd07f7	[26989] Add request filter for cloud and integration test fixtures for e2e sync testing (#27534 ) * add the request filters and integration test fixtures * pr feedback and some tweaks to the testing framework * optimize the cache for more hits * formatting * remove cache	2023-06-22 12:14:07 -04:00
Catherine Noll	a8e99a46e6	File CDK: define streams via glob list (#27476 )	2023-06-22 11:50:35 -04:00
Maxime Carbonneau-Leclerc	d9a5e2d873	🐛 Source zenloop: update to state per partition (#27556 ) * [ISSUE #26581] per partition cursor * [ISSUE #26581] format * [ISSUE #26581] clean up state management * [ISSUE #26581] improving Hashabledict * [ISSUE #26581] format cdk * [ISSUE #26581] fix tests * [ISSUE #26581] code review from girarda * Retrigger pipeline * [ISSUE #26581] code review * Automated Commit - Formatting Changes * [ISSUE #26581] validation overlapping keys * [ISSUE #26581] add typing * [ISSUE #26581] code review * [ISSUE #26607] zenloop migration (#27243) * [ISSUE #26607] zenloop migration implementation without tests * [ISSUE #26607] zenloop migration adding edge cases * [ISSUE #26607] add cursor field for state * [ISSUE #26607] update abnormal state * [ISSUE #26607] ensure default state * [ISSUE #26607] updating CATs state * [ISSUE #26607] revert migrating cursor * [ISSUE #26607] remove default cursor value * [ISSUE #26607] improve error message * [ISSUE #26607] changelog --------- Co-authored-by: Augustin <augustin@airbyte.io> * 🤖 Auto format source-zenloop code [skip ci] * Automated Commit - Formatting Changes * [ISSUE #26581] move partition serialization to JSON * Revert "[ISSUE #26607] zenloop migration (#27243)" This reverts commit `5c6f19b775`. * Revert "Revert "[ISSUE #26607] zenloop migration (#27243)"" This reverts commit `e363fd6cb8`. * [ISSUE #26607] update zenloop version * TMP specify cdk version * [ISSUE #26607] do not lock zenloop airbyte_cdk version * trigger pipeline * Automated Commit - Formatting Changes * trigger pipeline --------- Co-authored-by: Augustin <augustin@airbyte.io> Co-authored-by: octavia-squidington-iii <octavia-squidington-iii@users.noreply.github.com>	2023-06-22 08:42:20 -05:00
Maxime Carbonneau-Leclerc	8926970c86	Issue 26581/per partition cursor (#27223 ) * [ISSUE #26581] per partition cursor * [ISSUE #26581] format * [ISSUE #26581] clean up state management * [ISSUE #26581] improving Hashabledict * [ISSUE #26581] format cdk * [ISSUE #26581] fix tests * [ISSUE #26581] code review from girarda * Retrigger pipeline * [ISSUE #26581] code review * Automated Commit - Formatting Changes * [ISSUE #26581] validation overlapping keys * [ISSUE #26581] add typing * [ISSUE #26581] code review * [ISSUE #26607] zenloop migration (#27243) * [ISSUE #26607] zenloop migration implementation without tests * [ISSUE #26607] zenloop migration adding edge cases * [ISSUE #26607] add cursor field for state * [ISSUE #26607] update abnormal state * [ISSUE #26607] ensure default state * [ISSUE #26607] updating CATs state * [ISSUE #26607] revert migrating cursor * [ISSUE #26607] remove default cursor value * [ISSUE #26607] improve error message * [ISSUE #26607] changelog --------- Co-authored-by: Augustin <augustin@airbyte.io> * 🤖 Auto format source-zenloop code [skip ci] * Automated Commit - Formatting Changes * [ISSUE #26581] move partition serialization to JSON * Revert "[ISSUE #26607] zenloop migration (#27243)" This reverts commit `5c6f19b775`. * [ISSUE #26607] revert zenloop --------- Co-authored-by: Augustin <augustin@airbyte.io> Co-authored-by: octavia-squidington-iii <octavia-squidington-iii@users.noreply.github.com>	2023-06-21 12:59:11 -04:00
Alexandre Girard	dee2e9a905	🐛 Use url encoding in oauth refresh request (#27523 ) * Revert "🐛 CDK: replace `data` with `json` when making OAuth calls (#27350)" This reverts commit `780f4415d9`. * Revert "Set content-type header on oauth request (#27225)" This reverts commit `2864f72ff4`.	2023-06-20 14:41:06 -07:00
Catherine Noll	f464a330f8	File-based CDK module scaffolding (#27122 ) Includes CSV schema inference & record parser (#27176) --------- Co-authored-by: Sherif A. Nada <snadalive@gmail.com> Co-authored-by: Alexandre Girard <alexandre@airbyte.io>	2023-06-19 11:01:11 -04:00
Denys Davydov	780f4415d9	🐛 CDK: replace `data` with `json` when making OAuth calls (#27350 ) * Connector health: source hubspot, gitlab, snapchat-marketing: fix builds * Airbyte CDK: replace data with json when making OAuth calls	2023-06-14 22:51:37 +03:00
Maxime Carbonneau-Leclerc	f48849fdb4	[ISSUE #26909 ] adding message repository (#27158 ) * [ISSUE #26909] adding message repository * Automated Commit - Formatting Changes * [ISSUE #26909] improve entrypoint error handling * format CDK * [ISSUE #26909] adding an integration test	2023-06-13 08:40:55 -04:00
Alexandre Girard	2864f72ff4	Set content-type header on oauth request (#27225 ) * Set content-type header on oauth authenticator * Revert "Set content-type header on oauth authenticator" This reverts commit `1e6815e9bb`. * Set header on oauth request * Fix test * Verify header is set * Automated Commit - Formatting Changes --------- Co-authored-by: girarda <girarda@users.noreply.github.com>	2023-06-12 13:29:53 -04:00
Joe Reuter	d6512dea2c	CDK: Datetime format inferrer (#27071 ) * datetime inferrer class * format * pass inferred date formats along * review comments	2023-06-09 10:33:54 +02:00
Maxime Carbonneau-Leclerc	13834395bb	[ISSUE #26568 ] make DatetimeBasedCursor.end_datetime optional (#27031 ) * [ISSUE #26568] make DatetimeBasedCursor.end_datetime optional * [ISSUE #26568] ensure model_to_component_factory create end_datetime properly * [ISSUE #26568] fix typing * [ISSUE #26568] improve end_datetime documentation	2023-06-08 09:51:40 -04:00
Joe Reuter	c4981a72db	Low code CDK: Allow query param / body injection for api key authenticator (#26953 ) * make authenticator more flexible * format * format * format * format * format * fix problem * code review	2023-06-07 15:08:13 +02:00
Maxime Carbonneau-Leclerc	4625cef571	[ISSUE #26909 ] add latest connector config control message to connect… (#26922 ) * [ISSUE #26909] add latest connector config control message to connector builder API * [ISSUE #26909] flake * Automated Commit - Formatting Changes * [ISSUE #26909] fallback on in-memory dict if no config control message * [ISSUE #26909] update and add tests	2023-06-07 08:31:45 -04:00
Joe Reuter	b34fb00660	Extend low code OAuthAuthenticator with token refresh capabilities (#26966 ) * wip * Automated Commit - Formatting Changes * add documentation * tests and fixes * fix tests * more documentation * revert * changes as discussed * fix case * add docstring * add details to schema * format * fix bug --------- Co-authored-by: flash1293 <flash1293@users.noreply.github.com>	2023-06-07 10:51:59 +02:00
Maxime Carbonneau-Leclerc	b5c0ac15ec	[ISSUE #26570 ] make step and cursor_granularity optional (#26952 ) * [ISSUE #26570] make step and cursor_granularity optional * [ISSUE #26570] fix typos	2023-06-05 09:10:48 -04:00
Maxime Carbonneau-Leclerc	d54a68640f	Improving error messages to have better messaging in datadog and the … (#26860 ) * Improving error messages to have better messaging in datadog and the frontend * fixing tests	2023-05-31 15:36:27 -04:00
Joe Reuter	ec5aa7bab6	CDK: Improve schema detection (#26741 ) * improve schema detection * improve schema detection * review comment * Automated Commit - Formatting Changes --------- Co-authored-by: flash1293 <flash1293@users.noreply.github.com>	2023-05-31 09:57:08 -04:00
Joe Reuter	4a041bf77d	Low code CDK: Allow nested objects for request_body_json (#26474 ) * allow nested JSON * add test for boolean * review comment * change for testing * try fix * try another fix * Revert "change for testing" This reverts commit `931b935778`. * Revert "try fix" This reverts commit `6f1c6c0e4b`.	2023-05-26 10:52:24 +02:00
Brian Lai	5707e477ad	low-code cdk: make page_size optional for offset and page increment strategies (#26056 ) * make page_size optional * Automated Commit - Formatting Changes --------- Co-authored-by: brianjlai <brianjlai@users.noreply.github.com>	2023-05-24 17:21:41 -04:00
Joe Reuter	dbb766c255	Low code CDK: Make refresh token in oauth authenticator optional (#26031 ) * make refresh_token optional * format * add back type annotation	2023-05-15 14:47:48 +02:00
Catherine Noll	051c656aba	[lowcode] OAuth Authenticator: cast token expiry time to number (#26020 )	2023-05-12 14:31:20 -04:00
Alexandre Girard	7443970de3	low-code: Use Jinja sandbox environment and prevent use of range method (#25589 ) * secure the jinja environment * format * Update comment * remove extra test * remove lambda * Update * Raise an error on undefined variables * remove unused import * add unit tests to missing context vars and adjust error message --------- Co-authored-by: brianjlai <brian.lai@airbyte.io> Co-authored-by: Brian Lai <51336873+brianjlai@users.noreply.github.com>	2023-05-08 19:34:53 -04:00
Maxime Carbonneau-Leclerc	0efc18a114	[ISSUE #24720 ] connector builder set slice descriptor (#25677 )	2023-05-01 12:18:22 -04:00
Maxime Carbonneau-Leclerc	b26c897a8b	[ISSUE #25646 ] support parsing of non utc dates (#25665 ) * [ISSUE #25646] support parsing of non utc dates * [ISSUE #25646] improve parsing * [ISSUE #25646] removing timezone for DatetimeParser interface * [ISSUE #25646] fix tests	2023-05-01 12:16:44 -04:00
Alexandre Girard	e41060c02c	low-code: Fix type check in DeclarativeStream (#25533 ) * Set right type * Update the comment * Update * format * Update comment	2023-04-26 15:49:10 -07:00
Jonathan Pearlin	2ebfa459cf	Publish stream status messages in CDK (#24994 ) * Publish stream status messages in CDK * Automated Commit - Formatting Changes * Convert to StreamDescriptor * Automated Commit - Formatting Changes * Bump to latest protocol model * Automated Commit - Formatting Changes * Bump protocol version * Add tests for stream status message creation * Formatting * Formatting * Fix failing test * Actually emit state message * Automated Commit - Formatting Changes * Bump airbyte-protocol * PR feedback * Fix parameter input * Correctly yield status message * PR feedback * Formatting * Fix failing tests * Automated Commit - Formatting Changes * Revert accidental change * Automated Change * Replace STOPPED with COMPLETE/INCOMPLETE * Update source-facebook-marketing changelog * Revert "Update source-facebook-marketing changelog" This reverts commit `709edb800c`. --------- Co-authored-by: jdpgrailsdev <jdpgrailsdev@users.noreply.github.com>	2023-04-26 10:30:36 -04:00
Maxime Carbonneau-Leclerc	3cc67a6d9e	[ISSUE #23382 ] ignore backoff configuration on test reads (#25429 )	2023-04-26 08:36:59 -04:00
Alexandre Girard	250c3b1c87	low-code: Delete now_local macro (#25404 ) * Delete now_local macro * Remove from reference docs * remove example	2023-04-25 20:40:46 -07:00
Alexandre Girard	645763588c	low-code: Alias stream_interval and stream_partition to stream_slice in interpolation context (#25373 ) * add aliases * Raise error if the alias is found in the context * format * Comment * Automated Commit - Formatting Changes * rename to stream partition in greenhouse manifest * Revert "rename to stream partition in greenhouse manifest" This reverts commit `d513ef418f`. * Clean up test * Other test * last test --------- Co-authored-by: girarda <girarda@users.noreply.github.com>	2023-04-24 18:25:54 -07:00
Maxime Carbonneau-Leclerc	4d65fa1b98	[ISSUE #23994 ] make MessageGrouper use AirbyteEntrypoint (#25402 ) * [ISSUE #23994] make MessageGrouper use AirbyteEntrypoint * [ISSUE #23994] code review	2023-04-24 11:24:15 -04:00
Alexandre Girard	15f90e3a2f	Fix and document macros and interpolation variables (#25305 ) * Fix and document macros * cleanup * dots * Add tests and refactor * Update * Add an example * Document variables * Mention now_local is not recommended	2023-04-21 10:58:53 -07:00
Alexandre Girard	1e8cf8f5d5	low-code: Do not apply transforms on AirbyteLogMessages and AirbyteTraceMessages (#25290 ) * Check the input type before applying transformations * format * remove debug prints	2023-04-20 14:12:22 -07:00
Alexandre Girard	fc3655c12a	low-code: Clean up SessionTokenAuthenticator interface (#25086 ) * Username and session token are optional fields * update * Add titles, descriptions, and examples * Automated Commit - Formatting Changes * fix a small typo --------- Co-authored-by: girarda <girarda@users.noreply.github.com>	2023-04-17 14:42:49 -07:00
Alexandre Girard	3841141913	Fix manifest_declarative_source + add unit tests (#25217 ) * Fix + unit test * Add a test with pagination * Add a test with partition router * Make sure _fetch_next_page is called with the right arguments * Automated Commit - Formatting Changes * pagination with partitions * refactor * clean up * format --------- Co-authored-by: girarda <girarda@users.noreply.github.com>	2023-04-14 14:05:22 -07:00
Alexandre Girard	f3799280f2	connector builder: Emit message at start of slice (#25180 ) * Move condition for yielding the slice message to an overwritable method * Automated Commit - Formatting Changes * yield the slice log messages * same for incremental * refactor * Revert "refactor" This reverts commit `c594365bd8`. * move flag from factory to source * set the flag * remove debug print * halfmock * clean up * Add a test for a single page * Add another test * Pass the flag * rename --------- Co-authored-by: girarda <girarda@users.noreply.github.com>	2023-04-14 10:23:59 -07:00
Alexandre Girard	71fc3dd517	Connector builder: set pages and slices limits (#25121 ) * Set limits * refactor and add unit tests * Update as per comments	2023-04-12 14:46:43 -07:00
Alexandre Girard	f969ebd63b	low-code: Add some unit tests for CompositeErrorHandler (#24930 ) * Add some unit tests * update * fix indent * fix warnings	2023-04-10 12:12:36 -07:00
Brian Lai	3ba15b5fcb	Decouple flags for debug messages from connector builder log messages (#24881 ) * decouple debug message flags from connector builder grouping messages * Automated Commit - Formatting Changes * pr feedback simplifying configs a bit --------- Co-authored-by: brianjlai <brianjlai@users.noreply.github.com>	2023-04-06 12:16:05 -04:00
Alexandre Girard	4b324c3084	Low-code: fix duplicate stream_slicer update (#24827 ) * first tentative fix * cleaner fix * refactor test * format * format * move to utils file * use simpler implementation	2023-04-04 15:40:33 -07:00
Serhii Chvaliuk	032f9b8045	Low-Code CDK: make RecordFilter.filter_records as generator (#24772 ) Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com>	2023-04-03 13:29:24 +03:00
Alexandre Girard	c3b017c7b5	Add auth flow to declarative manifest schema (#24441 ) * Add auth flow to declarative manifest schema * Rename * fix rename * set advanced_auth * Automated Commit - Formatting Changes * update unit test * format * Add examples * example -> examples * add missing examples * Automated Commit - Formatting Changes --------- Co-authored-by: girarda <girarda@users.noreply.github.com>	2023-03-29 16:06:56 -05:00
Augustin	bad5bce8ce	CDK: remove unexpected error swallowing on abstract source's `check` method (#24240 )	2023-03-23 13:04:51 +00:00
Alexandre Girard	edfc59533d	Connector builder: Port "send stacktrace when error on read" to CDK connector builder module (#24173 ) * wip * fix unit test * fix other unit test * format * reset * format * missing unit test * yield a LogMessage on error * format * format * fix unit tests * yield a trace message instead of a log message * format * fix bad merge	2023-03-21 17:22:08 -07:00
Catherine Noll	f4fd4d98a2	Connector Builder: Make connector_builder part of the CDK package (#24280 )	2023-03-21 13:31:16 -04:00

1 2 3 4 5 ...

266 Commits