1
0
mirror of synced 2026-01-20 12:07:14 -05:00
Commit Graph

430 Commits

Author SHA1 Message Date
Alexandre Girard
5f48da9a67 [low-code] allow page size to be defined with string interpolation (#35735)
Co-authored-by: Dan Lecocq <dlecocq@sofi.org>
2024-03-05 16:03:16 -08:00
Ella Rohm-Ensing
a090088594 file cdk: handle scalar values that resolve to None (#35688)
<!--
Thanks for your contribution! 
Before you submit the pull request, 
I'd like to kindly remind you to take a moment and read through our guidelines
to ensure that your contribution aligns with the type of contributions our project accepts.
All the information you need can be found here:
   https://docs.airbyte.com/contributing-to-airbyte/

We truly appreciate your interest in contributing to Airbyte,
and we're excited to see what you have to offer! 

If you have any questions or need any assistance, feel free to reach out in #contributions Slack channel.
-->

## What
* Closes https://github.com/airbytehq/airbyte/issues/34151
* Closes https://github.com/airbytehq/oncall/issues/4386

## How
Handle cases where the python value of a pyarrow scalar is None. This can be due to null values in data, as well as null-like values like `NaT` (similar to `NaN`). We previously handled this for `None` binary types, but now handle this for `None` of any type.

## 🚨 User Impact 🚨
No breaking changes. After this CDK version is released we should update the CDK dependency in S3 and any other file sources that parse parquet


## Pre-merge Actions
*Expand the relevant checklist and delete the others.*

<details><summary><strong>New Connector</strong></summary>

### Community member or Airbyter

- **Community member?** Grant edit access to maintainers ([instructions](https://docs.github.com/en/github/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests))
- Unit & integration tests added and passing. Community members, please provide proof of success locally e.g: screenshot or copy-paste unit, integration, and acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run `./gradlew :airbyte-integrations:connectors:<name>:integrationTest`.
- Connector version is set to `0.0.1`
    - `Dockerfile` has version `0.0.1`
- Documentation updated
    - Connector's `README.md`
    - Connector's `bootstrap.md`. See [description and examples](https://docs.google.com/document/d/1ypdgmwmEHWv-TrO4_YOQ7pAJGVrMp5BOkEVh831N260/edit?usp=sharing)
    - `docs/integrations/<source or destination>/<name>.md` including changelog with an entry for the initial version. See changelog [example](https://docs.airbyte.io/integrations/sources/stripe#changelog)
    - `docs/integrations/README.md`

### Airbyter

If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.

- Create a non-forked branch based on this PR and test the below items on it
- Build is successful
- If new credentials are required for use in CI, add them to GSM. [Instructions](https://docs.airbyte.io/connector-development#using-credentials-in-ci).

</details>

<details><summary><strong>Updating a connector</strong></summary>

### Community member or Airbyter

- Grant edit access to maintainers ([instructions](https://docs.github.com/en/github/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests))
- Unit & integration tests added


### Airbyter

If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.

- Create a non-forked branch based on this PR and test the below items on it
- Build is successful
- If new credentials are required for use in CI, add them to GSM. [Instructions](https://docs.airbyte.io/connector-development#using-credentials-in-ci).

</details>

<details><summary><strong>Connector Generator</strong></summary>

- Issue acceptance criteria met
- PR name follows [PR naming conventions](https://docs.airbyte.com/contributing-to-airbyte/resources/pull-requests-handbook)
- If adding a new generator, add it to the [list of scaffold modules being tested](https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connector-templates/generator/build.gradle#L41)
- The generator test modules (all connectors with `-scaffold` in their name) have been updated with the latest scaffold by running `./gradlew :airbyte-integrations:connector-templates:generator:generateScaffolds` then checking in your changes
- Documentation which references the generator is updated as needed

</details>

<details><summary><strong>Updating the Python CDK</strong></summary>

### Airbyter

Before merging:
- Pull Request description explains what problem it is solving
- Code change is unit tested
- Build and my-py check pass
- Smoke test the change on at least one affected connector
   - On Github: Run [this workflow](https://github.com/airbytehq/airbyte/actions/workflows/connectors_tests.yml), passing `--use-local-cdk --name=source-<connector>` as options
   - Locally: `airbyte-ci connectors --use-local-cdk --name=source-<connector> test`
- PR is reviewed and approved
      
After merging:
- [Publish the CDK](https://github.com/airbytehq/airbyte/actions/workflows/publish-cdk-command-manually.yml)
   - The CDK does not follow proper semantic versioning. Choose minor if this the change has significant user impact or is a breaking change. Choose patch otherwise.
   - Write a thoughtful changelog message so we know what was updated.
- Merge the platform PR that was auto-created for updating the Connector Builder's CDK version
   - This step is optional if the change does not affect the connector builder or declarative connectors.

</details>
2024-03-05 09:07:02 -08:00
Brian Lai
ef98194673 Emit final state message for full refresh syncs and consolidate read flows (#35622) 2024-03-05 01:05:06 -05:00
Danny Tiesling
e671aa320d 🐛 Source S3: fix exception when setting CSV stream delimiter to \t. (#35246)
Co-authored-by: Marcos Marx <marcosmarxm@users.noreply.github.com>
Co-authored-by: marcosmarxm <marcosmarxm@gmail.com>
2024-02-23 14:34:29 -03:00
Artem Inzhyyants
0954ad3d3a Airbyte CDK: add interpolation for request options (#35485)
Signed-off-by: Artem Inzhyyants <artem.inzhyyants@gmail.com>
Co-authored-by: Alexandre Girard <alexandre@airbyte.io>
2024-02-22 19:40:44 +01:00
Alexandre Girard
5724ca0cf0 Add ignore_stream_slicer_parameters_on_paginated_requests flag (#35462) 2024-02-21 14:14:37 -08:00
Artem Inzhyyants
3355c5c432 Airbyte CDK: add filter to RemoveFields (#35326)
Signed-off-by: Artem Inzhyyants <artem.inzhyyants@gmail.com>
2024-02-21 13:49:10 +01:00
Alex Birdsall
385a70d89d Support user-specified test read limits in connector_builder code (#35312) 2024-02-16 15:53:26 -08:00
Brian Lai
2b87164b89 Emit multiple error trace messages and continue syncs by default (#35129) 2024-02-15 02:16:02 -05:00
Alexandre Girard
fc87183905 🐛 python cdk: mask oauth access key (#34931) 2024-02-14 22:25:18 -08:00
Maxime Carbonneau-Leclerc
60a2618154 [ISSUE #34910] add headers to HttpResponse for test framework (#35105) 2024-02-09 12:19:29 -05:00
Catherine Noll
e8910e427a File-based CDK: make incremental syncs concurrent (#34540) 2024-02-07 20:41:04 -05:00
Brian Lai
60686505f3 Revert "Emit multiple error trace messages and continue syncs by default" (#34990) 2024-02-07 19:47:15 -05:00
Maxime Carbonneau-Leclerc
3d9f70f9b0 [ISSUE #34755] do not propagate parameters on InlineSchemaLoader (#34853) 2024-02-07 15:41:03 -05:00
Brian Lai
cc2a6e229f Emit multiple error trace messages and continue syncs by default (#34636) 2024-02-07 13:34:43 -05:00
Catherine Noll
7f97f245bc CDK: fix flaky scenario-based tests by sorting on k & v (#34912) 2024-02-06 18:55:39 -05:00
Maxime Carbonneau-Leclerc
ca8590e2b4 Have StateBuilder return our actual state object and not simply a dict (#34625) 2024-01-30 08:46:03 -05:00
Maxime Carbonneau-Leclerc
2c8b47b100 Emit state when no partitions are generated for ccdk (#34605) 2024-01-30 08:45:49 -05:00
Catherine Noll
eb31e4d2ba File-based CDK: make full refresh concurrent (#34411) 2024-01-29 19:33:50 -05:00
Maxime Carbonneau-Leclerc
b9c1897cfc Fix concurrent deadlock (#34454) 2024-01-24 12:35:21 -05:00
Catherine Noll
e3e58cc063 Concurrent CDK: fix state message ordering (#34131) 2024-01-18 11:35:40 -05:00
Alexandre Girard
0faa69d899 concurrent cdk: improve resource usage and stop waiting on the main thread (#33669)
Co-authored-by: Augustin <augustin@airbyte.io>
2024-01-17 23:54:02 -08:00
Baz
cf7f700bbb 🎉 Airbyte CDK (File-based CDK): Stop the sync if the record could not be parsed (#32589) 2024-01-11 21:26:23 +02:00
Artem Inzhyyants
9c6aea19cd Airbyte CDK: handle private network exception as config error (#33751) 2024-01-10 15:20:40 +01:00
Anton Karpets
8305d05e52 Airbyte CDK: add POST method to HttpMocker (#34001)
Co-authored-by: maxi297 <maxime@airbyte.io>
2024-01-10 12:44:58 +02:00
Alexandre Girard
c8ca4b13ff 🐛 fix declarative oauth initialization (#32967)
Co-authored-by: girarda <girarda@users.noreply.github.com>
2024-01-08 17:40:48 -08:00
Eugene Kulak
4061f08f3d CDK: Add schema normalization to declarative stream (#32786)
Co-authored-by: Eugene Kulak <kulak.eugene@gmail.com>
Co-authored-by: Yevhenii Kurochkin <ykurochkin@flyaps.com>
Co-authored-by: Alexandre Girard <alexandre@airbyte.io>
2023-12-21 14:51:17 +02:00
Catherine Noll
86321b7945 Concurrent CDK: add state converter for ISO timestamps with milliseco… (#33531) 2023-12-20 11:53:59 -05:00
Eugene Kulak
25bdd30fd5 CDK: add SelectiveAuthenticator (#33526)
Co-authored-by: Eugene Kulak <kulak.eugene@gmail.com>
Co-authored-by: Yevhenii Kurochkin <ykurochkin@flyaps.com>
2023-12-20 01:59:50 +02:00
Joe Reuter
9065181e77 Unstructured parser: Support txt (#32929)
Co-authored-by: flash1293 <flash1293@users.noreply.github.com>
2023-12-15 11:31:45 +01:00
Maxime Carbonneau-Leclerc
66edb4b0f0 Issue 32871/more integration test tooling to test events stream (#33305)
Co-authored-by: Catherine Noll <clnoll@users.noreply.github.com>
2023-12-14 09:23:19 -05:00
Yevhenii
0aea0eb560 CDK: Raise error on passing unsupported value formats as query parameters (#33060)
Co-authored-by: Alexandre Girard <alexandre@airbyte.io>
2023-12-13 14:01:34 +00:00
Joe Reuter
55d5345bff Vector DB CDK: Refactor to improve readability (#33255)
Co-authored-by: flash1293 <flash1293@users.noreply.github.com>
2023-12-13 11:23:39 +00:00
Joe Reuter
c1e428f35c File CDK: Handle 422 errors separately (#33300) 2023-12-13 11:03:36 +00:00
Maxime Carbonneau-Leclerc
0c2d43fdf9 Issue 32871/extract trace message creation (#33227) 2023-12-11 09:20:45 -05:00
Maxime Carbonneau-Leclerc
d3f2aa548a [ISSUE #33202] allow for loose query params validation (#33226) 2023-12-11 08:57:41 -05:00
Augustin
0b33caecda Revert "[skip ci] formatting: add missing license headers (#33250)" (#33289) 2023-12-11 11:38:37 +01:00
Augustin
60c1cc01ad [skip ci] formatting: add missing license headers (#33250) 2023-12-11 10:15:18 +01:00
Joe Reuter
aa220fc515 Stop sync on traced exception (#33246)
Co-authored-by: flash1293 <flash1293@users.noreply.github.com>
2023-12-08 18:07:25 +01:00
Joe Reuter
f5ac5cfd80 File CDK: Add file processing via API to document file type parser (#32781)
Co-authored-by: flash1293 <flash1293@users.noreply.github.com>
2023-12-08 15:48:37 +01:00
Joe Reuter
7fd92e2a03 File CDK: Parser defined primary key (#33009)
Co-authored-by: flash1293 <flash1293@users.noreply.github.com>
2023-12-08 15:15:33 +01:00
Yevhenii
32ebd88402 CDK: low-code enable caching for parent streams (#32726) 2023-12-08 13:54:31 +00:00
Joe Reuter
5b682ef74f Unstructured parser: Handle parsing errors better (#32700)
Co-authored-by: flash1293 <flash1293@users.noreply.github.com>
2023-12-08 11:47:05 +01:00
Joe Reuter
21b3b2f638 Vector DB CDK: Fix special tokens (#33065) 2023-12-08 11:46:46 +01:00
Brian Lai
f8182bc18e airbyte-cdk: failed stream does not end the sync (#33136)
Co-authored-by: brianjlai <brianjlai@users.noreply.github.com>
2023-12-07 14:11:39 -05:00
Catherine Noll
7ed47ee7d9 File-based CDK: hide the primary key field from config (#33172) 2023-12-06 11:12:50 -05:00
Maxime Carbonneau-Leclerc
ba83309bb1 [ISSUE #32870] Adding entrypoint wrapper and migrating file based and… (#33103) 2023-12-06 08:46:38 -05:00
Maxime Carbonneau-Leclerc
69cb3a571e [ISSUE #32868] create HttpMocker (#32937) and [ISSUE #32869] response builder (#32983)
Co-authored-by: octavia-approvington <octavia-approvington@users.noreply.github.com>
2023-12-05 08:48:45 -05:00
Joe Reuter
28e8692624 Vector DB CDK: Add omit_raw_text flag (#32698)
Co-authored-by: flash1293 <flash1293@users.noreply.github.com>
2023-11-30 09:49:02 +01:00
Alexandre Girard
a84902e8be concurrent cdk: Read multiple streams concurrently (#32411) 2023-11-28 15:00:00 -08:00