mirror of synced 2026-01-06 06:04:16 -05:00

Go to file

Alexandre Girard c98f196d64 [low-code connectors] Rename decode_response reference to response (#14877 )

* checkout files from test branch

* read_incremental works

* reset to master

* remove dead code

* comment

* fix

* Add test

* comments

* utc

* format

* small fix

* Add test with rfc3339

* remove unused param

* fix test

* configurable state checkpointing

* update test

* start working on retrier

* retry predicate

* return response status

* look in error message

* cleanup test

* constant backoff strategy

* chain backoff strategy

* chain retrier

* Add to class types registry

* extract backoff time from header

* wait until

* update

* split file

* parse_records

* classmethod

* delete dead code

* comment

* comment

* comments

* fix

* test for instantiating chain retrier

* fix parsing

* cleanup

* fix

* reset

* never raise on http error

* remove print

* comment

* comment

* comment

* comment

* remove prints

* add declarative stream to registry

* start working on limit paginator

* support for offset pagination

* tests

* move limit value

* extract request option

* boilerplate

* page increment

* delete offset paginator

* update conditional paginator

* refactor and fix test

* fix test

* small fix

* Delete dead code

* Add docstrings

* quick fix

* exponential backoff

* fix test

* fix

* delete unused properties

* fix

* missing unit tests

* uppercase

* docstrings

* rename to success

* compare full request instead of just url

* renmae module

* rename test file

* rename interface

* rename default retrier

* rename to compositeerrorhandler

* fix missing renames

* move action to filter

* str -> minmaxdatetime

* small fixes

* plural

* add example

* handle header variations

* also fix wait time from

* allow using a regex to extract the value

* group()

* docstring

* add docs

* update comment

* docstrings

* fix tests

* rename param

* cleanup stop_condition

* cleanup

* Add examples

* interpolated pagination strategy

* dont need duplicate class

* docstrings

* more docstrings

* docstrings

* update comment

* Update airbyte-cdk/python/airbyte_cdk/sources/declarative/requesters/http_requester.py

Co-authored-by: Sherif A. Nada <snadalive@gmail.com>

* version: Update Parquet library to latest release (#14502)

The upstream Parquet library that is currently pinned for use in the S3 destination plugin is over a year old. The current version is generating invalid schemas for date-time with time-zone fields which appears to be addressed in the `1.12.3` release of the library in commit c72862b613

* merge

* 🎉 Source Github: improve schema for stream `pull_request_commits` added "null" (#14613)

Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com>

* Docs: Fixed broken links (#14622)

* fixing broken links

* more broken links

* source-hubspot: change mentioning of Mailchimp into HubSpot  doc (#14620)

* Helm Chart: Add external temporal option (#14597)

* conflict env configmap and chart lock

* reverting lock

* add eof lines and documentation on values yaml

* conflict json file

* rollback json

* solve conflict

* correct minio with new version

Co-authored-by: Guy Feldman <gfeldman@86labs.com>

* 🎉 Add YAML format to source-file reader (#14588)

* Add yaml reader

* Update docs

* Bumpversion of connector

* bump docs

* Update pyarrow dependency

* Upgrade pandas dependency

* auto-bump connector version

Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>

* 🎉 Source Okta: add GroupMembers stream (#14380)

* add Group_Members stream to okta source

- Group_Members return a list of users, the same schema of Users stream.
- Create a shared schema users, and both group_members and users sechema use it as a reference.
- Add Group_Members stream to source connector

* add tests and fix logs schema

- fix the test error: None is not one of enums though the enum type includes both string and null, it comes from json schema validator
ddb87afad8/jsonschema/_validators.py (L279-L285)
- change grouop_members to use id as the cursor field since `filter` is not supported in the query string
- fix the abnormal state test on logs stream, when since is abnormally large, until has to defined, an equal or a larger value
- remove logs stream from full sync test, because 2 full sync always has a gap -- at least a new log about users or groups api.

* last polish before submit the PR

- bump docker version
- update changelog
- add the right abnormal value for logs stream
- correct the sample catalog

* address comments::

- improve comments for until parameter under the logs stream
- add use_cache on groupMembers

* add use_cache to Group_Members

* change configured_catalog to test

* auto-bump connector version

Co-authored-by: marcosmarxm <marcosmarxm@gmail.com>
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>

* split test files

* renames

* missing unit test

* add missing unit tests

* rename

* assert isinstance

* start extracting to their own files

* use final instead of classmethod

* assert we retry 429 errors

* Add log

* replace asserts with valueexceptions

* delete superfluous print statement

* fix factory so we don't need to union everything with strings

* get class_name from type

* remove from class types registry

* process error handlers one at a time

* sort

* delete print statement

* comment

* comment

* format

* delete unused file

* comment

* interpolatedboolean

* comment

* not optional

* not optional

* unit tests

* fix request body data

* add test

* move file to right module

* update

* reset to master

* format

* rename to pass_by

* rename to page size

* fix

* add test

* fix body data

* delete extra newlines

* move to subpackage

* fix imports

* handle str body data

* simplify

* fix typing

* always return a map

* rename to inject_into

* only accept enum

* delete conditional paginator

* only return body data

* rename decoded response to response

* decoded_response -> response

Co-authored-by: Sherif A. Nada <snadalive@gmail.com>
Co-authored-by: Tobias Macey <tmacey@boundlessnotions.com>
Co-authored-by: Serhii Chvaliuk <grubberr@gmail.com>
Co-authored-by: Amruta Ranade <11484018+Amruta-Ranade@users.noreply.github.com>
Co-authored-by: Bas Beelen <bjgbeelen@gmail.com>
Co-authored-by: Marcos Marx <marcosmarxm@users.noreply.github.com>
Co-authored-by: Guy Feldman <gfeldman@86labs.com>
Co-authored-by: Christophe Duong <christophe.duong@gmail.com>
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
Co-authored-by: Yiyang Li <yiyangli2010@gmail.com>
Co-authored-by: marcosmarxm <marcosmarxm@gmail.com>

2022-07-21 02:58:22 -07:00

.github

fix: fix helm publish workflow (#14884 )

2022-07-20 18:02:34 +03:00

.vscode

Adding VS Code settings to project workspace (#12319 )

2022-05-10 14:23:19 -04:00

airbyte-analytics

Start publishing proper artifacts. (#13484 )

2022-06-06 17:15:25 +08:00

airbyte-api

update open api in correct place (#14652 )

2022-07-13 10:05:05 -07:00

airbyte-bootloader

Apply more Best Practices PMD rules (#14772 )

2022-07-20 14:28:47 -07:00

airbyte-cdk

[low-code connectors] Rename decode_response reference to response (#14877 )

2022-07-21 02:58:22 -07:00

airbyte-cli

Un-Revert OSS branch build for Cloud workflow (#11808 )

2022-04-08 15:17:04 -07:00

airbyte-commons

Apply more Best Practices PMD rules (#14772 )

2022-07-20 14:28:47 -07:00

airbyte-commons-cli

Start publishing proper artifacts. (#13484 )

2022-06-06 17:15:25 +08:00

airbyte-commons-docker

Start publishing proper artifacts. (#13484 )

2022-06-06 17:15:25 +08:00

airbyte-config

Remove additionalProperties from JDBC source connectors (#14574 )

2022-07-21 11:01:34 +03:00

airbyte-container-orchestrator

Bump Airbyte version from 0.39.36-alpha to 0.39.37-alpha (#14719 )

2022-07-14 10:05:50 -07:00

airbyte-db

Apply more Best Practices PMD rules (#14772 )

2022-07-20 14:28:47 -07:00

airbyte-integrations

Remove additionalProperties from JDBC source connectors (#14574 )

2022-07-21 11:01:34 +03:00

airbyte-json-validation

Start publishing proper artifacts. (#13484 )

2022-06-06 17:15:25 +08:00

airbyte-metrics

Apply more Best Practices PMD rules (#14772 )

2022-07-20 14:28:47 -07:00

airbyte-notification

Apply more Best Practices PMD rules (#14772 )

2022-07-20 14:28:47 -07:00

airbyte-oauth

Apply more Best Practices PMD rules (#14772 )

2022-07-20 14:28:47 -07:00

airbyte-protocol

Apply Best Practices PMD rules (#14753 )

2022-07-15 15:01:04 -07:00

airbyte-queue

Apply Best Practices PMD rules (#14753 )

2022-07-15 15:01:04 -07:00

airbyte-scheduler

Apply more Best Practices PMD rules (#14772 )

2022-07-20 14:28:47 -07:00

airbyte-server

Apply more Best Practices PMD rules (#14772 )

2022-07-20 14:28:47 -07:00

airbyte-temporal

Un-Revert OSS branch build for Cloud workflow (#11808 )

2022-04-08 15:17:04 -07:00

airbyte-test-utils

Apply more Best Practices PMD rules (#14772 )

2022-07-20 14:28:47 -07:00

airbyte-tests

Apply more Best Practices PMD rules (#14772 )

2022-07-20 14:28:47 -07:00

airbyte-webapp

🪟 🎨 replication settings table column width fix (#13797 )

2022-07-21 12:44:39 +03:00

airbyte-webapp-e2e-tests

🪟 Per-Stream state new flow (#14634 )

2022-07-19 20:01:56 +01:00

airbyte-workers

Apply more Best Practices PMD rules (#14772 )

2022-07-20 14:28:47 -07:00

buildSrc

Java integration tests also depend on spotbugsMain (#14755 )

2022-07-15 15:21:30 -07:00

charts

Refactor OSS Helm Charts (#14794 )

2022-07-20 17:32:24 +03:00

docs

Remove additionalProperties from JDBC source connectors (#14574 )

2022-07-21 11:01:34 +03:00

docusaurus

fixing broken things in docs (#14685 )

2022-07-13 13:53:23 -04:00

gradle/wrapper

upgrade gradle from 7.3.3 -> 7.4 (#10645 )

2022-02-24 15:32:48 -08:00

kube

pass USE_STREAM_CAPABLE_STATE env var to containers/deployments (#14737 )

2022-07-15 10:53:34 -07:00

octavia-cli

Bump Airbyte version from 0.39.36-alpha to 0.39.37-alpha (#14719 )

2022-07-14 10:05:50 -07:00

resources/examples/airflow

Quality of life changes to Airflow Demo (#4895 )

2021-07-22 12:00:45 -07:00

temporal/dynamicconfig

improve temporal configuration on kubernetes (#4183 )

2021-06-17 17:50:50 -07:00

terraform

fix 'cannot reach server' error on demo instance (#10020 )

2022-06-28 09:17:28 -03:00

tools

Apply more Best Practices PMD rules (#14772 )

2022-07-20 14:28:47 -07:00

.bumpversion.cfg

Refactor OSS Helm Charts (#14794 )

2022-07-20 17:32:24 +03:00

.dockerignore

Prepare clean mount management (#126 )

2020-08-28 14:04:51 -07:00

.editorconfig

fix editorconfig for non-python (#9235 )

2021-12-31 12:39:22 +02:00

.env

pass USE_STREAM_CAPABLE_STATE env var to containers/deployments (#14737 )

2022-07-15 10:53:34 -07:00

.env.dev

[MVP] Integrate sentry to all java-based connectors (#9745 )

2022-01-29 16:58:35 -08:00

.gitignore

Adds symmary.md to gitignore (#14078 )

2022-06-23 09:23:16 -05:00

.pre-commit-config.yaml

Fix pre-commit hook failing because of black (#11737 )

2022-04-05 21:49:37 +01:00

.prettierignore

Add some dev-facing normalization docs (#13780 )

2022-06-15 08:21:14 -07:00

.python-version

Upgrade to Python 3.9 (#11763 )

2022-04-11 20:51:37 -07:00

.readthedocs.yaml

add reference docs for declarative source (#14501 )

2022-07-08 07:26:27 -07:00

.root

OSS Setup (#4 )

2020-07-29 10:45:16 -07:00

build.gradle

Upgrade spotless version and remove jvmargs workaround (#13705 )

2022-06-28 18:39:18 +02:00

CODE_OF_CONDUCT.md

Fix more links + use relative links (#799 )

2020-11-02 19:51:38 -08:00

codecov.yml

Relax codecov target to 90% (#12038 )

2022-04-14 11:51:11 -07:00

CONTRIBUTING.md

Fix more links + use relative links (#799 )

2020-11-02 19:51:38 -08:00

CONTRIBUTORS.md

Shorten our headers + adds contributors file (#6478 )

2021-09-27 10:45:50 -07:00

deps.toml

13524 Resolved host port for mac os (#14663 )

2022-07-13 18:36:49 +07:00

docker-compose-cloud.build.yaml

Sweep old scheduler code (#13400 )

2022-06-06 10:49:17 -07:00

docker-compose-cloud.buildx.yaml

fix missing db image in cloud KIND step (#14773 )

2022-07-15 15:54:53 -07:00

docker-compose.build.yaml

Sweep old scheduler code (#13400 )

2022-06-06 10:49:17 -07:00

docker-compose.debug.yaml

Faux Major Version Bump (#7876 )

2021-11-11 13:40:09 -08:00

docker-compose.yaml

pass USE_STREAM_CAPABLE_STATE env var to containers/deployments (#14737 )

2022-07-15 10:53:34 -07:00

gradle.properties

Upgrade spotless version and remove jvmargs workaround (#13705 )

2022-06-28 18:39:18 +02:00

gradlew

upgrade to Gradle 7.3.1 / Java 17 (#7964 )

2021-12-10 16:57:54 -08:00

gradlew.bat

upgrade to Gradle 7.2 (#7070 )

2021-10-15 14:03:30 -07:00

LICENSE

🎉 Update license for Core (#6479 )

2021-09-27 11:17:17 -07:00

LICENSE_SHORT

Bump year in license short to 2022 (#13191 )

2022-05-25 17:56:49 -07:00

publish-repositories.gradle

Prepare to remove fat jar. (#13427 )

2022-06-03 00:01:37 +08:00

pyproject.toml

🎉 Source Salesforce: speed up discovery >20x by leveraging parallel API calls (#10516 )

2022-02-27 19:03:39 -08:00

pytest.ini

SAT: DX improvements, better error handling and more (#4260 )

2021-06-22 03:42:10 -04:00

README.md

doc: fix deadlink in repo top level readme.md (#14564 )

2022-07-11 12:15:46 +02:00

settings.gradle

upgrade debezium version for postgres to 1.9.2 (#13368 )

2022-06-15 16:17:36 +05:30

spotbugs-exclude-filter-file.xml

add spotbugs (#10522 )

2022-03-11 12:05:17 -08:00

README.md

Introduction

Data integration made simple, secure and extensible. The new open-source standard to sync data from applications, APIs & databases to warehouses, lakes & other destinations.

Airbyte is on a mission to make data integration pipelines a commodity.

Maintenance-free connectors you can use in minutes. Just authenticate your sources and warehouse, and get connectors that adapt to schema and API changes for you.
Building new connectors made trivial. We make it very easy to add new connectors that you need, using the language of your choice, by offering scheduling and orchestration.
Designed to cover the long tail of connectors and needs. Benefit from the community's battle-tested connectors and adapt them to your specific needs.
Your data stays in your cloud. Have full control over your data, and the costs of your data transfers.
No more security compliance process to go through as Airbyte is self-hosted.
No more pricing indexed on volume, as cloud-based solutions offer.

Here's a list of our connectors with their health status.

Quick start

git clone https://github.com/airbytehq/airbyte.git
cd airbyte
docker-compose up

Now visit http://localhost:8000

Here is a step-by-step guide showing you how to load data from an API into a file, all on your computer.

Features

Built for extensibility: Adapt an existing connector to your needs or build a new one with ease.
Optional normalized schemas: Entirely customizable, start with raw data or from some suggestion of normalized data.
Full-grade scheduler: Automate your replications with the frequency you need.
Real-time monitoring: We log all errors in full detail to help you understand.
Incremental updates: Automated replications are based on incremental updates to reduce your data transfer costs.
Manual full refresh: Sometimes, you need to re-sync all your data to start again.
Debugging autonomy: Modify and debug pipelines as you see fit, without waiting.

See more on our website.

Contributing

We love contributions to Airbyte, big or small.

See our Contributing guide on how to get started. Not sure where to start? We’ve listed some good first issues to start with. If you have any questions, please open a draft PR or visit our slack channel where the core team can help answer your questions.

Note that you are able to create connectors using the language you want, as Airbyte connections run as Docker containers.

Also, we will never ask you to maintain your connector. The goal is that the Airbyte team and the community helps maintain it, let's call it crowdsourced maintenance!

Community support

For general help using Airbyte, please refer to the official Airbyte documentation. For additional help, you can use one of these channels to ask a question:

Slack (For live discussion with the Community and Airbyte team)
Forum (For deeper converstaions about features, connectors, or problems)
GitHub (Bug reports, Contributions)
Twitter (Get the news fast)
Weekly office hours (Live informal 30-minute video call sessions with the Airbyte team)

Roadmap

Check out our roadmap to get informed on what we are currently working on, and what we have in mind for the next weeks, months and years.

License

See the LICENSE file for licensing information, and our FAQ for any questions you may have on that topic.

Languages

Python 52.6%

Kotlin 35.9%

Java 8.8%

MDX 0.9%

JavaScript 0.7%

Other 0.8%

README.md Unescape Escape

Introduction

Quick start

Features

Contributing

Community support

Roadmap

License

README.md