1
0
mirror of synced 2025-12-19 18:14:56 -05:00

GitBook: [master] 84 pages and 72 assets modified

This commit is contained in:
Marcos Marx
2021-05-03 12:14:16 +00:00
committed by gitbook-bot
parent 9a45f7ca04
commit 7c70eb02cd
130 changed files with 445 additions and 397 deletions

View File

@@ -1,4 +1,4 @@
# Overview # Introduction
![GitHub Workflow Status](https://img.shields.io/github/workflow/status/airbytehq/airbyte/Airbyte%20CI) ![License](https://img.shields.io/github/license/airbytehq/airbyte) ![GitHub Workflow Status](https://img.shields.io/github/workflow/status/airbytehq/airbyte/Airbyte%20CI) ![License](https://img.shields.io/github/license/airbytehq/airbyte)
@@ -32,7 +32,7 @@ docker-compose up
Now visit [http://localhost:8000](http://localhost:8000) Now visit [http://localhost:8000](http://localhost:8000)
Here is a [step-by-step guide](docs/quickstart/getting-started.md) showing you how to load data from an API into a file, all on your computer. Here is a [step-by-step guide](https://github.com/airbytehq/airbyte/tree/e378d40236b6a34e1c1cb481c8952735ec687d88/docs/quickstart/getting-started.md) showing you how to load data from an API into a file, all on your computer.
## Features ## Features

Binary file not shown.

Before

Width:  |  Height:  |  Size: 210 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 254 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 386 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 429 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 551 KiB

View File

Before

Width:  |  Height:  |  Size: 681 KiB

After

Width:  |  Height:  |  Size: 681 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 681 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 681 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 364 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 719 KiB

View File

Before

Width:  |  Height:  |  Size: 965 KiB

After

Width:  |  Height:  |  Size: 965 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 965 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 965 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 307 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 379 KiB

View File

Before

Width:  |  Height:  |  Size: 492 KiB

After

Width:  |  Height:  |  Size: 492 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 492 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 492 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 378 KiB

View File

Before

Width:  |  Height:  |  Size: 470 KiB

After

Width:  |  Height:  |  Size: 470 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 470 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 470 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 648 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 493 KiB

View File

Before

Width:  |  Height:  |  Size: 243 KiB

After

Width:  |  Height:  |  Size: 243 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 243 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 243 KiB

View File

Before

Width:  |  Height:  |  Size: 667 KiB

After

Width:  |  Height:  |  Size: 667 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 667 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 667 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 415 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 505 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 532 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 532 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 504 KiB

View File

Before

Width:  |  Height:  |  Size: 434 KiB

After

Width:  |  Height:  |  Size: 434 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 434 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 434 KiB

View File

Before

Width:  |  Height:  |  Size: 612 KiB

After

Width:  |  Height:  |  Size: 612 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 612 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 612 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 304 KiB

View File

Before

Width:  |  Height:  |  Size: 323 KiB

After

Width:  |  Height:  |  Size: 323 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 323 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 323 KiB

View File

Before

Width:  |  Height:  |  Size: 261 KiB

After

Width:  |  Height:  |  Size: 261 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 261 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 261 KiB

View File

Before

Width:  |  Height:  |  Size: 1.1 MiB

After

Width:  |  Height:  |  Size: 1.1 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.1 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.1 MiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 376 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 194 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 476 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 374 KiB

View File

Before

Width:  |  Height:  |  Size: 508 KiB

After

Width:  |  Height:  |  Size: 508 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 508 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 508 KiB

View File

@@ -10,11 +10,11 @@
* [Browsing Output Logs](tutorials/browsing-output-logs.md) * [Browsing Output Logs](tutorials/browsing-output-logs.md)
* [Upgrading Airbyte](tutorials/upgrading-airbyte.md) * [Upgrading Airbyte](tutorials/upgrading-airbyte.md)
* [Using the Airflow Airbyte Operator](tutorials/using-the-airflow-airbyte-operator.md) * [Using the Airflow Airbyte Operator](tutorials/using-the-airflow-airbyte-operator.md)
* [Contributing to Airbyte](contributing-to-airbyte/tutorials/README.md) * [Contributing to Airbyte](tutorials/tutorials/README.md)
* [A Beginner's Guide to the AirbyteCatalog](contributing-to-airbyte/tutorials/beginners-guide-to-catalog.md) * [A Beginner's Guide to the AirbyteCatalog](tutorials/tutorials/beginners-guide-to-catalog.md)
* [Building a Toy Connector](contributing-to-airbyte/tutorials/toy-connector.md) * [Building a Toy Connector](tutorials/tutorials/toy-connector.md)
* [Adding Incremental to a Source](contributing-to-airbyte/tutorials/adding-incremental-sync.md) * [Adding Incremental to a Source](tutorials/tutorials/adding-incremental-sync.md)
* [Building a Python Source](contributing-to-airbyte/tutorials/building-a-python-source.md) * [Building a Python Source](tutorials/tutorials/building-a-python-source.md)
* [Transformations and Normalization](tutorials/transformation-and-normalization/README.md) * [Transformations and Normalization](tutorials/transformation-and-normalization/README.md)
* [Transformations with SQL \(Part 1/2\)](tutorials/transformation-and-normalization/transformations-with-sql.md) * [Transformations with SQL \(Part 1/2\)](tutorials/transformation-and-normalization/transformations-with-sql.md)
* [Transformations with DBT \(Part 2/2\)](tutorials/transformation-and-normalization/transformations-with-dbt.md) * [Transformations with DBT \(Part 2/2\)](tutorials/transformation-and-normalization/transformations-with-dbt.md)
@@ -95,20 +95,20 @@
* [Monorepo Python Development](contributing-to-airbyte/building-new-connector/monorepo-python-development.md) * [Monorepo Python Development](contributing-to-airbyte/building-new-connector/monorepo-python-development.md)
* [Testing Connectors](contributing-to-airbyte/building-new-connector/testing-connectors.md) * [Testing Connectors](contributing-to-airbyte/building-new-connector/testing-connectors.md)
* [Standard Source Test Suite](contributing-to-airbyte/building-new-connector/standard-source-tests.md) * [Standard Source Test Suite](contributing-to-airbyte/building-new-connector/standard-source-tests.md)
* [Using the Airbyte CDK](contributing-to-airbyte/tutorials/cdk-tutorial-alpha/README.md) * [Using the Airbyte CDK](contributing-to-airbyte/cdk-tutorial-alpha/README.md)
* [Getting Started](contributing-to-airbyte/tutorials/cdk-tutorial-alpha/0-getting-started.md) * [Getting Started](contributing-to-airbyte/cdk-tutorial-alpha/0-getting-started.md)
* [Step 1: Creating the Source](contributing-to-airbyte/tutorials/cdk-tutorial-alpha/1-creating-the-source.md) * [Step 1: Creating the Source](contributing-to-airbyte/cdk-tutorial-alpha/1-creating-the-source.md)
* [Step 2: Install Dependencies](contributing-to-airbyte/tutorials/cdk-tutorial-alpha/2-install-dependencies.md) * [Step 2: Install Dependencies](contributing-to-airbyte/cdk-tutorial-alpha/2-install-dependencies.md)
* [Step 3: Define Inputs](contributing-to-airbyte/tutorials/cdk-tutorial-alpha/3-define-inputs.md) * [Step 3: Define Inputs](contributing-to-airbyte/cdk-tutorial-alpha/3-define-inputs.md)
* [Step 4: Connection Checking](contributing-to-airbyte/tutorials/cdk-tutorial-alpha/4-connection-checking.md) * [Step 4: Connection Checking](contributing-to-airbyte/cdk-tutorial-alpha/4-connection-checking.md)
* [Step 5: Declare the Schema](contributing-to-airbyte/tutorials/cdk-tutorial-alpha/5-declare-schema.md) * [Step 5: Declare the Schema](contributing-to-airbyte/cdk-tutorial-alpha/5-declare-schema.md)
* [Step 6: Read Data](contributing-to-airbyte/tutorials/cdk-tutorial-alpha/6-read-data.md) * [Step 6: Read Data](contributing-to-airbyte/cdk-tutorial-alpha/6-read-data.md)
* [Step 7: Use the Connector in Airbyte](contributing-to-airbyte/tutorials/cdk-tutorial-alpha/7-use-connector-in-airbyte.md) * [Step 7: Use the Connector in Airbyte](contributing-to-airbyte/cdk-tutorial-alpha/7-use-connector-in-airbyte.md)
* [Step 8: Test Connector](contributing-to-airbyte/tutorials/cdk-tutorial-alpha/8-test-your-connector.md) * [Step 8: Test Connector](contributing-to-airbyte/cdk-tutorial-alpha/8-test-your-connector.md)
* [Code Style](contributing-to-airbyte/code-style.md) * [Code Style](contributing-to-airbyte/code-style.md)
* [Updating Documentation](contributing-to-airbyte/updating-documentation.md) * [Updating Documentation](contributing-to-airbyte/updating-documentation.md)
* [Templates](contributing-to-airbyte/templates/README.md) * [Templates](contributing-to-airbyte/templates/README.md)
* [Connector Doc Template](contributing-to-airbyte/templates/integration-documentation-template.md) * [Connector Doc Template](contributing-to-airbyte/templates/integration-documentation-template.md)
* [Understanding Airbyte](understanding-airbyte/README.md) * [Understanding Airbyte](understanding-airbyte/README.md)
* [AirbyteCatalog & ConfiguredAirbyteCatalog](understanding-airbyte/catalog.md) * [AirbyteCatalog & ConfiguredAirbyteCatalog](understanding-airbyte/catalog.md)
* [Airbyte Specification](understanding-airbyte/airbyte-specification.md) * [Airbyte Specification](understanding-airbyte/airbyte-specification.md)
@@ -121,9 +121,9 @@
* [High-level View](understanding-airbyte/high-level-view.md) * [High-level View](understanding-airbyte/high-level-view.md)
* [Workers & Jobs](understanding-airbyte/jobs.md) * [Workers & Jobs](understanding-airbyte/jobs.md)
* [Technical Stack](understanding-airbyte/tech-stack.md) * [Technical Stack](understanding-airbyte/tech-stack.md)
* [Change Data Capture (CDC)](understanding-airbyte/cdc.md) * [Change Data Capture \(CDC\)](understanding-airbyte/cdc.md)
* [Namespaces](understanding-airbyte/namespaces.md) * [Namespaces](understanding-airbyte/namespaces.md)
* [API documentation](api-documentation.md) * [API documentation](api-documentation.md)
* [Project Overview](project-overview/README.md) * [Project Overview](project-overview/README.md)
* [Roadmap](project-overview/roadmap.md) * [Roadmap](project-overview/roadmap.md)
* [Changelog](project-overview/changelog/README.md) * [Changelog](project-overview/changelog/README.md)
@@ -131,7 +131,7 @@
* [Connectors](project-overview/changelog/connectors.md) * [Connectors](project-overview/changelog/connectors.md)
* [License](project-overview/license.md) * [License](project-overview/license.md)
* [Careers & Open Positions](career-and-open-positions/README.md) * [Careers & Open Positions](career-and-open-positions/README.md)
* [Senior Software Engineer](career-and-open-positions/senior-software-engineer.md) * [Senior Software Engineer](career-and-open-positions/senior-software-engineer.md)
* [FAQ](faq/README.md) * [FAQ](faq/README.md)
* [Technical Support](faq/technical-support.md) * [Technical Support](faq/technical-support.md)
* [Getting Started](faq/getting-started.md) * [Getting Started](faq/getting-started.md)
@@ -144,3 +144,4 @@
* [Singer vs Airbyte](faq/differences-with/singer-vs-airbyte.md) * [Singer vs Airbyte](faq/differences-with/singer-vs-airbyte.md)
* [Pipelinewise vs Airbyte](faq/differences-with/pipelinewise-vs-airbyte.md) * [Pipelinewise vs Airbyte](faq/differences-with/pipelinewise-vs-airbyte.md)
* [Meltano vs Airbyte](faq/differences-with/meltano-vs-airbyte.md) * [Meltano vs Airbyte](faq/differences-with/meltano-vs-airbyte.md)

View File

@@ -1,10 +1,10 @@
# Career & Open Positions # Careers & Open Positions
## **Who we are** ## **Who we are**
[Airbyte](http://airbyte.io) is the upcoming open-source standard for EL\(T\). We enable data teams to replicate data from applications, APIs, and databases to data warehouses, lakes, and other destinations. We believe only an open-source approach can solve the problem of data integration, as it enables us to cover the long tail of integrations while enabling teams to adapt prebuilt connectors to their needs. [Airbyte](http://airbyte.io) is the upcoming open-source standard for EL\(T\). We enable data teams to replicate data from applications, APIs, and databases to data warehouses, lakes, and other destinations. We believe only an open-source approach can solve the problem of data integration, as it enables us to cover the long tail of integrations while enabling teams to adapt prebuilt connectors to their needs.
Airbyte is remote friendly, with most of the team still based in the Silicon Valley. Were fully open as a company. Our **[company handbook](https://handbook.airbyte.io)**, **[culture & values](https://handbook.airbyte.io/company/culture-and-values)**, **[strategy](https://handbook.airbyte.io/strategy/strategy)** and **[roadmap](../project-overview/roadmap.md)** are open to all. Airbyte is remote friendly, with most of the team still based in the Silicon Valley. Were fully open as a company. Our [**company handbook**](https://handbook.airbyte.io), [**culture & values**](https://handbook.airbyte.io/company/culture-and-values), [**strategy**](https://handbook.airbyte.io/strategy/strategy) and [**roadmap**](../project-overview/roadmap.md) are open to all.
We're backed by some of the world's [top investors](./#our-investors) and believe in product-led growth, where we build something awesome and let our product bring the users, rather than an outbound sales engine with cold calls. We're backed by some of the world's [top investors](./#our-investors) and believe in product-led growth, where we build something awesome and let our product bring the users, rather than an outbound sales engine with cold calls.
@@ -50,12 +50,12 @@ If the written interview is a success, we might set you up with one or 2 additio
Once all of this done, we will discuss the process internally and get back to you very fast \(velocity is everything here\)! So about 2-3 calls and one written interview, that's it! Once all of this done, we will discuss the process internally and get back to you very fast \(velocity is everything here\)! So about 2-3 calls and one written interview, that's it!
## **[Our Benefits](https://handbook.airbyte.io/people/benefits)** ## [**Our Benefits**](https://handbook.airbyte.io/people/benefits)
* **Flexible work environment as fully remote** - we dont look at when you log in, log out or how much time you work. We trust you, its the only way remote can actually work. * **Flexible work environment as fully remote** - we dont look at when you log in, log out or how much time you work. We trust you, its the only way remote can actually work.
* **[Unlimited vacation policy](https://handbook.airbyte.io/people/time-off)** with mandatory minimum time off - so you can fit work around your life. * [**Unlimited vacation policy**](https://handbook.airbyte.io/people/time-off) with mandatory minimum time off - so you can fit work around your life.
* **[Co-working space stipend](https://handbook.airbyte.io/people/expense-policy#work-space)** - we provide everyone with $200/month to use on a coworking space of their choice, if any. * [**Co-working space stipend**](https://handbook.airbyte.io/people/expense-policy#work-space) - we provide everyone with $200/month to use on a coworking space of their choice, if any.
* **[Parental leave](https://handbook.airbyte.io/people/time-off#parental-leave)** \(for both parents, after one year spent with the company\) - so those raising families can do so while still working for us. * [**Parental leave**](https://handbook.airbyte.io/people/time-off#parental-leave) \(for both parents, after one year spent with the company\) - so those raising families can do so while still working for us.
* **Open book policy** - we reimburse books that employees want to purchase for their professional and career development. * **Open book policy** - we reimburse books that employees want to purchase for their professional and career development.
* **Continuous learning / training policy** - we sponsor the conferences and training programs you feel would add to your development in the company. * **Continuous learning / training policy** - we sponsor the conferences and training programs you feel would add to your development in the company.
* **Health insurance** for those from countries that do not provide this freely. Through Savvy in the US, which means you can choose the insurance you want and will receive a stipend from the company. * **Health insurance** for those from countries that do not provide this freely. Through Savvy in the US, which means you can choose the insurance you want and will receive a stipend from the company.

View File

@@ -39,9 +39,9 @@ Wherever you want!
## **Perks!!!** ## **Perks!!!**
* **Flexible work environment as fully remote** - we dont look at when you log in, log out or how much time you work. We trust you, its the only way remote can actually work. * **Flexible work environment as fully remote** - we dont look at when you log in, log out or how much time you work. We trust you, its the only way remote can actually work.
* **[Unlimited vacation policy](https://handbook.airbyte.io/people/time-off)** with mandatory minimum time off - so you can fit work around your life. * [**Unlimited vacation policy**](https://handbook.airbyte.io/people/time-off) with mandatory minimum time off - so you can fit work around your life.
* **[Co-working space stipend](https://handbook.airbyte.io/people/expense-policy#work-space)** - we provide everyone with $200/month to use on a coworking space of their choice, if any. * [**Co-working space stipend**](https://handbook.airbyte.io/people/expense-policy#work-space) - we provide everyone with $200/month to use on a coworking space of their choice, if any.
* **[Parental leave](https://handbook.airbyte.io/people/time-off#parental-leave)** \(for both parents, after one year spent with the company\) - so those raising families can do so while still working for us. * [**Parental leave**](https://handbook.airbyte.io/people/time-off#parental-leave) \(for both parents, after one year spent with the company\) - so those raising families can do so while still working for us.
* **Open book policy** - we reimburse books that employees want to purchase for their professional and career development. * **Open book policy** - we reimburse books that employees want to purchase for their professional and career development.
* **Continuous learning / training policy** - we sponsor the conferences and training programs you feel would add to your development in the company. * **Continuous learning / training policy** - we sponsor the conferences and training programs you feel would add to your development in the company.
* **Health insurance** for those from countries that do not provide this freely. Through Savvy in the US, which means you can choose the insurance you want and will receive a stipend from the company. * **Health insurance** for those from countries that do not provide this freely. Through Savvy in the US, which means you can choose the insurance you want and will receive a stipend from the company.

View File

@@ -29,7 +29,7 @@ Here is a list of easy [good first issues](https://github.com/airbytehq/airbyte/
It's easy to add your own connector to Airbyte! **Since Airbyte connectors are encapsulated within Docker containers, you can use any language you like.** Here are some links on how to add sources and destinations. We haven't built the documentation for all languages yet, so don't hesitate to reach out to us if you'd like help developing connectors in other languages. It's easy to add your own connector to Airbyte! **Since Airbyte connectors are encapsulated within Docker containers, you can use any language you like.** Here are some links on how to add sources and destinations. We haven't built the documentation for all languages yet, so don't hesitate to reach out to us if you'd like help developing connectors in other languages.
* See [Building new connectors](building-new-connector/) to get started. * See [Building new connectors](building-new-connector/) to get started.
* Since we frequently build connectors in Python, on top of Singer or in Java, we've created generator libraries to get you started quickly: [Build Python Source Connectors](tutorials/building-a-python-source.md) and [Build Java Connectors](building-new-connector/java-connectors.md) * Since we frequently build connectors in Python, on top of Singer or in Java, we've created generator libraries to get you started quickly: [Build Python Source Connectors](../tutorials/tutorials/building-a-python-source.md) and [Build Java Connectors](building-new-connector/java-connectors.md)
* Integration tests \(tests that run a connector's image against an external resource\) can be run one of three ways, as detailed [here](building-new-connector/testing-connectors.md) * Integration tests \(tests that run a connector's image against an external resource\) can be run one of three ways, as detailed [here](building-new-connector/testing-connectors.md)
**Please note that, at no point in time, we will ask you to maintain your connector.** The goal is that the Airbyte team and the community helps maintain the connector. **Please note that, at no point in time, we will ask you to maintain your connector.** The goal is that the Airbyte team and the community helps maintain the connector.

View File

@@ -42,9 +42,9 @@ npm run generate
and choose the relevant template. This will generate a new connector in the `airbyte-integrations/connectors/<your-connector>` directory. and choose the relevant template. This will generate a new connector in the `airbyte-integrations/connectors/<your-connector>` directory.
Search the generated directory for "TODO"s and follow them to implement your connector. Search the generated directory for "TODO"s and follow them to implement your connector.
If you are developing a Python connector, you may find the [building a Python connector tutorial](../tutorials/building-a-python-source.md) helpful. If you are developing a Python connector, you may find the [building a Python connector tutorial](../../tutorials/tutorials/building-a-python-source.md) helpful.
### 2. Integration tests ### 2. Integration tests
@@ -54,14 +54,14 @@ At a minimum, your connector must implement the standard tests described in [Tes
If you're writing in Python or Java, skip this section -- it is provided automatically. If you're writing in Python or Java, skip this section -- it is provided automatically.
If you're writing in another language, please document the commands needed to: If you're writing in another language, please document the commands needed to:
1. Build your connector docker image \(usually this is just `docker build .` but let us know if there are necessary flags, gotchas, etc..\) 1. Build your connector docker image \(usually this is just `docker build .` but let us know if there are necessary flags, gotchas, etc..\)
2. Run any unit or integration tests _in a Docker image_. 2. Run any unit or integration tests _in a Docker image_.
Your integration and unit tests must be runnable entirely within a Docker image. This is important to guarantee consistent build environments. Your integration and unit tests must be runnable entirely within a Docker image. This is important to guarantee consistent build environments.
When you submit a PR to Airbyte with your connector, the reviewer will use the commands you provide to integrate your connector into Airbyte's build system as follows: When you submit a PR to Airbyte with your connector, the reviewer will use the commands you provide to integrate your connector into Airbyte's build system as follows:
1. `:airbyte-integrations:connectors:source-<name>:build` should run unit tests and build the integration's Docker image 1. `:airbyte-integrations:connectors:source-<name>:build` should run unit tests and build the integration's Docker image
2. `:airbyte-integrations:connectors:source-<name>:integrationTest` should run integration tests including Airbyte's Standard test suite. 2. `:airbyte-integrations:connectors:source-<name>:integrationTest` should run integration tests including Airbyte's Standard test suite.

View File

@@ -4,7 +4,7 @@ This guide contains instructions on how to setup Python with Gradle within the A
## Python Connector Development ## Python Connector Development
Before working with connectors written in Python, we recommend running `./gradlew :airbyte-integrations:connectors:<connector directory name>:build` (e.g. `./gradlew :airbyte-integrations:connectors:source-postgres:build`) from the root project directory. This will create a `virtualenv` and install dependencies for the connector you want to work on as well as any internal Airbyte python packages it depends on. Before working with connectors written in Python, we recommend running `./gradlew :airbyte-integrations:connectors:<connector directory name>:build` \(e.g. `./gradlew :airbyte-integrations:connectors:source-postgres:build`\) from the root project directory. This will create a `virtualenv` and install dependencies for the connector you want to work on as well as any internal Airbyte python packages it depends on.
When iterating on a single connector, you will often iterate by running When iterating on a single connector, you will often iterate by running

View File

@@ -4,9 +4,10 @@
To ensure a minimum quality bar, Airbyte runs all connectors against the same set of integration tests \(sources & destinations have two different test suites\). Those tests ensure that each connector adheres to the [Airbyte Specification](../../understanding-airbyte/airbyte-specification.md) and responds correctly to Airbyte commands when provided valid \(or invalid\) inputs. To ensure a minimum quality bar, Airbyte runs all connectors against the same set of integration tests \(sources & destinations have two different test suites\). Those tests ensure that each connector adheres to the [Airbyte Specification](../../understanding-airbyte/airbyte-specification.md) and responds correctly to Airbyte commands when provided valid \(or invalid\) inputs.
*Note: If you are looking for reference documentation for the deprecated first version of test suites, see [Standard Tests (Legacy)](legacy-standard-source-tests.md).* _Note: If you are looking for reference documentation for the deprecated first version of test suites, see_ [_Standard Tests \(Legacy\)_](https://github.com/airbytehq/airbyte/tree/e378d40236b6a34e1c1cb481c8952735ec687d88/docs/contributing-to-airbyte/building-new-connector/legacy-standard-source-tests.md)_._
### Architecture of standard tests ### Architecture of standard tests
The Standard Test Suite runs its tests against the connector's Docker image. It takes as input the configuration file `acceptance-tests-config.yml`. The Standard Test Suite runs its tests against the connector's Docker image. It takes as input the configuration file `acceptance-tests-config.yml`.
![Standard test sequence diagram](../../.gitbook/assets/standard_tests_sequence_diagram.png) ![Standard test sequence diagram](../../.gitbook/assets/standard_tests_sequence_diagram.png)
@@ -15,42 +16,46 @@ The Standard Test Suite use pytest as a test runner and was built as pytest plug
Each test suite has a timeout and will fail if the limit is exceeded. Each test suite has a timeout and will fail if the limit is exceeded.
See all the test cases, their description, and inputs in [Source Acceptance Tests](source-acceptance-tests.md). See all the test cases, their description, and inputs in [Source Acceptance Tests](https://github.com/airbytehq/airbyte/tree/e378d40236b6a34e1c1cb481c8952735ec687d88/docs/contributing-to-airbyte/building-new-connector/source-acceptance-tests.md).
### Setting up standard tests for your connector ### Setting up standard tests for your connector
Create `acceptance-test-config.yml`. In most cases, your connector already has this file in its root folder. Create `acceptance-test-config.yml`. In most cases, your connector already has this file in its root folder. Here is an example of the minimal `acceptance-test-config.yml`:
Here is an example of the minimal `acceptance-test-config.yml`:
```yaml ```yaml
connector_image: airbyte/source-some-connector:dev connector_image: airbyte/source-some-connector:dev
tests: tests:
spec: spec:
- spec_path: "some_folder/spec.json" - spec_path: "some_folder/spec.json"
``` ```
Build your connector image if needed. Build your connector image if needed.
```
```text
docker build . docker build .
``` ```
Run one of the two scripts in the root of the connector:
- `python -m pytest -p integration_tests.acceptance` - to run tests inside virtual environment
- `./acceptance-test-docker.sh` - to run tests from a docker container
If the test fails you will see detail about the test and where to find its inputs and outputs to reproduce it. Run one of the two scripts in the root of the connector:
You can also debug failed tests by adding `—pdb —last-failed`:
``` * `python -m pytest -p integration_tests.acceptance` - to run tests inside virtual environment
* `./acceptance-test-docker.sh` - to run tests from a docker container
If the test fails you will see detail about the test and where to find its inputs and outputs to reproduce it. You can also debug failed tests by adding `—pdb —last-failed`:
```text
python -m pytest -p integration_tests.acceptance --pdb --last-failed python -m pytest -p integration_tests.acceptance --pdb --last-failed
``` ```
See other useful pytest options [here](https://docs.pytest.org/en/stable/usage.html) See other useful pytest options [here](https://docs.pytest.org/en/stable/usage.html)
### Dynamically managing inputs & resources used in standard tests ### Dynamically managing inputs & resources used in standard tests
Since the inputs to standard tests are often static, the file-based runner is sufficient for most connectors. However, in some cases, you may need to run pre or post hooks to dynamically create or destroy resources for use in standard tests. Since the inputs to standard tests are often static, the file-based runner is sufficient for most connectors. However, in some cases, you may need to run pre or post hooks to dynamically create or destroy resources for use in standard tests. For example, if we need to spin up a Redshift cluster to use in the test then tear it down afterwards, we need the ability to run code before and after the tests, as well as customize the Redshift cluster URL we pass to the standard tests. If you have need for this use case, please reach out to us via [Github](https://github.com/airbytehq/airbyte) or [Slack](https://slack.airbyte.io). We currently support it for Java & Python, and other languages can be made available upon request.
For example, if we need to spin up a Redshift cluster to use in the test then tear it down afterwards, we need the ability to run code before and after the tests, as well as customize the Redshift cluster URL we pass to the standard tests.
If you have need for this use case, please reach out to us via [Github](https://github.com/airbytehq/airbyte) or [Slack](https://slack.airbyte.io).
We currently support it for Java & Python, and other languages can be made available upon request.
#### Python #### Python
Create pytest yield-fixture with your custom setup/teardown code and place it in `integration_tests/acceptance.py`,
Example of fixture that starts a docker container before tests and stops before exit: Create pytest yield-fixture with your custom setup/teardown code and place it in `integration_tests/acceptance.py`, Example of fixture that starts a docker container before tests and stops before exit:
```python ```python
@pytest.fixture(scope="session", autouse=True) @pytest.fixture(scope="session", autouse=True)
def connector_setup(): def connector_setup():
@@ -108,3 +113,4 @@ Note that integration tests can be triggered with a slightly different syntax fo
Commits to `master` attempt to launch integration tests. Two workflows launch for each commit: one is a launcher for integration tests, the other is the core build \(the same as the default for PR and branch builds\). Commits to `master` attempt to launch integration tests. Two workflows launch for each commit: one is a launcher for integration tests, the other is the core build \(the same as the default for PR and branch builds\).
Since some of our connectors use rate-limited external resources, we don't want to overload from multiple commits to master. If a certain threshold of `master` integration tests are running, the integration test launcher passes but does not launch any tests. This can manually be re-run if necessary. The `master` build also runs every few hours automatically, and will launch the integration tests at that time. Since some of our connectors use rate-limited external resources, we don't want to overload from multiple commits to master. If a certain threshold of `master` integration tests are running, the integration test launcher passes but does not launch any tests. This can manually be re-run if necessary. The `master` build also runs every few hours automatically, and will launch the integration tests at that time.

View File

@@ -1,19 +1,19 @@
# Building a Python Source for an HTTP API # Getting Started
## Summary ## Summary
This is a step-by-step guide for how to create an Airbyte source in Python to read data from an HTTP API. We'll be using the This is a step-by-step guide for how to create an Airbyte source in Python to read data from an HTTP API. We'll be using the Exchange Rates API as an example since it is simple and demonstrates a lot of the capabilities of the CDK.
Exchange Rates API as an example since it is simple and demonstrates a lot of the capabilities of the CDK.
## Requirements ## Requirements
* Python >= 3.7 * Python &gt;= 3.7
* Docker * Docker
* NodeJS (only used to generate the connector). We'll remove the NodeJS dependency soon. * NodeJS \(only used to generate the connector\). We'll remove the NodeJS dependency soon.
All the commands below assume that `python` points to a version of python >=3.7.9. On some systems, `python` points to a Python2 installation and `python3` points to Python3. If this is the case on your machine, substitute all `python` commands in this guide with `python3`. All the commands below assume that `python` points to a version of python &gt;=3.7.9. On some systems, `python` points to a Python2 installation and `python3` points to Python3. If this is the case on your machine, substitute all `python` commands in this guide with `python3`.
## Checklist ## Checklist
* Step 1: Create the source using the template * Step 1: Create the source using the template
* Step 2: Install dependencies for the new source * Step 2: Install dependencies for the new source
* Step 3: Define the inputs needed by your connector * Step 3: Define the inputs needed by your connector
@@ -23,4 +23,5 @@ All the commands below assume that `python` points to a version of python >=3.7.
* Step 7: Use the connector in Airbyte * Step 7: Use the connector in Airbyte
* Step 8: Write unit tests or integration tests * Step 8: Write unit tests or integration tests
Each step of the Creating a Source checklist is explained in more detail in the following steps. We also mention how you can submit the connector to be included with the general Airbyte release at the end of the tutorial. Each step of the Creating a Source checklist is explained in more detail in the following steps. We also mention how you can submit the connector to be included with the general Airbyte release at the end of the tutorial.

View File

@@ -1,4 +1,4 @@
# Step 1: Create the source using template # Step 1: Creating the Source
Airbyte provides a code generator which bootstraps the scaffolding for our connector. Airbyte provides a code generator which bootstraps the scaffolding for our connector.
@@ -11,5 +11,5 @@ $ npm run generate
Select the `Python HTTP API Source` template and then input the name of your connector. For this walk-through we will refer to our source as `python-http-example`. The finalized source code for this tutorial can be found [here](https://github.com/airbytehq/airbyte/tree/master/airbyte-integrations/connectors/source-python-http-tutorial). Select the `Python HTTP API Source` template and then input the name of your connector. For this walk-through we will refer to our source as `python-http-example`. The finalized source code for this tutorial can be found [here](https://github.com/airbytehq/airbyte/tree/master/airbyte-integrations/connectors/source-python-http-tutorial).
The source we will build in this tutorial will pull data from the [Rates API](ratesapi.io), a free and open API which The source we will build in this tutorial will pull data from the [Rates API](https://github.com/airbytehq/airbyte/tree/d940c78307f09f38198e50e54195052d762af944/docs/contributing-to-airbyte/tutorials/cdk-tutorial-alpha/ratesapi.io), a free and open API which documents historical exchange rates for fiat currencies.
documents historical exchange rates for fiat currencies.

View File

@@ -1,4 +1,5 @@
# Step 2: Install dependencies for the new source # Step 2: Install Dependencies
Now that you've generated the module, let's navigate to its directory and install dependencies: Now that you've generated the module, let's navigate to its directory and install dependencies:
```text ```text
@@ -12,12 +13,13 @@ This step sets up the initial python environment. **All** subsequent `python` or
Let's verify everything is working as intended. Run: Let's verify everything is working as intended. Run:
``` ```text
python main_dev.py spec python main_dev.py spec
``` ```
You should see some output: You should see some output:
```
```text
{"type": "SPEC", "spec": {"documentationUrl": "https://docsurl.com", "connectionSpecification": {"$schema": "http://json-schema.org/draft-07/schema#", "title": "Python Http Tutorial Spec", "type": "object", "required": ["TODO"], "additionalProperties": false, "properties": {"TODO: This schema defines the configuration required for the source. This usually involves metadata such as database and/or authentication information.": {"type": "string", "description": "describe me"}}}}} {"type": "SPEC", "spec": {"documentationUrl": "https://docsurl.com", "connectionSpecification": {"$schema": "http://json-schema.org/draft-07/schema#", "title": "Python Http Tutorial Spec", "type": "object", "required": ["TODO"], "additionalProperties": false, "properties": {"TODO: This schema defines the configuration required for the source. This usually involves metadata such as database and/or authentication information.": {"type": "string", "description": "describe me"}}}}}
``` ```
@@ -26,6 +28,7 @@ We just ran Airbyte Protocol's `spec` command! We'll talk more about this later,
Note that the `main_dev.py` file is a simple script that makes it easy to run your connector. Its invocation format is `python main_dev.py <command> [args]`. See the module's generated `README.md` for the commands it supports. Note that the `main_dev.py` file is a simple script that makes it easy to run your connector. Its invocation format is `python main_dev.py <command> [args]`. See the module's generated `README.md` for the commands it supports.
## Notes on iteration cycle ## Notes on iteration cycle
### Dependencies ### Dependencies
Python dependencies for your source should be declared in `airbyte-integrations/connectors/source-<source-name>/setup.py` in the `install_requires` field. You will notice that a couple of Airbyte dependencies are already declared there. Do not remove these; they give your source access to the helper interfaces provided by the generator. Python dependencies for your source should be declared in `airbyte-integrations/connectors/source-<source-name>/setup.py` in the `install_requires` field. You will notice that a couple of Airbyte dependencies are already declared there. Do not remove these; they give your source access to the helper interfaces provided by the generator.
@@ -33,9 +36,11 @@ Python dependencies for your source should be declared in `airbyte-integrations/
You may notice that there is a `requirements.txt` in your source's directory as well. Don't edit this. It is autogenerated and used to provide Airbyte dependencies. All your dependencies should be declared in `setup.py`. You may notice that there is a `requirements.txt` in your source's directory as well. Don't edit this. It is autogenerated and used to provide Airbyte dependencies. All your dependencies should be declared in `setup.py`.
### Development Environment ### Development Environment
The commands we ran above created a [Python virtual environment](https://docs.python.org/3/tutorial/venv.html) for your source. If you want your IDE to auto complete and resolve dependencies properly, point it at the virtual env `airbyte-integrations/connectors/source-<source-name>/.venv`. Also anytime you change the dependencies in the `setup.py` make sure to re-run `pip install -r requirements.txt`. The commands we ran above created a [Python virtual environment](https://docs.python.org/3/tutorial/venv.html) for your source. If you want your IDE to auto complete and resolve dependencies properly, point it at the virtual env `airbyte-integrations/connectors/source-<source-name>/.venv`. Also anytime you change the dependencies in the `setup.py` make sure to re-run `pip install -r requirements.txt`.
### Iterating on your implementation ### Iterating on your implementation
There are two ways we recommend iterating on a source. Consider using whichever one matches your style. There are two ways we recommend iterating on a source. Consider using whichever one matches your style.
**Run the source using python** **Run the source using python**
@@ -70,3 +75,4 @@ docker run --rm -v $(pwd)/secrets:/secrets -v $(pwd)/sample_files:/sample_files
Note: Each time you make a change to your implementation you need to re-build the connector image via `docker build . -t airbyte/source-<name>:dev`. This ensures the new python code is added into the docker container. Note: Each time you make a change to your implementation you need to re-build the connector image via `docker build . -t airbyte/source-<name>:dev`. This ensures the new python code is added into the docker container.
The nice thing about this approach is that you are running your source exactly as it will be run by Airbyte. The tradeoff is iteration is slightly slower, as the connector is re-built between each change. The nice thing about this approach is that you are running your source exactly as it will be run by Airbyte. The tradeoff is iteration is slightly slower, as the connector is re-built between each change.

View File

@@ -1,4 +1,4 @@
# Step 3: Define the inputs required by your connector # Step 3: Define Inputs
Each connector declares the inputs it needs to read data from the underlying data source. This is the Airbyte Protocol's `spec` operation. Each connector declares the inputs it needs to read data from the underlying data source. This is the Airbyte Protocol's `spec` operation.
@@ -10,7 +10,7 @@ The generated code that Airbyte provides, handles implementing the `spec` method
Given that we'll pulling currency data for our example source, we'll define the following `spec.json`: Given that we'll pulling currency data for our example source, we'll define the following `spec.json`:
``` ```text
{ {
"documentationUrl": "https://docs.airbyte.io/integrations/sources/exchangeratesapi", "documentationUrl": "https://docs.airbyte.io/integrations/sources/exchangeratesapi",
"connectionSpecification": { "connectionSpecification": {
@@ -37,5 +37,7 @@ Given that we'll pulling currency data for our example source, we'll define the
``` ```
In addition to metadata, we define two inputs: In addition to metadata, we define two inputs:
* `start_date`: The beginning date to start tracking currency exchange rates from * `start_date`: The beginning date to start tracking currency exchange rates from
* `base`: The currency whose rates we're interested in tracking * `base`: The currency whose rates we're interested in tracking

View File

@@ -1,4 +1,5 @@
# Step 4: Implement connection checking # Step 4: Connection Checking
The second operation in the Airbyte Protocol that we'll implement is the `check` operation. The second operation in the Airbyte Protocol that we'll implement is the `check` operation.
This operation verifies that the input configuration supplied by the user can be used to connect to the underlying data source. Note that this user-supplied configuration has the values described in the `spec.json` filled in. In other words if the `spec.json` said that the source requires a `username` and `password` the config object might be `{ "username": "airbyte", "password": "password123" }`. You should then implement something that returns a json object reporting, given the credentials in the config, whether we were able to connect to the source. This operation verifies that the input configuration supplied by the user can be used to connect to the underlying data source. Note that this user-supplied configuration has the values described in the `spec.json` filled in. In other words if the `spec.json` said that the source requires a `username` and `password` the config object might be `{ "username": "airbyte", "password": "password123" }`. You should then implement something that returns a json object reporting, given the credentials in the config, whether we were able to connect to the source.
@@ -38,7 +39,7 @@ Following the docstring instructions, we'll change the implementation to verify
Let's test out this implementation by creating two objects: a valid and an invalid config and attempt to give them as input to the connector Let's test out this implementation by creating two objects: a valid and an invalid config and attempt to give them as input to the connector
``` ```text
echo '{"start_date": "2021-04-01", "base": "USD"}' > sample_files/config.json echo '{"start_date": "2021-04-01", "base": "USD"}' > sample_files/config.json
echo '{"start_date": "2021-04-01", "base": "BTC"}' > sample_files/invalid_config.json echo '{"start_date": "2021-04-01", "base": "BTC"}' > sample_files/invalid_config.json
python main_dev.py check --config sample_files/config.json python main_dev.py check --config sample_files/config.json
@@ -47,7 +48,7 @@ python main_dev.py check --config sample_files/invalid_config.json
You should see output like the following: You should see output like the following:
``` ```text
> python main_dev.py check --config sample_files/config.json > python main_dev.py check --config sample_files/config.json
{"type": "CONNECTION_STATUS", "connectionStatus": {"status": "SUCCEEDED"}} {"type": "CONNECTION_STATUS", "connectionStatus": {"status": "SUCCEEDED"}}
@@ -55,4 +56,5 @@ You should see output like the following:
{"type": "CONNECTION_STATUS", "connectionStatus": {"status": "FAILED", "message": "Input currency BTC is invalid. Please input one of the following currencies: {'DKK', 'USD', 'CZK', 'BGN', 'JPY'}"}} {"type": "CONNECTION_STATUS", "connectionStatus": {"status": "FAILED", "message": "Input currency BTC is invalid. Please input one of the following currencies: {'DKK', 'USD', 'CZK', 'BGN', 'JPY'}"}}
``` ```
While developing, we recommend storing configs which contain secrets in `secrets/config.json` because the `secrets` directory is gitignored by default. While developing, we recommend storing configs which contain secrets in `secrets/config.json` because the `secrets` directory is gitignored by default.

View File

@@ -1,14 +1,13 @@
# Step 5: Declare the schema of your streams # Step 5: Declare the Schema
The `discover` method of the Airbyte Protocol returns an `AirbyteCatalog`: an object which declares all the streams output by a connector and their schemas. It also declares the sync modes supported by the stream (full refresh or incremental). See the [catalog tutorial](https://docs.airbyte.io/tutorials/tutorials/beginners-guide-to-catalog) for more information. The `discover` method of the Airbyte Protocol returns an `AirbyteCatalog`: an object which declares all the streams output by a connector and their schemas. It also declares the sync modes supported by the stream \(full refresh or incremental\). See the [catalog tutorial](https://docs.airbyte.io/tutorials/tutorials/beginners-guide-to-catalog) for more information.
This is a simple task with the Airbyte CDK. For each stream in our connector we'll need to: This is a simple task with the Airbyte CDK. For each stream in our connector we'll need to: 1. Create a python `class` in `source.py` which extends `HttpStream` 2. Place a `<stream_name>.json` file in the `source_<name>/schemas/` directory. The name of the file should be the snake\_case name of the stream whose schema it describes, and its contents should be the JsonSchema describing the output from that stream.
1. Create a python `class` in `source.py` which extends `HttpStream`
2. Place a `<stream_name>.json` file in the `source_<name>/schemas/` directory. The name of the file should be the snake_case name of the stream whose schema it describes, and its contents should be the JsonSchema describing the output from that stream.
Let's create a class in `source.py` which extends `HttpStream`. You'll notice there are classes with extensive comments describing what needs to be done to implement various connector features. Feel free to read these classes as needed. But for the purposes of this tutorial, let's assume that we are adding classes from scratch either by deleting those generated classes or editing them to match the implementation below. Let's create a class in `source.py` which extends `HttpStream`. You'll notice there are classes with extensive comments describing what needs to be done to implement various connector features. Feel free to read these classes as needed. But for the purposes of this tutorial, let's assume that we are adding classes from scratch either by deleting those generated classes or editing them to match the implementation below.
We'll begin by creating a stream to represent the data that we're pulling from the Exchange Rates API: We'll begin by creating a stream to represent the data that we're pulling from the Exchange Rates API:
```python ```python
class ExchangeRates(HttpStream): class ExchangeRates(HttpStream):
url_base = "https://api.ratesapi.io/" url_base = "https://api.ratesapi.io/"
@@ -32,8 +31,7 @@ class ExchangeRates(HttpStream):
stream_slice: Mapping[str, Any] = None, stream_slice: Mapping[str, Any] = None,
next_page_token: Mapping[str, Any] = None, next_page_token: Mapping[str, Any] = None,
) -> Iterable[Mapping]: ) -> Iterable[Mapping]:
return None # TODO return None # TODO
``` ```
Note that this implementation is entirely empty -- we haven't actually done anything. We'll come back to this in the next step. But for now we just want to declare the schema of this stream. We'll declare this as a stream that the connector outputs by returning it from the `streams` method: Note that this implementation is entirely empty -- we haven't actually done anything. We'll come back to this in the next step. But for now we just want to declare the schema of this stream. We'll declare this as a stream that the connector outputs by returning it from the `streams` method:
@@ -52,22 +50,23 @@ class SourcePythonHttpTutorial(AbstractSource):
# Other authenticators are available for API token-based auth and Oauth2. # Other authenticators are available for API token-based auth and Oauth2.
auth = NoAuth() auth = NoAuth()
return [ExchangeRates(authenticator=auth)] return [ExchangeRates(authenticator=auth)]
``` ```
Having created this stream in code, we'll put a file `exchange_rates.json` in the `schemas/` folder. You can download the JSON file describing the output schema [here](http_api_source_assets/exchange_rates.json) for convenience and place it in `schemas/`. Having created this stream in code, we'll put a file `exchange_rates.json` in the `schemas/` folder. You can download the JSON file describing the output schema [here](https://github.com/airbytehq/airbyte/tree/d940c78307f09f38198e50e54195052d762af944/docs/contributing-to-airbyte/tutorials/cdk-tutorial-alpha/http_api_source_assets/exchange_rates.json) for convenience and place it in `schemas/`.
With `.json` schema file in place, let's see if the connector can now find this schema and produce a valid catalog: With `.json` schema file in place, let's see if the connector can now find this schema and produce a valid catalog:
``` ```text
python main_dev.py discover --config sample_files/config.json python main_dev.py discover --config sample_files/config.json
``` ```
you should see some output like: you should see some output like:
```
```text
{"type": "CATALOG", "catalog": {"streams": [{"name": "exchange_rates", "json_schema": {"$schema": "http://json-schema.org/draft-04/schema#", "type": "object", "properties": {"base": {"type": "string"}, "rates": {"type": "object", "properties": {"GBP": {"type": "number"}, "HKD": {"type": "number"}, "IDR": {"type": "number"}, "PHP": {"type": "number"}, "LVL": {"type": "number"}, "INR": {"type": "number"}, "CHF": {"type": "number"}, "MXN": {"type": "number"}, "SGD": {"type": "number"}, "CZK": {"type": "number"}, "THB": {"type": "number"}, "BGN": {"type": "number"}, "EUR": {"type": "number"}, "MYR": {"type": "number"}, "NOK": {"type": "number"}, "CNY": {"type": "number"}, "HRK": {"type": "number"}, "PLN": {"type": "number"}, "LTL": {"type": "number"}, "TRY": {"type": "number"}, "ZAR": {"type": "number"}, "CAD": {"type": "number"}, "BRL": {"type": "number"}, "RON": {"type": "number"}, "DKK": {"type": "number"}, "NZD": {"type": "number"}, "EEK": {"type": "number"}, "JPY": {"type": "number"}, "RUB": {"type": "number"}, "KRW": {"type": "number"}, "USD": {"type": "number"}, "AUD": {"type": "number"}, "HUF": {"type": "number"}, "SEK": {"type": "number"}}}, "date": {"type": "string"}}}, "supported_sync_modes": ["full_refresh"]}]}} {"type": "CATALOG", "catalog": {"streams": [{"name": "exchange_rates", "json_schema": {"$schema": "http://json-schema.org/draft-04/schema#", "type": "object", "properties": {"base": {"type": "string"}, "rates": {"type": "object", "properties": {"GBP": {"type": "number"}, "HKD": {"type": "number"}, "IDR": {"type": "number"}, "PHP": {"type": "number"}, "LVL": {"type": "number"}, "INR": {"type": "number"}, "CHF": {"type": "number"}, "MXN": {"type": "number"}, "SGD": {"type": "number"}, "CZK": {"type": "number"}, "THB": {"type": "number"}, "BGN": {"type": "number"}, "EUR": {"type": "number"}, "MYR": {"type": "number"}, "NOK": {"type": "number"}, "CNY": {"type": "number"}, "HRK": {"type": "number"}, "PLN": {"type": "number"}, "LTL": {"type": "number"}, "TRY": {"type": "number"}, "ZAR": {"type": "number"}, "CAD": {"type": "number"}, "BRL": {"type": "number"}, "RON": {"type": "number"}, "DKK": {"type": "number"}, "NZD": {"type": "number"}, "EEK": {"type": "number"}, "JPY": {"type": "number"}, "RUB": {"type": "number"}, "KRW": {"type": "number"}, "USD": {"type": "number"}, "AUD": {"type": "number"}, "HUF": {"type": "number"}, "SEK": {"type": "number"}}}, "date": {"type": "string"}}}, "supported_sync_modes": ["full_refresh"]}]}}
``` ```
It's that simple! Now the connector knows how to declare your connector's stream's schema. We declare only one stream since our source is simple, but the principle is exactly the same if you had many streams. It's that simple! Now the connector knows how to declare your connector's stream's schema. We declare only one stream since our source is simple, but the principle is exactly the same if you had many streams.
You can also dynamically define schemas, but that's beyond the scope of this tutorial. See the [schema docs](https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/bases/base-python/docs/schemas.md) for more information. You can also dynamically define schemas, but that's beyond the scope of this tutorial. See the [schema docs](https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/bases/base-python/docs/schemas.md) for more information.

View File

@@ -1,20 +1,24 @@
# Step 6: Read data from the API # Step 6: Read Data
Describing schemas is good and all, but at some point we have to start reading data! So let's get to work. But before, let's describe what we're about to do: Describing schemas is good and all, but at some point we have to start reading data! So let's get to work. But before, let's describe what we're about to do:
The `HttpStream` superclass, like described in the [concepts documentation](https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/bases/base-python/README.md), is facilitating reading data from HTTP endpoints. It contains built-in functions or helpers for: The `HttpStream` superclass, like described in the [concepts documentation](https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/bases/base-python/README.md), is facilitating reading data from HTTP endpoints. It contains built-in functions or helpers for:
* authentication * authentication
* pagination * pagination
* handling rate limiting or transient errors * handling rate limiting or transient errors
* and other useful functionality * and other useful functionality
In order for it to be able to do this, we have to provide it with a few inputs: In order for it to be able to do this, we have to provide it with a few inputs:
* the URL base and path of the endpoint we'd like to hit * the URL base and path of the endpoint we'd like to hit
* how to parse the response from the API * how to parse the response from the API
* how to perform pagination * how to perform pagination
Optionally, we can provide additional inputs to customize requests: Optionally, we can provide additional inputs to customize requests:
* request parameters and headers * request parameters and headers
* how to recognize rate limit errors, and how long to wait (by default it retries 429 and 5XX errors using exponential backoff) * how to recognize rate limit errors, and how long to wait \(by default it retries 429 and 5XX errors using exponential backoff\)
* HTTP method and request body if applicable * HTTP method and request body if applicable
There are many other customizable options - you can find them in the [`base_python.cdk.streams.http.HttpStream`](https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/bases/base-python/base_python/cdk/streams/http.py) class. There are many other customizable options - you can find them in the [`base_python.cdk.streams.http.HttpStream`](https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/bases/base-python/base_python/cdk/streams/http.py) class.
@@ -26,12 +30,12 @@ Let's begin by pulling data for the last day's rates by using the `/latest` endp
```python ```python
class ExchangeRates(HttpStream): class ExchangeRates(HttpStream):
url_base = "https://api.ratesapi.io/" url_base = "https://api.ratesapi.io/"
def __init__(self, base: str, **kwargs): def __init__(self, base: str, **kwargs):
super().__init__() super().__init__()
self.base = base self.base = base
def path( def path(
self, self,
stream_state: Mapping[str, Any] = None, stream_state: Mapping[str, Any] = None,
@@ -65,14 +69,9 @@ class ExchangeRates(HttpStream):
# The API does not offer pagination, # The API does not offer pagination,
# so we return None to indicate there are no more pages in the response # so we return None to indicate there are no more pages in the response
return None return None
``` ```
This may look big, but that's just because there are lots of (unused, for now) parameters in these methods (those can be hidden with Python's `**kwargs`, but don't worry about it for now). Really we just added a few lines of "significant" code: This may look big, but that's just because there are lots of \(unused, for now\) parameters in these methods \(those can be hidden with Python's `**kwargs`, but don't worry about it for now\). Really we just added a few lines of "significant" code: 1. Added a constructor `__init__` which stores the `base` currency to query for. 2. `return {'base': self.base}` to add the `?base=<base-value>` query parameter to the request based on the `base` input by the user. 3. `return [response.json()]` to parse the response from the API to match the schema of our schema `.json` file. 4. `return "latest"` to indicate that we want to hit the `/latest` endpoint of the API to get the latest exchange rate data.
1. Added a constructor `__init__` which stores the `base` currency to query for.
2. `return {'base': self.base}` to add the `?base=<base-value>` query parameter to the request based on the `base` input by the user.
3. `return [response.json()]` to parse the response from the API to match the schema of our schema `.json` file.
4. `return "latest"` to indicate that we want to hit the `/latest` endpoint of the API to get the latest exchange rate data.
Let's also pass the `base` parameter input by the user to the stream class: Let's also pass the `base` parameter input by the user to the stream class:
@@ -84,15 +83,15 @@ def streams(self, config: Mapping[str, Any]) -> List[Stream]:
We're now ready to query the API! We're now ready to query the API!
To do this, we'll need a [ConfiguredCatalog](https://docs.airbyte.io/tutorials/tutorials/beginners-guide-to-catalog). We've prepared one [here](http_api_source_assets/configured_catalog.json) -- download this and place it in `sample_files/configured_catalog.json`. Then run: To do this, we'll need a [ConfiguredCatalog](https://docs.airbyte.io/tutorials/tutorials/beginners-guide-to-catalog). We've prepared one [here](https://github.com/airbytehq/airbyte/tree/d940c78307f09f38198e50e54195052d762af944/docs/contributing-to-airbyte/tutorials/cdk-tutorial-alpha/http_api_source_assets/configured_catalog.json) -- download this and place it in `sample_files/configured_catalog.json`. Then run:
``` ```text
python main_dev.py read --config sample_files/config.json --catalog sample_files/configured_catalog.json python main_dev.py read --config sample_files/config.json --catalog sample_files/configured_catalog.json
``` ```
you should see some output lines, one of which is a record from the API: you should see some output lines, one of which is a record from the API:
``` ```text
{"type": "RECORD", "record": {"stream": "exchange_rates", "data": {"base": "USD", "rates": {"GBP": 0.7196938353, "HKD": 7.7597848573, "IDR": 14482.4824162185, "ILS": 3.2412081092, "DKK": 6.1532478279, "INR": 74.7852709971, "CHF": 0.915763343, "MXN": 19.8439387671, "CZK": 21.3545717832, "SGD": 1.3261894911, "THB": 31.4398014067, "HRK": 6.2599917253, "EUR": 0.8274720728, "MYR": 4.0979726934, "NOK": 8.3043442284, "CNY": 6.4856433595, "BGN": 1.61836988, "PHP": 48.3516756309, "PLN": 3.770872983, "ZAR": 14.2690111709, "CAD": 1.2436905254, "ISK": 124.9482829954, "BRL": 5.4526272238, "RON": 4.0738932561, "NZD": 1.3841125362, "TRY": 8.3101365329, "JPY": 108.0182043856, "RUB": 74.9555647497, "KRW": 1111.7583781547, "USD": 1.0, "AUD": 1.2840711626, "HUF": 300.6206040546, "SEK": 8.3829540753}, "date": "2021-04-26"}, "emitted_at": 1619498062000}} {"type": "RECORD", "record": {"stream": "exchange_rates", "data": {"base": "USD", "rates": {"GBP": 0.7196938353, "HKD": 7.7597848573, "IDR": 14482.4824162185, "ILS": 3.2412081092, "DKK": 6.1532478279, "INR": 74.7852709971, "CHF": 0.915763343, "MXN": 19.8439387671, "CZK": 21.3545717832, "SGD": 1.3261894911, "THB": 31.4398014067, "HRK": 6.2599917253, "EUR": 0.8274720728, "MYR": 4.0979726934, "NOK": 8.3043442284, "CNY": 6.4856433595, "BGN": 1.61836988, "PHP": 48.3516756309, "PLN": 3.770872983, "ZAR": 14.2690111709, "CAD": 1.2436905254, "ISK": 124.9482829954, "BRL": 5.4526272238, "RON": 4.0738932561, "NZD": 1.3841125362, "TRY": 8.3101365329, "JPY": 108.0182043856, "RUB": 74.9555647497, "KRW": 1111.7583781547, "USD": 1.0, "AUD": 1.2840711626, "HUF": 300.6206040546, "SEK": 8.3829540753}, "date": "2021-04-26"}, "emitted_at": 1619498062000}}
``` ```
@@ -101,13 +100,8 @@ There we have it - a stream which reads data in just a few lines of code!
We theoretically _could_ stop here and call it a connector. But let's give adding incremental sync a shot. We theoretically _could_ stop here and call it a connector. But let's give adding incremental sync a shot.
## Adding incremental sync ## Adding incremental sync
To add incremental sync, we'll do a few things:
1. Pass the `start_date` param input by the user into the stream. To add incremental sync, we'll do a few things: 1. Pass the `start_date` param input by the user into the stream. 2. Declare the stream's `cursor_field`. 3. Implement the `get_updated_state` method. 4. Implement the `stream_slices` method. 5. Update the `path` method to specify the date to pull exchange rates for. 6. Update the configured catalog to use `incremental` sync when we're testing the stream.
2. Declare the stream's `cursor_field`.
3. Implement the `get_updated_state` method.
4. Implement the `stream_slices` method.
5. Update the `path` method to specify the date to pull exchange rates for.
6. Update the configured catalog to use `incremental` sync when we're testing the stream.
We'll describe what each of these methods do below. Before we begin, it may help to familiarize yourself with how incremental sync works in Airbyte by reading the [docs on incremental](https://docs.airbyte.io/architecture/connections/incremental-append). We'll describe what each of these methods do below. Before we begin, it may help to familiarize yourself with how incremental sync works in Airbyte by reading the [docs on incremental](https://docs.airbyte.io/architecture/connections/incremental-append).
@@ -155,7 +149,7 @@ Let's do this by implementing the `get_updated_state` method inside the `Exchang
return {'date': max(current_parsed_date, latest_record_date).strftime('%Y-%m-%d')} return {'date': max(current_parsed_date, latest_record_date).strftime('%Y-%m-%d')}
else: else:
return {'date': self.start_date.strftime('%Y-%m-%d')} return {'date': self.start_date.strftime('%Y-%m-%d')}
``` ```
This implementation compares the date from the latest record with the date in the current state and takes the maximum as the "new" state object. This implementation compares the date from the latest record with the date in the current state and takes the maximum as the "new" state object.
@@ -177,11 +171,12 @@ We'll implement the `stream_slices` method to return a list of the dates for whi
Optional[Mapping[str, any]]]: Optional[Mapping[str, any]]]:
start_date = datetime.strptime(stream_state['date'], '%Y-%m-%d') if stream_state and 'date' in stream_state else self.start_date start_date = datetime.strptime(stream_state['date'], '%Y-%m-%d') if stream_state and 'date' in stream_state else self.start_date
return self._chunk_date_range(start_date) return self._chunk_date_range(start_date)
``` ```
Each slice will cause an HTTP request to be made to the API. We can then use the information present in the `stream_slice` parameter (a single element from the list we constructed in `stream_slices` above) to set other configurations for the outgoing request like `path` or `request_params`. For more info about stream slicing, see [the slicing docs](../concepts/stream_slices.md). Each slice will cause an HTTP request to be made to the API. We can then use the information present in the `stream_slice` parameter \(a single element from the list we constructed in `stream_slices` above\) to set other configurations for the outgoing request like `path` or `request_params`. For more info about stream slicing, see [the slicing docs](https://github.com/airbytehq/airbyte/tree/d940c78307f09f38198e50e54195052d762af944/docs/contributing-to-airbyte/tutorials/concepts/stream_slices.md).
In order to pull data for a specific date, the Exchange Rates API requires that we pass the date as the path component of the URL. Let's override the `path` method to achieve this: In order to pull data for a specific date, the Exchange Rates API requires that we pass the date as the path component of the URL. Let's override the `path` method to achieve this:
```python ```python
def path(self, stream_state: Mapping[str, Any] = None, stream_slice: Mapping[str, Any] = None, next_page_token: Mapping[str, Any] = None) -> str: def path(self, stream_state: Mapping[str, Any] = None, stream_slice: Mapping[str, Any] = None, next_page_token: Mapping[str, Any] = None) -> str:
return stream_slice['date'] return stream_slice['date']
@@ -190,7 +185,8 @@ def path(self, stream_state: Mapping[str, Any] = None, stream_slice: Mapping[str
With these changes, your implementation should look like the file [here](https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-python-http-tutorial/source_python_http_tutorial/source.py). With these changes, your implementation should look like the file [here](https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-python-http-tutorial/source_python_http_tutorial/source.py).
The last thing we need to do is change the `sync_mode` field in the `sample_files/configured_catalog.json` to `incremental`: The last thing we need to do is change the `sync_mode` field in the `sample_files/configured_catalog.json` to `incremental`:
```
```text
"sync_mode": "incremental", "sync_mode": "incremental",
``` ```
@@ -198,12 +194,13 @@ We should now have a working implementation of incremental sync!
Let's try it out: Let's try it out:
``` ```text
python main_dev.py read --config sample_files/config.json --catalog sample_files/configured_catalog.json python main_dev.py read --config sample_files/config.json --catalog sample_files/configured_catalog.json
``` ```
You should see a bunch of `RECORD` messages and `STATE` messages. To verify that incremental sync is working, pass the input state back to the connector and run it again: You should see a bunch of `RECORD` messages and `STATE` messages. To verify that incremental sync is working, pass the input state back to the connector and run it again:
```
```text
# Save the latest state to sample_files/state.json # Save the latest state to sample_files/state.json
python main_dev.py read --config sample_files/config.json --catalog sample_files/configured_catalog.json | grep STATE | tail -n 1 | jq .state.data > sample_files/state.json python main_dev.py read --config sample_files/config.json --catalog sample_files/configured_catalog.json | grep STATE | tail -n 1 | jq .state.data > sample_files/state.json
@@ -213,4 +210,5 @@ python main_dev.py read --config sample_files/config.json --catalog sample_files
You should see that only the record from the last date is being synced! This is acceptable behavior, since Airbyte requires at-least-once delivery of records, so repeating the last record twice is OK. You should see that only the record from the last date is being synced! This is acceptable behavior, since Airbyte requires at-least-once delivery of records, so repeating the last record twice is OK.
With that, we've implemented incremental sync for our connector! With that, we've implemented incremental sync for our connector!

View File

@@ -1,4 +1,6 @@
### Step 7: Use the connector in Airbyte # Step 7: Use the Connector in Airbyte
To use your connector in your own installation of Airbyte, build the docker image for your container by running `docker build . -t airbyte/source-python-http-example:dev`. Then, follow the instructions from the [building a toy source tutorial](https://docs.airbyte.io/tutorials/tutorials/toy-connector#use-the-connector-in-the-airbyte-ui) for using the connector in the Airbyte UI, replacing the name as appropriate. To use your connector in your own installation of Airbyte, build the docker image for your container by running `docker build . -t airbyte/source-python-http-example:dev`. Then, follow the instructions from the [building a toy source tutorial](https://docs.airbyte.io/tutorials/tutorials/toy-connector#use-the-connector-in-the-airbyte-ui) for using the connector in the Airbyte UI, replacing the name as appropriate.
Note: your built docker image must be accessible to the `docker` daemon running on the Airbyte node. If you're doing this tutorial locally, these instructions are sufficient. Otherwise you may need to push your Docker image to Dockerhub. Note: your built docker image must be accessible to the `docker` daemon running on the Airbyte node. If you're doing this tutorial locally, these instructions are sufficient. Otherwise you may need to push your Docker image to Dockerhub.

View File

@@ -1,14 +1,18 @@
# Step 8: Test your connector # Step 8: Test Connector
## Unit Tests ## Unit Tests
Add any relevant unit tests to the `unit_tests` directory. Unit tests should **not** depend on any secrets. Add any relevant unit tests to the `unit_tests` directory. Unit tests should **not** depend on any secrets.
You can run the tests using `python -m pytest -s unit_tests` You can run the tests using `python -m pytest -s unit_tests`
## Integration Tests ## Integration Tests
Place any integration tests in the `integration_tests` directory such that they can be [discovered by pytest](https://docs.pytest.org/en/reorganize-docs/new-docs/user/naming_conventions.html). Place any integration tests in the `integration_tests` directory such that they can be [discovered by pytest](https://docs.pytest.org/en/reorganize-docs/new-docs/user/naming_conventions.html).
## Standard Tests ## Standard Tests
Standard tests are a fixed set of tests Airbyte provides that every Airbyte source connector must pass. While they're only required if you intend to submit your connector to Airbyte, you might find them helpful in any case. See [Testing your connectors](https://docs.airbyte.io/contributing-to-airbyte/building-new-connector/testing-connectors) Standard tests are a fixed set of tests Airbyte provides that every Airbyte source connector must pass. While they're only required if you intend to submit your connector to Airbyte, you might find them helpful in any case. See [Testing your connectors](https://docs.airbyte.io/contributing-to-airbyte/building-new-connector/testing-connectors)
If you want to submit this connector to become a default connector within Airbyte, follow If you want to submit this connector to become a default connector within Airbyte, follow steps 8 onwards from the [Python source checklist](https://docs.airbyte.io/tutorials/tutorials/building-a-python-source#step-8-set-up-standard-tests)
steps 8 onwards from the [Python source checklist](https://docs.airbyte.io/tutorials/tutorials/building-a-python-source#step-8-set-up-standard-tests)

View File

@@ -0,0 +1,2 @@
# Using the Airbyte CDK

View File

@@ -1 +0,0 @@
# Tutorials

View File

@@ -1 +0,0 @@
# CDK Python Source Tutorial (Alpha)

View File

@@ -1,7 +1,8 @@
# On AWS \(ECS\) # On AWS ECS \(Coming Soon\)
{% hint style="warn" %} {% hint style="info" %}
We do not currently support deployment on ECS. We do not currently support deployment on ECS.
{% endhint %} {% endhint %}
The current iteration is not compatible with ECS. Airbyte currently relies on docker containers being able to create other docker containers. ECS does not permit containers to do this. We will be revising this strategy soon, so that we can be compatible with ECS and other container services. The current iteration is not compatible with ECS. Airbyte currently relies on docker containers being able to create other docker containers. ECS does not permit containers to do this. We will be revising this strategy soon, so that we can be compatible with ECS and other container services.

View File

@@ -1 +1,2 @@
# Examples # Example Use Cases

View File

@@ -44,19 +44,19 @@ Choosing Zoom as **source type** will cause Airbyte to display the configuration
![](../.gitbook/assets/02_setting-zoom-connector-name.png) ![](../.gitbook/assets/02_setting-zoom-connector-name.png)
The Zoom connector for Airbyte requires you to provide it with a Zoom JWT token. Lets take a detour and look at how to obtain one from Zoom. The Zoom connector for Airbyte requires you to provide it with a Zoom JWT token. Lets take a detour and look at how to obtain one from Zoom.
### Obtaining a Zoom JWT Token ### Obtaining a Zoom JWT Token
To obtain a Zoom JWT Token, login to your Zoom account and go to the [Zoom Marketplace](https://marketplace.zoom.us/). If this is your first time in the marketplace, you will need to agree to the Zooms marketplace terms of use. To obtain a Zoom JWT Token, login to your Zoom account and go to the [Zoom Marketplace](https://marketplace.zoom.us/). If this is your first time in the marketplace, you will need to agree to the Zooms marketplace terms of use.
Once you are in, you need to click on the **Develop** dropdown and then click on **Build App.** Once you are in, you need to click on the **Develop** dropdown and then click on **Build App.**
![](../.gitbook/assets/03_click.png) ![](../.gitbook/assets/03_click.png)
Clicking on **Build App** for the first time will display a modal for you to accept the Zooms API license and terms of use. Do accept if you agree and you will be presented with the below screen. Clicking on **Build App** for the first time will display a modal for you to accept the Zooms API license and terms of use. Do accept if you agree and you will be presented with the below screen.
![](../.gitbook/assets/04_zoom-marketplace-build-screen.png) ![](../.gitbook/assets/zoom-marketplace-build-screen%20%281%29.png)
Select **JWT** as the app you want to build and click on the **Create** button on the card. You will be presented with a modal to enter the app name; type in `airbyte-zoom`. Select **JWT** as the app you want to build and click on the **Create** button on the card. You will be presented with a modal to enter the app name; type in `airbyte-zoom`.
@@ -78,15 +78,15 @@ After copying it, click on the **Continue** button.
![](../.gitbook/assets/08_activate-webhook.png) ![](../.gitbook/assets/08_activate-webhook.png)
You will be taken to a screen to activate **Event Subscriptions**. Just leave it as is, as we wont be needing Webhooks. Click on **Continue**, and your app should be marked as activated. You will be taken to a screen to activate **Event Subscriptions**. Just leave it as is, as we wont be needing Webhooks. Click on **Continue**, and your app should be marked as activated.
### Connecting Zoom on Airbyte ### Connecting Zoom on Airbyte
So lets go back to the Airbyte web UI and provide it with the JWT token we copied from our Zoom app. So lets go back to the Airbyte web UI and provide it with the JWT token we copied from our Zoom app.
Now click on the **Set up source** button. You will see the below success message when the connection is made successfully. Now click on the **Set up source** button. You will see the below success message when the connection is made successfully.
![](../.gitbook/assets/09_setup-successful.png) ![](../.gitbook/assets/setup-successful%20%282%29.png)
And you will be taken to the page to add your destination. And you will be taken to the page to add your destination.
@@ -94,19 +94,19 @@ And you will be taken to the page to add your destination.
![](../.gitbook/assets/10_destination.png) ![](../.gitbook/assets/10_destination.png)
For our destination, we will be using a PostgreSQL database, since Tableau supports PostgreSQL as a data source. Click on the **add destination** button, and then in the drop down click on **+ add a new destination**. In the page that presents itself, add the destination name and choose the Postgres destination. For our destination, we will be using a PostgreSQL database, since Tableau supports PostgreSQL as a data source. Click on the **add destination** button, and then in the drop down click on **+ add a new destination**. In the page that presents itself, add the destination name and choose the Postgres destination.
![](../.gitbook/assets/11_choose-postgres-destination.png) ![](../.gitbook/assets/11_choose-postgres-destination.png)
To supply Airbyte with the PostgreSQL configuration parameters needed to make a PostgreSQL destination, we will spin off a PostgreSQL container with Docker using the following command in our terminal. To supply Airbyte with the PostgreSQL configuration parameters needed to make a PostgreSQL destination, we will spin off a PostgreSQL container with Docker using the following command in our terminal.
`docker run --rm --name airbyte-zoom-db -e POSTGRES_PASSWORD=password -v airbyte_zoom_data:/var/lib/postgresql/data -p 2000:5432 -d postgres` `docker run --rm --name airbyte-zoom-db -e POSTGRES_PASSWORD=password -v airbyte_zoom_data:/var/lib/postgresql/data -p 2000:5432 -d postgres`
This will spin a docker container and persist the data we will be replicating in the PostgreSQL database in a Docker volume `airbyte_zoom_data`. This will spin a docker container and persist the data we will be replicating in the PostgreSQL database in a Docker volume `airbyte_zoom_data`.
Now, lets supply the above credentials to the Airbyte UI requiring those credentials. Now, lets supply the above credentials to the Airbyte UI requiring those credentials.
![](../.gitbook/assets/12_postgres_credentials.png) ![](../.gitbook/assets/postgres_credentials.png)
Then click on the **Set up destination** button. Then click on the **Set up destination** button.
@@ -114,20 +114,20 @@ After the connection has been made to your PostgreSQL database successfully, Air
Leave all the fields checked. Leave all the fields checked.
![](../.gitbook/assets/13_schema.png) ![](../.gitbook/assets/schema.png)
Select a **Sync frequency** of **manual** and then click on **Set up connection**. Select a **Sync frequency** of **manual** and then click on **Set up connection**.
After successfully making the connection, you will see your PostgreSQL destination. Click on the Launch button to start the data replication. After successfully making the connection, you will see your PostgreSQL destination. Click on the Launch button to start the data replication.
![](../.gitbook/assets/14_launch.png) ![](../.gitbook/assets/launch%20%281%29.png)
Then click on the **airbyte-zoom-destination** to see the Sync page. Then click on the **airbyte-zoom-destination** to see the Sync page.
![](../.gitbook/assets/15_sync-screen.png) ![](../.gitbook/assets/sync-screen%20%283%29.png)
Syncing should take a few minutes or longer depending on the size of the data being replicated. Once Airbyte is done replicating the data, you will get a **succeeded** status.
Syncing should take a few minutes or longer depending on the size of the data being replicated. Once Airbyte is done replicating the data, you will get a **succeeded** status.
Then, you can run the following SQL command on the PostgreSQL container to confirm that the sync was done successfully. Then, you can run the following SQL command on the PostgreSQL container to confirm that the sync was done successfully.
`docker exec airbyte-zoom-db psql -U postgres -c "SELECT * FROM public.users;"` `docker exec airbyte-zoom-db psql -U postgres -c "SELECT * FROM public.users;"`
@@ -144,15 +144,15 @@ Go ahead and install Tableau on your machine. After the installation is complete
Once your activation is successful, you will see your Tableau dashboard. Once your activation is successful, you will see your Tableau dashboard.
![](../.gitbook/assets/16_tableau-dashboard.png) ![](../.gitbook/assets/tableau-dashboard%20%281%29.png)
On the sidebar menu under the **To a Server** section, click on the **More…** menu. You will see a list of datasource connectors you can connect Tableau with. On the sidebar menu under the **To a Server** section, click on the **More…** menu. You will see a list of datasource connectors you can connect Tableau with.
![](../.gitbook/assets/17_datasources.png) ![](../.gitbook/assets/datasources%20%284%29.png)
Select **PostgreSQL** and you will be presented with a connection credentials modal. Select **PostgreSQL** and you will be presented with a connection credentials modal.
Fill in the same details of the PostgreSQL database we used as the destination in Airbyte. Fill in the same details of the PostgreSQL database we used as the destination in Airbyte.
![](../.gitbook/assets/18_fill-in-connection-details.png) ![](../.gitbook/assets/18_fill-in-connection-details.png)
@@ -160,8 +160,6 @@ Next, click on the **Sign In** button. If the connection was made successfully,
_Note: If you are having trouble connecting PostgreSQL with Tableau, it might be because the driver Tableau comes with for PostgreSQL might not work for newer versions of PostgreSQL. You can download the JDBC driver for PostgreSQL_ [_here_](https://www.tableau.com/support/drivers?_ga=2.62351404.1800241672.1616922684-1838321730.1615100968) _and follow the setup instructions._ _Note: If you are having trouble connecting PostgreSQL with Tableau, it might be because the driver Tableau comes with for PostgreSQL might not work for newer versions of PostgreSQL. You can download the JDBC driver for PostgreSQL_ [_here_](https://www.tableau.com/support/drivers?_ga=2.62351404.1800241672.1616922684-1838321730.1615100968) _and follow the setup instructions._
------
Now that we have replicated our Zoom data into a PostgreSQL database using Airbytes Zoom connector, and connected Tableau with our PostgreSQL database containing our Zoom data, lets proceed to creating the charts we need to visualize the time spent by a team in Zoom calls. Now that we have replicated our Zoom data into a PostgreSQL database using Airbytes Zoom connector, and connected Tableau with our PostgreSQL database containing our Zoom data, lets proceed to creating the charts we need to visualize the time spent by a team in Zoom calls.
## Step 3: Create the charts on Tableau with the Zoom data ## Step 3: Create the charts on Tableau with the Zoom data
@@ -172,25 +170,23 @@ To create this chart, we will need to use the count of the meetings and the **cr
![](../.gitbook/assets/19_tableau-view-with-all-tables.png) ![](../.gitbook/assets/19_tableau-view-with-all-tables.png)
Drag the **meetings** table from the sidebar onto the space with the prompt. Drag the **meetings** table from the sidebar onto the space with the prompt.
Now that we have the meetings table, we can start building out the chart by clicking on **Sheet 1** at the bottom left of Tableau.
Now that we have the meetings table, we can start building out the chart by clicking on **Sheet 1** at the bottom left of Tableau.
![](../.gitbook/assets/20_empty-meeting-sheet.png) ![](../.gitbook/assets/20_empty-meeting-sheet.png)
As stated earlier, we need **Created At**, but currently its a String data type. Lets change that by converting it to a data time. So right click on **Created At**, then select `ChangeDataType` and choose Date & Time. And thats it! That field is now of type **Date** & **Time**. As stated earlier, we need **Created At**, but currently its a String data type. Lets change that by converting it to a data time. So right click on **Created At**, then select `ChangeDataType` and choose Date & Time. And thats it! That field is now of type **Date** & **Time**.
![](../.gitbook/assets/21_change-to-date-time.png) ![](../.gitbook/assets/21_change-to-date-time.png)
Next, drag **Created At** to **Columns**. Next, drag **Created At** to **Columns**.
![](../.gitbook/assets/22_drag-created-at.png) ![](../.gitbook/assets/22_drag-created-at.png)
Currently, we get the Created At in **YEAR**, but per our requirement we want them in Weeks, so right click on the **YEAR\(Created At\)** and choose **Week Number**. Currently, we get the Created At in **YEAR**, but per our requirement we want them in Weeks, so right click on the **YEAR\(Created At\)** and choose **Week Number**.
![](../.gitbook/assets/23_change-to-per-week.png) ![](../.gitbook/assets/change-to-per-week.png)
Tableau should now look like this: Tableau should now look like this:
@@ -198,7 +194,7 @@ Tableau should now look like this:
Now, to finish up, we need to add the **meetings\(Count\) measure** Tableau already calculated for us in the **Rows** section. So drag **meetings\(Count\)** onto the Columns section to complete the chart. Now, to finish up, we need to add the **meetings\(Count\) measure** Tableau already calculated for us in the **Rows** section. So drag **meetings\(Count\)** onto the Columns section to complete the chart.
![](../.gitbook/assets/25_evolution-of-meetings-per-week.png) ![](../.gitbook/assets/evolution-of-meetings-per-week%20%282%29.png)
And now we are done with the very first chart. Let's save the sheet and create a new Dashboard that we will add this sheet to as well as the others we will be creating. And now we are done with the very first chart. Let's save the sheet and create a new Dashboard that we will add this sheet to as well as the others we will be creating.
@@ -224,19 +220,19 @@ Note: We are adding a filter on the Duration to filter out null values. You can
### Evolution of the number of participants for all meetings per week ### Evolution of the number of participants for all meetings per week
For this chart, we will need to have a calculated field called **\# of meetings attended**, which will be an aggregate of the counts of rows matching a particular user's email in the `report_meeting_participants` table plotted against the **Created At** field of the **meetings** table. To get this done, right click on the **User Email** field. Select **create** and click on **calculatedField**, then enter the title of the field as **\# of meetings attended**. Next, enter the below formula: For this chart, we will need to have a calculated field called **\# of meetings attended**, which will be an aggregate of the counts of rows matching a particular user's email in the `report_meeting_participants` table plotted against the **Created At** field of the **meetings** table. To get this done, right click on the **User Email** field. Select **create** and click on **calculatedField**, then enter the title of the field as **\# of meetings attended**. Next, enter the below formula:
`COUNT(IF [User Email] == [User Email] THEN [Id (Report Meeting Participants)] END)` `COUNT(IF [User Email] == [User Email] THEN [Id (Report Meeting Participants)] END)`
Then click on apply. Finally, drag the **Created At** fields \(make sure its on the **Weekly** number\) and the calculated field you just created to match the below screenshot: Then click on apply. Finally, drag the **Created At** fields \(make sure its on the **Weekly** number\) and the calculated field you just created to match the below screenshot:
![](../.gitbook/assets/28_number_of_participants_per_weekly_meetings.png) ![](../.gitbook/assets/number_of_participants_per_weekly_meetings.png)
### Listing of team members with the number of meetings per week and number of hours spent in meetings, ranked. ### Listing of team members with the number of meetings per week and number of hours spent in meetings, ranked.
To get this chart, we need to create a relationship between the **meetings table** and the `report_meeting_participants` table. You can do this by dragging the `report_meeting_participants` table in as a source alongside the **meetings** table and relate both via the **meeting id**. Then you will be able to create a new worksheet that looks like this: To get this chart, we need to create a relationship between the **meetings table** and the `report_meeting_participants` table. You can do this by dragging the `report_meeting_participants` table in as a source alongside the **meetings** table and relate both via the **meeting id**. Then you will be able to create a new worksheet that looks like this:
![](../.gitbook/assets/29_meetings-participant-ranked.png) ![](../.gitbook/assets/meetings-participant-ranked.png)
Note: To achieve the ranking, we simply use the sort menu icon on the top menu bar. Note: To achieve the ranking, we simply use the sort menu icon on the top menu bar.
@@ -250,7 +246,7 @@ The rest of the charts will be needing the **webinars** and `report_webinar_part
For this chart, as for the meetings counterpart, we will get a calculated field off the Duration field to get the **Webinar Duration in Hours**, and then plot **Created At** against the **Sum of Webinar Duration in Hours**, as shown in the screenshot below. Note: Make sure you create a new sheet for each of these graphs. For this chart, as for the meetings counterpart, we will get a calculated field off the Duration field to get the **Webinar Duration in Hours**, and then plot **Created At** against the **Sum of Webinar Duration in Hours**, as shown in the screenshot below. Note: Make sure you create a new sheet for each of these graphs.
![](../.gitbook/assets/31_time-spent-in-weekly-webinars.png) ![](../.gitbook/assets/duration-spent-in-weekly-webinars%20%283%29.png)
### Evolution of the number of participants for all webinars per week ### Evolution of the number of participants for all webinars per week
@@ -264,7 +260,7 @@ Below is the chart:
![](../.gitbook/assets/32_number_of_webinar_attended_per_week.png) ![](../.gitbook/assets/32_number_of_webinar_attended_per_week.png)
#### Listing of team members with the number of webinars per week and number of hours spent in meetings, ranked #### Listing of team members with the number of webinars per week and number of hours spent in meetings, ranked
Below is the chart with these specs Below is the chart with these specs
@@ -272,6 +268,7 @@ Below is the chart with these specs
## Conclusion ## Conclusion
In this article, we see how we can use Airbyte to get data off the Zoom API onto a PostgreSQL database, and then use that data to create some chart visualizations in Tableau. In this article, we see how we can use Airbyte to get data off the Zoom API onto a PostgreSQL database, and then use that data to create some chart visualizations in Tableau.
You can leverage Airbyte and Tableau to produce graphs on any collaboration tool. We just used Zoom to illustrate how it can be done. Hope this is helpful!
You can leverage Airbyte and Tableau to produce graphs on any collaboration tool. We just used Zoom to illustrate how it can be done. Hope this is helpful!

View File

@@ -1,6 +1,6 @@
# Meltano vs Airbyte # Meltano vs Airbyte
We wrote an article, “[The State of Open-Source Data Integration and ETL](https://airbyte.io/articles/data-engineering-thoughts/the-state-of-open-source-data-integration-and-etl/),” in which we list and compare all ETL-related open-source projects, including Meltano and Airbyte. Dont hesitate to check it out for more detailed arguments. As a summary, here are the differences: We wrote an article, “[The State of Open-Source Data Integration and ETL](https://airbyte.io/articles/data-engineering-thoughts/the-state-of-open-source-data-integration-and-etl/),” in which we list and compare all ETL-related open-source projects, including Meltano and Airbyte. Dont hesitate to check it out for more detailed arguments. As a summary, here are the differences:
![](https://airbyte.io/wp-content/uploads/2020/10/Landscape-of-open-source-data-integration-platforms-4.png) ![](https://airbyte.io/wp-content/uploads/2020/10/Landscape-of-open-source-data-integration-platforms-4.png)
@@ -16,7 +16,7 @@ Meltano is a Gitlab side project. Since 2019, they have been iterating on severa
## **Airbyte:** ## **Airbyte:**
In contrast, Airbyte is a company fully committed to the open-source MIT project and has a [business model](../../company-handbook/business-model.md)in mind around this project. Our [team](../../company-handbook/team.md) are data integration experts that have built more than 1,000 integrations collectively at large scale. The team now counts 20 engineers working full-time on Airbyte. In contrast, Airbyte is a company fully committed to the open-source MIT project and has a [business model](https://github.com/airbytehq/airbyte/tree/428e10e727c05e5aed4235610ab86f0e5b304864/docs/company-handbook/business-model.md)in mind around this project. Our [team](https://github.com/airbytehq/airbyte/tree/428e10e727c05e5aed4235610ab86f0e5b304864/docs/company-handbook/team.md) are data integration experts that have built more than 1,000 integrations collectively at large scale. The team now counts 20 engineers working full-time on Airbyte.
* **Airbyte supports more than 60 connectors after only 8 months since its inception**, 20% of which were built by the community. Our ambition is to support **200+ connectors by the end of 2021.** * **Airbyte supports more than 60 connectors after only 8 months since its inception**, 20% of which were built by the community. Our ambition is to support **200+ connectors by the end of 2021.**
* Airbytes connectors are **usable out of the box through a UI and API,** with monitoring, scheduling and orchestration. Airbyte was built on the premise that a user, whatever their background, should be able to move data in 2 minutes. Data engineers might want to use raw data and their own transformation processes, or to use Airbytes API to include data integration in their workflows. On the other hand, analysts and data scientists might want to use normalized consolidated data in their database or data warehouses. Airbyte supports all these use cases. * Airbytes connectors are **usable out of the box through a UI and API,** with monitoring, scheduling and orchestration. Airbyte was built on the premise that a user, whatever their background, should be able to move data in 2 minutes. Data engineers might want to use raw data and their own transformation processes, or to use Airbytes API to include data integration in their workflows. On the other hand, analysts and data scientists might want to use normalized consolidated data in their database or data warehouses. Airbyte supports all these use cases.

View File

@@ -71,23 +71,26 @@ Depending on your Docker network configuration, you may not be able to connect t
If you are running into connection refused errors when running Airbyte via Docker Compose on Mac, try using `host.docker.internal` as the host. On Linux, you may have to modify `docker-compose.yml` and add a host that maps to your local machine using [`extra_hosts`](https://docs.docker.com/compose/compose-file/compose-file-v3/#extra_hosts). If you are running into connection refused errors when running Airbyte via Docker Compose on Mac, try using `host.docker.internal` as the host. On Linux, you may have to modify `docker-compose.yml` and add a host that maps to your local machine using [`extra_hosts`](https://docs.docker.com/compose/compose-file/compose-file-v3/#extra_hosts).
## **Do you support change data capture (CDC) or logical replication for databases?** ## **Do you support change data capture \(CDC\) or logical replication for databases?**
We currently support [CDC for Postgres 10+](../integrations/sources/postgres.md). We are adding support for a few other databases April/May 2021. We currently support [CDC for Postgres 10+](../integrations/sources/postgres.md). We are adding support for a few other databases April/May 2021.
## **Can I disable analytics in Airbyte?** ## **Can I disable analytics in Airbyte?**
Yes, you can control what's sent outside of Airbyte for analytics purposes. Yes, you can control what's sent outside of Airbyte for analytics purposes.
We instrumented some parts of Airbyte for the following reasons: We instrumented some parts of Airbyte for the following reasons:
- measure usage of Airbyte
- measure usage of features & connectors * measure usage of Airbyte
- collect connector telemetry to measure stability * measure usage of features & connectors
- reach out to our users if they opt-in * collect connector telemetry to measure stability
- ... * reach out to our users if they opt-in
* ...
To disable telemetry, modify the `.env` file and define the two following environment variables: To disable telemetry, modify the `.env` file and define the two following environment variables:
```
```text
TRACKING_STRATEGY=logging TRACKING_STRATEGY=logging
PAPERCUPS_STORYTIME=disabled PAPERCUPS_STORYTIME=disabled
``` ```

View File

@@ -14,6 +14,7 @@ Airbyte continues to sync data using the configured schema until that schema is
For now, the schema can only be updated manually in the UI \(by clicking "Update Schema" in the settings page for the connection\). When a schema is updated Airbyte will re-sync all data for that source using the new schema. For now, the schema can only be updated manually in the UI \(by clicking "Update Schema" in the settings page for the connection\). When a schema is updated Airbyte will re-sync all data for that source using the new schema.
## **How does Airbyte handle namespaces (or schemas for the DB-inclined)?** ## **How does Airbyte handle namespaces \(or schemas for the DB-inclined\)?**
Airbyte respects source-defined namespaces when syncing data with a namespace-supported destination. See [this](../understanding-airbyte/namespaces.md) for more details. Airbyte respects source-defined namespaces when syncing data with a namespace-supported destination. See [this](../understanding-airbyte/namespaces.md) for more details.

View File

@@ -20,10 +20,10 @@ Each stream will be output into its own file. Each file will contain 3 columns:
#### Features #### Features
| Feature | Supported | | Feature | Supported | |
| :--- | :--- | | :--- | :--- | :--- |
| Full Refresh Sync | Yes | | Full Refresh Sync | Yes | |
| Incremental - Append Sync | Yes | | Incremental - Append Sync | Yes | |
| Namespaces | No | | | Namespaces | No | |
#### Performance considerations #### Performance considerations

View File

@@ -20,10 +20,10 @@ Each stream will be output into its own file. Each file will a collections of `j
#### Features #### Features
| Feature | Supported | | Feature | Supported | |
| :--- | :--- | | :--- | :--- | :--- |
| Full Refresh Sync | Yes | | Full Refresh Sync | Yes | |
| Incremental - Append Sync | Yes | | Incremental - Append Sync | Yes | |
| Namespaces | No | | | Namespaces | No | |
#### Performance considerations #### Performance considerations

View File

@@ -4,9 +4,7 @@
The Airbyte Redshift destination allows you to sync data to Redshift. The Airbyte Redshift destination allows you to sync data to Redshift.
This Redshift destination connector has two replication strategies: This Redshift destination connector has two replication strategies: 1\) INSERT: Replicates data via SQL INSERT queries. This is built on top of the destination-jdbc code base and is configured to rely on JDBC 4.2 standard drivers provided by Amazon via Mulesoft [here](https://mvnrepository.com/artifact/com.amazon.redshift/redshift-jdbc42) as described in Redshift documentation [here](https://docs.aws.amazon.com/redshift/latest/mgmt/jdbc20-install.html). Not recommended for production workloads as this does not scale well. 2\) COPY: Replicates data by first uploading data to an S3 bucket and issuing a COPY command. This is the recommended loading approach described by Redshift [best practices](https://docs.aws.amazon.com/redshift/latest/dg/c_loading-data-best-practices.html). Requires an S3 bucket and credentials.
1) INSERT: Replicates data via SQL INSERT queries. This is built on top of the destination-jdbc code base and is configured to rely on JDBC 4.2 standard drivers provided by Amazon via Mulesoft [here](https://mvnrepository.com/artifact/com.amazon.redshift/redshift-jdbc42) as described in Redshift documentation [here](https://docs.aws.amazon.com/redshift/latest/mgmt/jdbc20-install.html). Not recommended for production workloads as this does not scale well.
2) COPY: Replicates data by first uploading data to an S3 bucket and issuing a COPY command. This is the recommended loading approach described by Redshift [best practices](https://docs.aws.amazon.com/redshift/latest/dg/c_loading-data-best-practices.html). Requires an S3 bucket and credentials.
Airbyte automatically picks an approach depending on the given configuration - if S3 configuration is present, Airbyte will use the COPY strategy and vice versa. Airbyte automatically picks an approach depending on the given configuration - if S3 configuration is present, Airbyte will use the COPY strategy and vice versa.
@@ -40,7 +38,7 @@ You will need to choose an existing database or create a new database that will
1. Active Redshift cluster 1. Active Redshift cluster
2. Allow connections from Airbyte to your Redshift cluster \(if they exist in separate VPCs\) 2. Allow connections from Airbyte to your Redshift cluster \(if they exist in separate VPCs\)
3. A staging S3 bucket with credentials (for the COPY strategy). 3. A staging S3 bucket with credentials \(for the COPY strategy\).
### Setup guide ### Setup guide
@@ -62,7 +60,7 @@ You should have all the requirements needed to configure Redshift as a destinati
* **Database** * **Database**
* This database needs to exist within the cluster provided. * This database needs to exist within the cluster provided.
#### 2a. Fill up S3 info (for COPY strategy) #### 2a. Fill up S3 info \(for COPY strategy\)
Provide the required S3 info. Provide the required S3 info.

View File

@@ -40,9 +40,9 @@ For more information, see the [Facebook Insights API documentation. ](https://de
| Feature | Supported?\(Yes/No\) | Notes | | Feature | Supported?\(Yes/No\) | Notes |
| :--- | :--- | :--- | | :--- | :--- | :--- |
| Full Refresh Sync | Yes | | Full Refresh Sync | Yes | |
| Incremental Sync | Yes | except AdCreatives | | Incremental Sync | Yes | except AdCreatives |
| Namespaces | No | | Namespaces | No | |
### Rate Limiting & Performance Considerations ### Rate Limiting & Performance Considerations

View File

@@ -59,3 +59,4 @@ When you apply for a token, you need to mention:
* That you have full access to the server running the code \(because you're self-hosting Airbyte\) * That you have full access to the server running the code \(because you're self-hosting Airbyte\)
If for any reason the request gets denied, let us know and we will be able to unblock you. If for any reason the request gets denied, let us know and we will be able to unblock you.

View File

@@ -1,4 +1,4 @@
# Google Workspace Admin Reports API # Google Workspace Admin Reports
## Overview ## Overview
@@ -12,7 +12,7 @@ This Source is capable of syncing the following Streams:
* [drive](https://developers.google.com/admin-sdk/reports/v1/guides/manage-audit-drive) * [drive](https://developers.google.com/admin-sdk/reports/v1/guides/manage-audit-drive)
* [logins](https://developers.google.com/admin-sdk/reports/v1/guides/manage-audit-login) * [logins](https://developers.google.com/admin-sdk/reports/v1/guides/manage-audit-login)
* [mobile](https://developers.google.com/admin-sdk/reports/v1/guides/manage-audit-mobile) * [mobile](https://developers.google.com/admin-sdk/reports/v1/guides/manage-audit-mobile)
* [oauth_tokens](https://developers.google.com/admin-sdk/reports/v1/guides/manage-audit-tokens) * [oauth\_tokens](https://developers.google.com/admin-sdk/reports/v1/guides/manage-audit-tokens)
### Data type mapping ### Data type mapping
@@ -39,16 +39,18 @@ This connector attempts to back off gracefully when it hits Reports API's rate l
## Getting started ## Getting started
### Requirements ### Requirements
* Credentials to a Google Service Account with delegated Domain Wide Authority * Credentials to a Google Service Account with delegated Domain Wide Authority
* Email address of the workspace admin which created the Service Account * Email address of the workspace admin which created the Service Account
### Create a Service Account with delegated domain wide authority ### Create a Service Account with delegated domain wide authority
Follow the Google Documentation for performing [Domain Wide Delegation of Authority](https://developers.google.com/admin-sdk/reports/v1/guides/delegation) to create a Service account with delegated domain wide authority. This account must be created by an administrator of the Google Workspace.
Please make sure to grant the following OAuth scopes to the service user: Follow the Google Documentation for performing [Domain Wide Delegation of Authority](https://developers.google.com/admin-sdk/reports/v1/guides/delegation) to create a Service account with delegated domain wide authority. This account must be created by an administrator of the Google Workspace. Please make sure to grant the following OAuth scopes to the service user:
1. `https://www.googleapis.com/auth/admin.reports.audit.readonly` 1. `https://www.googleapis.com/auth/admin.reports.audit.readonly`
2. `https://www.googleapis.com/auth/admin.reports.usage.readonly` 2. `https://www.googleapis.com/auth/admin.reports.usage.readonly`
At the end of this process, you should have JSON credentials to this Google Service Account. At the end of this process, you should have JSON credentials to this Google Service Account.
You should now be ready to use the Google Workspace Admin Reports API connector in Airbyte.
You should now be ready to use the Google Workspace Admin Reports API connector in Airbyte.

View File

@@ -2,7 +2,7 @@
## Overview ## Overview
The Iterable supports full refresh and incremental sync. The Iterable supports full refresh and incremental sync.
This source can sync data for the [Iterable API](https://api.iterable.com/api/docs). This source can sync data for the [Iterable API](https://api.iterable.com/api/docs).
@@ -12,20 +12,20 @@ Several output streams are available from this source:
* [Campaigns](https://api.iterable.com/api/docs#campaigns_campaigns) * [Campaigns](https://api.iterable.com/api/docs#campaigns_campaigns)
* [Channels](https://api.iterable.com/api/docs#channels_channels) * [Channels](https://api.iterable.com/api/docs#channels_channels)
* [Email Bounce](https://api.iterable.com/api/docs#export_exportDataJson) (Incremental sync) * [Email Bounce](https://api.iterable.com/api/docs#export_exportDataJson) \(Incremental sync\)
* [Email Click](https://api.iterable.com/api/docs#export_exportDataJson) (Incremental sync) * [Email Click](https://api.iterable.com/api/docs#export_exportDataJson) \(Incremental sync\)
* [Email Complaint](https://api.iterable.com/api/docs#export_exportDataJson) (Incremental sync) * [Email Complaint](https://api.iterable.com/api/docs#export_exportDataJson) \(Incremental sync\)
* [Email Open](https://api.iterable.com/api/docs#export_exportDataJson) (Incremental sync) * [Email Open](https://api.iterable.com/api/docs#export_exportDataJson) \(Incremental sync\)
* [Email Send](https://api.iterable.com/api/docs#export_exportDataJson) (Incremental sync) * [Email Send](https://api.iterable.com/api/docs#export_exportDataJson) \(Incremental sync\)
* [Email Send Skip](https://api.iterable.com/api/docs#export_exportDataJson) (Incremental sync) * [Email Send Skip](https://api.iterable.com/api/docs#export_exportDataJson) \(Incremental sync\)
* [Email Subscribe](https://api.iterable.com/api/docs#export_exportDataJson) (Incremental sync) * [Email Subscribe](https://api.iterable.com/api/docs#export_exportDataJson) \(Incremental sync\)
* [Email Unsubscribe](https://api.iterable.com/api/docs#export_exportDataJson) (Incremental sync) * [Email Unsubscribe](https://api.iterable.com/api/docs#export_exportDataJson) \(Incremental sync\)
* [Lists](https://api.iterable.com/api/docs#lists_getLists) * [Lists](https://api.iterable.com/api/docs#lists_getLists)
* [List Users](https://api.iterable.com/api/docs#lists_getLists_0) * [List Users](https://api.iterable.com/api/docs#lists_getLists_0)
* [Message Types](https://api.iterable.com/api/docs#messageTypes_messageTypes) * [Message Types](https://api.iterable.com/api/docs#messageTypes_messageTypes)
* [Metadata](https://api.iterable.com/api/docs#metadata_list_tables) * [Metadata](https://api.iterable.com/api/docs#metadata_list_tables)
* [Templates](https://api.iterable.com/api/docs#templates_getTemplates) (Incremental sync) * [Templates](https://api.iterable.com/api/docs#templates_getTemplates) \(Incremental sync\)
* [Users](https://api.iterable.com/api/docs#export_exportDataJson) (Incremental sync) * [Users](https://api.iterable.com/api/docs#export_exportDataJson) \(Incremental sync\)
If there are more endpoints you'd like Airbyte to support, please [create an issue.](https://github.com/airbytehq/airbyte/issues/new/choose) If there are more endpoints you'd like Airbyte to support, please [create an issue.](https://github.com/airbytehq/airbyte/issues/new/choose)

View File

@@ -30,12 +30,12 @@ If you do not see a type in this list, assume that it is coerced into a string.
| Feature | Supported | Notes | | Feature | Supported | Notes |
| :--- | :--- | :--- | | :--- | :--- | :--- |
| Full Refresh Sync | Yes | | Full Refresh Sync | Yes | |
| Incremental Sync - Append | Yes | | Incremental Sync - Append | Yes | |
| Replicate Incremental Deletes | Coming soon | | Replicate Incremental Deletes | Coming soon | |
| Logical Replication \(WAL\) | Coming soon | | Logical Replication \(WAL\) | Coming soon | |
| SSL Support | Yes | | SSL Support | Yes | |
| SSH Tunnel Connection | Coming soon | | SSH Tunnel Connection | Coming soon | |
| Namespaces | Yes | Enabled by default | | Namespaces | Yes | Enabled by default |
## Getting started ## Getting started

View File

@@ -19,12 +19,12 @@ MySQL data types are mapped to the following data types when synchronizing data:
| `date` | string | | | `date` | string | |
| `datetime` | string | | | `datetime` | string | |
| `enum` | string | | | `enum` | string | |
| `tinyint` | number | | | `tinyint` | number | |
| `smallint` | number | | | `smallint` | number | |
| `mediumint` | number | | | `mediumint` | number | |
| `int` | number | | | `int` | number | |
| `bigint` | number | | | `bigint` | number | |
| `numeric` | number | | | `numeric` | number | |
| `string` | string | | | `string` | string | |
If you do not see a type in this list, assume that it is coerced into a string. We are happy to take feedback on preferred mappings. If you do not see a type in this list, assume that it is coerced into a string. We are happy to take feedback on preferred mappings.
@@ -35,12 +35,12 @@ If you do not see a type in this list, assume that it is coerced into a string.
| Feature | Supported | Notes | | Feature | Supported | Notes |
| :--- | :--- | :--- | | :--- | :--- | :--- |
| Full Refresh Sync | Yes | | Full Refresh Sync | Yes | |
| Incremental - Append Sync | Yes | | Incremental - Append Sync | Yes | |
| Replicate Incremental Deletes | Coming soon | | Replicate Incremental Deletes | Coming soon | |
| Logical Replication \(WAL\) | Coming soon | | Logical Replication \(WAL\) | Coming soon | |
| SSL Support | Yes | | SSL Support | Yes | |
| SSH Tunnel Connection | Coming soon | | SSH Tunnel Connection | Coming soon | |
| Namespaces | Yes | Enabled by default | | Namespaces | Yes | Enabled by default |
## Getting started ## Getting started

View File

@@ -1,17 +1,12 @@
# Oracle # Oracle DB
## Overview ## Overview
The Oracle Database source supports both Full Refresh and Incremental syncs. You can choose if this The Oracle Database source supports both Full Refresh and Incremental syncs. You can choose if this connector will copy only the new or updated data, or all rows in the tables and columns you set up for replication, every time a sync is run.
connector will copy only the new or updated data, or all rows in the tables and columns you set up
for replication, every time a sync is run.
### Resulting schema ### Resulting schema
The Oracle source does not alter the schema present in your database. Depending on the destination The Oracle source does not alter the schema present in your database. Depending on the destination connected to this source, however, the schema may be altered. See the destination's documentation for more details.
connected to this source, however, the schema may be altered. See the destination's documentation
for more details.
### Data type mapping ### Data type mapping
@@ -19,27 +14,26 @@ Oracle data types are mapped to the following data types when synchronizing data
| Oracle Type | Resulting Type | Notes | | Oracle Type | Resulting Type | Notes |
| :--- | :--- | :--- | | :--- | :--- | :--- |
| `number` | number | | | `number` | number | |
| `integer` | number | | | `integer` | number | |
| `decimal` | number | | | `decimal` | number | |
| `float` | number | | | `float` | number | |
| everything else | string | | | everything else | string | |
If you do not see a type in this list, assume that it is coerced into a string. We are happy to If you do not see a type in this list, assume that it is coerced into a string. We are happy to take feedback on preferred mappings.
take feedback on preferred mappings.
### Features ### Features
| Feature | Supported | Notes | | Feature | Supported | Notes |
| :--- | :--- | :--- | | :--- | :--- | :--- |
| Full Refresh Sync | Yes | | Full Refresh Sync | Yes | |
| Incremental - Append Sync | Yes | | Incremental - Append Sync | Yes | |
| Replicate Incremental Deletes | Coming soon | | Replicate Incremental Deletes | Coming soon | |
| Logical Replication \(WAL\) | Coming soon | | Logical Replication \(WAL\) | Coming soon | |
| SSL Support | Coming soon | | SSL Support | Coming soon | |
| SSH Tunnel Connection | Coming soon | | SSH Tunnel Connection | Coming soon | |
| LogMiner | Coming soon | | LogMiner | Coming soon | |
| Flashback | Coming soon | | Flashback | Coming soon | |
| Namespaces | Yes | Enabled by default | | Namespaces | Yes | Enabled by default |
## Getting started ## Getting started
@@ -74,9 +68,11 @@ GRANT SELECT ANY TABLE TO airbyte;
``` ```
Or you can be more granular: Or you can be more granular:
```sql ```sql
GRANT SELECT ON "<schema_a>"."<table_1>" TO airbyte; GRANT SELECT ON "<schema_a>"."<table_1>" TO airbyte;
GRANT SELECT ON "<schema_b>"."<table_2>" TO airbyte; GRANT SELECT ON "<schema_b>"."<table_2>" TO airbyte;
``` ```
Your database user should now be ready for use with Airbyte. Your database user should now be ready for use with Airbyte.

View File

@@ -49,12 +49,12 @@ Postgres data types are mapped to the following data types when synchronizing da
| Feature | Supported | Notes | | Feature | Supported | Notes |
| :--- | :--- | :--- | | :--- | :--- | :--- |
| Full Refresh Sync | Yes | | Full Refresh Sync | Yes | |
| Incremental - Append Sync | Yes | | Incremental - Append Sync | Yes | |
| Replicate Incremental Deletes | Yes | | Replicate Incremental Deletes | Yes | |
| Logical Replication \(WAL\) | Yes | | Logical Replication \(WAL\) | Yes | |
| SSL Support | Yes | | SSL Support | Yes | |
| SSH Tunnel Connection | Coming soon | | SSH Tunnel Connection | Coming soon | |
| Namespaces | Yes | Enabled by default | | Namespaces | Yes | Enabled by default |
## Getting started ## Getting started
@@ -100,33 +100,36 @@ ALTER DEFAULT PRIVILEGES IN SCHEMA <schema_name> GRANT SELECT ON TABLES TO airby
#### 3. Set up CDC \(Optional\) #### 3. Set up CDC \(Optional\)
Please read [the section on CDC below](#setting-up-cdc-for-postgres) for more information. Please read [the section on CDC below](postgres.md#setting-up-cdc-for-postgres) for more information.
#### 4. That's it! #### 4. That's it!
Your database user should now be ready for use with Airbyte. Your database user should now be ready for use with Airbyte.
## Change Data Capture (CDC) / Logical Replication / WAL Replication ## Change Data Capture \(CDC\) / Logical Replication / WAL Replication
We use [logical replication](https://www.postgresql.org/docs/10/logical-replication.html) of the Postgres write-ahead log (WAL) to incrementally capture deletes using the `pgoutput` plugin.
We use [logical replication](https://www.postgresql.org/docs/10/logical-replication.html) of the Postgres write-ahead log \(WAL\) to incrementally capture deletes using the `pgoutput` plugin.
We do not require installing custom plugins like `wal2json` or `test_decoding`. We use `pgoutput`, which is included in Postgres 10+ by default. We do not require installing custom plugins like `wal2json` or `test_decoding`. We use `pgoutput`, which is included in Postgres 10+ by default.
Please read the [CDC docs](../../understanding-airbyte/cdc.md) for an overview of how Airbyte approaches CDC. Please read the [CDC docs](../../understanding-airbyte/cdc.md) for an overview of how Airbyte approaches CDC.
### Should I use CDC for Postgres? ### Should I use CDC for Postgres?
* If you need a record of deletions and can accept the limitations posted below, you should to use CDC for Postgres. * If you need a record of deletions and can accept the limitations posted below, you should to use CDC for Postgres.
* If your data set is small and you just want snapshot of your table in the destination, consider using Full Refresh replication for your table instead of CDC. * If your data set is small and you just want snapshot of your table in the destination, consider using Full Refresh replication for your table instead of CDC.
* If the limitations prevent you from using CDC and your goal is to maintain a snapshot of your table in the destination, consider using non-CDC incremental and occasionally reset the data and re-sync. * If the limitations prevent you from using CDC and your goal is to maintain a snapshot of your table in the destination, consider using non-CDC incremental and occasionally reset the data and re-sync.
* If your table has a primary key but doesn't have a reasonable cursor field for incremental syncing (i.e. `updated_at`), CDC allows you to sync your table incrementally. * If your table has a primary key but doesn't have a reasonable cursor field for incremental syncing \(i.e. `updated_at`\), CDC allows you to sync your table incrementally.
### CDC Limitations ### CDC Limitations
* Make sure to read our [CDC docs](../../understanding-airbyte/cdc.md) to see limitations that impact all databases using CDC replication. * Make sure to read our [CDC docs](../../understanding-airbyte/cdc.md) to see limitations that impact all databases using CDC replication.
* CDC is only available for Postgres 10+. * CDC is only available for Postgres 10+.
* Airbyte requires a replication slot configured only for its use. Only one source should be configured that uses this replication slot. Instructions on how to set up a replication slot can be found below. * Airbyte requires a replication slot configured only for its use. Only one source should be configured that uses this replication slot. Instructions on how to set up a replication slot can be found below.
* Log-based replication only works for master instances of Postgres. * Log-based replication only works for master instances of Postgres.
* Using logical replication increases disk space used on the database server. The additional data is stored until it is consumed. * Using logical replication increases disk space used on the database server. The additional data is stored until it is consumed.
* We recommend setting frequent syncs for CDC in order to ensure that this data doesn't fill up your disk space. * We recommend setting frequent syncs for CDC in order to ensure that this data doesn't fill up your disk space.
* If you stop syncing a CDC-configured Postgres instance to Airbyte, you should delete the replication slot. Otherwise, it may fill up your disk space. * If you stop syncing a CDC-configured Postgres instance to Airbyte, you should delete the replication slot. Otherwise, it may fill up your disk space.
* Our CDC implementation uses at least once delivery for all change records. * Our CDC implementation uses at least once delivery for all change records.
### Setting up CDC for Postgres ### Setting up CDC for Postgres
@@ -134,17 +137,20 @@ Please read the [CDC docs](../../understanding-airbyte/cdc.md) for an overview o
#### Enable logical replication #### Enable logical replication
Follow one of these guides to enable logical replication: Follow one of these guides to enable logical replication:
* [Bare Metal, VMs (EC2/GCE/etc), Docker, etc.](#setting-up-cdc-on-bare-metal-vms-ec2gceetc-docker-etc)
* [AWS Postgres RDS or Aurora](#setting-up-cdc-on-aws-postgres-rds-or-aurora)
* [Azure Database for Postgres](#setting-up-cdc-on-azure-database-for-postgres)
#### Add user-level permissions * [Bare Metal, VMs \(EC2/GCE/etc\), Docker, etc.](postgres.md#setting-up-cdc-on-bare-metal-vms-ec2gceetc-docker-etc)
* [AWS Postgres RDS or Aurora](postgres.md#setting-up-cdc-on-aws-postgres-rds-or-aurora)
* [Azure Database for Postgres](postgres.md#setting-up-cdc-on-azure-database-for-postgres)
We recommend using a user specifically for Airbyte's replication so you can minimize access. This Airbyte user for your instance needs to be granted `REPLICATION` and `LOGIN` permissions. You can create a role with `CREATE ROLE <name> REPLICATION LOGIN;` and grant that role to the user. You still need to make sure the user can connect to the database, use the schema, and to use `SELECT` on tables (the same are required for non-CDC incremental syncs and all full refreshes). #### Add user-level permissions
We recommend using a user specifically for Airbyte's replication so you can minimize access. This Airbyte user for your instance needs to be granted `REPLICATION` and `LOGIN` permissions. You can create a role with `CREATE ROLE <name> REPLICATION LOGIN;` and grant that role to the user. You still need to make sure the user can connect to the database, use the schema, and to use `SELECT` on tables \(the same are required for non-CDC incremental syncs and all full refreshes\).
#### Create replication slot #### Create replication slot
Next, you will need to create a replication slot. Here is the query used to create a replication slot called `airbyte_slot`:
``` Next, you will need to create a replication slot. Here is the query used to create a replication slot called `airbyte_slot`:
```text
SELECT pg_create_logical_replication_slot('airbyte_slot', 'pgoutput');` SELECT pg_create_logical_replication_slot('airbyte_slot', 'pgoutput');`
``` ```
@@ -157,10 +163,12 @@ For each table you want to replicate with CDC, you will need to run `CREATE PUBL
The UI currently allows selecting any tables for CDC. If a table is selected that is not part of the publication, it will not replicate even though it is selected. If a table is part of the publication but does not have a replication identity, that replication identity will be created automatically on the first run if the Airbyte user has the necessary permissions. The UI currently allows selecting any tables for CDC. If a table is selected that is not part of the publication, it will not replicate even though it is selected. If a table is part of the publication but does not have a replication identity, that replication identity will be created automatically on the first run if the Airbyte user has the necessary permissions.
#### Start syncing #### Start syncing
When configuring the source, select CDC and provide the replication slot and publication you just created. You should be ready to sync data with CDC! When configuring the source, select CDC and provide the replication slot and publication you just created. You should be ready to sync data with CDC!
### Setting up CDC on Bare Metal, VMs (EC2/GCE/etc), Docker, etc. ### Setting up CDC on Bare Metal, VMs \(EC2/GCE/etc\), Docker, etc.
Some settings must be configured in the `postgresql.conf` file for your database. You can find the location of this file using `psql -U postgres -c 'SHOW config_file'` withe the correct `psql` credentials specified. Alternatively, a custom file can be specified when running postgres with the `-c` flag. For example `postgres -c config_file=/etc/postgresql/postgresql.conf` runs Postgres with the config file at `/etc/postgresql/postgresql.conf`.
Some settings must be configured in the `postgresql.conf` file for your database. You can find the location of this file using `psql -U postgres -c 'SHOW config_file'` withe the correct `psql` credentials specified. Alternatively, a custom file can be specified when running postgres with the `-c` flag. For example `postgres -c config_file=/etc/postgresql/postgresql.conf` runs Postgres with the config file at `/etc/postgresql/postgresql.conf`.
If you are syncing data from a server using the `postgres` Docker image, you will need to mount a file and change the command to run Postgres with the set config file. If you're just testing CDC behavior, you may want to use a modified version of a [sample `postgresql.conf`](https://github.com/postgres/postgres/blob/master/src/backend/utils/misc/postgresql.conf.sample). If you are syncing data from a server using the `postgres` Docker image, you will need to mount a file and change the command to run Postgres with the set config file. If you're just testing CDC behavior, you may want to use a modified version of a [sample `postgresql.conf`](https://github.com/postgres/postgres/blob/master/src/backend/utils/misc/postgresql.conf.sample).
@@ -169,7 +177,8 @@ If you are syncing data from a server using the `postgres` Docker image, you wil
* `max_replication_slots` is the maximum number of replication slots that are allowed to stream WAL changes. This must one if Airbyte will be the only service reading subscribing to WAL changes or more if other services are also reading from the WAL. * `max_replication_slots` is the maximum number of replication slots that are allowed to stream WAL changes. This must one if Airbyte will be the only service reading subscribing to WAL changes or more if other services are also reading from the WAL.
Here is what these settings would look like in `postgresql.conf`: Here is what these settings would look like in `postgresql.conf`:
```
```text
wal_level = logical wal_level = logical
max_wal_senders = 1 max_wal_senders = 1
max_replication_slots = 1 max_replication_slots = 1
@@ -177,27 +186,32 @@ max_replication_slots = 1
After setting these values you will need to restart your instance. After setting these values you will need to restart your instance.
Finally, [follow the rest of steps above](#setting-up-cdc-for-postgres). Finally, [follow the rest of steps above](postgres.md#setting-up-cdc-for-postgres).
### Setting up CDC on AWS Postgres RDS or Aurora ### Setting up CDC on AWS Postgres RDS or Aurora
* Go to the `Configuration` tab for your DB cluster. * Go to the `Configuration` tab for your DB cluster.
* Find your cluster parameter group. You will either edit the parameters for this group or create a copy of this parameter group to edit. If you create a copy you will need to change your cluster's parameter group before restarting. * Find your cluster parameter group. You will either edit the parameters for this group or create a copy of this parameter group to edit. If you create a copy you will need to change your cluster's parameter group before restarting.
* Within the parameter group page, search for `rds.logical_replication`. Select this row and click on the `Edit parameters` button. Set this value to `1`. * Within the parameter group page, search for `rds.logical_replication`. Select this row and click on the `Edit parameters` button. Set this value to `1`.
* Wait for a maintenance window to automatically restart the instance or restart it manually. * Wait for a maintenance window to automatically restart the instance or restart it manually.
* Finally, [follow the rest of steps above](#setting-up-cdc-for-postgres). * Finally, [follow the rest of steps above](postgres.md#setting-up-cdc-for-postgres).
### Setting up CDC on Azure Database for Postgres ### Setting up CDC on Azure Database for Postgres
Use either the Azure CLI to: Use either the Azure CLI to:
```
```text
az postgres server configuration set --resource-group group --server-name server --name azure.replication_support --value logical az postgres server configuration set --resource-group group --server-name server --name azure.replication_support --value logical
az postgres server restart --resource-group group --name server az postgres server restart --resource-group group --name server
``` ```
Finally, [follow the rest of steps above](#setting-up-cdc-for-postgres). Finally, [follow the rest of steps above](postgres.md#setting-up-cdc-for-postgres).
### Setting up CDC on Google CloudSQL ### Setting up CDC on Google CloudSQL
Unfortunately, logical replication is not configurable for Google CloudSQL. You can indicate your support for this feature on the [Google Issue Tracker](https://issuetracker.google.com/issues/120274585). Unfortunately, logical replication is not configurable for Google CloudSQL. You can indicate your support for this feature on the [Google Issue Tracker](https://issuetracker.google.com/issues/120274585).
### Setting up CDC on other platforms ### Setting up CDC on other platforms
If you encounter one of those not listed below, please consider [contributing to our docs](https://github.com/airbytehq/airbyte/tree/master/docs) and providing setup instructions. If you encounter one of those not listed below, please consider [contributing to our docs](https://github.com/airbytehq/airbyte/tree/master/docs) and providing setup instructions.

View File

@@ -64,6 +64,7 @@ This Source is capable of syncing the following [Streams](https://developer.intu
3. Obtain credentials 3. Obtain credentials
### Requirements ### Requirements
* Client ID * Client ID
* Client Secret * Client Secret
* Realm ID * Realm ID
@@ -71,4 +72,5 @@ This Source is capable of syncing the following [Streams](https://developer.intu
The easiest way to get these credentials is by using Quickbook's [OAuth 2.0 playground](https://developer.intuit.com/app/developer/qbo/docs/develop/authentication-and-authorization/oauth-2.0-playground) The easiest way to get these credentials is by using Quickbook's [OAuth 2.0 playground](https://developer.intuit.com/app/developer/qbo/docs/develop/authentication-and-authorization/oauth-2.0-playground)
**Important note:** The refresh token expires every 100 days. You will need to manually revisit the Oauth playground to obtain a refresh token every 100 days, or your syncs will expire. We plan on offering full Oauth support soon so you don't need to redo this process manually. **Important note:** The refresh token expires every 100 days. You will need to manually revisit the Oauth playground to obtain a refresh token every 100 days, or your syncs will expire. We plan on offering full Oauth support soon so you don't need to redo this process manually.

View File

@@ -36,7 +36,7 @@ This Source is capable of syncing the following core Streams:
| :--- | :--- | :--- | | :--- | :--- | :--- |
| Full Refresh Sync | Yes | | | Full Refresh Sync | Yes | |
| Incremental - Append Sync | Yes | | | Incremental - Append Sync | Yes | |
| Namespaces | No | | | Namespaces | No | |
### Performance considerations ### Performance considerations

View File

@@ -1,85 +1,75 @@
# Smartsheets # Smartsheets
### Table of Contents ### Table of Contents
- [Sync Details](#sync-details)
- [Column datatype mapping](#column-datatype-mapping) * [Sync Details](smartsheets.md#sync-details)
- [Features](#Features) * [Column datatype mapping](smartsheets.md#column-datatype-mapping)
- [Performance Considerations](#performance-considerations) * [Features](smartsheets.md#Features)
- [Getting Started](#getting-started) * [Performance Considerations](smartsheets.md#performance-considerations)
- [Requirements](#requirements) * [Getting Started](smartsheets.md#getting-started)
- [Setup Guide](#setup-guide) * [Requirements](smartsheets.md#requirements)
- [Configuring the source in the Airbyte UI](#configuring-the-source-in-the-airbyte-ui) * [Setup Guide](smartsheets.md#setup-guide)
* [Configuring the source in the Airbyte UI](smartsheets.md#configuring-the-source-in-the-airbyte-ui)
## Sync Details ## Sync Details
The Smartsheet Source is written to pull data from a single Smartsheet spreadsheet. Unlike Google Sheets, Smartsheets only allows one sheet per Smartsheet - so a given Airbyte connector instance can sync only one sheet at a time.
The Smartsheet Source is written to pull data from a single Smartsheet spreadsheet. Unlike Google Sheets, Smartsheets only allows one sheet per Smartsheet - so a given Airbyte connector instance can sync only one sheet at a time.
To replicate multiple spreadsheets, you can create multiple instances of the Smartsheet Source in Airbyte, reusing the API token for all your sheets that you need to sync. To replicate multiple spreadsheets, you can create multiple instances of the Smartsheet Source in Airbyte, reusing the API token for all your sheets that you need to sync.
**Note: Column headers must contain only alphanumeric characters or `_` , as specified in the** [**Airbyte Protocol**](../../understanding-airbyte/airbyte-specification.md). **Note: Column headers must contain only alphanumeric characters or `_` , as specified in the** [**Airbyte Protocol**](../../understanding-airbyte/airbyte-specification.md).
### Column datatype mapping ### Column datatype mapping
The data type mapping adopted by this connector is based on the Smartsheet [documentation](https://smartsheet.redoc.ly/tag/columnsRelated#section/Column-Types).
The data type mapping adopted by this connector is based on the Smartsheet [documentation](https://smartsheet.redoc.ly/tag/columnsRelated#section/Column-Types).
**NOTE**: For any column datatypes interpreted by Smartsheets beside `DATE` and `DATETIME`, this connector's source schema generation assumes a `string` type, in which case the `format` field is not required by Airbyte. **NOTE**: For any column datatypes interpreted by Smartsheets beside `DATE` and `DATETIME`, this connector's source schema generation assumes a `string` type, in which case the `format` field is not required by Airbyte.
<center> \| Integration Type \| Airbyte Type \| Airbyte Format \| \| :--- \| :--- \| :--- \| \| \`TEXT\_NUMBER\` \| \`string\` \| \| \| \`DATE\` \| \`string\` \| \`format: date\` \| \| \`DATETIME\` \| \`string\` \| \`format: date-time\` \| \| \`anything else\` \| \`string\` \| \|
| Integration Type | Airbyte Type | Airbyte Format | The remaining column datatypes supported by Smartsheets are more complex types \(e.g. Predecessor, Dropdown List\) and are not supported by this connector beyond its `string` representation.
| :--- | :--- | :--- |
| `TEXT_NUMBER` | `string` | |
| `DATE` | `string` | `format: date` |
| `DATETIME` | `string` | `format: date-time` |
| `anything else` | `string` | |
</center>
The remaining column datatypes supported by Smartsheets are more complex types (e.g. Predecessor, Dropdown List) and are not supported by this connector beyond its `string` representation.
### Features ### Features
This source connector only supports Full Refresh Sync. Since Smartsheets only allows 5000 rows per sheet, it's likely that the Full Refresh Sync Mode will suit the majority of use-cases. This source connector only supports Full Refresh Sync. Since Smartsheets only allows 5000 rows per sheet, it's likely that the Full Refresh Sync Mode will suit the majority of use-cases.
<center> \| Feature \| Supported?\| \| :--- \| :--- \| \| Full Refresh Sync \|Yes \| \| Incremental Sync \|No \| \| Namespaces \|No \|
| Feature | Supported?|
| :--- | :--- |
| Full Refresh Sync | <center>Yes</center> |
| Incremental Sync | <center>No</center> |
| Namespaces | <center>No</center> |
</center>
### Performance considerations ### Performance considerations
At the time of writing, the [Smartsheets API rate limit](https://developers.smartsheet.com/blog/smartsheet-api-best-practices#:~:text=The%20Smartsheet%20API%20currently%20imposes,per%20minute%20per%20Access%20Token.) is 300 requests per minute per API access token. This connector makes 6 API calls per sync operation.
<hr> At the time of writing, the [Smartsheets API rate limit](https://developers.smartsheet.com/blog/smartsheet-api-best-practices#:~:text=The%20Smartsheet%20API%20currently%20imposes,per%20minute%20per%20Access%20Token.) is 300 requests per minute per API access token. This connector makes 6 API calls per sync operation.
## Getting started ## Getting started
### Requirements ### Requirements
To configure the Smartsheet Source for syncs, you'll need the following: To configure the Smartsheet Source for syncs, you'll need the following:
* A Smartsheets API access token - generated by a Smartsheets user with at least **read** access * A Smartsheets API access token - generated by a Smartsheets user with at least **read** access
* The ID of the spreadsheet you'd like to sync * The ID of the spreadsheet you'd like to sync
### Setup guide ### Setup guide
#### Obtain a Smartsheets API access token #### Obtain a Smartsheets API access token
You can generate an API key for your account from a session of your Smartsheet webapp by clicking: You can generate an API key for your account from a session of your Smartsheet webapp by clicking:
- Account (top-right icon) * Account \(top-right icon\)
- Apps & Integrations * Apps & Integrations
- API Access * API Access
- Generate new access token * Generate new access token
For questions on advanced authorization flows, refer to [this](https://www.smartsheet.com/content-center/best-practices/tips-tricks/api-getting-started). For questions on advanced authorization flows, refer to [this](https://www.smartsheet.com/content-center/best-practices/tips-tricks/api-getting-started).
#### The spreadsheet ID of your Smartsheet #### The spreadsheet ID of your Smartsheet
You'll also need the ID of the Spreadsheet you'd like to sync. Unlike Google Sheets, this ID is not found in the URL. You can find the required spreadsheet ID from your Smartsheet app session by going to: You'll also need the ID of the Spreadsheet you'd like to sync. Unlike Google Sheets, this ID is not found in the URL. You can find the required spreadsheet ID from your Smartsheet app session by going to:
- File
- Properties * File
<br> * Properties
### Configuring the source in the Airbyte UI ### Configuring the source in the Airbyte UI
To setup your new Smartsheets source, Airbyte will need: To setup your new Smartsheets source, Airbyte will need:
1. Your API access token 1. Your API access token

View File

@@ -30,7 +30,7 @@ This Source is capable of syncing the following core Streams:
* [Transcriptions](https://www.twilio.com/docs/voice/api/recording-transcription?code-sample=code-read-list-all-transcriptions&code-language=curl&code-sdk-version=json#read-multiple-transcription-resources) * [Transcriptions](https://www.twilio.com/docs/voice/api/recording-transcription?code-sample=code-read-list-all-transcriptions&code-language=curl&code-sdk-version=json#read-multiple-transcription-resources)
* [Queues](https://www.twilio.com/docs/voice/api/queue-resource#read-multiple-queue-resources) * [Queues](https://www.twilio.com/docs/voice/api/queue-resource#read-multiple-queue-resources)
* [Message media](https://www.twilio.com/docs/sms/api/media-resource#read-multiple-media-resources) * [Message media](https://www.twilio.com/docs/sms/api/media-resource#read-multiple-media-resources)
* [Messages](https://www.twilio.com/docs/sms/api/message-resource#read-multiple-message-resources) * [Messages](https://www.twilio.com/docs/sms/api/message-resource#read-multiple-message-resources)
\(stream data can only be received for the last 400 days\) \(stream data can only be received for the last 400 days\)

Some files were not shown because too many files have changed in this diff Show More