1
0
mirror of synced 2025-12-23 21:03:15 -05:00

GitBook: [master] 186 pages and 77 assets modified

This commit is contained in:
Abhi Vaidyanatha
2021-10-08 21:17:47 +00:00
committed by gitbook-bot
parent c8c9905b35
commit ae32ecbb27
228 changed files with 3139 additions and 2834 deletions

View File

@@ -40,7 +40,7 @@ $ cd airbyte-integrations/connector-templates/generator # assumes you are starti
$ ./generate.sh
```
Select the `Java Destination` template and then input the name of your connector. We'll refer to the destination as `<name>-destination` in this tutorial, but you should replace `<name>` with the actual name you used for your connector e.g: `BigQueryDestination` or `bigquery-destination`.
Select the `Java Destination` template and then input the name of your connector. We'll refer to the destination as `<name>-destination` in this tutorial, but you should replace `<name>` with the actual name you used for your connector e.g: `BigQueryDestination` or `bigquery-destination`.
### Step 2: Build the newly generated destination
@@ -51,43 +51,45 @@ You can build the destination by running:
./gradlew :airbyte-integrations:connectors:destination-<name>:build
```
On Mac M1(Apple Silicon) machines(until openjdk images natively support ARM64 images) set the platform variable as shown below and build
On Mac M1\(Apple Silicon\) machines\(until openjdk images natively support ARM64 images\) set the platform variable as shown below and build
```bash
export DOCKER_BUILD_PLATFORM=linux/amd64
# Must be run from the Airbyte project root
./gradlew :airbyte-integrations:connectors:destination-<name>:build
```
this compiles the java code for your destination and builds a Docker image with the connector. At this point, we haven't implemented anything of value yet, but once we do, you'll use this command to compile your code and Docker image.
this compiles the java code for your destination and builds a Docker image with the connector. At this point, we haven't implemented anything of value yet, but once we do, you'll use this command to compile your code and Docker image.
{% hint style="info" %}
Airbyte uses Gradle to manage Java dependencies. To add dependencies for your connector, manage them in the `build.gradle` file inside your connector's directory.
Airbyte uses Gradle to manage Java dependencies. To add dependencies for your connector, manage them in the `build.gradle` file inside your connector's directory.
{% endhint %}
#### Iterating on your implementation
We recommend the following ways of iterating on your connector as you're making changes:
* Test-driven development (TDD) in Java
* Test-driven development (TDD) using Airbyte's Acceptance Tests
* Test-driven development \(TDD\) in Java
* Test-driven development \(TDD\) using Airbyte's Acceptance Tests
* Directly running the docker image
#### Test-driven development in Java
This should feel like a standard flow for a Java developer: you make some code changes then run java tests against them. You can do this directly in your IDE, but you can also run all unit tests via Gradle by running the command to build the connector:
```
This should feel like a standard flow for a Java developer: you make some code changes then run java tests against them. You can do this directly in your IDE, but you can also run all unit tests via Gradle by running the command to build the connector:
```text
./gradlew :airbyte-integrations:connectors:destination-<name>:build
```
This will build the code and run any unit tests. This approach is great when you are testing local behaviors and writing unit tests.
This will build the code and run any unit tests. This approach is great when you are testing local behaviors and writing unit tests.
#### TDD using acceptance tests & integration tests
Airbyte provides a standard test suite (dubbed "Acceptance Tests") that runs against every destination connector. They are "free" baseline tests to ensure the basic functionality of the destination. When developing a connector, you can simply run the tests between each change and use the feedback to guide your development.
Airbyte provides a standard test suite \(dubbed "Acceptance Tests"\) that runs against every destination connector. They are "free" baseline tests to ensure the basic functionality of the destination. When developing a connector, you can simply run the tests between each change and use the feedback to guide your development.
If you want to try out this approach, check out Step 6 which describes what you need to do to set up the acceptance Tests for your destination.
The nice thing about this approach is that you are running your destination exactly as Airbyte will run it in the CI. The downside is that the tests do not run very quickly. As such, we recommend this iteration approach only once you've implemented most of your connector and are in the finishing stages of implementation. Note that Acceptance Tests are required for every connector supported by Airbyte, so you should make sure to run them a couple of times while iterating to make sure your connector is compatible with Airbyte.
The nice thing about this approach is that you are running your destination exactly as Airbyte will run it in the CI. The downside is that the tests do not run very quickly. As such, we recommend this iteration approach only once you've implemented most of your connector and are in the finishing stages of implementation. Note that Acceptance Tests are required for every connector supported by Airbyte, so you should make sure to run them a couple of times while iterating to make sure your connector is compatible with Airbyte.
#### Directly running the destination using Docker
@@ -116,11 +118,12 @@ The nice thing about this approach is that you are running your destination exac
Each destination contains a specification written in JsonSchema that describes its inputs. Defining the specification is a good place to start when developing your destination. Check out the documentation [here](https://json-schema.org/) to learn the syntax. Here's [an example](https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/destination-postgres/src/main/resources/spec.json) of what the `spec.json` looks like for the postgres destination.
Your generated template should have the spec file in `airbyte-integrations/connectors/destination-<name>/src/main/resources/spec.json`. The generated connector will take care of reading this file and converting it to the correct output. Edit it and you should be done with this step.
Your generated template should have the spec file in `airbyte-integrations/connectors/destination-<name>/src/main/resources/spec.json`. The generated connector will take care of reading this file and converting it to the correct output. Edit it and you should be done with this step.
For more details on what the spec is, you can read about the Airbyte Protocol [here](../../understanding-airbyte/airbyte-specification.md).
See the `spec` operation in action:
See the `spec` operation in action:
```bash
# First build the connector
./gradlew :airbyte-integrations:connectors:destination-<name>:build
@@ -131,15 +134,15 @@ docker run --rm airbyte/destination-<name>:dev spec
### Step 4: Implement `check`
The check operation accepts a JSON object conforming to the `spec.json`. In other words if the `spec.json` said that the destination requires a `username` and `password` the config object might be `{ "username": "airbyte", "password": "password123" }`. It returns a json object that reports, given the credentials in the config, whether we were able to connect to the destination.
The check operation accepts a JSON object conforming to the `spec.json`. In other words if the `spec.json` said that the destination requires a `username` and `password` the config object might be `{ "username": "airbyte", "password": "password123" }`. It returns a json object that reports, given the credentials in the config, whether we were able to connect to the destination.
While developing, we recommend storing any credentials in `secrets/config.json`. Any `secrets` directory in the Airbyte repo is gitignored by default.
Implement the `check` method in the generated file `<Name>Destination.java`. Here's an [example implementation](https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/destination-bigquery/src/main/java/io/airbyte/integrations/destination/bigquery/BigQueryDestination.java#L94) from the BigQuery destination.
Implement the `check` method in the generated file `<Name>Destination.java`. Here's an [example implementation](https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/destination-bigquery/src/main/java/io/airbyte/integrations/destination/bigquery/BigQueryDestination.java#L94) from the BigQuery destination.
Verify that the method is working by placing your config in `secrets/config.json` then running:
Verify that the method is working by placing your config in `secrets/config.json` then running:
```
```text
# First build the connector
./gradlew :airbyte-integrations:connectors:destination-<name>:build
@@ -148,26 +151,25 @@ docker run -v $(pwd)/secrets:/secrets --rm airbyte/destination-<name>:dev check
```
### Step 5: Implement `write`
The `write` operation is the main workhorse of a destination connector: it reads input data from the source and writes it to the underlying destination. It takes as input the config file used to run the connector as well as the configured catalog: the file used to describe the schema of the incoming data and how it should be written to the destination. Its "output" is two things:
The `write` operation is the main workhorse of a destination connector: it reads input data from the source and writes it to the underlying destination. It takes as input the config file used to run the connector as well as the configured catalog: the file used to describe the schema of the incoming data and how it should be written to the destination. Its "output" is two things:
1. Data written to the underlying destination
2. `AirbyteMessage`s of type `AirbyteStateMessage`, written to stdout to indicate which records have been written so far during a sync. It's important to output these messages when possible in order to avoid re-extracting messages from the source. See the [write operation protocol reference](https://docs.airbyte.io/understanding-airbyte/airbyte-specification#write) for more information.
To implement the `write` Airbyte operation, implement the `getConsumer` method in your generated `<Name>Destination.java` file. Here are some example implementations from different destination conectors:
* [BigQuery](https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/destination-bigquery/src/main/java/io/airbyte/integrations/destination/bigquery/BigQueryDestination.java#L188)
* [Google Pubsub](https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/destination-pubsub/src/main/java/io/airbyte/integrations/destination/pubsub/PubsubDestination.java#L98)
* [Local CSV](https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/destination-csv/src/main/java/io/airbyte/integrations/destination/csv/CsvDestination.java#L90)
* [Postgres](https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/destination-postgres/src/main/java/io/airbyte/integrations/destination/postgres/PostgresDestination.java)
{% hint style="info" %}
The Postgres destination leverages the `AbstractJdbcDestination` superclass which makes it extremely easy to create a destination for a database or data warehouse if it has a compatible JDBC driver. If the destination you are implementing has a JDBC driver, be sure to check out `AbstractJdbcDestination`.
The Postgres destination leverages the `AbstractJdbcDestination` superclass which makes it extremely easy to create a destination for a database or data warehouse if it has a compatible JDBC driver. If the destination you are implementing has a JDBC driver, be sure to check out `AbstractJdbcDestination`.
{% endhint %}
For a brief overview on the Airbyte catalog check out [the Beginner's Guide to the Airbyte Catalog](../../understanding-airbyte/beginners-guide-to-catalog.md).
### Step 6: Set up Acceptance Tests
The Acceptance Tests are a set of tests that run against all destinations. These tests are run in the Airbyte CI to prevent regressions and verify a baseline of functionality. The test cases are contained and documented in the [following file](https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/bases/standard-destination-test/src/main/java/io/airbyte/integrations/standardtest/destination/DestinationAcceptanceTest.java).
@@ -175,6 +177,7 @@ The Acceptance Tests are a set of tests that run against all destinations. These
To setup acceptance Tests for your connector, follow the `TODO`s in the generated file `<name>DestinationAcceptanceTest.java`. Once setup, you can run the tests using `./gradlew :airbyte-integrations:connectors:destination-<name>:integrationTest`. Make sure to run this command from the Airbyte repository root.
### Step 7: Write unit tests and/or integration tests
The Acceptance Tests are meant to cover the basic functionality of a destination. Think of it as the bare minimum required for us to add a destination to Airbyte. You should probably add some unit testing or custom integration testing in case you need to test additional functionality of your destination.
#### Step 8: Update the docs
@@ -182,4 +185,6 @@ The Acceptance Tests are meant to cover the basic functionality of a destination
Each connector has its own documentation page. By convention, that page should have the following path: in `docs/integrations/destinations/<destination-name>.md`. For the documentation to get packaged with the docs, make sure to add a link to it in `docs/SUMMARY.md`. You can pattern match doing that from existing connectors.
## Wrapping up
Well done on making it this far! If you'd like your connector to ship with Airbyte by default, create a PR against the Airbyte repo and we'll work with you to get it across the finish line.
Well done on making it this far! If you'd like your connector to ship with Airbyte by default, create a PR against the Airbyte repo and we'll work with you to get it across the finish line.

View File

@@ -6,7 +6,7 @@ This article provides a checklist for how to create a Python destination. Each s
## Requirements
Docker and Python with the versions listed in the [tech stack section](../../understanding-airbyte/tech-stack.md). You can use any Python version between 3.7 and 3.9, but this tutorial was tested with 3.7.
Docker and Python with the versions listed in the [tech stack section](../../understanding-airbyte/tech-stack.md). You can use any Python version between 3.7 and 3.9, but this tutorial was tested with 3.7.
## Checklist
@@ -22,7 +22,7 @@ Docker and Python with the versions listed in the [tech stack section](../../und
* Step 8: Update the docs \(in `docs/integrations/destinations/<destination-name>.md`\)
{% hint style="info" %}
If you need help with any step of the process, feel free to submit a PR with your progress and any questions you have, or ask us on [slack](https://slack.airbyte.io). Also reference the KvDB python destination implementation if you want to see an example of a working destination.
If you need help with any step of the process, feel free to submit a PR with your progress and any questions you have, or ask us on [slack](https://slack.airbyte.io). Also reference the KvDB python destination implementation if you want to see an example of a working destination.
{% endhint %}
## Explaining Each Step
@@ -36,11 +36,11 @@ $ cd airbyte-integrations/connector-templates/generator # assumes you are starti
$ ./generate.sh
```
Select the `Python Destination` template and then input the name of your connector. We'll refer to the destination as `destination-<name>` in this tutorial, but you should replace `<name>` with the actual name you used for your connector e.g: `redis` or `google-sheets`.
Select the `Python Destination` template and then input the name of your connector. We'll refer to the destination as `destination-<name>` in this tutorial, but you should replace `<name>` with the actual name you used for your connector e.g: `redis` or `google-sheets`.
### Step 2: Setup the dev environment
Setup your Python virtual environment:
Setup your Python virtual environment:
```bash
cd airbyte-integrations/connectors/destination-<name>
@@ -54,6 +54,7 @@ source .venv/bin/activate
# Install with the "tests" extra which provides test requirements
pip install '.[tests]'
```
This step sets up the initial python environment. **All** subsequent `python` or `pip` commands assume you have activated your virtual environment.
If you want your IDE to auto complete and resolve dependencies properly, point it at the python binary in `airbyte-integrations/connectors/destination-<name>/.venv/bin/python`. Also anytime you change the dependencies in the `setup.py` make sure to re-run the build command. The build system will handle installing all dependencies in the `setup.py` into the virtual environment.
@@ -62,14 +63,14 @@ Let's quickly get a few housekeeping items out of the way.
#### Dependencies
Python dependencies for your destination should be declared in `airbyte-integrations/connectors/destination-<name>/setup.py` in the `install_requires` field. You might notice that a couple of Airbyte dependencies are already declared there (mainly the Airbyte CDK and potentially some testing libraries or helpers). Keep those as they will be useful during development.
Python dependencies for your destination should be declared in `airbyte-integrations/connectors/destination-<name>/setup.py` in the `install_requires` field. You might notice that a couple of Airbyte dependencies are already declared there \(mainly the Airbyte CDK and potentially some testing libraries or helpers\). Keep those as they will be useful during development.
You may notice that there is a `requirements.txt` in your destination's directory as well. Do not touch this. It is autogenerated and used to install local Airbyte dependencies which are not published to PyPI. All your dependencies should be declared in `setup.py`.
#### Iterating on your implementation
Pretty much all it takes to create a destination is to implement the `Destination` interface. Let's briefly recap the three methods implemented by a Destination:
Pretty much all it takes to create a destination is to implement the `Destination` interface. Let's briefly recap the three methods implemented by a Destination:
1. `spec`: declares the user-provided credentials or configuration needed to run the connector
2. `check`: tests if the user-provided configuration can be used to connect to the underlying data destination, and with the correct write permissions
3. `write`: writes data to the underlying destination by reading a configuration, a stream of records from stdin, and a configured catalog describing the schema of the data and how it should be written to the destination
@@ -98,8 +99,7 @@ cat messages.jsonl | python main.py write --config secrets/config.json --catalog
The nice thing about this approach is that you can iterate completely within in python. The downside is that you are not quite running your destination as it will actually be run by Airbyte. Specifically you're not running it from within the docker container that will house it.
**Run using Docker**
If you want to run your destination exactly as it will be run by Airbyte \(i.e. within a docker container\), you can use the following commands from the connector module directory \(`airbyte-integrations/connectors/destination-<name>`\):
**Run using Docker** If you want to run your destination exactly as it will be run by Airbyte \(i.e. within a docker container\), you can use the following commands from the connector module directory \(`airbyte-integrations/connectors/destination-<name>`\):
```bash
# First build the container
@@ -117,7 +117,7 @@ The nice thing about this approach is that you are running your source exactly a
**TDD using standard tests**
_note: these tests aren't yet available for Python connectors but will be very soon. Until then you should use custom unit or integration tests for TDD_.
_note: these tests aren't yet available for Python connectors but will be very soon. Until then you should use custom unit or integration tests for TDD_.
Airbyte provides a standard test suite that is run against every destination. The objective of these tests is to provide some "free" tests that can sanity check that the basic functionality of the destination works. One approach to developing your connector is to simply run the tests between each change and use the feedback from them to guide your development.
@@ -127,26 +127,25 @@ The nice thing about this approach is that you are running your destination exac
### Step 3: Implement `spec`
Each destination contains a specification written in JsonSchema that describes the inputs it requires and accepts. Defining the specification is a good place to start development.
To do this, find the spec file generated in `airbyte-integrations/connectors/destination-<name>/src/main/resources/spec.json`. Edit it and you should be done with this step. The generated connector will take care of reading this file and converting it to the correct output.
Each destination contains a specification written in JsonSchema that describes the inputs it requires and accepts. Defining the specification is a good place to start development. To do this, find the spec file generated in `airbyte-integrations/connectors/destination-<name>/src/main/resources/spec.json`. Edit it and you should be done with this step. The generated connector will take care of reading this file and converting it to the correct output.
Some notes about fields in the output spec:
* `supportsNormalization` is a boolean which indicates if this connector supports [basic normalization via DBT](https://docs.airbyte.io/understanding-airbyte/basic-normalization). If true, `supportsDBT` must also be true.
* `supportsDBT` is a boolean which indicates whether this destination is compatible with DBT. If set to true, the user can define custom DBT transformations that run on this destination after each successful sync. This must be true if `supportsNormalization` is set to true.
* `supported_destination_sync_modes`: An array of strings declaring the sync modes supported by this connector. The available options are:
* `overwrite`: The connector can be configured to wipe any existing data in a stream before writing new data
* `append`: The connector can be configured to append new data to existing data
* `append_dedupe`: The connector can be configured to deduplicate (i.e: UPSERT) data in the destination based on the new data and primary keys
* `overwrite`: The connector can be configured to wipe any existing data in a stream before writing new data
* `append`: The connector can be configured to append new data to existing data
* `append_dedupe`: The connector can be configured to deduplicate \(i.e: UPSERT\) data in the destination based on the new data and primary keys
* `supportsIncremental`: Whether the connector supports any `append` sync mode. Must be set to true if `append` or `append_dedupe` are included in the `supported_destination_sync_modes`.
Some helpful resources:
Some helpful resources:
* [**JSONSchema website**](https://json-schema.org/)
* [**Definition of Airbyte Protocol data models**](https://github.com/airbytehq/airbyte/blob/master/airbyte-protocol/models/src/main/resources/airbyte_protocol/airbyte_protocol.yaml). The output of `spec` is described by the `ConnectorSpecification` model (which is wrapped in an `AirbyteConnectionStatus` message).
* [**Definition of Airbyte Protocol data models**](https://github.com/airbytehq/airbyte/blob/master/airbyte-protocol/models/src/main/resources/airbyte_protocol/airbyte_protocol.yaml). The output of `spec` is described by the `ConnectorSpecification` model \(which is wrapped in an `AirbyteConnectionStatus` message\).
* [**Postgres Destination's spec.json file**](https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/destination-postgres/src/main/resources/spec.json) as an example `spec.json`.
Once you've edited the file, see the `spec` operation in action:
Once you've edited the file, see the `spec` operation in action:
```bash
python main.py spec
@@ -154,20 +153,21 @@ python main.py spec
### Step 4: Implement `check`
The check operation accepts a JSON object conforming to the `spec.json`. In other words if the `spec.json` said that the destination requires a `username` and `password`, the config object might be `{ "username": "airbyte", "password": "password123" }`. It returns a json object that reports, given the credentials in the config, whether we were able to connect to the destination.
The check operation accepts a JSON object conforming to the `spec.json`. In other words if the `spec.json` said that the destination requires a `username` and `password`, the config object might be `{ "username": "airbyte", "password": "password123" }`. It returns a json object that reports, given the credentials in the config, whether we were able to connect to the destination.
While developing, we recommend storing any credentials in `secrets/config.json`. Any `secrets` directory in the Airbyte repo is gitignored by default.
Implement the `check` method in the generated file `destination_<name>/destination.py`. Here's an [example implementation](https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/destination-kvdb/destination_kvdb/destination.py) from the KvDB destination.
Implement the `check` method in the generated file `destination_<name>/destination.py`. Here's an [example implementation](https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/destination-kvdb/destination_kvdb/destination.py) from the KvDB destination.
Verify that the method is working by placing your config in `secrets/config.json` then running:
Verify that the method is working by placing your config in `secrets/config.json` then running:
```bash
python main.py check --config secrets/config.json
```
### Step 5: Implement `write`
The `write` operation is the main workhorse of a destination connector: it reads input data from the source and writes it to the underlying destination. It takes as input the config file used to run the connector as well as the configured catalog: the file used to describe the schema of the incoming data and how it should be written to the destination. Its "output" is two things:
The `write` operation is the main workhorse of a destination connector: it reads input data from the source and writes it to the underlying destination. It takes as input the config file used to run the connector as well as the configured catalog: the file used to describe the schema of the incoming data and how it should be written to the destination. Its "output" is two things:
1. Data written to the underlying destination
2. `AirbyteMessage`s of type `AirbyteStateMessage`, written to stdout to indicate which records have been written so far during a sync. It's important to output these messages when possible in order to avoid re-extracting messages from the source. See the [write operation protocol reference](https://docs.airbyte.io/understanding-airbyte/airbyte-specification#write) for more information.
@@ -176,22 +176,25 @@ To implement the `write` Airbyte operation, implement the `write` method in your
### Step 6: Set up Acceptance Tests
_Coming soon. These tests are not yet available for Python destinations but will be very soon. For now please skip this step and rely on copious
amounts of integration and unit testing_.
_Coming soon. These tests are not yet available for Python destinations but will be very soon. For now please skip this step and rely on copious amounts of integration and unit testing_.
### Step 7: Write unit tests and/or integration tests
The Acceptance Tests are meant to cover the basic functionality of a destination. Think of it as the bare minimum required for us to add a destination to Airbyte. You should probably add some unit testing or custom integration testing in case you need to test additional functionality of your destination.
Add unit tests in `unit_tests/` directory and integration tests in the `integration_tests/` directory. Run them via
```bash
python -m pytest -s -vv integration_tests/
```
```
See the [KvDB integration tests](https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/destination-kvdb/integration_tests/integration_test.py) for an example of tests you can implement.
See the [KvDB integration tests](https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/destination-kvdb/integration_tests/integration_test.py) for an example of tests you can implement.
#### Step 8: Update the docs
Each connector has its own documentation page. By convention, that page should have the following path: in `docs/integrations/destinations/<destination-name>.md`. For the documentation to get packaged with the docs, make sure to add a link to it in `docs/SUMMARY.md`. You can pattern match doing that from existing connectors.
## Wrapping up
Well done on making it this far! If you'd like your connector to ship with Airbyte by default, create a PR against the Airbyte repo and we'll work with you to get it across the finish line.
Well done on making it this far! If you'd like your connector to ship with Airbyte by default, create a PR against the Airbyte repo and we'll work with you to get it across the finish line.

View File

@@ -6,6 +6,8 @@ This is a blazing fast guide to building an HTTP source connector. Think of it a
If you are a visual learner and want to see a video version of this guide going over each part in detail, check it out below.
{% embed url="https://www.youtube.com/watch?v=kJ3hLoNfz\_E&t=3s" caption="A speedy CDK overview." %}
## Dependencies
1. Python &gt;= 3.7
@@ -38,7 +40,7 @@ cd source_python_http_example
We're working with the PokeAPI, so we need to define our input schema to reflect that. Open the `spec.json` file here and replace it with:
```json
```javascript
{
"documentationUrl": "https://docs.airbyte.io/integrations/sources/pokeapi",
"connectionSpecification": {
@@ -58,10 +60,10 @@ We're working with the PokeAPI, so we need to define our input schema to reflect
}
}
```
As you can see, we have one input to our input schema, which is `pokemon_name`, which is required. Normally, input schemas will contain information such as API keys and client secrets that need to get passed down to all endpoints or streams.
Ok, let's write a function that checks the inputs we just defined. Nuke the `source.py` file. Now add this code to it. For a crucial time skip, we're going to define all the imports we need in the future here. Also note
that your `AbstractSource` class name must be a camel-cased version of the name you gave in the generation phase. In our case, this is `SourcePythonHttpExample`.
Ok, let's write a function that checks the inputs we just defined. Nuke the `source.py` file. Now add this code to it. For a crucial time skip, we're going to define all the imports we need in the future here. Also note that your `AbstractSource` class name must be a camel-cased version of the name you gave in the generation phase. In our case, this is `SourcePythonHttpExample`.
```python
from typing import Any, Iterable, List, Mapping, MutableMapping, Optional, Tuple
@@ -152,11 +154,9 @@ class Pokemon(HttpStream):
return None # TODO
```
Now download [this file](https://github.com/airbytehq/airbyte/blob/master/airbyte-cdk/python/docs/tutorials/http_api_source_assets/pokemon.json). Name it `pokemon.json` and place it in `/source_python_http_example/schemas`.
Now download [this file](https://github.com/airbytehq/airbyte/blob/master/airbyte-cdk/python/docs/tutorials/http_api_source_assets/pokemon.json). Name it `pokemon.json` and place it in `/source_python_http_example/schemas`.
This file defines your output schema for every endpoint that you want to implement. Normally, this will likely be the most time-consuming section of the connector development process, as it requires defining the output of the endpoint
exactly. This is really important, as Airbyte needs to have clear expectations for what the stream will output. Note that the name of this stream will be consistent in the naming of the JSON schema and the `HttpStream` class, as
`pokemon.json` and `Pokemon` respectively in this case. Learn more about schema creation [here](https://docs.airbyte.io/connector-development/cdk-python/full-refresh-stream#defining-the-streams-schema).
This file defines your output schema for every endpoint that you want to implement. Normally, this will likely be the most time-consuming section of the connector development process, as it requires defining the output of the endpoint exactly. This is really important, as Airbyte needs to have clear expectations for what the stream will output. Note that the name of this stream will be consistent in the naming of the JSON schema and the `HttpStream` class, as `pokemon.json` and `Pokemon` respectively in this case. Learn more about schema creation [here](https://docs.airbyte.io/connector-development/cdk-python/full-refresh-stream#defining-the-streams-schema).
Test your discover function. You should receive a fairly large JSON object in return.
@@ -213,8 +213,7 @@ class Pokemon(HttpStream):
return None
```
We now need a catalog that defines all of our streams. We only have one stream: `Pokemon`. Download that file [here](https://github.com/airbytehq/airbyte/blob/master/airbyte-cdk/python/docs/tutorials/http_api_source_assets/configured_catalog_pokeapi.json). Place it in `/sample_files` named as `configured_catalog.json`. More clearly,
this is where we tell Airbyte all the streams/endpoints we support for the connector and in which sync modes Airbyte can run the connector on. Learn more about the AirbyteCatalog [here](https://docs.airbyte.io/understanding-airbyte/beginners-guide-to-catalog) and learn more about sync modes [here](https://docs.airbyte.io/understanding-airbyte/connections#sync-modes).
We now need a catalog that defines all of our streams. We only have one stream: `Pokemon`. Download that file [here](https://github.com/airbytehq/airbyte/blob/master/airbyte-cdk/python/docs/tutorials/http_api_source_assets/configured_catalog_pokeapi.json). Place it in `/sample_files` named as `configured_catalog.json`. More clearly, this is where we tell Airbyte all the streams/endpoints we support for the connector and in which sync modes Airbyte can run the connector on. Learn more about the AirbyteCatalog [here](https://docs.airbyte.io/understanding-airbyte/beginners-guide-to-catalog) and learn more about sync modes [here](https://docs.airbyte.io/understanding-airbyte/connections#sync-modes).
Let's read some data.

View File

@@ -8,7 +8,7 @@ $ cd airbyte-integrations/connector-templates/generator # assumes you are starti
$ ./generate.sh
```
This will bring up an interactive helper application. Use the arrow keys to pick a template from the list. Select the `Python HTTP API Source` template and then input the name of your connector. The application will create a new directory in airbyte/airbyte-integrations/connectors/ with the name of your new connector.
This will bring up an interactive helper application. Use the arrow keys to pick a template from the list. Select the `Python HTTP API Source` template and then input the name of your connector. The application will create a new directory in airbyte/airbyte-integrations/connectors/ with the name of your new connector.
For this walk-through we will refer to our source as `python-http-example`. The finalized source code for this tutorial can be found [here](https://github.com/airbytehq/airbyte/tree/master/airbyte-integrations/connectors/source-python-http-tutorial).

View File

@@ -24,9 +24,9 @@ Optionally, we can provide additional inputs to customize requests:
Backoff policy options:
- `retry_factor` Specifies factor for exponential backoff policy (by default is 5)
- `max_retries` Specifies maximum amount of retries for backoff policy (by default is 5)
- `raise_on_http_errors` If set to False, allows opting-out of raising HTTP code exception (by default is True)
* `retry_factor` Specifies factor for exponential backoff policy \(by default is 5\)
* `max_retries` Specifies maximum amount of retries for backoff policy \(by default is 5\)
* `raise_on_http_errors` If set to False, allows opting-out of raising HTTP code exception \(by default is True\)
There are many other customizable options - you can find them in the [`airbyte_cdk.sources.streams.http.HttpStream`](https://github.com/airbytehq/airbyte/blob/master/airbyte-cdk/python/airbyte_cdk/sources/streams/http/http.py) class.

View File

@@ -12,7 +12,7 @@ Place any integration tests in the `integration_tests` directory such that they
## Standard Tests
Standard tests are a fixed set of tests Airbyte provides that every Airbyte source connector must pass. While they're only required if you intend to submit your connector to Airbyte, you might find them helpful in any case. See [Testing your connectors](../../testing-connectors/README.md)
Standard tests are a fixed set of tests Airbyte provides that every Airbyte source connector must pass. While they're only required if you intend to submit your connector to Airbyte, you might find them helpful in any case. See [Testing your connectors](../../testing-connectors/)
If you want to submit this connector to become a default connector within Airbyte, follow steps 8 onwards from the [Python source checklist](../building-a-python-source.md#step-8-set-up-standard-tests)

View File

@@ -1,2 +1,2 @@
# Creating an HTTP API Source with the Python CDK
# Python CDK: Creating a HTTP API Source