1
0
mirror of synced 2026-01-02 21:02:43 -05:00

Revert "Revert "Merge branch 'master' of github.com:airbytehq/airbyte""

This reverts commit de66bf010d.
This commit is contained in:
Sherif Nada
2021-03-23 15:12:01 -07:00
parent de66bf010d
commit 77b72dcf5b
181 changed files with 17418 additions and 9915 deletions

View File

@@ -17,7 +17,7 @@ All the commands below assume that `python` points to a version of python 3. On
### Creating a Source
* Step 1: Create the source using template
* Step 2: Build the newly generated source `./gradlew :airbyte-integrations:connectors:source-<source-name>:build`
* Step 2: Build the newly generated source
* Step 3: Set up your Airbyte development environment
* Step 4: Implement `spec` \(and define the specification for the source `airbyte-integrations/connectors/source-<source-name>/spec.json`\)
* Step 5: Implement `check`
@@ -42,10 +42,8 @@ All `./gradlew` commands must be run from the root of the airbyte project.
* If you need help with any step of the process, feel free to submit a PR with your progress and any questions you have.
* Submit a PR.
* To run integration tests, Airbyte needs access to a test account/environment. Coordinate with an Airbyte engineer \(via the PR\) to add test credentials so that we can run tests for the integration in the CI. \(We will create our own test account once you let us know what source we need to create it for.\)
* Once the config is stored in Github Secrets, edit `.github/workflows/test-command.yml` to inject the config into the build environment.
* Once the config is stored in Github Secrets, edit `.github/workflows/test-command.yml` and `.github/workflows/publish-command.yml` to inject the config into the build environment.
* Edit the `airbyte/tools/bin/ci_credentials.sh` script to pull the script from the build environment and write it to `secrets/config.json` during the build.
* From the `airbyte` project root, run `./gradlew :airbyte-integrations:connectors:source-<source-name>:build` to make sure your module builds.
* Apply Airbyte auto formatting `./gradlew format` and commit any changes.
{% hint style="info" %}
If you have a question about a step the Submitting a Source to Airbyte checklist include it in your PR or ask it on [slack](https://slack.airbyte.io).
@@ -68,9 +66,16 @@ Select the `python` template and then input the name of your connector. For this
### Step 2: Build the newly generated source
Build the source using: `./gradlew :airbyte-integrations:connectors:source-<source-name>:build`
Build the source by running:
This step sets up the initial python environment. By sanity checking that the source builds at the beginning we have a good starting place for developing our source.
```
cd airbyte-integrations/connectors/source-<name>
python -m venv .venv # Create a virtual environment in the .venv directory
source .venv/bin/activate # enable the venv
pip install -r requirements.txt
```
This step sets up the initial python environment. **All** subsequent `python` or `pip` commands assume you have activated your virtual environment.
### Step 3: Set up your Airbyte development environment
@@ -85,11 +90,12 @@ The generator creates a file `source_<source_name>/source.py`. This will be wher
Python dependencies for your source should be declared in `airbyte-integrations/connectors/source-<source-name>/setup.py` in the `install_requires` field. You will notice that a couple of Airbyte dependencies are already declared there. Do not remove these; they give your source access to the helper interface that is provided by the generator.
You may notice that there is a `requirements.txt` in your source's directory as well. Do not touch this. It helps IDEs pull in local Airbyte dependencies to help with code completion. It is _not_ used outside of the development environment. All dependencies should be declared in `setup.py`.
You may notice that there is a `requirements.txt` in your source's directory as well. Do not touch this. It is autogenerated and used to provide Airbyte
dependencies. All your dependencies should be declared in `setup.py`.
#### Development Environment
Running `./gradlew :airbyte-integrations:connectors:source-<source-name>:build` creates a virtual environment for your source. If you want your IDE to auto complete and resolve dependencies properly, point it at the virtual env `airbyte-integrations/connectors/source-<source-name>/.venv`. Also anytime you change the dependencies in the `setup.py` make sure to re-run the build command. The build system will handle installing all dependencies in the `setup.py` into the virtual environment.
The commands we ran above created a virtual environment for your source. If you want your IDE to auto complete and resolve dependencies properly, point it at the virtual env `airbyte-integrations/connectors/source-<source-name>/.venv`. Also anytime you change the dependencies in the `setup.py` make sure to re-run the build command. The build system will handle installing all dependencies in the `setup.py` into the virtual environment.
Pretty much all it takes to create a source is to implement the `Source` interface. The template fills in a lot of information for you and has extensive docstrings describing what you need to do to implement each method. The next 4 steps are just implementing that interface.
@@ -117,18 +123,21 @@ The nice thing about this approach is that you can iterate completely within in
**Run the source using docker**
If you want to run your source exactly as it will be run by Airbyte \(i.e. within a docker container\), you can use the following commands:
If you want to run your source exactly as it will be run by Airbyte \(i.e. within a docker container\), you can use the following commands from the
connector module directory (`airbyte-integrations/connectors/source-example-python`):
```text
# in airbyte root directory
./gradlew :airbyte-integrations:connectors:source-example-python:airbyteDocker
# First build the container
docker build . -t airbyte/source-example-python:dev
# Then use the following commands to run it
docker run --rm airbyte/source-example-python:dev spec
docker run --rm -v $(pwd)/airbyte-integrations/connectors/source-example-python/secrets:/secrets airbyte/source-example-python:dev check --config /secrets/config.json
docker run --rm -v $(pwd)/airbyte-integrations/connectors/source-example-python/secrets:/secrets airbyte/source-example-python:dev discover --config /secrets/config.json
docker run --rm -v $(pwd)/airbyte-integrations/connectors/source-example-python/secrets:/secrets -v $(pwd)/airbyte-integrations/connectors/source-example-python/sample_files:/sample_files airbyte/source-example-python:dev read --config /secrets/config.json --catalog /sample_files/configured_catalog.json
docker run --rm -v $(pwd)/secrets:/secrets airbyte/source-example-python:dev check --config /secrets/config.json
docker run --rm -v $(pwd)/secrets:/secrets airbyte/source-example-python:dev discover --config /secrets/config.json
docker run --rm -v $(pwd)/secrets:/secrets -v $(pwd)/sample_files:/sample_files airbyte/source-example-python:dev read --config /secrets/config.json --catalog /sample_files/configured_catalog.json
```
Note: Each time you make a change to your implementation you need to re-build the connector. `./gradlew :airbyte-integrations:connectors:source-<source-name>:build`. This makes sure that the new python code is added into the docker container.
Note: Each time you make a change to your implementation you need to re-build the connector image. `docker build . -t airbyte/source-example-python:dev`. This ensures the new python code is added into the docker container.
The nice thing about this approach is that you are running your source exactly as it will be run by Airbyte. The tradeoff is that iteration is slightly slower, because you need to re-build the connector between each change.
@@ -152,7 +161,7 @@ The generated code that Airbyte provides, handles implementing the `spec` method
As described in the template code, this method takes in a json object called config that has the values described in the `spec.json` filled in. In other words if the `spec.json` said that the source requires a `username` and `password` the config object might be `{ "username": "airbyte", "password": "password123" }`. It returns a json object that reports, given the credentials in the config, whether we were able to connect to the source. For example, with the given credentials could the source connect to the database server.
While developing, we recommend storing this object in `secrets/config.json`. All tests assume that is where credentials will be stored.
While developing, we recommend storing this object in `secrets/config.json`. The `secrets` directory is gitignored by default.
### Step 6: Implement `discover`
@@ -162,13 +171,14 @@ For a brief overview on the catalog check out [Beginner's Guide to the Airbyte C
### Step 7: Implement `read`
As described in the template code, this method takes in the same config object as the previous methods. It also takes in a "configured catalog". This object wraps the catalog emitted by the `discover` step and includes configuration on how the data should be replicated. For a brief overview on the configured catalog check out [Beginner's Guide to the Airbyte Catalog](beginners-guide-to-catalog.md). It then returns each record that it fetches from the source as a stream \(or generator\).
As described in the template code, this method takes in the same config object as the previous methods. It also takes in a "configured catalog". This object wraps the catalog emitted by the `discover` step and includes configuration on how the data should be replicated. For a brief overview on the configured catalog check out [Beginner's Guide to the Airbyte Catalog](beginners-guide-to-catalog.md). It then returns a generator which returns each record in the stream.
### Step 8: Set up Standard Tests
The Standard Tests are a set of tests that run against all sources. These tests are run in the Airbyte CI to prevent regressions. They also can help you sanity check that your source works as expected. The following [article](../contributing-to-airbyte/building-new-connector/testing-connectors.md) gives a brief overview of the Standard Tests and explains what you need to do to set up these tests.
The Standard Tests are a set of tests that run against all sources. These tests are run in the Airbyte CI to prevent regressions. They also can help you sanity check that your source works as expected. The following [article](../contributing-to-airbyte/building-new-connector/testing-connectors.md) explains Standard Tests and how to run them.
You can run the tests using `./gradlew :airbyte-integrations:connectors:source-<source-name>:integrationTest`
You can run the tests using `./gradlew :airbyte-integrations:connectors:source-<source-name>:integrationTest`. Make sure to run this command from
the Airbyte repository root.
{% hint style="info" %}
In some rare cases we make exceptions and allow a source to not need to pass all the standard tests. If for some reason you think your source cannot reasonably pass one of the tests cases, reach out to us on github or slack, and we can determine whether there's a change we can make so that the test will pass or if we should skip that test for your source.
@@ -176,17 +186,19 @@ In some rare cases we make exceptions and allow a source to not need to pass all
### Step 9: Write unit tests and/or integration tests
The Standard Tests are meant to cover the basic functionality of a source. Think of it as the bare minimum required for us to add a source to Airbyte.
The Standard Tests are meant to cover the basic functionality of a source. Think of it as the bare minimum required for us to add a source to Airbyte. In case you need to test additional functionality of your source, write unit or integration tests.
#### Unit Tests
Add any relevant unit tests to the `unit_tests` directory. Unit tests should _not_ depend on any secrets.
You can run the tests using `./gradlew :airbyte-integrations:connectors:source-<source-name>:test`
You can run the tests using `python -m pytest -s unit_tests`
#### Integration Tests
_coming soon_
Place any integration tests in the `integration_tests` directory such that they can be [discovered by pytest](https://docs.pytest.org/en/reorganize-docs/new-docs/user/naming_conventions.html).
Run integration tests using `python -m pytest -s integration_tests`.
#### Step 10: Update the `README.md`
@@ -194,7 +206,7 @@ The template fills in most of the information for the readme for you. Unless the
#### Step 11: Add the connector to the API/UI
Open the following file: `airbyte-config/init/src/main/resources/seed/source_definitions.yaml`. You'll find a list of all the connectors that Airbyte displays in the UI. Pattern match to add your own connector. Make sure to generate a new _unique_ UUIDv4 for the `sourceDefinitionId` field. You can get one [here](https://www.uuidgenerator.net/). After you do, run `./gradlew :airbyte-config:init:build` \(this command generates some necessary configuration files\).
Open the following file: `airbyte-config/init/src/main/resources/seed/source_definitions.yaml`. You'll find a list of all the connectors that Airbyte displays in the UI. Pattern match to add your own connector. Make sure to generate a new _unique_ UUIDv4 for the `sourceDefinitionId` field. You can get one [here](https://www.uuidgenerator.net/).
Note that for simple and quick testing use cases, you can also do this step [using the UI](../integrations/custom-connectors.md#adding-your-connectors-in-the-ui).