1
0
mirror of synced 2026-01-27 07:02:03 -05:00
Files
airbyte/docs/connector-development/cdk-python/schemas.md
Dmytro 8d2cd1e798 🎉 Tool for generation catalog schema from OpenAPI definition file (#5734)
* Add tool for generating catalog json schema from openapi definition file
2021-09-01 21:35:43 +03:00

28 lines
1.9 KiB
Markdown

# Defining your stream schemas
Your connector must describe the schema of each stream it can output using [JSONSchema](https://json-schema.org).
The simplest way to do this is to describe the schema of your streams using one `.json` file per stream. You can also dynamically generate the schema of your stream in code, or you can combine both approaches: start with a `.json` file and dynamically add properties to it.
The schema of a stream is the return value of `Stream.get_json_schema`.
## Static schemas
By default, `Stream.get_json_schema` reads a `.json` file in the `schemas/` directory whose name is equal to the value of the `Stream.name` property. In turn `Stream.name` by default returns the name of the class in snake case. Therefore, if you have a class `class EmployeeBenefits(HttpStream)` the default behavior will look for a file called `schemas/employee_benefits.json`. You can override any of these behaviors as you need.
Important note: any objects referenced via `$ref` should be placed in the `shared/` directory in their own `.json` files.
### Generating schemas from OpenAPI definitions
If you are implementing a connector to pull data from an API which publishes an [OpenAPI/Swagger spec](https://swagger.io/specification/), you can use a tool we've provided for generating JSON schemas from the OpenAPI definition file. Detailed information can be found [here](https://github.com/airbytehq/airbyte/tree/master/tools/openapi2jsonschema/).
## Dynamic schemas
If you'd rather define your schema in code, override `Stream.get_json_schema` in your stream class to return a `dict` describing the schema using [JSONSchema](https://json-schema.org).
## Dynamically modifying static schemas
Override `Stream.get_json_schema` to run the default behavior, edit the returned value, then return the edited value:
```
def get_json_schema(self):
schema = super().get_json_schema()
schema['dynamically_determined_property'] = "property"
return schema
```