* 5-step tutorial * move * tiny bit of editing * Update tutorial * update docs * reset * move files * record selector, request options, and more links * update * update * connector definition * link * links * update example * footnote * typo * document string interpolation * note on string interpolation * update * fix code sample * fix * update sample * fix * use the actual config * Update as per comments * write as yaml * typo * Clarify options overloading * clarify that docker must be running * remove extra footnote * use venv directly * Apply suggestions from code review Co-authored-by: Sherif A. Nada <snadalive@gmail.com> * signup instructions * update * clarify that both dot and bracket notations are interchangeable * Clarify how check works * create spec and config before updating connector definition * clarify what now_local() is * rename to yaml structure * Go through tutorial and update end of section code samples * fix link * update * update code samples * Update code samples * Update to bracket notation * remove superfluous comments * Update docs/connector-development/config-based/tutorial/2-install-dependencies.md Co-authored-by: Augustin <augustin.lafanechere@gmail.com> * Update docs/connector-development/config-based/tutorial/3-connecting-to-the-API-source.md Co-authored-by: Augustin <augustin.lafanechere@gmail.com> * Update docs/connector-development/config-based/tutorial/3-connecting-to-the-API-source.md Co-authored-by: Augustin <augustin.lafanechere@gmail.com> * Update docs/connector-development/config-based/tutorial/3-connecting-to-the-API-source.md Co-authored-by: Augustin <augustin.lafanechere@gmail.com> * Update docs/connector-development/config-based/tutorial/3-connecting-to-the-API-source.md Co-authored-by: Augustin <augustin.lafanechere@gmail.com> * Update docs/connector-development/config-based/tutorial/3-connecting-to-the-API-source.md Co-authored-by: Augustin <augustin.lafanechere@gmail.com> * Update docs/connector-development/config-based/tutorial/4-reading-data.md Co-authored-by: Augustin <augustin.lafanechere@gmail.com> * fix path * update * motivation blurp * warning * warning * fix code block * update code samples * update code sample * update code samples * small updates * update yaml structure * custom class example * language annotations * update warning * Update tutorial to use dpath extractor * Update record selector docs * unit test * link to contributing * tiny update * $ in front of commands * $ in front of commands * More readings * link to existing config-based connectors * index * update * delete broken link * supported features * update * Add some links * Update docs/connector-development/config-based/overview.md Co-authored-by: Brian Lai <51336873+brianjlai@users.noreply.github.com> * Update docs/connector-development/config-based/record-selector.md Co-authored-by: Brian Lai <51336873+brianjlai@users.noreply.github.com> * Update docs/connector-development/config-based/overview.md Co-authored-by: Brian Lai <51336873+brianjlai@users.noreply.github.com> * Update docs/connector-development/config-based/overview.md Co-authored-by: Brian Lai <51336873+brianjlai@users.noreply.github.com> * Update docs/connector-development/config-based/overview.md Co-authored-by: Brian Lai <51336873+brianjlai@users.noreply.github.com> * mention the unit * headers * remove mentions of interpolating on stream slice, etc. * update * exclude config-based docs Co-authored-by: Sherif A. Nada <snadalive@gmail.com> Co-authored-by: Augustin <augustin.lafanechere@gmail.com> Co-authored-by: Brian Lai <51336873+brianjlai@users.noreply.github.com>
4.0 KiB
Record selector
The record selector is responsible for translating an HTTP response into a list of Airbyte records by extracting records from the response and optionally filtering and shaping records based on a heuristic.
The current record extraction implementation uses dpath to select records from the json-decoded HTTP response.
Common recipes:
Here are some common patterns:
Selecting the whole response
If the root of the response is an array containing the records, the records can be extracted using the following definition:
selector:
extractor:
field_pointer: [ ]
If the root of the response is a json object representing a single record, the record can be extracted and wrapped in an array.
For example, given a response body of the form
{
"id": 1
}
and a selector
selector:
extractor:
field_pointer: [ ]
The selected records will be
[
{
"id": 1
}
]
Selecting a field
Given a response body of the form
{
"data": [{"id": 0}, {"id": 1}],
"metadata": {"api-version": "1.0.0"}
}
and a selector
selector:
extractor:
field_pointer: [ "data" ]
The selected records will be
[
{
"id": 0
},
{
"id": 1
}
]
Selecting an inner field
Given a response body of the form
{
"data": {
"records": [
{
"id": 1
},
{
"id": 2
}
]
}
}
and a selector
selector:
extractor:
field_pointer:
- "data"
- "records"
The selected records will be
[
{
"id": 1
},
{
"id": 2
}
]
Filtering records
Records can be filtered by adding a record_filter to the selector. The expression in the filter will be evaluated to a boolean returning true the record should be included.
In this example, all records with a created_at field greater than the stream slice's start_time will be filtered out:
selector:
extractor:
field_pointer: [ ]
record_filter:
condition: "{{ record['created_at'] < stream_slice['start_time'] }}"
Transformations
Fields can be added or removed from records by adding Transformations to a stream's definition.
Adding fields
Fields can be added with the AddFields transformation.
This example adds a top-level field "field1" with a value "static_value"
stream:
<...>
transformations:
- type: AddFields
fields:
- path: [ "field1" ]
value: "static_value"
This example adds a top-level field "start_date", whose value is evaluated from the stream slice:
stream:
<...>
transformations:
- type: AddFields
fields:
- path: [ "start_date" ]
value: {{ stream_slice[ 'start_date' ] }}
Fields can also be added in a nested object by writing the fields' path as a list.
Given a record of the following shape:
{
"id": 0,
"data":
{
"field0": "some_data"
}
}
this definition will add a field in the "data" nested object:
stream:
<...>
transformations:
- type: AddFields
fields:
- path: [ "data", "field1" ]
value: "static_value"
resulting in the following record:
{
"id": 0,
"data":
{
"field0": "some_data",
"field1": "static_value"
}
}
Removing fields
Fields can be removed from records with the RemoveFields transformation.
Given a record of the following shape:
{
"path":
{
"to":
{
"field1": "data_to_remove",
"field2": "data_to_keep"
}
},
"path2": "data_to_remove",
"path3": "data_to_keep"
}
this definition will remove the 2 instances of "data_to_remove" which are found in "path2" and "path.to.field1":
the_stream:
<...>
transformations:
- type: RemoveFields
field_pointers:
- [ "path", "to", "field1" ]
- [ "path2" ]
resulting in the following record:
{
"path":
{
"to":
{
"field2": "data_to_keep"
}
},
"path3": "data_to_keep"
}