1
0
mirror of synced 2025-12-26 14:02:10 -05:00
Files
airbyte/docs/integrations/sources/elasticsearch.md
jerome DOUCET 5149108020 🐛 Source ElasticSearch: avoid too_long_frame_exception (#18134)
* Fix: (elasticsearch source) avoid too_long_frame_exception

batch the queries on mapping with a arbitrary (but reasonable) chunk size
to avoid reaching the 4096 bytes limits url size.

* bump connector

* auto-bump connector version

Co-authored-by: Marcos Marx <marcosmarxm@users.noreply.github.com>
Co-authored-by: marcosmarxm <marcosmarxm@gmail.com>
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
2022-12-05 08:33:23 -03:00

5.0 KiB

Elasticsearch

This page contains the setup guide and reference information for the Elasticsearch source connector.

Prerequisites

Requirements

  • Elasticsearch endpoint URL
  • Elasticsearch credentials (optional)

Supported sync modes

Feature Supported?(Yes/No) Notes
Full Refresh Sync Yes
Incremental Sync No

This source syncs data from an ElasticSearch domain.

Supported Streams

This source automatically discovers all indices in the domain and can sync any of them.

Performance Considerations

ElasticSearch calls may be rate limited by the underlying service. This is specific to each deployment.

Data type map

Elasticsearch data types: https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-types.html

Airbyte data types: https://docs.airbyte.com/understanding-airbyte/supported-data-types/

In Elasticsearch, there is no dedicated array data type. Any field can contain zero or more values by default, however, all values in the array must be of the same data type. Hence, every field can be an array as well.

Integration Type Airbyte Type Notes
binary ["string", "array"]
boolean ["boolean", "array"]
keyword ["string", "array", "number", "integer"]
constant_keyword ["string", "array", "number", "integer"]
wildcard ["string", "array", "number", "integer"]
long ["integer", "array"]
unsigned_long ["integer", "array"]
integer ["integer", "array"]
short ["integer", "array"]
byte ["integer", "array"]
double ["number", "array"]
float ["number", "array"]
half_float ["number", "array"]
scaled_float ["number", "array"]
date ["string", "array"]
date_nanos ["number", "array"]
object ["object", "array"]
flattened ["object", "array"]
nested ["object", "string"]
join ["object", "string"]
integer_range ["object", "array"]
float_range ["object", "array"]
long_range ["object", "array"]
double_range ["object", "array"]
date_range ["object", "array"]
ip_range ["object", "array"]
ip ["string", "array"]
version ["string", "array"]
murmur3 ["string", "array", "number", "integer"]
aggregate_metric_double ["string", "array", "number", "integer"]
histogram ["string", "array", "number", "integer"]
text ["string", "array", "number", "integer"]
alias ["string", "array", "number", "integer"]
search_as_you_type ["string", "array", "number", "integer"]
token_count ["string", "array", "number", "integer"]
dense_vector ["string", "array", "number", "integer"]
geo_point ["string", "array", "number", "integer"]
geo_shape ["string", "array", "number", "integer"]
shape ["string", "array", "number", "integer"]
point ["string", "array", "number", "integer"]

Changelog

Version Date Pull Request Subject
0.1.1 2022-12-02 18118 Avoid too_long_frame_exception
0.1.0 2022-07-12 14118 Initial Release