Vector DB based destinations
Note: All helpers in this directory are experimental and subject to change
This directory contains several helpers that can be used to create a destination that processes and chunks records, embeds their text part and loads them into a vector database. The specific loading behavior is defined by the destination connector itself, but chunking and embedding behavior is handled by the helpers.
To use these helpers, install the CDK with the vector-db-based extra:
pip install airbyte-cdk[vector-db-based]
The helpers can be used in the following way:
- Add the config models to the spec of the connector
- Implement the
Indexerinterface for your specific database - In the check implementation of the destination, initialize the indexer and the embedder and call
checkon them - In the write implementation of the destination, initialize the indexer, the embedder and pass them to a new instance of the writer. Then call the writers
writemethod with the iterable for incoming messages
This is how the components interact:
┌─────────────┐
│MyDestination│
└┬────────────┘
┌▽───────────────────────────────┐
│Writer │
└┬─────────┬──────────┬────────┬─┘
┌▽───────┐┌▽────────┐┌▽──────┐┌▽────────────────┐
│Embedder││MyIndexer││Batcher││DocumentProcessor│
└────────┘└─────────┘└───────┘└─────────────────┘
Normally, only the MyDestination class and the MyIndexer class has to be implemented specifically for the destination. The other classes are provided as is by the helpers.