mirror of synced 2026-01-24 07:01:51 -05:00

Files

Neil Macneale V 65e616877a 🎉 New source: Fauna (#15274 )

* Add fauna source

* Update changelog to include the correct PR

* Improve docs (#1)

* Applied suggestions to improve docs (#2)

* Applied suggestions to improve docs

* Cleaned up the docs

* Apply suggestions from code review

Co-authored-by: Ewan Edwards <46354154+faunaee@users.noreply.github.com>

* Update airbyte-integrations/connectors/source-fauna/source_fauna/spec.yaml

Co-authored-by: Ewan Edwards <46354154+faunaee@users.noreply.github.com>

Co-authored-by: Ewan Edwards <46354154+faunaee@users.noreply.github.com>

* Flake Checker (#3)

* Run ./gradlew :airbyte-integrations:connectors:source-fauna:flakeCheck

* Fix all the warnings

* Set additionalProperties to true to adhere to acceptance tests

* Remove custom fields (#4)

* Remove custom fields from source.py

* Remove custom fields from spec.yaml

* Collections that support incremental sync are found correctly

* Run formatter

* Index values and termins are verified

* Stripped additional_columns from collection config and check()

* We now search for an index at the start of each sync

* Add default for missing data in collection

* Add a log message about the index chosen to sync an incremental stream

* Add an example for a configured incremental catalog

* Check test now validates the simplified check function

* Remove collection name from spec.yaml and CollectionConfig

* Update test_util.py to ahere to the new config

* Update the first discover test to validate that we can find indexes correctly

* Remove other discover tests, as they no longer apply

* Full refresh test now works with simplified expanded columns

* Remove unused imports

* Incremental test now adheres to the find_index_for_stream system

* Database test passes, so now all unit tests pass again

* Remove extra fields from required section

* ttl is nullable

* Data defaults to an empty object

* Update tests to reflect ttl and data select changes

* Fix expected records. All unit tests and acceptance tests pass

* Cleanup docs for find_index_for_stream

* Update setup guide to reflect multiple collections

* Add docs to install the fauna shell

* Update examples and README to conform to the removal of additional columns

Co-authored-by: Ewan Edwards <46354154+faunaee@users.noreply.github.com>

2022-09-29 10:37:03 -03:00

9.3 KiB

Raw Blame History

Fauna

This page guides you through setting up a Fauna source.

Overview

The Fauna source supports the following sync modes:

Full Sync - exports all the data from a Fauna collection.
Incremental Sync - exports data incrementally from a Fauna collection.

You need to create a separate source per collection that you want to export.

Preliminary setup

Enter the domain of the collection's database that you are exporting. The URL can be found in the docs.

Full sync

Follow these steps if you want this connection to perform a full sync.

Create a role that can read the collection that you are exporting. You can create the role in the Dashboard or the fauna shell with the following query:

CreateRole({
  name: "airbyte-readonly",
  privileges: [
    {
      resource: Collections(),
      actions: { read: true }
    },
    {
      resource: Indexes(),
      actions: { read: true }
    },
    {
      resource: Collection("COLLECTION_NAME"),
      actions: { read: true }
    }
  ],
})

Replace COLLECTION_NAME with the name of the collection configured for this connector. If you'd like to sync multiple collections, add an entry for each additional collection you'd like to sync. For example, to sync users and products, run this query instead:

CreateRole({
  name: "airbyte-readonly",
  privileges: [
    {
      resource: Collections(),
      actions: { read: true }
    },
    {
      resource: Indexes(),
      actions: { read: true }
    },
    {
      resource: Collection("users"),
      actions: { read: true }
    },
    {
      resource: Collection("products"),
      actions: { read: true }
    }
  ],
})

Create a key with that role. You can create a key using this query:

CreateKey({
  name: "airbyte-readonly",
  role: Role("airbyte-readonly"),
})

Copy the secret output by the CreateKey command and enter that as the "Fauna Secret" on the left. Important: The secret is only ever displayed once. If you lose it, you would have to create a new key.

Incremental sync

Follow these steps if you want this connection to perform incremental syncs.

Create the "Incremental Sync Index". This allows the connector to perform incremental syncs. You can create the index with the fauna shell or in the Dashboard with the following query:

CreateIndex({
  name: "INDEX_NAME",
  source: Collection("COLLECTION_NAME"),
  terms: [],
  values: [
    { "field": "ts" },
    { "field": "ref" }
  ]
})

Replace COLLECTION_NAME with the name of the collection configured for this connector. Replace INDEX_NAME with the name that you configured for the Incremental Sync Index.

Repeat this step for every collection you'd like to sync.

Create a role that can read the collection, the index, and the metadata of all indexes. It needs access to index metadata in order to validate the index settings. You can create the role with this query:

CreateRole({
  name: "airbyte-readonly",
  privileges: [
    {
      resource: Collections(),
      actions: { read: true }
    },
    {
      resource: Indexes(),
      actions: { read: true }
    },
    {
      resource: Collection("COLLECTION_NAME"),
      actions: { read: true }
    },
    {
      resource: Index("INDEX_NAME"),
      actions: { read: true }
    }
  ],
})

Replace COLLECTION_NAME with the name of the collection configured for this connector. Replace INDEX_NAME with the name that you configured for the Incremental Sync Index.

If you'd like to sync multiple collections, add an entry for every collection and index you'd like to sync. For example, to sync users and products with Incremental Sync, run the following query:

CreateRole({
  name: "airbyte-readonly",
  privileges: [
    {
      resource: Collections(),
      actions: { read: true }
    },
    {
      resource: Indexes(),
      actions: { read: true }
    },
    {
      resource: Collection("users"),
      actions: { read: true }
    },
    {
      resource: Index("users-ts"),
      actions: { read: true }
    },
    {
      resource: Collection("products"),
      actions: { read: true }
    },
    {
      resource: Index("products-ts"),
      actions: { read: true }
    }
  ],
})

Create a key with that role. You can create a key using this query:

CreateKey({
  name: "airbyte-readonly",
  role: Role("airbyte-readonly"),
})

Copy the secret output by the CreateKey command and enter that as the "Fauna Secret" on the left. Important: The secret is only ever displayed once. If you lose it, you would have to create a new key.

Export formats

This section captures export formats for all special case data stored in Fauna. This list is exhaustive.

Note that the ref column in the exported database contains only the document ID from each document's reference (or "ref"). Since only one collection is involved in each connector configuration, it is inferred that the document ID refers to a document within the synced collection.

Fauna Type	Format	Note
Document Ref	`{ id: "id", "collection": "collection-name", "type": "document" }`
Other Ref	`{ id: "id", "type": "ref-type" }`	This includes all other refs, listed below.
Byte Array	base64 url formatting
Timestamp	date-time, or an iso-format timestamp
Query, SetRef	a string containing the wire protocol of this value	The wire protocol is not documented.

Ref types

Every ref is serialized as a JSON object with 2 or 3 fields, as listed above. The type field must be one of these strings:

Reference Type	`type` string
Document	`"document"`
Collection	`"collection"`
Database	`"database"`
Index	`"index"`
Function	`"function"`
Role	`"role"`
AccessProvider	`"access_provider"`
Key	`"key"`
Token	`"token"`
Credential	`"credential"`

For all other refs (for example if you stored the result of Collections()), the type must be "unknown". There is a difference between a specific collection ref (retrieved with Collection("name")), and all the reference to all collections (retrieved with Collections()). This is why the type is "unknown" for Collections(), but not for Collection("name")

To select the document ID from a ref, add "id" to the "Path" of the additional column. For example, if "Path" is ["data", "parent"], change the "Path" to ["data", "parent", "id"].

To select the collection name, add "collection", "id" to the "Path" of the additional column. For example, if "Path" is ["data", "parent"], change the "Path" to ["data", "parent", "collection", "id"]. Internally, the FQL Select is used.

Changelog

Version	Date	Pull Request	Subject
0.1.0	2022-08-03	15274	Add Fauna Source

9.3 KiB Raw Blame History