1
0
mirror of synced 2025-12-30 03:02:21 -05:00
Files
airbyte/docs/integrations/sources/sftp-bulk.md
Henri Blancke c469ea8c4f 🎉 New source: SFTP Bulk [python cdk] (#17691)
* [INIT] setup source-ftp

Signed-off-by: Henri Blancke <blanckehenri@gmail.com>

* [ADD] add logic

Signed-off-by: Henri Blancke <blanckehenri@gmail.com>

* [ADD] unit, integration, acceptance tests

Signed-off-by: Henri Blancke <blanckehenri@gmail.com>

* [UPD] update docs

Signed-off-by: Henri Blancke <blanckehenri@gmail.com>

* [UPD] clean up crew

Signed-off-by: Henri Blancke <blanckehenri@gmail.com>

* [UPD] update host to localhost

Signed-off-by: Henri Blancke <blanckehenri@gmail.com>

* [UPD] rename from ftp to sftp bulk

Signed-off-by: Henri Blancke <blanckehenri@gmail.com>

* [INIT] setup source-ftp

Signed-off-by: Henri Blancke <blanckehenri@gmail.com>

* [ADD] add logic

Signed-off-by: Henri Blancke <blanckehenri@gmail.com>

* [ADD] unit, integration, acceptance tests

Signed-off-by: Henri Blancke <blanckehenri@gmail.com>

* [UPD] update docs

Signed-off-by: Henri Blancke <blanckehenri@gmail.com>

* [UPD] clean up crew

Signed-off-by: Henri Blancke <blanckehenri@gmail.com>

* [UPD] update host to localhost

Signed-off-by: Henri Blancke <blanckehenri@gmail.com>

* [UPD] rename from ftp to sftp bulk

Signed-off-by: Henri Blancke <blanckehenri@gmail.com>

* update branch

* [FIX] integration tests

Signed-off-by: Henri Blancke <blanckehenri@gmail.com>

* [FIX] acceptance test fixture

Signed-off-by: Henri Blancke <blanckehenri@gmail.com>

* [UPD] change ftp port

Signed-off-by: Henri Blancke <blanckehenri@gmail.com>

* add uuid to image build

* add source def

* [FIX] generate ssh keys

Signed-off-by: Henri Blancke <blanckehenri@gmail.com>

* auto-bump connector version

Signed-off-by: Henri Blancke <blanckehenri@gmail.com>
Co-authored-by: marcosmarxm <marcosmarxm@gmail.com>
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
2022-10-27 23:58:30 -03:00

3.7 KiB

SFTP Bulk

This page contains the setup guide and reference information for the FTP source connector.

This connector allows you to:

  • Fetch files from an FTP server matching a folder path and define an optional file pattern to bulk ingest files into a single stream
  • Incrementally load files into your destination from an FTP server based on when files were last added or modified
  • Optionally load only the latest file matching a folder path and optional pattern and overwrite the data in your destination (helpful when a snapshot file gets added on a regular basis containing the latest data)

Prerequisites

  • The Server with FTP connection type support
  • The Server host
  • The Server port
  • Username-Password/Public Key Access Rights

Setup guide

Step 1: Set up SFTP

  1. Use your username/password credential to connect the server.
  2. Alternatively generate Public Key Access

The following simple steps are required to set up public key authentication:

Key pair is created (typically by the user). This is typically done with ssh-keygen. Private key stays with the user (and only there), while the public key is sent to the server. Typically with the ssh-copy-id utility. Server stores the public key (and "marks" it as authorized). Server will now allow access to anyone who can prove they have the corresponding private key.

Step 2: Set up the SFTP connector in Airbyte

  1. In the left navigation bar, click Sources. In the top-right corner, click +new source.
  2. On the Set up the source page, enter the name for the FTP connector and select SFTP Bulk from the Source type dropdown.
  3. Enter your User Name, Host Address, Port
  4. Enter authentication details for the FTP server (Password and/or Private Key)
  5. Choose a File type
  6. Enter Folder Path (Optional) to specify server folder for sync
  7. Enter File Pattern (Optional). e.g. log-([0-9]{4})([0-9]{2})([0-9]{2}). Write your own regex
  8. Check Most recent file (Optional) if you only want to sync the most recent file matching a folder path and optional file pattern
  9. Provide a Start Date for incremental syncs to only sync files modified/added after this date
  10. Click on Check Connection to finish configuring the FTP source.

Supported sync modes

The FTP source connector supports the following sync modes:

Feature Support Notes
Full Refresh - Overwrite
Full Refresh - Append Sync
Incremental - Append
Incremental - Deduped History
Namespaces

Supported Streams

This source provides a single stream per file with a dynamic schema. The current supported type file: .csv and .json More formats (e.g. Apache Avro) will be supported in the future.

Changelog

Version Date Pull Request Subject
0.1.0 2021-24-05 Initial version