* [INIT] setup source-ftp Signed-off-by: Henri Blancke <blanckehenri@gmail.com> * [ADD] add logic Signed-off-by: Henri Blancke <blanckehenri@gmail.com> * [ADD] unit, integration, acceptance tests Signed-off-by: Henri Blancke <blanckehenri@gmail.com> * [UPD] update docs Signed-off-by: Henri Blancke <blanckehenri@gmail.com> * [UPD] clean up crew Signed-off-by: Henri Blancke <blanckehenri@gmail.com> * [UPD] update host to localhost Signed-off-by: Henri Blancke <blanckehenri@gmail.com> * [UPD] rename from ftp to sftp bulk Signed-off-by: Henri Blancke <blanckehenri@gmail.com> * [INIT] setup source-ftp Signed-off-by: Henri Blancke <blanckehenri@gmail.com> * [ADD] add logic Signed-off-by: Henri Blancke <blanckehenri@gmail.com> * [ADD] unit, integration, acceptance tests Signed-off-by: Henri Blancke <blanckehenri@gmail.com> * [UPD] update docs Signed-off-by: Henri Blancke <blanckehenri@gmail.com> * [UPD] clean up crew Signed-off-by: Henri Blancke <blanckehenri@gmail.com> * [UPD] update host to localhost Signed-off-by: Henri Blancke <blanckehenri@gmail.com> * [UPD] rename from ftp to sftp bulk Signed-off-by: Henri Blancke <blanckehenri@gmail.com> * update branch * [FIX] integration tests Signed-off-by: Henri Blancke <blanckehenri@gmail.com> * [FIX] acceptance test fixture Signed-off-by: Henri Blancke <blanckehenri@gmail.com> * [UPD] change ftp port Signed-off-by: Henri Blancke <blanckehenri@gmail.com> * add uuid to image build * add source def * [FIX] generate ssh keys Signed-off-by: Henri Blancke <blanckehenri@gmail.com> * auto-bump connector version Signed-off-by: Henri Blancke <blanckehenri@gmail.com> Co-authored-by: marcosmarxm <marcosmarxm@gmail.com> Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
3.7 KiB
SFTP Bulk
This page contains the setup guide and reference information for the FTP source connector.
This connector allows you to:
- Fetch files from an FTP server matching a folder path and define an optional file pattern to bulk ingest files into a single stream
- Incrementally load files into your destination from an FTP server based on when files were last added or modified
- Optionally load only the latest file matching a folder path and optional pattern and overwrite the data in your destination (helpful when a snapshot file gets added on a regular basis containing the latest data)
Prerequisites
- The Server with FTP connection type support
- The Server host
- The Server port
- Username-Password/Public Key Access Rights
Setup guide
Step 1: Set up SFTP
- Use your username/password credential to connect the server.
- Alternatively generate Public Key Access
The following simple steps are required to set up public key authentication:
Key pair is created (typically by the user). This is typically done with ssh-keygen. Private key stays with the user (and only there), while the public key is sent to the server. Typically with the ssh-copy-id utility. Server stores the public key (and "marks" it as authorized). Server will now allow access to anyone who can prove they have the corresponding private key.
Step 2: Set up the SFTP connector in Airbyte
- In the left navigation bar, click
Sources. In the top-right corner, click +new source. - On the Set up the source page, enter the name for the FTP connector and select SFTP Bulk from the Source type dropdown.
- Enter your
User Name,Host Address,Port - Enter authentication details for the FTP server (
Passwordand/orPrivate Key) - Choose a
File type - Enter
Folder Path(Optional) to specify server folder for sync - Enter
File Pattern(Optional). e.g.log-([0-9]{4})([0-9]{2})([0-9]{2}). Write your own regex - Check
Most recent file(Optional) if you only want to sync the most recent file matching a folder path and optional file pattern - Provide a
Start Datefor incremental syncs to only sync files modified/added after this date - Click on
Check Connectionto finish configuring the FTP source.
Supported sync modes
The FTP source connector supports the following sync modes:
| Feature | Support | Notes |
|---|---|---|
| Full Refresh - Overwrite | ✅ | |
| Full Refresh - Append Sync | ✅ | |
| Incremental - Append | ✅ | |
| Incremental - Deduped History | ❌ | |
| Namespaces | ❌ |
Supported Streams
This source provides a single stream per file with a dynamic schema. The current supported type file: .csv and .json
More formats (e.g. Apache Avro) will be supported in the future.
Changelog
| Version | Date | Pull Request | Subject |
|---|---|---|---|
| 0.1.0 | 2021-24-05 | Initial version |