* [INIT] setup source-ftp Signed-off-by: Henri Blancke <blanckehenri@gmail.com> * [ADD] add logic Signed-off-by: Henri Blancke <blanckehenri@gmail.com> * [ADD] unit, integration, acceptance tests Signed-off-by: Henri Blancke <blanckehenri@gmail.com> * [UPD] update docs Signed-off-by: Henri Blancke <blanckehenri@gmail.com> * [UPD] clean up crew Signed-off-by: Henri Blancke <blanckehenri@gmail.com> * [UPD] update host to localhost Signed-off-by: Henri Blancke <blanckehenri@gmail.com> * [UPD] rename from ftp to sftp bulk Signed-off-by: Henri Blancke <blanckehenri@gmail.com> * [INIT] setup source-ftp Signed-off-by: Henri Blancke <blanckehenri@gmail.com> * [ADD] add logic Signed-off-by: Henri Blancke <blanckehenri@gmail.com> * [ADD] unit, integration, acceptance tests Signed-off-by: Henri Blancke <blanckehenri@gmail.com> * [UPD] update docs Signed-off-by: Henri Blancke <blanckehenri@gmail.com> * [UPD] clean up crew Signed-off-by: Henri Blancke <blanckehenri@gmail.com> * [UPD] update host to localhost Signed-off-by: Henri Blancke <blanckehenri@gmail.com> * [UPD] rename from ftp to sftp bulk Signed-off-by: Henri Blancke <blanckehenri@gmail.com> * update branch * [FIX] integration tests Signed-off-by: Henri Blancke <blanckehenri@gmail.com> * [FIX] acceptance test fixture Signed-off-by: Henri Blancke <blanckehenri@gmail.com> * [UPD] change ftp port Signed-off-by: Henri Blancke <blanckehenri@gmail.com> * add uuid to image build * add source def * [FIX] generate ssh keys Signed-off-by: Henri Blancke <blanckehenri@gmail.com> * auto-bump connector version Signed-off-by: Henri Blancke <blanckehenri@gmail.com> Co-authored-by: marcosmarxm <marcosmarxm@gmail.com> Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
64 lines
3.7 KiB
Markdown
64 lines
3.7 KiB
Markdown
# SFTP Bulk
|
|
This page contains the setup guide and reference information for the FTP source connector.
|
|
|
|
This connector allows you to:
|
|
- Fetch files from an FTP server matching a folder path and define an optional file pattern to bulk ingest files into a single stream
|
|
- Incrementally load files into your destination from an FTP server based on when files were last added or modified
|
|
- Optionally load only the latest file matching a folder path and optional pattern and overwrite the data in your destination (helpful when a snapshot file gets added on a regular basis containing the latest data)
|
|
|
|
## Prerequisites
|
|
|
|
* The Server with FTP connection type support
|
|
* The Server host
|
|
* The Server port
|
|
* Username-Password/Public Key Access Rights
|
|
|
|
## Setup guide
|
|
### Step 1: Set up SFTP
|
|
1. Use your username/password credential to connect the server.
|
|
2. Alternatively generate Public Key Access
|
|
|
|
The following simple steps are required to set up public key authentication:
|
|
|
|
Key pair is created (typically by the user). This is typically done with ssh-keygen.
|
|
Private key stays with the user (and only there), while the public key is sent to the server. Typically with the ssh-copy-id utility.
|
|
Server stores the public key (and "marks" it as authorized).
|
|
Server will now allow access to anyone who can prove they have the corresponding private key.
|
|
|
|
### Step 2: Set up the SFTP connector in Airbyte
|
|
|
|
1. In the left navigation bar, click **`Sources`**. In the top-right corner, click **+new source**.
|
|
2. On the Set up the source page, enter the name for the FTP connector and select **SFTP Bulk** from the Source type dropdown.
|
|
3. Enter your `User Name`, `Host Address`, `Port`
|
|
4. Enter authentication details for the FTP server (`Password` and/or `Private Key`)
|
|
5. Choose a `File type`
|
|
6. Enter `Folder Path` (Optional) to specify server folder for sync
|
|
7. Enter `File Pattern` (Optional). e.g. ` log-([0-9]{4})([0-9]{2})([0-9]{2})`. Write your own [regex](https://docs.python.org/3/howto/regex.html)
|
|
8. Check `Most recent file` (Optional) if you only want to sync the most recent file matching a folder path and optional file pattern
|
|
9. Provide a `Start Date` for incremental syncs to only sync files modified/added after this date
|
|
10. Click on `Check Connection` to finish configuring the FTP source.
|
|
|
|
## Supported sync modes
|
|
|
|
The FTP source connector supports the following[ sync modes](https://docs.airbyte.com/cloud/core-concepts#connection-sync-modes):
|
|
|
|
| Feature | Support | Notes |
|
|
|:------------------------------|:--------:|:--------------------------------------------------------------------------------------|
|
|
| Full Refresh - Overwrite | ✅ | |
|
|
| Full Refresh - Append Sync | ✅ | |
|
|
| Incremental - Append | ✅ | |
|
|
| Incremental - Deduped History | ❌ | |
|
|
| Namespaces | ❌ | |
|
|
|
|
|
|
## Supported Streams
|
|
|
|
This source provides a single stream per file with a dynamic schema. The current supported type file: `.csv` and `.json`
|
|
More formats \(e.g. Apache Avro\) will be supported in the future.
|
|
|
|
## Changelog
|
|
|
|
| Version | Date | Pull Request | Subject |
|
|
|:--------|:-----------|:-------------|:----------------|
|
|
| 0.1.0 | 2021-24-05 | | Initial version |
|