1
0
mirror of synced 2025-12-25 02:09:19 -05:00

Source Microsoft SharePoint: Add parsing shared files (#35830)

This commit is contained in:
Anatolii Yatsuk
2024-03-18 18:25:16 +02:00
committed by GitHub
parent f5c9a87e0f
commit 80332d8730
16 changed files with 3271 additions and 702 deletions

View File

@@ -25,16 +25,17 @@ This page contains the setup guide and reference information for the Microsoft S
3. On the Set up the source page, select **Microsoft SharePoint** from the Source type dropdown.
4. Enter the name for the Microsoft SharePoint connector.
5. Enter **Drive Name**. To find your drive name go to settings and at the top of setting menu you can find the name of your drive.
6. Enter **Folder Path**.
7. The **OAuth2.0** authorization method is selected by default. Click **Authenticate your Microsoft SharePoint account**. Log in and authorize your Microsoft account.
8. For **Start Date**, enter the date in YYYY-MM-DD format. The data added on and after this date will be replicated.
9. Add a stream:
6. Select **Search Scope**. Specifies the location(s) to search for files. Valid options are 'ACCESSIBLE_DRIVES' for all SharePoint drives the user can access, 'SHARED_ITEMS' for shared items the user has access to, and 'ALL' to search both. Default value is 'ALL'.
7. Enter **Folder Path**. Leave empty to search all folders of the drives. This does not apply to shared items.
8. The **OAuth2.0** authorization method is selected by default. Click **Authenticate your Microsoft SharePoint account**. Log in and authorize your Microsoft account.
9. For **Start Date**, enter the date in YYYY-MM-DD format. The data added on and after this date will be replicated.
10. Add a stream:
1. Write the **File Type**
2. In the **Format** box, use the dropdown menu to select the format of the files you'd like to replicate. The supported formats are **CSV**, **Parquet**, **Avro** and **JSONL**. Toggling the **Optional fields** button within the **Format** box will allow you to enter additional configurations based on the selected format. For a detailed breakdown of these settings, refer to the [File Format section](#file-format-settings) below.
3. Give a **Name** to the stream
4. (Optional) - If you want to enforce a specific schema, you can enter a **Input schema**. By default, this value is set to `{}` and will automatically infer the schema from the file\(s\) you are replicating. For details on providing a custom schema, refer to the [User Schema section](#user-schema).
5. Optionally, enter the **Globs** which dictates which files to be synced. This is a regular expression that allows Airbyte to pattern match the specific files to replicate. If you are replicating all the files within your bucket, use `**` as the pattern. For more precise pattern matching options, refer to the [Path Patterns section](#path-patterns) below.
10. Click **Set up source**
11. Click **Set up source**
<!-- /env:cloud -->
<!-- env:oss -->
@@ -86,18 +87,19 @@ This source requires **Application permissions**. Follow these [instructions](ht
3. On the **Set up** the source page, select **Microsoft SharePoint** from the Source type dropdown.
4. Enter the name for the Microsoft SharePoint connector.
5. Enter **Drive Name**. To find your drive name go to settings and at the top of setting menu you can find the name of your drive.
6. Enter **Folder Path**.
7. Switch to **Service Key Authentication**
8. For **User Practical Name**, enter the [UPN](https://learn.microsoft.com/en-us/sharepoint/list-onedrive-urls) for your user.
9. Enter **Tenant ID**, **Client ID** and **Client secret**.
10. For **Start Date**, enter the date in YYYY-MM-DD format. The data added on and after this date will be replicated.
11. Add a stream:
6. Select **Search Scope**. Specifies the location(s) to search for files. Valid options are 'ACCESSIBLE_DRIVES' for all SharePoint drives the user can access, 'SHARED_ITEMS' for shared items the user has access to, and 'ALL' to search both. Default value is 'ALL'.
7. Enter **Folder Path**. Leave empty to search all folders of the drives. This does not apply to shared items.
8. Switch to **Service Key Authentication**
9. For **User Practical Name**, enter the [UPN](https://learn.microsoft.com/en-us/sharepoint/list-onedrive-urls) for your user.
10. Enter **Tenant ID**, **Client ID** and **Client secret**.
11. For **Start Date**, enter the date in YYYY-MM-DD format. The data added on and after this date will be replicated.
12. Add a stream:
1. Write the **File Type**
2. In the **Format** box, use the dropdown menu to select the format of the files you'd like to replicate. The supported formats are **CSV**, **Parquet**, **Avro** and **JSONL**. Toggling the **Optional fields** button within the **Format** box will allow you to enter additional configurations based on the selected format. For a detailed breakdown of these settings, refer to the [File Format section](#file-format-settings) below.
3. Give a **Name** to the stream
4. (Optional) - If you want to enforce a specific schema, you can enter a **Input schema**. By default, this value is set to `{}` and will automatically infer the schema from the file\(s\) you are replicating. For details on providing a custom schema, refer to the [User Schema section](#user-schema).
5. Optionally, enter the **Globs** which dictates which files to be synced. This is a regular expression that allows Airbyte to pattern match the specific files to replicate. If you are replicating all the files within your bucket, use `**` as the pattern. For more precise pattern matching options, refer to the [Path Patterns section](#path-patterns) below.
12. Click **Set up source**
13. Click **Set up source**
<!-- /env:oss -->
@@ -127,8 +129,9 @@ The connector is restricted by normal Microsoft Graph [requests limitation](http
## Changelog
| Version | Date | Pull Request | Subject |
|:--------|:-----------|:---------------------------------------------------------|:-----------|
| 0.1.0 | 2024-01-25 | [33537](https://github.com/airbytehq/airbyte/pull/33537) | New source |
| Version | Date | Pull Request | Subject |
|:--------|:-----------|:---------------------------------------------------------|:--------------------------|
| 0.2.0 | 2024-03-06 | [35830](https://github.com/airbytehq/airbyte/pull/35830) | Add fetching shared items |
| 0.1.0 | 2024-01-25 | [33537](https://github.com/airbytehq/airbyte/pull/33537) | New source |
</HideInUI>