1
0
mirror of synced 2025-12-20 18:39:31 -05:00
Files
airbyte/docs/integrations/destinations/mssql.md
octavia-bot[bot] 12ac9737a0 chore: upgrade destination-mssql to bulk CDK 0.1.61 (#69130)
Co-authored-by: edgao <5741425+edgao@users.noreply.github.com>
Co-authored-by: octavia-bot[bot] <108746235+octavia-bot[bot]@users.noreply.github.com>
Co-authored-by: Edward Gao <edward.gao@airbyte.io>
2025-11-05 12:27:00 +05:30

21 KiB
Raw Permalink Blame History

MSSQL

Features

Feature Supported?(Yes/No) Notes
Full Refresh Sync Yes
Incremental - Append Sync Yes
Incremental - Append + Deduped Yes
Namespaces Yes

Output Schema

Each stream will be output into its own table in SQL Server. Each table will contain the following metadata columns:

  • _airbyte_raw_id: A random UUID assigned to each incoming record. The column type in SQL Server is VARCHAR(MAX).
  • _airbyte_extracted_at: A timestamp for when the event was pulled from the data source. The column type in SQL Server is BIGINT.
  • _airbyte_meta: Additional information about the record. The column type in SQL Server is TEXT.
  • _airbyte_generation_id: Incremented each time a refresh is executed. The column type in SQL Server is TEXT.

See here for more information about these fields.

Getting Started

Setup guide

  • MS SQL Server: Azure SQL Database, SQL Server 2016 or greater

Network Access

Make sure your SQL Server database can be accessed by Airbyte. If your database is within a VPC, you may need to allow access from the IP you're using to expose Airbyte.

Permissions

You need a user configured in SQL Server that can create tables and write rows. We highly recommend creating an Airbyte-specific user for this purpose. In order to allow for normalization, please grant ALTER permissions for the user configured.

Target Database

You will need to choose an existing database or create a new database that will be used to store synced data from Airbyte.

Configuration

You'll need the following information to configure the MSSQL destination:

  • Host
    • The host name of the MSSQL database.
  • Port
    • The port of the MSSQL database.
  • Database Name
    • The name of the MSSQL database.
  • Default Schema
    • The default schema tables are written to if the source does not specify a namespace. The usual value for this field is "public".
  • Username
    • The username which is used to access the database.
  • Password
    • The password associated with this username.
  • JDBC URL Parameters
    • Additional properties to pass to the JDBC URL string when connecting to the database formatted as 'key=value' pairs separated by the symbol '&'. (example: key1=value1&key2=value2&key3=value3).
  • SSL Method
    • The SSL configuration supports three modes: Unencrypted, Encrypted (trust server certificate), and Encrypted (verify certificate).
      • Unencrypted: Do not use SSL encryption on the database connection
      • Encrypted (trust server certificate): Use SSL encryption without verifying the server's certificate. This is useful for self-signed certificates in testing scenarios, but should not be used in production.
      • Encrypted (verify certificate): Use the server's SSL certificate, after standard certificate verification.
        • Host Name In Certificate (optional): When using certificate verification, this property can be set to specify an expected name for added security. If this value is present, and the server's certificate's host name does not match it, certificate verification will fail.
  • Load Type
    • The data load type supports two modes: Insert or Bulk
      • Insert: Utilizes SQL INSERT statements to load data to the destination table.
      • Bulk: Utilizes Azure Blob Storage and the BULK INSERT command to load data to the destination table. If selected, additional configuration is required:

MSSQL with Azure Blob Storage (Bulk Upload) Setup Guide

This section describes how to set up and use the Microsoft SQL Server (MSSQL) connector with the Azure Blob Storage Bulk Upload feature. By staging data in an Azure Blob Storage container and using BULK INSERT, you can significantly improve ingestion speed and reduce network overhead for large or frequent data loads.

Why Use Azure Blob Storage Bulk Upload?

When handling high data volumes or frequent syncs, row-by-row inserts into MSSQL can become slow and resource-intensive. By staging files in Azure Blob Storage first, you can:

  1. Aggregate Data into Bulk Files: Data is written to Blob Storage in batches, reducing overhead.
  2. Perform Bulk Ingestion: MSSQL uses BULK INSERT to directly load these files, typically resulting in faster performance compared to conventional row-by-row inserts.
Prerequisites
  1. A Microsoft SQL Server Instance
    • Compatible with on-premises SQL Server or Azure SQL Database.
  2. Azure Blob Storage Account
    • A storage account and container (e.g., bulk-staging) where data files will be placed.
  3. Permissions
    • Blob Storage: ability to create, read, and delete objects within the designated container.
    • MSSQL: ability to create or modify tables, and permission to execute BULK INSERT.
Setup Guide

Follow these steps to configure MSSQL with Azure Blob Storage for bulk uploads.

1. Set Up Azure Blob Storage
  1. Create a Storage Account & Container
    • In the Azure Portal, create (or reuse) a Storage Account.
    • Within that account, create a container (e.g., bulk-staging) for staging your data files.
  2. Establish Access Credentials
    • Use a Shared Access Signature (SAS) scoped to your container.
    • Ensure the SAS token or role assignments include permissions such as Read, Write, Delete, and List.
2. Configure MSSQL

See the official Microsoft documentation for more details. Below is a simplified overview:

  1. (Optional) Create a Master Encryption Key If your environment requires a master key to store credentials securely, create one:

    CREATE MASTER KEY ENCRYPTION BY PASSWORD = '<your_password>';
    
  2. Create a Database Scoped Credential Configure a credential that grants MSSQL access to your Blob Storage using the SAS token:

    CREATE DATABASE SCOPED CREDENTIAL <credential_name>
    WITH IDENTITY = 'SHARED ACCESS SIGNATURE',
         SECRET = '<your_sas_token>';
    
  3. Create an External Data Source Point MSSQL to your Blob container using the credential:

    CREATE EXTERNAL DATA SOURCE <data_source_name>
    WITH (
        TYPE = BLOB_STORAGE,
        LOCATION = 'https://<storage_account>.blob.core.windows.net/<container_name>',
        CREDENTIAL = <credential_name>
    );
    

    Youll reference <data_source_name> when configuring the connector.

3. Connector Configuration

Youll need to supply:

  1. MSSQL Connection Details
    • The server hostname/IP, port, database name, and authentication (username/password).
  2. Bulk Load Data Source
    • The name of the external data source you created (e.g., <data_source_name>).
  3. Azure Storage Account & Container
    • The name of the storage account and container.
  4. SAS Token
    • The token that grants Blob Storage access.

See the Getting Started: Configuration section of this guide for more details on BULK INSERT connector configuration.

Changelog

Expand to review
Version Date Pull Request Subject
2.2.14 2025-11-01 69130 Upgrade to Bulk CDK 0.1.61.
2.2.13 2025-09-24 66684 Pin to CDK artifact
2.2.12 2025-06-26 62078 Add SSH tunnel support
2.2.11 2025-05-30 61017 Integration test fixes
2.2.10 2025-05-29 60897 Internal fixes
2.2.9 2025-05-19 60791 Fix bug in detecting schema change when stream has no columns
2.2.8 2025-05-08 59735 Cleanup: Remove unused code
2.2.7 2025-05-07 56444 CDK: Internal refactor; perf improvements
2.2.6 2025-04-21 58146 Fix numeric bounds-handling
2.2.5 2025-04-18 58140 Upgrade to latest CDK
2.2.4 2025-04-11 57563 Improve BULK INSERT documentation.
2.2.3 2025-04-16 58085 Internal refactoring
2.2.2 2025-04-07 56391 Add support for Azure blob storage auth via storage account key.
2.2.1 2025-03-27 56402 Improve Azure blob storage load logic.
2.2.0 2025-03-23 56353 Bulk Load performance improvements
2.1.2 2025-03-25 56346 Internal refactor
2.1.1 2025-03-24 56355 Upgrade to airbyte/java-connector-base:2.0.1 to be M4 compatible.
2.1.0 2025-03-24 55849 Misc. bugfixes in type-handling (esp. in complex types)
2.0.5 2025-03-24 55904 Fix handling of invalid schemas (correctly JSON-serialize values)
2.0.4 2025-03-20 55886 Internal refactor
2.0.3 2025-03-18 55811 CDK: Pass DestinationStream around vs Descriptor
2.0.2 2025-03-12 55720 Restore definition ID
2.0.1 2025-03-12 55718 Fix breaking change information in metadata.yaml
2.0.0 2025-03-11 55684 Release 2.0.0
2.0.0.rc13 2025-03-07 55252 RC13: Bugfix for OOM on Bulk Load
2.0.0.rc12 2025-03-05 54159 RC12: Support For Bulk Insert Using Azure Blob Storage
2.0.0.rc11 2025-03-04 55193 RC11: Increase decimal precision
2.0.0.rc10 2025-02-24 54648 RC10: Fix index column names with hyphens
2.0.0.rc9 2025-02-21 54197 RC9: Fix index column names with invalid characters
2.0.0.rc8 2025-02-20 54186 RC8: Fix String support
2.0.0.rc7 2025-02-11 53364 RC7: Revert deletion change
2.0.0.rc6 2025-02-11 53364 RC6: Break up deletes into loop to reduce locking
2.0.0.rc5 2025-02-07 53236 RC5: Use rowlock hint
2.0.0.rc4 2025-02-06 53192 RC4: Fix config, timehandling, performance tweak
2.0.0.rc3 2025-02-04 53174 RC3: Fix metadata.yaml for publish
2.0.0.rc2 2025-02-04 52704 RC2: Performance improvement
2.0.0.rc1 2025-01-24 52096 Release candidate
1.0.3 2025-01-10 51497 Use a non root base image
1.0.2 2024-12-18 49891 Use a base image: airbyte/java-connector-base:1.0.0
1.0.1 2024-11-04 #48134 Fix supported sync modes (destination-mssql 1.x.y does not support dedup)
1.0.0 2024-04-11 #36050 Update to Dv2 Table Format and Remove normalization
0.2.0 2023-06-27 #27781 License Update: Elv2
0.1.25 2023-06-21 #27555 Reduce image size
0.1.24 2023-06-05 #27034 Internal code change for future development (install normalization packages inside connector)
0.1.23 2023-04-04 #24604 Support for destination checkpointing
0.1.22 2022-10-21 #18275 Upgrade commons-text for CVE 2022-42889
0.1.20 2022-07-14 #14618 Removed additionalProperties: false from JDBC destination connectors
0.1.19 2022-05-25 #13054 Destination MSSQL: added custom JDBC parameters support.
0.1.18 2022-05-17 #12820 Improved 'check' operation performance
0.1.17 2022-04-05 #11729 Bump mina-sshd from 2.7.0 to 2.8.0
0.1.15 2022-02-25 #10421 Refactor JDBC parameters handling
0.1.14 2022-02-14 #10256 Add -XX:+ExitOnOutOfMemoryError JVM option
0.1.13 2021-12-28 #9158 Update connector fields title/description
0.1.12 2021-12-01 #8371 Fixed incorrect handling "\n" in ssh key
0.1.11 2021-11-08 #7719 Improve handling of wide rows by buffering records based on their byte size rather than their count
0.1.10 2021-10-11 #6877 Add normalization capability, add append+deduplication sync mode
0.1.9 2021-09-29 #5970 Add support & test cases for MSSQL Destination via SSH tunnels
0.1.8 2021-08-07 #5272 Add batch method to insert records
0.1.7 2021-07-30 #5125 Enable additionalPropertities in spec.json
0.1.6 2021-06-21 #3555 Partial Success in BufferedStreamConsumer
0.1.5 2021-07-20 #4874 declare object types correctly in spec
0.1.4 2021-06-17 #3744 Fix doc/params in specification file
0.1.3 2021-05-28 #3728 Change dockerfile entrypoint
0.1.2 2021-05-13 #3367 Fix handle symbols unicode
0.1.1 2021-05-11 #3566 MS SQL Server Destination Release!