📚 Documentation November 2023 overhaul (#32811)
Co-authored-by: Natalie Kwong <38087517+nataliekwong@users.noreply.github.com> Co-authored-by: timroes <timroes@users.noreply.github.com> Co-authored-by: nataliekwong <nataliekwong@users.noreply.github.com>
This commit is contained in:
Binary file not shown.
|
Before Width: | Height: | Size: 103 KiB |
@@ -1,645 +0,0 @@
|
||||
# Changelog
|
||||
|
||||
## 1/28/2022 Summary
|
||||
|
||||
* New Source: Chartmogul (contributyed by Titas Skrebė)
|
||||
* New Source: Hellobaton (contributed by Daniel Luftspring)
|
||||
* New Source: Flexport (contributed by Juozas)
|
||||
* New Source: PersistIq (contributed by Wadii Zaim)
|
||||
|
||||
* ✨ Postgres Source: Users can now select which schemas they wish to sync before discovery. This makes the discovery stage for large instances much more performant.
|
||||
* ✨ Shopify Source: Now verifies permissions on the token before accessing resources.
|
||||
* ✨ Snowflake Destination: Users now have access to an option to purge their staging data.
|
||||
* ✨ HubSpot Source: Added some more fields for the email_events stream.
|
||||
* ✨ Amazon Seller Partner Source: Added the GET_FLAT_FILE_ALL_ORDERS_DATA_BY_LAST_UPDATE_GENERAL report stream. (contributed by @ron-damon)
|
||||
* ✨ HubSpot Source: Added the form_submission and property_history streams.
|
||||
|
||||
* 🐛 DynamoDB Destination: The parameter dynamodb_table_name is now named dynamodb_table_name_prefix to more accurately represent it.
|
||||
* 🐛 Intercom Source: The handling of scroll param is now fixed when it is expired.
|
||||
* 🐛 S3 + GCS Destinations: Now support arrays with unknown item type.
|
||||
* 🐛 Postgres Source: Now supports handling of the Java SQL date type.
|
||||
* 🐛 Salesforce Source: No longer fails during schema generation.
|
||||
|
||||
## 1/13/2022 Summary
|
||||
|
||||
⚠️ WARNING ⚠️
|
||||
|
||||
Snowflake Source: Normalization with Snowflake now produces permanent tables. [If you want to continue creating transient tables, you will need to create a new transient database for Airbyte.]
|
||||
|
||||
* ✨ GitHub Source: PR related streams now support incremental sync.
|
||||
* ✨ HubSpot Source: We now support ListMemberships in the Contacts stream.
|
||||
* ✨ Azure Blob Storage Destination: Now has the option to add a BufferedOutputStream to improve performance and fix writing data with over 50GB in a stream. (contributed by @bmatticus)
|
||||
|
||||
* 🐛 Normalization partitioning now works as expected with FLOAT64 and BigQuery.
|
||||
* 🐛 Normalization now works properly with quoted and case sensitive columns.
|
||||
* 🐛 Source MSSQL: Added support for some missing data types.
|
||||
* 🐛 Snowflake Destination: Schema is now not created if it previously exists.
|
||||
* 🐛 Postgres Source: Now properly reads materialized views.
|
||||
* 🐛 Delighted Source: Pagination for survey_responses, bounces and unsubscribes streams now works as expected.
|
||||
* 🐛 Google Search Console Source: Incremental sync now works as expected.
|
||||
* 🐛 Recurly Source: Now does not load all accounts when importing account coupon redemptions.
|
||||
* 🐛 Salesforce Source: Now properly handles 400 when streams don't support query or queryAll.
|
||||
|
||||
## 1/6/2022 Summary
|
||||
|
||||
* New Source: 3PL Central (contributed by Juozas)
|
||||
* New Source: My Hours (contributed by Wisse Jelgersma)
|
||||
* New Source: Qualaroo (contributed by gunu)
|
||||
* New Source: SearchMetrics
|
||||
|
||||
* 💎 Salesforce Source: Now supports filtering streams at configuration, making it easier to handle large Salesforce instances.
|
||||
* 💎 Snowflake Destination: Now supports byte-buffering for staged inserts.
|
||||
* 💎 Redshift Destination: Now supports byte-buffering for staged inserts.
|
||||
* ✨ Postgres Source: Now supports all Postgres 14 types.
|
||||
* ✨ Recurly Source: Now supports incremental sync for all streams.
|
||||
* ✨ Zendesk Support Source: Added the Brands, CustomRoles, and Schedules streams.
|
||||
* ✨ Zendesk Support Source: Now uses cursor-based pagination.
|
||||
* ✨ Kustomer Source: Setup configuration is now more straightforward.
|
||||
* ✨ Hubspot Source: Now supports incremental sync on all streams where possible.
|
||||
* ✨ Facebook Marketing Source: Fixed schema for breakdowns fields.
|
||||
* ✨ Facebook Marketing Source: Added asset_feed_spec to AdCreatives stream.
|
||||
* ✨ Redshift Destination: Now has an option to toggle the deletion of staging data.
|
||||
|
||||
* 🐛 S3 Destination: Avro and Parquet formats are now processed correctly.
|
||||
* 🐛 Snowflake Destination: Fixed SQL Compliation error.
|
||||
* 🐛 Kafka Source: SASL configurations no longer throw null pointer exceptions (contributed by Nitesh Kumar)
|
||||
* 🐛 Salesforce Source: Now throws a 400 for non-queryable streams.
|
||||
* 🐛 Amazon Ads Source: Polling for report generation is now much more resilient. (contributed by Juozas)
|
||||
* 🐛 Jira Source: The filters stream now works as expected.
|
||||
* 🐛 BigQuery Destination: You can now properly configure the buffer size with the part_size config field.
|
||||
* 🐛 Snowflake Destination: You can now properly configure the buffer size with the part_size config field.
|
||||
* 🐛 CockroachDB Source: Now correctly only discovers tables the user has permission to access.
|
||||
* 🐛 Stripe Source: The date and arrival_date fields are now typed correctly.
|
||||
|
||||
## 12/16/2021 Summary
|
||||
|
||||
🎉 First off... There's a brand new CDK! Menno Hamburg contributed a .NET/C# implementation for our CDK, allowing you to write HTTP API sources and Generic Dotnet sources. Thank you so much Menno, this is huge!
|
||||
|
||||
* New Source: OpenWeather
|
||||
* New Destination: ClickHouse (contributed by @Bo)
|
||||
* New Destination: RabbitMQ (contributed by @Luis Gomez)
|
||||
* New Destination: Amazon SQS (contributed by @Alasdair Brown)
|
||||
* New Destination: Rockset (contributed by @Steve Baldwin)
|
||||
|
||||
* ✨ Facebook Marketing Source: Updated the campaign schema with more relevant fields. (contributed by @Maxime Lavoie)
|
||||
* ✨ TikTok Marketing Source: Now supports the Basic Report stream.
|
||||
* ✨ MySQL Source: Now supports all MySQL 8.0 data types.
|
||||
* ✨ Klaviyo Source: Improved performance, added incremental sync support to the Global Exclusions stream.
|
||||
* ✨ Redshift Destination: You can now specify a bucket path to stage your data in before inserting.
|
||||
* ✨ Kubernetes deployments: Sidecar memory is now 25Mi, up from 6Mi to cover all usage cases.
|
||||
* ✨ Kubernetes deployments: The Helm chart can now set up writing logs to S3 easily. (contributed by @Valentin Nourdin)
|
||||
|
||||
* 🐛 Python CDK: Now shows the stack trace of unhandled exceptions.
|
||||
* 🐛 Google Analytics Source: Fix data window input validation, fix date type conversion.
|
||||
* 🐛 Google Ads Source: Data from the end_date for syncs is now included in a sync.
|
||||
* 🐛 Marketo Source: Fixed issues around input type conversion and conformation to the schema.
|
||||
* 🐛 Mailchimp Source: Fixed schema conversion error causing sync failures.
|
||||
* 🐛 PayPal Transactions Source: Now reports full error message details on failure.
|
||||
* 🐛 Shopify Source: Normalization now works as expected.
|
||||
|
||||
## 12/9/2021 Summary
|
||||
|
||||
⚠️ WARNING ⚠️
|
||||
|
||||
v0.33.0 is a minor version with breaking changes. Take the normal precautions with upgrading safely to this version.
|
||||
v0.33.0 has a bug that affects GCS logs on Kubernetes. Upgrade straight to v0.33.2 if you are running a K8s deployment of Airbyte.
|
||||
|
||||
* New Source: Mailgun
|
||||
|
||||
🎉 Snowflake Destination: You can now stage your inserts, making them much faster.
|
||||
|
||||
* ✨ Google Ads Source: Source configuration is now more clear.
|
||||
* ✨ Google Analytics Source: Source configuration is now more clear.
|
||||
* ✨ S3 Destination: You can now write timestamps in Avro and Parquet formats.
|
||||
* ✨ BigQuery & BigQuery Denormalized Destinations: Now use byte-based buffering for batch inserts.
|
||||
* ✨ Iterable Source: Now has email validation on the list_users stream.
|
||||
|
||||
* 🐛 Incremental normalization now works properly with empty tables.
|
||||
* 🐛 LinkedIn Ads Source: 429 response is now properly handled.
|
||||
* 🐛 Intercom Source: Now handles failed pagination requests with backoffs.
|
||||
* 🐛 Intercom Source: No longer drops records from the conversation stream.
|
||||
* 🐛 Google Analytics Source: 400 errors no longer get ignored with custom reports.
|
||||
* 🐛 Marketo Source: The createdAt and updatedAt fields are now formatted correctly.
|
||||
|
||||
## 12/2/2021 Summary
|
||||
|
||||
🎃 **Hacktoberfest Submissions** 🎃
|
||||
-----------------------------------------
|
||||
* New Destination: Redis (contributed by @Ivica Taseski)
|
||||
* New Destination: MQTT (contributed by @Mario Molina)
|
||||
* New Destination: Google Firestore (contributed by @Adam Dobrawy)
|
||||
* New Destination: Kinesis (contributed by @Ivica Taseski)
|
||||
* New Source: Zenloop (contributed by @Alexander Batoulis)
|
||||
* New Source: Outreach (contributed by @Luis Gomez)
|
||||
|
||||
* ✨ Zendesk Source: The chats stream now supports incremental sync and added testing for all streams.
|
||||
* 🐛 Monday Source: Pagination now works as expected and the schema has been fixed.
|
||||
* 🐛 Postgres Source: Views are now properly listed during schema discovery.
|
||||
* 🐛 Postgres Source: Using the money type with an amount greater than 1000 works properly now.
|
||||
* 🐛 Google Search Console Search: We now set a default end_data value.
|
||||
* 🐛 Mixpanel Source: Normalization now works as expected and streams are now displayed properly in the UI.
|
||||
* 🐛 MongoDB Source: The DATE_TIME type now uses milliseconds.
|
||||
|
||||
## 11/25/2021 Summary
|
||||
Hey Airbyte Community! Let's go over all the changes from v.32.5 and prior!
|
||||
|
||||
🎃 **Hacktoberfest Submissions** 🎃
|
||||
* New Source: Airtable (contributed by Tuan Nguyen).
|
||||
* New Source: Notion (contributed by Bo Lu).
|
||||
* New Source: Pardot (contributed by Tuan Nguyen).
|
||||
|
||||
* New Source: Youtube analytics.
|
||||
|
||||
* ✨ Source Exchange Rates: add ignore_weekends option.
|
||||
* ✨ Source Facebook: add the videos stream.
|
||||
* ✨ Source Freshdesk: removed the limitation in streams pagination.
|
||||
* ✨ Source Jira: add option to render fields in HTML format.
|
||||
* ✨ Source MongoDB v2: improve read performance.
|
||||
* ✨ Source Pipedrive: specify schema for "persons" stream.
|
||||
* ✨ Source PostgreSQL: exclude tables on which user doesn't have select privileges.
|
||||
* ✨ Source SurveyMonkey: improve connection check.
|
||||
|
||||
* 🐛 Source Salesforce: improve resiliency of async bulk jobs.
|
||||
* 🐛 Source Zendesk Support: fix missing ticket_id in ticket_comments stream.
|
||||
* 🐛 Normalization: optimize incremental normalization runtime with Snowflake.
|
||||
|
||||
As usual, thank you so much to our wonderful contributors this week that have made Airbyte into what it is today: Madison Swain-Bowden, Tuan Nguyen, Bo Lu, Adam Dobrawy, Christopher Wu, Luis Gomez, Ivica Taseski, Mario Molina, Ping Yee, Koji Matsumoto, Sujit Sagar, Shadab, Juozas V.([Labanoras Tech](http://labanoras.io)) and Serhii Chvaliuk!
|
||||
|
||||
## 11/17/2021 Summary
|
||||
|
||||
Hey Airbyte Community! Let's go over all the changes from v.32.1 and prior! But first, there's an important announcement I need to make about upgrading Airbyte to v.32.1.
|
||||
|
||||
⚠️ WARNING ⚠️
|
||||
Upgrading to v.32.0 is equivalent to a major version bump. If your current version is v.32.0, you must upgrade to v.32.0 first before upgrading to any later version
|
||||
|
||||
Keep in mind that this upgrade requires your all of your connector Specs to be retrievable, or Airbyte will fail on startup. You can force delete your connector Specs by setting the `VERSION_0_32_0_FORCE_UPGRADE` environment variable to `true`. Steps to specifically check out v.32.0 and details around this breaking change can be found [here](https://docs.airbyte.com/operator-guides/upgrading-airbyte/#mandatory-intermediate-upgrade).
|
||||
|
||||
*Now back to our regularly scheduled programming.*
|
||||
|
||||
🎃 Hacktoberfest Submissions 🎃
|
||||
|
||||
* New Destination: ScyllaDB (contributed by Ivica Taseski)
|
||||
* New Source: Azure Table Storage (contributed by geekwhocodes)
|
||||
* New Source: Linnworks (contributed by Juozas V.([Labanoras Tech](http://labanoras.io)))
|
||||
|
||||
* ✨ Source MySQL: Now has basic performance tests.
|
||||
* ✨ Source Salesforce: We now automatically transform and handle incorrect data for the anyType and calculated types.
|
||||
|
||||
* 🐛 IBM Db2 Source: Now handles conversion from DECFLOAT to BigDecimal correctly.
|
||||
* 🐛 MSSQL Source: Now handles VARBINARY correctly.
|
||||
* 🐛 CockroachDB Source: Improved parsing of various data types.
|
||||
|
||||
As usual, thank you so much to our wonderful contributors this week that have made Airbyte into what it is today: Achmad Syarif Hidayatullah, Tuan Nguyen, Ivica Taseski, Hai To, Juozas, gunu, Shadab, Per-Victor Persson, and Harsha Teja Kanna!
|
||||
|
||||
## 11/11/2021 Summary
|
||||
|
||||
Time to go over changes from v.30.39! And... let's get another update on Hacktoberfest.
|
||||
|
||||
🎃 Hacktoberfest Submissions 🎃
|
||||
|
||||
* New Destination: Cassandra (contributed by Ivica Taseski)
|
||||
* New Destination: Pulsar (contributed by Mario Molina)
|
||||
* New Source: Confluence (contributed by Tuan Nguyen)
|
||||
* New Source: Monday (contributed by Tuan Nguyen)
|
||||
* New Source: Commerce Tools (contributed by James Wilson)
|
||||
* New Source: Pinterest Marketing (contributed by us!)
|
||||
|
||||
* ✨ Shopify Source: Now supports the FulfillmentOrders and Fulfillments streams.
|
||||
* ✨ Greenhouse Source: Now supports the Demographics stream.
|
||||
* ✨ Recharge Source: Broken requests should now be re-requested with improved backoff.
|
||||
* ✨ Stripe Source: Now supports the checkout_sessions, checkout_sessions_line_item, and promotion_codes streams.
|
||||
* ✨ Db2 Source: Now supports SSL.
|
||||
|
||||
* 🐛 We've made some updates to incremental normalization to fix some outstanding issues. [Details](https://github.com/airbytehq/airbyte/pull/7669)
|
||||
* 🐛 Airbyte Server no longer crashes due to too many open files.
|
||||
* 🐛 MSSQL Source: Data type conversion with smalldatetime and smallmoney works correctly now.
|
||||
* 🐛 Salesforce Source: anyType fields can now be retrieved properly with the BULK API
|
||||
* 🐛 BigQuery-Denormalized Destination: Fixed JSON parsing with $ref fields.
|
||||
|
||||
As usual, thank you to our awesome contributors that have done awesome work during the last week: Tuan Nguyen, Harsha Teja Kanna, Aaditya S, James Wilson, Vladimir Remar, Yuhui Shi, Mario Molina, Ivica Taseski, Collin Scangarella, and haoranyu!
|
||||
|
||||
## 11/03/2021 Summary
|
||||
|
||||
It's patch notes time. Let's go over the changes from 0.30.24 and before. But before we do, let's get a quick update on how Hacktober is going!
|
||||
|
||||
🎃 Hacktoberfest Submissions 🎃
|
||||
|
||||
* New Destination: Elasticsearch (contributed by Jeremy Branham)
|
||||
* New Source: Salesloft (contributed by Pras)
|
||||
* New Source: OneSignal (contributed by Bo)
|
||||
* New Source: Strava (contributed by terencecho)
|
||||
* New Source: Lemlist (contributed by Igli Koxha)
|
||||
* New Source: Amazon SQS (contributed by Alasdair Brown)
|
||||
* New Source: Freshservices (contributed by Tuan Nguyen)
|
||||
* New Source: Freshsales (contributed by Tuan Nguyen)
|
||||
* New Source: Appsflyer (contributed by Achmad Syarif Hidayatullah)
|
||||
* New Source: Paystack (contributed by Foluso Ogunlana)
|
||||
* New Source: Sentry (contributed by koji matsumoto)
|
||||
* New Source: Retently (contributed by Subhash Gopalakrishnan)
|
||||
* New Source: Delighted! (contributed by Rodrigo Parra)
|
||||
|
||||
with 18 more currently in review...
|
||||
|
||||
🎉 **Incremental Normalization is here!** 🎉
|
||||
|
||||
💎 Basic normalization no longer runs on already normalized data, making it way faster and cheaper. :gem:
|
||||
|
||||
🎉 **Airbyte Compiles on M1 Macs!**
|
||||
|
||||
Airbyte developers with M1 chips in their MacBooks can now compile the project and run the server. This is a major step towards being able to fully run Airbyte on M1. (contributed by Harsha Teja Kanna)
|
||||
|
||||
* ✨ BigQuery Destination: You can now run transformations in batches, preventing queries from hitting BigQuery limits. (contributed by Andrés Bravo)
|
||||
* ✨ S3 Source: Memory and Performance optimizations, also some fancy new PyArrow CSV configuration options.
|
||||
* ✨ Zuora Source: Now supports Unlimited as an option for the Data Query Live API.
|
||||
* ✨ Clickhouse Source: Now supports SSL and connection via SSH tunneling.
|
||||
|
||||
* 🐛 Oracle Source: Now handles the LONG RAW data type correctly.
|
||||
* 🐛 Snowflake Source: Fixed parsing of extreme values for FLOAT and NUMBER data types.
|
||||
* 🐛 Hubspot Source: No longer fails due to lengthy URI/URLs.
|
||||
* 🐛 Zendesk Source: The chats stream now pulls data past the first page.
|
||||
* 🐛 Jira Source: Normalization now works as expected.
|
||||
|
||||
As usual, thank you to our awesome contributors that have done awesome work during this productive spooky season: Tuan Nguyen, Achmad Syarif Hidayatullah, Christopher Wu, Andrés Bravo, Harsha Teja Kanna, Collin Scangarella, haoranyu, koji matsumoto, Subhash Gopalakrishnan, Jeremy Branham, Rodrigo Parra, Foluso Ogunlana, EdBizarro, Gergely Lendvai, Rodeoclash, terencecho, Igli Koxha, Alasdair Brown, bbugh, Pras, Bo, Xiangxuan Liu, Hai To, s-mawjee, Mario Molina, SamyPesse, Yuhui Shi, Maciej Nędza, Matt Hoag, and denis-sokolov!
|
||||
|
||||
## 10/20/2021 Summary
|
||||
|
||||
It's patch notes time! Let's go over changes from 0.30.16! But before we do... I want to remind everyone that Airbyte Hacktoberfest is currently taking place! For every connector that is merged into our codebase, you'll get $500, so make sure to submit before the hackathon ends on November 19th.
|
||||
|
||||
* 🎉 New Source: WooCommerce (contributed by James Wilson)
|
||||
* 🎉 K8s deployments: Worker image pull policy is now configurable (contributed by Mario Molina)
|
||||
|
||||
* ✨ MSSQL destination: Now supports basic normalization
|
||||
* 🐛 LinkedIn Ads source: Analytics streams now work as expected.
|
||||
|
||||
We've had a lot of contributors over the last few weeks, so I'd like to thank all of them for their efforts: James Wilson, Mario Molina, Maciej Nędza, Pras, Tuan Nguyen, Andrés Bravo, Christopher Wu, gunu, Harsha Teja Kanna, Jonathan Stacks, darian, Christian Gagnon, Nicolas Moreau, Matt Hoag, Achmad Syarif Hidayatullah, s-mawjee, SamyPesse, heade, zurferr, denis-solokov, and aristidednd!
|
||||
|
||||
## 09/29/2021 Summary
|
||||
|
||||
It's patch notes time, let's go over the changes from our new minor version, v0.30.0. As usual, bug fixes are in the thread.
|
||||
|
||||
* New source: LinkedIn Ads
|
||||
* New source: Kafka
|
||||
* New source: Lever Hiring
|
||||
|
||||
* 🎉 New License: Nothing changes for users of Airbyte/contributors. You just can't sell your own Airbyte Cloud!
|
||||
|
||||
* 💎 New API endpoint: You can now call connections/search in the web backend API to search sources and destinations. (contributed by Mario Molina)
|
||||
* 💎 K8s: Added support for ImagePullSecrets for connector images.
|
||||
* 💎 MSSQL, Oracle, MySQL sources & destinations: Now support connection via SSH (Bastion server)
|
||||
|
||||
* ✨ MySQL destination: Now supports connection via TLS/SSL
|
||||
* ✨ BigQuery (denormalized) destination: Supports reading BigQuery types such as date by reading the format field (contributed by Nicolas Moreau)
|
||||
* ✨ Hubspot source: Added contacts associations to the deals stream.
|
||||
* ✨ GitHub source: Now supports pulling commits from user-specified branches.
|
||||
* ✨ Google Search Console source: Now accepts admin email as input when using a service account key.
|
||||
* ✨ Greenhouse source: Now identifies API streams it has access to if permissions are limited.
|
||||
* ✨ Marketo source: Now Airbyte native.
|
||||
* ✨ S3 source: Now supports any source that conforms to the S3 protocol (Non-AWS S3).
|
||||
* ✨ Shopify source: Now reports pre_tax_price on the line_items stream if you have Shopify Plus.
|
||||
* ✨ Stripe source: Now actually uses the mandatory start_date config field for incremental syncs.
|
||||
|
||||
* 🏗 Python CDK: Now supports passing custom headers to the requests in OAuth2, enabling token refresh calls.
|
||||
* 🏗 Python CDK: Parent streams can now be configured to cache data for their child streams.
|
||||
* 🏗 Python CDK: Now has a Transformer class that can cast record fields to the data type expected by the schema.
|
||||
|
||||
* 🐛 Amplitude source: Fixed schema for date-time objects.
|
||||
* 🐛 Asana source: Schema fixed for the sections, stories, tasks, and users streams.
|
||||
* 🐛 GitHub source: Added error handling for streams not applicable to a repo. (contributed by Christopher Wu)
|
||||
* 🐛 Google Search Console source: Verifies access to sites when performing the connection check.
|
||||
* 🐛 Hubspot source: Now conforms to the V3 API, with streams such as owners reflecting the new fields.
|
||||
* 🐛 Intercom source: Fixed data type for the updated_at field. (contributed by Christian Gagnon)
|
||||
* 🐛 Iterable source: Normalization now works as expected.
|
||||
* 🐛 Pipedrive source: Schema now reflects the correct types for date/time fields.
|
||||
* 🐛 Stripe source: Incorrect timestamp formats removed for coupons and subscriptions streams.
|
||||
* 🐛 Salesforce source: You can now sync more than 10,000 records with the Bulk API.
|
||||
* 🐛 Snowflake destination: Now accepts any date-time format with normalization.
|
||||
* 🐛 Snowflake destination: Inserts are now split into batches to accommodate for large data loads.
|
||||
|
||||
Thank you to our awesome contributors. Y'all are amazing: Mario Molina, Pras, Vladimir Remar, Christopher Wu, gunu, Juliano Benvenuto Piovezan, Brian M, Justinas Lukasevicius, Jonathan Stacks, Christian Gagnon, Nicolas Moreau, aristidednd, camro, minimax75, peter-mcconnell, and sashkalife!
|
||||
|
||||
## 09/16/2021 Summary
|
||||
|
||||
Now let's get to the 0.29.19 changelog. As with last time, bug fixes are in the thread!
|
||||
|
||||
* New Destination: Databricks 🎉
|
||||
* New Source: Google Search Console
|
||||
* New Source: Close.com
|
||||
|
||||
* 🏗 Python CDK: Now supports auth workflows involving query params.
|
||||
* 🏗 Java CDK: You can now run the connector gradle build script on Macs with M1 chips! (contributed by @Harsha Teja Kanna)
|
||||
|
||||
* 💎 Google Ads source: You can now specify user-specified queries in GAQL.
|
||||
* ✨ GitHub source: All streams with a parent stream use cached parent stream data when possible.
|
||||
* ✨ Shopify source: Substantial performance improvements to the incremental sync mode.
|
||||
* ✨ Stripe source: Now supports the PaymentIntents stream.
|
||||
* ✨ Pipedrive source: Now supports the Organizations stream.
|
||||
* ✨ Sendgrid source: Now supports the SingleSendStats stream.
|
||||
* ✨ Bing Ads source: Now supports the Report stream.
|
||||
* ✨ GitHub source: Now supports the Reactions stream.
|
||||
* ✨ MongoDB source: Now Airbyte native!
|
||||
* 🐛 Facebook Marketing source: Numeric values are no longer wrapped into strings.
|
||||
* 🐛 Facebook Marketing source: Fetching conversion data now works as expected. (contributed by @Manav)
|
||||
* 🐛 Keen destination: Timestamps are now parsed correctly.
|
||||
* 🐛 S3 destination: Parquet schema parsing errors are fixed.
|
||||
* 🐛 Snowflake destination: No longer syncs unnecessary tables with S3.
|
||||
* 🐛 SurveyMonkey source: Cached responses are now decoded correctly.
|
||||
* 🐛 Okta source: Incremental sync now works as expected.
|
||||
|
||||
Also, a quick shout out to Jinni Gu and their team who made the DynamoDB destination that we announced last week!
|
||||
|
||||
As usual, thank you to all of our contributors: Harsha Teja Kanna, Manav, Maciej Nędza, mauro, Brian M, Iakov Salikov, Eliziario (Marcos Santos), coeurdestenebres, and mohammadbolt.
|
||||
|
||||
## 09/09/2021 Summary
|
||||
|
||||
We're going over the changes from 0.29.17 and before... and there's a lot of big improvements here, so don't miss them!
|
||||
|
||||
**New Source**: Facebook Pages **New Destination**: MongoDB **New Destination**: DynamoDB
|
||||
|
||||
* 🎉 You can now send notifications via webhook for successes and failures on Airbyte syncs. \(This is a massive contribution by @Pras, thank you\) 🎉
|
||||
* 🎉 Scheduling jobs and worker jobs are now separated, allowing for workers to be scaled horizontally.
|
||||
* 🎉 When developing a connector, you can now preview what your spec looks like in real time with this process.
|
||||
* 🎉 Oracle destination: Now has basic normalization.
|
||||
* 🎉 Add XLSB \(binary excel\) support to the Files source \(contributed by Muutech\).
|
||||
* 🎉 You can now properly cancel K8s deployments.
|
||||
* ✨ S3 source: Support for Parquet format.
|
||||
* ✨ Github source: Branches, repositories, organization users, tags, and pull request stats streams added \(contributed by @Christopher Wu\).
|
||||
* ✨ BigQuery destination: Added GCS upload option.
|
||||
* ✨ Salesforce source: Now Airbyte native.
|
||||
* ✨ Redshift destination: Optimized for performance.
|
||||
* 🏗 CDK: 🎉 We’ve released a tool to generate JSON Schemas from OpenAPI specs. This should make specifying schemas for API connectors a breeze! 🎉
|
||||
* 🏗 CDK: Source Acceptance Tests now verify that connectors correctly format strings which are declared as using date-time and date formats.
|
||||
* 🏗 CDK: Add private options to help in testing: \_limit and \_page\_size are now accepted by any CDK connector to minimze your output size for quick iteration while testing.
|
||||
* 🐛 Fixed a bug that made it possible for connector definitions to be duplicated, violating uniqueness.
|
||||
* 🐛 Pipedrive source: Output schemas no longer remove timestamp from fields.
|
||||
* 🐛 Github source: Empty repos and negative backoff values are now handled correctly.
|
||||
* 🐛 Harvest source: Normalization now works as expected.
|
||||
* 🐛 All CDC sources: Removed sleep logic which caused exceptions when loading data from high-volume sources.
|
||||
* 🐛 Slack source: Increased number of retries to tolerate flaky retry wait times on the API side.
|
||||
* 🐛 Slack source: Sync operations no longer hang indefinitely.
|
||||
* 🐛 Jira source: Now uses updated time as the cursor field for incremental sync instead of the created time.
|
||||
* 🐛 Intercom source: Fixed inconsistency between schema and output data.
|
||||
* 🐛 HubSpot source: Streams with the items property now have their schemas fixed.
|
||||
* 🐛 HubSpot source: Empty strings are no longer handled as dates, fixing the deals, companies, and contacts streams.
|
||||
* 🐛 Typeform source: Allows for multiple choices in responses now.
|
||||
* 🐛 Shopify source: The type for the amount field is now fixed in the schema.
|
||||
* 🐛 Postgres destination: \u0000\(NULL\) value processing is now fixed.
|
||||
|
||||
As usual... thank you to our wonderful contributors this week: Pras, Christopher Wu, Brian M, yahu98, Michele Zuccala, jinnig, and luizgribeiro!
|
||||
|
||||
## 09/01/2021 Summary
|
||||
|
||||
Got the changes from 0.29.13... with some other surprises!
|
||||
|
||||
* 🔥 There's a new way to create Airbyte sources! The team at Faros AI has created a Javascript/Typescript CDK which can be found here and in our docs here. This is absolutely awesome and give a huge thanks to Chalenge Masekera, Christopher Wu, eskrm, and Matthew Tovbin!
|
||||
* ✨ New Destination: Azure Blob Storage ✨
|
||||
|
||||
**New Source**: Bamboo HR \(contributed by @Oren Haliva\) **New Source**: BigCommerce \(contributed by @James Wilson\) **New Source**: Trello **New Source**: Google Analytics V4 **New Source**: Amazon Ads
|
||||
|
||||
* 💎 Alpine Docker images are the new standard for Python connectors, so image sizes have dropped by around 100 MB!
|
||||
* ✨ You can now apply tolerations for Airbyte Pods on K8s deployments \(contributed by @Pras\).
|
||||
* 🐛 Shopify source: Rate limit throttling fixed.
|
||||
* 📚 We now have a doc on how to deploy Airbyte at scale. Check it out here!
|
||||
* 🏗 Airbyte CDK: You can now ignore HTTP status errors and override retry parameters.
|
||||
|
||||
As usual, thank you to our awesome contributors: Oren Haliva, Pras, James Wilson, and Muutech.
|
||||
|
||||
## 08/26/2021 Summary
|
||||
|
||||
New Source: Short.io \(contributed by @Apostol Tegko\)
|
||||
|
||||
* 💎 GitHub source: Added support for rotating through multiple API tokens!
|
||||
* ✨ Syncs are now scheduled with a 3 day timeout \(contributed by @Vladimir Remar\).
|
||||
* ✨ Google Ads source: Added UserLocationReport stream \(contributed by @Max Krog\).
|
||||
* ✨ Cart.com source: Added the order\_items stream.
|
||||
* 🐛 Postgres source: Fixed out-of-memory issue with CDC interacting with large JSON blobs.
|
||||
* 🐛 Intercom source: Pagination now works as expected.
|
||||
|
||||
As always, thank you to our awesome community contributors this week: Apostol Tegko, Vladimir Remar, Max Krog, Pras, Marco Fontana, Troy Harvey, and damianlegawiec!
|
||||
|
||||
## 08/20/2021 Summary
|
||||
|
||||
Hey Airbyte community, we got some patch notes for y'all. Here's all the changes we've pushed since the last update.
|
||||
|
||||
* **New Source**: S3/Abstract Files
|
||||
* **New Source**: Zuora
|
||||
* **New Source**: Kustomer
|
||||
* **New Source**: Apify
|
||||
* **New Source**: Chargebee
|
||||
* **New Source**: Bing Ads
|
||||
|
||||
New Destination: Keen
|
||||
|
||||
* ✨ Shopify source: The `status` property is now in the `Products` stream.
|
||||
* ✨ Amazon Seller Partner source: Added support for `GET_MERCHANT_LISTINGS_ALL_DATA` and `GET_FBA_INVENTORY_AGED_DATA` stream endpoints.
|
||||
* ✨ GitHub source: Existing streams now don't minify the user property.
|
||||
* ✨ HubSpot source: Updated user-defined custom field schema generation.
|
||||
* ✨ Zendesk source: Migrated from Singer to the Airbyte CDK.
|
||||
* ✨ Amazon Seller Partner source: Migrated to the Airbyte CDK.
|
||||
* 🐛 Shopify source: Fixed the `products` schema to be in accordance with the API.
|
||||
* 🐛 S3 source: Fixed bug where syncs could hang indefinitely.
|
||||
|
||||
And as always... we'd love to shout out the awesome contributors that have helped push Airbyte forward. As a reminder, you can now see your contributions publicly reflected on our [contributors page](https://airbyte.com/contributors).
|
||||
|
||||
Thank you to Rodrigo Parra, Brian Krausz, Max Krog, Apostol Tegko, Matej Hamas, Vladimir Remar, Marco Fontana, Nicholas Bull, @mildbyte, @subhaklp, and Maciej Nędza!
|
||||
|
||||
## 07/30/2021 Summary
|
||||
|
||||
For this week's update, we got... a few new connectors this week in 0.29.0. We found that a lot of sources can pull data directly from the underlying db instance, which we naturally already supported.
|
||||
|
||||
* New Source: PrestaShop ✨
|
||||
* New Source: Snapchat Marketing ✨
|
||||
* New Source: Drupal
|
||||
* New Source: Magento
|
||||
* New Source: Microsoft Dynamics AX
|
||||
* New Source: Microsoft Dynamics Customer Engagement
|
||||
* New Source: Microsoft Dynamics GP
|
||||
* New Source: Microsoft Dynamics NAV
|
||||
* New Source: Oracle PeopleSoft
|
||||
* New Source: Oracle Siebel CRM
|
||||
* New Source: SAP Business One
|
||||
* New Source: Spree Commerce
|
||||
* New Source: Sugar CRM
|
||||
* New Source: Wordpress
|
||||
* New Source: Zencart
|
||||
* 🐛 Shopify source: Fixed the products schema to be in accordance with the API
|
||||
* 🐛 BigQuery source: No longer fails with nested array data types.
|
||||
|
||||
View the full release highlights here: [Platform](platform.md), [Connectors](connectors.md)
|
||||
|
||||
And as always, thank you to our wonderful contributors: Madison Swain-Bowden, Brian Krausz, Apostol Tegko, Matej Hamas, Vladimir Remar, Oren Haliva, satishblotout, jacqueskpoty, wallies
|
||||
|
||||
## 07/23/2021 Summary
|
||||
|
||||
What's going on? We just released 0.28.0 and here's the main highlights.
|
||||
|
||||
* New Destination: Google Cloud Storage ✨
|
||||
* New Destination: Kafka ✨ \(contributed by @Mario Molina\)
|
||||
* New Source: Pipedrive
|
||||
* New Source: US Census \(contributed by @Daniel Mateus Pires \(Earnest Research\)\)
|
||||
* ✨ Google Ads source: Now supports Campaigns, Ads, AdGroups, and Accounts streams.
|
||||
* ✨ Stripe source: All subscription types \(including expired and canceled ones\) are now returned.
|
||||
* 🐛 Facebook source: Improved rate limit management
|
||||
* 🐛 Square source: The send\_request method is no longer broken due to CDK changes
|
||||
* 🐛 MySQL destination: Does not fail on columns with JSON data now.
|
||||
|
||||
View the full release highlights here: [Platform](platform.md), [Connectors](connectors.md)
|
||||
|
||||
And as always, thank you to our wonderful contributors: Mario Molina, Daniel Mateus Pires \(Earnest Research\), gunu, Ankur Adhikari, Vladimir Remar, Madison Swain-Bowden, Maksym Pavlenok, Sam Crowder, mildbyte, avida, and gaart
|
||||
|
||||
## 07/16/2021 Summary
|
||||
|
||||
As for our changes this week...
|
||||
|
||||
* New Source: Zendesk Sunshine
|
||||
* New Source: Dixa
|
||||
* New Source: Typeform
|
||||
* 💎 MySQL destination: Now supports normalization!
|
||||
* 💎 MSSQL source: Now supports CDC \(Change Data Capture\)
|
||||
* ✨ Snowflake destination: Data coming from Airbyte is now identifiable
|
||||
* 🐛 GitHub source: Now uses the correct cursor field for the IssueEvents stream
|
||||
* 🐛 Square source: The send\_request method is no longer broken due to CDK changes
|
||||
|
||||
View the full release highlights here: [Platform](platform.md), [Connectors](connectors.md)
|
||||
|
||||
As usual, thank you to our awesome community contributors this week: Oliver Meyer, Varun, Brian Krausz, shadabshaukat, Serhii Lazebnyi, Juliano Benvenuto Piovezan, mildbyte, and Sam Crowder!
|
||||
|
||||
## 07/09/2021 Summary
|
||||
|
||||
* New Source: PayPal Transaction
|
||||
* New Source: Square
|
||||
* New Source: SurveyMonkey
|
||||
* New Source: CockroachDB
|
||||
* New Source: Airbyte-Native GitHub
|
||||
* New Source: Airbyte-Native GitLab
|
||||
* New Source: Airbyte-Native Twilio
|
||||
* ✨ S3 destination: Now supports anyOf, oneOf and allOf schema fields.
|
||||
* ✨ Instagram source: Migrated to the CDK and has improved error handling.
|
||||
* ✨ Shopify source: Add support for draft orders.
|
||||
* ✨ K8s Deployments: Now support logging to GCS.
|
||||
* 🐛 GitHub source: Fixed issue with locked breaking normalization of the pull\_request stream.
|
||||
* 🐛 Okta source: Fix endless loop when syncing data from logs stream.
|
||||
* 🐛 PostgreSQL source: Fixed decimal handling with CDC.
|
||||
* 🐛 Fixed random silent source failures.
|
||||
* 📚 New document on how the CDK handles schemas.
|
||||
* 🏗️ Python CDK: Now allows setting of network adapter args on outgoing HTTP requests.
|
||||
|
||||
View the full release highlights here: [Platform](platform.md), [Connectors](connectors.md)
|
||||
|
||||
As usual, thank you to our awesome community contributors this week: gunu, P.VAD, Rodrigo Parra, Mario Molina, Antonio Grass, sabifranjo, Jaime Farres, shadabshaukat, Rodrigo Menezes, dkelwa, Jonathan Duval, and Augustin Lafanechère.
|
||||
|
||||
## 07/01/2021 Summary
|
||||
|
||||
* New Destination: Google PubSub
|
||||
* New Source: AWS CloudTrail
|
||||
|
||||
_The risks and issues with upgrading Airbyte are now gone..._
|
||||
|
||||
* 🎉 Airbyte automatically upgrades versions safely at server startup 🎉
|
||||
* 💎 Logs on K8s are now stored in Minio by default, no S3 bucket required
|
||||
* ✨ Looker Source: Supports the Run Look output stream
|
||||
* ✨ Slack Source: is now Airbyte native!
|
||||
* 🐛 Freshdesk Source: No longer fails after 300 pages
|
||||
* 📚 New tutorial on building Java destinations
|
||||
|
||||
Starting from next week, our weekly office hours will now become demo days! Drop by to get sneak peeks and new feature demos.
|
||||
|
||||
* We added the \#careers channel, so if you're hiring, post your job reqs there!
|
||||
* We added a \#understanding-airbyte channel to mirror [this](../../understanding-airbyte/) section on our docs site. Ask any questions about our architecture or protocol there.
|
||||
* We added a \#contributing-to-airbyte channel. A lot of people ask us about how to contribute to the project, so ask away there!
|
||||
|
||||
View the full release highlights here: [Platform](platform.md), [Connectors](connectors.md)
|
||||
|
||||
As usual, thank you to our awesome community contributors this week: Harshith Mullapudi, Michael Irvine, and [sabifranjo](https://github.com/sabifranjo).
|
||||
|
||||
## 06/24/2021 Summary
|
||||
|
||||
* New Source: [IBM Db2](../../integrations/sources/db2.md)
|
||||
* 💎 We now support Avro and JSONL output for our S3 destination! 💎
|
||||
* 💎 Brand new BigQuery destination flavor that now supports denormalized STRUCT types.
|
||||
* ✨ Looker source now supports self-hosted instances.
|
||||
* ✨ Facebook Marketing source is now migrated to the CDK, massively improving async job performance and error handling.
|
||||
|
||||
View the full connector release notes [here](connectors.md).
|
||||
|
||||
As usual, thank you to some of our awesome community contributors this week: Harshith Mullapudi, Tyler DeLange, Daniel Mateus Pires, EdBizarro, Tyler Schroeder, and Konrad Schlatte!
|
||||
|
||||
## 06/18/2021 Summary
|
||||
|
||||
* New Source: [Snowflake](../../integrations/sources/snowflake.md)
|
||||
* 💎 We now support custom dbt transformations! 💎
|
||||
* ✨ We now support configuring your destination namespace at the table level when setting up a connection!
|
||||
* ✨ The S3 destination now supports Minio S3 and Parquet output!
|
||||
|
||||
View the full release notes here: [Platform](platform.md), [Connectors](connectors.md)
|
||||
|
||||
As usual, thank you to some of our awesome community contributors this week: Tyler DeLange, Mario Molina, Rodrigo Parra, Prashanth Patali, Christopher Wu, Itai Admi, Fred Reimer, and Konrad Schlatte!
|
||||
|
||||
## 06/10/2021 Summary
|
||||
|
||||
* New Destination: [S3!!](../../integrations/destinations/s3.md)
|
||||
* New Sources: [Harvest](../../integrations/sources/harvest.md), [Amplitude](../../integrations/sources/amplitude.md), [Posthog](../../integrations/sources/posthog.md)
|
||||
* 🐛 Ensure that logs from threads created by replication workers are added to the log file.
|
||||
* 🐛 Handle TINYINT\(1\) and BOOLEAN correctly and fix target file comparison for MySQL CDC.
|
||||
* Jira source: now supports all available entities in Jira Cloud.
|
||||
* 📚 Added a troubleshooting section, a gradle cheatsheet, a reminder on what the reset button does, and a refresh on our docs best practices.
|
||||
|
||||
#### Connector Development:
|
||||
|
||||
* Containerized connector code generator
|
||||
* Added JDBC source connector bootstrap template.
|
||||
* Added Java destination generator.
|
||||
|
||||
View the full release notes highlights here: [Platform](platform.md), [Connectors](connectors.md)
|
||||
|
||||
As usual, thank you to some of our awesome community contributors this week \(I've noticed that we've had more contributors to our docs, which we really appreciate\). Ping, Harshith Mullapudi, Michael Irvine, Matheus di Paula, jacqueskpoty and P.VAD.
|
||||
|
||||
## Overview
|
||||
|
||||
Airbyte is comprised of 2 parts:
|
||||
|
||||
* Platform \(The scheduler, workers, api, web app, and the Airbyte protocol\). Here is the [changelog for Platform](platform.md).
|
||||
* Connectors that run in Docker containers. Here is the [changelog for the connectors](connectors.md).
|
||||
|
||||
## Airbyte Platform Releases
|
||||
|
||||
### Production v. Dev Releases
|
||||
|
||||
The "production" version of Airbyte is the version of the app specified in `.env`. With each production release, we update the version in the `.env` file. This version will always be available for download on DockerHub. It is the version of the app that runs when a user runs `docker compose up`.
|
||||
|
||||
The "development" version of Airbyte is the head of master branch. It is the version of the app that runs when a user runs `./gradlew build &&
|
||||
VERSION=dev docker compose up`.
|
||||
|
||||
### Production Release Schedule
|
||||
|
||||
#### Scheduled Releases
|
||||
|
||||
Airbyte currently releases a new minor version of the application on a weekly basis. Generally this weekly release happens on Monday or Tuesday.
|
||||
|
||||
#### Hotfixes
|
||||
|
||||
Airbyte releases a new version whenever it discovers and fixes a bug that blocks any mission critical functionality.
|
||||
|
||||
**Mission Critical**
|
||||
|
||||
e.g. Non-ASCII characters break the Salesforce source.
|
||||
|
||||
**Non-Mission Critical**
|
||||
|
||||
e.g. Buttons in the UI are offset.
|
||||
|
||||
#### Unscheduled Releases
|
||||
|
||||
We will often release more frequently than the weekly cadence if we complete a feature that we know that a user is waiting on.
|
||||
|
||||
### Development Release Schedule
|
||||
|
||||
As soon as a feature is on master, it is part of the development version of Airbyte. We merge features as soon as they are ready to go \(have been code reviewed and tested\). We attempt to keep the development version of the app working all the time. We are iterating quickly, however, and there may be intermittent periods where the development version is broken.
|
||||
|
||||
If there is ever a feature that is only on the development version, and you need it on the production version, please let us know. We are very happy to do ad-hoc production releases if it unblocks a specific need for one of our users.
|
||||
|
||||
## Airbyte Connector Releases
|
||||
|
||||
Each connector is tracked with its own version. These versions are separate from the versions of Airbyte Platform. We generally will bump the version of a connector anytime we make a change to it. We rely on a large suite of tests to make sure that these changes do not cause regressions in our connectors.
|
||||
|
||||
When we updated the version of a connector, we usually update the connector's version in Airbyte Platform as well. Keep in mind that you might not see the updated version of that connector in the production version of Airbyte Platform until after a production release of Airbyte Platform.
|
||||
|
||||
@@ -1,776 +0,0 @@
|
||||
---
|
||||
description: Do not miss the new connectors we support!
|
||||
---
|
||||
|
||||
# Connectors
|
||||
|
||||
**You can request new connectors directly** [**here**](https://github.com/airbytehq/airbyte/issues/new?assignees=&labels=area%2Fintegration%2C+new-integration&template=new-integration-request.md&title=)**.**
|
||||
|
||||
Note: Airbyte is not built on top of Singer but is compatible with Singer's protocol. Airbyte's ambitions go beyond what Singer enables us to do, so we are building our own protocol that maintains compatibility with Singer's protocol.
|
||||
|
||||
Check out our [connector roadmap](https://github.com/airbytehq/airbyte/projects/3) to see what we're currently working on.
|
||||
|
||||
## 1/28/2022
|
||||
|
||||
New sources:
|
||||
|
||||
- [**Chartmogul**](https://docs.airbyte.com/integrations/sources/chartmogul)
|
||||
- [**Hellobaton**](https://docs.airbyte.com/integrations/sources/hellobaton)
|
||||
- [**Flexport**](https://docs.airbyte.com/integrations/sources/flexport)
|
||||
- [**PersistIq**](https://docs.airbyte.com/integrations/sources/persistiq)
|
||||
|
||||
## 1/6/2022
|
||||
|
||||
New sources:
|
||||
|
||||
- [**3PL Central**](https://docs.airbyte.com/integrations/sources/tplcentral)
|
||||
- [**My Hours**](https://docs.airbyte.com/integrations/sources/my-hours)
|
||||
- [**Qualaroo**](https://docs.airbyte.com/integrations/sources/qualaroo)
|
||||
- [**SearchMetrics**](https://docs.airbyte.com/integrations/sources/search-metrics)
|
||||
|
||||
## 12/16/2021
|
||||
|
||||
New source:
|
||||
|
||||
- [**OpenWeather**](https://docs.airbyte.com/integrations/sources/openweather)
|
||||
|
||||
New destinations:
|
||||
|
||||
- [**ClickHouse**](https://docs.airbyte.com/integrations/destinations/clickhouse)
|
||||
- [**RabbitMQ**](https://docs.airbyte.com/integrations/destinations/rabbitmq)
|
||||
- [**Amazon SQS**](https://docs.airbyte.com/integrations/destinations/amazon-sqs)
|
||||
- [**Rockset**](https://docs.airbyte.com/integrations/destinations/rockset)
|
||||
|
||||
## 12/9/2021
|
||||
|
||||
New source:
|
||||
|
||||
- [**Mailgun**](https://docs.airbyte.com/integrations/sources/mailgun)
|
||||
|
||||
## 12/2/2021
|
||||
|
||||
New destinations:
|
||||
|
||||
- [**Redis**](https://docs.airbyte.com/integrations/destinations/redis)
|
||||
- [**MQTT**](https://docs.airbyte.com/integrations/destinations/mqtt)
|
||||
- [**Google Firestore**](https://docs.airbyte.com/integrations/destinations/firestore)
|
||||
- [**Kinesis**](https://docs.airbyte.com/integrations/destinations/kinesis)
|
||||
|
||||
## 11/25/2021
|
||||
|
||||
New sources:
|
||||
|
||||
- [**Airtable**](https://docs.airbyte.com/integrations/sources/airtable)
|
||||
- [**Notion**](https://docs.airbyte.com/integrations/sources/notion)
|
||||
- [**Pardot**](https://docs.airbyte.com/integrations/sources/pardot)
|
||||
- [**Notion**](https://docs.airbyte.com/integrations/sources/linnworks)
|
||||
- [**YouTube Analytics**](https://docs.airbyte.com/integrations/sources/youtube-analytics)
|
||||
|
||||
New features:
|
||||
|
||||
- **Exchange Rates** Source: add `ignore_weekends` option.
|
||||
- **Facebook** Source: add the videos stream.
|
||||
- **Freshdesk** Source: removed the limitation in streams pagination.
|
||||
- **Jira** Source: add option to render fields in HTML format.
|
||||
- **MongoDB v2** Source: improve read performance.
|
||||
- **Pipedrive** Source: specify schema for "persons" stream.
|
||||
- **PostgreSQL** Source: exclude tables on which user doesn't have select privileges.
|
||||
- **SurveyMonkey** Source: improve connection check.
|
||||
|
||||
## 11/17/2021
|
||||
|
||||
New destination:
|
||||
|
||||
- [**ScyllaDB**](https://docs.airbyte.com/integrations/destinations/scylla)
|
||||
|
||||
New sources:
|
||||
|
||||
- [**Azure Table Storage**](https://docs.airbyte.com/integrations/sources/azure-table)
|
||||
- [**Linnworks**](https://docs.airbyte.com/integrations/sources/linnworks)
|
||||
|
||||
New features:
|
||||
|
||||
- **MySQL** Source: Now has basic performance tests.
|
||||
- **Salesforce** Source: We now automatically transform and handle incorrect data for the anyType and calculated types.
|
||||
|
||||
## 11/11/2021
|
||||
|
||||
New destinations:
|
||||
|
||||
- [**Cassandra**](https://docs.airbyte.com/integrations/destinations/cassandra)
|
||||
- [**Pulsar**](https://docs.airbyte.com/integrations/destinations/pulsar)
|
||||
|
||||
New sources:
|
||||
|
||||
- [**Confluence**](https://docs.airbyte.com/integrations/sources/confluence)
|
||||
- [**Monday**](https://docs.airbyte.com/integrations/sources/monday)
|
||||
- [**Commerce Tools**](https://github.com/airbytehq/airbyte/tree/master/airbyte-integrations/connectors/source-commercetools)
|
||||
- [**Pinterest**](https://docs.airbyte.com/integrations/sources/pinterest)
|
||||
|
||||
New features:
|
||||
|
||||
- **Shopify** Source: Now supports the FulfillmentOrders and Fulfillments streams.
|
||||
- **Greenhouse** Source: Now supports the Demographics stream.
|
||||
- **Recharge** Source: Broken requests should now be re-requested with improved backoff.
|
||||
- **Stripe** Source: Now supports the checkout_sessions, checkout_sessions_line_item, and promotion_codes streams.
|
||||
- **Db2** Source: Now supports SSL.
|
||||
|
||||
## 11/3/2021
|
||||
|
||||
New destination:
|
||||
|
||||
- [**Elasticsearch**](https://docs.airbyte.com/integrations/destinations/elasticsearch)
|
||||
|
||||
New sources:
|
||||
|
||||
- [**Salesloft**](https://docs.airbyte.com/integrations/sources/salesloft)
|
||||
- [**OneSignal**](https://docs.airbyte.com/integrations/sources/onesignal)
|
||||
- [**Strava**](https://docs.airbyte.com/integrations/sources/strava)
|
||||
- [**Lemlist**](https://docs.airbyte.com/integrations/sources/lemlist)
|
||||
- [**Amazon SQS**](https://docs.airbyte.com/integrations/sources/amazon-sqs)
|
||||
- [**Freshservices**](https://docs.airbyte.com/integrations/sources/freshservice/)
|
||||
- [**Freshsales**](https://docs.airbyte.com/integrations/sources/freshsales)
|
||||
- [**Appsflyer**](https://github.com/airbytehq/airbyte/tree/master/airbyte-integrations/connectors/source-appsflyer)
|
||||
- [**Paystack**](https://docs.airbyte.com/integrations/sources/paystack)
|
||||
- [**Sentry**](https://docs.airbyte.com/integrations/sources/sentry)
|
||||
- [**Retently**](https://github.com/airbytehq/airbyte/tree/master/airbyte-integrations/connectors/source-retently)
|
||||
- [**Delighted!**](https://github.com/airbytehq/airbyte/tree/master/airbyte-integrations/connectors/source-delighted)
|
||||
|
||||
New features:
|
||||
|
||||
- **BigQuery** Destination: You can now run transformations in batches, preventing queries from hitting BigQuery limits. (contributed by @Andrés Bravo)
|
||||
- **S3** Source: Memory and Performance optimizations, also some fancy new PyArrow CSV configuration options.
|
||||
- **Zuora** Source: Now supports Unlimited as an option for the Data Query Live API.
|
||||
- **Clickhouse** Source: Now supports SSL and connection via SSH tunneling.
|
||||
|
||||
## 10/20/2021
|
||||
|
||||
New source:
|
||||
|
||||
- [**WooCommerce**](https://docs.airbyte.com/integrations/sources/woocommerce)
|
||||
|
||||
New feature:
|
||||
|
||||
- **MSSQL** destination: Now supports basic normalization
|
||||
|
||||
## 9/29/2021
|
||||
|
||||
New sources:
|
||||
|
||||
- [**LinkedIn Ads**](https://docs.airbyte.com/integrations/sources/linkedin-ads)
|
||||
- [**Kafka**](https://docs.airbyte.com/integrations/sources/kafka)
|
||||
- [**Lever Hiring**](https://docs.airbyte.com/integrations/sources/lever-hiring)
|
||||
|
||||
New features:
|
||||
|
||||
- **MySQL** destination: Now supports connection via TLS/SSL
|
||||
- **BigQuery** (denormalized) destination: Supports reading BigQuery types such as date by reading the format field (contributed by @Nicolas Moreau)
|
||||
- **Hubspot** source: Added contacts associations to the deals stream.
|
||||
- **GitHub** source: Now supports pulling commits from user-specified branches.
|
||||
- **Google Search Console** source: Now accepts admin email as input when using a service account key.
|
||||
- **Greenhouse** source: Now identifies API streams it has access to if permissions are limited.
|
||||
- **Marketo** source: Now Airbyte native.
|
||||
- **S3** source: Now supports any source that conforms to the S3 protocol (Non-AWS S3).
|
||||
- **Shopify** source: Now reports pre_tax_price on the line_items stream if you have Shopify Plus.
|
||||
- **Stripe** source: Now actually uses the mandatory start_date config field for incremental syncs.
|
||||
|
||||
## 9/16/2021
|
||||
|
||||
New destinations:
|
||||
|
||||
- [**Databricks**](https://docs.airbyte.com/integrations/destinations/databricks)
|
||||
|
||||
New sources:
|
||||
|
||||
- [**Close.com**](https://docs.airbyte.com/integrations/sources/close-com)
|
||||
- [**Google Search Console**](https://docs.airbyte.com/integrations/sources/google-search-console)
|
||||
|
||||
New features:
|
||||
|
||||
- **Google Ads** source: You can now specify user-specified queries in GAQL.
|
||||
- **GitHub** source: All streams with a parent stream use cached parent stream data when possible.
|
||||
- **Shopify** source: Substantial performance improvements to the incremental sync mode.
|
||||
- **Stripe** source: Now supports the PaymentIntents stream.
|
||||
- **Pipedrive** source: Now supports the Organizations stream.
|
||||
- **Sendgrid** source: Now supports the SingleSendStats stream.
|
||||
- **Bing Ads** source: Now supports the Report stream.
|
||||
- **GitHub** source: Now supports the Reactions stream.
|
||||
- **MongoDB** source: Now Airbyte native!
|
||||
|
||||
## 9/9/2021
|
||||
|
||||
New source:
|
||||
|
||||
- [**Facebook Pages**](https://docs.airbyte.com/integrations/sources/facebook-pages)
|
||||
|
||||
New destinations:
|
||||
|
||||
- [**MongoDB**](https://docs.airbyte.com/integrations/destinations/mongodb)
|
||||
- [**DynamoDB**](https://docs.airbyte.com/integrations/destinations/dynamodb)
|
||||
|
||||
New features:
|
||||
|
||||
- **S3** source: Support for Parquet format.
|
||||
- **Github** source: Branches, repositories, organization users, tags, and pull request stats streams added \(contributed by @Christopher Wu\).
|
||||
- **BigQuery** destination: Added GCS upload option.
|
||||
- **Salesforce** source: Now Airbyte native.
|
||||
- **Redshift** destination: Optimized for performance.
|
||||
|
||||
Bug fixes:
|
||||
|
||||
- **Pipedrive** source: Output schemas no longer remove timestamp from fields.
|
||||
- **Github** source: Empty repos and negative backoff values are now handled correctly.
|
||||
- **Harvest** source: Normalization now works as expected.
|
||||
- **All CDC sources**: Removed sleep logic which caused exceptions when loading data from high-volume sources.
|
||||
- **Slack** source: Increased number of retries to tolerate flaky retry wait times on the API side.
|
||||
- **Slack** source: Sync operations no longer hang indefinitely.
|
||||
- **Jira** source: Now uses updated time as the cursor field for incremental sync instead of the created time.
|
||||
- **Intercom** source: Fixed inconsistency between schema and output data.
|
||||
- **HubSpot** source: Streams with the items property now have their schemas fixed.
|
||||
- **HubSpot** source: Empty strings are no longer handled as dates, fixing the deals, companies, and contacts streams.
|
||||
- **Typeform** source: Allows for multiple choices in responses now.
|
||||
- **Shopify** source: The type for the amount field is now fixed in the schema.
|
||||
- **Postgres** destination: \u0000\(NULL\) value processing is now fixed.
|
||||
|
||||
## 9/1/2021
|
||||
|
||||
New sources:
|
||||
|
||||
- [**Bamboo HR**](https://docs.airbyte.com/integrations/sources/bamboo-hr)
|
||||
- [**BigCommerce**](https://docs.airbyte.com/integrations/sources/bigcommerce)
|
||||
- [**Trello**](https://docs.airbyte.com/integrations/sources/trello)
|
||||
- [**Google Analytics V4**](https://docs.airbyte.com/integrations/sources/google-analytics-v4)
|
||||
- [**Amazon Ads**](https://docs.airbyte.com/integrations/sources/google-analytics-v4)
|
||||
|
||||
Bug fixes:
|
||||
|
||||
- **Shopify** source: Rate limit throttling fixed.
|
||||
|
||||
## 8/26/2021
|
||||
|
||||
New source:
|
||||
|
||||
- [**Short.io**](https://docs.airbyte.com/integrations/sources/shortio)
|
||||
|
||||
New features:
|
||||
|
||||
- **GitHub** source: Add support for rotating through multiple API tokens.
|
||||
- **Google Ads** source: Added `UserLocationReport` stream.
|
||||
- **Cart.com** source: Added the `order_items` stream.
|
||||
|
||||
Bug fixes:
|
||||
|
||||
- **Postgres** source: Fix out-of-memory issue with CDC interacting with large JSON blobs.
|
||||
- **Intercom** source: Pagination now works as expected.
|
||||
|
||||
## 8/18/2021
|
||||
|
||||
New source:
|
||||
|
||||
- [**Bing Ads**](https://docs.airbyte.com/integrations/sources/bing-ads)
|
||||
|
||||
New destination:
|
||||
|
||||
- [**Keen**](https://docs.airbyte.com/integrations/destinations/keen)
|
||||
|
||||
New features:
|
||||
|
||||
- **Chargebee** source: Adds support for the `items`, `item prices` and `attached items` endpoints.
|
||||
|
||||
Bug fixes:
|
||||
|
||||
- **QuickBooks** source: Now uses the number data type for decimal fields.
|
||||
- **HubSpot** source: Fixed `empty string` inside of the `number` and `float` datatypes.
|
||||
- **GitHub** source: Validation fixed on non-required fields.
|
||||
- **BigQuery** destination: Now supports processing of arrays of records properly.
|
||||
- **Oracle** destination: Fixed destination check for users without DBA role.
|
||||
|
||||
## 8/9/2021
|
||||
|
||||
New sources:
|
||||
|
||||
- [**S3/Abstract Files**](https://docs.airbyte.com/integrations/sources/s3)
|
||||
- [**Zuora**](https://docs.airbyte.com/integrations/sources/zuora)
|
||||
- [**Kustomer**](https://docs.airbyte.com/integrations/sources/kustomer-singer/)
|
||||
- [**Apify**](https://docs.airbyte.com/integrations/sources/apify-dataset)
|
||||
- [**Chargebee**](https://docs.airbyte.com/integrations/sources/chargebee)
|
||||
|
||||
New features:
|
||||
|
||||
- **Shopify** source: The `status` property is now in the `Products` stream.
|
||||
- **Amazon Seller Partner** source: Added support for `GET_MERCHANT_LISTINGS_ALL_DATA` and `GET_FBA_INVENTORY_AGED_DATA` stream endpoints.
|
||||
- **GitHub** source: Existing streams now don't minify the `user` property.
|
||||
- **HubSpot** source: Updated user-defined custom field schema generation.
|
||||
- **Zendesk** source: Migrated from Singer to the Airbyte CDK.
|
||||
- **Amazon Seller Partner** source: Migrated to the Airbyte CDK.
|
||||
|
||||
Bug fixes:
|
||||
|
||||
- **HubSpot** source: Casting exceptions are now logged correctly.
|
||||
- **S3** source: Fixed bug where syncs could hang indefinitely.
|
||||
- **Shopify** source: Fixed the `products` schema to be in accordance with the API.
|
||||
- **PayPal Transactions** source: Fixed the start date minimum to be 3 years rather than 45 days.
|
||||
- **Google Ads** source: Added the `login-customer-id` setting.
|
||||
- **Intercom** source: Rate limit corrected from 1000 requests/minute from 1000 requests/hour.
|
||||
- **S3** source: Fixed bug in spec to properly display the `format` field in the UI.
|
||||
|
||||
New CDK features:
|
||||
|
||||
- Now allows for setting request data in non-JSON formats.
|
||||
|
||||
## 7/30/2021
|
||||
|
||||
New sources:
|
||||
|
||||
- [**PrestaShop**](https://docs.airbyte.com/integrations/sources/prestashop)
|
||||
- [**Snapchat Marketing**](https://docs.airbyte.com/integrations/sources/snapchat-marketing)
|
||||
- [**Drupal**](https://docs.airbyte.com/integrations/sources/drupal)
|
||||
- [**Magento**](https://docs.airbyte.com/integrations/sources/magento)
|
||||
- [**Microsoft Dynamics AX**](https://docs.airbyte.com/integrations/sources/microsoft-dynamics-ax)
|
||||
- [**Microsoft Dynamics Customer Engagement**](https://docs.airbyte.com/integrations/sources/microsoft-dynamics-customer-engagement)
|
||||
- [**Microsoft Dynamics GP**](https://docs.airbyte.com/integrations/sources/microsoft-dynamics-gp)
|
||||
- [**Microsoft Dynamics NAV**](https://docs.airbyte.com/integrations/sources/microsoft-dynamics-nav)
|
||||
- [**Oracle PeopleSoft**](https://docs.airbyte.com/integrations/sources/oracle-peoplesoft)
|
||||
- [**Oracle Siebel CRM**](https://docs.airbyte.com/integrations/sources/oracle-siebel-crm)
|
||||
- [**SAP Business One**](https://docs.airbyte.com/integrations/sources/sap-business-one)
|
||||
- [**Spree Commerce**](https://docs.airbyte.com/integrations/sources/spree-commerce)
|
||||
- [**Sugar CRM**](https://docs.airbyte.com/integrations/sources/sugar-crm)
|
||||
- [**WooCommerce**](https://docs.airbyte.com/integrations/sources/woocommerce)
|
||||
- [**Wordpress**](https://docs.airbyte.com/integrations/sources/wordpress)
|
||||
- [**Zencart**](https://docs.airbyte.com/integrations/sources/zencart)
|
||||
|
||||
Bug fixes:
|
||||
|
||||
- **Shopify** source: Fixed the `products` schema to be in accordance with the API.
|
||||
- **BigQuery** source: No longer fails with `Array of Records` data types.
|
||||
- **BigQuery** destination: Improved logging, Job IDs are now filled with location and Project IDs.
|
||||
|
||||
## 7/23/2021
|
||||
|
||||
New sources:
|
||||
|
||||
- [**Pipedrive**](https://docs.airbyte.com/integrations/sources/pipedrive)
|
||||
- [**US Census**](https://docs.airbyte.com/integrations/sources/us-census)
|
||||
- [**BigQuery**](https://docs.airbyte.com/integrations/sources/bigquery)
|
||||
|
||||
New destinations:
|
||||
|
||||
- [**Google Cloud Storage**](https://docs.airbyte.com/integrations/destinations/gcs)
|
||||
- [**Kafka**](https://docs.airbyte.com/integrations/destinations/kafka)
|
||||
|
||||
New Features:
|
||||
|
||||
- **Java Connectors**: Now have config validators for check, discover, read, and write calls
|
||||
- **Stripe** source: All subscription types are returnable \(including expired and canceled ones\).
|
||||
- **Mixpanel** source: Migrated to the CDK.
|
||||
- **Intercom** source: Migrated to the CDK.
|
||||
- **Google Ads** source: Now supports the `Campaigns`, `Ads`, `AdGroups`, and `Accounts` streams.
|
||||
|
||||
Bug Fixes:
|
||||
|
||||
- **Facebook** source: Improved rate limit management
|
||||
- **Instagram** source: Now supports old format for state and automatically updates it to the new format.
|
||||
- **Sendgrid** source: Now gracefully handles malformed responses from API.
|
||||
- **Jira** source: Fixed dbt failing to normalize schema for the labels stream.
|
||||
- **MySQL** destination: Does not fail anymore with columns that contain JSON data.
|
||||
- **Slack** source: Now does not fail stream slicing on reading threads.
|
||||
|
||||
## 7/16/2021
|
||||
|
||||
3 new sources:
|
||||
|
||||
- [**Zendesk Sunshine**](https://docs.airbyte.com/integrations/sources/zendesk-sunshine)
|
||||
- [**Dixa**](https://docs.airbyte.com/integrations/sources/dixa)
|
||||
- [**Typeform**](https://docs.airbyte.com/integrations/sources/typeform)
|
||||
|
||||
New Features:
|
||||
|
||||
- **MySQL** destination: Now supports normalization!
|
||||
- **MSSQL** source: Now supports CDC \(Change Data Capture\).
|
||||
- **Snowflake** destination: Data coming from Airbyte is now identifiable.
|
||||
- **GitHub** source: Now handles rate limiting.
|
||||
|
||||
Bug Fixes:
|
||||
|
||||
- **GitHub** source: Now uses the correct cursor field for the `IssueEvents` stream.
|
||||
- **Square** source: `send_request` method is no longer broken.
|
||||
|
||||
## 7/08/2021
|
||||
|
||||
7 new sources:
|
||||
|
||||
- [**PayPal Transaction**](https://docs.airbyte.com/integrations/sources/paypal-transaction)
|
||||
- [**Square**](https://docs.airbyte.com/integrations/sources/square)
|
||||
- [**SurveyMonkey**](https://docs.airbyte.com/integrations/sources/surveymonkey)
|
||||
- [**CockroachDB**](https://docs.airbyte.com/integrations/sources/cockroachdb)
|
||||
- [**Airbyte-native GitLab**](https://docs.airbyte.com/integrations/sources/gitlab)
|
||||
- [**Airbyte-native GitHub**](https://docs.airbyte.com/integrations/sources/github)
|
||||
- [**Airbyte-native Twilio**](https://docs.airbyte.com/integrations/sources/twilio)
|
||||
|
||||
New Features:
|
||||
|
||||
- **S3** destination: Now supports `anyOf`, `oneOf` and `allOf` schema fields.
|
||||
- **Instagram** source: Migrated to the CDK and has improved error handling.
|
||||
- **Snowflake** source: Now has comprehensive data type tests.
|
||||
- **Shopify** source: Change the default stream cursor field to `update_at` where possible.
|
||||
- **Shopify** source: Add support for draft orders.
|
||||
- **MySQL** destination: Now supports normalization.
|
||||
|
||||
Connector Development:
|
||||
|
||||
- **Python CDK**: Now allows setting of network adapter args on outgoing HTTP requests.
|
||||
- Abstract classes for non-JDBC relational database sources.
|
||||
|
||||
Bugfixes:
|
||||
|
||||
- **GitHub** source: Fixed issue with `locked` breaking normalization of the pull_request stream.
|
||||
- **PostgreSQL** source: Fixed decimal handling with CDC.
|
||||
- **Okta** source: Fix endless loop when syncing data from logs stream.
|
||||
|
||||
## 7/01/2021
|
||||
|
||||
Bugfixes:
|
||||
|
||||
- **Looker** source: Now supports the Run Look stream.
|
||||
- **Google Adwords**: CI is fixed and new version is published.
|
||||
- **Slack** source: Now Airbyte native and supports channels, channel members, messages, users, and threads streams.
|
||||
- **Freshdesk** source: Does not fail after 300 pages anymore.
|
||||
- **MSSQL** source: Now has comprehensive data type tests.
|
||||
|
||||
## 6/24/2021
|
||||
|
||||
1 new source:
|
||||
|
||||
- [**Db2**](https://docs.airbyte.com/integrations/sources/db2)
|
||||
|
||||
New features:
|
||||
|
||||
- **S3** destination: supports Avro and Jsonl output!
|
||||
- **BigQuery** destination: now supports loading JSON data as structured data.
|
||||
- **Looker** source: Now supports self-hosted instances.
|
||||
- **Facebook** source: is now migrated to the CDK.
|
||||
|
||||
## 6/18/2021
|
||||
|
||||
1 new source:
|
||||
|
||||
- [**Snowflake**](https://docs.airbyte.com/integrations/sources/snowflake)
|
||||
|
||||
New features:
|
||||
|
||||
- **Postgres** source: now has comprehensive data type tests.
|
||||
- **Google Ads** source: now uses the [Google Ads Query Language](https://developers.google.com/google-ads/api/docs/query/overview)!
|
||||
- **S3** destination: supports Parquet output!
|
||||
- **S3** destination: supports Minio S3!
|
||||
- **BigQuery** destination: credentials are now optional.
|
||||
|
||||
## 6/10/2021
|
||||
|
||||
1 new destination:
|
||||
|
||||
- [**S3**](https://docs.airbyte.com/integrations/destinations/s3)
|
||||
|
||||
3 new sources:
|
||||
|
||||
- [**Harvest**](https://docs.airbyte.com/integrations/sources/harvest)
|
||||
- [**Amplitude**](https://docs.airbyte.com/integrations/sources/amplitude)
|
||||
- [**Posthog**](https://docs.airbyte.com/integrations/sources/posthog)
|
||||
|
||||
New features:
|
||||
|
||||
- **Jira** source: now supports all available entities in Jira Cloud.
|
||||
- **ExchangeRatesAPI** source: clearer messages around unsupported currencies.
|
||||
- **MySQL** source: Comprehensive core extension to be more compatible with other JDBC sources.
|
||||
- **BigQuery** destination: Add dataset location.
|
||||
- **Shopify** source: Add order risks + new attributes to orders schema for native connector
|
||||
|
||||
Bugfixes:
|
||||
|
||||
- **MSSQL** destination: fixed handling of unicode symbols.
|
||||
|
||||
Connector development updates:
|
||||
|
||||
- Containerized connector code generator.
|
||||
- Added JDBC source connector bootstrap template.
|
||||
- Added Java destination generator.
|
||||
|
||||
## 06/3/2021
|
||||
|
||||
2 new sources:
|
||||
|
||||
- [**Okta**](https://docs.airbyte.com/integrations/sources/okta)
|
||||
- [**Amazon Seller Partner**](https://docs.airbyte.com/integrations/sources/amazon-seller-partner)
|
||||
|
||||
New features:
|
||||
|
||||
- **MySQL CDC** now only polls for 5 minutes if we haven't received any records \([\#3789](https://github.com/airbytehq/airbyte/pull/3789)\)
|
||||
- **Python CDK** now supports Python 3.7.X \([\#3692](https://github.com/airbytehq/airbyte/pull/3692)\)
|
||||
- **File** source: now supports Azure Blob Storage \([\#3660](https://github.com/airbytehq/airbyte/pull/3660)\)
|
||||
|
||||
Bugfixes:
|
||||
|
||||
- **Recurly** source: now uses type `number` instead of `integer` \([\#3769](https://github.com/airbytehq/airbyte/pull/3769)\)
|
||||
- **Stripe** source: fix types in schema \([\#3744](https://github.com/airbytehq/airbyte/pull/3744)\)
|
||||
- **Stripe** source: output `number` instead of `int` \([\#3728](https://github.com/airbytehq/airbyte/pull/3728)\)
|
||||
- **MSSQL** destination: fix issue with unicode symbols handling \([\#3671](https://github.com/airbytehq/airbyte/pull/3671)\)
|
||||
|
||||
## 05/25/2021
|
||||
|
||||
4 new sources:
|
||||
|
||||
- [**Asana**](https://docs.airbyte.com/integrations/sources/asana)
|
||||
- [**Klaviyo**](https://docs.airbyte.com/integrations/sources/klaviyo)
|
||||
- [**Recharge**](https://docs.airbyte.com/integrations/sources/recharge)
|
||||
- [**Tempo**](https://docs.airbyte.com/integrations/sources/tempo)
|
||||
|
||||
Progress on connectors:
|
||||
|
||||
- **CDC for MySQL** is now available!
|
||||
- **Sendgrid** source: support incremental sync, as rewritten using HTTP CDK \([\#3445](https://github.com/airbytehq/airbyte/pull/3445)\)
|
||||
- **Github** source bugfix: exception when parsing null date values, use `created_at` as cursor value for issue_milestones \([\#3314](https://github.com/airbytehq/airbyte/pull/3314)\)
|
||||
- **Slack** source bugfix: don't overwrite thread_ts in threads stream \([\#3483](https://github.com/airbytehq/airbyte/pull/3483)\)
|
||||
- **Facebook Marketing** source: allow configuring insights lookback window \([\#3396](https://github.com/airbytehq/airbyte/pull/3396)\)
|
||||
- **Freshdesk** source: fix discovery \([\#3591](https://github.com/airbytehq/airbyte/pull/3591)\)
|
||||
|
||||
## 05/18/2021
|
||||
|
||||
1 new destination: [**MSSQL**](https://docs.airbyte.com/integrations/destinations/mssql)
|
||||
|
||||
1 new source: [**ClickHouse**](https://docs.airbyte.com/integrations/sources/clickhouse)
|
||||
|
||||
Progress on connectors:
|
||||
|
||||
- **Shopify**: make this source more resilient to timeouts \([\#3409](https://github.com/airbytehq/airbyte/pull/3409)\)
|
||||
- **Freshdesk** bugfix: output correct schema for various streams \([\#3376](https://github.com/airbytehq/airbyte/pull/3376)\)
|
||||
- **Iterable**: update to use latest version of CDK \([\#3378](https://github.com/airbytehq/airbyte/pull/3378)\)
|
||||
|
||||
## 05/11/2021
|
||||
|
||||
1 new destination: [**MySQL**](https://docs.airbyte.com/integrations/destinations/mysql)
|
||||
|
||||
2 new sources:
|
||||
|
||||
- [**Google Search Console**](https://docs.airbyte.com/integrations/sources/google-search-console)
|
||||
- [**PokeAPI**](https://docs.airbyte.com/integrations/sources/pokeapi) \(talking about long tail and having fun ;\)\)
|
||||
|
||||
Progress on connectors:
|
||||
|
||||
- **Zoom**: bugfix on declaring correct types to match data coming from API \([\#3159](https://github.com/airbytehq/airbyte/pull/3159)\), thanks to [vovavovavovavova](https://github.com/vovavovavovavova)
|
||||
- **Smartsheets**: bugfix on gracefully handling empty cell values \([\#3337](https://github.com/airbytehq/airbyte/pull/3337)\), thanks to [Nathan Nowack](https://github.com/zzstoatzz)
|
||||
- **Stripe**: fix date property name, only add connected account header when set, and set primary key \(\#3210\), thanks to [Nathan Yergler](https://github.com/nyergler)
|
||||
|
||||
## 05/04/2021
|
||||
|
||||
2 new sources:
|
||||
|
||||
- [**Smartsheets**](https://docs.airbyte.com/integrations/sources/smartsheets), thanks to [Nathan Nowack](https://github.com/zzstoatzz)
|
||||
- [**Zendesk Chat**](https://docs.airbyte.com/integrations/sources/zendesk-chat)
|
||||
|
||||
Progress on connectors:
|
||||
|
||||
- **Appstore**: bugfix private key handling in the UI \([\#3201](https://github.com/airbytehq/airbyte/pull/3201)\)
|
||||
- **Facebook marketing**: Wait longer \(5 min\) for async jobs to start \([\#3116](https://github.com/airbytehq/airbyte/pull/3116)\), thanks to [Max Krog](https://github.com/MaxKrog)
|
||||
- **Stripe**: support reading data from connected accounts \(\#3121\), and 2 new streams with Refunds & Bank Accounts \([\#3030](https://github.com/airbytehq/airbyte/pull/3030)\) \([\#3086](https://github.com/airbytehq/airbyte/pull/3086)\)
|
||||
- **Redshift destination**: Ignore records that are too big \(instead of failing\) \([\#2988](https://github.com/airbytehq/airbyte/pull/2988)\)
|
||||
- **MongoDB**: add supporting TLS and Replica Sets \([\#3111](https://github.com/airbytehq/airbyte/pull/3111)\)
|
||||
- **HTTP sources**: bugfix on handling array responses gracefully \([\#3008](https://github.com/airbytehq/airbyte/pull/3008)\)
|
||||
|
||||
## 04/27/2021
|
||||
|
||||
- **Zendesk Talk**: fix normalization failure \([\#3022](https://github.com/airbytehq/airbyte/pull/3022)\), thanks to [yevhenii-ldv](https://github.com/yevhenii-ldv)
|
||||
- **Github**: pull_requests stream only incremental syncs \([\#2886](https://github.com/airbytehq/airbyte/pull/2886)\) \([\#3009](https://github.com/airbytehq/airbyte/pull/3009)\), thanks to [Zirochkaa](https://github.com/Zirochkaa)
|
||||
- Create streaming writes to a file and manage the issuance of copy commands for the destination \([\#2921](https://github.com/airbytehq/airbyte/pull/2921)\)
|
||||
- **Redshift**: make Redshift part size configurable. \([\#3053](https://github.com/airbytehq/airbyte/pull/23053)\)
|
||||
- **HubSpot**: fix argument error in log call \(\#3087\) \([\#3087](https://github.com/airbytehq/airbyte/pull/3087)\) , thanks to [Nathan Yergler](https://github.com/nyergler)
|
||||
|
||||
## 04/20/2021
|
||||
|
||||
3 new source connectors!
|
||||
|
||||
- [**Zendesk Talk**](https://docs.airbyte.com/integrations/sources/zendesk-talk)
|
||||
- [**Iterable**](https://docs.airbyte.com/integrations/sources/iterable)
|
||||
- [**QuickBooks**](https://docs.airbyte.com/integrations/sources/quickbooks-singer)
|
||||
|
||||
Other progress on connectors:
|
||||
|
||||
- **Postgres source/destination**: add SSL option, thanks to [Marcos Marx](https://github.com/marcosmarxm) \([\#2757](https://github.com/airbytehq/airbyte/pull/2757)\)
|
||||
- **Google sheets bugfix**: handle duplicate sheet headers, thanks to [Aneesh Makala](https://github.com/makalaaneesh) \([\#2905](https://github.com/airbytehq/airbyte/pull/2905)\)
|
||||
- **Source Google Adwords**: support specifying the lookback window for conversions, thanks to [Harshith Mullapudi](https://github.com/harshithmullapudi) \([\#2918](https://github.com/airbytehq/airbyte/pull/2918)\)
|
||||
- **MongoDB improvement**: speed up mongodb schema discovery, thanks to [Yury Koleda](https://github.com/FUT) \([\#2851](https://github.com/airbytehq/airbyte/pull/2851)\)
|
||||
- **MySQL bugfix**: parsing Mysql jdbc params, thanks to [Vasily Safronov](https://github.com/gingeard) \([\#2891](https://github.com/airbytehq/airbyte/pull/2891)\)
|
||||
- **CSV bugfix**: discovery takes too much memory \([\#2089](https://github.com/airbytehq/airbyte/pull/2851)\)
|
||||
- A lot of work was done on improving the standard tests for the connectors, for better standardization and maintenance!
|
||||
|
||||
## 04/13/2021
|
||||
|
||||
- New connector: [**Oracle DB**](https://docs.airbyte.com/integrations/sources/oracle), thanks to [Marcos Marx](https://github.com/marcosmarxm)
|
||||
|
||||
## 04/07/2021
|
||||
|
||||
- New connector: [**Google Workspace Admin Reports**](https://docs.airbyte.com/integrations/sources/google-workspace-admin-reports) \(audit logs\)
|
||||
- Bugfix in the base python connector library that caused errors to be silently skipped rather than failing the sync
|
||||
- **Exchangeratesapi.io** bugfix: to point to the updated API URL
|
||||
- **Redshift destination** bugfix: quote keywords “DATETIME” and “TIME” when used as identifiers
|
||||
- **GitHub** bugfix: syncs failing when a personal repository doesn’t contain collaborators or team streams available
|
||||
- **Mixpanel** connector: sync at most the last 90 days of data in the annotations stream to adhere to API limits
|
||||
|
||||
## 03/29/2021
|
||||
|
||||
- We started measuring throughput of connectors. This will help us improve that point for all connectors.
|
||||
- **Redshift**: implemented Copy strategy to improve its throughput.
|
||||
- **Instagram**: bugfix an issue which caused media and media_insights streams to stop syncing prematurely.
|
||||
- Support NCHAR and NVCHAR types in SQL-based database sources.
|
||||
- Add the ability to specify a custom JDBC parameters for the MySQL source connector.
|
||||
|
||||
## 03/22/2021
|
||||
|
||||
- 2 new source connectors: [**Gitlab**](https://docs.airbyte.com/integrations/sources/gitlab) and [**Airbyte-native HubSpot**](https://docs.airbyte.com/integrations/sources/hubspot)
|
||||
- Developing connectors now requires almost no interaction with Gradle, Airbyte’s monorepo build tool. If you’re building a Python connector, you never have to worry about developing outside your typical flow. See [the updated documentation](https://docs.airbyte.com/connector-development).
|
||||
|
||||
## 03/15/2021
|
||||
|
||||
- 2 new source connectors: [**Instagram**](https://docs.airbyte.com/integrations/sources/instagram) and [**Google Directory**](https://docs.airbyte.com/integrations/sources/google-directory)
|
||||
- **Facebook Marketing**: support of API v10
|
||||
- **Google Analytics**: support incremental sync
|
||||
- **Jira**: bug fix to consistently pull all tickets
|
||||
- **HTTP Source**: bug fix to correctly parse JSON responses consistently
|
||||
|
||||
## 03/08/2021
|
||||
|
||||
- 1 new source connector: **MongoDB**
|
||||
- **Google Analytics**: Support chunked syncs to avoid sampling
|
||||
- **AppStore**: fix bug where the catalog was displayed incorrectly
|
||||
|
||||
## 03/01/2021
|
||||
|
||||
- **New native HubSpot connector** with schema folder populated
|
||||
- Facebook Marketing connector: add option to include deleted records
|
||||
|
||||
## 02/22/2021
|
||||
|
||||
- Bug fixes:
|
||||
- **Google Analytics:** add the ability to sync custom reports
|
||||
- **Apple Appstore:** bug fix to correctly run incremental syncs
|
||||
- **Exchange rates:** UI now correctly validates input date pattern
|
||||
- **File Source:** Support JSONL \(newline-delimited JSON\) format
|
||||
- **Freshdesk:** Enable controlling how many requests per minute the connector makes to avoid overclocking rate limits
|
||||
|
||||
## 02/15/2021
|
||||
|
||||
- 1 new destination connector: [MeiliSearch](https://docs.airbyte.com/integrations/destinations/meilisearch)
|
||||
- 2 new sources that support incremental append: [Freshdesk](https://docs.airbyte.com/integrations/sources/freshdesk) and [Sendgrid](https://docs.airbyte.com/integrations/sources/sendgrid)
|
||||
- Other fixes:
|
||||
- Thanks to [@ns-admetrics](https://github.com/ns-admetrics) for contributing an upgrade to the **Shopify** source connector which now provides the landing_site field containing UTM parameters in the Orders table.
|
||||
- **Sendgrid** source connector supports most available endpoints available in the API
|
||||
- **Facebook** Source connector now supports syncing Ad Insights data
|
||||
- **Freshdesk** source connector now supports syncing satisfaction ratings and conversations
|
||||
- **Microsoft Teams** source connector now gracefully handles rate limiting
|
||||
- Bug fix in **Slack** source where the last few records in a sync were sporadically dropped
|
||||
- Bug fix in **Google Analytics** source where the last few records in sync were sporadically dropped
|
||||
- In **Redshift source**, support non alpha-numeric table names
|
||||
- Bug fix in **Github Source** to fix instances where syncs didn’t always fail if there was an error while reading data from the API
|
||||
|
||||
## 02/02/2021
|
||||
|
||||
- Sources that we improved reliability for \(and that became “certified”\):
|
||||
- [Certified sources](https://docs.airbyte.com/integrations): Files and Shopify
|
||||
- Enhanced continuous testing for Tempo and Looker sources
|
||||
- Other fixes / features:
|
||||
- Correctly handle boolean types in the File Source
|
||||
- Add docs for [App Store](https://docs.airbyte.com/integrations/sources/appstore) source
|
||||
- Fix a bug in Snowflake destination where the connector didn’t check for all needed write permissions, causing some syncs to fail
|
||||
|
||||
## 01/26/2021
|
||||
|
||||
- Improved reliability with our best practices on : Google Sheets, Google Ads, Marketo, Tempo
|
||||
- Support incremental for Facebook and Google Ads
|
||||
- The Facebook connector now supports the FB marketing API v9
|
||||
|
||||
## 01/19/2021
|
||||
|
||||
- **Our new** [**Connector Health Grade**](../../integrations/) **page**
|
||||
- **1 new source:** App Store \(thanks to [@Muriloo](https://github.com/Muriloo)\)
|
||||
- Fixes on connectors:
|
||||
- Bug fix writing boolean columns to Redshift
|
||||
- Bug fix where getting a connector’s input configuration hung indefinitely
|
||||
- Stripe connector now gracefully handles rate limiting from the Stripe API
|
||||
|
||||
## 01/12/2021
|
||||
|
||||
- **1 new source:** Tempo \(thanks to [@thomasvl](https://github.com/thomasvl)\)
|
||||
- **Incremental support for 3 new source connectors:** [Salesforce](../../integrations/sources/salesforce.md), [Slack](../../integrations/sources/slack.md) and [Braintree](../../integrations/sources/braintree.md)
|
||||
- Fixes on connectors:
|
||||
- Fix a bug in MSSQL and Redshift source connectors where custom SQL types weren't being handled correctly.
|
||||
- Improvement of the Snowflake connector from [@hudsondba](https://github.com/hudsondba) \(batch size and timeout sync\)
|
||||
|
||||
## 01/05/2021
|
||||
|
||||
- **Incremental support for 2 new source connectors:** [Mixpanel](../../integrations/sources/mixpanel.md) and [HubSpot](../../integrations/sources/hubspot.md)
|
||||
- Fixes on connectors:
|
||||
- Fixed a bug in the github connector where the connector didn’t verify the provided API token was granted the correct permissions
|
||||
- Fixed a bug in the Google sheets connector where rate limits were not always respected
|
||||
- Alpha version of Facebook marketing API v9. This connector is a native Airbyte connector \(current is Singer based\).
|
||||
|
||||
## 12/30/2020
|
||||
|
||||
**New sources:** [Plaid](../../integrations/sources/plaid.md) \(contributed by [tgiardina](https://github.com/tgiardina)\), [Looker](../../integrations/sources/looker.md)
|
||||
|
||||
## 12/18/2020
|
||||
|
||||
**New sources:** [Drift](../../integrations/sources/drift.md), [Microsoft Teams](../../integrations/sources/microsoft-teams.md)
|
||||
|
||||
## 12/10/2020
|
||||
|
||||
**New sources:** [Intercom](../../integrations/sources/intercom.md), [Mixpanel](../../integrations/sources/mixpanel.md), [Jira Cloud](../../integrations/sources/jira.md), [Zoom](../../integrations/sources/zoom.md)
|
||||
|
||||
## 12/07/2020
|
||||
|
||||
**New sources:** [Slack](../../integrations/sources/slack.md), [Braintree](../../integrations/sources/braintree.md), [Zendesk Support](../../integrations/sources/zendesk-support.md)
|
||||
|
||||
## 12/04/2020
|
||||
|
||||
**New sources:** [Redshift](../../integrations/sources/redshift.md), [Greenhouse](../../integrations/sources/greenhouse.md) **New destination:** [Redshift](../../integrations/destinations/redshift.md)
|
||||
|
||||
## 11/30/2020
|
||||
|
||||
**New sources:** [Freshdesk](../../integrations/sources/freshdesk.md), [Twilio](../../integrations/sources/twilio.md)
|
||||
|
||||
## 11/25/2020
|
||||
|
||||
**New source:** [Recurly](../../integrations/sources/recurly.md)
|
||||
|
||||
## 11/23/2020
|
||||
|
||||
**New source:** [Sendgrid](../../integrations/sources/sendgrid.md)
|
||||
|
||||
## 11/18/2020
|
||||
|
||||
**New source:** [Mailchimp](../../integrations/sources/mailchimp.md)
|
||||
|
||||
## 11/13/2020
|
||||
|
||||
**New source:** [MSSQL](../../integrations/sources/mssql.md)
|
||||
|
||||
## 11/11/2020
|
||||
|
||||
**New source:** [Shopify](../../integrations/sources/shopify.md)
|
||||
|
||||
## 11/09/2020
|
||||
|
||||
**New sources:** [Files \(CSV, JSON, HTML...\)](../../integrations/sources/file.md)
|
||||
|
||||
## 11/04/2020
|
||||
|
||||
**New sources:** [Facebook Ads](connectors.md), [Google Ads](../../integrations/sources/google-ads.md), [Marketo](../../integrations/sources/marketo.md) **New destination:** [Snowflake](../../integrations/destinations/snowflake.md)
|
||||
|
||||
## 10/30/2020
|
||||
|
||||
**New sources:** [Salesforce](../../integrations/sources/salesforce.md), Google Analytics, [HubSpot](../../integrations/sources/hubspot.md), [GitHub](../../integrations/sources/github.md), [Google Sheets](../../integrations/sources/google-sheets.md), [Rest APIs](connectors.md), and [MySQL](../../integrations/sources/mysql.md)
|
||||
|
||||
## 10/21/2020
|
||||
|
||||
**New destinations:** we built our own connectors for [BigQuery](../../integrations/destinations/bigquery.md) and [Postgres](../../integrations/destinations/postgres.md), to ensure they are of the highest quality.
|
||||
|
||||
## 09/23/2020
|
||||
|
||||
**New sources:** [Stripe](../../integrations/sources/stripe.md), [Postgres](../../integrations/sources/postgres.md) **New destinations:** [BigQuery](../../integrations/destinations/bigquery.md), [Postgres](../../integrations/destinations/postgres.md), [local CSV](../../integrations/destinations/csv.md)
|
||||
@@ -1,509 +0,0 @@
|
||||
---
|
||||
description: Be sure to not miss out on new features and improvements!
|
||||
---
|
||||
|
||||
# Platform
|
||||
|
||||
This is the changelog for Airbyte Platform. For our connector changelog, please visit our [Connector Changelog](connectors.md) page.
|
||||
|
||||
## [20-12-2021 - 0.32.5](https://github.com/airbytehq/airbyte/releases/tag/v0.32.5-alpha)
|
||||
* Add an endpoint that specify that the feedback have been given after the first sync.
|
||||
|
||||
## [18-12-2021 - 0.32.4](https://github.com/airbytehq/airbyte/releases/tag/v0.32.4-alpha)
|
||||
* No major changes to Airbyte Core.
|
||||
|
||||
## [18-12-2021 - 0.32.3](https://github.com/airbytehq/airbyte/releases/tag/v0.32.3-alpha)
|
||||
* No major changes to Airbyte Core.
|
||||
|
||||
## [18-12-2021 - 0.32.2](https://github.com/airbytehq/airbyte/releases/tag/v0.32.2-alpha)
|
||||
* Improve error handling when additional sources/destinations cannot be read.
|
||||
* Implement connector config dependency for OAuth consent URL.
|
||||
* Treat oauthFlowInitParameters just as hidden instead of getting rid of them.
|
||||
* Stop using gentle close with heartbeat.
|
||||
|
||||
## [17-12-2021 - 0.32.1](https://github.com/airbytehq/airbyte/releases/tag/v0.32.1-alpha)
|
||||
* Add to the new connection flow form with an existing source and destination dropdown.
|
||||
* Implement protocol change for OAuth outputs.
|
||||
* Enhance API for use by cloud to provide per-connector billing info.
|
||||
|
||||
## [11-12-2021 - 0.32.0](https://github.com/airbytehq/airbyte/releases/tag/v0.32.0-alpha)
|
||||
* This is a **MAJOR** version update. You need to [update to this version](../../operator-guides/upgrading-airbyte.md#mandatory-intermediate-upgrade) before updating to any version newer than `0.32.0`
|
||||
|
||||
## [11-11-2021 - 0.31.0](https://github.com/airbytehq/airbyte/releases/tag/v0.31.0-alpha)
|
||||
* No major changes to Airbyte Core.
|
||||
|
||||
## [11-11-2021 - 0.30.39](https://github.com/airbytehq/airbyte/releases/tag/v0.30.39-alpha)
|
||||
* We migrated our secret management to Google Secret Manager, allowing us to scale how many connectors we support.
|
||||
|
||||
## [11-09-2021 - 0.30.37](https://github.com/airbytehq/airbyte/releases/tag/v0.30.37-alpha)
|
||||
* No major changes to Airbyte Core.
|
||||
|
||||
## [11-09-2021 - 0.30.36](https://github.com/airbytehq/airbyte/releases/tag/v0.30.36-alpha)
|
||||
* No major changes to Airbyte Core.
|
||||
|
||||
## [11-08-2021 - 0.30.35](https://github.com/airbytehq/airbyte/releases/tag/v0.30.35-alpha)
|
||||
* No major changes to Airbyte Core.
|
||||
|
||||
## [11-06-2021 - 0.30.34](https://github.com/airbytehq/airbyte/releases/tag/v0.30.34-alpha)
|
||||
* No major changes to Airbyte Core.
|
||||
|
||||
## [11-06-2021 - 0.30.33](https://github.com/airbytehq/airbyte/releases/tag/v0.30.33-alpha)
|
||||
* No major changes to Airbyte Core.
|
||||
|
||||
## [11-05-2021 - 0.30.32](https://github.com/airbytehq/airbyte/releases/tag/v0.30.32-alpha)
|
||||
* Airbyte Server no longer crashes from having too many open files.
|
||||
|
||||
## [11-04-2021 - 0.30.31](https://github.com/airbytehq/airbyte/releases/tag/v0.30.31-alpha)
|
||||
* No major changes to Airbyte Core.
|
||||
|
||||
## [11-01-2021 - 0.30.25](https://github.com/airbytehq/airbyte/releases/tag/v0.30.25-alpha)
|
||||
* No major changes to Airbyte Core.
|
||||
|
||||
## [11-01-2021 - 0.30.24](https://github.com/airbytehq/airbyte/releases/tag/v0.30.24-alpha)
|
||||
* Incremental normalization is live. Basic normalization no longer runs on already normalized data, making it way faster and cheaper.
|
||||
|
||||
## [11-01-2021 - 0.30.23](https://github.com/airbytehq/airbyte/releases/tag/v0.30.23-alpha)
|
||||
* No major changes to Airbyte Core.
|
||||
|
||||
## [10-21-2021 - 0.30.22](https://github.com/airbytehq/airbyte/releases/tag/v0.30.22-alpha)
|
||||
* We now support experimental deployment of Airbyte on Macbooks with M1 chips!
|
||||
|
||||
:::info
|
||||
|
||||
This interim patch period mostly contained stability changes for Airbyte Cloud, so we skipped from `0.30.16` to `0.30.22`.
|
||||
|
||||
:::
|
||||
|
||||
## [10-07-2021 - 0.30.16](https://github.com/airbytehq/airbyte/releases/tag/v0.30.16-alpha)
|
||||
* On Kubernetes deployments, you can now configure the Airbyte Worker Pod's image pull policy.
|
||||
|
||||
:::info
|
||||
|
||||
This interim patch period mostly contained stability changes for Airbyte Cloud, so we skipped from `0.30.2` to `0.30.16`.
|
||||
|
||||
:::
|
||||
|
||||
## [09-30-2021 - 0.30.2](https://github.com/airbytehq/airbyte/releases/tag/v0.30.2-alpha)
|
||||
* Fixed a bug that would fail Airbyte upgrades for deployments with sync notifications.
|
||||
|
||||
## [09-24-2021 - 0.29.22](https://github.com/airbytehq/airbyte/releases/tag/v0.29.22-alpha)
|
||||
* We now have integration tests for SSH.
|
||||
|
||||
## [09-19-2021 - 0.29.21](https://github.com/airbytehq/airbyte/releases/tag/v0.29.21-alpha)
|
||||
* You can now [deploy Airbyte on Kubernetes with a Helm Chart](https://github.com/airbytehq/airbyte/pull/5891)!
|
||||
|
||||
## [09-16-2021 - 0.29.19](https://github.com/airbytehq/airbyte/releases/tag/v0.29.19-alpha)
|
||||
* Fixes a breaking bug that prevents Airbyte upgrading from older versions.
|
||||
|
||||
## [09-15-2021 - 0.29.18](https://github.com/airbytehq/airbyte/releases/tag/v0.29.18-alpha)
|
||||
* Building images is now optional in the CI build.
|
||||
|
||||
## [09-08-2021 - 0.29.17](https://github.com/airbytehq/airbyte/releases/tag/v0.29.17-alpha)
|
||||
|
||||
* You can now properly cancel deployments when deploying on K8s.
|
||||
|
||||
## [09-08-2021 - 0.29.16](https://github.com/airbytehq/airbyte/releases/tag/v0.29.16-alpha)
|
||||
|
||||
* You can now send notifications via webhook for successes and failures on Airbyte syncs.
|
||||
* Scheduling jobs and worker jobs are now separated, allowing for workers to be scaled horizontally.
|
||||
|
||||
## [09-04-2021 - 0.29.15](https://github.com/airbytehq/airbyte/releases/tag/v0.29.15-alpha)
|
||||
|
||||
* Fixed a bug that made it possible for connector definitions to be duplicated, violating uniqueness.
|
||||
|
||||
## [09-02-2021 - 0.29.14](https://github.com/airbytehq/airbyte/releases/tag/v0.29.14-alpha)
|
||||
|
||||
* Nothing of note.
|
||||
|
||||
## [08-27-2021 - 0.29.13](https://github.com/airbytehq/airbyte/releases/tag/v0.29.13-alpha)
|
||||
|
||||
* The scheduler now waits for the server before it creates any databases.
|
||||
* You can now apply tolerations for Airbyte Pods on K8s deployments.
|
||||
|
||||
## [08-23-2021 - 0.29.12](https://github.com/airbytehq/airbyte/releases/tag/v0.29.12-alpha)
|
||||
|
||||
* Syncs now have a `max_sync_timeout` that times them out after 3 days.
|
||||
* Fixed Kube deploys when logging with Minio.
|
||||
|
||||
## [08-20-2021 - 0.29.11](https://github.com/airbytehq/airbyte/releases/tag/v0.29.11-alpha)
|
||||
|
||||
* Nothing of note.
|
||||
|
||||
## [08-20-2021 - 0.29.10](https://github.com/airbytehq/airbyte/releases/tag/v0.29.10-alpha)
|
||||
|
||||
* Migration of Python connector template images to Alpine Docker images to reduce size.
|
||||
|
||||
## [08-20-2021 - 0.29.9](https://github.com/airbytehq/airbyte/releases/tag/v0.29.9-alpha)
|
||||
|
||||
* Nothing of note.
|
||||
|
||||
## [08-17-2021 - 0.29.8](https://github.com/airbytehq/airbyte/releases/tag/v0.29.8-alpha)
|
||||
|
||||
* Nothing of note.
|
||||
|
||||
## [08-14-2021 - 0.29.7](https://github.com/airbytehq/airbyte/releases/tag/v0.29.7-alpha)
|
||||
|
||||
* Re-release: Fixed errant ENV variable in `0.29.6`
|
||||
|
||||
## [08-14-2021 - 0.29.6](https://github.com/airbytehq/airbyte/releases/tag/v0.29.6-alpha)
|
||||
|
||||
* Connector pods no longer fail with edge case names for the associated Docker images.
|
||||
|
||||
## [08-14-2021 - 0.29.5](https://github.com/airbytehq/airbyte/releases/tag/v0.29.5-alpha)
|
||||
|
||||
* Nothing of note.
|
||||
|
||||
## [08-12-2021 - 0.29.4](https://github.com/airbytehq/airbyte/releases/tag/v0.29.4-alpha)
|
||||
|
||||
* Introduced implementation for date-time support in normalization.
|
||||
|
||||
## [08-9-2021 - 0.29.3](https://github.com/airbytehq/airbyte/releases/tag/v0.29.3-alpha)
|
||||
|
||||
* Importing configuration no longer removes available but unused connectors.
|
||||
|
||||
## [08-6-2021 - 0.29.2](https://github.com/airbytehq/airbyte/releases/tag/v0.29.2-alpha)
|
||||
|
||||
* Fixed nil pointer exception in version migrations.
|
||||
|
||||
## [07-29-2021 - 0.29.1](https://github.com/airbytehq/airbyte/releases/tag/v0.29.1-alpha)
|
||||
|
||||
* When migrating, types represented in the config archive need to be a subset of the types declared in the schema.
|
||||
|
||||
## [07-28-2021 - 0.29.0](https://github.com/airbytehq/airbyte/releases/tag/v0.29.0-alpha)
|
||||
|
||||
* Deprecated `DEFAULT_WORKSPACE_ID`; default workspace no longer exists by default.
|
||||
|
||||
## [07-28-2021 - 0.28.2](https://github.com/airbytehq/airbyte/releases/tag/v0.28.2-alpha)
|
||||
|
||||
* Backend now handles workspaceId for WebBackend operations.
|
||||
|
||||
## [07-26-2021 - 0.28.1](https://github.com/airbytehq/airbyte/releases/tag/v0.28.1-alpha)
|
||||
|
||||
* K8s: Overly-sensitive logs are now silenced.
|
||||
|
||||
## [07-22-2021 - 0.28.0](https://github.com/airbytehq/airbyte/releases/tag/v0.28.0-alpha)
|
||||
|
||||
* Acceptance test dependencies fixed.
|
||||
|
||||
## [07-22-2021 - 0.27.5](https://github.com/airbytehq/airbyte/releases/tag/v0.27.5-alpha)
|
||||
|
||||
* Fixed unreliable logging on Kubernetes deployments.
|
||||
* Introduced pre-commit to auto-format files on commits.
|
||||
|
||||
## [07-21-2021 - 0.27.4](https://github.com/airbytehq/airbyte/releases/tag/v0.27.4-alpha)
|
||||
|
||||
* Config persistence is now migrated to the internal Airbyte database.
|
||||
* Source connector ports now properly close when deployed on Kubernetes.
|
||||
* Missing dependencies added that allow acceptance tests to run.
|
||||
|
||||
## [07-15-2021 - 0.27.3](https://github.com/airbytehq/airbyte/releases/tag/v0.27.3-alpha)
|
||||
|
||||
* Fixed some minor API spec errors.
|
||||
|
||||
## [07-12-2021 - 0.27.2](https://github.com/airbytehq/airbyte/releases/tag/v0.27.2-alpha)
|
||||
|
||||
* GCP environment variable is now stubbed out to prevent noisy and harmless errors.
|
||||
|
||||
## [07-8-2021 - 0.27.1](https://github.com/airbytehq/airbyte/releases/tag/v0.27.1-alpha)
|
||||
|
||||
* New API endpoint: List workspaces
|
||||
* K8s: Server doesn't start up before Temporal is ready to operate now.
|
||||
* Silent source failures caused by last patch fixed to throw exceptions.
|
||||
|
||||
## [07-1-2021 - 0.27.0](https://github.com/airbytehq/airbyte/releases/tag/v0.27.0-alpha)
|
||||
|
||||
* Airbyte now automatically upgrades on server startup!
|
||||
* Airbyte will check whether your `.env` Airbyte version is compatible with the Airbyte version in the database and upgrade accordingly.
|
||||
* When running Airbyte on K8s logs will automatically be stored in a Minio bucket unless configured otherwise.
|
||||
* CDC for MySQL now handles decimal types correctly.
|
||||
|
||||
## [06-21-2021 - 0.26.2](https://github.com/airbytehq/airbyte/releases/tag/v0.26.2-alpha)
|
||||
|
||||
* First-Class Kubernetes support!
|
||||
|
||||
## [06-16-2021 - 0.26.0](https://github.com/airbytehq/airbyte/releases/tag/v0.26.0-alpha)
|
||||
|
||||
* Custom dbt transformations!
|
||||
* You can now configure your destination namespace at the table level when setting up a connection!
|
||||
* Migrate basic normalization settings to the sync operations.
|
||||
|
||||
## [06-09-2021 - 0.24.8 / 0.25.0](https://github.com/airbytehq/airbyte/releases/tag/v0.24.8-alpha)
|
||||
|
||||
* Bugfix: Handle TINYINT\(1\) and BOOLEAN correctly and fix target file comparison for MySQL CDC.
|
||||
* Bugfix: Updating the source/destination name in the UI now works as intended.
|
||||
|
||||
## [06-04-2021 - 0.24.7](https://github.com/airbytehq/airbyte/releases/tag/v0.24.7-alpha)
|
||||
|
||||
* Bugfix: Ensure that logs from threads created by replication workers are added to the log file.
|
||||
|
||||
## [06-03-2021 - 0.24.5](https://github.com/airbytehq/airbyte/releases/tag/v0.24.5-alpha)
|
||||
|
||||
* Remove hash from table names when it's not necessary for normalization outputs.
|
||||
|
||||
## [06-03-2021 - 0.24.4](https://github.com/airbytehq/airbyte/releases/tag/v0.24.4-alpha)
|
||||
|
||||
* PythonCDK: change minimum Python version to 3.7.0
|
||||
|
||||
## [05-28-2021 - 0.24.3](https://github.com/airbytehq/airbyte/releases/tag/v0.24.3-alpha)
|
||||
|
||||
* Minor fixes to documentation
|
||||
* Reliability updates in preparation for custom transformations
|
||||
* Limit Docker log size to 500 MB \([\#3702](https://github.com/airbytehq/airbyte/pull/3702)\)
|
||||
|
||||
## [05-26-2021 - 0.24.2](https://github.com/airbytehq/airbyte/releases/tag/v0.24.2-alpha)
|
||||
|
||||
* Fix for file names being too long in Windows deployments \([\#3625](https://github.com/airbytehq/airbyte/pull/3625)\)
|
||||
* Allow users to access the API and WebApp from the same port \([\#3603](https://github.com/airbytehq/airbyte/pull/3603)\)
|
||||
|
||||
## [05-25-2021 - 0.24.1](https://github.com/airbytehq/airbyte/releases/tag/v0.24.1-alpha)
|
||||
|
||||
* **Checkpointing for incremental syncs** that will now continue where they left off even if they fail! \([\#3290](https://github.com/airbytehq/airbyte/pull/3290)\)
|
||||
|
||||
## [05-25-2021 - 0.24.0](https://github.com/airbytehq/airbyte/releases/tag/v0.24.0-alpha)
|
||||
|
||||
* Avoid dbt runtime exception "maximum recursion depth exceeded" in ephemeral materialization \([\#3470](https://github.com/airbytehq/airbyte/pull/3470)\)
|
||||
|
||||
## [05-18-2021 - 0.23.0](https://github.com/airbytehq/airbyte/releases/tag/v0.23.0-alpha)
|
||||
|
||||
* Documentation to deploy locally on Windows is now available \([\#3425](https://github.com/airbytehq/airbyte/pull/3425)\)
|
||||
* Connector icons are now displayed in the UI
|
||||
* Restart core containers if they fail automatically \([\#3423](https://github.com/airbytehq/airbyte/pull/3423)\)
|
||||
* Progress on supporting custom transformation using dbt. More updates on this soon!
|
||||
|
||||
## [05-11-2021 - 0.22.3](https://github.com/airbytehq/airbyte/releases/tag/v0.22.3-alpha)
|
||||
|
||||
* Bump K8s deployment version to latest stable version, thanks to [Coetzee van Staden](https://github.com/coetzeevs)
|
||||
* Added tutorial to deploy Airbyte on Azure VM \([\#3171](https://github.com/airbytehq/airbyte/pull/3171)\), thanks to [geekwhocodes](https://github.com/geekwhocodes)
|
||||
* Progress on checkpointing to support rate limits better
|
||||
* Upgrade normalization to use dbt from docker images \([\#3186](https://github.com/airbytehq/airbyte/pull/3186)\)
|
||||
|
||||
## [05-04-2021 - 0.22.2](https://github.com/airbytehq/airbyte/releases/tag/v0.22.2-alpha)
|
||||
|
||||
* Split replication and normalization into separate temporal activities \([\#3136](https://github.com/airbytehq/airbyte/pull/3136)\)
|
||||
* Fix normalization Nesting bug \([\#3110](https://github.com/airbytehq/airbyte/pull/3110)\)
|
||||
|
||||
## [04-27-2021 - 0.22.0](https://github.com/airbytehq/airbyte/releases/tag/v0.22.0-alpha)
|
||||
|
||||
* **Replace timeout for sources** \([\#3031](https://github.com/airbytehq/airbyte/pull/2851)\)
|
||||
* Fix UI issue where tables with the same name are selected together \([\#3032](https://github.com/airbytehq/airbyte/pull/2851)\)
|
||||
* Fix feed handling when feeds are unavailable \([\#2964](https://github.com/airbytehq/airbyte/pull/2851)\)
|
||||
* Export whitelisted tables \([\#3055](https://github.com/airbytehq/airbyte/pull/2851)\)
|
||||
* Create a contributor bootstrap script \(\#3028\) \([\#3054](https://github.com/airbytehq/airbyte/pull/2851)\), thanks to [nclsbayona](https://github.com/nclsbayona)
|
||||
|
||||
## [04-20-2021 - 0.21.0](https://github.com/airbytehq/airbyte/releases/tag/v0.21.0-alpha)
|
||||
|
||||
* **Namespace support**: supported source-destination pairs will now sync data into the same namespace as the source \(\#2862\)
|
||||
* Add **“Refresh Schema”** button \([\#2943](https://github.com/airbytehq/airbyte/pull/2943)\)
|
||||
* In the Settings, you can now **add a webhook to get notified when a sync fails**
|
||||
* Add destinationSyncModes to connection form
|
||||
* Add tooltips for connection status icons
|
||||
|
||||
## [04-12-2021 - 0.20.0](https://github.com/airbytehq/airbyte/releases/tag/v0.20.0-alpha)
|
||||
|
||||
* **Change Data Capture \(CDC\)** is now supported for Postgres, thanks to [@jrhizor](https://github.com/jrhizor) and [@cgardens](https://github.com/cgardens). We will now expand it to MySQL and MSSQL in the coming weeks.
|
||||
* When displaying the schema for a source, you can now search for table names, thanks to [@jamakase](https://github.com/jamakase)
|
||||
* Better feedback UX when manually triggering a sync with “Sync now”
|
||||
|
||||
## [04-07-2021 - 0.19.0](https://github.com/airbytehq/airbyte/releases/tag/v0.19.0-alpha)
|
||||
|
||||
* New **Connections** page where you can see the list of all your connections and their statuses.
|
||||
* New **Settings** page to update your preferences.
|
||||
* Bugfix where very large schemas caused schema discovery to fail.
|
||||
|
||||
## [03-29-2021 - 0.18.1](https://github.com/airbytehq/airbyte/releases/tag/v0.18.1-alpha)
|
||||
|
||||
* Surface the **health of each connection** so that a user can spot any problems at a glance.
|
||||
* Added support for deduplicating records in the destination using a primary key using incremental dedupe -
|
||||
* A source’s extraction mode \(incremental, full refresh\) is now decoupled from the destination’s write mode -- so you can repeatedly append full refreshes to get repeated snapshots of data in your source.
|
||||
* New **Upgrade all** button in Admin to upgrade all your connectors at once
|
||||
* New **Cancel** job button in Connections Status page when a sync job is running, so you can stop never-ending processes.
|
||||
|
||||
## [03-22-2021 - 0.17.2](https://github.com/airbytehq/airbyte/releases/tag/v0.17.2-alpha)
|
||||
|
||||
* Improved the speed of get spec, check connection, and discover schema by migrating to the Temporal workflow engine.
|
||||
* Exposed cancellation for sync jobs in the API \(will be exposed in the UI in the next week!\).
|
||||
* Bug fix: Fix issue where migration app was OOMing.
|
||||
|
||||
## [03-15-2021 - 0.17.1](https://github.com/airbytehq/airbyte/releases/tag/v0.17.1-alpha)
|
||||
|
||||
* **Creating and deleting multiple workspaces** is now supported via the API. Thanks to [@Samuel Gordalina](https://github.com/gordalina) for contributing this feature!
|
||||
* Normalization now supports numeric types with precision greater than 32 bits
|
||||
* Normalization now supports union data types
|
||||
* Support longform text inputs in the UI for cases where you need to preserve formatting on connector inputs like .pem keys
|
||||
* Expose the latest available connector versions in the API
|
||||
* Airflow: published a new [tutorial](https://docs.airbyte.com/operator-guides/using-the-airflow-airbyte-operator/) for how to use the Airbyte operator. Thanks [@Marcos Marx](https://github.com/marcosmarxm) for writing the tutorial!
|
||||
* Connector Contributions: All connectors now describe how to contribute to them without having to touch Airbyte’s monorepo build system -- just work on the connector in your favorite dev setup!
|
||||
|
||||
## [03-08-2021 - 0.17](https://github.com/airbytehq/airbyte/releases/tag/v0.17.0-alpha)
|
||||
|
||||
* **Integration with Airflow** is here. Thanks to @Marcos Marx, you can now run Airbyte jobs from Airflow directly. A tutorial is on the way and should be coming this week!
|
||||
* Add a prefix for tables, so that tables with the same name don't clobber each other in the destination
|
||||
|
||||
## [03-01-2021 - 0.16](https://github.com/airbytehq/airbyte/milestone/22?closed=1)
|
||||
|
||||
* We made some progress to address **nested tables in our normalization.**
|
||||
|
||||
Previously, basic normalization would output nested tables as-is and append a number for duplicate tables. For example, Stripe’s nested address fields go from:
|
||||
|
||||
```text
|
||||
Address
|
||||
address_1
|
||||
```
|
||||
|
||||
To
|
||||
|
||||
```text
|
||||
Charges_source_owner_755_address
|
||||
customers_shipping_c70_address
|
||||
```
|
||||
|
||||
After the change, the parent tables are combined with the name of the nested table to show where the nested table originated. **This is a breaking change for the consumers of nested tables. Consumers will need to update to point at the new tables.**
|
||||
|
||||
## [02-19-2021 - 0.15](https://github.com/airbytehq/airbyte/milestone/22?closed=1)
|
||||
|
||||
* We now handle nested tables with the normalization steps. Check out the video below to see how it works.
|
||||
|
||||
{% embed url="https://youtu.be/I4fngMnkJzY" caption="" %}
|
||||
|
||||
## [02-12-2021 - 0.14](https://github.com/airbytehq/airbyte/milestone/21?closed=1)
|
||||
|
||||
* Front-end changes:
|
||||
* Display Airbyte's version number
|
||||
* Describe schemas using JsonSchema
|
||||
* Better feedback on buttons
|
||||
|
||||
## [Beta launch - 0.13](https://github.com/airbytehq/airbyte/milestone/15?closed=1) - Released 02/02/2021
|
||||
|
||||
* Add connector build status dashboard
|
||||
* Support Schema Changes in Sources
|
||||
* Support Import / Export of Airbyte Data in the Admin section of the UI
|
||||
* Bug fixes:
|
||||
* If Airbyte is closed during a sync the running job is not marked as failed
|
||||
* Airbyte should fail when deployment version doesn't match data version
|
||||
* Upgrade Airbyte Version without losing existing configuration / data
|
||||
|
||||
## [0.12-alpha](https://github.com/airbytehq/airbyte/milestone/14?closed=1) - Released 01/20/2021
|
||||
|
||||
* Ability to skip onboarding
|
||||
* Miscellaneous bug fixes:
|
||||
* A long discovery request causes a timeout in the UI type/bug
|
||||
* Out of Memory when replicating large table from MySQL
|
||||
|
||||
## 0.11.2-alpha - Released 01/18/2021
|
||||
|
||||
* Increase timeout for long running catalog discovery operations from 3 minutes to 30 minutes to avoid prematurely failing long-running operations
|
||||
|
||||
## 0.11.1-alpha - Released 01/17/2021
|
||||
|
||||
### Bugfixes
|
||||
|
||||
* Writing boolean columns to Redshift destination now works correctly
|
||||
|
||||
## [0.11.0-alpha](https://github.com/airbytehq/airbyte/milestone/12?closed=1) - Delivered 01/14/2021
|
||||
|
||||
### New features
|
||||
|
||||
* Allow skipping the onboarding flow in the UI
|
||||
* Add the ability to reset a connection's schema when the underlying data source schema changes
|
||||
|
||||
### Bugfixes
|
||||
|
||||
* Fix UI race condition which showed config for the wrong connector when rapidly choosing between different connector
|
||||
* Fix a bug in MSSQL and Redshift source connectors where custom SQL types weren't being handled correctly. [Pull request](https://github.com/airbytehq/airbyte/pull/1576)
|
||||
* Support incremental sync for Salesforce, Slack, and Braintree sources
|
||||
* Gracefully handle invalid nuemric values \(e.g NaN or Infinity\) in MySQL, MSSQL, and Postgtres DB sources
|
||||
* Fix flashing red sources/destinations fields after success submit
|
||||
* Fix a bug which caused getting a connector's specification to hang indefinitely if the connector docker image failed to download
|
||||
|
||||
### New connectors
|
||||
|
||||
* Tempo
|
||||
* Appstore
|
||||
|
||||
## [0.10.0](https://github.com/airbytehq/airbyte/milestone/12?closed=1) - delivered on 01/04/2021
|
||||
|
||||
* You can now **deploy Airbyte on** [**Kuberbetes**](https://docs.airbyte.com/deploying-airbyte/on-kubernetes) _\*\*_\(alpha version\)
|
||||
* **Support incremental sync** for Mixpanel and HubSpot sources
|
||||
* **Fixes on connectors:**
|
||||
* Fixed a bug in the GitHub connector where the connector didn’t verify the provided API token was granted the correct permissions
|
||||
* Fixed a bug in the Google Sheets connector where rate limits were not always respected
|
||||
* Alpha version of Facebook marketing API v9. This connector is a native Airbyte connector \(current is Singer based\).
|
||||
* **New source:** Plaid \(contributed by [@tgiardina](https://github.com/tgiardina) - thanks Thomas!\)
|
||||
|
||||
## [0.9.0](https://github.com/airbytehq/airbyte/milestone/11?closed=1) - delivered on 12/23/2020
|
||||
|
||||
* **New chat app from the web app** so you can directly chat with the team for any issues you run into
|
||||
* **Debugging** has been made easier in the UI, with checks, discover logs, and sync download logs
|
||||
* Support of **Kubernetes in local**. GKE will come at the next release.
|
||||
* **New source:** Looker _\*\*_
|
||||
|
||||
## [0.8.0](https://github.com/airbytehq/airbyte/milestone/10?closed=1) - delivered on 12/17/2020
|
||||
|
||||
* **Incremental - Append"**
|
||||
* We now allow sources to replicate only new or modified data. This enables to avoid re-fetching data that you have already replicated from a source.
|
||||
* The delta from a sync will be _appended_ to the existing data in the data warehouse.
|
||||
* Here are [all the details of this feature](../../understanding-airbyte/connections/incremental-append.md).
|
||||
* It has been released for 15 connectors, including Postgres, MySQL, Intercom, Zendesk, Stripe, Twilio, Marketo, Shopify, GitHub, and all the destination connectors. We will expand it to all the connectors in the next couple of weeks.
|
||||
* **Other features:**
|
||||
* Improve interface for writing python sources \(should make writing new python sources easier and clearer\).
|
||||
* Add support for running Standard Source Tests with files \(making them easy to run for any language a source is written in\)
|
||||
* Add ability to reset data for a connection.
|
||||
* **Bug fixes:**
|
||||
* Update version of test containers we use to avoid pull issues while running tests.
|
||||
* Fix issue where jobs were not sorted by created at in connection detail view.
|
||||
* **New sources:** Intercom, Mixpanel, Jira Cloud, Zoom, Drift, Microsoft Teams
|
||||
|
||||
## [0.7.0](https://github.com/airbytehq/airbyte/milestone/8?closed=1) - delivered on 12/07/2020
|
||||
|
||||
* **New destination:** our own **Redshift** warehouse connector. You can also use this connector for Panoply.
|
||||
* **New sources**: 8 additional source connectors including Recurly, Twilio, Freshdesk. Greenhouse, Redshift \(source\), Braintree, Slack, Zendesk Support
|
||||
* Bug fixes
|
||||
|
||||
## [0.6.0](https://github.com/airbytehq/airbyte/milestone/6?closed=1) - delivered on 11/23/2020
|
||||
|
||||
* Support **multiple destinations**
|
||||
* **New source:** Sendgrid
|
||||
* Support **basic normalization**
|
||||
* Bug fixes
|
||||
|
||||
## [0.5.0](https://github.com/airbytehq/airbyte/milestone/5?closed=1) - delivered on 11/18/2020
|
||||
|
||||
* **New sources:** 10 additional source connectors, including Files \(CSV, HTML, JSON...\), Shopify, MSSQL, Mailchimp
|
||||
|
||||
## [0.4.0](https://github.com/airbytehq/airbyte/milestone/4?closed=1) - delivered on 11/04/2020
|
||||
|
||||
Here is what we are working on right now:
|
||||
|
||||
* **New destination**: our own **Snowflake** warehouse connector
|
||||
* **New sources:** Facebook Ads, Google Ads.
|
||||
|
||||
## [0.3.0](https://github.com/airbytehq/airbyte/milestone/3?closed=1) - delivered on 10/30/2020
|
||||
|
||||
* **New sources:** Salesforce, GitHub, Google Sheets, Google Analytics, HubSpot, Rest APIs, and MySQL
|
||||
* Integration test suite for sources
|
||||
* Improve build speed
|
||||
|
||||
## [0.2.0](https://github.com/airbytehq/airbyte/milestone/2?closed=1) - delivered on 10/21/2020
|
||||
|
||||
* **a new Admin section** to enable users to add their own connectors, in addition to upgrading the ones they currently use
|
||||
* improve the developer experience \(DX\) for **contributing new connectors** with additional documentation and a connector protocol
|
||||
* our own **BigQuery** warehouse connector
|
||||
* our own **Postgres** warehouse connector
|
||||
* simplify the process of supporting new Singer taps, ideally make it a 1-day process
|
||||
|
||||
## [0.1.0](https://github.com/airbytehq/airbyte/milestone/1?closed=1) - delivered on 09/23/2020
|
||||
|
||||
This is our very first release after 2 months of work.
|
||||
|
||||
* **New sources:** Stripe, Postgres
|
||||
* **New destinations:** BigQuery, Postgres
|
||||
* **Only one destination**: we only support one destination in that 1st release, but you will soon be able to add as many as you need.
|
||||
* **Logs & monitoring**: you can now see your detailed logs
|
||||
* **Scheduler:** you now have 10 different frequency options for your recurring syncs
|
||||
* **Deployment:** you can now deploy Airbyte via a simple Docker image, or directly on AWS and GCP
|
||||
* **New website**: this is the day we launch our website - airbyte.io. Let us know what you think
|
||||
* **New documentation:** this is the 1st day for our documentation too
|
||||
* **New blog:** we published a few articles on our startup journey, but also about our vision to making data integrations a commodity.
|
||||
|
||||
Stay tuned, we will have new sources and destinations very soon! Don't hesitate to subscribe to our [newsletter](https://airbyte.io/#subscribe-newsletter) to receive our product updates and community news.
|
||||
|
||||
@@ -1,2 +0,0 @@
|
||||
# Example Use Cases
|
||||
|
||||
@@ -1,424 +0,0 @@
|
||||
---
|
||||
description: Using Airbyte and Apache Superset
|
||||
---
|
||||
|
||||
# Build a Slack Activity Dashboard
|
||||
|
||||

|
||||
|
||||
This article will show how to use [Airbyte](http://airbyte.com) - an open-source data integration platform - and [Apache Superset](https://superset.apache.org/) - an open-source data exploration platform - in order to build a Slack activity dashboard showing:
|
||||
|
||||
* Total number of members of a Slack workspace
|
||||
* The evolution of the number of Slack workspace members
|
||||
* Evolution of weekly messages
|
||||
* Evolution of messages per channel
|
||||
* Members per time zone
|
||||
|
||||
Before we get started, let’s take a high-level look at how we are going to achieve creating a Slack dashboard using Airbyte and Apache Superset.
|
||||
|
||||
1. We will use the Airbyte’s Slack connector to get the data off a Slack workspace \(we will be using Airbyte’s own Slack workspace for this tutorial\).
|
||||
2. We will save the data onto a PostgreSQL database.
|
||||
3. Finally, using Apache Superset, we will implement the various metrics we care about.
|
||||
|
||||
Got it? Now let’s get started.
|
||||
|
||||
## 1. Replicating Data from Slack to Postgres with Airbyte
|
||||
|
||||
### a. Deploying Airbyte
|
||||
|
||||
There are several easy ways to deploy Airbyte, as listed [here](https://docs.airbyte.com/). For this tutorial, I will just use the [Docker Compose method](https://docs.airbyte.com/deploying-airbyte/local-deployment) from my workstation:
|
||||
|
||||
```text
|
||||
# In your workstation terminal
|
||||
git clone https://github.com/airbytehq/airbyte.git
|
||||
cd airbyte
|
||||
docker-compose up
|
||||
```
|
||||
|
||||
The above command will make the Airbyte app available on `localhost:8000`. Visit the URL on your favorite browser, and you should see Airbyte’s dashboard \(if this is your first time, you will be prompted to enter your email to get started\).
|
||||
|
||||
If you haven’t set Docker up, follow the [instructions here](https://docs.docker.com/desktop/) to set it up on your machine.
|
||||
|
||||
### b. Setting Up Airbyte’s Slack Source Connector
|
||||
|
||||
Airbyte’s Slack connector will give us access to the data. So, we are going to kick things off by setting this connector to be our data source in Airbyte’s web app. I am assuming you already have Airbyte and Docker set up on your local machine. We will be using Docker to create our PostgreSQL database container later on.
|
||||
|
||||
Now, let’s proceed. If you already went through the onboarding, click on the “new source” button at the top right of the Sources section. If you're going through the onboarding, then follow the instructions.
|
||||
|
||||
You will be requested to enter a name for the source you are about to create. You can call it “slack-source”. Then, in the Source Type combo box, look for “Slack,” and then select it. Airbyte will then present the configuration fields needed for the Slack connector. So you should be seeing something like this on the Airbyte App:
|
||||
|
||||

|
||||
|
||||
The first thing you will notice is that this connector requires a Slack token. So, we have to obtain one. If you are not a workspace admin, you will need to ask for permission.
|
||||
|
||||
Let’s walk through how we would get the Slack token we need.
|
||||
|
||||
Assuming you are a workspace admin, open the Slack workspace and navigate to \[Workspace Name\] > Administration > Customize \[Workspace Name\]. In our case, it will be Airbyte > Administration > Customize Airbyte \(as shown below\):
|
||||
|
||||

|
||||
|
||||
In the new page that opens up in your browser, you will then need to navigate to **Configure apps**.
|
||||
|
||||

|
||||
|
||||
In the new window that opens up, click on **Build** in the top right corner.
|
||||
|
||||

|
||||
|
||||
Click on the **Create an App** button.
|
||||
|
||||

|
||||
|
||||
In the modal form that follows, give your app a name - you can name it `airbyte_superset`, then select your workspace from the Development Slack Workspace.
|
||||
|
||||

|
||||
|
||||
Next, click on the **Create App** button. You will then be presented with a screen where we are going to set permissions for our `airbyte_superset` app, by clicking on the **Permissions** button on this page.
|
||||
|
||||

|
||||
|
||||
In the next screen, navigate to the scope section. Then, click on the **Add an OAuth Scope** button. This will allow you to add permission scopes for your app. At a minimum, your app should have the following permission scopes:
|
||||
|
||||

|
||||
|
||||
Then, we are going to add our created app to the workspace by clicking the **Install to Workspace** button.
|
||||
|
||||

|
||||
|
||||
Slack will prompt you that your app is requesting permission to access your workspace of choice. Click Allow.
|
||||
|
||||

|
||||
|
||||
After the app has been successfully installed, you will be navigated to Slack’s dashboard, where you will see the Bot User OAuth Access Token.
|
||||
|
||||
This is the token you will provide back on the Airbyte page, where we dropped off to obtain this token. So make sure to copy it and keep it in a safe place.
|
||||
|
||||
Now that we are done with obtaining a Slack token, let’s go back to the Airbyte page we dropped off and add the token in there.
|
||||
|
||||
We will also need to provide Airbyte with `start_date`. This is the date from which we want Airbyte to start replicating data from the Slack API, and we define that in the format: `YYYY-MM-DDT00:00:00Z`.
|
||||
|
||||
We will specify ours as `2020-09-01T00:00:00Z`. We will also tell Airbyte to exclude archived channels and not include private channels, and also to join public channels, so the latter part of the form should look like this:
|
||||
|
||||

|
||||
|
||||
Finally, click on the **Set up source** button for Airbyte to set the Slack source up.
|
||||
|
||||
If the source was set up correctly, you will be taken to the destination section of Airbyte’s dashboard, where you will tell Airbyte where to store the replicated data.
|
||||
|
||||
### c. Setting Up Airbyte’s Postgres Destination Connector
|
||||
|
||||
For our use case, we will be using PostgreSQL as the destination.
|
||||
|
||||
Click the **add destination** button in the top right corner, then click on **add a new destination**.
|
||||
|
||||

|
||||
|
||||
In the next screen, Airbyte will validate the source, and then present you with a form to give your destination a name. We’ll call this destination slack-destination. Then, we will select the Postgres destination type. Your screen should look like this now:
|
||||
|
||||

|
||||
|
||||
Great! We have a form to enter Postgres connection credentials, but we haven’t set up a Postgres database. Let’s do that!
|
||||
|
||||
Since we already have Docker installed, we can spin off a Postgres container with the following command in our terminal:
|
||||
|
||||
```text
|
||||
docker run --rm --name slack-db -e POSTGRES_PASSWORD=password -p 2000:5432 -d postgres
|
||||
```
|
||||
|
||||
\(Note that the Docker compose file for Superset ships with a Postgres database, as you can see [here](https://github.com/apache/superset/blob/master/docker-compose.yml#L40)\).
|
||||
|
||||
The above command will do the following:
|
||||
|
||||
* create a Postgres container with the name slack-db,
|
||||
* set the password to password,
|
||||
* expose the container’s port 5432, as our machine’s port 2000.
|
||||
* create a database and a user, both called postgres.
|
||||
|
||||
With this, we can go back to the Airbyte screen and supply the information needed. Your form should look like this:
|
||||
|
||||

|
||||
|
||||
Then click on the **Set up destination** button.
|
||||
|
||||
### d. Setting Up the Replication
|
||||
|
||||
You should now see the following screen:
|
||||
|
||||

|
||||
|
||||
Airbyte will then fetch the schema for the data coming from the Slack API for your workspace. You should leave all boxes checked and then choose the sync frequency - this is the interval in which Airbyte will sync the data coming from your workspace. Let’s set the sync interval to every 24 hours.
|
||||
|
||||
Then click on the **Set up connection** button.
|
||||
|
||||
Airbyte will now take you to the destination dashboard, where you will see the destination you just set up. Click on it to see more details about this destination.
|
||||
|
||||

|
||||
|
||||
You will see Airbyte running the very first sync. Depending on the size of the data Airbyte is replicating, it might take a while before syncing is complete.
|
||||
|
||||

|
||||
|
||||
When it’s done, you will see the **Running status** change to **Succeeded**, and the size of the data Airbyte replicated as well as the number of records being stored on the Postgres database.
|
||||
|
||||

|
||||
|
||||
To test if the sync worked, run the following in your terminal:
|
||||
|
||||
```text
|
||||
docker exec slack-source psql -U postgres -c "SELECT * FROM public.users;"
|
||||
```
|
||||
|
||||
This should output the rows in the users’ table.
|
||||
|
||||
To get the count of the users’ table as well, you can also run:
|
||||
|
||||
```text
|
||||
docker exec slack-db psql -U postgres -c "SELECT count(*) FROM public.users;"
|
||||
```
|
||||
|
||||
Now that we have the data from the Slack workspace in our Postgres destination, we will head on to creating the Slack dashboard with Apache Superset.
|
||||
|
||||
## 2. Setting Up Apache Superset for the Dashboards
|
||||
|
||||
### a. Installing Apache Superset
|
||||
|
||||
Apache Superset, or simply Superset, is a modern data exploration and visualization platform. To get started using it, we will be cloning the Superset repo. Navigate to a destination in your terminal where you want to clone the Superset repo to and run:
|
||||
|
||||
```text
|
||||
git clone https://github.com/apache/superset.git
|
||||
```
|
||||
|
||||
It’s recommended to check out the latest branch of Superset, so run:
|
||||
|
||||
```text
|
||||
cd superset
|
||||
```
|
||||
|
||||
And then run:
|
||||
|
||||
```text
|
||||
git checkout latest
|
||||
```
|
||||
|
||||
Superset needs you to install and build its frontend dependencies and assets. So, we will start by installing the frontend dependencies:
|
||||
|
||||
```text
|
||||
npm install
|
||||
```
|
||||
|
||||
Note: The above command assumes you have both Node and NPM installed on your machine.
|
||||
|
||||
Finally, for the frontend, we will build the assets by running:
|
||||
|
||||
```text
|
||||
npm run build
|
||||
```
|
||||
|
||||
After that, go back up one directory into the Superset directory by running:
|
||||
|
||||
```text
|
||||
cd..
|
||||
```
|
||||
|
||||
Then run:
|
||||
|
||||
```text
|
||||
docker-compose up
|
||||
```
|
||||
|
||||
This will download the Docker images Superset needs and build containers and start services Superset needs to run locally on your machine.
|
||||
|
||||
Once that’s done, you should be able to access Superset on your browser by visiting [`http://localhost:8088`](http://localhost:8088), and you should be presented with the Superset login screen.
|
||||
|
||||
Enter username: **admin** and Password: **admin** to be taken to your Superset dashboard.
|
||||
|
||||
Great! You’ve got Superset set up. Now let’s tell Superset about our Postgres Database holding the Slack data from Airbyte.
|
||||
|
||||
### b. Setting Up a Postgres Database in Superset
|
||||
|
||||
To do this, on the top menu in your Superset dashboard, hover on the Data dropdown and click on **Databases**.
|
||||
|
||||

|
||||
|
||||
In the page that opens up, click on the **+ Database** button in the top right corner.
|
||||
|
||||

|
||||
|
||||
Then, you will be presented with a modal to add your Database Name and the connection URI.
|
||||
|
||||

|
||||
|
||||
Let’s call our Database `slack_db`, and then add the following URI as the connection URI:
|
||||
|
||||
```text
|
||||
postgresql://postgres:password@docker.for.mac.localhost:2000/postgres
|
||||
```
|
||||
|
||||
If you are on a Windows Machine, yours will be:
|
||||
|
||||
```text
|
||||
postgresql://postgres:password@docker.for.win.localhost:2000/postgres
|
||||
```
|
||||
|
||||
Note: We are using `docker.for.[mac|win].localhost` in order to access the localhost of your machine, because using just localhost will point to the Docker container network and not your machine’s network.
|
||||
|
||||
Your Superset UI should look like this:
|
||||
|
||||

|
||||
|
||||
We will need to enable some settings on this connection. Click on the **SQL LAB SETTINGS** and check the following boxes:
|
||||
|
||||

|
||||
|
||||
Afterwards, click on the **ADD** button, and you will see your database on the data page of Superset.
|
||||
|
||||

|
||||
|
||||
### c. Importing our dataset
|
||||
|
||||
Now that you’ve added the database, you will need to hover over the data menu again; now click on **Datasets**.
|
||||
|
||||

|
||||
|
||||
Then, you will be taken to the datasets page:
|
||||
|
||||

|
||||
|
||||
We want to only see the datasets that are in our `slack_db` database, so in the Database that is currently showing All, select `slack_db` and you will see that we don’t have any datasets at the moment.
|
||||
|
||||

|
||||
|
||||

|
||||
|
||||
You can fix this by clicking on the **+ DATASET** button and adding the following datasets.
|
||||
|
||||
Note: Make sure you select the public schema under the Schema dropdown.
|
||||
|
||||

|
||||
|
||||
Now that we have set up Superset and given it our Slack data, let’s proceed to creating the visualizations we need.
|
||||
|
||||
Still remember them? Here they are again:
|
||||
|
||||
* Total number of members of a Slack workspace
|
||||
* The evolution of the number of Slack workspace members
|
||||
* Evolution of weekly messages
|
||||
* Evolution of weekly threads created
|
||||
* Evolution of messages per channel
|
||||
* Members per time zone
|
||||
|
||||
## 3. Creating Our Dashboards with Superset
|
||||
|
||||
### a. Total number of members of a Slack workspace
|
||||
|
||||
To get this, we will first click on the users’ dataset of our `slack_db` on the Superset dashboard.
|
||||
|
||||

|
||||
|
||||
Next, change **untitled** at the top to **Number of Members**.
|
||||
|
||||

|
||||
|
||||
Now change the **Visualization Type** to **Big Number,** remove the **Time Range** filter, and add a Subheader named “Slack Members.” So your UI should look like this:
|
||||
|
||||

|
||||
|
||||
Then, click on the **RUN QUERY** button, and you should now see the total number of members.
|
||||
|
||||
Pretty cool, right? Now let’s save this chart by clicking on the **SAVE** button.
|
||||
|
||||

|
||||
|
||||
Then, in the **ADD TO DASHBOARD** section, type in “Slack Dashboard”, click on the “Create Slack Dashboard” button, and then click the **Save** button.
|
||||
|
||||
Great! We have successfully created our first Chart, and we also created the Dashboard. Subsequently, we will be following this flow to add the other charts to the created Slack Dashboard.
|
||||
|
||||
### b. Casting the ts column
|
||||
|
||||
Before we proceed with the rest of the charts for our dashboard, if you inspect the **ts** column on either the **messages** table or the **threads** table, you will see it’s of the type `VARCHAR`. We can’t really use this for our charts, so we have to cast both the **messages** and **threads**’ **ts** column as `TIMESTAMP`. Then, we can create our charts from the results of those queries. Let’s do this.
|
||||
|
||||
First, navigate to the **Data** menu, and click on the **Datasets** link. In the list of datasets, click the **Edit** button for the **messages** table.
|
||||
|
||||

|
||||
|
||||
You’re now in the Edit Dataset view. Click the **Lock** button to enable editing of the dataset. Then, navigate to the **Columns** tab, expand the **ts** dropdown, and then tick the **Is Temporal** box.
|
||||
|
||||

|
||||
|
||||
Persist the changes by clicking the Save button.
|
||||
|
||||
### c. The evolution of the number of Slack workspace members
|
||||
|
||||
In the exploration page, let’s first get the chart showing the evolution of the number of Slack members. To do this, make your settings on this page match the screenshot below:
|
||||
|
||||

|
||||
|
||||
Save this chart onto the Slack Dashboard.
|
||||
|
||||
### d. Evolution of weekly messages posted
|
||||
|
||||
Now, we will look at the evolution of weekly messages posted. Let’s configure the chart settings on the same page as the previous one.
|
||||
|
||||

|
||||
|
||||
Remember, your visualization will differ based on the data you have.
|
||||
|
||||
### e. Evolution of weekly threads created
|
||||
|
||||
Now, we are finished with creating the message chart. Let's go over to the thread chart. You will recall that we will need to cast the **ts** column as stated earlier. So, do that and get to the exploration page, and make it match the screenshot below to achieve the required visualization:
|
||||
|
||||

|
||||
|
||||
### f. Evolution of messages per channel
|
||||
|
||||
For this visualization, we will need a more complex SQL query. Here’s the query we used \(as you can see in the screenshot below\):
|
||||
|
||||
```text
|
||||
SELECT CAST(m.ts as TIMESTAMP), c.name, m.text
|
||||
FROM public.messages m
|
||||
INNER JOIN public.channels c
|
||||
ON m.channel_id = c_id
|
||||
```
|
||||
|
||||

|
||||
|
||||
Next, click on **EXPLORE** to be taken to the exploration page; make it match the screenshot below:
|
||||
|
||||

|
||||
|
||||
Save this chart to the dashboard.
|
||||
|
||||
### g. Members per time zone
|
||||
|
||||
Finally, we will be visualizing members per time zone. To do this, instead of casting in the SQL lab as we’ve previously done, we will explore another method to achieve casting by using Superset’s Virtual calculated column feature. This feature allows us to write SQL queries that customize the appearance and behavior of a specific column.
|
||||
|
||||
For our use case, we will need the updated column of the users table to be a `TIMESTAMP`, in order to perform the visualization we need for Members per time zone. Let’s start on clicking the edit icon on the users table in Superset.
|
||||
|
||||

|
||||
|
||||
You will be presented with a modal like so:
|
||||
|
||||

|
||||
|
||||
Click on the **CALCULATED COLUMNS** tab:
|
||||
|
||||

|
||||
|
||||
Then, click on the **+ ADD ITEM** button, and make your settings match the screenshot below.
|
||||
|
||||

|
||||
|
||||
Then, go to the **exploration** page and make it match the settings below:
|
||||
|
||||

|
||||
|
||||
Now save this last chart, and head over to your Slack Dashboard. It should look like this:
|
||||
|
||||

|
||||
|
||||
Of course, you can edit how the dashboard looks to fit what you want on it.
|
||||
|
||||
## Conclusion
|
||||
|
||||
In this article, we looked at using Airbyte’s Slack connector to get the data from a Slack workspace into a Postgres database, and then used Apache Superset to craft a dashboard of visualizations.If you have any questions about Airbyte, don’t hesitate to ask questions on our [Slack](https://slack.airbyte.io)! If you have questions about Superset, you can join the [Superset Community Slack](https://superset.apache.org/community/)!
|
||||
|
||||
@@ -1,116 +0,0 @@
|
||||
---
|
||||
description: Start syncing data in minutes with Airbyte
|
||||
---
|
||||
|
||||
# Postgres Replication
|
||||
|
||||
Let's see how you can spin up a local instance of Airbyte and syncing data from one Postgres database to another.
|
||||
|
||||
Here's a 6-minute video showing you how you can do it.
|
||||
|
||||
{% embed url="https://www.youtube.com/watch?v=Rcpt5SVsMpk" caption="" %}
|
||||
|
||||
First of all, make sure you have Docker and Docker Compose installed. If this isn't the case, follow the [guide](../../deploying-airbyte/local-deployment.md) for the recommended approach to install Docker.
|
||||
|
||||
Once Docker is installed successfully, run the following commands:
|
||||
|
||||
```text
|
||||
git clone https://github.com/airbytehq/airbyte.git
|
||||
cd airbyte
|
||||
docker-compose up
|
||||
```
|
||||
|
||||
Once you see an Airbyte banner, the UI is ready to go at [http://localhost:8000/](http://localhost:8000/).
|
||||
|
||||
## 1. Set up your preferences
|
||||
|
||||
You should see an onboarding page. Enter your email and continue.
|
||||
|
||||

|
||||
|
||||
## 2. Set up your first connection
|
||||
|
||||
We support a growing [list of source connectors](https://docs.airbyte.com/category/sources). For now, we will start out with a Postgres source and destination.
|
||||
|
||||
**If you don't have a readily available Postgres database to sync, here are some quick instructions:**
|
||||
Run the following commands in a new terminal window to start backgrounded source and destination databases:
|
||||
|
||||
```text
|
||||
docker run --rm --name airbyte-source -e POSTGRES_PASSWORD=password -p 2000:5432 -d postgres
|
||||
docker run --rm --name airbyte-destination -e POSTGRES_PASSWORD=password -p 3000:5432 -d postgres
|
||||
```
|
||||
|
||||
Add a table with a few rows to the source database:
|
||||
|
||||
```text
|
||||
docker exec -it airbyte-source psql -U postgres -c "CREATE TABLE users(id SERIAL PRIMARY KEY, col1 VARCHAR(200));"
|
||||
docker exec -it airbyte-source psql -U postgres -c "INSERT INTO public.users(col1) VALUES('record1');"
|
||||
docker exec -it airbyte-source psql -U postgres -c "INSERT INTO public.users(col1) VALUES('record2');"
|
||||
docker exec -it airbyte-source psql -U postgres -c "INSERT INTO public.users(col1) VALUES('record3');"
|
||||
```
|
||||
|
||||
You now have a Postgres database ready to be replicated!
|
||||
|
||||
### **Connect the Postgres database**
|
||||
|
||||
In the UI, you will see a wizard that allows you choose the data you want to send through Airbyte.
|
||||
|
||||

|
||||
|
||||
Use the name `airbyte-source` for the name and `Postgres`as the type. If you used our instructions to create a Postgres database, fill in the configuration fields as follows:
|
||||
|
||||
```text
|
||||
Host: localhost
|
||||
Port: 2000
|
||||
User: postgres
|
||||
Password: password
|
||||
DB Name: postgres
|
||||
```
|
||||
|
||||
Click on `Set Up Source` and the wizard should move on to allow you to configure a destination.
|
||||
|
||||
We support a growing list of data warehouses, lakes and databases. For now, use the name `airbyte-destination`, and configure the destination Postgres database:
|
||||
|
||||
```text
|
||||
Host: localhost
|
||||
Port: 3000
|
||||
User: postgres
|
||||
Password: password
|
||||
DB Name: postgres
|
||||
```
|
||||
|
||||
After adding the destination, you can choose what tables and columns you want to sync.
|
||||
|
||||

|
||||
|
||||
For this demo, we recommend leaving the defaults and selecting "Every 5 Minutes" as the frequency. Click `Set Up Connection` to finish setting up the sync.
|
||||
|
||||
## 3. Check the logs of your first sync
|
||||
|
||||
You should now see a list of sources with the source you just added. Click on it to find more information about your connection. This is the page where you can update any settings about this source and how it syncs. There should be a `Completed` job under the history section. If you click on that run, it will show logs from that run.
|
||||
|
||||

|
||||
|
||||
One of biggest problems we've seen in tools like Fivetran is the lack of visibility when debugging. In Airbyte, allowing full log access and the ability to debug and fix connector problems is one of our highest priorities. We'll be working hard to make these logs accessible and understandable.
|
||||
|
||||
## 4. Check if the syncing actually worked
|
||||
|
||||
Now let's verify that this worked. Let's output the contents of the destination db:
|
||||
|
||||
```text
|
||||
docker exec airbyte-destination psql -U postgres -c "SELECT * FROM public.users;"
|
||||
```
|
||||
|
||||
:::info
|
||||
|
||||
Don't worry about the awkward `public_users` name for now; we are currently working on an update to allow users to configure their destination table names!
|
||||
|
||||
:::
|
||||
|
||||
You should see the rows from the source database inside the destination database!
|
||||
|
||||
And there you have it. You've taken data from one database and replicated it to another. All of the actual configuration for this replication only took place in the UI.
|
||||
|
||||
That's it! This is just the beginning of Airbyte. If you have any questions at all, please reach out to us on [Slack](https://slack.airbyte.io/). We’re still in alpha, so if you see any rough edges or want to request a connector you need, please create an issue on our [Github](https://github.com/airbytehq/airbyte) or leave a thumbs up on an existing issue.
|
||||
|
||||
Thank you and we hope you enjoy using Airbyte.
|
||||
@@ -1,109 +0,0 @@
|
||||
---
|
||||
description: Using Airbyte and MeiliSearch
|
||||
---
|
||||
|
||||
# Save and Search Through Your Slack History on a Free Slack Plan
|
||||
|
||||

|
||||
|
||||
The [Slack free tier](https://slack.com/pricing/paid-vs-free) saves only the last 10K messages. For social Slack instances, it may be impractical to upgrade to a paid plan to retain these messages. Similarly, for an open-source project like [Airbyte](../../understanding-airbyte/airbyte-protocol.md#catalog) where we interact with our community through a public Slack instance, the cost of paying for a seat for every Slack member is prohibitive.
|
||||
|
||||
However, searching through old messages can be really helpful. Losing that history feels like some advanced form of memory loss. What was that joke about Java 8 Streams? This contributor question sounds familiar—haven't we seen it before? But you just can't remember!
|
||||
|
||||
This tutorial will show you how you can, for free, use Airbyte to save these messages \(even after Slack removes access to them\). It will also provide you a convenient way to search through them.
|
||||
|
||||
Specifically, we will export messages from your Slack instance into an open-source search engine called [MeiliSearch](https://github.com/meilisearch/meilisearch). We will be focusing on getting this setup running from your local workstation. We will mention at the end how you can set up a more productionized version of this pipeline.
|
||||
|
||||
We want to make this process easy, so while we will link to some external documentation for further exploration, we will provide all the instructions you need here to get this up and running.
|
||||
|
||||
## 1. Set Up MeiliSearch
|
||||
|
||||
First, let's get MeiliSearch running on our workstation. MeiliSearch has extensive docs for [getting started](https://docs.meilisearch.com/reference/features/installation.html#download-and-launch). For this tutorial, however, we will give you all the instructions you need to set up MeiliSearch using Docker.
|
||||
|
||||
```text
|
||||
docker run -it --rm \
|
||||
-p 7700:7700 \
|
||||
-v $(pwd)/data.ms:/data.ms \
|
||||
getmeili/meilisearch
|
||||
```
|
||||
|
||||
That's it!
|
||||
|
||||
:::info
|
||||
|
||||
MeiliSearch stores data in $\(pwd\)/data.ms, so if you prefer to store it somewhere else, just adjust this path.
|
||||
|
||||
:::
|
||||
|
||||
## 2. Replicate Your Slack Messages to MeiliSearch
|
||||
|
||||
### a. Set Up Airbyte
|
||||
|
||||
Make sure you have Docker and Docker Compose installed. If you haven’t set Docker up, follow the [instructions here](https://docs.docker.com/desktop/) to set it up on your machine. Then, run the following commands:
|
||||
|
||||
```bash
|
||||
git clone https://github.com/airbytehq/airbyte.git
|
||||
cd airbyte
|
||||
docker-compose up
|
||||
```
|
||||
|
||||
If you run into any problems, feel free to check out our more extensive [Getting Started FAQ](https://discuss.airbyte.io/c/faq/15) for help.
|
||||
|
||||
Once you see an Airbyte banner, the UI is ready to go at [http://localhost:8000/](http://localhost:8000/). Once you have set your user preferences, you will be brought to a page that asks you to set up a source. In the next step, we'll go over how to do that.
|
||||
|
||||
### b. Set Up Airbyte’s Slack Source Connector
|
||||
|
||||
In the Airbyte UI, select Slack from the dropdown. We provide step-by-step instructions for setting up the Slack source in Airbyte [here](https://docs.airbyte.com/integrations/sources/slack#setup-guide). These will walk you through how to complete the form on this page.
|
||||
|
||||

|
||||
|
||||
By the end of these instructions, you should have created a Slack source in the Airbyte UI. For now, just add your Slack app to a single public channel \(you can add it to more channels later\). Only messages from that channel will be replicated.
|
||||
|
||||
The Airbyte app will now prompt you to set up a destination. Next, we will walk through how to set up MeiliSearch.
|
||||
|
||||
### c. Set Up Airbyte’s MeiliSearch Destination Connector
|
||||
|
||||
Head back to the Airbyte UI. It should still be prompting you to set up a destination. Select "MeiliSearch" from the dropdown. For the `host` field, set: `http://localhost:7700`. The `api_key` can be left blank.
|
||||
|
||||
### d. Set Up the Replication
|
||||
|
||||
On the next page, you will be asked to select which streams of data you'd like to replicate. We recommend unchecking "files" and "remote files" since you won't really be able to search them easily in this search engine.
|
||||
|
||||

|
||||
|
||||
For frequency, we recommend every 24 hours.
|
||||
|
||||
## 3. Search MeiliSearch
|
||||
|
||||
After the connection has been saved, Airbyte should start replicating the data immediately. When it completes you should see the following:
|
||||
|
||||

|
||||
|
||||
When the sync is done, you can sanity check that this is all working by making a search request to MeiliSearch. Replication can take several minutes depending on the size of your Slack instance.
|
||||
|
||||
```bash
|
||||
curl 'http://localhost:7700/indexes/messages/search' --data '{ "q": "<search-term>" }'
|
||||
```
|
||||
|
||||
For example, I have the following message in one of the messages that I replicated: "welcome to airbyte".
|
||||
|
||||
```bash
|
||||
curl 'http://localhost:7700/indexes/messages/search' --data '{ "q": "welcome to" }'
|
||||
# => {"hits":[{"_ab_pk":"7ff9a858_6959_45e7_ad6b_16f9e0e91098","channel_id":"C01M2UUP87P","client_msg_id":"77022f01-3846-4b9d-a6d3-120a26b2c2ac","type":"message","text":"welcome to airbyte.","user":"U01AS8LGX41","ts":"2021-02-05T17:26:01.000000Z","team":"T01AB4DDR2N","blocks":[{"type":"rich_text"}],"file_ids":[],"thread_ts":"1612545961.000800"}],"offset":0,"limit":20,"nbHits":2,"exhaustiveNbHits":false,"processingTimeMs":21,"query":"test-72"}
|
||||
```
|
||||
|
||||
## 4. Search via a UI
|
||||
|
||||
Making curl requests to search your Slack History is a little clunky, so we have modified the example UI that MeiliSearch provides in [their docs](https://docs.meilisearch.com/learn/tutorials/getting_started.html#integrate-with-your-project) to search through the Slack results.
|
||||
|
||||
Download \(or copy and paste\) this [html file](https://github.com/airbytehq/airbyte/blob/master/docs/examples/slack-history/index.html) to your workstation. Then, open it using a browser. You should now be able to write search terms in the search bar and get results instantly!
|
||||
|
||||

|
||||
|
||||
## 5. "Productionizing" Saving Slack History
|
||||
|
||||
You can find instructions for how to host Airbyte on various cloud platforms [here](../../deploying-airbyte/README.md).
|
||||
|
||||
Documentation on how to host MeiliSearch on cloud platforms can be found [here](https://docs.meilisearch.com/running-production/#a-quick-introduction).
|
||||
|
||||
If you want to use the UI mentioned in the section above, we recommend statically hosting it on S3, GCS, or equivalent.
|
||||
@@ -1,77 +0,0 @@
|
||||
<!--modified version of: https://docs.meilisearch.com/learn/tutorials/getting_started.html#integrate-with-your-project-->
|
||||
<!DOCTYPE html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="utf-8" />
|
||||
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/@meilisearch/instant-meilisearch/templates/basic_search.css" />
|
||||
</head>
|
||||
<body>
|
||||
<div class="wrapper">
|
||||
<div id="searchbox" focus></div>
|
||||
<div id="hits"></div>
|
||||
<div id="hits2"></div>
|
||||
|
||||
</div>
|
||||
<script
|
||||
src="https://cdn.jsdelivr.net/npm/@meilisearch/instant-meilisearch/dist/instant-meilisearch.umd.min.js"></script>
|
||||
<script src="https://cdn.jsdelivr.net/npm/meilisearch@latest/dist/bundles/meilisearch.umd.js"></script>
|
||||
<script src="https://cdn.jsdelivr.net/npm/instantsearch.js@4"></script>
|
||||
<script>
|
||||
(async function () {
|
||||
const searchMessages = instantsearch({
|
||||
indexName: "threads",
|
||||
searchClient: instantMeiliSearch(
|
||||
"http://localhost:7700"
|
||||
)
|
||||
});
|
||||
|
||||
const client = new MeiliSearch({host: "http://localhost:7700"});
|
||||
// put all users in a map so that we can display user names.
|
||||
const users = await client.getIndex("users")
|
||||
.then(usersResult => usersResult.getDocuments())
|
||||
.then(usersResult => usersResult.reduce((map, obj) => {
|
||||
map[obj.id] = obj.name;
|
||||
return map;
|
||||
}, {}));
|
||||
|
||||
// put all channels in a map so that we can display channel names.
|
||||
const channels = await client.getIndex("channels")
|
||||
.then(usersResult => usersResult.getDocuments())
|
||||
.then(usersResult => usersResult.reduce((map, obj) => {
|
||||
map[obj.id] = obj.name;
|
||||
return map;
|
||||
}, {}));
|
||||
|
||||
searchMessages.addWidgets([
|
||||
instantsearch.widgets.searchBox({
|
||||
container: "#searchbox"
|
||||
}),
|
||||
instantsearch.widgets.configure({hitsPerPage: 8}),
|
||||
instantsearch.widgets.hits({
|
||||
container: "#hits",
|
||||
templates: {
|
||||
item(hit) {
|
||||
return `
|
||||
<div>
|
||||
<div class="hit-name">
|
||||
<b>${users[hit.user] || hit.user}</b>
|
||||
(<i>${channels[hit.channel_id] || hit.channel_id}, ${instantsearch.highlight({
|
||||
attribute: 'ts',
|
||||
highlightedTagName: 'mark',
|
||||
hit
|
||||
})}</i>):
|
||||
${instantsearch.highlight({attribute: 'text', highlightedTagName: 'mark', hit})}
|
||||
</div>
|
||||
</div>
|
||||
`
|
||||
}
|
||||
}
|
||||
})
|
||||
]);
|
||||
|
||||
searchMessages.start()
|
||||
})();
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
||||
|
||||
@@ -1,272 +0,0 @@
|
||||
---
|
||||
description: Using Airbyte and Tableau
|
||||
---
|
||||
|
||||
# Visualizing the Time Spent by Your Team in Zoom Calls
|
||||
|
||||
In this article, we will show you how you can understand how much your team leverages Zoom, or spends time in meetings, in a couple of minutes. We will be using [Airbyte](https://airbyte.com) \(an open-source data integration platform\) and [Tableau](https://www.tableau.com) \(a business intelligence and analytics software\) for this tutorial.
|
||||
|
||||
Here is what we will cover:
|
||||
|
||||
1. Replicating data from Zoom to a PostgreSQL database, using Airbyte
|
||||
2. Connecting the PostgreSQL database to Tableau
|
||||
3. Creating charts in Tableau with Zoom data
|
||||
|
||||
We will produce the following charts in Tableau:
|
||||
|
||||
* Meetings per week in a team
|
||||
* Hours a team spends in meetings per week
|
||||
* Listing of team members with the number of meetings per week and number of hours spent in meetings, ranked
|
||||
* Webinars per week in a team
|
||||
* Hours a team spends in webinars per week
|
||||
* Participants for all webinars in a team per week
|
||||
* Listing of team members with the number of webinars per week and number of hours spent in meetings, ranked
|
||||
|
||||
Let’s get started by replicating Zoom data using Airbyte.
|
||||
|
||||
## Step 1: Replicating Zoom data to PostgreSQL
|
||||
|
||||
### Launching Airbyte
|
||||
|
||||
In order to replicate Zoom data, we will need to use [Airbyte’s Zoom connector](https://docs.airbyte.com/integrations/sources/zoom). To do this, you need to start off Airbyte’s web app by opening up your terminal and navigating to Airbyte and running:
|
||||
|
||||
`docker-compose up`
|
||||
|
||||
You can find more details about this in the [Getting Started FAQ](https://discuss.airbyte.io/c/faq/15) on our [Airbyte Forum](https://github.com/airbytehq/airbyte/discussions).
|
||||
|
||||
This will start up Airbyte on `localhost:8000`; open that address in your browser to access the Airbyte dashboard.
|
||||
|
||||

|
||||
|
||||
If you haven't gone through the onboarding yet, you will be prompted to connect a source and a destination. Then just follow the instructions. If you've gone through it, then you will see the screenshot above. In the top right corner of the Airbyte dashboard, click on the **+ new source** button to add a new Airbyte source. In the screen to set up the new source, enter the source name \(we will use airbyte-zoom\) and select **Zoom** as source type.
|
||||
|
||||
Choosing Zoom as **source type** will cause Airbyte to display the configuration parameters needed to set up the Zoom source.
|
||||
|
||||

|
||||
|
||||
The Zoom connector for Airbyte requires you to provide it with a Zoom JWT token. Let’s take a detour and look at how to obtain one from Zoom.
|
||||
|
||||
### Obtaining a Zoom JWT Token
|
||||
|
||||
To obtain a Zoom JWT Token, login to your Zoom account and go to the [Zoom Marketplace](https://marketplace.zoom.us/). If this is your first time in the marketplace, you will need to agree to the Zoom’s marketplace terms of use.
|
||||
|
||||
Once you are in, you need to click on the **Develop** dropdown and then click on **Build App.**
|
||||
|
||||

|
||||
|
||||
Clicking on **Build App** for the first time will display a modal for you to accept the Zoom’s API license and terms of use. Do accept if you agree and you will be presented with the below screen.
|
||||
|
||||
%20(3).png)
|
||||
|
||||
Select **JWT** as the app you want to build and click on the **Create** button on the card. You will be presented with a modal to enter the app name; type in `airbyte-zoom`.
|
||||
|
||||

|
||||
|
||||
Next, click on the **Create** button on the modal.
|
||||
|
||||
You will then be taken to the **App Information** page of the app you just created. Fill in the required information.
|
||||
|
||||

|
||||
|
||||
After filling in the needed information, click on the **Continue** button. You will be taken to the **App Credentials** page. Here, click on the **View JWT Token** dropdown.
|
||||
|
||||

|
||||
|
||||
There you can set the expiration time of the token \(we will leave the default 90 minutes\), and then you click on the **Copy** button of the **JWT Token**.
|
||||
|
||||
After copying it, click on the **Continue** button.
|
||||
|
||||

|
||||
|
||||
You will be taken to a screen to activate **Event Subscriptions**. Just leave it as is, as we won’t be needing Webhooks. Click on **Continue**, and your app should be marked as activated.
|
||||
|
||||
### Connecting Zoom on Airbyte
|
||||
|
||||
So let’s go back to the Airbyte web UI and provide it with the JWT token we copied from our Zoom app.
|
||||
|
||||
Now click on the **Set up source** button. You will see the below success message when the connection is made successfully.
|
||||
|
||||
%20(2).png)
|
||||
|
||||
And you will be taken to the page to add your destination.
|
||||
|
||||
### Connecting PostgreSQL on Airbyte
|
||||
|
||||

|
||||
|
||||
For our destination, we will be using a PostgreSQL database, since Tableau supports PostgreSQL as a data source. Click on the **add destination** button, and then in the drop down click on **+ add a new destination**. In the page that presents itself, add the destination name and choose the Postgres destination.
|
||||
|
||||

|
||||
|
||||
To supply Airbyte with the PostgreSQL configuration parameters needed to make a PostgreSQL destination, we will spin off a PostgreSQL container with Docker using the following command in our terminal.
|
||||
|
||||
`docker run --rm --name airbyte-zoom-db -e POSTGRES_PASSWORD=password -v airbyte_zoom_data:/var/lib/postgresql/data -p 2000:5432 -d postgres`
|
||||
|
||||
This will spin a docker container and persist the data we will be replicating in the PostgreSQL database in a Docker volume `airbyte_zoom_data`.
|
||||
|
||||
Now, let’s supply the above credentials to the Airbyte UI requiring those credentials.
|
||||
|
||||
%20(3).png)
|
||||
|
||||
Then click on the **Set up destination** button.
|
||||
|
||||
After the connection has been made to your PostgreSQL database successfully, Airbyte will generate the schema of the data to be replicated in your database from the Zoom source.
|
||||
|
||||
Leave all the fields checked.
|
||||
|
||||
%20(3).png)
|
||||
|
||||
Select a **Sync frequency** of **manual** and then click on **Set up connection**.
|
||||
|
||||
After successfully making the connection, you will see your PostgreSQL destination. Click on the Launch button to start the data replication.
|
||||
|
||||
%20(3).png)
|
||||
|
||||
Then click on the **airbyte-zoom-destination** to see the Sync page.
|
||||
|
||||
%20(3).png)
|
||||
|
||||
Syncing should take a few minutes or longer depending on the size of the data being replicated. Once Airbyte is done replicating the data, you will get a **succeeded** status.
|
||||
|
||||
Then, you can run the following SQL command on the PostgreSQL container to confirm that the sync was done successfully.
|
||||
|
||||
`docker exec airbyte-zoom-db psql -U postgres -c "SELECT * FROM public.users;"`
|
||||
|
||||
Now that we have our Zoom data replicated successfully via Airbyte, let’s move on and set up Tableau to make the various visualizations and analytics we want.
|
||||
|
||||
## Step 2: Connect the PostgreSQL database to Tableau
|
||||
|
||||
Tableau helps people and organizations to get answers from their data. It’s a visual analytic platform that makes it easy to explore and manage data.
|
||||
|
||||
To get started with Tableau, you can opt in for a [free trial period](https://www.tableau.com/products/trial) by providing your email and clicking the **DOWNLOAD FREE TRIAL** button to download the Tableau desktop app. The download should automatically detect your machine type \(Windows/Mac\).
|
||||
|
||||
Go ahead and install Tableau on your machine. After the installation is complete, you will need to fill in some more details to activate your free trial.
|
||||
|
||||
Once your activation is successful, you will see your Tableau dashboard.
|
||||
|
||||
%20(3).png)
|
||||
|
||||
On the sidebar menu under the **To a Server** section, click on the **More…** menu. You will see a list of datasource connectors you can connect Tableau with.
|
||||
|
||||
%20(4).png)
|
||||
|
||||
Select **PostgreSQL** and you will be presented with a connection credentials modal.
|
||||
|
||||
Fill in the same details of the PostgreSQL database we used as the destination in Airbyte.
|
||||
|
||||

|
||||
|
||||
Next, click on the **Sign In** button. If the connection was made successfully, you will see the Tableau dashboard for the database you just connected.
|
||||
|
||||
_Note: If you are having trouble connecting PostgreSQL with Tableau, it might be because the driver Tableau comes with for PostgreSQL might not work for newer versions of PostgreSQL. You can download the JDBC driver for PostgreSQL_ [_here_](https://www.tableau.com/support/drivers?_ga=2.62351404.1800241672.1616922684-1838321730.1615100968) _and follow the setup instructions._
|
||||
|
||||
Now that we have replicated our Zoom data into a PostgreSQL database using Airbyte’s Zoom connector, and connected Tableau with our PostgreSQL database containing our Zoom data, let’s proceed to creating the charts we need to visualize the time spent by a team in Zoom calls.
|
||||
|
||||
## Step 3: Create the charts on Tableau with the Zoom data
|
||||
|
||||
### Meetings per week in a team
|
||||
|
||||
To create this chart, we will need to use the count of the meetings and the **createdAt** field of the **meetings** table. Currently, we haven’t selected a table to work on in Tableau. So you will see a prompt to **Drag tables here**.
|
||||
|
||||

|
||||
|
||||
Drag the **meetings** table from the sidebar onto the space with the prompt.
|
||||
|
||||
Now that we have the meetings table, we can start building out the chart by clicking on **Sheet 1** at the bottom left of Tableau.
|
||||
|
||||

|
||||
|
||||
As stated earlier, we need **Created At**, but currently it’s a String data type. Let’s change that by converting it to a data time. So right click on **Created At**, then select `ChangeDataType` and choose Date & Time. And that’s it! That field is now of type **Date** & **Time**.
|
||||
|
||||

|
||||
|
||||
Next, drag **Created At** to **Columns**.
|
||||
|
||||

|
||||
|
||||
Currently, we get the Created At in **YEAR**, but per our requirement we want them in Weeks, so right click on the **YEAR\(Created At\)** and choose **Week Number**.
|
||||
|
||||
%20(3).png)
|
||||
|
||||
Tableau should now look like this:
|
||||
|
||||

|
||||
|
||||
Now, to finish up, we need to add the **meetings\(Count\) measure** Tableau already calculated for us in the **Rows** section. So drag **meetings\(Count\)** onto the Columns section to complete the chart.
|
||||
|
||||
%20(3).png)
|
||||
|
||||
And now we are done with the very first chart. Let's save the sheet and create a new Dashboard that we will add this sheet to as well as the others we will be creating.
|
||||
|
||||
Currently the sheet shows **Sheet 1**; right click on **Sheet 1** at the bottom left and rename it to **Weekly Meetings**.
|
||||
|
||||
To create our Dashboard, we can right click on the sheet we just renamed and choose **new Dashboard**. Rename the Dashboard to Zoom Dashboard and drag the sheet into it to have something like this:
|
||||
|
||||

|
||||
|
||||
Now that we have this first chart out of the way, we just need to replicate most of the process we used for this one to create the other charts. Because the steps are so similar, we will mostly be showing the finished screenshots of the charts except when we need to conform to the chart requirements.
|
||||
|
||||
### Hours a team spends in meetings per week
|
||||
|
||||
For this chart, we need the sum of the duration spent in weekly meetings. We already have a Duration field, which is currently displaying durations in minutes. We can derive a calculated field off this field since we want the duration in hours \(we just need to divide the duration field by 60\).
|
||||
|
||||
To do this, right click on the Duration field and select **create**, then click on **calculatedField**. Change the name to **Duration in Hours**, and then the calculation should be **\[Duration\]/60**. Click ok to create the field.
|
||||
|
||||
So now we can drag the Duration in Hours and Created At fields onto your sheet like so:
|
||||
|
||||

|
||||
|
||||
Note: We are adding a filter on the Duration to filter out null values. You can do this by right clicking on the **SUM\(Duration\)** pill and clicking filter, then make sure the **include null values** checkbox is unchecked.
|
||||
|
||||
### Participants for all meetings per week
|
||||
|
||||
For this chart, we will need to have a calculated field called **\# of meetings attended**, which will be an aggregate of the counts of rows matching a particular user's email in the `report_meeting_participants` table plotted against the **Created At** field of the **meetings** table. To get this done, right click on the **User Email** field. Select **create** and click on **calculatedField**, then enter the title of the field as **\# of meetings attended**. Next, enter the below formula:
|
||||
|
||||
`COUNT(IF [User Email] == [User Email] THEN [Id (Report Meeting Participants)] END)`
|
||||
|
||||
Then click on apply. Finally, drag the **Created At** fields \(make sure it’s on the **Weekly** number\) and the calculated field you just created to match the below screenshot:
|
||||
|
||||

|
||||
|
||||
### Listing of team members with the number of meetings per week and number of hours spent in meetings, ranked.
|
||||
|
||||
To get this chart, we need to create a relationship between the **meetings table** and the `report_meeting_participants` table. You can do this by dragging the `report_meeting_participants` table in as a source alongside the **meetings** table and relate both via the **meeting id**. Then you will be able to create a new worksheet that looks like this:
|
||||
|
||||
%20(3).png)
|
||||
|
||||
Note: To achieve the ranking, we simply use the sort menu icon on the top menu bar.
|
||||
|
||||
### Webinars per week in a team
|
||||
|
||||
The rest of the charts will be needing the **webinars** and `report_webinar_participants` tables. Similar to the number of meetings per week in a team, we will be plotting the Count of webinars against the **Created At** property.
|
||||
|
||||

|
||||
|
||||
### Hours a week spends in webinars per week
|
||||
|
||||
For this chart, as for the meeting’s counterpart, we will get a calculated field off the Duration field to get the **Webinar Duration in Hours**, and then plot **Created At** against the **Sum of Webinar Duration in Hours**, as shown in the screenshot below. Note: Make sure you create a new sheet for each of these graphs.
|
||||
|
||||
### Participants for all webinars per week
|
||||
|
||||
This calculation is the same as the number of participants for all meetings per week, but instead of using the **meetings** and `report_meeting_participants` tables, we will use the webinars and `report_webinar_participants` tables.
|
||||
|
||||
Also, the formula will now be:
|
||||
|
||||
`COUNT(IF [User Email] == [User Email] THEN [Id (Report Webinar Participants)] END)`
|
||||
|
||||
Below is the chart:
|
||||
|
||||

|
||||
|
||||
#### Listing of team members with the number of webinars per week and number of hours spent in meetings, ranked
|
||||
|
||||
Below is the chart with these specs
|
||||
|
||||

|
||||
|
||||
## Conclusion
|
||||
|
||||
In this article, we see how we can use Airbyte to get data off the Zoom API onto a PostgreSQL database, and then use that data to create some chart visualizations in Tableau.
|
||||
|
||||
You can leverage Airbyte and Tableau to produce graphs on any collaboration tool. We just used Zoom to illustrate how it can be done. Hope this is helpful!
|
||||
|
||||
@@ -1,5 +0,0 @@
|
||||
# FAQ
|
||||
|
||||
Our FAQ is now a section on our Airbyte Forum. Check it out [here](https://github.com/airbytehq/airbyte/discussions)!
|
||||
|
||||
If you don't see your question answered, feel free to open up a new topic for it.
|
||||
@@ -1,124 +0,0 @@
|
||||
# Data Loading
|
||||
|
||||
## **Why don’t I see any data in my destination yet?**
|
||||
|
||||
It can take a while for Airbyte to load data into your destination. Some sources have restrictive API limits which constrain how much
|
||||
data we can sync in a given time. Large amounts of data in your source can also make the initial sync take longer. You can check your
|
||||
sync status in your connection detail page that you can access through the destination detail page or the source one.
|
||||
|
||||
## **Why my final tables are being recreated everytime?**
|
||||
|
||||
Airbyte ingests data into raw tables and applies the process of normalization if you selected it in the connection page.
|
||||
The normalization runs a full refresh each sync and for some destinations like Snowflake, Redshift, Bigquery this may incur more
|
||||
resource consumption and more costs. You need to pay attention to the frequency that you're retrieving your data to avoid issues.
|
||||
For example, if you create a connection to sync every 5 minutes with incremental sync on, it will only retrieve new records into the raw tables but will apply normalization
|
||||
to *all* the data in every sync! If you have tons of data, this may not be the right sync frequency for you.
|
||||
|
||||
There is a [Github issue](https://github.com/airbytehq/airbyte/issues/4286) to implement normalization using incremental, which will reduce
|
||||
costs and resources in your destination.
|
||||
|
||||
## **What happens if a sync fails?**
|
||||
|
||||
You won't lose data when a sync fails, however, no data will be added or updated in your destination.
|
||||
|
||||
Airbyte will automatically attempt to replicate data 3 times. You can see and export the logs for those attempts in the connection
|
||||
detail page. You can access this page through the Source or Destination detail page.
|
||||
|
||||
You can configure a Slack webhook to warn you when a sync fails.
|
||||
|
||||
In the future you will be able to configure other notification method (email, Sentry) and an option to create a
|
||||
GitHub issue with the logs. We’re still working on it, and the purpose would be to help the community and the Airbyte team to fix the
|
||||
issue as soon as possible, especially if it is a connector issue.
|
||||
|
||||
Until Airbyte has this system in place, here is what you can do:
|
||||
|
||||
* File a GitHub issue: go [here](https://github.com/airbytehq/airbyte/issues/new?assignees=&labels=type%2Fbug&template=bug-report.md&title=)
|
||||
and file an issue with the detailed logs copied in the issue’s description. The team will be notified about your issue and will update
|
||||
it for any progress or comment on it.
|
||||
* Fix the issue yourself: Airbyte is open source so you don’t need to wait for anybody to fix your issue if it is important to you.
|
||||
To do so, just fork the [GitHub project](https://github.com/airbytehq/airbyte) and fix the piece of code that need fixing. If you’re okay
|
||||
with contributing your fix to the community, you can submit a pull request. We will review it ASAP.
|
||||
* Ask on Slack: don’t hesitate to ping the team on [Slack](https://slack.airbyte.io).
|
||||
|
||||
Once all this is done, Airbyte resumes your sync from where it left off.
|
||||
|
||||
We truly appreciate any contribution you make to help the community. Airbyte will become the open-source standard only if everybody participates.
|
||||
|
||||
## **Can Airbyte support 2-way sync i.e. changes from A go to B and changes from B go to A?**
|
||||
|
||||
Airbyte actually does not support this right now. There are some details around how we handle schema and tables names that isn't going to
|
||||
work for you in the current iteration.
|
||||
If you attempt to do a circular dependency between source and destination, you'll end up with the following
|
||||
A.public.table_foo writes to B.public.public_table_foo to A.public.public_public_table_foo. You won't be writing into your original table,
|
||||
which I think is your intention.
|
||||
|
||||
|
||||
## **What happens to data in the pipeline if the destination gets disconnected? Could I lose data, or wind up with duplicate data when the pipeline is reconnected?**
|
||||
|
||||
Airbyte is architected to prevent data loss or duplication. Airbyte will display a failure for the sync, and re-attempt it at the next syncing,
|
||||
according to the frequency you set.
|
||||
|
||||
## **How frequently can Airbyte sync data?**
|
||||
|
||||
You can adjust the load time to run as frequent as every hour or as infrequent as once a year using [Cron expressions](https://docs.airbyte.com/cloud/managing-airbyte-cloud/edit-stream-configuration).
|
||||
|
||||
## **Why wouldn’t I choose to load all of my data more frequently?**
|
||||
|
||||
While frequent data loads will give you more up-to-date data, there are a few reasons you wouldn’t want to load your too frequently, including:
|
||||
|
||||
* Higher API usage may cause you to hit a limit that could impact other systems that rely on that API.
|
||||
* Higher cost of loading data into your warehouse.
|
||||
* More frequent delays, resulting in increased delay notification emails. For instance, if the data source generally takes several hours to
|
||||
update but you wanted five-minute increments, you may receive a delay notification every sync.
|
||||
|
||||
Generally is recommended setting the incremental loads to every hour to help limit API calls.
|
||||
|
||||
## **Is there a way to know the estimated time to completion for the first historic sync?**
|
||||
|
||||
Unfortunately not yet.
|
||||
|
||||
## **Do you support change data capture \(CDC\) or logical replication for databases?**
|
||||
|
||||
Airbyte currently supports [CDC for Postgres and Mysql](../../understanding-airbyte/cdc.md). Airbyte is adding support for a few other
|
||||
databases you can check in the roadmap.
|
||||
|
||||
## Using incremental sync, is it possible to add more fields when some new columns are added to a source table, or when a new table is added?
|
||||
|
||||
For the moment, incremental sync doesn't support schema changes, so you would need to perform a full refresh whenever that happens.
|
||||
Here’s a related [Github issue](https://github.com/airbytehq/airbyte/issues/1601).
|
||||
|
||||
## There is a limit of how many tables one connection can handle?
|
||||
|
||||
Yes, for more than 6000 thousand tables could be a problem to load the information on UI.
|
||||
|
||||
There are two Github issues about this limitation: [Issue #3942](https://github.com/airbytehq/airbyte/issues/3942)
|
||||
and [Issue #3943](https://github.com/airbytehq/airbyte/issues/3943).
|
||||
|
||||
## Help, Airbyte is hanging/taking a long time to discover my source's schema!
|
||||
|
||||
This usually happens for database sources that contain a lot of tables. This should resolve itself in half an hour or so.
|
||||
|
||||
If the source contains more than 6k tables, see the [above question](#there-is-a-limit-of-how-many-tables-one-connection-can-handle).
|
||||
|
||||
There is a known issue with [Oracle databases](https://github.com/airbytehq/airbyte/issues/4944).
|
||||
|
||||
## **I see you support a lot of connectors – what about connectors Airbyte doesn’t support yet?**
|
||||
|
||||
You can either:
|
||||
|
||||
* Submit a [connector request](https://github.com/airbytehq/airbyte/issues/new?assignees=&labels=area%2Fintegration%2C+new-integration&template=new-integration-request.md&title=) on our Github project, and be notified once we or the community build a connector for it.
|
||||
* Build a connector yourself by forking our [GitHub project](https://github.com/airbytehq/airbyte) and submitting a pull request. Here
|
||||
are the [instructions how to build a connector](../../contributing-to-airbyte/README.md).
|
||||
* Ask on Slack: don’t hesitate to ping the team on [Slack](https://slack.airbyte.io).
|
||||
|
||||
## **What kind of notifications do I get?**
|
||||
|
||||
For the moment, the UI will only display one kind of notification: when a sync fails, Airbyte will display the failure at the source/destination
|
||||
level in the list of sources/destinations, and in the connection detail page along with the logs.
|
||||
|
||||
However, there are other types of notifications:
|
||||
|
||||
* When a connector that you use is no longer up to date
|
||||
* When your connections fails
|
||||
* When core isn't up to date
|
||||
|
||||
@@ -1,40 +0,0 @@
|
||||
# Deploying Airbyte on a Non-Standard Operating System
|
||||
|
||||
## CentOS 8
|
||||
|
||||
From clean install:
|
||||
|
||||
```
|
||||
firewall-cmd --zone=public --add-port=8000/tcp --permanent
|
||||
firewall-cmd --zone=public --add-port=8001/tcp --permanent
|
||||
firewall-cmd --zone=public --add-port=7233/tcp --permanent
|
||||
systemctl restart firewalld
|
||||
```
|
||||
OR... if you prefer iptables:
|
||||
```
|
||||
iptables -A INPUT -p tcp -m tcp --dport 8000 -j ACCEPT
|
||||
iptables -A INPUT -p tcp -m tcp --dport 8001 -j ACCEPT
|
||||
iptables -A INPUT -p tcp -m tcp --dport 7233 -j ACCEPT
|
||||
systemctl restart iptables
|
||||
```
|
||||
Setup the docker repo:
|
||||
```
|
||||
dnf config-manager --add-repo=https://download.docker.com/linux/centos/docker-ce.repo`
|
||||
dnf install docker-ce --nobest
|
||||
systemctl enable --now docker
|
||||
usermod -aG docker $USER
|
||||
```
|
||||
You'll need to get docker-compose separately.
|
||||
```
|
||||
dnf install wget git curl
|
||||
curl -L https://github.com/docker/compose/releases/download/1.25.0/docker-compose-`uname -s`-`uname -m` -o /usr/local/bin/docker-compose
|
||||
chmod +x /usr/local/bin/docker-compose
|
||||
```
|
||||
Now we can install Airbyte. In this example, we will install it under `/opt/`
|
||||
```
|
||||
cd /opt
|
||||
git clone https://github.com/airbytehq/airbyte.git
|
||||
cd airbyte
|
||||
docker-compose up
|
||||
docker-compose ps
|
||||
```
|
||||
@@ -1,2 +0,0 @@
|
||||
# Differences with
|
||||
|
||||
@@ -1,27 +0,0 @@
|
||||
# Fivetran vs Airbyte
|
||||
|
||||
We wrote an article, “[Open-source vs. Commercial Software: How to Solve the Data Integration Problem](https://airbyte.com/articles/data-engineering-thoughts/open-source-vs-commercial-software-how-to-better-solve-data-integration/),” in which we describe the pros and cons of Fivetran’s commercial approach and Airbyte’s open-source approach. Don’t hesitate to check it out for more detailed arguments. As a summary, here are the differences:
|
||||
|
||||

|
||||
|
||||
## **Fivetran:**
|
||||
|
||||
* **Limited high-quality connectors:** after 8 years in business, Fivetran supports 150 connectors. The more connectors, the more difficult it is for Fivetran to keep the same level of maintenance across all connectors. They will always have a ROI consideration to maintaining long-tailed connectors.
|
||||
* **Pricing indexed on usage:** Fivetran’s pricing is indexed on the number of active rows \(rows added or edited\) per month. Teams always need to keep that in mind and are not free to move data without thinking about cost, as the costs can grow fast.
|
||||
* **Security and privacy compliance:** all companies are subject to privacy compliance laws, such as GDPR, CCPA, HIPAA, etc. As a matter of fact, above a certain stage \(about 100 employees\) in a company, all external products need to go through a security compliance process that can take several months.
|
||||
* **No moving data between internal databases:** Fivetran sits in the cloud, so if you have to replicate data from an internal database to another, it makes no sense to have the data move through them \(Fivetran\) for privacy and cost reasons.
|
||||
|
||||
## **Airbyte:**
|
||||
|
||||
* **Free, as open source, so no more pricing based on usage**: learn more about our [future business model](https://handbook.airbyte.io/strategy/business-model) \(connectors will always remain open source\).
|
||||
* **Supporting 60 connectors within 8 months from inception**. Our goal is to reach 200+ connectors by the end of 2021.
|
||||
* **Building new connectors made trivial, in the language of your choice:** Airbyte makes it a lot easier to create your own connector, vs. building them yourself in-house \(with Airflow or other tools\). Scheduling, orchestration, and monitoring comes out of the box with Airbyte.
|
||||
* **Addressing the long tail of connectors:** with the help of the community, Airbyte ambitions to support thousands of connectors.
|
||||
* **Adapt existing connectors to your needs:** you can adapt any existing connector to address your own unique edge case.
|
||||
* **Using data integration in a workflow:** Airbyte’s API lets engineering teams add data integration jobs into their workflow seamlessly.
|
||||
* **Integrates with your data stack and your needs:** Airflow, Kubernetes, dbt, etc. Its normalization is optional, it gives you a basic version that works out of the box, but also allows you to use dbt to do more complicated things.
|
||||
* **Debugging autonomy:** if you experience any connector issue, you won’t need to wait for Fivetran’s customer support team to get back to you, if you can fix the issue fast yourself.
|
||||
* **No more security and privacy compliance, as self-hosted, source-available and open-sourced \(MIT\)**. Any team can directly address their integration needs.
|
||||
|
||||
Your data stays in your cloud. Have full control over your data, and the costs of your data transfers.
|
||||
|
||||
@@ -1,28 +0,0 @@
|
||||
# Meltano vs Airbyte
|
||||
|
||||
We wrote an article, “[The State of Open-Source Data Integration and ETL](https://airbyte.com/articles/data-engineering-thoughts/the-state-of-open-source-data-integration-and-etl/),” in which we list and compare all ETL-related open-source projects, including Meltano and Airbyte. Don’t hesitate to check it out for more detailed arguments. As a summary, here are the differences:
|
||||
|
||||
## **Meltano:**
|
||||
|
||||
* **Meltano is built on top of the Singer protocol, whereas Airbyte is built on top of the Airbyte protocol**. Having initially created Airbyte on top of Singer, we wrote about why we didn't move forward with it [here](https://airbyte.com/blog/why-you-should-not-build-your-data-pipeline-on-top-of-singer) and [here](https://airbyte.com/blog/airbyte-vs-singer-why-airbyte-is-not-built-on-top-of-singer). Summarized, the reasons were: Singer connectors didn't always adhere to the Singer protocol, had poor standardization and visibility in terms of quality, and community governance and support was abandoned by Stitch. By contrast, we aim to make Airbyte a product that ["just works"](https://airbyte.com/blog/our-truth-for-2021-airbyte-just-works) and always plan to maximize engagement within the Airbyte community.
|
||||
* **CLI-first approach:** Meltano was primarily built with a command line interface in mind. In that sense, they seem to target engineers with a preference for that interface.
|
||||
* **Integration with Airflow for orchestration:** You can either use Meltano alone for orchestration or with Airflow; Meltano works both ways.
|
||||
* All connectors must use Python.
|
||||
* Meltano works with any of Singer's 200+ available connectors. However, in our experience, quality has been hit or miss.
|
||||
|
||||
## **Airbyte:**
|
||||
|
||||
In contrast, Airbyte is a company fully committed to the open-source project and has a [business model](https://handbook.airbyte.io/strategy/business-model) in mind around this project. Our [team](https://airbyte.com/about-us) are data integration experts that have built more than 1,000 integrations collectively at large scale. The team now counts 20 engineers working full-time on Airbyte.
|
||||
|
||||
* **Airbyte supports more than 100 connectors after only 1 year since its inception**, 20% of which were built by the community. Our ambition is to support **200+ connectors by the end of 2021.**
|
||||
* Airbyte’s connectors are **usable out of the box through a UI and API,** with monitoring, scheduling and orchestration. Airbyte was built on the premise that a user, whatever their background, should be able to move data in 2 minutes. Data engineers might want to use raw data and their own transformation processes, or to use Airbyte’s API to include data integration in their workflows. On the other hand, analysts and data scientists might want to use normalized consolidated data in their database or data warehouses. Airbyte supports all these use cases.
|
||||
* **One platform, one project with standards:** This will help consolidate the developments behind one single project, some standardization and specific data protocol that can benefit all teams and specific cases.
|
||||
* **Not limited by Singer’s data protocol:** In contrast to Meltano, Airbyte was not built on top of Singer, but its data protocol is compatible with Singer’s. This means Airbyte can go beyond Singer, but Meltano will remain limited.
|
||||
* **Connectors can be built in the language of your choice,** as Airbyte runs them as Docker containers.
|
||||
* **Airbyte integrates with your data stack and your needs:** Airflow, Kubernetes, dbt, etc. Its normalization is optional, it gives you a basic version that works out of the box, but also allows you to use dbt to do more complicated things.
|
||||
|
||||
## **Other noteworthy differences:**
|
||||
|
||||
* In terms of community, Meltano's Slack community got 430 new members in the last 6 months, while Airbyte got 800.
|
||||
* The difference in velocity in terms of feature progress is easily measurable as both are open-source projects. Meltano closes about 30 issues per month, while Airbyte closes about 120.
|
||||
|
||||
@@ -1,25 +0,0 @@
|
||||
# Pipelinewise vs Airbyte
|
||||
|
||||
## **PipelineWise:**
|
||||
|
||||
PipelineWise is an open-source project by Transferwise that was built with the primary goal of serving their own needs. There is no business model attached to the project, and no apparent interest in growing the community.
|
||||
|
||||
* **Supports 21 connectors,** and only adds new ones based on the needs of the mother company, Transferwise.
|
||||
* **No business model attached to the project,** and no apparent interest from the company in growing the community.
|
||||
* **As close to the original format as possible:** PipelineWise aims to reproduce the data from the source to an Analytics-Data-Store in as close to the original format as possible. Some minor load time transformations are supported, but complex mapping and joins have to be done in the Analytics-Data-Store to extract meaning.
|
||||
* **Managed Schema Changes:** When source data changes, PipelineWise detects the change and alters the schema in your Analytics-Data-Store automatically.
|
||||
* **YAML based configuration:** Data pipelines are defined as YAML files, ensuring that the entire configuration is kept under version control.
|
||||
* **Lightweight:** No daemons or database setup are required.
|
||||
|
||||
## **Airbyte:**
|
||||
|
||||
In contrast, Airbyte is a company fully committed to the open-source project and has a [business model in mind](https://handbook.airbyte.io/) around this project.
|
||||
|
||||
* Our ambition is to support **300+ connectors by the end of 2021.** We already supported about 50 connectors at the end of 2020, just 5 months after its inception.
|
||||
* Airbyte’s connectors are **usable out of the box through a UI and API,** with monitoring, scheduling and orchestration. Airbyte was built on the premise that a user, whatever their background, should be able to move data in 2 minutes. Data engineers might want to use raw data and their own transformation processes, or to use Airbyte’s API to include data integration in their workflows. On the other hand, analysts and data scientists might want to use normalized consolidated data in their database or data warehouses. Airbyte supports all these use cases.
|
||||
* **One platform, one project with standards:** This will help consolidate the developments behind one single project, some standardization and specific data protocol that can benefit all teams and specific cases.
|
||||
* **Connectors can be built in the language of your choice,** as Airbyte runs them as Docker containers.
|
||||
* **Airbyte integrates with your data stack and your needs:** Airflow, Kubernetes, dbt, etc. Its normalization is optional, it gives you a basic version that works out of the box, but also allows you to use dbt to do more complicated things.
|
||||
|
||||
The data protocols for both projects are compatible with Singer’s. So it is easy to migrate a Singer tap or target onto Airbyte or PipelineWise.
|
||||
|
||||
@@ -1,28 +0,0 @@
|
||||
# Singer vs Airbyte
|
||||
|
||||
If you want to understand the difference between Airbyte and Singer, you might be interested in 2 articles we wrote:
|
||||
|
||||
* “[Airbyte vs. Singer: Why Airbyte is not built on top of Singer](https://airbyte.com/articles/data-engineering-thoughts/airbyte-vs-singer-why-airbyte-is-not-built-on-top-of-singer/).”
|
||||
* “[The State of Open-Source Data Integration and ETL](https://airbyte.com/articles/data-engineering-thoughts/the-state-of-open-source-data-integration-and-etl/),” in which we list and compare all ETL-related open-source projects, including Singer and Airbyte. As a summary, here are the differences:
|
||||
|
||||

|
||||
|
||||
## **Singer:**
|
||||
|
||||
* **Supports 96 connectors after 4 years.**
|
||||
* **Increasingly outdated connectors:** Talend \(acquirer of StitchData\) seems to have stopped investing in maintaining Singer’s community and connectors. As most connectors see schema changes several times a year, more and more Singer’s taps and targets are not actively maintained and are becoming outdated.
|
||||
* **Absence of standardization:** each connector is its own open-source project. So you never know the quality of a tap or target until you have actually used it. There is no guarantee whatsoever about what you’ll get.
|
||||
* **Singer’s connectors are standalone binaries:** you still need to build everything around to make them work \(e.g. UI, configuration validation, state management, normalization, schema migration, monitoring, etc\).
|
||||
* **No full commitment to open sourcing all connectors,** as some connectors are only offered by StitchData under a paid plan. _\*\*_
|
||||
|
||||
## **Airbyte:**
|
||||
|
||||
* Our ambition is to support **300+ connectors by the end of 2021.** We already supported about 50 connectors at the end of 2020, just 5 months after its inception.
|
||||
* Airbyte’s connectors are **usable out of the box through a UI and API**, with monitoring, scheduling and orchestration. Airbyte was built on the premise that a user, whatever their background, should be able to move data in 2 minutes. Data engineers might want to use raw data and their own transformation processes, or to use Airbyte’s API to include data integration in their workflows. On the other hand, analysts and data scientists might want to use normalized consolidated data in their database or data warehouses. Airbyte supports all these use cases.
|
||||
* **One platform, one project with standards:** This will help consolidate the developments behind one single project, some standardization and specific data protocol that can benefit all teams and specific cases.
|
||||
* **Connectors can be built in the language of your choice,** as Airbyte runs them as Docker containers.
|
||||
* **Airbyte integrates with your data stack and your needs:** Airflow, Kubernetes, dbt, etc. Its normalization is optional, it gives you a basic version that works out of the box, but also allows you to use dbt to do more complicated things.
|
||||
* **A full commitment to the open-source MIT project** with the promise not to hide some connectors behind paid walls.
|
||||
|
||||
Note that Airbyte’s data protocol is compatible with Singer’s. So it is easy to migrate a Singer tap onto Airbyte.
|
||||
|
||||
@@ -1,29 +0,0 @@
|
||||
# StitchData vs Airbyte
|
||||
|
||||
We wrote an article, “[Open-source vs. Commercial Software: How to Solve the Data Integration Problem](https://airbyte.com/articles/data-engineering-thoughts/open-source-vs-commercial-software-how-to-better-solve-data-integration/),” in which we describe the pros and cons of StitchData’s commercial approach and Airbyte’s open-source approach. Don’t hesitate to check it out for more detailed arguments. As a summary, here are the differences:
|
||||
|
||||

|
||||
|
||||
## StitchData:
|
||||
|
||||
* **Limited deprecating connectors:** Stitch only supports 150 connectors. Talend has stopped investing in StitchData and its connectors. And on Singer, each connector is its own open-source project. So you never know the quality of a tap or target until you have actually used it. There is no guarantee whatsoever about what you’ll get.
|
||||
* **Pricing indexed on usage:** StitchData’s pricing is indexed on the connectors used and the volume of data transferred. Teams always need to keep that in mind and are not free to move data without thinking about cost.
|
||||
* **Security and privacy compliance:** all companies are subject to privacy compliance laws, such as GDPR, CCPA, HIPAA, etc. As a matter of fact, above a certain stage \(about 100 employees\) in a company, all external products need to go through a security compliance process that can take several months.
|
||||
* **No moving data between internal databases:** StitchData sits in the cloud, so if you have to replicate data from an internal database to another, it makes no sense to have the data move through their cloud for privacy and cost reasons.
|
||||
* **StitchData’s Singer connectors are standalone binaries:** you still need to build everything around to make them work. And it’s hard to update some pre-built connectors, as they are of poor quality.
|
||||
|
||||
## Airbyte:
|
||||
|
||||
* **Free, as open source, so no more pricing based on usage:** learn more about our [future business model](https://handbook.airbyte.io/strategy/business-model) \(connectors will always remain open-source\).
|
||||
* **Supporting 50+ connectors by the end of 2020** \(so in only 5 months of existence\). Our goal is to reach 300+ connectors by the end of 2021.
|
||||
* **Building new connectors made trivial, in the language of your choice:** Airbyte makes it a lot easier to create your own connector, vs. building them yourself in-house \(with Airflow or other tools\). Scheduling, orchestration, and monitoring comes out of the box with Airbyte.
|
||||
* **Maintenance-free connectors you can use in minutes.** Just authenticate your sources and warehouse, and get connectors that adapt to schema and API changes for you.
|
||||
* **Addressing the long tail of connectors:** with the help of the community, Airbyte ambitions to support thousands of connectors.
|
||||
* **Adapt existing connectors to your needs:** you can adapt any existing connector to address your own unique edge case.
|
||||
* **Using data integration in a workflow:** Airbyte’s API lets engineering teams add data integration jobs into their workflow seamlessly.
|
||||
* **Integrates with your data stack and your needs:** Airflow, Kubernetes, dbt, etc. Its normalization is optional, it gives you a basic version that works out of the box, but also allows you to use dbt to do more complicated things.
|
||||
* **Debugging autonomy:** if you experience any connector issue, you won’t need to wait for Fivetran’s customer support team to get back to you, if you can fix the issue fast yourself.
|
||||
* **Your data stays in your cloud.** Have full control over your data, and the costs of your data transfers.
|
||||
* **No more security and privacy compliance, as self-hosted and open-sourced \(MIT\).** Any team can directly address their integration needs.
|
||||
* **Premium support directly on our Slack for free**. Our time to resolution is about 3-4 hours in average.
|
||||
|
||||
@@ -1,50 +0,0 @@
|
||||
# Getting Started
|
||||
|
||||
## **What do I need to get started using Airbyte?**
|
||||
|
||||
You can deploy Airbyte in several ways, as [documented here](../../deploying-airbyte/README.md). Airbyte will then help you replicate data between a source and a destination. If you don’t see the connector you need, you can [build your connector yourself](../../connector-development) and benefit from Airbyte’s optional scheduling, orchestration and monitoring modules.
|
||||
|
||||
## **How long does it take to set up Airbyte?**
|
||||
|
||||
It depends on your source and destination. Check our setup guides to see the tasks for your source and destination. Each source and destination also has a list of prerequisites for setup. To make setup faster, get your prerequisites ready before you start to set up your connector. During the setup process, you may need to contact others \(like a database administrator or AWS account owner\) for help, which might slow you down. But if you have access to the connection information, it can take 2 minutes: see [demo video. ](https://www.youtube.com/watch?v=jWVYpUV9vEg)
|
||||
|
||||
## **What data sources does Airbyte offer connectors for?**
|
||||
|
||||
We already offer 100+ connectors, and will focus all our effort in ramping up the number of connectors and strengthening them. If you don’t see a source you need, you can file a [connector request here](https://github.com/airbytehq/airbyte/issues/new?assignees=&labels=area%2Fintegration%2C+new-integration&template=new-integration-request.md&title=).
|
||||
|
||||
## **Where can I see my data in Airbyte?**
|
||||
|
||||
You can’t see your data in Airbyte, because we don’t store it. The sync loads your data into your destination \(data warehouse, data lake, etc.\). While you can’t see your data directly in Airbyte, you can check your schema and sync status on the source detail page in Airbyte.
|
||||
|
||||
## **Can I add multiple destinations?**
|
||||
|
||||
Sure, you can. Just go to the "Destinations" section and click on the top right "+ new destination" button. You can have multiple destinations for the same source, and multiple sources for the same destination.
|
||||
|
||||
## Am I limited to GUI interaction or is there a way to set up / run / interact with Airbyte programmatically?
|
||||
|
||||
You can use the API to do anything you do today from the UI. Though, word of notice, the API is in alpha and may change. You won’t lose any functionality, but you may need to update your code to catch up to any backwards incompatible changes in the API.
|
||||
|
||||
## How does Airbyte handle connecting to databases that are behind a firewall / NAT?
|
||||
|
||||
We don’t. Airbyte is to be self-hosted in your own private cloud.
|
||||
|
||||
## Can I set a start time for my integration?
|
||||
|
||||
[Here](../../understanding-airbyte/connections#sync-schedules) is the link to the docs on scheduling syncs.
|
||||
|
||||
## **Can I disable analytics in Airbyte?**
|
||||
|
||||
Yes, you can control what's sent outside of Airbyte for analytics purposes.
|
||||
|
||||
We added the following telemetry to Airbyte to ensure the best experience for users:
|
||||
|
||||
* Measure usage of features & connectors
|
||||
* Measure failure rate of connectors to address bugs quickly
|
||||
* Reach out to our users about Airbyte community updates if they opt-in
|
||||
* ...
|
||||
|
||||
To disable telemetry, modify the `.env` file and define the two following environment variables:
|
||||
|
||||
```text
|
||||
TRACKING_STRATEGY=logging
|
||||
```
|
||||
@@ -1,14 +0,0 @@
|
||||
# Security & Data Audits
|
||||
|
||||
## **How secure is Airbyte?**
|
||||
|
||||
Airbyte is an open-source self-hosted solution, so let’s say it is as safe as your data infrastructure. _\*\*_
|
||||
|
||||
## **Is Airbyte GDPR compliant?**
|
||||
|
||||
Airbyte is a self-hosted solution, so it doesn’t bring any security or privacy risk to your infrastructure. We do intend to add data quality and privacy compliance features in the future, in order to give you more visibility on that topic.
|
||||
|
||||
## **How does Airbyte charge?**
|
||||
|
||||
We don’t. All connectors are all under the MIT license. If you are curious about the business model we have in mind, please check our [company handbook](https://handbook.airbyte.io/strategy/business-model).
|
||||
|
||||
@@ -1,20 +0,0 @@
|
||||
# Transformation and Schemas
|
||||
|
||||
## **Where's the T in Airbyte’s ETL tool?**
|
||||
|
||||
Airbyte is actually an ELT tool, and you have the freedom to use it as an EL-only tool. The transformation part is done by default, but it is optional. You can choose to receive the data in raw \(JSON file for instance\) in your destination.
|
||||
|
||||
We do provide normalization \(if option is still on\) so that data analysts / scientists / any users of the data can use it without much effort.
|
||||
|
||||
We also intend to integrate deeply with dbt to make it easier for your team to continue relying you on them, if this was what you were doing.
|
||||
|
||||
## **How does Airbyte handle replication when a data source changes its schema?**
|
||||
|
||||
Airbyte continues to sync data using the configured schema until that schema is updated. Because Airbyte treats all fields as optional, if a field is renamed or deleted in the source, that field simply will no longer be replicated, but all remaining fields will. The same is true for streams as well.
|
||||
|
||||
For now, the schema can only be updated manually in the UI \(by clicking "Update Schema" in the settings page for the connection\). When a schema is updated Airbyte will re-sync all data for that source using the new schema.
|
||||
|
||||
## **How does Airbyte handle namespaces \(or schemas for the DB-inclined\)?**
|
||||
|
||||
Airbyte respects source-defined namespaces when syncing data with a namespace-supported destination. See [this](../../understanding-airbyte/namespaces.md) for more details.
|
||||
|
||||
@@ -1,102 +0,0 @@
|
||||
# Mongo DB
|
||||
|
||||
The MongoDB source supports Full Refresh and Incremental sync strategies.
|
||||
|
||||
## Resulting schema
|
||||
|
||||
MongoDB does not have anything like table definition, thus we have to define column types from actual attributes and their values. Discover phase have two steps:
|
||||
|
||||
### Step 1. Find all unique properties
|
||||
|
||||
Connector runs the map-reduce command which returns all unique document props in the collection. Map-reduce approach should be sufficient even for large clusters.
|
||||
|
||||
#### Note
|
||||
|
||||
To work with Atlas MongoDB, a **non-free** tier is required, as the free tier does not support the ability to perform the mapReduce operation.
|
||||
|
||||
### Step 2. Determine property types
|
||||
|
||||
For each property found, connector selects 10k documents from the collection where this property is not empty. If all the selected values have the same type - connector will set appropriate type to the property. In all other cases connector will fallback to `string` type.
|
||||
|
||||
## Features
|
||||
|
||||
| Feature | Supported |
|
||||
| :--- | :--- |
|
||||
| Full Refresh Sync | Yes |
|
||||
| Incremental - Append Sync | Yes |
|
||||
| Replicate Incremental Deletes | No |
|
||||
| Namespaces | No |
|
||||
|
||||
### Full Refresh sync
|
||||
|
||||
Works as usual full refresh sync.
|
||||
|
||||
### Incremental sync
|
||||
|
||||
Cursor field can not be nested. Currently only top level document properties are supported.
|
||||
|
||||
Cursor should **never** be blank. In case cursor is blank - the incremental sync results might be unpredictable and will totally rely on MongoDB comparison algorithm.
|
||||
|
||||
Only `datetime` and `integer` cursor types are supported. Cursor type is determined based on the cursor field name:
|
||||
|
||||
* `datetime` - if cursor field name contains a string from: `time`, `date`, `_at`, `timestamp`, `ts`
|
||||
* `integer` - otherwise
|
||||
|
||||
## Getting started
|
||||
|
||||
This guide describes in details how you can configure MongoDB for integration with Airbyte.
|
||||
|
||||
### Create users
|
||||
|
||||
Run `mongo` shell, switch to `admin` database and create a `READ_ONLY_USER`. `READ_ONLY_USER` will be used for Airbyte integration. Please make sure that user has read-only privileges.
|
||||
|
||||
```javascript
|
||||
mongo
|
||||
use admin;
|
||||
db.createUser({user: "READ_ONLY_USER", pwd: "READ_ONLY_PASSWORD", roles: [{role: "read", db: "TARGET_DATABASE"}]}
|
||||
```
|
||||
|
||||
Make sure the user have appropriate access levels.
|
||||
|
||||
### Configure application
|
||||
|
||||
In case your application uses MongoDB without authentication you will have to adjust code base and MongoDB config to enable MongoDB authentication. **Otherwise your application might go down once MongoDB authentication will be enabled.**
|
||||
|
||||
### Enable MongoDB authentication
|
||||
|
||||
Open `/etc/mongod.conf` and add/replace specific keys:
|
||||
|
||||
```yaml
|
||||
net:
|
||||
bindIp: 0.0.0.0
|
||||
|
||||
security:
|
||||
authorization: enabled
|
||||
```
|
||||
|
||||
Binding to `0.0.0.0` will allow to connect to database from any IP address.
|
||||
|
||||
The last line will enable MongoDB security. Now only authenticated users will be able to access the database.
|
||||
|
||||
### Configure firewall
|
||||
|
||||
Make sure that MongoDB is accessible from external servers. Specific commands will depend on the firewall you are using \(UFW/iptables/AWS/etc\). Please refer to appropriate documentation.
|
||||
|
||||
Your `READ_ONLY_USER` should now be ready for use with Airbyte.
|
||||
|
||||
|
||||
#### Possible configuration Parameters
|
||||
|
||||
* [Authentication Source](https://docs.mongodb.com/manual/reference/connection-string/#mongodb-urioption-urioption.authSource)
|
||||
* Host: URL of the database
|
||||
* Port: Port to use for connecting to the database
|
||||
* User: username to use when connecting
|
||||
* Password: used to authenticate the user
|
||||
* [Replica Set](https://docs.mongodb.com/manual/reference/connection-string/#mongodb-urioption-urioption.replicaSet)
|
||||
* Whether to enable SSL
|
||||
|
||||
|
||||
## Changelog
|
||||
| Version | Date | Pull Request | Subject |
|
||||
| :------ | :-------- | :----- | :------ |
|
||||
| 0.2.3 | 2021-07-20 | [4669](https://github.com/airbytehq/airbyte/pull/4669) | Subscriptions Stream now returns all kinds of subscriptions (including expired and canceled)|
|
||||
@@ -1,28 +0,0 @@
|
||||
# Securing Airbyte access
|
||||
|
||||
## Reporting Vulnerabilities
|
||||
⚠️ Please do not file GitHub issues or post on our public forum for security vulnerabilities as they are public! ⚠️
|
||||
|
||||
Airbyte takes security issues very seriously. If you have any concern around Airbyte or believe you have uncovered a vulnerability, please get in touch via the e-mail address security@airbyte.io. In the message, try to provide a description of the issue and ideally a way of reproducing it. The security team will get back to you as soon as possible.
|
||||
|
||||
Note that this security address should be used only for undisclosed vulnerabilities. Dealing with fixed issues or general questions on how to use the security features should be handled regularly via the user and the dev lists. Please report any security problems to us before disclosing it publicly.
|
||||
|
||||
## Access control
|
||||
|
||||
Airbyte, in its open-source version, does not support RBAC to manage access to the UI.
|
||||
|
||||
However, multiple options exist for the operators to implement access control themselves.
|
||||
|
||||
To secure access to Airbyte you have three options:
|
||||
* Networking restrictions: deploy Airbyte in a private network or use a firewall to filter which IP is allowed to access your host.
|
||||
* Put Airbyte behind a reverse proxy and handle the access control on the reverse proxy side.
|
||||
* If you deployed Airbyte on a cloud provider:
|
||||
* GCP: use the [Identity-Aware proxy](https://cloud.google.com/iap) service
|
||||
* AWS: use the [AWS Systems Manager Session Manager](https://docs.aws.amazon.com/systems-manager/latest/userguide/session-manager.html) service
|
||||
|
||||
**Non exhaustive** online resources list to set up auth on your reverse proxy:
|
||||
* [Configure HTTP Basic Auth on NGINX for Airbyte](https://shadabshaukat.medium.com/deploy-and-secure-airbyte-with-nginx-reverse-proxy-basic-authentication-lets-encrypt-ssl-72bee223a4d9)
|
||||
* [Kubernetes: Basic auth on a Nginx ingress controller](https://kubernetes.github.io/ingress-nginx/examples/auth/basic/)
|
||||
* [How to set up Okta SSO on an NGINX reverse proxy](https://developer.okta.com/blog/2018/08/28/nginx-auth-request)
|
||||
* [How to enable HTTP Basic Auth on Caddy](https://caddyserver.com/docs/caddyfile/directives/basicauth)
|
||||
* [SSO for Traefik](https://github.com/thomseddon/traefik-forward-auth)
|
||||
@@ -1,106 +0,0 @@
|
||||
# Core Concepts
|
||||
|
||||
Airbyte enables you to build data pipelines and replicate data from a source to a destination. You can configure how frequently the data is synced, what data is replicated, and how the data is written to in the destination.
|
||||
|
||||
This page describes the concepts you need to know to use Airbyte.
|
||||
|
||||
## Source
|
||||
|
||||
A source is an API, file, database, or data warehouse that you want to ingest data from.
|
||||
|
||||
## Destination
|
||||
|
||||
A destination is a data warehouse, data lake, database, or an analytics tool where you want to load your ingested data.
|
||||
|
||||
## Connector
|
||||
|
||||
An Airbyte component which pulls data from a source or pushes data to a destination.
|
||||
|
||||
## Connection
|
||||
|
||||
A connection is an automated data pipeline that replicates data from a source to a destination. Setting up a connection enables configuration of the following parameters:
|
||||
|
||||
| Concept | Description |
|
||||
|---------------------|---------------------------------------------------------------------------------------------------------------------|
|
||||
| Replication Frequency | When should a data sync be triggered? |
|
||||
| Destination Namespace and Stream Prefix | Where should the replicated data be written? |
|
||||
| Catalog Selection | What data (streams and columns) should be replicated from the source to the destination? |
|
||||
| Sync Mode | How should the streams be replicated (read and written)? |
|
||||
| Schema Propagation | How should Airbyte handle schema drift in sources? |
|
||||
## Stream
|
||||
|
||||
A stream is a group of related records.
|
||||
|
||||
Examples of streams:
|
||||
|
||||
- A table in a relational database
|
||||
- A resource or API endpoint for a REST API
|
||||
- The records from a directory containing many files in a filesystem
|
||||
|
||||
## Field
|
||||
|
||||
A field is an attribute of a record in a stream.
|
||||
|
||||
Examples of fields:
|
||||
|
||||
- A column in the table in a relational database
|
||||
- A field in an API response
|
||||
|
||||
## Namespace
|
||||
|
||||
Namespace is a method of grouping streams in a source or destination. Namespaces are used to generally organize data, segregate tests and production data, and enforce permissions. In a relational database system, this is known as a schema.
|
||||
|
||||
In a source, the namespace is the location from where the data is replicated to the destination. In a destination, the namespace is the location where the replicated data is stored in the destination.
|
||||
|
||||
Airbyte supports the following configuration options for a connection:
|
||||
|
||||
| Destination Namepsace | Description |
|
||||
| ---------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------- |
|
||||
| Destination default | All streams will be replicated to the single default namespace defined by the Destination. For more details, see <a href="/understanding-airbyte/namespaces#--destination-connector-settings"> Destination Connector Settings</a> |
|
||||
| Mirror source structure | Some sources (for example, databases) provide namespace information for a stream. If a source provides namespace information, the destination will mirror the same namespace when this configuration is set. For sources or streams where the source namespace is not known, the behavior will default to the "Destination default" option. |
|
||||
| Custom format | All streams will be replicated to a single user-defined namespace. See<a href="/understanding-airbyte/namespaces#--custom-format"> Custom format</a> for more details |
|
||||
|
||||
## Connection sync modes
|
||||
|
||||
A sync mode governs how Airbyte reads from a source and writes to a destination. Airbyte provides different sync modes to account for various use cases.
|
||||
|
||||
- **Full Refresh | Overwrite:** Sync all records from the source and replace data in destination by overwriting it each time.
|
||||
- **Full Refresh | Append:** Sync all records from the source and add them to the destination without deleting any data. This creates a historical copy of all records each sync.
|
||||
- **Incremental Sync | Append:** Sync new records from the source and add them to the destination without deleting any data. This enables efficient historical tracking over time of data.
|
||||
- **Incremental Sync | Append + Deduped:** Sync new records from the source and add them to the destination. Also provides a de-duplicated view mirroring the state of the stream in the source. This is the most common replication use case.
|
||||
|
||||
## Normalization
|
||||
|
||||
Normalization is the process of structuring data from the source into a format appropriate for consumption in the destination. For example, when writing data from a nested, dynamically typed source like a JSON API to a relational destination like Postgres, normalization is the process which un-nests JSON from the source into a relational table format which uses the appropriate column types in the destination.
|
||||
|
||||
Note that normalization is only relevant for the following relational database & warehouse destinations:
|
||||
|
||||
- Redshift
|
||||
- Postgres
|
||||
- Oracle
|
||||
- MySQL
|
||||
- MSSQL
|
||||
|
||||
Other destinations do not support normalization as described in this section, though they may normalize data in a format that makes sense for them. For example, the S3 destination connector offers the option of writing JSON files in S3, but also offers the option of writing statically typed files such as Parquet or Avro.
|
||||
|
||||
After a sync is complete, Airbyte normalizes the data. When setting up a connection, you can choose one of the following normalization options:
|
||||
|
||||
- Raw data (no normalization): Airbyte places the JSON blob version of your data in a table called `_airbyte_raw_<stream name>`
|
||||
- Basic Normalization: Airbyte converts the raw JSON blob version of your data to the format of your destination. _Note: Not all destinations support normalization._
|
||||
- [dbt Cloud integration](https://docs.airbyte.com/cloud/managing-airbyte-cloud/dbt-cloud-integration): Airbyte's dbt Cloud integration allows you to use dbt Cloud for transforming and cleaning your data during the normalization process.
|
||||
|
||||
:::note
|
||||
|
||||
Normalizing data may cause an increase in your destination's compute cost. This cost will vary depending on the amount of data that is normalized and is not related to Airbyte credit usage.
|
||||
|
||||
:::
|
||||
|
||||
## Workspace
|
||||
|
||||
A workspace is a grouping of sources, destinations, connections, and other configurations. It lets you collaborate with team members and share resources across your team under a shared billing account.
|
||||
|
||||
When you [sign up](http://cloud.airbyte.com/signup) for Airbyte Cloud, we automatically create your first workspace where you are the only user with access. You can set up your sources and destinations to start syncing data and invite other users to join your workspace.
|
||||
|
||||
## Glossary of Terms
|
||||
|
||||
You find and extended list of [Airbyte specific terms](https://glossary.airbyte.com/term/airbyte-glossary-of-terms/), [data engineering concepts](https://glossary.airbyte.com/term/data-engineering-concepts) or many [other data related terms](https://glossary.airbyte.com/).
|
||||
@@ -1,178 +0,0 @@
|
||||
# Getting Started with Airbyte Cloud
|
||||
|
||||
This page guides you through setting up your Airbyte Cloud account, setting up a source, destination, and connection, verifying the sync, and allowlisting an IP address.
|
||||
|
||||
## Set up your Airbyte Cloud account
|
||||
|
||||
To use Airbyte Cloud:
|
||||
|
||||
1. If you haven't already, [sign up for Airbyte Cloud](https://cloud.airbyte.com/signup?utm_campaign=22Q1_AirbyteCloudSignUpCampaign_Trial&utm_source=Docs&utm_content=SetupGuide) using your email address, Google login, or GitHub login.
|
||||
|
||||
Airbyte Cloud offers a 14-day free trial that begins after your first successful sync. For more information, see [Pricing](https://airbyte.com/pricing).
|
||||
|
||||
:::note
|
||||
If you are invited to a workspace, you currently cannot use your Google login to create a new Airbyte account.
|
||||
:::
|
||||
|
||||
2. If you signed up using your email address, Airbyte will send you an email with a verification link. On clicking the link, you'll be taken to your new workspace.
|
||||
|
||||
:::info
|
||||
A workspace lets you collaborate with team members and share resources across your team under a shared billing account.
|
||||
:::
|
||||
|
||||
## Set up a source
|
||||
|
||||
:::info
|
||||
A source is an API, file, database, or data warehouse that you want to ingest data from.
|
||||
:::
|
||||
|
||||
To set up a source:
|
||||
|
||||
1. On the Airbyte Cloud dashboard, click **Sources**.
|
||||
2. On the Set up the source page, select the source you want to set up from the **Source catalog**. Airbyte currently offers more than 200 source connectors in Cloud to choose from. Once you've selected the source, a Setup Guide will lead you through the authentication and setup of the source.
|
||||
|
||||
3. Click **Set up source**.
|
||||
|
||||
## Set up a destination
|
||||
|
||||
:::info
|
||||
A destination is a data warehouse, data lake, database, or an analytics tool where you want to load your extracted data.
|
||||
:::
|
||||
|
||||
To set up a destination:
|
||||
|
||||
1. On the Airbyte Cloud dashboard, click **Destinations**.
|
||||
2. On the Set up the Destination page, select the destination you want to set up from the **Destination catalog**. Airbyte currently offers more than 38 destination connectors in Cloud to choose from. Once you've selected the destination, a Setup Guide will lead you through the authentication and setup of the source.
|
||||
3. Click **Set up destination**.
|
||||
|
||||
## Set up a connection
|
||||
|
||||
:::info
|
||||
A connection is an automated data pipeline that replicates data from a source to a destination.
|
||||
:::
|
||||
|
||||
Setting up a connection involves configuring the following parameters:
|
||||
|
||||
| Replication Setting | Description |
|
||||
| ---------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------- |
|
||||
| [Destination Namespace](../understanding-airbyte/namespaces.md) and stream prefix | Where should the replicated data be written to? |
|
||||
| Replication Frequency | How often should the data sync? |
|
||||
| [Data Residency](https://docs.airbyte.com/cloud/managing-airbyte-cloud/manage-data-residency#choose-the-data-residency-for-a-connection) | Where should the data be processed? |
|
||||
| [Schema Propagation](https://docs.airbyte.com/cloud/managing-airbyte-cloud/manage-schema-changes) | Should schema drift be automated? |
|
||||
|
||||
After configuring the connection settings, you will then define specifically what data will be synced.
|
||||
|
||||
:::info
|
||||
A connection's schema consists of one or many streams. Each stream is most commonly associated with a database table or an API endpoint. Within a stream, there can be one or many fields or columns.
|
||||
:::
|
||||
|
||||
| Catalog Selection | Description |
|
||||
| ---------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------- |
|
||||
| Stream Selection | Which streams should be replicated from the source to the destination? |
|
||||
| Column Selection | Which fields should be included in the sync? |
|
||||
| [Sync Mode](../understanding-airbyte/connections/README.md) | How should the streams be replicated (read and written)? |
|
||||
|
||||
To set up a connection:
|
||||
|
||||
:::tip
|
||||
|
||||
Set your [default data residency](https://docs.airbyte.com/cloud/managing-airbyte-cloud/manage-data-residency#choose-your-default-data-residency) before creating a new connection to ensure your data is processed in the correct region.
|
||||
|
||||
:::
|
||||
|
||||
1. On the Airbyte Cloud dashboard, click **Connections** and then click **+ New connection**.
|
||||
2. Select a source:
|
||||
|
||||
- To use a data source you've already set up with Airbyte, select from the list of existing sources. Click the source to use it.
|
||||
- To set up a new source, select **Set up a new source** and fill out the fields relevant to your source using the Setup Guide.
|
||||
|
||||
3. Select a destination:
|
||||
|
||||
- To use a data source you've already set up with Airbyte, select from the list of existing destinations. Click the destination to use it.
|
||||
- To set up a new destination, select **Set up a new destination** and fill out the fields relevant to your destination using the Setup Guide.
|
||||
|
||||
Airbyte will scan the schema of the source, and then display the **Connection Configuration** page.
|
||||
|
||||
4. From the **Replication frequency** dropdown, select how often you want the data to sync from the source to the destination. The default replication frequency is **Every 24 hours**. You can also set up [cron scheduling](http://www.quartz-scheduler.org/documentation/quartz-2.3.0/tutorials/crontrigger.html).
|
||||
|
||||
Reach out to [Sales](https://airbyte.com/company/talk-to-sales) if you require replication more frequently than once per hour.
|
||||
|
||||
5. From the **Destination Namespace** dropdown, select the format in which you want to store the data in the destination. Note: The default configuration is **Destination default**.
|
||||
|
||||
| Destination Namepsace | Description |
|
||||
| ---------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------- |
|
||||
| Destination default | All streams will be replicated to the single default namespace defined by the Destination. For more details, see<a href="/understanding-airbyte/namespaces#--destination-connector-settings"> Destination Connector Settings</a> |
|
||||
| Mirror source structure | Some sources (for example, databases) provide namespace information for a stream. If a source provides namespace information, the destination will mirror the same namespace when this configuration is set. For sources or streams where the source namespace is not known, the behavior will default to the "Destination default" option. |
|
||||
| Custom format | All streams will be replicated to a single user-defined namespace. See<a href="/understanding-airbyte/namespaces#--custom-format"> Custom format</a> for more details |
|
||||
|
||||
:::tip
|
||||
To ensure your data is synced correctly, see our examples of how to use the [Destination Namespace](../understanding-airbyte/namespaces.md#examples)
|
||||
:::
|
||||
|
||||
6. (Optional) In the **Destination Stream Prefix (Optional)** field, add a prefix to stream names. For example, adding a prefix `airbyte_` renames the stream `projects` to `airbyte_projects`. This is helpful if you are sending multiple connections to the same Destination Namespace to ensure connections do not conflict when writing to the destination.
|
||||
|
||||
7. Select in the **Detect and propagate schema changes** dropdown whether Airbyte should propagate schema changes. See more details about how we handle [schema changes](https://docs.airbyte.com/cloud/managing-airbyte-cloud/manage-schema-changes).
|
||||
|
||||
|
||||
8. Activate the streams you want to sync by toggling the **Sync** button on. Use the **Search stream name** search box to find streams quickly. If you want to sync all streams, bulk toggle to enable all streams.
|
||||
|
||||
9. Configure the stream settings:
|
||||
1. **Data Destination**: Where the data will land in the destination
|
||||
2. **Stream**: The table name in the source
|
||||
3. **Sync mode**: How the data will be replicated from the source to the destination.
|
||||
|
||||
For the source:
|
||||
|
||||
- Select **Full Refresh** to copy the entire dataset each time you sync
|
||||
- Select **Incremental** to replicate only the new or modified data
|
||||
|
||||
For the destination:
|
||||
|
||||
- Select **Overwrite** to erase the old data and replace it completely
|
||||
- Select **Append** to capture changes to your table
|
||||
**Note:** This creates duplicate records
|
||||
- Select **Append + Deduped** to mirror your source while keeping records unique (most common)
|
||||
|
||||
**Note:** Some sync modes may not yet be available for the source or destination.
|
||||
|
||||
4. **Cursor field**: Used in **Incremental** sync mode to determine which records to sync. Airbyte pre-selects the cursor field for you (example: updated date). If you have multiple cursor fields, select the one you want.
|
||||
5. **Primary key**: Used in **Append + Deduped** sync mode to determine the unique identifier.
|
||||
6. Choose which fields or columns to sync. By default, all fields are synced.
|
||||
|
||||
10. Click **Set up connection**.
|
||||
11. Airbyte tests the connectio setup. If the test is successful, Airbyte will save the configuration. If the Replication Frequency uses a preset schedule or CRON, your first sync will immediately begin!
|
||||
|
||||
## Verify the sync
|
||||
|
||||
Once the first sync has completed, you can verify the sync has completed by checking in Airbyte Cloud and in your destination.
|
||||
|
||||
1. On the Airbyte Cloud dashboard, click **Connections**. The list of connections is displayed. Click on the connection you just set up.
|
||||
2. The **Job History** tab shows each sync run, along with the sync summary of data and rows moved. You can also manually trigger syncs or view detailed logs for each sync here.
|
||||
3. Check the data at your destination. If you added a Destination Stream Prefix while setting up the connection, make sure to search for the stream name with the prefix.
|
||||
|
||||
## Allowlist IP addresses
|
||||
|
||||
Depending on your [data residency](https://docs.airbyte.com/cloud/managing-airbyte-cloud/manage-data-residency#choose-your-default-data-residency) location, you may need to allowlist the following IP addresses to enable access to Airbyte:
|
||||
|
||||
### United States and Airbyte Default
|
||||
|
||||
#### GCP region: us-west3
|
||||
|
||||
[comment]: # "IMPORTANT: if changing the list of IP addresses below, you must also update the connector.airbyteCloudIpAddresses LaunchDarkly flag to show the new list so that the correct list is shown in the Airbyte Cloud UI, then reach out to the frontend team and ask them to update the default value in the useAirbyteCloudIps hook!"
|
||||
|
||||
- 34.106.109.131
|
||||
- 34.106.196.165
|
||||
- 34.106.60.246
|
||||
- 34.106.229.69
|
||||
- 34.106.127.139
|
||||
- 34.106.218.58
|
||||
- 34.106.115.240
|
||||
- 34.106.225.141
|
||||
|
||||
### European Union
|
||||
|
||||
#### AWS region: eu-west-3
|
||||
|
||||
- 13.37.4.46
|
||||
- 13.37.142.60
|
||||
- 35.181.124.238
|
||||
@@ -1,6 +1,6 @@
|
||||
# Configuring connections
|
||||
|
||||
A connection links a source to a destination and defines how your data will sync. After you have created a connection, you can modify any of the [configuration settings](#configure-connection-settings) or [stream settings](#modify-streams-in-your-connection).
|
||||
A connection links a source to a destination and defines how your data will sync. After you have created a connection, you can modify any of the configuration settings or stream settings.
|
||||
|
||||
## Configure Connection Settings
|
||||
|
||||
@@ -8,7 +8,7 @@ Configuring the connection settings allows you to manage various aspects of the
|
||||
|
||||
To configure these settings:
|
||||
|
||||
1. On the [Airbyte Cloud](http://cloud.airbyte.com) dashboard, click **Connections** and then click the connection you want to change.
|
||||
1. In the Airbyte UI, click **Connections** and then click the connection you want to change.
|
||||
|
||||
2. Click the **Replication** tab.
|
||||
|
||||
@@ -24,25 +24,11 @@ You can configure the following settings:
|
||||
|
||||
| Setting | Description |
|
||||
|--------------------------------------|-------------------------------------------------------------------------------------|
|
||||
| Replication frequency | How often the data syncs |
|
||||
| Destination namespace | Where the replicated data is written |
|
||||
| [Replication frequency](/using-airbyte/core-concepts/sync-schedules.md) | How often the data syncs |
|
||||
| [Destination namespace](/using-airbyte/core-concepts/namespaces.md) | Where the replicated data is written |
|
||||
| Destination stream prefix | How you identify streams from different connectors |
|
||||
| [Detect and propagate schema changes](https://docs.airbyte.com/cloud/managing-airbyte-cloud/manage-schema-changes/#review-non-breaking-schema-changes) | How Airbyte handles syncs when it detects schema changes in the source |
|
||||
| Connection Data Residency | Where data will be processed |
|
||||
|
||||
To use [cron scheduling](http://www.quartz-scheduler.org/documentation/quartz-2.3.0/tutorials/crontrigger.html):
|
||||
|
||||
1. In the **Replication Frequency** dropdown, click **Cron**.
|
||||
|
||||
2. Enter a cron expression and choose a time zone to create a sync schedule.
|
||||
|
||||
:::note
|
||||
|
||||
* Only one sync per connection can run at a time.
|
||||
* If a sync is scheduled to run before the previous sync finishes, the scheduled sync will start after the completion of the previous sync.
|
||||
* Reach out to [Sales](https://airbyte.com/company/talk-to-sales) if you require replication more frequently than once per hour.
|
||||
|
||||
:::
|
||||
| [Detect and propagate schema changes](/cloud/managing-airbyte-cloud/manage-schema-changes.md) | How Airbyte handles syncs when it detects schema changes in the source |
|
||||
| [Connection Data Residency](/cloud/managing-airbyte-cloud/manage-data-residency.md) | Where data will be processed |
|
||||
|
||||
## Modify streams in your connection
|
||||
|
||||
@@ -54,7 +40,7 @@ A connection's schema consists of one or many streams. Each stream is most commo
|
||||
|
||||
To modify streams:
|
||||
|
||||
1. On the [Airbyte Cloud](http://cloud.airbyte.com) dashboard, click **Connections** and then click the connection you want to change.
|
||||
1. In the Airbyte UI, click **Connections** and then click the connection you want to change.
|
||||
|
||||
2. Click the **Replication** tab.
|
||||
|
||||
@@ -74,7 +60,7 @@ Source-defined cursors and primary keys are selected automatically and cannot be
|
||||
|
||||
3. Click on a stream to display the stream details panel. You'll see each column we detect from the source.
|
||||
|
||||
4. Toggle individual fields or columns to include or exclude them in the sync, or use the toggle in the table header to select all fields at once.
|
||||
4. Column selection is available to protect PII or sensitive data from being synced to the destination. Toggle individual fields to include or exclude them in the sync, or use the toggle in the table header to select all fields at once.
|
||||
|
||||
:::info
|
||||
|
||||
|
||||
@@ -1,7 +1,15 @@
|
||||
# Use the dbt Cloud integration
|
||||
|
||||
<AppliesTo cloud />
|
||||
|
||||
By using the dbt Cloud integration, you can create and run dbt transformations during syncs in Airbyte Cloud. This allows you to transform raw data into a format that is suitable for analysis and reporting, including cleaning and enriching the data.
|
||||
|
||||
:::note
|
||||
|
||||
Normalizing data may cause an increase in your destination's compute cost. This cost will vary depending on the amount of data that is normalized and is not related to Airbyte credit usage.
|
||||
|
||||
:::
|
||||
|
||||
## Step 1: Generate a service token
|
||||
|
||||
Generate a [service token](https://docs.getdbt.com/docs/dbt-cloud-apis/service-tokens#generating-service-account-tokens) for your dbt Cloud transformation.
|
||||
@@ -17,7 +25,7 @@ Generate a [service token](https://docs.getdbt.com/docs/dbt-cloud-apis/service-t
|
||||
|
||||
To set up the dbt Cloud integration in Airbyte Cloud:
|
||||
|
||||
1. On the Airbyte Cloud dashboard, click **Settings**.
|
||||
1. In the Airbyte UI, click **Settings**.
|
||||
|
||||
2. Click **dbt Cloud integration**.
|
||||
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
# Manage notifications
|
||||
|
||||
This page provides guidance on how to manage notifications for Airbyte Cloud, allowing you to stay up-to-date on the activities in your workspace.
|
||||
This page provides guidance on how to manage notifications for Airbyte, allowing you to stay up-to-date on the activities in your workspace.
|
||||
|
||||
## Notification Event Types
|
||||
|
||||
@@ -12,41 +12,74 @@ This page provides guidance on how to manage notifications for Airbyte Cloud, al
|
||||
| Connection Updates Requiring Action | A connection update requires you to take action (ex. a breaking schema change is detected) |
|
||||
| Warning - Repeated Failures | A connection will be disabled soon due to repeated failures. It has failed 50 times consecutively or there were only failed jobs in the past 7 days |
|
||||
| Sync Disabled - Repeated Failures | A connection was automatically disabled due to repeated failures. It will be disabled when it has failed 100 times consecutively or has been failing for 14 days in a row |
|
||||
| Warning - Upgrade Required (email only) | A new connector version is available and requires manual upgrade |
|
||||
| Sync Disabled - Upgrade Required (email only) | One or more connections were automatically disabled due to a connector upgrade deadline passing
|
||||
|
|
||||
| Warning - Upgrade Required (Cloud only) | A new connector version is available and requires manual upgrade |
|
||||
| Sync Disabled - Upgrade Required (Cloud only) | One or more connections were automatically disabled due to a connector upgrade deadline passing
|
||||
|
||||
## Configure Notification Settings
|
||||
## Configure Email Notification Settings
|
||||
|
||||
<AppliesTo cloud />
|
||||
|
||||
To set up email notifications:
|
||||
|
||||
1. On the [Airbyte Cloud](http://cloud.airbyte.com) dashboard, click **Settings**.
|
||||
1. In the Airbyte UI, click **Settings** and navigate to **Notifications**.
|
||||
|
||||
2. Click **Notifications**.
|
||||
2. Toggle which messages you'd like to receive from Airbyte. All email notifications will be sent by default to the creator of the workspace. To change the recipient, edit and save the **notification email recipient**. If you would like to send email notifications to more than one recipient, you can enter an email distribution list (ie Google Group) as the recipient.
|
||||
|
||||
3. Toggle which messages you'd like to receive from Airbyte. All email notifications will be sent by default to the creator of the workspace. To change the recipient, edit and save the **notification email recipient**. If you would like to send email notifications to more than one recipient, you can enter an email distribution list (ie Google Group) as the recipient.
|
||||
3. Click **Save changes**.
|
||||
|
||||
:::note
|
||||
All email notifications except for Successful Syncs are enabled by default.
|
||||
:::
|
||||
|
||||
## Configure Slack Notification settings
|
||||
|
||||
To set up Slack notifications:
|
||||
|
||||
If you're more of a visual learner, just head over to [this video](https://www.youtube.com/watch?v=NjYm8F-KiFc&ab_channel=Airbyte) to learn how to do this. You can also refer to the Slack documentation on how to [create an incoming webhook for Slack](https://api.slack.com/messaging/webhooks).
|
||||
|
||||
### Create a Slack app
|
||||
|
||||
1. **Create a Slack App**: Navigate to https://api.slack.com/apps/. Select `Create an App`.
|
||||
|
||||

|
||||
|
||||
2. Select `From Scratch`. Enter your App Name (e.g. Airbyte Sync Notifications) and pick your desired Slack workspace.
|
||||
|
||||
3. **Set up the webhook URL.**: in the left sidebar, click on `Incoming Webhooks`. Click the slider button in the top right to turn the feature on. Then click `Add New Webhook to Workspace`.
|
||||
|
||||

|
||||
|
||||
4. Pick the channel that you want to receive Airbyte notifications in (ideally a dedicated one), and click `Allow` to give it permissions to access the channel. You should see the bot show up in the selected channel now. You will see an active webhook right above the `Add New Webhook to Workspace` button.
|
||||
|
||||

|
||||
|
||||
5. Click `Copy.` to copy the link to your clipboard, which you will need to enter into Airbyte.
|
||||
|
||||
Your Webhook URL should look something like this:
|
||||
|
||||

|
||||
|
||||
|
||||
### Enable the Slack notification in Airbyte
|
||||
|
||||
1. In the Airbyte UI, click **Settings** and navigate to **Notifications**.
|
||||
|
||||
2. Paste the copied webhook URL to `Webhook URL`. Using a Slack webook is recommended. On this page, you can toggle each slider decide whether you want notifications on each notification type.
|
||||
|
||||
3. **Test it out.**: you can click `Test` to send a test message to the channel. Or, just run a sync now and try it out! If all goes well, you should receive a notification in your selected channel that looks like this:
|
||||
|
||||

|
||||
|
||||
You're done!
|
||||
|
||||
4. Click **Save changes**.
|
||||
|
||||
To set up webhook notifications:
|
||||
|
||||
1. On the [Airbyte Cloud](http://cloud.airbyte.com) dashboard, click **Settings**.
|
||||
|
||||
2. Click **Notifications**.
|
||||
|
||||
3. Have a webhook URL ready if you plan to use webhook notifications. Using a Slack webook is recommended. [Create an Incoming Webhook for Slack](https://api.slack.com/messaging/webhooks).
|
||||
|
||||
4. Toggle the type of events you are interested to receive notifications for.
|
||||
1. To enable webhook notifications, the webhook URL is required. For your convenience, we provide a 'test' function to send a test message to your webhook URL so you can make sure it's working as expected.
|
||||
|
||||
5. Click **Save changes**.
|
||||
|
||||
## Enable schema update notifications
|
||||
|
||||
To get notified when your source schema changes:
|
||||
1. Make sure you have `Automatic Connection Updates` and `Connection Updates Requiring Action` turned on for your desired notification channels; If these are off, even if you turned on schema update notifications in a connection's settings, Airbyte will *NOT* send out any notifications related to these types of events.
|
||||
To be notified of any source schema changes:
|
||||
1. Make sure you have enabled `Automatic Connection Updates` and `Connection Updates Requiring Action` notifications. If these are off, even if you turned on schema update notifications in a connection's settings, Airbyte will *NOT* send out any notifications related to these types of events.
|
||||
|
||||
2. On the [Airbyte Cloud](http://cloud.airbyte.com/) dashboard, click **Connections** and select the connection you want to receive notifications for.
|
||||
2. In the Airbyte UI, click **Connections** and select the connection you want to receive notifications for.
|
||||
|
||||
3. Click the **Settings** tab on the Connection page.
|
||||
|
||||
|
||||
@@ -3,7 +3,7 @@
|
||||
The connection state provides additional information about incremental syncs. It includes the most recent values for the global or stream-level cursors, which can aid in debugging or determining which data will be included in the next sync.
|
||||
|
||||
To review the connection state:
|
||||
1. On the [Airbyte Cloud](http://cloud.airbyte.com) dashboard, click **Connections** and then click the connection you want to display.
|
||||
1. In the Airbyte UI, click **Connections** and then click the connection you want to display.
|
||||
|
||||
2. Click the **Settings** tab on the Connection page.
|
||||
|
||||
|
||||
@@ -1,14 +1,16 @@
|
||||
# Manage credits
|
||||
|
||||
<AppliesTo cloud />
|
||||
|
||||
## Buy credits
|
||||
|
||||
Airbyte [credits](https://airbyte.com/pricing) are used to pay for Airbyte resources when you run a sync. You can purchase credits on Airbyte Cloud to keep your data flowing without interruption.
|
||||
|
||||
To buy credits:
|
||||
|
||||
1. On the [Airbyte Cloud](http://cloud.airbyte.com) dashboard, click **Billing** in the navigation bar.
|
||||
1. In the Airbyte UI, click **Billing** in the navigation bar.
|
||||
|
||||
2. If you are unsure of how many credits you need, use our [Cost Estimator](https://cost.airbyte.com/) or click **Talk to Sales** to find the right amount for your team.
|
||||
2. If you are unsure of how many credits you need, use our [Cost Estimator](https://www.airbyte.com/pricing) or click **Talk to Sales** to find the right amount for your team.
|
||||
|
||||
3. Click **Buy credits**.
|
||||
|
||||
@@ -44,7 +46,7 @@ To buy credits:
|
||||
|
||||
You can enroll in automatic top-ups of your credit balance. This is a beta feature for those who do not want to manually add credits each time.
|
||||
|
||||
To enroll, [email us](mailto:natalie@airbyte.io) with:
|
||||
To enroll, [email us](mailto:billing@airbyte.io) with:
|
||||
|
||||
1. A link to your workspace that you'd like to enable this feature for.
|
||||
2. **Recharge threshold** The number under what credit balance you would like the automatic top up to occur.
|
||||
@@ -59,11 +61,11 @@ To take a real example, if:
|
||||
|
||||
Note that the difference between the recharge credit amount and recharge threshold must be at least 20 as our minimum purchase is 20 credits.
|
||||
|
||||
If you are enrolled and want to change your limits or cancel your enrollment, [email us](mailto:natalie@airbyte.io).
|
||||
If you are enrolled and want to change your limits or cancel your enrollment, [email us](mailto:billing@airbyte.io).
|
||||
|
||||
## View invoice history
|
||||
|
||||
1. On the [Airbyte Cloud](http://cloud.airbyte.com) dashboard, click **Billing** in the navigation bar.
|
||||
1. In the Airbyte UI, click **Billing** in the navigation bar.
|
||||
|
||||
2. Click **Invoice History**. You will be redirected to a Stripe portal.
|
||||
|
||||
|
||||
@@ -1,5 +1,7 @@
|
||||
# Manage data residency
|
||||
|
||||
<AppliesTo cloud />
|
||||
|
||||
In Airbyte Cloud, you can set the default data residency and choose the data residency for individual connections, which can help you comply with data localization requirements.
|
||||
|
||||
## Choose your default data residency
|
||||
@@ -12,11 +14,11 @@ While the data is processed in a data plane of the chosen residency, the cursor
|
||||
|
||||
:::
|
||||
|
||||
When you set the default data residency, it applies to new connections only. If you do not set the default data residency, the [Airbyte Default](https://docs.airbyte.com/cloud/getting-started-with-airbyte-cloud/#united-states-and-airbyte-default) region is used. If you want to change the data residency for a connection, you can do so in its [connection settings](#choose-the-data-residency-for-a-connection).
|
||||
When you set the default data residency, it applies to new connections only. If you do not set the default data residency, the [Airbyte Default](configuring-connections.md) region is used. If you want to change the data residency for a connection, you can do so in its [connection settings](configuring-connections.md).
|
||||
|
||||
To choose your default data residency:
|
||||
|
||||
1. On the [Airbyte Cloud](http://cloud.airbyte.com) dashboard, click **Settings**.
|
||||
1. In the Airbyte UI, click **Settings**.
|
||||
|
||||
2. Click **Data Residency**.
|
||||
|
||||
@@ -26,16 +28,16 @@ To choose your default data residency:
|
||||
|
||||
:::info
|
||||
|
||||
Depending on your network configuration, you may need to add [IP addresses](https://docs.airbyte.com/cloud/getting-started-with-airbyte-cloud/#allowlist-ip-addresses) to your allowlist.
|
||||
Depending on your network configuration, you may need to add [IP addresses](/operating-airbyte/security.md#network-security-1) to your allowlist.
|
||||
|
||||
:::
|
||||
|
||||
## Choose the data residency for a connection
|
||||
You can choose the data residency for your connection in the connection settings. You can also choose data residency when creating a [new connection](https://docs.airbyte.com/cloud/getting-started-with-airbyte-cloud#set-up-a-connection), or you can set the [default data residency](#choose-your-default-data-residency) for your workspace.
|
||||
You can choose the data residency for your connection in the connection settings. You can also choose data residency when creating a new connection, or you can set the default data residency for your workspace.
|
||||
|
||||
To choose the data residency for your connection:
|
||||
|
||||
1. On the [Airbyte Cloud](http://cloud.airbyte.com) dashboard, click **Connections** and then click the connection that you want to change.
|
||||
1. In the Airbyte UI, click **Connections** and then click the connection that you want to change.
|
||||
|
||||
2. Click the **Settings** tab.
|
||||
|
||||
|
||||
@@ -4,6 +4,7 @@ You can specify for each connection how Airbyte should handle any change of sche
|
||||
|
||||
Airbyte checks for any changes in your source schema immediately before syncing, at most once every 24 hours.
|
||||
|
||||
## Detection and Propagate Schema Changes
|
||||
Based on your configured settings for **Detect and propagate schema changes**, Airbyte will automatically sync those changes or ignore them:
|
||||
|
||||
| Setting | Description |
|
||||
@@ -13,6 +14,7 @@ Based on your configured settings for **Detect and propagate schema changes**, A
|
||||
| Ignore | Schema changes will be detected, but not propagated. Syncs will continue running with the schema you've set up. To propagate the detected schema changes, you will need to approve the changes manually |
|
||||
| Pause Connection | Connections will be automatically disabled as soon as any schema changes are detected |
|
||||
|
||||
## Types of Schema Changes
|
||||
When propagation is enabled, your data in the destination will automatically shift to bring in the new changes.
|
||||
|
||||
| Type of Schema Change | Propagation Behavior |
|
||||
@@ -23,6 +25,10 @@ When propagation is enabled, your data in the destination will automatically shi
|
||||
| Removal of stream | The stream will stop updating, and any existing data in the destination will remain. |
|
||||
| Column data type changes | The data in the destination will remain the same. Any new or updated rows with incompatible data types will result in a row error in the raw Airbyte tables. You will need to refresh the schema and do a full resync to ensure the data types are consistent.
|
||||
|
||||
:::tip
|
||||
Ensure you receive webhook notifications for your connection by enabling `Schema update notifications` in the connection's settings.
|
||||
:::
|
||||
|
||||
In all cases, if a breaking schema change is detected, the connection will be paused immediately for manual review to prevent future syncs from failing. Breaking schema changes occur when:
|
||||
* An existing primary key is removed from the source
|
||||
* An existing cursor is removed from the source
|
||||
@@ -33,7 +39,7 @@ To re-enable the streams, ensure the correct **Primary Key** and **Cursor** are
|
||||
|
||||
If the connection is set to **Ignore** any schema changes, Airbyte continues syncing according to your last saved schema. You need to manually approve any detected schema changes for the schema in the destination to change.
|
||||
|
||||
1. On the [Airbyte Cloud](http://cloud.airbyte.com/) dashboard, click **Connections**. Select a connection and navigate to the **Replication** tab. If schema changes are detected, you'll see a blue "i" icon next to the Replication ab.
|
||||
1. In the Airbyte UI, click **Connections**. Select a connection and navigate to the **Replication** tab. If schema changes are detected, you'll see a blue "i" icon next to the Replication ab.
|
||||
|
||||
2. Click **Review changes**.
|
||||
|
||||
@@ -62,7 +68,7 @@ A major version upgrade will include a breaking change if any of these apply:
|
||||
| State Changes | The format of the source’s state has changed, and the full dataset will need to be re-synced |
|
||||
|
||||
To review and fix breaking schema changes:
|
||||
1. On the [Airbyte Cloud](http://cloud.airbyte.com/) dashboard, click **Connections** and select the connection with breaking changes.
|
||||
1. In the Airbyte UI, click **Connections** and select the connection with breaking changes.
|
||||
|
||||
2. Review the description of what has changed in the new version. The breaking change will require you to upgrade your source or destination to a new version by a specific cutoff date.
|
||||
|
||||
@@ -74,13 +80,10 @@ In addition to Airbyte Cloud’s automatic schema change detection, you can manu
|
||||
|
||||
To manually refresh the source schema:
|
||||
|
||||
1. On the [Airbyte Cloud](http://cloud.airbyte.com) dashboard, click **Connections** and then click the connection you want to refresh.
|
||||
1. In the Airbyte UI, click **Connections** and then click the connection you want to refresh.
|
||||
|
||||
2. Click the **Replication** tab.
|
||||
|
||||
3. In the **Activate the streams you want to sync** table, click **Refresh source schema** to fetch the schema of your data source.
|
||||
|
||||
4. If there are changes to the schema, you can review them in the **Refreshed source schema** dialog.
|
||||
|
||||
## Manage Schema Change Notifications
|
||||
[Refer to our notification documentation](https://docs.airbyte.com/cloud/managing-airbyte-cloud/manage-airbyte-cloud-notifications#enable-schema-update-notifications) to understand how to stay updated on any schema updates to your connections.
|
||||
4. If there are changes to the schema, you can review them in the **Refreshed source schema** dialog.
|
||||
@@ -2,9 +2,9 @@
|
||||
The connection status displays information about the connection and of each stream being synced. Reviewing this summary allows you to assess the connection's current status and understand when the next sync will be run.
|
||||
|
||||
To review the connection status:
|
||||
1. On the [Airbyte Cloud](http://cloud.airbyte.com/) dashboard, click **Connections**.
|
||||
1. In the Airbyte UI, click **Connections**.
|
||||
|
||||
2. Click a connection in the list to view its status.
|
||||
2. Click a connection in the list to view its status.
|
||||
|
||||
| Status | Description |
|
||||
|------------------|---------------------------------------------------------------------------------------------------------------------|
|
||||
@@ -13,10 +13,20 @@ To review the connection status:
|
||||
| Delayed | The connection has not loaded data within the scheduled replication frequency. For example, if the replication frequency is 1 hour, the connection has not loaded data for more than 1 hour |
|
||||
| Error | The connection has not loaded data in more than two times the scheduled replication frequency. For example, if the replication frequency is 1 hour, the connection has not loaded data for more than 2 hours |
|
||||
| Action Required | A breaking change related to the source or destination requires attention to resolve |
|
||||
| Pending | The connection has not been run yet, so no status exists |
|
||||
| Disabled | The connection has been disabled and is not scheduled to run |
|
||||
| In Progress | The connection is currently extracting or loading data |
|
||||
| Disabled | The connection has been disabled and is not scheduled to run |
|
||||
| Pending | The connection has not been run yet, so no status exists |
|
||||
|
||||
If the most recent sync failed, you'll see the error message that will help diagnose if the failure is due to a source or destination configuration error. [Reach out](/community/getting-support.md) to us if you need any help to ensure you data continues syncing.
|
||||
|
||||
:::info
|
||||
If a sync starts to fail, it will automatically be disabled after 100 consecutive failures or 14 consecutive days of failure.
|
||||
:::
|
||||
|
||||
If a new major version of the connector has been released, you will also see a banner on this page indicating the cutoff date for the version. Airbyte recommends upgrading before the cutoff date to ensure your data continues syncing. If you do not upgrade before the cutoff date, Airbyte will automatically disable your connection.
|
||||
|
||||
Learn more about version upgrades in our [resolving breaking change documentation](/cloud/managing-airbyte-cloud/manage-schema-changes#resolving-breaking-changes).
|
||||
|
||||
## Review the stream status
|
||||
The stream status allows you to monitor each stream's latest status. The stream will be highlighted with a grey pending bar to indicate the sync is actively extracting or loading data.
|
||||
|
||||
@@ -28,6 +38,7 @@ The stream status allows you to monitor each stream's latest status. The stream
|
||||
|
||||
Each stream shows the last record loaded to the destination. Toggle the header to display the exact datetime the last record was loaded.
|
||||
|
||||
You can reset an individual stream without resetting all streams in a connection by clicking the three grey dots next to any stream. It is recommended to start a new sync after a reset.
|
||||
You can [reset](/operator-guides/reset.md) an individual stream without resetting all streams in a connection by clicking the three grey dots next to any stream.
|
||||
|
||||
You can also navigate directly to the stream's configuration by click the three grey dots next to any stream and selecting "Open details" to be redirected to the stream configuration.
|
||||
|
||||
You can also navigate directly to the stream's configuration by click the three grey dots next to any stream and selecting "Open details" to be redirected to the stream configuration.
|
||||
@@ -2,7 +2,7 @@
|
||||
|
||||
The job history displays information about synced data, such as the amount of data moved, the number of records read and committed, and the total sync time. Reviewing this summary can help you monitor the sync performance and identify any potential issues.
|
||||
|
||||
To review the sync history, click a connection in the list to view its sync history. Sync History displays the sync status or [reset](https://docs.airbyte.com/operator-guides/reset/) status. The sync status is defined as:
|
||||
To review the sync history, click a connection in the list to view its sync history. Sync History displays the sync status or [reset](/operator-guides/reset.md) status. The sync status is defined as:
|
||||
|
||||
| Status | Description |
|
||||
|---------------------|---------------------------------------------------------------------------------------------------------------------|
|
||||
|
||||
@@ -1,16 +1,12 @@
|
||||
# Understand Airbyte Cloud limits
|
||||
# Airbyte Cloud limits
|
||||
|
||||
Understanding the following limitations will help you more effectively manage Airbyte Cloud.
|
||||
|
||||
* Max number of workspaces per user: 3*
|
||||
* Max number of instances of the same source connector: 10*
|
||||
* Max number of destinations in a workspace: 20*
|
||||
* Max number of consecutive sync failures before a connection is paused: 100
|
||||
* Max number of days with consecutive sync failures before a connection is paused: 14 days
|
||||
* Max number of streams that can be returned by a source in a discover call: 1K
|
||||
* Max number of streams that can be configured to sync in a single connection: 1K
|
||||
* Size of a single record: 20MB
|
||||
* Shortest sync schedule: Every 60 min (Reach out to [Sales](https://airbyte.com/company/talk-to-sales) if you require replication more frequently than once per hour)
|
||||
* Schedule accuracy: +/- 30 min
|
||||
|
||||
*Limits on workspaces, sources, and destinations do not apply to customers of [Powered by Airbyte](https://airbyte.com/solutions/powered-by-airbyte). To learn more [contact us](https://airbyte.com/talk-to-sales)!
|
||||
|
||||
91
docs/community/code-of-conduct.md
Normal file
91
docs/community/code-of-conduct.md
Normal file
@@ -0,0 +1,91 @@
|
||||
---
|
||||
description: Our Community Code of Conduct
|
||||
---
|
||||
|
||||
# Code of Conduct
|
||||
|
||||
## Our Pledge
|
||||
|
||||
In the interest of fostering an open and welcoming environment, we as contributors and maintainers pledge to make participation in our project and our community a harassment-free experience for everyone, regardless of age, body size, disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, religion, or sexual identity and orientation.
|
||||
|
||||
## Our Standards
|
||||
|
||||
Examples of behavior that contributes to creating a positive environment include:
|
||||
|
||||
* Using welcoming and inclusive language
|
||||
* Being respectful of differing viewpoints and experiences
|
||||
* Gracefully accepting constructive criticism
|
||||
* Focusing on what is best for the community
|
||||
* Showing empathy towards other community members
|
||||
|
||||
Examples of unacceptable behavior by participants include:
|
||||
|
||||
* The use of sexualized language or imagery and unwelcome sexual attention or advances
|
||||
* Trolling, insulting/derogatory comments, and personal or political attacks
|
||||
* Public or private harassment
|
||||
* Publishing others’ private information, such as a physical or electronic address, without explicit permission
|
||||
* Other conduct which could reasonably be considered inappropriate in a professional setting
|
||||
|
||||
## Our Responsibilities
|
||||
|
||||
Project maintainers are responsible for clarifying the standards of acceptable behavior and are expected to take appropriate and fair corrective action in response to any instances of unacceptable behavior.
|
||||
|
||||
Project maintainers have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, or to ban temporarily or permanently any contributor for other behaviors that they deem inappropriate, threatening, offensive, or harmful.
|
||||
|
||||
## Scope
|
||||
|
||||
This Code of Conduct applies within all project spaces, and it also applies when an individual is representing the project or its community in public spaces. Examples of representing a project or community include using an official project e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event. Representation of a project may be further defined and clarified by project maintainers.
|
||||
|
||||
## Enforcement
|
||||
|
||||
Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by contacting the project team at [conduct@airbyte.io](mailto:conduct@airbyte.io). All complaints will be reviewed and investigated and will result in a response that is deemed necessary and appropriate to the circumstances. The project team is obligated to maintain confidentiality with regard to the reporter of an incident. Further details of specific enforcement policies may be posted separately.
|
||||
|
||||
Project maintainers who do not follow or enforce the Code of Conduct in good faith may face temporary or permanent repercussions as determined by other members of the project’s leadership.
|
||||
|
||||
## Attribution
|
||||
|
||||
This Code of Conduct is adapted from the [Contributor Covenant](https://www.contributor-covenant.org/), version 1.4, available at [https://www.contributor-covenant.org/version/1/4/code-of-conduct.html](https://www.contributor-covenant.org/version/1/4/code-of-conduct.html)
|
||||
|
||||
## Slack Code of Conduct
|
||||
|
||||
Airbyte's Slack community is growing incredibly fast. We're home to over 1500 data professionals and are growing at an awesome pace. We are proud of our community, and have provided these guidelines to support new members in maintaining the wholesome spirit we have developed here. We appreciate your continued commitment to making this a community we are all excited to be a part of.
|
||||
|
||||
### Rule 1: Be respectful.
|
||||
|
||||
Our desire is for everyone to have a positive, fulfilling experience in Airbyte Slack, and we sincerely appreciate your help in making this happen.
|
||||
All of the guidelines we provide below are important, but there’s a reason respect is the first rule. We take it seriously, and while the occasional breach of etiquette around Slack is forgivable, we cannot condone disrespectful behavior.
|
||||
|
||||
### Rule 2: Use the most relevant channels.
|
||||
|
||||
We deliberately use topic-specific Slack channels so members of the community can opt-in on various types of conversations. Our members take care to post their messages in the most relevant channel, and you’ll often see reminders about the best place to post a message (respectfully written, of course!). If you're looking for help directly from the Community Assistance Team or other Airbyte employees, please stick to posting in the airbyte-help channel, so we know you're asking us specifically!
|
||||
|
||||
### Rule 3: Don’t double-post.
|
||||
|
||||
Please be considerate of our community members’ time. We know your question is important, but please keep in mind that Airbyte Slack is not a customer service platform but a community of volunteers who will help you as they are able around their own work schedule. You have access to all the history, so it’s easy to check if your question has already been asked.
|
||||
|
||||
### Rule 4: Check question for clarity and thoughtfulness.
|
||||
|
||||
Airbyte Slack is a community of volunteers. Our members enjoy helping others; they are knowledgeable, gracious, and willing to give their time and expertise for free. Putting some effort into a well-researched and thoughtful post shows consideration for their time and will gain more responses.
|
||||
|
||||
### Rule 5: Keep it public.
|
||||
|
||||
This is a public forum; please do not contact individual members of this community without their express permission, regardless of whether you are trying to recruit someone, sell a product, or solicit help.
|
||||
|
||||
### Rule 6: No soliciting!
|
||||
|
||||
The purpose of the Airbyte Slack community is to provide a forum for data practitioners to discuss their work and share their ideas and learnings. It is not intended as a place to generate leads for vendors or recruiters, and may not be used as such.
|
||||
|
||||
If you’re a vendor, you may advertise your product in #shameless-plugs. Advertising your product anywhere else is strictly against the rules.
|
||||
|
||||
### Rule 7: Don't spam tags, or use @here or @channel.
|
||||
|
||||
Using the @here and @channel keywords in a post will not help, as they are disabled in Slack for everyone excluding admins. Nonetheless, if you use them we will remind you with a link to this rule, to help you better understand the way Airbyte Slack operates.
|
||||
|
||||
Do not tag specific individuals for help on your questions. If someone chooses to respond to your question, they will do so. You will find that our community of volunteers is generally very responsive and amazingly helpful!
|
||||
|
||||
### Rule 8: Use threads for discussion.
|
||||
|
||||
The simplest way to keep conversations on track in Slack is to use threads. The Airbyte Slack community relies heavily on threads, and if you break from this convention, rest assured one of our community members will respectfully inform you quickly!
|
||||
|
||||
_If you see a message or receive a direct message that violates any of these rules, please contact an Airbyte team member and we will take the appropriate moderation action immediately. We have zero tolerance for intentional rule-breaking and hate speech._
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Airbyte Support
|
||||
# Getting Support
|
||||
|
||||
Hold up! Have you looked at [our docs](https://docs.airbyte.com/) yet? We recommend searching the wealth of knowledge in our documentation as many times the answer you are looking for is there!
|
||||
|
||||
@@ -6,14 +6,26 @@ Hold up! Have you looked at [our docs](https://docs.airbyte.com/) yet? We recomm
|
||||
|
||||
Running Airbyte Open Source and have questions that our docs could not clear up? Post your questions on our [Github Discussions](https://github.com/airbytehq/airbyte/discussions?_gl=1*70s0c6*_ga*MTc1OTkyOTYzNi4xNjQxMjQyMjA0*_ga_HDBMVFQGBH*MTY4OTY5MDQyOC4zNDEuMC4xNjg5NjkwNDI4LjAuMC4w) and also join our community Slack to connect with other Airbyte users.
|
||||
|
||||
### Community Slack
|
||||
|
||||
**Join our Slack community** [HERE](https://slack.airbyte.com/?_gl=1*1h8mjfe*_gcl_au*MTc4MjAxMDQzOS4xNjgyOTczMDYy*_ga*MTc1OTkyOTYzNi4xNjQxMjQyMjA0*_ga_HDBMVFQGBH*MTY4Nzg4OTQ4MC4zMjUuMS4xNjg3ODkwMjE1LjAuMC4w&_ga=2.58571491.813788522.1687789276-1759929636.1641242204)!
|
||||
|
||||
Ask your questions first in the #ask-ai channel and if our bot can not assist you, reach out to our community in the #ask-community-for-troubleshooting channel.
|
||||
|
||||
Ask your questions first in the #ask-ai channel and if our bot can not assist you, reach out to our community in the #ask-community-for-troubleshooting channel.
|
||||
|
||||
If you require personalized support, reach out to our sales team to inquire about [Airbyte Enterprise](https://airbyte.com/airbyte-enterprise).
|
||||
|
||||
### Airbyte Forum
|
||||
|
||||
We are driving our community support from our [forum](https://github.com/airbytehq/airbyte/discussions) on GitHub.
|
||||
|
||||
### Office Hour
|
||||
|
||||
Airbyte provides a [Daily Office Hour](https://airbyte.com/daily-office-hour) to discuss issues.
|
||||
It is a 45 minute meeting, the first 20 minutes are reserved to a weekly topic presentation about Airbyte concepts and the others 25 minutes are for general questions. The schedule is:
|
||||
* Monday, Wednesday and Fridays: 1 PM PST/PDT
|
||||
* Tuesday and Thursday: 4 PM CEST
|
||||
|
||||
|
||||
## Airbyte Cloud Support
|
||||
|
||||
If you have questions about connector setup, error resolution, or want to report a bug, Airbyte Support is available to assist you. We recommend checking [our documentation](https://docs.airbyte.com/) and searching our [Help Center](https://support.airbyte.com/hc/en-us) before opening a support ticket.
|
||||
@@ -59,5 +71,4 @@ Although we strive to offer our utmost assistance, there are certain requests th
|
||||
* Curating unique documentation and training materials
|
||||
* Configuring Airbyte to meet security requirements
|
||||
|
||||
If you think you will need asssitance when upgrading, we recommend upgrading during our support hours, Monday-Friday 7AM - 7PM ET so we can assist if support is needed. If you upgrade outside of support hours, please submit a ticket and we will assist when we are back online.
|
||||
|
||||
If you think you will need assistance when upgrading, we recommend upgrading during our support hours, Monday-Friday 7AM - 7PM ET so we can assist if support is needed. If you upgrade outside of support hours, please submit a ticket and we will assist when we are back online.
|
||||
@@ -12,7 +12,7 @@ To use incremental syncs, the API endpoint needs to fullfil the following requir
|
||||
- If the record's cursor field is nested, you can use an "Add Field" transformation to copy it to the top-level, and a Remove Field to remove it from the object. This will effectively move the field to the top-level of the record
|
||||
- It's possible to filter/request records by the cursor field
|
||||
|
||||
The knowledge of a cursor value also allows the Airbyte system to automatically keep a history of changes to records in the destination. To learn more about how different modes of incremental syncs, check out the [Incremental Sync - Append](/understanding-airbyte/connections/incremental-append/) and [Incremental Sync - Append + Deduped](/understanding-airbyte/connections/incremental-append-deduped) pages.
|
||||
The knowledge of a cursor value also allows the Airbyte system to automatically keep a history of changes to records in the destination. To learn more about how different modes of incremental syncs, check out the [Incremental Sync - Append](/using-airbyte/core-concepts/sync-modes/incremental-append/) and [Incremental Sync - Append + Deduped](/using-airbyte/core-concepts/sync-modes/incremental-append-deduped) pages.
|
||||
|
||||
## Configuration
|
||||
|
||||
@@ -132,7 +132,7 @@ Some APIs update records over time but do not allow to filter or search by modif
|
||||
|
||||
In these cases, there are two options:
|
||||
|
||||
- **Do not use incremental sync** and always sync the full set of records to always have a consistent state, losing the advantages of reduced load and [automatic history keeping in the destination](/understanding-airbyte/connections/incremental-append-deduped)
|
||||
- **Do not use incremental sync** and always sync the full set of records to always have a consistent state, losing the advantages of reduced load and [automatic history keeping in the destination](/using-airbyte/core-concepts/sync-modes/incremental-append-deduped)
|
||||
- **Configure the "Lookback window"** to not only sync exclusively new records, but resync some portion of records before the cutoff date to catch changes that were made to existing records, trading off data consistency and the amount of synced records. In the case of the API of The Guardian, news articles tend to only be updated for a few days after the initial release date, so this strategy should be able to catch most updates without having to resync all articles.
|
||||
|
||||
Reiterating the example from above with a "Lookback window" of 2 days configured, let's assume the last encountered article looked like this:
|
||||
|
||||
@@ -321,7 +321,7 @@ Besides bringing the records in the right shape, it's important to communicate s
|
||||
|
||||
### Primary key
|
||||
|
||||
The "Primary key" field specifies how to uniquely identify a record. This is important for downstream de-duplication of records (e.g. by the [incremental sync - Append + Deduped sync mode](/understanding-airbyte/connections/incremental-append-deduped)).
|
||||
The "Primary key" field specifies how to uniquely identify a record. This is important for downstream de-duplication of records (e.g. by the [incremental sync - Append + Deduped sync mode](/using-airbyte/core-concepts/sync-modes/incremental-append-deduped)).
|
||||
|
||||
In a lot of cases, like for the EmailOctopus example from above, there is a dedicated id field that can be used for this purpose. It's important that the value of the id field is guaranteed to only occur once for a single record.
|
||||
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
|
||||
## Overview
|
||||
|
||||
This tutorial will assume that you already have a working source. If you do not, feel free to refer to the [Building a Toy Connector](building-a-python-source.md) tutorial. This tutorial will build directly off the example from that article. We will also assume that you have a basic understanding of how Airbyte's Incremental-Append replication strategy works. We have a brief explanation of it [here](../../understanding-airbyte/connections/incremental-append.md).
|
||||
This tutorial will assume that you already have a working source. If you do not, feel free to refer to the [Building a Toy Connector](building-a-python-source.md) tutorial. This tutorial will build directly off the example from that article. We will also assume that you have a basic understanding of how Airbyte's Incremental-Append replication strategy works. We have a brief explanation of it [here](/using-airbyte/core-concepts/sync-modes/incremental-append.md).
|
||||
|
||||
## Update Catalog in `discover`
|
||||
|
||||
@@ -293,6 +293,6 @@ Bonus points: go to Airbyte UI and reconfigure the connection to use incremental
|
||||
|
||||
Incremental definitely requires more configurability than full refresh, so your implementation may deviate slightly depending on whether your cursor
|
||||
field is source defined or user-defined. If you think you are running into one of those cases, check out
|
||||
our [incremental](../../understanding-airbyte/connections/incremental-append.md) documentation for more information on different types of
|
||||
our [incremental](/using-airbyte/core-concepts/sync-modes/incremental-append.md) documentation for more information on different types of
|
||||
configuration.
|
||||
|
||||
|
||||
@@ -57,7 +57,7 @@ Here's the outline of what we'll do to build our connector:
|
||||
|
||||
Once we've completed the above steps, we will have built a functioning connector. Then, we'll add some optional functionality:
|
||||
|
||||
- Support [incremental sync](../../understanding-airbyte/connections/incremental-append.md)
|
||||
- Support [incremental sync](/using-airbyte/core-concepts/sync-modes/incremental-append.md)
|
||||
- Add custom integration tests
|
||||
|
||||
### 1. Bootstrap the connector package
|
||||
|
||||
@@ -132,7 +132,7 @@ To add incremental sync, we'll do a few things:
|
||||
6. Update the `path` method to specify the date to pull exchange rates for.
|
||||
7. Update the configured catalog to use `incremental` sync when we're testing the stream.
|
||||
|
||||
We'll describe what each of these methods do below. Before we begin, it may help to familiarize yourself with how incremental sync works in Airbyte by reading the [docs on incremental](../../../understanding-airbyte/connections/incremental-append.md).
|
||||
We'll describe what each of these methods do below. Before we begin, it may help to familiarize yourself with how incremental sync works in Airbyte by reading the [docs on incremental](/using-airbyte/core-concepts/sync-modes/incremental-append.md).
|
||||
|
||||
To keep things concise, we'll only show functions as we edit them one by one.
|
||||
|
||||
|
||||
@@ -8,7 +8,7 @@ Thank you for your interest in contributing! We love community contributions.
|
||||
Read on to learn how to contribute to Airbyte.
|
||||
We appreciate first time contributors and we are happy to assist you in getting started. In case of questions, just reach out to us via [email](mailto:hey@airbyte.io) or [Slack](https://slack.airbyte.io)!
|
||||
|
||||
Before getting started, please review Airbyte's Code of Conduct. Everyone interacting in Slack, codebases, mailing lists, events, or other Airbyte activities is expected to follow [Code of Conduct](../project-overview/code-of-conduct.md).
|
||||
Before getting started, please review Airbyte's Code of Conduct. Everyone interacting in Slack, codebases, mailing lists, events, or other Airbyte activities is expected to follow [Code of Conduct](../community/code-of-conduct.md).
|
||||
|
||||
## Code Contributions
|
||||
|
||||
|
||||
@@ -13,7 +13,7 @@ The Docs team maintains a list of [#good-first-issues](https://github.com/airbyt
|
||||
|
||||
## Contributing to Airbyte docs
|
||||
|
||||
Before contributing to Airbyte docs, read the Airbyte Community [Code of Conduct](../project-overview/code-of-conduct.md).
|
||||
Before contributing to Airbyte docs, read the Airbyte Community [Code of Conduct](../community/code-of-conduct.md).
|
||||
|
||||
:::tip
|
||||
If you're new to GitHub and Markdown, complete [the First Contributions tutorial](https://github.com/firstcontributions/first-contributions) and learn [Markdown basics](https://guides.github.com/features/mastering-markdown/) before contributing to Airbyte documentation. Even if you're familiar with the basics, you may be interested in Airbyte's [custom markdown extensions for connector docs](#custom-markdown-extensions-for-connector-docs).
|
||||
@@ -276,16 +276,7 @@ Eagle-eyed readers may note that _all_ markdown should support this feature sinc
|
||||
|
||||
### Adding a redirect
|
||||
|
||||
To add a redirect, open the [`docusaurus.config.js`](https://github.com/airbytehq/airbyte/blob/master/docusaurus/docusaurus.config.js#L22) file and locate the following commented section:
|
||||
|
||||
```js
|
||||
// {
|
||||
// from: '/some-lame-path',
|
||||
// to: '/a-much-cooler-uri',
|
||||
// },
|
||||
```
|
||||
|
||||
Copy this section, replace the values, and [test the changes locally](#editing-on-your-local-machine) by going to the path you created a redirect for and verify that the address changes to the new one.
|
||||
To add a redirect, open the [`docusaurus/redirects.yml`](https://github.com/airbytehq/airbyte/blob/master/docusaurus/redirects.yml) file and add an entry from which old path to which new path a redirect should happen.
|
||||
|
||||
:::note
|
||||
Your path **needs** a leading slash `/` to work
|
||||
|
||||
@@ -1,15 +0,0 @@
|
||||
# Deploy Airbyte where you want to
|
||||
|
||||

|
||||
|
||||
- [Local Deployment](local-deployment.md)
|
||||
- [On Airbyte Cloud](on-cloud.md)
|
||||
- [On Aws](on-aws-ec2.md)
|
||||
- [On Azure VM Cloud Shell](on-azure-vm-cloud-shell.md)
|
||||
- [On Digital Ocean Droplet](on-digitalocean-droplet.md)
|
||||
- [On GCP.md](on-gcp-compute-engine.md)
|
||||
- [On Kubernetes](on-kubernetes-via-helm.md)
|
||||
- [On OCI VM](on-oci-vm.md)
|
||||
- [On Restack](on-restack.md)
|
||||
- [On Plural](on-plural.md)
|
||||
- [On AWS ECS (spoiler alert: it doesn't work)](on-aws-ecs.md)
|
||||
@@ -21,8 +21,8 @@ cd airbyte
|
||||
./run-ab-platform.sh
|
||||
```
|
||||
|
||||
- In your browser, just visit [http://localhost:8000](http://localhost:8000)
|
||||
- You will be asked for a username and password. By default, that's username `airbyte` and password `password`. Once you deploy Airbyte to your servers, be sure to change these:
|
||||
- In your browser, visit [http://localhost:8000](http://localhost:8000)
|
||||
- You will be asked for a username and password. By default, that's username `airbyte` and password `password`. Once you deploy Airbyte to your servers, be sure to change these in your `.env` file:
|
||||
|
||||
```yaml
|
||||
# Proxy Configuration
|
||||
@@ -66,5 +66,11 @@ bash run-ab-platform.sh
|
||||
- Start moving some data!
|
||||
|
||||
## Troubleshooting
|
||||
If you have any questions about the local setup and deployment process, head over to our [Getting Started FAQ](https://github.com/airbytehq/airbyte/discussions/categories/questions) on our Airbyte Forum that answers the following questions and more:
|
||||
|
||||
If you encounter any issues, just connect to our [Slack](https://slack.airbyte.io). Our community will help! We also have a [troubleshooting](../troubleshooting.md) section in our docs for common problems.
|
||||
- How long does it take to set up Airbyte?
|
||||
- Where can I see my data once I've run a sync?
|
||||
- Can I set a start time for my sync?
|
||||
|
||||
If you encounter any issues, check out [Getting Support](/community/getting-support) documentation
|
||||
for options how to get in touch with the community or us.
|
||||
|
||||
@@ -1,12 +1,12 @@
|
||||
# Airbyte Self-Managed
|
||||
# Airbyte Enterprise
|
||||
|
||||
[Airbyte Self-Managed](https://airbyte.com/product/airbyte-enterprise) is the best way to run Airbyte yourself. You get all 300+ pre-built connectors, data never leaves your environment, and Airbyte becomes self-serve in your organization with new tools to manage multiple users, and multiple teams using Airbyte all in one place.
|
||||
[Airbyte Enterprise](https://airbyte.com/product/airbyte-enterprise) is the best way to run Airbyte yourself. You get all 300+ pre-built connectors, data never leaves your environment, and Airbyte becomes self-serve in your organization with new tools to manage multiple users, and multiple teams using Airbyte all in one place.
|
||||
|
||||
A valid license key is required to get started with Airbyte Self-Managed. [Talk to sales](https://airbyte.com/company/talk-to-sales) to receive your license key.
|
||||
A valid license key is required to get started with Airbyte Enterprise. [Talk to sales](https://airbyte.com/company/talk-to-sales) to receive your license key.
|
||||
|
||||
The following pages outline how to:
|
||||
1. [Deploy Airbyte Self-Managed using Kubernetes](./implementation-guide.md)
|
||||
2. [Configure Okta for Single Sign-On (SSO) with Airbyte Self-Managed](./sso.md)
|
||||
1. [Deploy Airbyte Enterprise using Kubernetes](./implementation-guide.md)
|
||||
2. [Configure Okta for Single Sign-On (SSO) with Airbyte Enterprise](./sso.md)
|
||||
|
||||
| Feature | Description |
|
||||
|---------------------------|--------------------------------------------------------------------------------------------------------------|
|
||||
@@ -3,15 +3,15 @@ import TabItem from '@theme/TabItem';
|
||||
|
||||
# Implementation Guide
|
||||
|
||||
[Airbyte Self-Managed](./README.md) is in an early access stage for select priority users. Once you [are qualified for an Airbyte Self Managed license key](https://airbyte.com/company/talk-to-sales), you can deploy Airbyte with the following instructions.
|
||||
[Airbyte Enterprise](./README.md) is in an early access stage for select priority users. Once you [are qualified for an Airbyte Enterprise license key](https://airbyte.com/company/talk-to-sales), you can deploy Airbyte with the following instructions.
|
||||
|
||||
Airbyte Self Managed must be deployed using Kubernetes. This is to enable Airbyte's best performance and scale. The core components \(api server, scheduler, etc\) run as deployments while the scheduler launches connector-related pods on different nodes.
|
||||
Airbyte Enterprise must be deployed using Kubernetes. This is to enable Airbyte's best performance and scale. The core components \(api server, scheduler, etc\) run as deployments while the scheduler launches connector-related pods on different nodes.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
There are three prerequisites to deploying Self-Managed: installing [helm](https://helm.sh/docs/intro/install/), a Kubernetes cluster, and having configured `kubectl` to connect to the cluster.
|
||||
There are three prerequisites to deploying Enterprise: installing [helm](https://helm.sh/docs/intro/install/), a Kubernetes cluster, and having configured `kubectl` to connect to the cluster.
|
||||
|
||||
For production, we recommend deploying to EKS, GKE or AKS. If you are doing some local testing, follow the cluster setup instructions outlined [here](../../deploying-airbyte/on-kubernetes-via-helm.md#cluster-setup).
|
||||
For production, we recommend deploying to EKS, GKE or AKS. If you are doing some local testing, follow the cluster setup instructions outlined [here](/deploying-airbyte/on-kubernetes-via-helm.md#cluster-setup).
|
||||
|
||||
To install `kubectl`, please follow [these instructions](https://kubernetes.io/docs/tasks/tools/). To configure `kubectl` to connect to your cluster by using `kubectl use-context my-cluster-name`, see the following:
|
||||
|
||||
@@ -38,7 +38,7 @@ To install `kubectl`, please follow [these instructions](https://kubernetes.io/d
|
||||
</Tabs>
|
||||
</details>
|
||||
|
||||
## Deploy Airbyte Self-Managed
|
||||
## Deploy Airbyte Enterprise
|
||||
|
||||
### Add Airbyte Helm Repository
|
||||
|
||||
@@ -60,7 +60,7 @@ cp configs/airbyte.sample.yml configs/airbyte.yml
|
||||
|
||||
3. Add your Airbyte Enterprise license key to your `airbyte.yml`.
|
||||
|
||||
4. Add your [auth details](/enterprise-setup/self-managed/sso) to your `airbyte.yml`. Auth configurations aren't easy to modify after Airbyte is installed, so please double check them to make sure they're accurate before proceeding.
|
||||
4. Add your [auth details](/enterprise-setup/sso) to your `airbyte.yml`. Auth configurations aren't easy to modify after Airbyte is installed, so please double check them to make sure they're accurate before proceeding.
|
||||
|
||||
<details>
|
||||
<summary>Configuring auth in your airbyte.yml file</summary>
|
||||
@@ -81,7 +81,7 @@ To configure basic auth (deploy without SSO), remove the entire `auth:` section
|
||||
|
||||
</details>
|
||||
|
||||
### Install Airbyte Self Managed
|
||||
### Install Airbyte Enterprise
|
||||
|
||||
Install Airbyte Enterprise on helm using the following command:
|
||||
|
||||
@@ -92,7 +92,7 @@ Install Airbyte Enterprise on helm using the following command:
|
||||
The default release name is `airbyte-pro`. You can change this via the `RELEASE_NAME` environment
|
||||
variable.
|
||||
|
||||
### Customizing your Airbyte Self Managed Deployment
|
||||
### Customizing your Airbyte Enterprise Deployment
|
||||
|
||||
In order to customize your deployment, you need to create `values.yaml` file in a local folder and populate it with default configuration override values. A `values.yaml` example can be located in [charts/airbyte](https://github.com/airbytehq/airbyte-platform/blob/main/charts/airbyte/values.yaml) folder of the Airbyte repository.
|
||||
|
||||
@@ -6,7 +6,7 @@ Airbyte Self Managed currently supports SSO via OIDC with [Okta](https://www.okt
|
||||
|
||||
The following instructions walk you through:
|
||||
1. [Setting up the Okta OIDC App Integration to be used by your Airbyte instance](#setting-up-okta-for-sso)
|
||||
2. [Configuring Airbyte Self-Managed to use SSO](#deploying-airbyte-enterprise-with-okta)
|
||||
2. [Configuring Airbyte Enterprise to use SSO](#deploying-airbyte-enterprise-with-okta)
|
||||
|
||||
### Setting up Okta for SSO
|
||||
|
||||
@@ -14,13 +14,13 @@ You will need to create a new Okta OIDC App Integration for your Airbyte instanc
|
||||
|
||||
You should create an app integration with **OIDC - OpenID Connect** as the sign-in method and **Web Application** as the application type:
|
||||
|
||||

|
||||

|
||||
|
||||
#### App integration name
|
||||
|
||||
Please choose a URL-friendly app integraiton name without spaces or special characters, such as `my-airbyte-app`:
|
||||
|
||||

|
||||

|
||||
|
||||
Spaces or special characters in this field could result in invalid redirect URIs.
|
||||
|
||||
@@ -40,13 +40,13 @@ Sign-out redirect URIs
|
||||
<your-airbyte-domain>/auth/realms/airbyte/broker/<app-integration-name>/endpoint/logout_response
|
||||
```
|
||||
|
||||

|
||||

|
||||
|
||||
_Example values_
|
||||
|
||||
`<your-airbyte-domain>` should point to where your Airbyte instance will be available, including the http/https protocol.
|
||||
|
||||
## Deploying Airbyte Self-Managed with Okta
|
||||
## Deploying Airbyte Enterprise with Okta
|
||||
|
||||
Once your Okta app is set up, you're ready to deploy Airbyte with SSO. Take note of the following configuration values, as you will need them to configure Airbyte to use your new Okta SSO app integration:
|
||||
|
||||
@@ -10,7 +10,7 @@ Airbyte uses a two tiered system for connectors to help you understand what to e
|
||||
|
||||
**Community**: A community connector is maintained by the Airbyte community until it becomes Certified. Airbyte has over 800 code contributors and 15,000 people in the Slack community to help. The Airbyte team is continually certifying Community connectors as usage grows. As these connectors are not maintained by Airbyte, we do not offer support SLAs around them, and we encourage caution when using them in production.
|
||||
|
||||
For more information about the system, see [Product Support Levels](https://docs.airbyte.com/project-overview/product-support-levels)
|
||||
For more information about the system, see [Connector Support Levels](./connector-support-levels.md)
|
||||
|
||||
_[View the connector registries in full](https://connectors.airbyte.com/files/generated_reports/connector_registry_report.html)_
|
||||
|
||||
|
||||
39
docs/integrations/connector-support-levels.md
Normal file
39
docs/integrations/connector-support-levels.md
Normal file
@@ -0,0 +1,39 @@
|
||||
# Connector Support Levels
|
||||
|
||||
The following table describes the support levels of Airbyte connectors.
|
||||
|
||||
| | Certified | Community | Custom |
|
||||
| ------------------------------------ | ----------------------------------------- | ------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
|
||||
| **Availability** | Available to all users | Available to all users | Available to all users |
|
||||
| **Who builds them?** | Either the community or the Airbyte team. | Typically they are built by the community. The Airbyte team may upgrade them to Certified at any time. | Anyone can build custom connectors. We recommend using our [Connector Builder](https://docs.airbyte.com/connector-development/connector-builder-ui/overview) or [Low-code CDK](https://docs.airbyte.com/connector-development/config-based/low-code-cdk-overview). |
|
||||
| **Who maintains them?** | The Airbyte team | Users | Users |
|
||||
| **Production Readiness** | Guaranteed by Airbyte | Not guaranteed | Not guaranteed |
|
||||
| **Support: Cloud** | Supported* | No Support | Supported** |
|
||||
| **Support: Powered by Airbyte** | Supported* | No Support | Supported** |
|
||||
| **Support: Self-Managed Enterprise** | Supported* | No Support | Supported** |
|
||||
| **Support: Community (OSS)** | Slack Support only | No Support | Slack Support only |
|
||||
|
||||
\*For Certified connectors, Official Support SLAs are only available to customers with Premium Support included in their contract. Otherwise, please use our support portal and we will address your issues as soon as possible.
|
||||
|
||||
\*\*For Custom connectors, Official Support SLAs are only available to customers with Premium Support included in their contract. This support is provided with best efforts, and maintenance/upgrades are owned by the customer.
|
||||
|
||||
## Certified
|
||||
|
||||
A **Certified** connector is actively maintained and supported by the Airbyte team and maintains a high quality bar. It is production ready.
|
||||
|
||||
### What you should know about Certified connectors:
|
||||
|
||||
- Certified connectors are available to all users.
|
||||
- These connectors have been tested and vetted in order to be certified and are production ready.
|
||||
- Certified connectors should go through minimal breaking change but in the event an upgrade is needed users will be given an adequate upgrade window.
|
||||
|
||||
## Community
|
||||
|
||||
A **Community** connector is maintained by the Airbyte community until it becomes Certified. Airbyte has over 800 code contributors and 15,000 people in the Slack community to help. The Airbyte team is continually certifying Community connectors as usage grows. As these connectors are not maintained by Airbyte, we do not offer support SLAs around them, and we encourage caution when using them in production.
|
||||
|
||||
### What you should know about Community connectors:
|
||||
|
||||
- Community connectors are available to all users.
|
||||
- Community connectors may be upgraded to Certified at any time, and we will notify users of these upgrades via our Slack Community and in our Connector Catalog.
|
||||
- Community connectors might not be feature-complete (features planned for release are under development or not prioritized) and may include backward-incompatible/breaking API changes with no or short notice.
|
||||
- Community connectors have no Support SLAs.
|
||||
@@ -17,7 +17,7 @@ Only one stream will exist to collect data from all source streams. This will be
|
||||
|
||||
For each record, a UUID string is generated and used as the document id. The embeddings generated as defined will be stored as embeddings. Data in the text fields will be stored as documents and those in the metadata fields will be stored as metadata.
|
||||
|
||||
## Getting Started \(Airbyte Open-Source\)
|
||||
## Getting Started \(Airbyte Open Source\)
|
||||
|
||||
|
||||
You can connect to a Chroma instance either in client/server mode or in a local persistent mode. For the local persistent mode, the database file will be saved in the path defined in the `path` config parameter. Note that `path` must be an absolute path, prefixed with `/local`.
|
||||
|
||||
@@ -21,7 +21,7 @@ Each stream will be output into its own table in ClickHouse. Each table will con
|
||||
|
||||
Airbyte Cloud only supports connecting to your ClickHouse instance with SSL or TLS encryption, which is supported by [ClickHouse JDBC driver](https://github.com/ClickHouse/clickhouse-jdbc).
|
||||
|
||||
## Getting Started \(Airbyte Open-Source\)
|
||||
## Getting Started \(Airbyte Open Source\)
|
||||
|
||||
#### Requirements
|
||||
|
||||
|
||||
@@ -69,7 +69,7 @@ You can also copy the output file to your host machine, the following command wi
|
||||
docker cp airbyte-server:/tmp/airbyte_local/{destination_path}/{filename}.csv .
|
||||
```
|
||||
|
||||
Note: If you are running Airbyte on Windows with Docker backed by WSL2, you have to use similar step as above or refer to this [link](../../operator-guides/locating-files-local-destination.md) for an alternative approach.
|
||||
Note: If you are running Airbyte on Windows with Docker backed by WSL2, you have to use similar step as above or refer to this [link](/integrations/locating-files-local-destination.md) for an alternative approach.
|
||||
|
||||
## Changelog
|
||||
|
||||
|
||||
@@ -20,7 +20,7 @@ Each stream will be output into its own table in Databend. Each table will conta
|
||||
## Getting Started (Airbyte Cloud)
|
||||
Coming soon...
|
||||
|
||||
## Getting Started (Airbyte Open-Source)
|
||||
## Getting Started (Airbyte Open Source)
|
||||
You can follow the [Connecting to a Warehouse docs](https://docs.databend.com/using-databend-cloud/warehouses/connecting-a-warehouse) to get the user, password, host etc.
|
||||
|
||||
Or you can create such a user by running:
|
||||
|
||||
@@ -98,7 +98,7 @@ You can also copy the output file to your host machine, the following command wi
|
||||
docker cp airbyte-server:/tmp/airbyte_local/{destination_path} .
|
||||
```
|
||||
|
||||
Note: If you are running Airbyte on Windows with Docker backed by WSL2, you have to use similar step as above or refer to this [link](../../operator-guides/locating-files-local-destination.md) for an alternative approach.
|
||||
Note: If you are running Airbyte on Windows with Docker backed by WSL2, you have to use similar step as above or refer to this [link](/integrations/locating-files-local-destination.md) for an alternative approach.
|
||||
|
||||
<!-- /env:oss -->
|
||||
|
||||
|
||||
@@ -13,7 +13,7 @@ The Airbyte GCS destination allows you to sync data to cloud storage buckets. Ea
|
||||
| Feature | Support | Notes |
|
||||
| :----------------------------- | :-----: | :-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||
| Full Refresh Sync | ✅ | Warning: this mode deletes all previously synced data in the configured bucket path. |
|
||||
| Incremental - Append Sync | ✅ | Warning: Airbyte provides at-least-once delivery. Depending on your source, you may see duplicated data. Learn more [here](/understanding-airbyte/connections/incremental-append#inclusive-cursors) |
|
||||
| Incremental - Append Sync | ✅ | Warning: Airbyte provides at-least-once delivery. Depending on your source, you may see duplicated data. Learn more [here](/using-airbyte/core-concepts/sync-modes/incremental-append#inclusive-cursors) |
|
||||
| Incremental - Append + Deduped | ❌ | |
|
||||
| Namespaces | ❌ | Setting a specific bucket path is equivalent to having separate namespaces. |
|
||||
|
||||
|
||||
@@ -69,7 +69,7 @@ You can also copy the output file to your host machine, the following command wi
|
||||
docker cp airbyte-server:/tmp/airbyte_local/{destination_path}/{filename}.jsonl .
|
||||
```
|
||||
|
||||
Note: If you are running Airbyte on Windows with Docker backed by WSL2, you have to use similar step as above or refer to this [link](../../operator-guides/locating-files-local-destination.md) for an alternative approach.
|
||||
Note: If you are running Airbyte on Windows with Docker backed by WSL2, you have to use similar step as above or refer to this [link](/integrations/locating-files-local-destination.md) for an alternative approach.
|
||||
|
||||
## Changelog
|
||||
|
||||
|
||||
@@ -25,7 +25,7 @@ Each stream will be output into its own collection in MongoDB. Each collection w
|
||||
|
||||
Airbyte Cloud only supports connecting to your MongoDB instance with TLS encryption. Other than that, you can proceed with the open-source instructions below.
|
||||
|
||||
## Getting Started \(Airbyte Open-Source\)
|
||||
## Getting Started \(Airbyte Open Source\)
|
||||
|
||||
#### Requirements
|
||||
|
||||
|
||||
@@ -33,7 +33,7 @@ Airbyte Cloud only supports connecting to your MSSQL instance with TLS encryptio
|
||||
| Incremental - Append + Deduped | Yes | |
|
||||
| Namespaces | Yes | |
|
||||
|
||||
## Getting Started \(Airbyte Open-Source\)
|
||||
## Getting Started \(Airbyte Open Source\)
|
||||
|
||||
### Requirements
|
||||
|
||||
|
||||
@@ -27,7 +27,7 @@ Each stream will be output into its own table in MySQL. Each table will contain
|
||||
|
||||
Airbyte Cloud only supports connecting to your MySQL instance with TLS encryption. Other than that, you can proceed with the open-source instructions below.
|
||||
|
||||
## Getting Started \(Airbyte Open-Source\)
|
||||
## Getting Started \(Airbyte Open Source\)
|
||||
|
||||
### Requirements
|
||||
|
||||
|
||||
@@ -26,7 +26,7 @@ Enabling normalization will also create normalized, strongly typed tables.
|
||||
|
||||
The Oracle connector is currently in Alpha on Airbyte Cloud. Only TLS encrypted connections to your DB can be made from Airbyte Cloud. Other than that, follow the open-source instructions below.
|
||||
|
||||
## Getting Started \(Airbyte Open-Source\)
|
||||
## Getting Started \(Airbyte Open Source\)
|
||||
|
||||
#### Requirements
|
||||
|
||||
|
||||
@@ -23,7 +23,7 @@
|
||||
| api_server | string | api URL to rockset, specifying http protocol |
|
||||
| workspace | string | workspace under which rockset collections will be added/modified |
|
||||
|
||||
## Getting Started \(Airbyte Open-Source / Airbyte Cloud\)
|
||||
## Getting Started \(Airbyte Open Source / Airbyte Cloud\)
|
||||
|
||||
#### Requirements
|
||||
|
||||
|
||||
@@ -178,7 +178,7 @@ A data sync may create multiple files as the output files can be partitioned by
|
||||
| Feature | Support | Notes |
|
||||
| :----------------------------- | :-----: | :-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||
| Full Refresh Sync | ✅ | Warning: this mode deletes all previously synced data in the configured bucket path. |
|
||||
| Incremental - Append Sync | ✅ | Warning: Airbyte provides at-least-once delivery. Depending on your source, you may see duplicated data. Learn more [here](/understanding-airbyte/connections/incremental-append#inclusive-cursors) |
|
||||
| Incremental - Append Sync | ✅ | Warning: Airbyte provides at-least-once delivery. Depending on your source, you may see duplicated data. Learn more [here](/using-airbyte/core-concepts/sync-modes/incremental-append#inclusive-cursors) |
|
||||
| Incremental - Append + Deduped | ❌ | |
|
||||
| Namespaces | ❌ | Setting a specific bucket path is equivalent to having separate namespaces. |
|
||||
|
||||
|
||||
@@ -174,7 +174,7 @@ A data sync may create multiple files as the output files can be partitioned by
|
||||
| Feature | Support | Notes |
|
||||
| :----------------------------- | :-----: | :-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||
| Full Refresh Sync | ✅ | Warning: this mode deletes all previously synced data in the configured bucket path. |
|
||||
| Incremental - Append Sync | ✅ | Warning: Airbyte provides at-least-once delivery. Depending on your source, you may see duplicated data. Learn more [here](/understanding-airbyte/connections/incremental-append#inclusive-cursors) |
|
||||
| Incremental - Append Sync | ✅ | Warning: Airbyte provides at-least-once delivery. Depending on your source, you may see duplicated data. Learn more [here](/using-airbyte/core-concepts/sync-modes/incremental-append#inclusive-cursors) |
|
||||
| Incremental - Append + Deduped | ❌ | |
|
||||
| Namespaces | ❌ | Setting a specific bucket path is equivalent to having separate namespaces. |
|
||||
|
||||
|
||||
@@ -68,7 +68,7 @@ You can also copy the output file to your host machine, the following command wi
|
||||
docker cp airbyte-server:/tmp/airbyte_local/{destination_path} .
|
||||
```
|
||||
|
||||
Note: If you are running Airbyte on Windows with Docker backed by WSL2, you have to use similar step as above or refer to this [link](../../operator-guides/locating-files-local-destination.md) for an alternative approach.
|
||||
Note: If you are running Airbyte on Windows with Docker backed by WSL2, you have to use similar step as above or refer to this [link](/integrations/locating-files-local-destination.md) for an alternative approach.
|
||||
|
||||
## Changelog
|
||||
|
||||
|
||||
@@ -16,7 +16,7 @@ Each stream will be output into its own stream in Timeplus, with corresponding s
|
||||
## Getting Started (Airbyte Cloud)
|
||||
Coming soon...
|
||||
|
||||
## Getting Started (Airbyte Open-Source)
|
||||
## Getting Started (Airbyte Open Source)
|
||||
You can follow the [Quickstart with Timeplus Ingestion API](https://docs.timeplus.com/quickstart-ingest-api) to createa a workspace and API key.
|
||||
|
||||
### Setup the Timeplus Destination in Airbyte
|
||||
|
||||
@@ -1,70 +0,0 @@
|
||||
# Getting Started: Destination Redshift
|
||||
|
||||
## Requirements
|
||||
|
||||
1. Active Redshift cluster
|
||||
2. Allow connections from Airbyte to your Redshift cluster \(if they exist in separate VPCs\)
|
||||
3. A staging S3 bucket with credentials \(for the COPY strategy\).
|
||||
|
||||
## Setup guide
|
||||
|
||||
### 1. Make sure your cluster is active and accessible from the machine running Airbyte
|
||||
|
||||
This is dependent on your networking setup. The easiest way to verify if Airbyte is able to connect to your Redshift cluster is via the check connection tool in the UI. You can check AWS Redshift documentation with a tutorial on how to properly configure your cluster's access [here](https://docs.aws.amazon.com/redshift/latest/gsg/rs-gsg-authorize-cluster-access.html)
|
||||
|
||||
### 2. Fill up connection info
|
||||
|
||||
Next is to provide the necessary information on how to connect to your cluster such as the `host` whcih is part of the connection string or Endpoint accessible [here](https://docs.aws.amazon.com/redshift/latest/gsg/rs-gsg-connect-to-cluster.html#rs-gsg-how-to-get-connection-string) without the `port` and `database` name \(it typically includes the cluster-id, region and end with `.redshift.amazonaws.com`\).
|
||||
|
||||
You should have all the requirements needed to configure Redshift as a destination in the UI. You'll need the following information to configure the destination:
|
||||
|
||||
* **Host**
|
||||
* **Port**
|
||||
* **Username**
|
||||
* **Password**
|
||||
* **Schema**
|
||||
* **Database**
|
||||
* This database needs to exist within the cluster provided.
|
||||
|
||||
### 2a. Fill up S3 info \(for COPY strategy\)
|
||||
|
||||
Provide the required S3 info.
|
||||
|
||||
* **S3 Bucket Name**
|
||||
* See [this](https://docs.aws.amazon.com/AmazonS3/latest/userguide/create-bucket-overview.html) to create an S3 bucket.
|
||||
* **S3 Bucket Region**
|
||||
* Place the S3 bucket and the Redshift cluster in the same region to save on networking costs.
|
||||
* **Access Key Id**
|
||||
* See [this](https://docs.aws.amazon.com/general/latest/gr/aws-sec-cred-types.html#access-keys-and-secret-access-keys) on how to generate an access key.
|
||||
* We recommend creating an Airbyte-specific user. This user will require [read and write permissions](https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_policies_examples_s3_rw-bucket.html) to objects in the staging bucket.
|
||||
* **Secret Access Key**
|
||||
* Corresponding key to the above key id.
|
||||
* **Part Size**
|
||||
* Affects the size limit of an individual Redshift table. Optional. Increase this if syncing tables larger than 100GB. Files are streamed to S3 in parts. This determines the size of each part, in MBs. As S3 has a limit of 10,000 parts per file, part size affects the table size. This is 10MB by default, resulting in a default table limit of 100GB. Note, a larger part size will result in larger memory requirements. A rule of thumb is to multiply the part size by 10 to get the memory requirement. Modify this with care.
|
||||
|
||||
Optional parameters:
|
||||
* **Bucket Path**
|
||||
* The directory within the S3 bucket to place the staging data. For example, if you set this to `yourFavoriteSubdirectory`, staging data will be placed inside `s3://yourBucket/yourFavoriteSubdirectory`. If not provided, defaults to the root directory.
|
||||
|
||||
## Notes about Redshift Naming Conventions
|
||||
|
||||
From [Redshift Names & Identifiers](https://docs.aws.amazon.com/redshift/latest/dg/r_names.html):
|
||||
|
||||
### Standard Identifiers
|
||||
|
||||
* Begin with an ASCII single-byte alphabetic character or underscore character, or a UTF-8 multibyte character two to four bytes long.
|
||||
* Subsequent characters can be ASCII single-byte alphanumeric characters, underscores, or dollar signs, or UTF-8 multibyte characters two to four bytes long.
|
||||
* Be between 1 and 127 bytes in length, not including quotation marks for delimited identifiers.
|
||||
* Contain no quotation marks and no spaces.
|
||||
|
||||
### Delimited Identifiers
|
||||
|
||||
Delimited identifiers \(also known as quoted identifiers\) begin and end with double quotation marks \("\). If you use a delimited identifier, you must use the double quotation marks for every reference to that object. The identifier can contain any standard UTF-8 printable characters other than the double quotation mark itself. Therefore, you can create column or table names that include otherwise illegal characters, such as spaces or the percent symbol. ASCII letters in delimited identifiers are case-insensitive and are folded to lowercase. To use a double quotation mark in a string, you must precede it with another double quotation mark character.
|
||||
|
||||
Therefore, Airbyte Redshift destination will create tables and schemas using the Unquoted identifiers when possible or fallback to Quoted Identifiers if the names are containing special characters.
|
||||
|
||||
## Data Size Limitations
|
||||
|
||||
Redshift specifies a maximum limit of 65535 bytes to store the raw JSON record data. Thus, when a row is too big to fit, the Redshift destination fails to load such data and currently ignores that record.
|
||||
|
||||
For more information, see the [docs here.](https://docs.aws.amazon.com/redshift/latest/dg/r_Character_types.html)
|
||||
@@ -1,12 +0,0 @@
|
||||
## Getting Started: Source GitHub
|
||||
|
||||
### Requirements
|
||||
|
||||
* Github Account
|
||||
* Github Personal Access Token wih the necessary permissions \(described below\)
|
||||
|
||||
### Setup guide
|
||||
|
||||
Log into Github and then generate a [personal access token](https://github.com/settings/tokens).
|
||||
|
||||
Your token should have at least the `repo` scope. Depending on which streams you want to sync, the user generating the token needs more permissions:
|
||||
@@ -1,42 +0,0 @@
|
||||
# Getting Started: Source Google Ads
|
||||
|
||||
## Requirements
|
||||
|
||||
Google Ads Account with an approved Developer Token \(note: In order to get API access to Google Ads, you must have a "manager" account. This must be created separately from your standard account. You can find more information about this distinction in the [google ads docs](https://ads.google.com/home/tools/manager-accounts/).\)
|
||||
|
||||
* developer_token
|
||||
* client_id
|
||||
* client_secret
|
||||
* refresh_token
|
||||
* start_date
|
||||
* customer_id
|
||||
|
||||
## Setup guide
|
||||
|
||||
This guide will provide information as if starting from scratch. Please skip over any steps you have already completed.
|
||||
|
||||
* Create an Google Ads Account. Here are [Google's instruction](https://support.google.com/google-ads/answer/6366720) on how to create one.
|
||||
* Create an Google Ads MANAGER Account. Here are [Google's instruction](https://ads.google.com/home/tools/manager-accounts/) on how to create one.
|
||||
* You should now have two Google Ads accounts: a normal account and a manager account. Link the Manager account to the normal account following [Google's documentation](https://support.google.com/google-ads/answer/7459601).
|
||||
* Apply for a developer token \(**make sure you follow our** [**instructions**](#how-to-apply-for-the-developer-token)\) on your Manager account. This token allows you to access your data from the Google Ads API. Here are [Google's instructions](https://developers.google.com/google-ads/api/docs/first-call/dev-token). The docs are a little unclear on this point, but you will _not_ be able to access your data via the Google Ads API until this token is approved. You cannot use a test developer token, it has to be at least a basic developer token. It usually takes Google 24 hours to respond to these applications. This developer token is the value you will use in the `developer_token` field.
|
||||
* Fetch your `client_id`, `client_secret`, and `refresh_token`. Google provides [instructions](https://developers.google.com/google-ads/api/docs/first-call/overview) on how to do this.
|
||||
* Select your `customer_id`. The `customer_is` refer to the id of each of your Google Ads accounts. This is the 10 digit number in the top corner of the page when you are in google ads ui. The source will only pull data from the accounts for which you provide an id. If you are having trouble finding it, check out [Google's instructions](https://support.google.com/google-ads/answer/1704344).
|
||||
|
||||
Wow! That was a lot of steps. We are working on making the OAuth flow for all of our connectors simpler \(allowing you to skip needing to get a `developer_token` and a `refresh_token` which are the most painful / time-consuming steps in this walkthrough\).
|
||||
|
||||
## How to apply for the developer token
|
||||
|
||||
Google is very picky about which software and which use case can get access to a developer token. The Airbyte team has worked with the Google Ads team to whitelist Airbyte and make sure you can get one \(see [issue 1981](https://github.com/airbytehq/airbyte/issues/1981) for more information\).
|
||||
|
||||
When you apply for a token, you need to mention:
|
||||
|
||||
* Why you need the token \(eg: want to run some internal analytics...\)
|
||||
* That you will be using the Airbyte Open Source project
|
||||
* That you have full access to the code base \(because we're open source\)
|
||||
* That you have full access to the server running the code \(because you're self-hosting Airbyte\)
|
||||
|
||||
If for any reason the request gets denied, let us know and we will be able to unblock you.
|
||||
|
||||
## Understanding Google Ads Query Language
|
||||
|
||||
The Google Ads Query Language can query the Google Ads API. Check out [Google Ads Query Language](https://developers.google.com/google-ads/api/docs/query/overview)
|
||||
@@ -1,3 +1,7 @@
|
||||
---
|
||||
displayed_sidebar: docs
|
||||
---
|
||||
|
||||
# Windows - Browsing Local File Output
|
||||
|
||||
## Overview
|
||||
@@ -1,14 +0,0 @@
|
||||
# Missing an Integration?
|
||||
|
||||
If you'd like to ask for a new connector, or build a new connectors and make them part of the pool of pre-built connectors on Airbyte, first a big thank you. We invite you to check our [contributing guide](../contributing-to-airbyte/).
|
||||
|
||||
If you'd like to build new connectors, or update existing ones, for your own usage, without contributing to the Airbyte codebase, read along.
|
||||
|
||||
## Developing your own connectors
|
||||
|
||||
It's easy to code your own integrations on Airbyte. Here are some links to instruct on how to code new sources and destinations.
|
||||
|
||||
* [Building new connectors](../contributing-to-airbyte/README.md)
|
||||
|
||||
While the guides above are specific to the languages used most frequently to write integrations, **Airbyte integrations can be written in any language**. Please reach out to us if you'd like help developing integrations in other languages.
|
||||
|
||||
@@ -36,7 +36,7 @@ Available filters and metrics are provided in this [page](https://developers.goo
|
||||
3. Fill out a start date, and optionally, an end date and filters (check the [Queries documentation](https://developers.google.com/bid-manager/v1.1/queries)) .
|
||||
4. You're done.
|
||||
|
||||
## Getting Started \(Airbyte Open-Source\)
|
||||
## Getting Started \(Airbyte Open Source\)
|
||||
|
||||
#### Requirements
|
||||
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
|
||||
## Overview
|
||||
|
||||
This is a mock source for testing the Airbyte pipeline. It can generate arbitrary data streams. It is a subset of what is in [End-to-End Testing Source](e2e-test.md) in Open-Source to avoid Airbyte Cloud users accidentally in curring a huge bill.
|
||||
This is a mock source for testing the Airbyte pipeline. It can generate arbitrary data streams. It is a subset of what is in [End-to-End Testing Source](e2e-test.md) in Open Source to avoid Airbyte Cloud users accidentally in curring a huge bill.
|
||||
|
||||
## Mode
|
||||
|
||||
|
||||
@@ -104,7 +104,7 @@ The Google Analytics (Universal Analytics) source connector can sync the followi
|
||||
|
||||
Reach out to us on Slack or [create an issue](https://github.com/airbytehq/airbyte/issues) if you need to send custom Google Analytics report data with Airbyte.
|
||||
|
||||
## Rate Limits and Performance Considerations \(Airbyte Open-Source\)
|
||||
## Rate Limits and Performance Considerations \(Airbyte Open Source\)
|
||||
|
||||
[Analytics Reporting API v4](https://developers.google.com/analytics/devguides/reporting/core/v4/limits-quotas)
|
||||
|
||||
|
||||
@@ -40,7 +40,7 @@ This connector attempts to back off gracefully when it hits Directory API's rate
|
||||
1. Click `OAuth2.0 authorization` then `Authenticate your Google Directory account`.
|
||||
2. You're done.
|
||||
|
||||
## Getting Started \(Airbyte Open-Source\)
|
||||
## Getting Started \(Airbyte Open Source\)
|
||||
|
||||
Google APIs use the OAuth 2.0 protocol for authentication and authorization. This connector supports [Web server application](https://developers.google.com/identity/protocols/oauth2#webserver) and [Service accounts](https://developers.google.com/identity/protocols/oauth2#serviceaccount) scenarios. Therefore, there are 2 options of setting up authorization for this source:
|
||||
|
||||
|
||||
@@ -25,7 +25,7 @@ Note: Currently hierarchyid and sql_variant are not processed in CDC migration t
|
||||
|
||||
On Airbyte Cloud, only TLS connections to your MSSQL instance are supported in source configuration. Other than that, you can proceed with the open-source instructions below.
|
||||
|
||||
## Getting Started \(Airbyte Open-Source\)
|
||||
## Getting Started \(Airbyte Open Source\)
|
||||
|
||||
#### Requirements
|
||||
|
||||
|
||||
@@ -24,7 +24,7 @@ This source allows you to synchronize the following data tables:
|
||||
**Requirements**
|
||||
In order to use the My Hours API you need to provide the credentials to an admin My Hours account.
|
||||
|
||||
### Performance Considerations (Airbyte Open-Source)
|
||||
### Performance Considerations (Airbyte Open Source)
|
||||
|
||||
Depending on the amount of team members and time logs the source provides a property to change the pagination size for the time logs query. Typically a pagination of 30 days is a correct balance between reliability and speed. But if you have a big amount of monthly entries you might want to change this value to a lower value.
|
||||
|
||||
|
||||
@@ -91,7 +91,7 @@ To fill out the required information:
|
||||
#### Step 4: (Airbyte Cloud Only) Allow inbound traffic from Airbyte IPs.
|
||||
|
||||
If you are on Airbyte Cloud, you will always need to modify your database configuration to allow inbound traffic from Airbyte IPs. You can find a list of all IPs that need to be allowlisted in
|
||||
our [Airbyte Security docs](../../../operator-guides/security#network-security-1).
|
||||
our [Airbyte Security docs](../../operating-airbyte/security#network-security-1).
|
||||
|
||||
Now, click `Set up source` in the Airbyte UI. Airbyte will now test connecting to your database. Once this succeeds, you've configured an Airbyte MySQL source!
|
||||
<!-- /env:cloud -->
|
||||
|
||||
@@ -20,7 +20,7 @@ The Oracle source does not alter the schema present in your database. Depending
|
||||
|
||||
On Airbyte Cloud, only TLS connections to your Oracle instance are supported. Other than that, you can proceed with the open-source instructions below.
|
||||
|
||||
## Getting Started \(Airbyte Open-Source\)
|
||||
## Getting Started \(Airbyte Open Source\)
|
||||
|
||||
#### Requirements
|
||||
|
||||
|
||||
@@ -4,7 +4,7 @@
|
||||
|
||||
The PokéAPI is primarly used as a tutorial and educational resource, as it requires zero dependencies. Learn how Airbyte and this connector works with these tutorials:
|
||||
|
||||
- [Airbyte Quickstart: An Introduction to Deploying and Syncing](../../quickstart/deploy-airbyte.md)
|
||||
- [Airbyte Quickstart: An Introduction to Deploying and Syncing](../../using-airbyte/getting-started/readme.md)
|
||||
- [Airbyte CDK Speedrun: A Quick Primer on Building Source Connectors](../../connector-development/tutorials/cdk-speedrun.md)
|
||||
- [How to Build ETL Sources in Under 30 Minutes: A Video Tutorial](https://www.youtube.com/watch?v=kJ3hLoNfz_E&t=13s&ab_channel=Airbyte)
|
||||
|
||||
@@ -24,7 +24,7 @@ This source uses the fully open [PokéAPI](https://pokeapi.co/docs/v2#info) to s
|
||||
|
||||
Currently, only one output stream is available from this source, which is the Pokémon output stream. This schema is defined [here](https://github.com/airbytehq/airbyte/tree/master/airbyte-integrations/connectors/source-pokeapi/source_pokeapi/schemas/pokemon.json).
|
||||
|
||||
## Rate Limiting & Performance Considerations \(Airbyte Open-Source\)
|
||||
## Rate Limiting & Performance Considerations \(Airbyte Open Source\)
|
||||
|
||||
According to the API's [fair use policy](https://pokeapi.co/docs/v2#fairuse), please make sure to cache resources retrieved from the PokéAPI wherever possible. That said, the PokéAPI does not perform rate limiting.
|
||||
|
||||
|
||||
@@ -54,7 +54,7 @@ To fill out the required information:
|
||||
#### Step 3: (Airbyte Cloud Only) Allow inbound traffic from Airbyte IPs.
|
||||
|
||||
If you are on Airbyte Cloud, you will always need to modify your database configuration to allow inbound traffic from Airbyte IPs. You can find a list of all IPs that need to be allowlisted in
|
||||
our [Airbyte Security docs](../../../operator-guides/security#network-security-1).
|
||||
our [Airbyte Security docs](../../operating-airbyte/security#network-security-1).
|
||||
|
||||
Now, click `Set up source` in the Airbyte UI. Airbyte will now test connecting to your database. Once this succeeds, you've configured an Airbyte Postgres source!
|
||||
<!-- /env:cloud -->
|
||||
|
||||
@@ -58,7 +58,7 @@ If you are on Airbyte Cloud, you will always need to modify your database config
|
||||
|
||||

|
||||
|
||||
2. Add a new network, and enter the Airbyte's IPs, which you can find in our [Airbyte Security documentation](../../../operator-guides/security#network-security-1).
|
||||
2. Add a new network, and enter the Airbyte's IPs, which you can find in our [Airbyte Security documentation](../../../operating-airbyte/security#network-security-1).
|
||||
|
||||
Now, click `Set up source` in the Airbyte UI. Airbyte will now test connecting to your database. Once this succeeds, you've configured an Airbyte Postgres source!
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Airbyte Security
|
||||
# Security
|
||||
|
||||
Airbyte is committed to keeping your data safe by following industry-standard practices for securing physical deployments, setting access policies, and leveraging the security features of leading Cloud providers.
|
||||
|
||||
@@ -142,7 +142,7 @@ Airbyte Cloud allows you to log in to the platform using your email and password
|
||||
|
||||
### Access Control
|
||||
|
||||
Airbyte Cloud supports [user management](https://docs.airbyte.com/cloud/managing-airbyte-cloud/manage-airbyte-cloud-workspace#add-users-to-your-workspace) but doesn’t support role-based access control (RBAC) yet.
|
||||
Airbyte Cloud supports [user management](/using-airbyte/workspaces.md#add-users-to-your-workspace) but doesn’t support role-based access control (RBAC) yet.
|
||||
|
||||
### Compliance
|
||||
|
||||
@@ -1,29 +1,49 @@
|
||||
# Browsing Output Logs
|
||||
# Browsing Logs
|
||||
|
||||
## Overview
|
||||
|
||||
This tutorial will describe how to explore Airbyte Workspace folders.
|
||||
Airbyte records the full logs as a part of each sync. These logs can be used to understand the underlying operations Airbyte performs to read data from the source and write to the destination as a part of the [Airbyte Protocol](/understanding-airbyte/airbyte-protocol.md). The logs includes many details, including any errors that can be helpful when troubleshooting sync errors.
|
||||
|
||||
This is useful if you need to browse the docker volumes where extra output files of Airbyte server and workers are stored since they may not be accessible through the UI.
|
||||
:::info
|
||||
When using Airbyte Open Source, you can also access additional logs outside of the UI. This is useful if you need to browse the Docker volumes where extra output files of Airbyte server and workers are stored.
|
||||
:::
|
||||
|
||||
## Exploring the Logs folders
|
||||
To find the logs for a connection, navigate to a connection's `Job History` tab to see the latest syncs.
|
||||
|
||||
When running a Sync in Airbyte, you have the option to look at the logs in the UI as shown next.
|
||||
## View the logs in the UI
|
||||
To open the logs in the UI, select the three grey dots next to a sync and select `View logs`. This will open our full screen in-app log viewer.
|
||||
|
||||
### Identifying Workspace IDs
|
||||
:::tip
|
||||
If you are troubleshooting a sync error, you can search for `Error`, `Exception`, or `Fail` to find common errors.
|
||||
:::
|
||||
|
||||
In the screenshot below, you can notice the highlighted blue boxes are showing the id numbers that were used for the selected "Attempt" for this sync job.
|
||||
The in-app log viewer will only search for instances of the search term within that attempt. To search across all attempts, download the logs locally.
|
||||
|
||||
In this case, the job was running in `/tmp/workspace/9/2/` folder since the tab of the third attempt is being selected in the UI \(first attempt would be `/tmp/workspace/9/0/`\).
|
||||
## Link to a sync job
|
||||
To help others quickly find your job, copy the link to the logs to your clipboard, select the three grey dots next to a sync and select `Copy link to job`.
|
||||
|
||||

|
||||
You can also access the link to a sync job from the in-app log viewer.
|
||||
|
||||
The highlighted button in the red circle on the right would allow you to download the logs.log file.
|
||||
However, there are actually more files being recorded in the same workspace folder... Thus, we might want to dive deeper to explore these folders and gain a better understanding of what is being run by Airbyte.
|
||||
## Download the logs
|
||||
To download a copy of the logs locally, select the three grey dots next to a sync and select `Download logs`.
|
||||
|
||||
You can also access the download log button from the in-app log viewer.
|
||||
|
||||
:::note
|
||||
If a sync was completed across multiple attempts, downloading the logs will union all the logs for all attempts for that job.
|
||||
:::
|
||||
|
||||
## Exploring Local Logs
|
||||
|
||||
<AppliesTo oss />
|
||||
|
||||
### Establish the folder directory
|
||||
|
||||
In the UI, you can discover the Attempt ID within the sync job. Most jobs will complete in the first attempt, so your folder directory will look like `/tmp/workspace/9/0`. If you sync job completes in multiple attempts, you'll need to define which attempt you're interested in, and note this. For example, for the third attempt, it will look like `/tmp/workspace/9/2/` .
|
||||
|
||||
### Understanding the Docker run commands
|
||||
|
||||
Scrolling down a bit more, we can also read the different docker commands being used internally are starting with:
|
||||
We can also read the different docker commands being used internally are starting with:
|
||||
|
||||
```text
|
||||
docker run --rm -i -v airbyte_workspace:/data -v /tmp/airbyte_local:/local -w /data/9/2 --network host ...
|
||||
@@ -35,7 +55,7 @@ Following [Docker Volume documentation](https://docs.docker.com/storage/volumes/
|
||||
|
||||
### Opening a Unix shell prompt to browse the Docker volume
|
||||
|
||||
For example, we can run any docker container/image to browse the content of this named volume by mounting it similarly, let's use the [busybox](https://hub.docker.com/_/busybox) image.
|
||||
For example, we can run any docker container/image to browse the content of this named volume by mounting it similarly. In the example below, the [busybox](https://hub.docker.com/_/busybox) image is used.
|
||||
|
||||
```text
|
||||
docker run -it --rm --volume airbyte_workspace:/data busybox
|
||||
@@ -50,13 +70,15 @@ ls /data/9/2/
|
||||
Example Output:
|
||||
|
||||
```text
|
||||
catalog.json normalize tap_config.json
|
||||
logs.log singer_rendered_catalog.json target_config.json
|
||||
catalog.json
|
||||
tap_config.json
|
||||
logs.log
|
||||
target_config.json
|
||||
```
|
||||
|
||||
### Browsing from the host shell
|
||||
|
||||
Or, if you don't want to transfer to a shell prompt inside the docker image, you can simply run Shell commands using docker commands as a proxy like this:
|
||||
Or, if you don't want to transfer to a shell prompt inside the docker image, you can run Shell commands using docker commands as a proxy:
|
||||
|
||||
```bash
|
||||
docker run -it --rm --volume airbyte_workspace:/data busybox ls /data/9/2
|
||||
@@ -81,7 +103,7 @@ docker run -it --rm --volume airbyte_workspace:/data busybox cat /data/9/2/catal
|
||||
Example Output:
|
||||
|
||||
```text
|
||||
{"streams":[{"stream":{"name":"exchange_rate","json_schema":{"type":"object","properties":{"CHF":{"type":"number"},"HRK":{"type":"number"},"date":{"type":"string"},"MXN":{"type":"number"},"ZAR":{"type":"number"},"INR":{"type":"number"},"CNY":{"type":"number"},"THB":{"type":"number"},"AUD":{"type":"number"},"ILS":{"type":"number"},"KRW":{"type":"number"},"JPY":{"type":"number"},"PLN":{"type":"number"},"GBP":{"type":"number"},"IDR":{"type":"number"},"HUF":{"type":"number"},"PHP":{"type":"number"},"TRY":{"type":"number"},"RUB":{"type":"number"},"HKD":{"type":"number"},"ISK":{"type":"number"},"EUR":{"type":"number"},"DKK":{"type":"number"},"CAD":{"type":"number"},"MYR":{"type":"number"},"USD":{"type":"number"},"BGN":{"type":"number"},"NOK":{"type":"number"},"RON":{"type":"number"},"SGD":{"type":"number"},"CZK":{"type":"number"},"SEK":{"type":"number"},"NZD":{"type":"number"},"BRL":{"type":"number"}}},"supported_sync_modes":["full_refresh"],"default_cursor_field":[]},"sync_mode":"full_refresh","cursor_field":[]}]}
|
||||
{"streams":[{"stream":{"name":"exchange_rate","json_schema":{"type":"object","properties":{"CHF":{"type":"number"},"HRK":{"type":"number"},"date":{"type":"string"},"MXN":{"type":"number"},"ZAR":{"type":"number"},"INR":{"type":"number"},"CNY":{"type":"number"},"THB":{"type":"number"},"NZD":{"type":"number"},"BRL":{"type":"number"}}},"supported_sync_modes":["full_refresh"],"default_cursor_field":[]},"sync_mode":"full_refresh","cursor_field":[]}]}
|
||||
```
|
||||
|
||||
### Extract catalog.json file from docker volume
|
||||
|
||||
@@ -1,55 +0,0 @@
|
||||
# Configuring Sync Notifications
|
||||
|
||||
## Overview
|
||||
|
||||
You can set up Airbyte to notify you when syncs have **failed** or **succeeded**. This is achieved through a webhook, a URL that you can input into other applications to get real time data from Airbyte.
|
||||
|
||||
## Set up Slack Notifications on Sync Status
|
||||
|
||||
If you're more of a visual learner, just head over to [this video](https://www.youtube.com/watch?v=NjYm8F-KiFc&ab_channel=Airbyte) to learn how to do this. Otherwise, keep reading!
|
||||
|
||||
**Set up the bot.**
|
||||
|
||||
Navigate to https://api.slack.com/apps/. Hit `Create an App`.
|
||||
|
||||

|
||||
|
||||
Then click `From scratch`. Enter your App Name (e.g. Airbyte Sync Notifications) and pick your desired Slack workspace.
|
||||
|
||||
**Set up the webhook URL.**
|
||||
|
||||
Now on the left sidebar, click on `Incoming Webhooks`.
|
||||
|
||||

|
||||
|
||||
Click the slider button in the top right to turn the feature on. Then click `Add New Webhook to Workspace`.
|
||||
|
||||

|
||||
|
||||
Pick the channel that you want to receive Airbyte notifications in (ideally a dedicated one), and click `Allow` to give it permissions to access the channel. You should see the bot show up in the selected channel now.
|
||||
|
||||
Now you should see an active webhook right above the `Add New Webhook to Workspace` button.
|
||||
|
||||

|
||||
|
||||
Click `Copy.`
|
||||
|
||||
**Add the webhook to Airbyte.**
|
||||
|
||||
Assuming you have a [running instance of Airbyte](../deploying-airbyte/README.md), we can navigate to the UI. Click on Settings and then click on `Notifications`.
|
||||
|
||||

|
||||
|
||||
Simply paste the copied webhook URL in `Connection status Webhook URL` and you're ready to go! On this page, you can click one or both of the sliders to decide whether you want notifications on sync successes, failures, or both. Make sure to click `Save changes` before you leave.
|
||||
|
||||
Your Webhook URL should look something like this:
|
||||
|
||||

|
||||
|
||||
**Test it out.**
|
||||
|
||||
From the settings page, you can click `Test` to send a test message to the channel. Or, just run a sync now and try it out! If all goes well, you should receive a notification in your selected channel that looks like this:
|
||||
|
||||

|
||||
|
||||
You're done!
|
||||
@@ -1,20 +1,25 @@
|
||||
# Resetting Your Data
|
||||
|
||||
The reset button gives you a blank slate, of sorts, to perform a fresh new sync. This can be useful if you are just testing Airbyte or don't necessarily require the data replicated to your destination to be saved permanently.
|
||||
Resetting your data allows you to drop all previously synced data so that any ensuing sync can start syncing fresh. This is useful if you don't require the data replicated to your destination to be saved permanently or are just testing Airbyte.
|
||||
|
||||

|
||||
Airbyte allows you to reset all streams in the connection, some, or only a single stream (when the connector support per-stream operations).
|
||||
|
||||
As outlined above, you can click on the `Reset your data` button to give you that clean slate. Just as a heads up, here is what it does and doesn't do:
|
||||
A sync will automatically start after a completed reset, which commonly backfills all historical data.
|
||||
|
||||
The reset button **DOES**:
|
||||
## Performing a Reset
|
||||
To perform a reset, select `Reset your data` in the UI on a connection's status or job history tabs. You will also be prompted to reset affected streams if you edit any stream settings to ensure data continues to sync accurately.
|
||||
|
||||
* Delete all records in your destination tables
|
||||
* Delete all records in your destination file
|
||||
Similarly to a sync job, a reset can be completed as successful, failed, or cancelled. To resolve a failed reset, you should manually drop the tables in the destination so that Airbyte can continue syncing accurately into the destination.
|
||||
|
||||
The reset button **DOES NOT**:
|
||||
## Reset behavior
|
||||
When a reset is successfully completed, all the records are deleted from your destination tables (and files, if using local JSON or local CSV as the destination).
|
||||
|
||||
* Delete the destination tables
|
||||
* Delete a destination file if using the LocalCSV or LocalJSON Destinations
|
||||
:::info
|
||||
If you are using destinations that are on the [Destinations v2](/release_notes/upgrading_to_destinations_v2.md) framework, only raw tables will be cleared of their data. Final tables will retain all records from the last sync.
|
||||
:::
|
||||
|
||||
Because of this, if you have any orphaned tables or files that are no longer being synced to, they will have to be cleaned up later, as Airbyte will not clean them up for you.
|
||||
A reset **DOES NOT** delete any destination tables when using a data warehouse, data lake, database. The schema is retained but will not contain any rows.
|
||||
|
||||
:::tip
|
||||
If you have any orphaned tables or files that are no longer being synced to, they should be cleaned up separately, as Airbyte will not clean them up for you. This can occur when the `Destination Namespace` or `Stream Prefix` connection configuration is changed for an existing connection.
|
||||
:::
|
||||
|
||||
@@ -18,7 +18,7 @@ After replication of data from a source connector \(Extract\) to a destination c
|
||||
|
||||
## Public Git repository
|
||||
|
||||
In the connection settings page, I can add new Transformations steps to apply after [normalization](../../understanding-airbyte/basic-normalization.md). For example, I want to run my custom dbt project jaffle_shop, whenever my sync is done replicating and normalizing my data.
|
||||
In the connection settings page, I can add new Transformations steps to apply after [normalization](../../using-airbyte/core-concepts/basic-normalization.md). For example, I want to run my custom dbt project jaffle_shop, whenever my sync is done replicating and normalizing my data.
|
||||
|
||||
You can find the jaffle shop test repository by clicking [here](https://github.com/dbt-labs/jaffle_shop).
|
||||
|
||||
|
||||
@@ -16,7 +16,7 @@ At its core, Airbyte is geared to handle the EL \(Extract Load\) steps of an ELT
|
||||
|
||||
However, this is actually producing a table in the destination with a JSON blob column... For the typical analytics use case, you probably want this json blob normalized so that each field is its own column.
|
||||
|
||||
So, after EL, comes the T \(transformation\) and the first T step that Airbyte actually applies on top of the extracted data is called "Normalization". You can find more information about it [here](../../understanding-airbyte/basic-normalization.md).
|
||||
So, after EL, comes the T \(transformation\) and the first T step that Airbyte actually applies on top of the extracted data is called "Normalization". You can find more information about it [here](../../using-airbyte/core-concepts/basic-normalization.md).
|
||||
|
||||
Airbyte runs this step before handing the final data over to other tools that will manage further transformation down the line.
|
||||
|
||||
|
||||
@@ -1,5 +1,12 @@
|
||||
# Upgrading Airbyte
|
||||
|
||||
:::info
|
||||
|
||||
If you run on [Airbyte Cloud](https://cloud.airbyte.com/signup) you'll always run on the newest
|
||||
Airbyte version automatically. This documentation only applies to users deploying our self-managed
|
||||
version.
|
||||
:::
|
||||
|
||||
## Overview
|
||||
|
||||
This tutorial will describe how to determine if you need to run this upgrade process, and if you do, how to do so. This process does require temporarily turning off Airbyte.
|
||||
|
||||
@@ -1,15 +1,17 @@
|
||||
# Using custom connectors
|
||||
If our connector catalog does not fulfill your needs, you can build your own Airbyte connectors.
|
||||
There are two approaches you can take while jumping on connector development project:
|
||||
1. You want to build a connector for an **external** source or destination (public API, off-the-shelf DBMS, data warehouses, etc.). In this scenario, your connector development will probably benefit the community. The right way is to open a PR on our repo to add your connector to our catalog. You will then benefit from an Airbyte team review and potential future improvements and maintenance from the community.
|
||||
2. You want to build a connector for an **internal** source or destination (private API) specific to your organization. This connector has no good reason to be exposed to the community.
|
||||
|
||||
This guide focuses on the second approach and assumes the following:
|
||||
* You followed our other guides and tutorials about connector developments.
|
||||
* You finished your connector development, running it locally on an Airbyte development instance.
|
||||
:::info
|
||||
This guide walks through the setup of a Docker-based custom connector. To understand how to use our low-code connector builder, read our guide [here](/connector-development/connector-builder-ui/overview.md).
|
||||
:::
|
||||
|
||||
If our connector catalog does not fulfill your needs, you can build your own Airbyte connectors! You can either use our [low-code connector builder](/connector-development/connector-builder-ui/overview.md) or upload a Docker-based custom connector.
|
||||
|
||||
This page walks through the process to upload a **Docker-based custom connector**. This is an ideal route for connectors that have an **internal** use case like a private API with a specific fit for your organization. This guide for using Docker-based custom connectors assumes the following:
|
||||
* You followed our other guides and tutorials about [connector development](/connector-development/connector-builder-ui/overview.md)
|
||||
* You finished your connector development and have it running locally on an Airbyte development instance.
|
||||
* You want to deploy this connector to a production Airbyte instance running on a VM with docker-compose or on a Kubernetes cluster.
|
||||
|
||||
If you prefer video tutorials, [we recorded a demo about uploading connectors images to a GCP Artifact Registry](https://www.youtube.com/watch?v=4YF20PODv30&ab_channel=Airbyte).
|
||||
If you prefer video tutorials, we recorded a demo on how to upload [connectors images to a GCP Artifact Registry](https://www.youtube.com/watch?v=4YF20PODv30&ab_channel=Airbyte).
|
||||
|
||||
## 1. Create a private Docker registry
|
||||
Airbyte needs to pull its Docker images from a remote Docker registry to consume a connector.
|
||||
@@ -70,42 +72,21 @@ If you want Airbyte to pull images from another private Docker registry, you wil
|
||||
|
||||
You should run all the above commands from your local/CI environment, where your connector source code is available.
|
||||
|
||||
## 4. Use your custom connector in Airbyte
|
||||
## 4. Use your custom Docker connector in Airbyte
|
||||
At this step, you should have:
|
||||
* A private Docker registry hosting your custom connector image.
|
||||
* Authenticated your Airbyte instance to your private Docker registry.
|
||||
|
||||
You can pull your connector image from your private registry to validate the previous steps. On your Airbyte instance: run `docker pull <image-name>:<tag>` if you are using our `docker-compose` deployment, or start a pod that is using the connector image.
|
||||
|
||||
### 1. Click on Settings
|
||||

|
||||
1. Click on `Settings` in the left-hand sidebar. Navigate to `Sources` or `Destinations` depending on your connector. Click on `Add a new Docker connector`.
|
||||
|
||||
2. Name your custom connector in `Connector display name`. This is just the display name used for your workspace.
|
||||
|
||||
### 2. Click on Sources (or Destinations)
|
||||

|
||||
3. Fill in the Docker `Docker full image name` and `Docker image tag`.
|
||||
|
||||
4. (Optional) Add a link to connector's documentation in `Connector documentation URL`
|
||||
You can optionally fill this with any value if you do not have online documentation for your connector.
|
||||
This documentation will be linked in your connector setting's page.
|
||||
|
||||
### 3. Click on + New connector
|
||||

|
||||
|
||||
|
||||
### 4. Fill the name of your custom connector
|
||||

|
||||
|
||||
|
||||
### 5. Fill the Docker image name of your custom connector
|
||||

|
||||
|
||||
|
||||
### 6. Fill the Docker Tag of your custom connector image
|
||||

|
||||
|
||||
|
||||
### 7. Fill the URL to your connector documentation
|
||||
This is a required field at the moment, but you can fill with any value if you do not have online documentation for your connector.
|
||||
This documentation will be linked in the connector setting page.
|
||||

|
||||
|
||||
|
||||
### 8. Click on Add
|
||||

|
||||
5. `Add` the connector to save the configuration. You can now select your new connector when setting up a new connection!
|
||||
@@ -1,2 +0,0 @@
|
||||
# Project Overview
|
||||
|
||||
@@ -1,48 +0,0 @@
|
||||
---
|
||||
description: Our Community Code of Conduct
|
||||
---
|
||||
|
||||
# Code of Conduct
|
||||
|
||||
## Our Pledge
|
||||
|
||||
In the interest of fostering an open and welcoming environment, we as contributors and maintainers pledge to make participation in our project and our community a harassment-free experience for everyone, regardless of age, body size, disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, religion, or sexual identity and orientation.
|
||||
|
||||
## Our Standards
|
||||
|
||||
Examples of behavior that contributes to creating a positive environment include:
|
||||
|
||||
* Using welcoming and inclusive language
|
||||
* Being respectful of differing viewpoints and experiences
|
||||
* Gracefully accepting constructive criticism
|
||||
* Focusing on what is best for the community
|
||||
* Showing empathy towards other community members
|
||||
|
||||
Examples of unacceptable behavior by participants include:
|
||||
|
||||
* The use of sexualized language or imagery and unwelcome sexual attention or advances
|
||||
* Trolling, insulting/derogatory comments, and personal or political attacks
|
||||
* Public or private harassment
|
||||
* Publishing others’ private information, such as a physical or electronic address, without explicit permission
|
||||
* Other conduct which could reasonably be considered inappropriate in a professional setting
|
||||
|
||||
## Our Responsibilities
|
||||
|
||||
Project maintainers are responsible for clarifying the standards of acceptable behavior and are expected to take appropriate and fair corrective action in response to any instances of unacceptable behavior.
|
||||
|
||||
Project maintainers have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, or to ban temporarily or permanently any contributor for other behaviors that they deem inappropriate, threatening, offensive, or harmful.
|
||||
|
||||
## Scope
|
||||
|
||||
This Code of Conduct applies within all project spaces, and it also applies when an individual is representing the project or its community in public spaces. Examples of representing a project or community include using an official project e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event. Representation of a project may be further defined and clarified by project maintainers.
|
||||
|
||||
## Enforcement
|
||||
|
||||
Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by contacting the project team at [conduct@airbyte.io](mailto:conduct@airbyte.io). All complaints will be reviewed and investigated and will result in a response that is deemed necessary and appropriate to the circumstances. The project team is obligated to maintain confidentiality with regard to the reporter of an incident. Further details of specific enforcement policies may be posted separately.
|
||||
|
||||
Project maintainers who do not follow or enforce the Code of Conduct in good faith may face temporary or permanent repercussions as determined by other members of the project’s leadership.
|
||||
|
||||
## Attribution
|
||||
|
||||
This Code of Conduct is adapted from the [Contributor Covenant](https://www.contributor-covenant.org/), version 1.4, available at [https://www.contributor-covenant.org/version/1/4/code-of-conduct.html](https://www.contributor-covenant.org/version/1/4/code-of-conduct.html)
|
||||
|
||||
@@ -1,39 +0,0 @@
|
||||
# Connector Support Levels
|
||||
|
||||
The following table describes the support levels of Airbyte connectors.
|
||||
|
||||
| | Certified | Custom | Community |
|
||||
| --------------------------------- | -------------------------- | -------------------------- | ---------------------- |
|
||||
| **Availability** | Available to all users | Available to all users | Available to all users |
|
||||
| **Support: Cloud** | Supported* | Supported** | No Support |
|
||||
| **Support: Powered by Airbyte** | Supported* | Supported** | No Support |
|
||||
| **Support: Self-Managed Enterprise** | Supported* | Supported** | No Support |
|
||||
| **Support: Community (OSS)** | Slack Support only | Slack Support only | No Support |
|
||||
| **Who builds them?** | Either the community or the Airbyte team. | Anyone can build custom connectors. We recommend using our [Connector Builder](https://docs.airbyte.com/connector-development/connector-builder-ui/overview) or [Low-code CDK](https://docs.airbyte.com/connector-development/config-based/low-code-cdk-overview). | Typically they are built by the community. The Airbyte team may upgrade them to Certified at any time. |
|
||||
| **Who maintains them?** | The Airbyte team | Users | Users |
|
||||
| **Production Readiness** | Guaranteed by Airbyte | Not guaranteed | Not guaranteed |
|
||||
|
||||
\*For Certified connectors, Official Support SLAs are only available to customers with Premium Support included in their contract. Otherwise, please use our support portal and we will address your issues as soon as possible.
|
||||
|
||||
\*\*For Custom connectors, Official Support SLAs are only available to customers with Premium Support included in their contract. This support is provided with best efforts, and maintenance/upgrades are owned by the customer.
|
||||
|
||||
## Certified
|
||||
|
||||
A **Certified** connector is actively maintained and supported by the Airbyte team and maintains a high quality bar. It is production ready.
|
||||
|
||||
### What you should know about Certified connectors:
|
||||
|
||||
- Certified connectors are available to all users.
|
||||
- These connectors have been tested and vetted in order to be certified and are production ready.
|
||||
- Certified connectors should go through minimal breaking change but in the event an upgrade is needed users will be given an adequate upgrade window.
|
||||
|
||||
## Community
|
||||
|
||||
A **Community** connector is maintained by the Airbyte community until it becomes Certified. Airbyte has over 800 code contributors and 15,000 people in the Slack community to help. The Airbyte team is continually certifying Community connectors as usage grows. As these connectors are not maintained by Airbyte, we do not offer support SLAs around them, and we encourage caution when using them in production.
|
||||
|
||||
### What you should know about Community connectors:
|
||||
|
||||
- Community connectors are available to all users.
|
||||
- Community connectors may be upgraded to Certified at any time, and we will notify users of these upgrades via our Slack Community and in our Connector Catalog.
|
||||
- Community connectors might not be feature-complete (features planned for release are under development or not prioritized) and may include backward-incompatible/breaking API changes with no or short notice.
|
||||
- Community connectors have no Support SLAs.
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user