From cbec4fa798da0b455ecc7ff64d7d5704f7eae8e2 Mon Sep 17 00:00:00 2001 From: John Lafleur Date: Thu, 21 Jan 2021 14:29:18 +1100 Subject: [PATCH] Product FAQ in Docs (#1734) * GitBook: [docs_faq] 107 pages modified * Update docs/faq/data-loading.md Co-authored-by: Charles * Update docs/faq/differences-with.../singer-vs-airbyte.md Co-authored-by: Charles * Update docs/faq/security-and-data-audits.md Co-authored-by: Charles * Update docs/faq/transformation-and-schemas.md Co-authored-by: Charles * Update docs/faq/transformation-and-schemas.md Co-authored-by: Charles * Update docs/faq/differences-with.../pipelinewise-vs-airbyte.md Co-authored-by: Jared Rhizor * GitBook: [docs_faq] 106 pages modified * GitBook: [docs_faq] one page modified * Update docs/faq/differences-with.../meltano-vs-airbyte.md Co-authored-by: Michel Tricot * GitBook: [docs_faq] 6 pages modified Co-authored-by: John Lafleur Co-authored-by: Charles Co-authored-by: Jared Rhizor Co-authored-by: Michel Tricot --- docs/SUMMARY.md | 13 +++- docs/faq/README.md | 2 + docs/faq/data-loading.md | 64 +++++++++++++++++++ docs/faq/differences-with.../README.md | 2 + .../fivetran-vs-airbyte.md | 27 ++++++++ .../differences-with.../meltano-vs-airbyte.md | 23 +++++++ .../pipelinewise-vs-airbyte.md | 25 ++++++++ .../differences-with.../singer-vs-airbyte.md | 23 +++++++ .../stitchdata-vs-airbyte.md | 26 ++++++++ docs/faq/getting-started.md | 23 +++++++ docs/faq/security-and-data-audits.md | 14 ++++ docs/faq/technical-support.md | 24 +++++++ docs/faq/transformation-and-schemas.md | 16 +++++ 13 files changed, 281 insertions(+), 1 deletion(-) create mode 100644 docs/faq/README.md create mode 100644 docs/faq/data-loading.md create mode 100644 docs/faq/differences-with.../README.md create mode 100644 docs/faq/differences-with.../fivetran-vs-airbyte.md create mode 100644 docs/faq/differences-with.../meltano-vs-airbyte.md create mode 100644 docs/faq/differences-with.../pipelinewise-vs-airbyte.md create mode 100644 docs/faq/differences-with.../singer-vs-airbyte.md create mode 100644 docs/faq/differences-with.../stitchdata-vs-airbyte.md create mode 100644 docs/faq/getting-started.md create mode 100644 docs/faq/security-and-data-audits.md create mode 100644 docs/faq/technical-support.md create mode 100644 docs/faq/transformation-and-schemas.md diff --git a/docs/SUMMARY.md b/docs/SUMMARY.md index dc0c5f733ec..b06c93da26c 100644 --- a/docs/SUMMARY.md +++ b/docs/SUMMARY.md @@ -85,7 +85,18 @@ * [Updating Documentation](contributing-to-airbyte/updating-documentation.md) * [Templates](contributing-to-airbyte/templates/README.md) * [Connector Doc Template](contributing-to-airbyte/templates/integration-documentation-template.md) -* [Technical Support](technical-support.md) +* [FAQ](faq/README.md) + * [Getting Started](faq/getting-started.md) + * [Data Loading](faq/data-loading.md) + * [Transformation and Schemas](faq/transformation-and-schemas.md) + * [Security & Data Audits](faq/security-and-data-audits.md) + * [Technical Support](faq/technical-support.md) + * [Differences with...](faq/differences-with.../README.md) + * [Fivetran vs Airbyte](faq/differences-with.../fivetran-vs-airbyte.md) + * [StitchData vs Airbyte](faq/differences-with.../stitchdata-vs-airbyte.md) + * [Singer vs Airbyte](faq/differences-with.../singer-vs-airbyte.md) + * [Pipelinewise vs Airbyte](faq/differences-with.../pipelinewise-vs-airbyte.md) + * [Meltano vs Airbyte](faq/differences-with.../meltano-vs-airbyte.md) * [Company Handbook](company-handbook/README.md) * [Story](company-handbook/future-milestones.md) * [Culture and Values](company-handbook/culture-and-values.md) diff --git a/docs/faq/README.md b/docs/faq/README.md new file mode 100644 index 00000000000..a39c1bc0d65 --- /dev/null +++ b/docs/faq/README.md @@ -0,0 +1,2 @@ +# FAQ + diff --git a/docs/faq/data-loading.md b/docs/faq/data-loading.md new file mode 100644 index 00000000000..5adfad421c2 --- /dev/null +++ b/docs/faq/data-loading.md @@ -0,0 +1,64 @@ +# Data Loading + +## **Why don’t I see any data in my destination yet?** + +It can take a while for Airbyte to load data into your destination. Some sources have restrictive API limits which constrain how much data we can sync in a given time. Large amounts of data in your source can also make the initial sync take longer. You can check your sync status in your connection detail page that you can access through the destination detail page or the source one. + +## **What happens if a sync fails?** + +You won't loose data when a sync fails, however, no data will be added or updated in your destination. + +Airbyte will automatically attempt to replicate data 3 times. You can see and export the logs for those attempts in the connection detail page. You can access this page through the Source or Destination detail page. + +In the future, you will be able to configure a notification \(email, Slack...\) when a sync fails, with an option to create a GitHub issue with the logs. We’re still working on it, and the purpose would be to help the community and the Airbyte team fix the issue as soon as possible, especially if it is a connector issue. + +Until we have this system in place, here is what you can do: + +* File a GitHub issue: go [here](https://github.com/airbytehq/airbyte/issues/new?assignees=&labels=type%2Fbug&template=bug-report.md&title=) and file an issue with the detailed logs copied in the issue’s description. The team will be notified about your issue and will update it for any progress or comment on it. +* Fix the issue yourself: Airbyte is open source so you don’t need to wait for anybody to fix your issue if it is important to you. To do so, just fork the [GitHub project](http://github.com/airbytehq/airbyte) and fix the piece of code that need fixing. If you’re okay with contributing your fix to the community, you can submit a pull request. We will review it ASAP. +* Ask on Slack: don’t hesitate to ping the team on [Slack](https://slack.airbyte.io). + +Once all this is done, Airbyte resumes your sync from where it left off. + +We truly appreciate any contribution you make to help the community. Airbyte will become the open-source standard only if everybody participates. + +## **What happens to data in the pipeline if the destination gets disconnected? Could I lose data, or wind up with duplicate data when the pipeline is reconnected?** + +Airbyte is architected to prevent data loss or duplication. We will display a failure for the sync, and re-attempt it at the next syncing, according to the frequency you set. + +## **How frequently can Airbyte sync data?** + +You can adjust the load time to run as frequent as every five minutes and as infrequent as every 24 hours. + +## **Why wouldn’t I choose to load all of my data every five minutes?** + +While frequent data loads will give you more up-to-date data, there are a few reasons you wouldn’t want to load your data every five minutes, including: + +* Higher API usage may cause you to hit a limit that could impact other systems that rely on that API. +* Higher cost of loading data into your warehouse. +* More frequent delays, resulting in increased delay notification emails. For instance, if the data source generally takes several hours to update but you choose five-minute increments, you may receive a delay notification every sync. + +We generally recommend setting the incremental loads to every hour to help limit API calls. + +## **Is there a way to know the estimated time to completion for the first historic sync?** + +Unfortunately not yet. + +## **I see you support a lot of connectors – what about connectors Airbyte doesn’t support yet?** + +You can either: + +* Submit a [connector request](https://github.com/airbytehq/airbyte/issues/new?assignees=&labels=area%2Fintegration%2C+new-integration&template=new-integration-request.md&title=) on our Github project, and be notified once we or the community build a connector for it. +* Build a connector yourself by forking our [GitHub project](https://github.com/airbytehq/airbyte) and submitting a pull request. Here are the [instructions how to build a connector](../contributing-to-airbyte/building-new-connector/). +* Ask on Slack: don’t hesitate to ping the team on [Slack](https://slack.airbyte.io). + +## **What kind of notifications do I get?** + +For the moment, the UI will only display one kind of notification: when a sync fails, we will display the failure at the source/destination level in the list of sources/destinations, and in the connection detail page along with the logs. + +However, there are other types of notifications we’re thinking about: + +* When a connector that you use is no longer up to date +* When your connections fails +* When core isn't up to date + diff --git a/docs/faq/differences-with.../README.md b/docs/faq/differences-with.../README.md new file mode 100644 index 00000000000..1ad1eb1ad51 --- /dev/null +++ b/docs/faq/differences-with.../README.md @@ -0,0 +1,2 @@ +# Differences with... + diff --git a/docs/faq/differences-with.../fivetran-vs-airbyte.md b/docs/faq/differences-with.../fivetran-vs-airbyte.md new file mode 100644 index 00000000000..4def4abf9ab --- /dev/null +++ b/docs/faq/differences-with.../fivetran-vs-airbyte.md @@ -0,0 +1,27 @@ +# Fivetran vs Airbyte + +We wrote an article, “[Open-source vs. Commercial Software: How to Solve the Data Integration Problem](https://airbyte.io/articles/data-engineering-thoughts/open-source-vs-commercial-software-how-to-better-solve-data-integration/),” in which we describe the pros and cons of Fivetran’s commercial approach and Airbyte’s open-source approach. Don’t hesitate to check it out for more detailed arguments. As a summary, here are the differences: + +![](https://airbyte.io/wp-content/uploads/2021/01/Airbyte-vs-Fivetran.png) + +### **Fivetran:** + +* **Limited high-quality connectors:** after 8 years in business, Fivetran supports 150 connectors. The more connectors, the more difficult it is for Fivetran to keep the same level of maintenance across all connectors. They will always have a ROI consideration to maintaining long-tailed connectors. +* **Pricing indexed on usage:** Fivetran’s pricing is indexed on the number of active rows \(rows added or edited\) per month. Teams always need to keep that in mind and are not free to move data without thinking about cost, as the costs can grow fast. +* **Security and privacy compliance:** all companies are subject to privacy compliance laws, such as GDPR, CCPA, HIPAA, etc. As a matter of fact, above a certain stage \(about 100 employees\) in a company, all external products need to go through a security compliance process that can take several months. +* **No moving data between internal databases:** Fivetran sits in the cloud, so if you have to replicate data from an internal database to another, it makes no sense to have the data move through them \(Fivetran\) for privacy and cost reasons. + +### **Airbyte:** + +* **Free, as open source, so no more pricing based on usage**: learn more about our [future business model](../../company-handbook/business-model.md) \(connectors will always remain open source\). +* **Supporting 50+ connectors by the end of 2020** \(so in only 5 months of existence\). Our goal is to reach 300+ connectors by the end of 2021. +* **Building new connectors made trivial, in the language of your choice:** Airbyte makes it a lot easier to create your own connector, vs. building them yourself in-house \(with Airflow or other tools\). Scheduling, orchestration, and monitoring comes out of the box with Airbyte. +* **Addressing the long tail of connectors:** with the help of the community, Airbyte ambitions to support thousands of connectors. +* **Adapt existing connectors to your needs:** you can adapt any existing connector to address your own unique edge case. +* **Using data integration in a workflow:** Airbyte’s API lets engineering teams add data integration jobs into their workflow seamlessly. +* **Integrates with your data stack and your needs:** Airflow, Kubernetes, DBT, etc. Its normalization is optional, it gives you a basic version that works out of the box, but also allows you to use DBT to do more complicated things. +* **Debugging autonomy:** if you experience any connector issue, you won’t need to wait for Fivetran’s customer support team to get back to you, if you can fix the issue fast yourself. +* **No more security and privacy compliance, as self-hosted and open-sourced \(MIT\)**. Any team can directly address their integration needs. + +Your data stays in your cloud. Have full control over your data, and the costs of your data transfers. + diff --git a/docs/faq/differences-with.../meltano-vs-airbyte.md b/docs/faq/differences-with.../meltano-vs-airbyte.md new file mode 100644 index 00000000000..2c7bd69044f --- /dev/null +++ b/docs/faq/differences-with.../meltano-vs-airbyte.md @@ -0,0 +1,23 @@ +# Meltano vs Airbyte + +## **Meltano:** + +Meltano is a Gitlab side project. Since 2019, they have been iterating on several approaches. The latest positioning is an orchestrator dedicated to data integration that was built by Gitlab on top of Singer’s taps and targets. They currently have only one maintainer for this project. + +* **Only 19 connectors built on top of Singer, after more than a year**. This means that Meltano has the same limitations as Singer in regards to its data protocol. +* **CLI-first approach:** Meltano was primarily built with a command line interface in mind. In that sense, they seem to target engineers with a preference for that interface. Unfortunately, it’s not thought to be part of some workflows. +* **A new UI**: Meltano has recently built a new UI to try to appeal to a larger audience. +* **Integration with DBT for transformation:** Meltano offers some deep integration with [DBT](http://getdbt.com), and therefore lets data engineering teams handle transformation any way they want. +* **Integration with Airflow for orchestration:** You can either use Meltano alone for orchestration or with Airflow; Meltano works both ways. + +## **Airbyte:** + +In contrast, Airbyte is a company fully committed to the open-source MIT project and has a [business model ](../../company-handbook/business-model.md)in mind around this project. Our [team](../../company-handbook/team.md) are data integration experts that have built more than 1,000 integrations collectively at large scale. + +* Our ambition is to support **300+ connectors by the end of 2021.** We already supported about 50 connectors at the end of 2020, just 5 months after its inception. +* Airbyte’s connectors are **usable out of the box through a UI and API,** with monitoring, scheduling and orchestration. Airbyte was built on the premise that a user, whatever their background, should be able to move data in 2 minutes. Data engineers might want to use raw data and their own transformation processes, or to use Airbyte’s API to include data integration in their workflows. On the other hand, analysts and data scientists might want to use normalized consolidated data in their database or data warehouses. Airbyte supports all these use cases. +* **One platform, one project with standards:** This will help consolidate the developments behind one single project, some standardization and specific data protocol that can benefit all teams and specific cases. +* **Not limited by Singer’s data protocol:** In contrast to Meltano, Airbyte was not built on top of Singer, but its data protocol is compatible with Singer’s. This means Airbyte can go beyond Singer, but Meltano will remain limited. +* **Connectors can be built in the language of your choice,** as Airbyte runs them as Docker containers. +* **Airbyte integrates with your data stack and your needs:** Airflow, Kubernetes, DBT, etc. Its normalization is optional, it gives you a basic version that works out of the box, but also allows you to use DBT to do more complicated things. + diff --git a/docs/faq/differences-with.../pipelinewise-vs-airbyte.md b/docs/faq/differences-with.../pipelinewise-vs-airbyte.md new file mode 100644 index 00000000000..fbd62afa055 --- /dev/null +++ b/docs/faq/differences-with.../pipelinewise-vs-airbyte.md @@ -0,0 +1,25 @@ +# Pipelinewise vs Airbyte + +## **PipelineWise:** + +PipelineWise is an open-source project by Transferwise that was built with the primary goal of serving their own needs. There is no business model attached to the project, and no apparent interest in growing the community. + +* **Supports 21 connectors,** and only adds new ones based on the needs of the mother company, Transferwise. +* **No business model attached to the project,** and no apparent interest from the company in growing the community. +* **As close to the original format as possible:** PipelineWise aims to reproduce the data from the source to an Analytics-Data-Store in as close to the original format as possible. Some minor load time transformations are supported, but complex mapping and joins have to be done in the Analytics-Data-Store to extract meaning. +* **Managed Schema Changes:** When source data changes, PipelineWise detects the change and alters the schema in your Analytics-Data-Store automatically. +* **YAML based configuration:** Data pipelines are defined as YAML files, ensuring that the entire configuration is kept under version control. +* **Lightweight:** No daemons or database setup are required. + +## **Airbyte:** + +In contrast, Airbyte is a company fully committed to the open-source MIT project and has a [business model in mind](https://docs.airbyte.io/company-handbook/company-handbook/business-model) around this project. + +* Our ambition is to support **300+ connectors by the end of 2021.** We already supported about 50 connectors at the end of 2020, just 5 months after its inception. +* Airbyte’s connectors are **usable out of the box through a UI and API,** with monitoring, scheduling and orchestration. Airbyte was built on the premise that a user, whatever their background, should be able to move data in 2 minutes. Data engineers might want to use raw data and their own transformation processes, or to use Airbyte’s API to include data integration in their workflows. On the other hand, analysts and data scientists might want to use normalized consolidated data in their database or data warehouses. Airbyte supports all these use cases. +* **One platform, one project with standards:** This will help consolidate the developments behind one single project, some standardization and specific data protocol that can benefit all teams and specific cases. +* **Connectors can be built in the language of your choice,** as Airbyte runs them as Docker containers. +* **Airbyte integrates with your data stack and your needs:** Airflow, Kubernetes, DBT, etc. Its normalization is optional, it gives you a basic version that works out of the box, but also allows you to use DBT to do more complicated things. + +The data protocols for both projects are compatible with Singer’s. So it is easy to migrate a Singer tap or target onto Airbyte or PipelineWise. + diff --git a/docs/faq/differences-with.../singer-vs-airbyte.md b/docs/faq/differences-with.../singer-vs-airbyte.md new file mode 100644 index 00000000000..ab573817f5f --- /dev/null +++ b/docs/faq/differences-with.../singer-vs-airbyte.md @@ -0,0 +1,23 @@ +# Singer vs Airbyte + +We wrote an article about this topic: “[Airbyte vs. Singer: Why Airbyte is not built on top of Singer](https://airbyte.io/articles/data-engineering-thoughts/airbyte-vs-singer-why-airbyte-is-not-built-on-top-of-singer/).” As a summary, here are the differences. + +## **Singer:** + +* **Supports 96 connectors after 4 years.** +* **Increasingly outdated connectors:** Talend \(acquirer of StitchData\) seems to have stopped investing in maintaining Singer’s community and connectors. As most connectors see schema changes several times a year, more and more Singer’s taps and targets are not actively maintained and are becoming outdated. +* **Absence of standardization:** each connector is its own open-source project. So you never know the quality of a tap or target until you have actually used it. There is no guarantee whatsoever about what you’ll get. +* **Singer’s connectors are standalone binaries:** you still need to build everything around to make them work \(e.g. UI, configuration validation, state management, normalization, schema migration, monitoring, etc\). +* **No full commitment to open sourcing all connectors,** as some connectors are only offered by StitchData under a paid plan. _\*\*_ + +## **Airbyte:** + +* Our ambition is to support **300+ connectors by the end of 2021.** We already supported about 50 connectors at the end of 2020, just 5 months after its inception. +* Airbyte’s connectors are **usable out of the box through a UI and API**, with monitoring, scheduling and orchestration. Airbyte was built on the premise that a user, whatever their background, should be able to move data in 2 minutes. Data engineers might want to use raw data and their own transformation processes, or to use Airbyte’s API to include data integration in their workflows. On the other hand, analysts and data scientists might want to use normalized consolidated data in their database or data warehouses. Airbyte supports all these use cases. +* **One platform, one project with standards:** This will help consolidate the developments behind one single project, some standardization and specific data protocol that can benefit all teams and specific cases. +* **Connectors can be built in the language of your choice,** as Airbyte runs them as Docker containers. +* **Airbyte integrates with your data stack and your needs:** Airflow, Kubernetes, DBT, etc. Its normalization is optional, it gives you a basic version that works out of the box, but also allows you to use DBT to do more complicated things. +* **A full commitment to the open-source MIT project** with the promise not to hide some connectors behind paid walls. + +Note that Airbyte’s data protocol is compatible with Singer’s. So it is easy to migrate a Singer tap onto Airbyte. + diff --git a/docs/faq/differences-with.../stitchdata-vs-airbyte.md b/docs/faq/differences-with.../stitchdata-vs-airbyte.md new file mode 100644 index 00000000000..a30a032e724 --- /dev/null +++ b/docs/faq/differences-with.../stitchdata-vs-airbyte.md @@ -0,0 +1,26 @@ +# StitchData vs Airbyte + +We wrote an article, “[Open-source vs. Commercial Software: How to Solve the Data Integration Problem](https://airbyte.io/articles/data-engineering-thoughts/open-source-vs-commercial-software-how-to-better-solve-data-integration/),” in which we describe the pros and cons of StitchData’s commercial approach and Airbyte’s open-source approach. Don’t hesitate to check it out for more detailed arguments. As a summary, here are the differences: + +### StitchData: + +* **Limited deprecating connectors:** Stitch only supports 150 connectors. Talend has stopped investing in StitchData and its connectors. And on Singer, each connector is its own open-source project. So you never know the quality of a tap or target until you have actually used it. There is no guarantee whatsoever about what you’ll get. +* **Pricing indexed on usage:** StitchData’s pricing is indexed on the connectors used and the volume of data transferred. Teams always need to keep that in mind and are not free to move data without thinking about cost. +* **Security and privacy compliance:** all companies are subject to privacy compliance laws, such as GDPR, CCPA, HIPAA, etc. As a matter of fact, above a certain stage \(about 100 employees\) in a company, all external products need to go through a security compliance process that can take several months. +* **No moving data between internal databases:** StitchData sits in the cloud, so if you have to replicate data from an internal database to another, it makes no sense to have the data move through their cloud for privacy and cost reasons. +* **StitchData’s Singer connectors are standalone binaries:** you still need to build everything around to make them work. And it’s hard to update some pre-built connectors, as they are of poor quality. + +### Airbyte: + +* **Free, as open source, so no more pricing based on usage:** learn more about our [future business model](../../company-handbook/business-model.md) \(connectors will always remain open-source\). +* **Supporting 50+ connectors by the end of 2020** \(so in only 5 months of existence\). Our goal is to reach 300+ connectors by the end of 2021. +* **Building new connectors made trivial, in the language of your choice:** Airbyte makes it a lot easier to create your own connector, vs. building them yourself in-house \(with Airflow or other tools\). Scheduling, orchestration, and monitoring comes out of the box with Airbyte. +* **Maintenance-free connectors you can use in minutes.** Just authenticate your sources and warehouse, and get connectors that adapt to schema and API changes for you. +* **Addressing the long tail of connectors:** with the help of the community, Airbyte ambitions to support thousands of connectors. +* **Adapt existing connectors to your needs:** you can adapt any existing connector to address your own unique edge case. +* **Using data integration in a workflow:** Airbyte’s API lets engineering teams add data integration jobs into their workflow seamlessly. +* **Integrates with your data stack and your needs:** Airflow, Kubernetes, DBT, etc. Its normalization is optional, it gives you a basic version that works out of the box, but also allows you to use DBT to do more complicated things. +* **Debugging autonomy:** if you experience any connector issue, you won’t need to wait for Fivetran’s customer support team to get back to you, if you can fix the issue fast yourself. +* **Your data stays in your cloud.** Have full control over your data, and the costs of your data transfers. +* **No more security and privacy compliance, as self-hosted and open-sourced \(MIT\).** Any team can directly address their integration needs. + diff --git a/docs/faq/getting-started.md b/docs/faq/getting-started.md new file mode 100644 index 00000000000..dd2f0917e9c --- /dev/null +++ b/docs/faq/getting-started.md @@ -0,0 +1,23 @@ +# Getting Started + +### **What do I need to get started using Airbyte?** + +You can deploy Airbyte in several ways, as [documented here](../deploying-airbyte/). Airbyte will then help you replicate data between a source and a destination. Airbyte offers pre-built connectors for both, you can see their list [here](../changelog/connectors.md). If you don’t see the connector you need, you can [build your connector yourself](../contributing-to-airbyte/building-new-connector/) and benefit from Airbyte’s optional scheduling, orchestration and monitoring modules. + +### **How long does it take to set up Airbyte?** + +It depends on your source and destination. Check our setup guides to see the tasks for your source and destination. Each source and destination also has a list of prerequisites for setup. To make setup faster, get your prerequisites ready before you start to set up your connector. During the setup process, you may need to contact others \(like a database administrator or AWS account owner\) for help, which might slow you down. But if you have access to the connection information, it can take 2 minutes: see [demo video. ](https://www.youtube.com/watch?v=jWVYpUV9vEg) + +### **What data sources does Airbyte offer connectors for?** + +We already offer 50+ connectors, and will focus all our effort in ramping up the number of connectors and strengthening them. View the [full list here](../changelog/connectors.md). If you don’t see a source you need, you can file a [connector request here](https://github.com/airbytehq/airbyte/issues/new?assignees=&labels=area%2Fintegration%2C+new-integration&template=new-integration-request.md&title=). + +### **Where can I see my data in Airbyte?** + +You can’t see your data in Airbyte, because we don’t store it. The sync loads your data into your destination \(data warehouse, data lake, etc.\). While you can’t see your data directly in Airbyte, you can check your schema and sync status on the source detail page in Airbyte. + +### **Can I add multiple destinations?** + +Sure, you can. Just go to the "Destinations" section and click on the top right "+ new destination" button. You can have multiple destinations for the same source, and multiple sources for the same destination. + + diff --git a/docs/faq/security-and-data-audits.md b/docs/faq/security-and-data-audits.md new file mode 100644 index 00000000000..01c806287c8 --- /dev/null +++ b/docs/faq/security-and-data-audits.md @@ -0,0 +1,14 @@ +# Security & Data Audits + +## **How secure is Airbyte?** + +Airbyte is an open-source self-hosted solution, so let’s say it is as safe as your data infrastructure. _\*\*_ + +## **Is Airbyte GDPR compliant?** + +Airbyte is a self-hosted solution, so it doesn’t bring any security or privacy risk to your infrastructure. We do intend to add data quality and privacy compliance features in the future, in order to give you more visibility on that topic. + +## **How does Airbyte charge?** + +We don’t. All connectors are all under the MIT license. If you are curious about the business model we have in mind, please check our [company handbook](https://docs.airbyte.io/company-handbook/company-handbook/business-model). + diff --git a/docs/faq/technical-support.md b/docs/faq/technical-support.md new file mode 100644 index 00000000000..91191030105 --- /dev/null +++ b/docs/faq/technical-support.md @@ -0,0 +1,24 @@ +--- +description: Common issues and their workarounds +--- + +# Technical Support + +### Airbyte is stuck while loading required configuration parameters for my connector + +Example of the issue: + +![](../.gitbook/assets/faq_stuck_onboarding.png) + +To load configuration parameters, Airbyte must first `docker pull` the connector's image, which may be many hundreds of megabytes. Under poor connectivity conditions, the request to pull the image may take a very long time or time out. More context on this issue can be found [here](https://github.com/airbytehq/airbyte/issues/1462). If your internet speed is less than 30mbps down or are running bandwidth-consuming workloads concurrently with Airbyte, you may encounter this issue. Run a [speed test](https://fast.com/) to verify your internet speed. + +One workaround is to manually pull the latest version of every connector you'll use then resetting Airbyte. Note that this will remove any configured connections, sources, or destinations you currently have in Airbyte. To do this: + +1. Decide which connectors you'd like to use. For this example let's say you want the Postgres source and the Snowflake destination. +2. Find the Docker image name of those connectors. Look [here](https://github.com/airbytehq/airbyte/blob/master/airbyte-config/init/src/main/resources/seed/source_definitions.yaml) for sources and [here](https://github.com/airbytehq/airbyte/blob/master/airbyte-config/init/src/main/resources/seed/destination_definitions.yaml) for destinations. For each of the connectors you'd like to use, copy the value of the `dockerRepository` and `dockerImageTag` fields. For example, for the Postgres source this would be `airbyte/source-postgres` and e.g `0.1.6`. +3. For **each of the connectors** you'd like to use, from your shell run `docker pull :`, replacing `` and `` with the values copied from the step above e.g: `docker pull airbyte/source-postgres:0.1.6`. +4. Once you've finished downloading all the images, from the Airbyte repository root run `docker-compose down -v` followed by `docker-compose up`. +5. The issue should be resolved. + +If the above workaround does not fix your problem, please report it [here](https://github.com/airbytehq/airbyte/issues/1462) or in our [Slack](https://slack.airbyte.io). + diff --git a/docs/faq/transformation-and-schemas.md b/docs/faq/transformation-and-schemas.md new file mode 100644 index 00000000000..d7b3ac1e6d3 --- /dev/null +++ b/docs/faq/transformation-and-schemas.md @@ -0,0 +1,16 @@ +# Transformation and Schemas + +## **Where's the T in Airbyte’s ETL tool?** + +Airbyte is actually an ELT tool, and you have the freedom to use it as an EL-only tool. The transformation part is done by default, but it is optional. You can choose to receive the data in raw \(JSON file for instance\) in your destination. + +We do provide normalization \(if option is still on\) so that data analysts / scientists / any users of the data can use it without much effort. + +We also intend to integrate deeply with DBT to make it easier for your team to continue relying you on them, if this was what you were doing. + +## **How does Airbyte handle replication when a data source changes its schema?** + +Airbyte continues to sync data using the configured schema until that schema is updated. Because Airbyte treats all fields as optional, if a field is renamed or deleted in the source, that field simply will no longer be replicated, but all remaining fields will. The same is true for streams as well. + +For now, the schema can only be updated manually in the UI \(by clicking "Update Schema" in the settings page for the connection\). When a schema is updated Airbyte will re-sync all data for that source using the new schema. +