1
0
mirror of synced 2025-12-21 19:11:14 -05:00
Commit Graph

234 Commits

Author SHA1 Message Date
Przemysław Dąbek
6bf8c21499 Update documentation about source_catalog_id and its reference to actor_catalog table (#27080) 2023-06-07 06:50:26 -05:00
Edward Gao
fb152a9a0a Normalization: Better handling for CDC transactional updates (#25993)
* try this?

* fix tests

* assert cdc values

* handle case where we have lsn but no updated_at

* readability improvements

* tweaks to test

* version bumps + changelogs

* Automated Change

---------

Co-authored-by: edgao <edgao@users.noreply.github.com>
2023-05-12 12:53:23 +00:00
Cynthia Yin
8400d20352 Destination Redshift: deprecate old migration normalization code (#25771)
* first pass normalization

* add pr link

* remove python test & resources

* linting
2023-05-05 14:18:27 -07:00
Emma Forman Ling
3b60f23194 Use correct CDC field names led by underscores (#25761) 2023-05-02 20:46:53 -03:00
Jeff Cowan (Airbyte)
bdca4cdf06 publish normalization (#25591)
* publish normalization
* bump normalization container version in all the destinations that use it

Co-authored-by: Edward Gao <edward.gao@airbyte.io>
Co-authored-by: edgao <edgao@users.noreply.github.com>
2023-04-28 13:17:50 -07:00
Rodi Reich Zilberman
0bab1756b8 Rename airbyte-config module (#24885)
* rename airbyte-config module

* Automated Commit - Formatting Changes

* sanity

* update import

* update import

* update script

* update script

* update script

* update script

* Automated Change

* Automated Change

* Automated Change

* Automated Change

* update awsdatalake icon

* point slash commands to new path

* sanity

* Automated Commit - Formatting Changes

* sanity

* Automated Change

* Automated Change

* sanity

---------

Co-authored-by: rodireich <rodireich@users.noreply.github.com>
2023-04-06 10:47:30 -07:00
Edward Gao
9b7b30f92b Normalization: Use strict > comparison in incremental mode (#22381)
* copy tests from other branch

* switch to >

* [wip] wire up tests

* make tests work

* fixes

* nicer test structure

* maybe add feature flag?

* pattern matching

* also add version check

* formatting

* refactor test also

* extract test + fix method call

* minor tweaks

* add context to log message

* put workspace id in normalization input

* use non-semver tag

* add flag for version of normalization

* also flag old version

* add test

* missed part of the commit

* format

* add test for null workspace ID

* Revert "also flag old version"

This reverts commit 3be601d16c.

* Revert "missed part of the commit"

This reverts commit 47a67b4631.

* always apply flag, even if we're behind a version

* derp

* Add more logging to the normalization activity

* Update charts and kustomize for the feature flag

* fix clickhouse integration test

* remove replace_identifiers

* Revert "remove replace_identifiers"

This reverts commit 0e7ded5a7b.

* fix replace_identifiers

* garbage debug logs

* stop trying to setup duckdb test

* wake up and choose violence

* fix mssql

* exclude duckdb from tests

* make snowflake happy

* uncomment tests

* derp

* derpderp

* format

* format

* also fix redshift???

* maybe now everything works???

* remove debug logs

* use special docker tag

* bump to new tag

* use random test schema in publish also

* properly cleanup

* remove feature flag stuff

* version bump + changelog

* Automated Commit - Formatting Changes

* bump definitions

---------

Co-authored-by: Jimmy Ma <gosusnp@users.noreply.github.com>
Co-authored-by: Jimmy Ma <jimmy@airbyte.io>
Co-authored-by: octavia-squidington-iii <octavia-bot@airbyte.io>
Co-authored-by: edgao <edgao@users.noreply.github.com>
2023-03-23 09:37:15 -07:00
Edward Gao
2e9e2bb202 Data types documentation: revert to v0 + update text (#24096) 2023-03-16 10:20:28 -07:00
Edward Gao
294cdbcf4a Normalization Bigquery: Add more reserved words (#24077)
* add current times for bigquery

* bump version + changelog
2023-03-15 18:33:59 +00:00
Evan Tahler
4ee62d9dfb Update supported-data-types.md - fix link to well-known-type (#23961) 2023-03-13 08:49:07 -07:00
Mikhail Shustov
2ce3c17048 🎉 Destination ClickHouse: bump dbt-clickhouse to v1.4.0 (#23023)
* bump dbt-clickhouse to 1.4.0

* fix clickhouse integration test

* exclude duckdb from tests

* add to changelog

* bump normalization version in definitions

---------

Co-authored-by: Marcos Marx <marcosmarxm@users.noreply.github.com>
Co-authored-by: Edward Gao <edward.gao@airbyte.io>
2023-02-16 20:15:09 -08:00
Ryan Fu
d21068c989 Tentatively disallowed normalization for DuckDB (#22528) 2023-02-07 20:22:49 -08:00
Edward Gao
517fc6ac10 Normalization: Revert to protocol v0 (#22283)
* Revert "Normalization: handle non-object top-level schemas; treat binary data as string (#22165)"

This reverts commit 8276d03359.

* Revert "Normalization: check for ref type existence (#22161)"

This reverts commit dbe56d6fc2.

* Revert "🎉Updated normalization to handle new datatypes (#19721)"

This reverts commit c1d7736639.

* revert dest definitions

* also dockerfile

* re-add to changelog

* add comment in dockerfile
2023-02-06 10:14:36 -08:00
Sophia Wiley
281bb5a090 Added info about normalization costs to docs (#22182)
* added info about normalization costs

* edited wording
2023-02-01 09:15:05 -08:00
Edward Gao
8276d03359 Normalization: handle non-object top-level schemas; treat binary data as string (#22165)
* handle dumb top-level schemas

* version bump

* also definitions

* treat binary as string

* fallback case

* format

* new variable
2023-01-31 15:59:04 -06:00
Edward Gao
dbe56d6fc2 Normalization: check for ref type existence (#22161)
* check for ref type existence

* version bump

* bump normalization version

* format
2023-01-31 11:33:34 -08:00
Evan Tahler
0b45ad84e1 Update database-data-catalog.md (#22105) 2023-01-30 14:13:06 -08:00
Jimmy Ma
6660b13ad2 Add Airbyte Protocol V1 support. (#20036)
* Add Airbyte Protocol V1 support.

* Fix VersionedAirbyteStreamFactoryTest

* Remove AirbyteMessageMigrationV0 example

* Add Protocol Version constants

* 🎉Updated normalization to handle new datatypes (#19721)

* Updated normalization simple stream processing to handle new datatypes

* Updated normalization nested stream processing to handle new datatypes

* Updated normalization nested stream processing to handle new datatypes

* Updated normalization drop_scd_catalog processing to handle new datatypes

* Updated normalization ephemeral test processing to handle new datatypes

* fixed more tests for normalization

* fixed more tests for normalization

* fixed more tests for normalization

* fixed more tests for normalization

* fixed more issues

* fixed more issues (clickhouse)

* fixed more issues

* fixed more issues

* fixed more issues

* added binary type processing for some DBs

* cleared commented code and moved some hardcodes to processing as macro

* fixed codestyle and cleared commented code

* minor refactor

* minor refactor

* minor refactor

* fixed bool cast error

* fixed dict->str cast error

* fixed is_combining_node cast py check

* removed commented code

* removed commented code

* committed autogenerated normalization_test_output files

* committed autogenerated normalization_test_output files (new files)

* refactored utils.py

* Updated utils.py to use Callable functions and get rid of property_type in is_number and is_bool functions

* committed autogenerated normalization_test_output files (new files)

* fixed typo in TIMESTAMP_WITH_TIMEZONE_TYPE

* updated stream_processor to handle string type first as a wider type

* fixed arrays normalization by updating is_simple_property method as per new approaches

* format

Co-authored-by: Edward Gao <edward.gao@airbyte.io>

* Update airbyte protocol migration (#20745)

* Extract MigrationContainer from AirbyteMessageMigrator

* Add ConfiguredAirbyteCatalogMigrations

* Add ConfiguredAirbyteCatalog to AirbyteMessageMigrations

* Enable ConfiguredAirbyteCatalog migration

* Fix tests

* Remove extra this.

* Add missing docs

* Typo

Co-authored-by: Edward Gao <edward.gao@airbyte.io>

* Data types update: Implement protocol message migrations (#19240)

* Extract MigrationContainer from AirbyteMessageMigrator

* Add ConfiguredAirbyteCatalogMigrations

* Add ConfiguredAirbyteCatalog to AirbyteMessageMigrations

* Enable ConfiguredAirbyteCatalog migration

* set up scaffolding

* [wip] more scaffolding, basic unit test

* minimal green code

* [wip] add failing test for other primitive types

* correct version number

* handle basic primitive type decls

* add implicit cases

* add recursive schema

* formatting

* comment

* support not

* fix indentation

* handle all nested schema cases

* handle boolean schemas

* verify empty schema handling

* cleanup

* extract map

* code organization

* extract method

* reformat

* [wip] more tests, minor fix type array handling

* corrected test

* cleanup

* reformat

* switch to v1

* add support for multityped fields

* missed test case

* nested test class

* basic record upgrade

* implement record upgrades

* slight refactor

* comments+clarificationso

* extract constants

* (partly) correct model classes

* add de/ser

* formatting

* extract constants

* fix json reference

* update docs

* switch to v1 models

* fix compile+test

* add base64 handling

* use vnull

* Data types update: Implement protocol message downgrade path (#19909)

* rough skeleton for passing catalog into migration

* basic test

* more scaffolding

* basic implementation

* add primitives test

* add in other tests (nested fields currently failing)

* add formats

* impleent oneOf handling

* formatting

* oneOf handling

* better tests

* comments + organization

* progress

* basic test case

* downgrade objects, ish

* basic array implementation

* handle numeric failure

* test for new type

* handle array items

* empty schema handling

* first pass at oneof handling

* add more tests+handling

* more tests

* comments

* add empty oneof test case

* format + reorganize

* more reorganize

* fix name

* also downgrade binary data

* only import vnull

* move migrations into v1 package

* extract schema mutation code

* comment

* extract schema migration to new class

* extract record downgrade logic for future use

* format

* fix build after rebase

* rename private method for consistency

* also implement configuredcatalog migrations >.>

* quick and dirty tests

* slight cleanup

* fix tests

* pmd

* pmd test

* null check on message objects

* maybe fix acceptance tests?

* fix name

* extract constants

* more fixes

* tmp

* meh

* fix cdc acc tests

* revert to master source-postgres

* remove log messages

* revert other misc hacks

* integers are valid cursors

* remove unrelated change

* fix build

* fix build more?

* [MUST REVERT] use dev normalization

* capture kube logs

* also here?

* no debug logs?

* delete dup from merging

* add final everywhere

* revert test changes

Co-authored-by: Jimmy Ma <jimmy@airbyte.io>

* On-the-fly migrations of persisted catalogs (#21757)

* On the fly catalog migration for normalization activity

* On the fly catalog migration for job persistence

* On the fly migration for standard sync persistence

* On the fly migration for airbyte catalogs

* Refactor code to share JsonSchema traversal

* Add V0 Data type search function

* PMD and Format

* Fix getOrInsertActorCatalog and ConfigRepositoryE2E tests

* Null-proofing CatalogMigrationV1Helper

* More null checks

* Fix test

* Format

* Add data type v1 support to the FE

* Changes AC test check to check exited ps (#21672)

some docker compose changes no longer show exited
processes.  this broke out test

this change should fix master

tested in a runner that failed

* Move wellknown types mapping to the utility function

* use protocolv1 normalization

---------

Co-authored-by: Topher Lubaway <asimplechris@gmail.com>
Co-authored-by: Edward Gao <edward.gao@airbyte.io>

* Update protocol support range (#21996)

* bump normalization version to 0.3.0

* Add version check on normalization (#22048)

* Add normalization min version check

* Add visible for testing

---------

Co-authored-by: Edward Gao <edward.gao@airbyte.io>
Co-authored-by: Eugene <etsybaev@gmail.com>
Co-authored-by: Topher Lubaway <asimplechris@gmail.com>
2023-01-30 10:17:49 -08:00
Evan Tahler
a55eb7df2d Fix CI Dependency Check Failures (#20666)
* pardot

* plaid

* quickbooks

* appfollow

* appstore

* cloudtrail

* clickup

* clockify

* coda

* Coinmarketcap

* cooper

* dixa

* dv-360

* exchange-rates

* file

* gridly

* Hellobaton

* kustomer

* mailersend

* microsoft dataverse

* n8n

* PersistIq

* Survey Sparrow

* Twilio Taskrouter

* YouTube Analytics Business

* Younium

* Yahoo Finance Price

* Yandex Metrica

* Xero

* WooCommerce

* XKCD

* Webflow

* US Census API

* Qonto

* Pivotal Tracker

* KVDB

* Firestore

* Ignore even more connectors

* test run

* SFTP JSON

* cleanup

* move pardot changelog

* update links

* remove testing HACK

* Update docs/integrations/sources/dixa.md

Co-authored-by: Augustin <augustin@airbyte.io>

* Update docs/integrations/sources/kustomer-singer.md

Co-authored-by: Augustin <augustin@airbyte.io>

* Update docs/integrations/sources/pardot.md

Co-authored-by: Augustin <augustin@airbyte.io>

* Update docs/integrations/sources/kustomer-singer.md

Co-authored-by: Augustin <augustin@airbyte.io>

* Update docs/integrations/sources/pardot.md

Co-authored-by: Augustin <augustin@airbyte.io>

Co-authored-by: Augustin <augustin@airbyte.io>
2022-12-20 16:57:25 -06:00
Geoff Genz
b7816f4f58 🐛 Destination ClickHouse: Update Normalization Docker File (#19573)
* Update ClickHouse normalization docker file

* bump destination and norm version

* auto-bump connector version

* update doc

Co-authored-by: Marcos Marx <marcosmarxm@users.noreply.github.com>
Co-authored-by: marcosmarxm <marcosmarxm@gmail.com>
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
2022-12-06 14:49:18 -03:00
Jimmy Ma
5a2642af69 Airbyte Protocol v1 (#19846)
* Fix versioning of protocol

* Update protocol changelog

* Update protocol $id

* Fix PR reference in the changelog
2022-11-29 16:54:13 -08:00
Ben Church
61bf812105 Fix markdown typo in airbyte-protocol-docker.md (#19842)
## Overview
I noticed the `I/O` heading was not displaying properly due to a missing space. 

This fixes that.
2022-11-28 11:42:45 -08:00
Jimmy Ma
43e6a52413 Add versioning documentation (#19738)
* On Deploying versioning error from the bootloader

* Add ProtocolVersion description to airbyte protocol

* Add general protocol versioning doc
2022-11-28 11:32:09 -08:00
Evan Tahler
e6b06a88ac AirbyteEstimateTraceMessage (#18875)
* `AirbyteEstimateTraceMessage`

* Add PR number

* fix method name

* Lint

* Lint

* fix merge

* Update docs/understanding-airbyte/airbyte-protocol.md

Co-authored-by: Davin Chia <davinchia@gmail.com>

* `EstimateType` sub type in python

* lint

Co-authored-by: Davin Chia <davinchia@gmail.com>
2022-11-07 12:45:39 -08:00
Ella Rohm-Ensing
e6bfe10278 [docs] Use correct header heirarchy in airbyte-protocol docs (#18917)
* Use correct header heirarchy

* Use h2 for Actor Specification

* Revert indentation of key concepts
2022-11-07 10:39:58 -07:00
Greg Solovyev
7b9a097081 Add normalization changelog and bump normalization version in platform (#18813) 2022-11-01 15:36:31 -07:00
Evan Tahler
02459e8354 Protocol Change: AirbyteControlMessage.ConnectorConfig (#17907)
* Protocol Change: AirbyteConfigMessage

* update PR link in docs

* Lint

* Update python files

* Update docs/understanding-airbyte/airbyte-protocol.md

Co-authored-by: Sherif A. Nada <snadalive@gmail.com>

* Update docs/understanding-airbyte/airbyte-protocol.md

Co-authored-by: Sherif A. Nada <snadalive@gmail.com>

* `AirbyteConfigMessage` -> `AirbyteConnectorConfigMessage`

* AirbyteOrchestratorMessage

* Update docs

* `AirbyteControlConnectorConfigMessage`

Co-authored-by: Sherif A. Nada <snadalive@gmail.com>
2022-10-28 11:11:31 -07:00
Taras Korenko
d840a8a80d Simplify the OSS documentation deploy system (#2670) (#18598)
+ Adds Slack notification about failed Docs builds
  (slack notifications fall into "#oss-master-build-failure" for now)
+ (while here) Unbreaks docs build
2022-10-28 17:08:06 +03:00
Edward Gao
aea4c5392c Protocol change: Define a set of well-known data types (#17486)
* add WellKnownTypes.yaml

* rename to snakecase + put in airbyte-protocol

* add examples

* more descriptoins

* descriptions, more restrictions, better regex

* update documentation

* explicitly call out BC support
2022-10-27 14:04:15 -07:00
Edward Gao
bd0c2388df Destination Clickhouse: Publish normalization after removing native port (#17896)
* bump version for publish

* add link for publish pr
2022-10-12 13:51:55 -07:00
Topher Lubaway
9c46e5a560 Fixes relative links above docs to be URIs (#17850)
also adds a missing exit 1 to deploy docs
when the deploy SHOULD fail
2022-10-11 11:36:33 -05:00
Alexander Marquardt
9cc86d857f Update full-refresh-append.md (#17784)
Added a link to the new article comparing sync modes
2022-10-10 15:55:42 +02:00
Alexander Marquardt
caeb8017d3 Update full-refresh-overwrite.md (#17783)
Added a link to the new article that compares replication modes
2022-10-10 15:55:27 +02:00
Alexander Marquardt
5e30aaff14 Update incremental-append.md (#17785)
Added a link to the new article on choosing a replication mode
2022-10-10 15:55:15 +02:00
Alexander Marquardt
3ea0b5a82d Update incremental-deduped-history.md (#17786)
Add a link to new article on choosing a replication mode
2022-10-10 15:55:05 +02:00
Alexander Marquardt
f5f3e87f21 Update cdc.md (#17787)
* Update cdc.md

Added a link to the article about that explains Airbyte replication modes

* Update cdc.md

Added a link to the CDC "exploration" tutorial
2022-10-10 15:54:48 +02:00
Charles
675e153cb6 jobs db descriptions (#16543) 2022-10-08 23:16:28 -05:00
Charles
2237654392 config db data catalog (#16427) 2022-10-08 23:11:11 -05:00
Simon Späti
8a74d39ac3 📖 Removing existing Glossary of Terms and link to new Data Glossary (#17313) 2022-10-03 12:37:34 +02:00
Damilare Oyediran
33bad16267 Update supported-data-types.md (#17179)
Boolean data type was added to the official docs.
2022-09-29 15:35:13 -03:00
Liren Tu
9356d6adf5 📝 Update json avro conversion doc about logical type union (#17010)
* Update json avro doc to include logical type union

* Add a note about the potential problem

* Add issue link
2022-09-21 15:43:44 -07:00
Alexander Marquardt
2fbb85c3a0 Update incremental-append.md (#16579)
* Update incremental-append.md

Added link to the incremental sync tutorial

* Update incremental-append.md
2022-09-16 17:55:44 +02:00
Alexander Marquardt
005378626c Update incremental-deduped-history.md (#16578)
* Update incremental-deduped-history.md

Added a link to the incremental sync tutorial

* Update incremental-deduped-history.md

* Update incremental-deduped-history.md
2022-09-16 17:55:33 +02:00
Alexander Marquardt
ae85cacba5 Update full-refresh-overwrite.md (#16580) 2022-09-16 17:55:01 +02:00
Alexander Marquardt
7f2b27274b Update full-refresh-append.md (#16581) 2022-09-16 17:54:46 +02:00
Filipp Balakin
17cb363fe1 Normalization: update dbt-clickhouse version from 1.1.7 to 1.1.8 (#16339)
* Bump dbt-clickhouse version from 1.1.7 to 1.1.8

* pin dbt<1.2

* update doc

* pin dbt core lt 1.2

* bump normalization version

* loosen requirements

* fix md

* remove empty line

* bump normalization version in worker

* bump normalization version

Co-authored-by: marcosmarxm <marcosmarxm@gmail.com>
2022-09-14 08:27:31 -03:00
Davin Chia
08090bca6b Update the worker/job documentation. Add documentation for container orchestrator. (#16575)
- Add a container orchestrator section.
- Update existing section to include details on Kubernetes vs Docker.
- Move some things around so things are smoother.
2022-09-13 13:29:53 -07:00
Evan Tahler
dcfcb75d0f AirbyteLogMessage.stack_trace for logging messages with related (non-fatal) errors (#16479)
* Test log message from faker

* AirbyteLogMessage gains stack_trace

* Fixup spacing

* bump python protocol

* fixup additionalProperties in faker spec

* bump faker version

* update docs

* use lineSeparator vs \r\n

* auto-bump connector version [ci skip]

Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
2022-09-12 16:01:57 -07:00
Evan Tahler
6f2fcff85c fix changelog formatting (#16618)
* fix changelog formatting

* fixup docs
2022-09-12 15:59:50 -07:00
Evan Tahler
3a1df7ea58 Update airbyte-protocol.md 2022-09-12 10:55:54 -07:00