1
0
mirror of synced 2025-12-23 03:47:05 -05:00
Commit Graph

321 Commits

Author SHA1 Message Date
Joe Bell
05c7269a50 Destination BigQuery - Try unsafe operations with exception handling (#30696) 2023-09-27 20:57:44 +00:00
Edward Gao
7a8150eb25 Destination bigquery: soft reset correctly handles previous sync unclean exit (#30697) 2023-09-22 19:05:14 +00:00
Edward Gao
15c8ce9a67 Destination bigquery: handle case where stream name and namespace are identical (#30640)
Co-authored-by: edgao <edgao@users.noreply.github.com>
2023-09-22 10:51:28 -05:00
Ryan Br
317f7acc22 Bigquery uses async framework (#30069)
Co-authored-by: tryangul <tryangul@users.noreply.github.com>
Co-authored-by: benmoriceau <benoit@airbyte.io>
Co-authored-by: benmoriceau <benmoriceau@users.noreply.github.com>
2023-09-20 16:34:07 -05:00
Cynthia Yin
bb73ece673 Destination BigQuery + Snowflake: delete unused constants for reserved keywords (#30592) 2023-09-19 15:18:28 -07:00
Cynthia Yin
839fb26e09 🐛 Destination Snowflake - support reserved column names (#30319)
Co-authored-by: cynthiaxyin <cynthiaxyin@users.noreply.github.com>
2023-09-19 02:27:20 -07:00
Evan Tahler
e366ebdbd7 destination-bigquery GCS staging comes first (#30551) 2023-09-18 17:14:36 -07:00
Edward Gao
8805edcea6 DV2: Better errors in the UI (#30491)
Co-authored-by: edgao <edgao@users.noreply.github.com>
2023-09-18 15:58:56 +00:00
Edward Gao
68c6b01937 Destinations v2: threadsafe setup (#30439)
Co-authored-by: edgao <edgao@users.noreply.github.com>
2023-09-14 11:56:12 -05:00
Edward Gao
0910f3a626 DV2: add log messages for name collisions (#30364) 2023-09-12 22:47:08 +00:00
Edward Gao
e0ce2ac4d0 destinations v2: snowflake: threadsafe typing and deduping (#29878)
Co-authored-by: edgao <edgao@users.noreply.github.com>
Co-authored-by: Evan Tahler <evan@airbyte.io>
Co-authored-by: evantahler <evantahler@users.noreply.github.com>
2023-09-12 17:28:50 +00:00
Cynthia Yin
1db33d9c55 🐛 Destination BigQuery - change typing and deduping error column creation (#30070)
Co-authored-by: cynthiaxyin <cynthiaxyin@users.noreply.github.com>
2023-09-06 16:44:48 -07:00
Joe Bell
11068c1c4e Destinations V2 - T&D Streams in parallel (#30020)
Co-authored-by: jbfbell <jbfbell@users.noreply.github.com>
Co-authored-by: Edward Gao <edward.gao@airbyte.io>
2023-09-06 10:04:34 -07:00
Evan Tahler
7fc7db7ea4 T&D frequency: stream start, 6h(*) (#30117) 2023-09-06 01:22:20 +00:00
Edward Gao
9231c044d9 🐛 Destination snowflake: Create final tables with uppercase naming (#30056)
Co-authored-by: edgao <edgao@users.noreply.github.com>
2023-09-05 11:54:53 -05:00
Edward Gao
5baaa714d8 Destination bigquery: improve performance by skipping safe_cast for string columns (#30120) 2023-09-01 23:48:56 +00:00
Evan Tahler
54ba99046b destination-bigquery v2.0.1 (#29972) 2023-08-29 14:31:40 -05:00
Edward Gao
44f840ddb5 Destinations bigquery+snowflake: Release DV2 (#29783)
Co-authored-by: edgao <edgao@users.noreply.github.com>
Co-authored-by: Evan Tahler <evan@airbyte.io>
Co-authored-by: evantahler <evantahler@users.noreply.github.com>
2023-08-29 14:06:50 -05:00
Joe Bell
2215d0f8e5 Destinations V2 Remove Soft Reset (#29805) 2023-08-24 14:19:50 -05:00
Joe Bell
ea3a07f03d 🐛 Destination Bigquery - don't soft reset overwrite syncs (#29774) 2023-08-23 22:31:32 +00:00
Joe Bell
cd5671ca2e 🐛 Destination v2 general availability (#29636)
Co-authored-by: Edward Gao <edward.gao@airbyte.io>
Co-authored-by: btkcodedev <btk.codedev@gmail.com>
Co-authored-by: marcosmarxm <marcosmarxm@gmail.com>
Co-authored-by: Marcos Marx <marcosmarxm@users.noreply.github.com>
Co-authored-by: Cole Snodgrass <cole@airbyte.io>
Co-authored-by: Jonathan Pearlin <jonathan@airbyte.io>
Co-authored-by: Jose Pefaur <jose.pefaur@gmail.com>
Co-authored-by: Sujay Khandekar <sskhandek@gmail.com>
Co-authored-by: edgao <edgao@users.noreply.github.com>
Co-authored-by: jbfbell <jbfbell@users.noreply.github.com>
2023-08-23 14:34:23 -05:00
Aaron ("AJ") Steers
d094b152c9 Java CDK 'no-op': v0.0.1 (#28687)
Co-authored-by: aaronsteers <aaronsteers@users.noreply.github.com>
Co-authored-by: Conor <cpdeethree@users.noreply.github.com>
Co-authored-by: cpdeethree <conor@airbyte.io>
Co-authored-by: Augustin <augustin@airbyte.io>
Co-authored-by: Joe Bell <joseph.bell@airbyte.io>
2023-08-21 14:01:32 -05:00
Edward Gao
b2c0f389f2 Destination bigquery v2: throw error on reserved column name prefixes (#29560) 2023-08-17 23:29:55 +00:00
Joe Bell
900645d53f 🐛 Destination Bigquery - ensure raw dataset created with migration (#29522) 2023-08-17 10:40:47 -06:00
Edward Gao
f003a062ba 🐛 Destination bigquery: Properly fix per-stream state handling (#29498)
Co-authored-by: edgao <edgao@users.noreply.github.com>
2023-08-17 15:30:37 +00:00
Joe Bell
085b1215ca 🐛 Destination Bigquery - fix migration logic (#29461) 2023-08-16 22:51:23 -06:00
Edward Gao
f2dc37e907 🐛 Destinations v2: handle streams with no columns (#29381)
* handle streams with no columns

* logistics

* Automated Commit - Format and Process Resources Changes

---------

Co-authored-by: edgao <edgao@users.noreply.github.com>
2023-08-16 18:24:55 -06:00
Joe Bell
6e9cdc8ebe Destination BigQuery - Add v1v2 Migration (#28962)
* Add everything for BQ but migrate, refactor interface after practical work

* Make new default methods, refactor to single implemented method

* MigrationInterface and BQ impl created

* Trying to integrate with standard inserts

* remove unnecessary NameAndNamespacePair class

* Shimmed in

* Java Docs

* Initial Testing Setup

* Tests!

* Move Migrator into TyperDeduper

* Functional Migration

* Add Integration Test

* Pr updates

* bump version

* bump version

* version bump

* Update to airbyte-ci-internal (#29026)

* 🐛 Source Github, Instagram, Zendesk-support, Zendesk-talk: fix CAT tests fail on `spec` (#28910)

* connectors-ci: better modified connectors detection logic (#28855)

* connectors-ci: report path should always start with `airbyte-ci/` (#29030)

* make report path always start with airbyte-ci

* revert report path in orchestrator

* add more test cases

* bump version

* Updated docs (#29019)

* CDK: Embedded reader utils (#28873)

* relax pydantic dep

* Automated Commit - Format and Process Resources Changes

* wip

* wrap up base integration

* add init file

* introduce CDK runner and improve error message

* make state param optional

* update protocol models

* review comments

* always run incremental if possible

* fix

---------

Co-authored-by: flash1293 <flash1293@users.noreply.github.com>

* 🤖 Bump minor version of Airbyte CDK

* 🚨🚨 Low code CDK: Decouple SimpleRetriever and HttpStream (#28657)

* fix tests

* format

* review comments

* Automated Commit - Formatting Changes

* review comments

* review comments

* review comments

* log all messages

* log all message

* review comments

* review comments

* Automated Commit - Formatting Changes

* add comment

---------

Co-authored-by: flash1293 <flash1293@users.noreply.github.com>

* 🤖 Bump minor version of Airbyte CDK

* 🐛 Source Github, Instagram, Zendesk Support / Talk - revert `spec` changes and improve (#29031)

* Source oauth0: new streams and fix incremental (#29001)

* Add new streams Organizations,OrganizationMembers,OrganizationMemberRoles

* relax schema definition to allow additional fields

* Bump image tag version

* revert some changes to the old schemas

* Format python so gradle can pass

* update incremental

* remove unused print

* fix unit test

---------

Co-authored-by: Vasilis Gavriilidis <vasilis.gavriilidis@orfium.com>

* 🐛 Source Mongo: Fix failing acceptance tests (#28816)

* Fix failing acceptance tests

* Fix failing strict acceptance tests

* Source-Greenhouse: Fix unit tests for new CDK version (#28969)

Fix unit tests

* Add CSV options to the CSV parser (#28491)

* remove invalid legacy option

* remove unused option

* the tests pass but this is quite messy

* very slight clean up

* Add skip options to csv format

* fix some of the typing issues

* fixme comment

* remove extra log message

* fix typing issues

* skip before header

* skip after header

* format

* add another test

* Automated Commit - Formatting Changes

* auto generate column names

* delete dead code

* update title and description

* true and false values

* Update the tests

* Add comment

* missing test

* rename

* update expected spec

* move to method

* Update comment

* fix typo

* remove unused import

* Add a comment

* None records do not pass the WaitForDiscoverPolicy

* format

* remove second branch to ensure we always go through the same processing

* Raise an exception if the record is None

* reset

* Update tests

* handle unquoted newlines

* Automated Commit - Formatting Changes

* Update test case so the quoting is explicit

* Update comment

* Automated Commit - Formatting Changes

* Fail validation if skipping rows before header and header is autogenerated

* always fail if a record cannot be parsed

* format

* set write line_no in error message

* remove none check

* Automated Commit - Formatting Changes

* enable autogenerate test

* remove duplicate test

* missing unit tests

* Update

* remove branching

* remove unused none check

* Update tests

* remove branching

* format

* extract to function

* comment

* missing type

* type annotation

* use set

* Document that the strings are case-sensitive

* public -> private

* add unit test

* newline

---------

Co-authored-by: girarda <girarda@users.noreply.github.com>

* Dagster: Add sentry logging (#28822)

* Add sentry

* add sentry decorator

* Add traces

* Use sentry trace

* Improve duplicate logging

* Add comments

* DNC

* Fix up issues

* Move to scopes

* Remove breadcrumb

* Update lock

* Source Shortio: Migrate Python CDK to Low-code CDK (#28950)

* Migrate Shortio to Low-Code

* Update abnormal state

* Format

* Update Docs

* Fix metadata.yaml

* Add pagination

* Add incremental sync

* add incremental parameters

* update metadata

* rollback update version

* release date

---------

Co-authored-by: marcosmarxm <marcosmarxm@gmail.com>

* Update to new verbiage (#29051)

* [skip ci] Metadata: Remove leading underscore (#29024)

* DNC

* Add test models

* Add model test

* Remove underscore from metadata files

* Regenerate models

* Add test to check for key transformation

* Allow additional fields on metadata

* Delete transform

* Proof of concept parallel source stream reading implementation for MySQL (#26580)

* Proof of concept parallel source stream reading implementation for MySQL

* Automated Change

* Add read method that supports concurrent execution to Source interface

* Remove parallel iterator

* Ensure that executor service is stopped

* Automated Commit - Format and Process Resources Changes

* Expose method to fix compilation issue

* Use concurrent map to avoid access issues

* Automated Commit - Format and Process Resources Changes

* Ensure concurrent streams finish before closing source

* Fix compile issue

* Formatting

* Exclude concurrent stream threads from orphan thread watcher

* Automated Commit - Format and Process Resources Changes

* Refactor orphaned thread logic to account for concurrent execution

* PR feedback

* Implement readStreams in wrapper source

* Automated Commit - Format and Process Resources Changes

* Add readStream override

* Automated Commit - Format and Process Resources Changes

* 🤖 Auto format source-mysql code [skip ci]

* 🤖 Auto format source-mysql code [skip ci]

* 🤖 Auto format source-mysql code [skip ci]

* 🤖 Auto format source-mysql code [skip ci]

* 🤖 Auto format source-mysql code [skip ci]

* Debug logging

* Reduce logging level

* Replace synchronized calls to System.out.println when concurrent

* Close consumer

* Flush before close

* Automated Commit - Format and Process Resources Changes

* Remove charset

* Use ASCII and flush periodically for parallel streams

* Test performance harness patch

* Automated Commit - Format and Process Resources Changes

* Cleanup

* Logging to identify concurrent read enabled

* Mark parameter as final

---------

Co-authored-by: jdpgrailsdev <jdpgrailsdev@users.noreply.github.com>
Co-authored-by: octavia-squidington-iii <octavia-squidington-iii@users.noreply.github.com>
Co-authored-by: Rodi Reich Zilberman <867491+rodireich@users.noreply.github.com>
Co-authored-by: rodireich <rodireich@users.noreply.github.com>

* connectors-ci: disable dependency scanning (#29033)

* updates (#29059)

* Metadata: skip breaking change validation on prerelease (#29017)

* skip breaking change validation

* Move ValidatorOpts higher in call

* Add prerelease test

* Fix test

*  Source MongoDB Internal POC: Generate Test Data (#29049)

* Add script to generate test data

* Fix prose

* Update credentials example

* PR feedback

* Bump Airbyte version from 0.50.12 to 0.50.13

* Bump versions for mssql strict-encrypt (#28964)

* Bump versions for mssql strict-encrypt

* Fix failing test

* Fix failing test

* 🎨 Improve replication method selection UX (#28882)

* update replication method in MySQL source

* bump version

* update expected specs

* update registries

* bump strict encrypt version

* make password always_show

* change url

* update registries

* 🐛 Avoid writing records to log (#29047)

* Avoid writing records to log

* Update version

* Rollout ctid cdc (#28708)

* source-postgres: enable ctid+cdc implementation

* 100% ctid rollout for cdc

* remove CtidFeatureFlags

* fix CdcPostgresSourceAcceptanceTest

* Bump versions and release notes

* Fix compilation error due to previous merge

---------

Co-authored-by: subodh <subodh1810@gmail.com>

* connectors-ci: fix `unhashable type 'set'` (#29064)

* Add Slack Alert lifecycle to Dagster for Metadata publish (#28759)

* DNC

* Add slack lifecycle logging

* Update to use slack

* Update slack to use resource and bot

* Improve markdown

* Improve log

* Add sensor logging

* Extend sensor time

* merge conflict

* PR Refactoring

* Make the tests work

* remove unnecessary classes, pr feedback

* more merging

* Update airbyte-integrations/bases/base-typing-deduping-test/src/main/java/io/airbyte/integrations/base/destination/typing_deduping/BaseSqlGeneratorIntegrationTest.java

Co-authored-by: Edward Gao <edward.gao@airbyte.io>

* snowflake updates

---------

Co-authored-by: Ben Church <ben@airbyte.io>
Co-authored-by: Baz <oleksandr.bazarnov@globallogic.com>
Co-authored-by: Augustin <augustin@airbyte.io>
Co-authored-by: Serhii Lazebnyi <53845333+lazebnyi@users.noreply.github.com>
Co-authored-by: Joe Reuter <joe@airbyte.io>
Co-authored-by: flash1293 <flash1293@users.noreply.github.com>
Co-authored-by: Marcos Marx <marcosmarxm@users.noreply.github.com>
Co-authored-by: Vasilis Gavriilidis <vasilis.gavriilidis@orfium.com>
Co-authored-by: Jonathan Pearlin <jonathan@airbyte.io>
Co-authored-by: Alexandre Girard <alexandre@airbyte.io>
Co-authored-by: girarda <girarda@users.noreply.github.com>
Co-authored-by: btkcodedev <btk.codedev@gmail.com>
Co-authored-by: marcosmarxm <marcosmarxm@gmail.com>
Co-authored-by: Natalie Kwong <38087517+nataliekwong@users.noreply.github.com>
Co-authored-by: jdpgrailsdev <jdpgrailsdev@users.noreply.github.com>
Co-authored-by: octavia-squidington-iii <octavia-squidington-iii@users.noreply.github.com>
Co-authored-by: Rodi Reich Zilberman <867491+rodireich@users.noreply.github.com>
Co-authored-by: rodireich <rodireich@users.noreply.github.com>
Co-authored-by: Alexandre Cuoci <Hesperide@users.noreply.github.com>
Co-authored-by: terencecho <terencecho@users.noreply.github.com>
Co-authored-by: Lake Mossman <lake@airbyte.io>
Co-authored-by: Benoit Moriceau <benoit@airbyte.io>
Co-authored-by: subodh <subodh1810@gmail.com>
Co-authored-by: Edward Gao <edward.gao@airbyte.io>
2023-08-09 14:12:16 -06:00
Evan Tahler
9210547af5 [Docs] No Deduped + Hostory, Append + Deduped is the future! (#29114)
* [Docs] No `Deduped + Hostory`, `Append + Deduped` is the future!

* fix links
2023-08-08 15:07:49 -07:00
Edward Gao
460295427b 🐛 Destination bigquery 1s1t: handle erroneous cdc deletion record (#29106)
* fix bigquery cdc edge case

* fix condition

* logistics
2023-08-04 15:05:28 -06:00
Edward Gao
b74ddffdbe 🐛 Destination bigquery 1s1t: wrap jsonpath fieldname in quotes (#29089)
* wrap jsonpath fieldname in quotes

* logistics
2023-08-04 11:28:38 -06:00
Edward Gao
ac67bfd9fd 1s1t: Refactor sqlgenerator integration test (#28890)
* random wip stuff

* fix insertRaw

* theoretically implement stuff?

* stuff

* put suffix at the end

* different uuids

* fix expected records

* move tdtest resources into dat folder

* use resource files

* stuff

* move code around

* more stuff

* rename final table

* stuff

* cdc immediate deletion

* cdcComplexUpdate

* cleanup

* botched rebase

* more tests

* move back to old file

* Automated Commit - Format and Process Resources Changes

* add comments

* Automated Commit - Format and Process Resources Changes

* Automated Commit - Format and Process Resources Changes

* raw name update

* logistics

---------

Co-authored-by: edgao <edgao@users.noreply.github.com>
2023-08-03 17:27:17 -06:00
Edward Gao
3af7f3b6fb 🐛 Destinations snowflake + bigquery: only parse catalog in 1s1t mode (#28976)
* only parse catalog in 1s1t mode

* one more thing?

* logistics
2023-08-02 15:19:52 -07:00
Edward Gao
694731f4e4 Destination bigquery v2: Fix _ab_cdc_deleted_at handling in non-dedup modes (#28959)
* fix bug in deleted_at handling

* add test

* comments

* more comments

* logistics
2023-08-02 09:48:25 -06:00
Cynthia Yin
ae30ce09de Destinations V2: open up early access for BigQuery via spec toggle (#28894)
* update spec + version

* update PR link
2023-08-01 18:12:21 -05:00
Edward Gao
83fb3caeea 🚨 Destination bigquery 1s1t: change raw dataset + table name (#28723)
* add test for raw dataset override

* tests hardcode raw dataset name

* rename raw tables

* minimum 1

* logistics

* different option per destination
2023-07-27 12:37:17 -05:00
Edward Gao
9f6963ccfc Destination Bigquery 1s1t: handle cursor change (#28721)
* handle new cursor column

* sync2 is actualy weird, apparently

* logistics

* Automated Commit - Format and Process Resources Changes

---------

Co-authored-by: edgao <edgao@users.noreply.github.com>
2023-07-26 18:36:15 -06:00
Joe Bell
6159452f8f 🐛 Destination BigQuery - Limit Clustering Column Amount (#28625)
* Add limit to the number of clustering columns

* update bq version
2023-07-24 13:48:56 -06:00
Edward Gao
adf8870de0 🐛 Destination bigquery 1s1t: respect dataset location (#28580)
* set dataset location during creation

* logistics
2023-07-21 16:54:45 -06:00
Edward Gao
53da5baa7d Destination bigquery 1s1t: fix 1s1t schema change logic; extract TyperDeduper (#28490)
* rename for clarity

* fix cleanup method

* giant commit because I'm irresponsible

* rename constant

* better raw table creation

* fix build?

* move code around

* tweaks

* more code shuffling

* Automated Commit - Format and Process Resources Changes

* add tests

* minor tweak

* remove unimportant methods

* cleanup

* Automated Commit - Format and Process Resources Changes

* derp

* clean up tests

* some more fixes post-merge

* botched merge

* create NoopTyperDeduper

* try and update everything to work?

* tweak comment

* move suffix args to end of list

* fix exception message

* Automated Commit - Format and Process Resources Changes

* add sqlgenerator test for softReset

* only prepare once

* update log message

* do what intellij says

* implement one more test

* less indirection

* Automated Commit - Format and Process Resources Changes

* rename test

* use noop in test

* version bump + changelog

* use stringutils

* fix typo

* flip if-statement

* typo

* simplify logic

* fix schema change logic

* typo

* use spy for clarity

* Automated Commit - Format and Process Resources Changes

* better test teardown

* slightly better logs

* fix exception message

* softReset returns single string

* Automated Commit - Format and Process Resources Changes

* simplify if chain

---------

Co-authored-by: edgao <edgao@users.noreply.github.com>
2023-07-21 13:45:19 -06:00
Joe Bell
a16cbea2ae Destination BigQuery - Handle Schema Changes (#28382)
* Add ability to detect differences in expected Schemas and perform soft resets

* Remove alter table for overwrite syncs since its unneccessary

* Updates after testing

* pr reorganize

* comments

* add collection util test

* Add Tests

* bump version

* Automated Commit - Format and Process Resources Changes

* Destination BigQuery - Reduce amount of typing and deduping for GCS staging (#28489)

* undo comment out

* centralize t&d logic for staging and standard, add valve to staging

* Share more logic for typing and deduping

* Remove record checking logic and use only time for staging inserts

* Add Javadoc

* Automated Commit - Format and Process Resources Changes

---------

Co-authored-by: jbfbell <jbfbell@users.noreply.github.com>

* Change TableNotMigratedException to extend runtime exception, remove SqlGenerator interface method

* Make Lambda slightly more readable

* add test for validating v2 schemas

* change soft reset to single string

* convert back to list, update dockerfile

* remove needless default

---------

Co-authored-by: jbfbell <jbfbell@users.noreply.github.com>
2023-07-20 09:25:44 -06:00
Ben Church
6fa755f81d Java Pipeline Bump patch bump all java connectors in july (#28345)
* patch bump all java connectors in july

* Bump changelog
2023-07-14 21:01:39 -05:00
Edward Gao
934acaa137 Destination bigquery: rerelease 1s1t behind gate (#27936)
* Revert "Revert "Destination Bigquery: Scaffolding for destinations v2 (#27268)""

This reverts commit 348c577dbb.

* version bumps+changelog

* Speed up BQ by having 2 queries, and not an OR (#27981)

* 🐛 Destination Bigquery: fix bug in standard inserts for syncs >10K records (#27856)

* only run t+d code if it's enabled

* dockerfile+changelog

* remove changelog entry

* Destinations V2: handle optional fields for `object` and `array` types (#27898)

* catch null schema

* fix null properties

* clean up

* consolidate + add more tests

* try catch

* empty json test

* Automated Commit - Formatting Changes

* remove todo

* destination bigquery: misc updates to 1s1t code (#28057)

* switch to checkedconsumer

* add unit test for buildColumnId

* use flag

* restructure prefix check

* fix build

* more type-parsing fixes (#28100)

* more type-parsing fixes

* handle duplicates

* Automated Commit - Format and Process Resources Changes

* add tests for asColumns

* Automated Commit - Format and Process Resources Changes

* log warnings instead of throwing exception

* better log message

* error level

---------

Co-authored-by: edgao <edgao@users.noreply.github.com>

* Automated Commit - Formatting Changes

* Improve protocol type parsing (#28126)

* Automated Commit - Formatting Changes

* Change from T&D every 10k records to an increasing time based interval (#28130)

* fifteen minute t&d

* add typing and deduping operation valve for increased intervals of typing and deduping

* Automated Commit - Format and Process Resources Changes

* resolve bizarre merge conflict

* Automated Commit - Format and Process Resources Changes

---------

Co-authored-by: jbfbell <jbfbell@users.noreply.github.com>

* Simplify and speed up CDC delete support [DestinationsV2] (#28029)

* Simplify and speed up CDC delete support [DestinationsV2]

* better QUOTE

* spotbugs?

* recompile dbt image for local arch and use that when building images

* things compile, but tests fail

* tests working-ish

* comment

* fix logic to re-insert deleted records for cursor comparison.

tests pass!

* remove comment

* Skip CDC re-include logic if there are no CDC columns

* stop hardcoding pk (#28092)

* wip

* remove TODOs

---------

Co-authored-by: Edward Gao <edward.gao@airbyte.io>

* update method name

* Automated Commit - Formatting Changes

* depend on pinned normalization version

* implement 1s1t DATs for destination-bigquery (#27852)

* intiial implementation

* Automated Commit - Formatting Changes

* add second sync to test

* do concurrent things

* Automated Commit - Formatting Changes

* clarify comment

* minor tweaks

* more stuff

* Automated Commit - Formatting Changes

* minor cleanup

* lots of fixes

* handle sql vs json null better
* verify extra columns
* only check deleted_at if in DEDUP mode and the column exists
* add full refresh append test case

* Automated Commit - Formatting Changes

* add tests for the remaining sync modes

* Automated Commit - Formatting Changes

* readability stuff

* Automated Commit - Formatting Changes

* add test for gcs mode

* remove static fields

* Automated Commit - Formatting Changes

* add more test cases, tweak test scaffold

* cleanup

* Automated Commit - Formatting Changes

* extract recorddiffer

* and use it in the sql generator test

* fix

* comment

* naming+comment

* one more comment

* better assert

* remove unnecessary thing

* one last thing

* Automated Commit - Formatting Changes

* enable concurrent execution on all java integration tests

* add test for default namespace

* Automated Commit - Formatting Changes

* implement a 2-stream test

* Automated Commit - Formatting Changes

* extract methods

* invert jsonNodesNotEquivalent

* Automated Commit - Formatting Changes

* fix conditional

* pull out diffSingleRecord

* Automated Commit - Formatting Changes

* handle nulls correctly

* remove raw-specific handling; break up methods

* Automated Commit - Formatting Changes

---------

Co-authored-by: edgao <edgao@users.noreply.github.com>
Co-authored-by: octavia-approvington <octavia-approvington@users.noreply.github.com>

* Destinations V2: move create raw tables earlier (#28255)

* move create raw tables

* better log message

* stop building normalization (#28256)

* fix ability to run tests

* disable incremental t+d for now

* Automated Commit - Formatting Changes

---------

Co-authored-by: Evan Tahler <evan@airbyte.io>
Co-authored-by: Cynthia Yin <cynthia@airbyte.io>
Co-authored-by: cynthiaxyin <cynthiaxyin@users.noreply.github.com>
Co-authored-by: edgao <edgao@users.noreply.github.com>
Co-authored-by: Joe Bell <joseph.bell@airbyte.io>
Co-authored-by: jbfbell <jbfbell@users.noreply.github.com>
Co-authored-by: octavia-approvington <octavia-approvington@users.noreply.github.com>
2023-07-14 09:34:56 -05:00
Edward Gao
52b8cbe39d Revert "Destination Bigquery: Scaffolding for destinations v2 (#27268)" (#27891)
* Revert "Destination Bigquery: Scaffolding for destinations v2 (#27268)"

This reverts commit ba3e39bb0c.

* bump versions to 1.5.1 everywhere
2023-06-30 20:26:48 -04:00
Evan Tahler
f455c1288d Java DB Destination connector licenses to Elv2 (#27781)
* Java DB Destination connector licenses to Elv2

* PR id for docs

* fix redshift tagging
2023-06-29 12:26:24 -05:00
Edward Gao
ba3e39bb0c Destination Bigquery: Scaffolding for destinations v2 (#27268)
* copy files from edgao branch

* start writing create table statement

* add basic unit test setup

* create a table, probably

* remove outdated todo

* derp, one more column

* ugh

* add partitioning+clustering

* use StringSubstitutor

* substitutions in updateTable

* wip generate update/insert statement

* split up into smaller methods

* handle json types correctly

* rename stuff

* more json_query vs _value stuff

* minor tweak

* super basic test setup

* laying foundation for type parsing

* more stuff

* tweaks

* more progress on type parsing

* fix json_value stuff?

* misc fixes in insert

* fix dedupFinalTable

* add testDedupRaw

* full e2e test

* type parsing: gave up and mirrored the dbt code structure to avoid bugs

* type parsing - more cleanup

* handle column name collisions

* handle tablename collisions...?

* comments

* remove original ns/name from quotedstream

* also javadoc

* remove redundant method

* fix table rename

* add incremental append test

* add full refresh append test

* comment

* call T+D sql in a reasonable location for standard inserts

* add config option

* use config option here

* type parsing - fix fromJsonSchema

* gate everything

* log query + runtime

* add spec option temporarily

* Raw Table Updates

* fix more stuff

* first big pass at toDialectType

* no quotes

* wrap everything in quotes

* resolve some TODOs

* log sql statement in tests

* overwriteFinalTable returns optional

* minor clean up

* add raw dataset override

* try to preserve the original namespace for t+d?

* write to the raw table correctly

* update todos

* write directly to raw table

this is kind of dumb because we're still trying to do tmp table operations,
and we still don't ack state until the end of the entire sync.

* standard inserts write to raw table correctly

* imports + log statements

* move logs + add comment

* explicitly create raw table

* move comment to better place

* Typing issues

* bash attempt

* formatting updates

* formatting updates

* write to the airbyte schema by default unless overriden by config options

* standard inserts truncate raw table at start of sync

* full refresh overwrite will overwrite correctly!

* fix avro record schema parsing

* better raw table recreate

* rename raw table to match standard inserts

* full refresh overwrite does tmp table things

* small clean up

* small clean up

* remove errors entry if no errors

* pull out destination config into singleton

* clean up singleton stuff

* make sure dest config exists when trying to do lookups

* avoid stringifying null

* quick thoughts on alter table

* add basic cdc testcase

* tweak cdc test setup

* rename raw table to match standard inserts

* minor tweak

* delete exact sql string assertions

* switch to JSON type

* minor cleanup

* sql whitespace changes

* explain cdc deletions

* GCS Staging Full Refresh create temp table

* assert schema

* first out of order cdc test

* add another cdc test case (currently failing)

* better test structure

* make this work

* oops, fix test

* stop trying to delete deletion records

* minor improvements to code+test

* enable concurrent test runs on integration test

* move stuff to static initializer

* extract utility method

* formatting

* Move conditional to the base java package, replace conditionals which did not use the typing and deduping flag but should have been.

* 🤖 Auto format destination-bigquery code [skip ci]

* 🤖 Auto format destination-gcs code [skip ci]

* switch back to empty list; write big assert

* minor wording tweaks

* 🤖 Auto format destination-bigquery code [skip ci]

* 🤖 Auto format destination-gcs code [skip ci]

* DestinationConfigTest

* 🤖 Auto format destination-bigquery code [skip ci]

* 🤖 Auto format destination-gcs code [skip ci]

* formatting

* remove ParsedType

* 🤖 Auto format destination-gcs code [skip ci]

* 🤖 Auto format destination-bigquery code [skip ci]

* tests verify every data type

* 🤖 Auto format destination-bigquery code [skip ci]

* 🤖 Auto format destination-gcs code [skip ci]

* full update with all data types

* 🤖 Auto format destination-bigquery code [skip ci]

* 🤖 Auto format destination-gcs code [skip ci]

* move stuff to new base lib

* 🤖 Auto format destination-gcs code [skip ci]

* Automated Commit - Formatting Changes

* 🤖 Auto format destination-bigquery code [skip ci]

* fix test

* 🤖 Auto format destination-bigquery code [skip ci]

* 🤖 Auto format destination-bigquery code [skip ci]

* 🤖 Auto format destination-gcs code [skip ci]

* asserts in dedupFinalTable

* better asserts in dedupRawTable

* [wip] test case for all data types

* 🤖 Auto format destination-gcs code [skip ci]

* 🤖 Auto format destination-bigquery code [skip ci]

* AirbyteTypeTest

* Automated Commit - Formatting Changes

* remove comments

* test chooseOneOf

* slightly better test output

* Automated Commit - Formatting Changes

* add some awful pretty print code

* more comment

* minor tweaks

* verify array/object type

* fix test

* handle deletions more correctly

* test toDialectType

* Destinations v2: better namespace handling (#27682)

* [wip] better namespace handling

* 🤖 Auto format destination-bigquery code [skip ci]

* wip also implement in gcs

* get gcs working (?)

* 🤖 Auto format destination-bigquery code [skip ci]

* remove duplicate method

* 🤖 Auto format destination-bigquery code [skip ci]

* fixed my code style settings

* make ci happy?

* 🤖 Auto format destination-bigquery code [skip ci]

* make ci happy?

* remove incorrect test

* blank line change

* initialize singleton

---------

Co-authored-by: octavia-squidington-iii <octavia-squidington-iii@users.noreply.github.com>

* reset args correctly

* Automated Commit - Formatting Changes

* more bash stuff

* parse implicit structs

* initialize singleton in more tests

* Automated Commit - Formatting Changes

* I missed this namespace handling thing

* test more schemas

* fix singular types specified in arrays

* Automated Commit - Formatting Changes

* disable test for unimplemented feature

* initialize singleton

* remove spec options; changelogs+metadata

* randomize namespace

* also bump dockerfile

* unremove namespace sanitizing in legacy mode

* ... disable the correct test

* even more unit test fixes!

* move integration test to integration tests

---------

Co-authored-by: Cynthia Yin <cynthia@airbyte.io>
Co-authored-by: Joe Bell <joseph.bell@airbyte.io>
Co-authored-by: octavia-squidington-iii <octavia-squidington-iii@users.noreply.github.com>
Co-authored-by: edgao <edgao@users.noreply.github.com>
Co-authored-by: cynthiaxyin <cynthiaxyin@users.noreply.github.com>
2023-06-29 08:44:37 -07:00
Evan Tahler
1f6aef98df yum clean all after every yum install to save space (#27555)
* `yum clean all` after every yum install to save space

* docs and versions

* update env clean

* fix python install confusion

---------

Co-authored-by: Augustin <augustin@airbyte.io>
2023-06-23 13:53:22 -07:00
Edward Gao
cf2ded2bbb Destination Bigquery: small tweak to clarify logs (#26585)
* make logs less misleading

* version bumps + changelog

* tweak wording
2023-05-25 18:20:40 +00:00
Edward Gao
c25afc4adb 🐛 Destination BigQuery (+denormalized): correctly parse buffer count from config (#26213)
* fix logic in parsing config

* simplify logic

* ugh

* holy moly that took way too many iterations

* version bumps / changelog

* Automated Change

*  Destination Bigquery: stop running normalization container for DAT (#25925)

* readme update

* allow passing additional flags to test containers

* remove build dependency

* Automated Change

* versioning updates

* restore denormalized change from master

* formatting changes

* formatting

* Automated Change

* update metadata file

---------

Co-authored-by: jbfbell <jbfbell@users.noreply.github.com>

* fix version (#26218)

* Source Airtable: skip missing streams (#25946)

* Source Airtable: skip missing streams

* Move stream removal to a separate method, cover with tests

* Update changelog

* Fix flake warnings

* Update docs/integrations/sources/airtable.md

Co-authored-by: Sherif A. Nada <snadalive@gmail.com>

* Update docs/integrations/sources/airtable.md

Co-authored-by: Sherif A. Nada <snadalive@gmail.com>

* Automated Change

* Update link to docs in warning

* Automated Change

* Automated Change

* Automated Change

* “Empty-Commit”

---------

Co-authored-by: Sherif A. Nada <snadalive@gmail.com>
Co-authored-by: arsenlosenko <arsenlosenko@users.noreply.github.com>

* 🎉 New Source: Ringcentral [Low code CDK] (#25701)

* Initial commit - All test passed

* add stream fax cover

* refactor docs

* fix schema, Added pagination

* Add several streams, fix schema

* fix schema, add streams, refactor docs

* EOF

* Resolve conflicts

* Resolve conflicts

* add metadata file

---------

Co-authored-by: marcosmarxm <marcosmarxm@gmail.com>

* rebump version

* Automated Change

---------

Co-authored-by: edgao <edgao@users.noreply.github.com>
Co-authored-by: Joe Bell <joseph.bell@airbyte.io>
Co-authored-by: jbfbell <jbfbell@users.noreply.github.com>
Co-authored-by: Joe Reuter <joe@airbyte.io>
Co-authored-by: Arsen Losenko <20901439+arsenlosenko@users.noreply.github.com>
Co-authored-by: Sherif A. Nada <snadalive@gmail.com>
Co-authored-by: arsenlosenko <arsenlosenko@users.noreply.github.com>
Co-authored-by: btkcodedev <btk.codedev@gmail.com>
Co-authored-by: marcosmarxm <marcosmarxm@gmail.com>
2023-05-18 16:45:24 +00:00
Joe Bell
9a4be977c1 Destination Bigquery: stop running normalization container for DAT (#25925)
* readme update

* allow passing additional flags to test containers

* remove build dependency

* Automated Change

* versioning updates

* restore denormalized change from master

* formatting changes

* formatting

* Automated Change

* update metadata file

---------

Co-authored-by: jbfbell <jbfbell@users.noreply.github.com>
2023-05-18 00:46:32 +00:00