1
0
mirror of synced 2025-12-26 14:02:10 -05:00
Commit Graph

660 Commits

Author SHA1 Message Date
Evan Tahler
27f3e4cacc Document append + file destinations might have repeat records (#29819) 2023-08-24 16:41:53 -07:00
Joe Bell
2215d0f8e5 Destinations V2 Remove Soft Reset (#29805) 2023-08-24 14:19:50 -05:00
Joe Bell
ea3a07f03d 🐛 Destination Bigquery - don't soft reset overwrite syncs (#29774) 2023-08-23 22:31:32 +00:00
Joe Bell
cd5671ca2e 🐛 Destination v2 general availability (#29636)
Co-authored-by: Edward Gao <edward.gao@airbyte.io>
Co-authored-by: btkcodedev <btk.codedev@gmail.com>
Co-authored-by: marcosmarxm <marcosmarxm@gmail.com>
Co-authored-by: Marcos Marx <marcosmarxm@users.noreply.github.com>
Co-authored-by: Cole Snodgrass <cole@airbyte.io>
Co-authored-by: Jonathan Pearlin <jonathan@airbyte.io>
Co-authored-by: Jose Pefaur <jose.pefaur@gmail.com>
Co-authored-by: Sujay Khandekar <sskhandek@gmail.com>
Co-authored-by: edgao <edgao@users.noreply.github.com>
Co-authored-by: jbfbell <jbfbell@users.noreply.github.com>
2023-08-23 14:34:23 -05:00
Cirdes
abf7935fcd 🐛 Destination Typesense - Increase connection timeout time to avoid errors (#29555)
Co-authored-by: marcosmarxm <marcosmarxm@gmail.com>
2023-08-23 10:20:05 -03:00
Aaron ("AJ") Steers
d094b152c9 Java CDK 'no-op': v0.0.1 (#28687)
Co-authored-by: aaronsteers <aaronsteers@users.noreply.github.com>
Co-authored-by: Conor <cpdeethree@users.noreply.github.com>
Co-authored-by: cpdeethree <conor@airbyte.io>
Co-authored-by: Augustin <augustin@airbyte.io>
Co-authored-by: Joe Bell <joseph.bell@airbyte.io>
2023-08-21 14:01:32 -05:00
Joe Reuter
26bc2a8762 Langchain destination: Clean up spec schema (#29514) 2023-08-21 12:58:09 +02:00
Joe Reuter
dd170e2e9d Langchain destination: Make starter pods work (#29513) 2023-08-21 12:25:30 +02:00
Ryan Br
9aad31caf2 Fix duplicate object names for concurrent uploads: Always use UUID fo… (#29640)
Co-authored-by: Evan Tahler <evan@airbyte.io>
2023-08-19 16:52:12 -05:00
Davin Chia
ec0f83b03f 🚨 Destination Snowflake: Remove GCS/S3 Staging. (#29236)
As title, remove the GCS/S3 staging methods.

There isn't much usage so we can remove this. Internal Staging is also recommended by Snowflake, so using that is both cheaper and faster.

Co-authored-by: davinchia <davinchia@users.noreply.github.com>
Co-authored-by: Evan Tahler <evan@airbyte.io>
Co-authored-by: Pedro S. Lopez <pedroslopez@me.com>
2023-08-18 13:15:25 -07:00
Edward Gao
b2c0f389f2 Destination bigquery v2: throw error on reserved column name prefixes (#29560) 2023-08-17 23:29:55 +00:00
Joe Bell
900645d53f 🐛 Destination Bigquery - ensure raw dataset created with migration (#29522) 2023-08-17 10:40:47 -06:00
Edward Gao
f003a062ba 🐛 Destination bigquery: Properly fix per-stream state handling (#29498)
Co-authored-by: edgao <edgao@users.noreply.github.com>
2023-08-17 15:30:37 +00:00
Joe Bell
085b1215ca 🐛 Destination Bigquery - fix migration logic (#29461) 2023-08-16 22:51:23 -06:00
Edward Gao
f2dc37e907 🐛 Destinations v2: handle streams with no columns (#29381)
* handle streams with no columns

* logistics

* Automated Commit - Format and Process Resources Changes

---------

Co-authored-by: edgao <edgao@users.noreply.github.com>
2023-08-16 18:24:55 -06:00
Benoit Moriceau
8d19017f48 Switch redshift staging to async mode (#28619)
* Async snowflake

* Use async in destination implenentation

* Format

* Switch redshif to asyn mode

* Remove old unused consumer creation

* Add new version

* Fix non staging mode

* Change switcing to use the get serialized consumer

* Automated Commit - Format and Process Resources Changes

* Test

* Automated Commit - Format and Process Resources Changes

* Use method

* Test smaller buffer

* Test smaller buffer for redshift

* Automated Commit - Format and Process Resources Changes

* Bigger ratio

* Remove snowflake changes

* Implement the new interface

* Automated Commit - Format and Process Resources Changes

* push ratio to 0.8

* Smaller Optimal buffer size

* Automated Commit - Format and Process Resources Changes

* Bigger buffer

* Use a buffer of 10 Mb

* Use a buffer of 75 Mb

* Test reduce lib thread

* Add flags for remote profiler.

* Part size to match the async part size

* Part size to 100 Mb

* restore default

* Try with 1 thread

* Go back to default

* Clean up

* Bump version

* Restore gradle

* Re-add vm capture

* Test reduce allowed buffer size

* Use all the memory available

* only 3 threads for the lib

* Automated Commit - Format and Process Resources Changes

* test with 1

* Automated Commit - Format and Process Resources Changes

* Add local log ling.

* Do not use all RAM for heap.

* Fix build

* Clean up

* Clean up

* Update airbyte-integrations/bases/bases-destination-jdbc/src/main/java/io/airbyte/integrations/destination/staging/AsyncFlush.java

Co-authored-by: Davin Chia <davinchia@gmail.com>

* Automated Commit - Format and Process Resources Changes

---------

Co-authored-by: Davin Chia <davinchia@gmail.com>
Co-authored-by: benmoriceau <benmoriceau@users.noreply.github.com>
2023-08-14 15:53:05 -05:00
Joe Bell
6e9cdc8ebe Destination BigQuery - Add v1v2 Migration (#28962)
* Add everything for BQ but migrate, refactor interface after practical work

* Make new default methods, refactor to single implemented method

* MigrationInterface and BQ impl created

* Trying to integrate with standard inserts

* remove unnecessary NameAndNamespacePair class

* Shimmed in

* Java Docs

* Initial Testing Setup

* Tests!

* Move Migrator into TyperDeduper

* Functional Migration

* Add Integration Test

* Pr updates

* bump version

* bump version

* version bump

* Update to airbyte-ci-internal (#29026)

* 🐛 Source Github, Instagram, Zendesk-support, Zendesk-talk: fix CAT tests fail on `spec` (#28910)

* connectors-ci: better modified connectors detection logic (#28855)

* connectors-ci: report path should always start with `airbyte-ci/` (#29030)

* make report path always start with airbyte-ci

* revert report path in orchestrator

* add more test cases

* bump version

* Updated docs (#29019)

* CDK: Embedded reader utils (#28873)

* relax pydantic dep

* Automated Commit - Format and Process Resources Changes

* wip

* wrap up base integration

* add init file

* introduce CDK runner and improve error message

* make state param optional

* update protocol models

* review comments

* always run incremental if possible

* fix

---------

Co-authored-by: flash1293 <flash1293@users.noreply.github.com>

* 🤖 Bump minor version of Airbyte CDK

* 🚨🚨 Low code CDK: Decouple SimpleRetriever and HttpStream (#28657)

* fix tests

* format

* review comments

* Automated Commit - Formatting Changes

* review comments

* review comments

* review comments

* log all messages

* log all message

* review comments

* review comments

* Automated Commit - Formatting Changes

* add comment

---------

Co-authored-by: flash1293 <flash1293@users.noreply.github.com>

* 🤖 Bump minor version of Airbyte CDK

* 🐛 Source Github, Instagram, Zendesk Support / Talk - revert `spec` changes and improve (#29031)

* Source oauth0: new streams and fix incremental (#29001)

* Add new streams Organizations,OrganizationMembers,OrganizationMemberRoles

* relax schema definition to allow additional fields

* Bump image tag version

* revert some changes to the old schemas

* Format python so gradle can pass

* update incremental

* remove unused print

* fix unit test

---------

Co-authored-by: Vasilis Gavriilidis <vasilis.gavriilidis@orfium.com>

* 🐛 Source Mongo: Fix failing acceptance tests (#28816)

* Fix failing acceptance tests

* Fix failing strict acceptance tests

* Source-Greenhouse: Fix unit tests for new CDK version (#28969)

Fix unit tests

* Add CSV options to the CSV parser (#28491)

* remove invalid legacy option

* remove unused option

* the tests pass but this is quite messy

* very slight clean up

* Add skip options to csv format

* fix some of the typing issues

* fixme comment

* remove extra log message

* fix typing issues

* skip before header

* skip after header

* format

* add another test

* Automated Commit - Formatting Changes

* auto generate column names

* delete dead code

* update title and description

* true and false values

* Update the tests

* Add comment

* missing test

* rename

* update expected spec

* move to method

* Update comment

* fix typo

* remove unused import

* Add a comment

* None records do not pass the WaitForDiscoverPolicy

* format

* remove second branch to ensure we always go through the same processing

* Raise an exception if the record is None

* reset

* Update tests

* handle unquoted newlines

* Automated Commit - Formatting Changes

* Update test case so the quoting is explicit

* Update comment

* Automated Commit - Formatting Changes

* Fail validation if skipping rows before header and header is autogenerated

* always fail if a record cannot be parsed

* format

* set write line_no in error message

* remove none check

* Automated Commit - Formatting Changes

* enable autogenerate test

* remove duplicate test

* missing unit tests

* Update

* remove branching

* remove unused none check

* Update tests

* remove branching

* format

* extract to function

* comment

* missing type

* type annotation

* use set

* Document that the strings are case-sensitive

* public -> private

* add unit test

* newline

---------

Co-authored-by: girarda <girarda@users.noreply.github.com>

* Dagster: Add sentry logging (#28822)

* Add sentry

* add sentry decorator

* Add traces

* Use sentry trace

* Improve duplicate logging

* Add comments

* DNC

* Fix up issues

* Move to scopes

* Remove breadcrumb

* Update lock

* Source Shortio: Migrate Python CDK to Low-code CDK (#28950)

* Migrate Shortio to Low-Code

* Update abnormal state

* Format

* Update Docs

* Fix metadata.yaml

* Add pagination

* Add incremental sync

* add incremental parameters

* update metadata

* rollback update version

* release date

---------

Co-authored-by: marcosmarxm <marcosmarxm@gmail.com>

* Update to new verbiage (#29051)

* [skip ci] Metadata: Remove leading underscore (#29024)

* DNC

* Add test models

* Add model test

* Remove underscore from metadata files

* Regenerate models

* Add test to check for key transformation

* Allow additional fields on metadata

* Delete transform

* Proof of concept parallel source stream reading implementation for MySQL (#26580)

* Proof of concept parallel source stream reading implementation for MySQL

* Automated Change

* Add read method that supports concurrent execution to Source interface

* Remove parallel iterator

* Ensure that executor service is stopped

* Automated Commit - Format and Process Resources Changes

* Expose method to fix compilation issue

* Use concurrent map to avoid access issues

* Automated Commit - Format and Process Resources Changes

* Ensure concurrent streams finish before closing source

* Fix compile issue

* Formatting

* Exclude concurrent stream threads from orphan thread watcher

* Automated Commit - Format and Process Resources Changes

* Refactor orphaned thread logic to account for concurrent execution

* PR feedback

* Implement readStreams in wrapper source

* Automated Commit - Format and Process Resources Changes

* Add readStream override

* Automated Commit - Format and Process Resources Changes

* 🤖 Auto format source-mysql code [skip ci]

* 🤖 Auto format source-mysql code [skip ci]

* 🤖 Auto format source-mysql code [skip ci]

* 🤖 Auto format source-mysql code [skip ci]

* 🤖 Auto format source-mysql code [skip ci]

* Debug logging

* Reduce logging level

* Replace synchronized calls to System.out.println when concurrent

* Close consumer

* Flush before close

* Automated Commit - Format and Process Resources Changes

* Remove charset

* Use ASCII and flush periodically for parallel streams

* Test performance harness patch

* Automated Commit - Format and Process Resources Changes

* Cleanup

* Logging to identify concurrent read enabled

* Mark parameter as final

---------

Co-authored-by: jdpgrailsdev <jdpgrailsdev@users.noreply.github.com>
Co-authored-by: octavia-squidington-iii <octavia-squidington-iii@users.noreply.github.com>
Co-authored-by: Rodi Reich Zilberman <867491+rodireich@users.noreply.github.com>
Co-authored-by: rodireich <rodireich@users.noreply.github.com>

* connectors-ci: disable dependency scanning (#29033)

* updates (#29059)

* Metadata: skip breaking change validation on prerelease (#29017)

* skip breaking change validation

* Move ValidatorOpts higher in call

* Add prerelease test

* Fix test

*  Source MongoDB Internal POC: Generate Test Data (#29049)

* Add script to generate test data

* Fix prose

* Update credentials example

* PR feedback

* Bump Airbyte version from 0.50.12 to 0.50.13

* Bump versions for mssql strict-encrypt (#28964)

* Bump versions for mssql strict-encrypt

* Fix failing test

* Fix failing test

* 🎨 Improve replication method selection UX (#28882)

* update replication method in MySQL source

* bump version

* update expected specs

* update registries

* bump strict encrypt version

* make password always_show

* change url

* update registries

* 🐛 Avoid writing records to log (#29047)

* Avoid writing records to log

* Update version

* Rollout ctid cdc (#28708)

* source-postgres: enable ctid+cdc implementation

* 100% ctid rollout for cdc

* remove CtidFeatureFlags

* fix CdcPostgresSourceAcceptanceTest

* Bump versions and release notes

* Fix compilation error due to previous merge

---------

Co-authored-by: subodh <subodh1810@gmail.com>

* connectors-ci: fix `unhashable type 'set'` (#29064)

* Add Slack Alert lifecycle to Dagster for Metadata publish (#28759)

* DNC

* Add slack lifecycle logging

* Update to use slack

* Update slack to use resource and bot

* Improve markdown

* Improve log

* Add sensor logging

* Extend sensor time

* merge conflict

* PR Refactoring

* Make the tests work

* remove unnecessary classes, pr feedback

* more merging

* Update airbyte-integrations/bases/base-typing-deduping-test/src/main/java/io/airbyte/integrations/base/destination/typing_deduping/BaseSqlGeneratorIntegrationTest.java

Co-authored-by: Edward Gao <edward.gao@airbyte.io>

* snowflake updates

---------

Co-authored-by: Ben Church <ben@airbyte.io>
Co-authored-by: Baz <oleksandr.bazarnov@globallogic.com>
Co-authored-by: Augustin <augustin@airbyte.io>
Co-authored-by: Serhii Lazebnyi <53845333+lazebnyi@users.noreply.github.com>
Co-authored-by: Joe Reuter <joe@airbyte.io>
Co-authored-by: flash1293 <flash1293@users.noreply.github.com>
Co-authored-by: Marcos Marx <marcosmarxm@users.noreply.github.com>
Co-authored-by: Vasilis Gavriilidis <vasilis.gavriilidis@orfium.com>
Co-authored-by: Jonathan Pearlin <jonathan@airbyte.io>
Co-authored-by: Alexandre Girard <alexandre@airbyte.io>
Co-authored-by: girarda <girarda@users.noreply.github.com>
Co-authored-by: btkcodedev <btk.codedev@gmail.com>
Co-authored-by: marcosmarxm <marcosmarxm@gmail.com>
Co-authored-by: Natalie Kwong <38087517+nataliekwong@users.noreply.github.com>
Co-authored-by: jdpgrailsdev <jdpgrailsdev@users.noreply.github.com>
Co-authored-by: octavia-squidington-iii <octavia-squidington-iii@users.noreply.github.com>
Co-authored-by: Rodi Reich Zilberman <867491+rodireich@users.noreply.github.com>
Co-authored-by: rodireich <rodireich@users.noreply.github.com>
Co-authored-by: Alexandre Cuoci <Hesperide@users.noreply.github.com>
Co-authored-by: terencecho <terencecho@users.noreply.github.com>
Co-authored-by: Lake Mossman <lake@airbyte.io>
Co-authored-by: Benoit Moriceau <benoit@airbyte.io>
Co-authored-by: subodh <subodh1810@gmail.com>
Co-authored-by: Edward Gao <edward.gao@airbyte.io>
2023-08-09 14:12:16 -06:00
Edward Gao
e9f1a7ad01 Destination snowflake 1s1t: release v2 early access (#29174)
* disable 1s1t in gcs/s3 mode

* derp

* quote things in many places

* fix timestamp format...?

* delete unused tests

* more expectedrecord timestamp fixes

* implement dumpFinalTable

* fix bug

* bugfix in schema change detection

* add schema change detection tests

* fix schema change detection

* add spec options

* logistics

* add timestamp format test

* and snowflake

* accept raw schema override

* fix test handling

* fix unit test

* Automated Commit - Format and Process Resources Changes

* typo

* forgot to fix this

* ... I had uncommitted changes

* resolve TODOs

* add regex examples

* correctly drop check table in check connection

* also bump teradata >.>

---------

Co-authored-by: edgao <edgao@users.noreply.github.com>
2023-08-09 10:21:05 -06:00
Edward Gao
38530ee0b3 Destinations snowflake, redshift: Simplify default namespace handling (#29188)
* pass default namespace to more convenient location

* add final modifier

* logistcs
2023-08-08 18:34:41 -06:00
Evan Tahler
9210547af5 [Docs] No Deduped + Hostory, Append + Deduped is the future! (#29114)
* [Docs] No `Deduped + Hostory`, `Append + Deduped` is the future!

* fix links
2023-08-08 15:07:49 -07:00
Edward Gao
2866ed6f2f Destination snowflake: mostly done implementations for sqlgenerator+destinationhandler (#28677)
* csv sheet generator supports 1s1t

* create+insert raw tables 1s1t

* add skeletons

* start writing tests

* progress in creating raw tables

* fix tests

* add s3 test; better csv generation

* handle case-sensitive column names

* also add gcs test

* hook T+D into the destination

* fix redshift; simplify

* Delete unused files?

* disable test; enable cleanup

* initialize config singleton in tests

* logistics

* header

* simplify

* fix unit tests

* correctly disable tests

* use default null for loaded_at

* fix test

* autoformat

* cython >.>

* more singleton init

* literally how?

* basic destinationhandler impl

* use raw string for type >.>

* add toDialectType

* basic createTable impl

* better sql query

* comment

* unused variables

* recorddiffer can be case-sensitive

* misc fixes

* add expected_records

* move constants to base-java

* use ternary

* fix tests

* resolve todo

* T+D can trigger on first commit

* fix test teardown

* implement softReset

* implement overwriteFinalTable

* better type stuff; check table schema

* fix

* derp

* implement updateTable?

* derp

* random wip stuff

* fix insertRaw

* theoretically implement stuff?

* stuff

* put suffix at the end

* different uuids

* fix expected records

* move tdtest resources into dat folder

* use resource files

* stuff

* move code around

* more stuff

* rename final table

* stuff

* cdc immediate deletion

* cdcComplexUpdate

* cleanup

* botched rebase

* more tests

* move back to old file

* Automated Commit - Format and Process Resources Changes

* add comments

* Automated Commit - Format and Process Resources Changes

* fix merge

* move expected_records into dat folder

* wip implement sqlgenerator test

* basic implementation

* tons of fixes, still tons more to go

* more stuff

* fix more things

* hacky convert temporal types to varchar

* test data fix

* fix variant parsing

* fix number

* fix time parsing; fix test data

* typo

* fix input data

* progress

* switch back to float

* add more test files

* swap int -> number

* fix PK null check

* fix overwriteTable

* better test

* Automated Commit - Format and Process Resources Changes

* type aliases, one more test

* also verify numeric precision/scale

* logistics

---------

Co-authored-by: edgao <edgao@users.noreply.github.com>
2023-08-07 09:14:59 -07:00
Edward Gao
460295427b 🐛 Destination bigquery 1s1t: handle erroneous cdc deletion record (#29106)
* fix bigquery cdc edge case

* fix condition

* logistics
2023-08-04 15:05:28 -06:00
Edward Gao
b74ddffdbe 🐛 Destination bigquery 1s1t: wrap jsonpath fieldname in quotes (#29089)
* wrap jsonpath fieldname in quotes

* logistics
2023-08-04 11:28:38 -06:00
Edward Gao
ac67bfd9fd 1s1t: Refactor sqlgenerator integration test (#28890)
* random wip stuff

* fix insertRaw

* theoretically implement stuff?

* stuff

* put suffix at the end

* different uuids

* fix expected records

* move tdtest resources into dat folder

* use resource files

* stuff

* move code around

* more stuff

* rename final table

* stuff

* cdc immediate deletion

* cdcComplexUpdate

* cleanup

* botched rebase

* more tests

* move back to old file

* Automated Commit - Format and Process Resources Changes

* add comments

* Automated Commit - Format and Process Resources Changes

* Automated Commit - Format and Process Resources Changes

* raw name update

* logistics

---------

Co-authored-by: edgao <edgao@users.noreply.github.com>
2023-08-03 17:27:17 -06:00
Benoit Moriceau
d7f6bcbefe 🐛 Avoid writing records to log (#29047)
* Avoid writing records to log

* Update version
2023-08-03 15:20:12 -05:00
Edward Gao
3af7f3b6fb 🐛 Destinations snowflake + bigquery: only parse catalog in 1s1t mode (#28976)
* only parse catalog in 1s1t mode

* one more thing?

* logistics
2023-08-02 15:19:52 -07:00
Joe Reuter
516e89ce8d update docs (#28973) 2023-08-02 18:42:26 +02:00
Joe Reuter
4d229f2974 Langchain destination: Check pincone index dimensions as part of check (#28977)
* check dimensions

* improve error message

* adjust changelog
2023-08-02 18:38:52 +02:00
Edward Gao
694731f4e4 Destination bigquery v2: Fix _ab_cdc_deleted_at handling in non-dedup modes (#28959)
* fix bug in deleted_at handling

* add test

* comments

* more comments

* logistics
2023-08-02 09:48:25 -06:00
Cynthia Yin
ae30ce09de Destinations V2: open up early access for BigQuery via spec toggle (#28894)
* update spec + version

* update PR link
2023-08-01 18:12:21 -05:00
Benoit Moriceau
a68ea60f63 Reduce log noise (#28917)
* Reduce log noise

* Automated Commit - Format and Process Resources Changes

---------

Co-authored-by: benmoriceau <benmoriceau@users.noreply.github.com>
2023-08-01 13:21:41 -05:00
Edward Gao
360f0e8f74 Destination snowflake: Add 1s1t skeletons (#28618)
* csv sheet generator supports 1s1t

* create+insert raw tables 1s1t

* add skeletons

* start writing tests

* progress in creating raw tables

* fix tests

* add s3 test; better csv generation

* handle case-sensitive column names

* also add gcs test

* hook T+D into the destination

* fix redshift; simplify

* Delete unused files?

* disable test; enable cleanup

* initialize config singleton in tests

* logistics

* header

* simplify

* fix unit tests

* correctly disable tests

* use default null for loaded_at

* fix test

* autoformat

* cython >.>

* more singleton init

* literally how?

* unused variables

* recorddiffer can be case-sensitive

* move constants to base-java

* use ternary
2023-07-31 11:14:25 -05:00
Edward Gao
83fb3caeea 🚨 Destination bigquery 1s1t: change raw dataset + table name (#28723)
* add test for raw dataset override

* tests hardcode raw dataset name

* rename raw tables

* minimum 1

* logistics

* different option per destination
2023-07-27 12:37:17 -05:00
Eduard Tudenhoefner
aab90a0e48 Destination Iceberg: Support server-managed storage config (#28506) 2023-07-27 08:41:36 -05:00
Edward Gao
9f6963ccfc Destination Bigquery 1s1t: handle cursor change (#28721)
* handle new cursor column

* sync2 is actualy weird, apparently

* logistics

* Automated Commit - Format and Process Resources Changes

---------

Co-authored-by: edgao <edgao@users.noreply.github.com>
2023-07-26 18:36:15 -06:00
Joe Reuter
94e8ef8b1e Langchain destination: Add Chroma support (#28605)
* add chroma support

* prepare release

* prepare release

* adjust docs

* adjust docs

* normalize metadata for Chroma
2023-07-26 19:29:15 +02:00
Edward Gao
df274b7f40 Destination snowflake: destinations v2 scaffolding (#28584)
* deps

* scaffolding

* logistics

* base-jdbc depends on base-td
2023-07-24 14:58:50 -05:00
Joe Bell
6159452f8f 🐛 Destination BigQuery - Limit Clustering Column Amount (#28625)
* Add limit to the number of clustering columns

* update bq version
2023-07-24 13:48:56 -06:00
Edward Gao
adf8870de0 🐛 Destination bigquery 1s1t: respect dataset location (#28580)
* set dataset location during creation

* logistics
2023-07-21 16:54:45 -06:00
Davin Chia
b9a3c0817e 🐛 Release Snowflake Async State Bug. (#28581)
Release #28342 for Snowflake.
2023-07-21 15:38:54 -07:00
Edward Gao
53da5baa7d Destination bigquery 1s1t: fix 1s1t schema change logic; extract TyperDeduper (#28490)
* rename for clarity

* fix cleanup method

* giant commit because I'm irresponsible

* rename constant

* better raw table creation

* fix build?

* move code around

* tweaks

* more code shuffling

* Automated Commit - Format and Process Resources Changes

* add tests

* minor tweak

* remove unimportant methods

* cleanup

* Automated Commit - Format and Process Resources Changes

* derp

* clean up tests

* some more fixes post-merge

* botched merge

* create NoopTyperDeduper

* try and update everything to work?

* tweak comment

* move suffix args to end of list

* fix exception message

* Automated Commit - Format and Process Resources Changes

* add sqlgenerator test for softReset

* only prepare once

* update log message

* do what intellij says

* implement one more test

* less indirection

* Automated Commit - Format and Process Resources Changes

* rename test

* use noop in test

* version bump + changelog

* use stringutils

* fix typo

* flip if-statement

* typo

* simplify logic

* fix schema change logic

* typo

* use spy for clarity

* Automated Commit - Format and Process Resources Changes

* better test teardown

* slightly better logs

* fix exception message

* softReset returns single string

* Automated Commit - Format and Process Resources Changes

* simplify if chain

---------

Co-authored-by: edgao <edgao@users.noreply.github.com>
2023-07-21 13:45:19 -06:00
Joe Reuter
6540fa7c91 Langchain destination: Support composite and nested primary keys for deduping (#28556)
* support composite primary keys for deduping

* prepare release

* format

* fix

---------

Co-authored-by: Augustin <augustin@airbyte.io>
2023-07-21 17:17:29 +02:00
Joe Bell
a16cbea2ae Destination BigQuery - Handle Schema Changes (#28382)
* Add ability to detect differences in expected Schemas and perform soft resets

* Remove alter table for overwrite syncs since its unneccessary

* Updates after testing

* pr reorganize

* comments

* add collection util test

* Add Tests

* bump version

* Automated Commit - Format and Process Resources Changes

* Destination BigQuery - Reduce amount of typing and deduping for GCS staging (#28489)

* undo comment out

* centralize t&d logic for staging and standard, add valve to staging

* Share more logic for typing and deduping

* Remove record checking logic and use only time for staging inserts

* Add Javadoc

* Automated Commit - Format and Process Resources Changes

---------

Co-authored-by: jbfbell <jbfbell@users.noreply.github.com>

* Change TableNotMigratedException to extend runtime exception, remove SqlGenerator interface method

* Make Lambda slightly more readable

* add test for validating v2 schemas

* change soft reset to single string

* convert back to list, update dockerfile

* remove needless default

---------

Co-authored-by: jbfbell <jbfbell@users.noreply.github.com>
2023-07-20 09:25:44 -06:00
Augustin
9815e080cd destination-langchain: fix build (#28509) 2023-07-20 13:22:52 +02:00
Eduard Tudenhoefner
588c7d6f43 Destination Iceberg: Bump Iceberg from 1.1.0 to 1.3.0 and add REST catalog support (#28158)
Co-authored-by: Marcos Marx <marcosmarxm@users.noreply.github.com>
2023-07-19 14:45:01 -05:00
Joe Reuter
d5d7e757f8 Vector databases destination: Make available on cloud (#28398)
* make connector available on cloud

* fix build

* prepare release

* Update langchain.md
2023-07-18 18:23:29 +02:00
Ben Church
6fa755f81d Java Pipeline Bump patch bump all java connectors in july (#28345)
* patch bump all java connectors in july

* Bump changelog
2023-07-14 21:01:39 -05:00
Evan Tahler
b81cc031e0 destination-redshift should fail syncs if records or properties are too large, rather than silently skipping records and succeeding (#27993)
* `destination-redshift` will fail syncs if records or properties are too large, rather than silently skipping records and succeding

* Bump version

* remove tests that don't matter any more

* more test removal

* more test removal

---------

Co-authored-by: Augustin <augustin@airbyte.io>
2023-07-14 14:27:12 -05:00
Davin Chia
fa4a278a2d 🐛 Destination Snowflake: Pull in async minor bug fix. (#28315)
* Pull in async minor bug fix.

* Update readme.
2023-07-14 08:53:01 -07:00
Edward Gao
934acaa137 Destination bigquery: rerelease 1s1t behind gate (#27936)
* Revert "Revert "Destination Bigquery: Scaffolding for destinations v2 (#27268)""

This reverts commit 348c577dbb.

* version bumps+changelog

* Speed up BQ by having 2 queries, and not an OR (#27981)

* 🐛 Destination Bigquery: fix bug in standard inserts for syncs >10K records (#27856)

* only run t+d code if it's enabled

* dockerfile+changelog

* remove changelog entry

* Destinations V2: handle optional fields for `object` and `array` types (#27898)

* catch null schema

* fix null properties

* clean up

* consolidate + add more tests

* try catch

* empty json test

* Automated Commit - Formatting Changes

* remove todo

* destination bigquery: misc updates to 1s1t code (#28057)

* switch to checkedconsumer

* add unit test for buildColumnId

* use flag

* restructure prefix check

* fix build

* more type-parsing fixes (#28100)

* more type-parsing fixes

* handle duplicates

* Automated Commit - Format and Process Resources Changes

* add tests for asColumns

* Automated Commit - Format and Process Resources Changes

* log warnings instead of throwing exception

* better log message

* error level

---------

Co-authored-by: edgao <edgao@users.noreply.github.com>

* Automated Commit - Formatting Changes

* Improve protocol type parsing (#28126)

* Automated Commit - Formatting Changes

* Change from T&D every 10k records to an increasing time based interval (#28130)

* fifteen minute t&d

* add typing and deduping operation valve for increased intervals of typing and deduping

* Automated Commit - Format and Process Resources Changes

* resolve bizarre merge conflict

* Automated Commit - Format and Process Resources Changes

---------

Co-authored-by: jbfbell <jbfbell@users.noreply.github.com>

* Simplify and speed up CDC delete support [DestinationsV2] (#28029)

* Simplify and speed up CDC delete support [DestinationsV2]

* better QUOTE

* spotbugs?

* recompile dbt image for local arch and use that when building images

* things compile, but tests fail

* tests working-ish

* comment

* fix logic to re-insert deleted records for cursor comparison.

tests pass!

* remove comment

* Skip CDC re-include logic if there are no CDC columns

* stop hardcoding pk (#28092)

* wip

* remove TODOs

---------

Co-authored-by: Edward Gao <edward.gao@airbyte.io>

* update method name

* Automated Commit - Formatting Changes

* depend on pinned normalization version

* implement 1s1t DATs for destination-bigquery (#27852)

* intiial implementation

* Automated Commit - Formatting Changes

* add second sync to test

* do concurrent things

* Automated Commit - Formatting Changes

* clarify comment

* minor tweaks

* more stuff

* Automated Commit - Formatting Changes

* minor cleanup

* lots of fixes

* handle sql vs json null better
* verify extra columns
* only check deleted_at if in DEDUP mode and the column exists
* add full refresh append test case

* Automated Commit - Formatting Changes

* add tests for the remaining sync modes

* Automated Commit - Formatting Changes

* readability stuff

* Automated Commit - Formatting Changes

* add test for gcs mode

* remove static fields

* Automated Commit - Formatting Changes

* add more test cases, tweak test scaffold

* cleanup

* Automated Commit - Formatting Changes

* extract recorddiffer

* and use it in the sql generator test

* fix

* comment

* naming+comment

* one more comment

* better assert

* remove unnecessary thing

* one last thing

* Automated Commit - Formatting Changes

* enable concurrent execution on all java integration tests

* add test for default namespace

* Automated Commit - Formatting Changes

* implement a 2-stream test

* Automated Commit - Formatting Changes

* extract methods

* invert jsonNodesNotEquivalent

* Automated Commit - Formatting Changes

* fix conditional

* pull out diffSingleRecord

* Automated Commit - Formatting Changes

* handle nulls correctly

* remove raw-specific handling; break up methods

* Automated Commit - Formatting Changes

---------

Co-authored-by: edgao <edgao@users.noreply.github.com>
Co-authored-by: octavia-approvington <octavia-approvington@users.noreply.github.com>

* Destinations V2: move create raw tables earlier (#28255)

* move create raw tables

* better log message

* stop building normalization (#28256)

* fix ability to run tests

* disable incremental t+d for now

* Automated Commit - Formatting Changes

---------

Co-authored-by: Evan Tahler <evan@airbyte.io>
Co-authored-by: Cynthia Yin <cynthia@airbyte.io>
Co-authored-by: cynthiaxyin <cynthiaxyin@users.noreply.github.com>
Co-authored-by: edgao <edgao@users.noreply.github.com>
Co-authored-by: Joe Bell <joseph.bell@airbyte.io>
Co-authored-by: jbfbell <jbfbell@users.noreply.github.com>
Co-authored-by: octavia-approvington <octavia-approvington@users.noreply.github.com>
2023-07-14 09:34:56 -05:00