1
0
mirror of synced 2025-12-20 10:32:35 -05:00
Commit Graph

1266 Commits

Author SHA1 Message Date
Brian Lai
037e8ed1a9 fix cdk bug to send legacy format if connector overrides read() (#16566)
* fix cdk bug to send legacy format if connector overrides read()

* fix comment

* update changelog and setup.py
2022-09-09 21:09:50 -04:00
Alexandre Girard
0cb44ca071 release cdk with frozen dataclasses-jsonschema lib (#16525) 2022-09-09 07:03:18 -07:00
Brian Lai
1d9608cbbe [per-stream cdk] Support deserialization of legacy and per-stream state (#16205)
* interpret legacy and new per-stream format into AirbyteStateMessages

* add ConnectorStateManager stubs for future work

* remove frozen for the time being until we need to hash descriptors

* add validation that AirbyteStateMessage has at least one of stream, global, or data fields

* pr feedback and clean up of the code

* remove changes to airbyte_protocol and perform validation in read_state()

* fix import formatting
2022-09-07 13:20:14 -04:00
Brian Lai
dceeef4683 [cdk] pin dataclasses-jsonschema to 2.15.1 (#16253) 2022-09-01 21:08:32 -04:00
Serhii Chvaliuk
3cfa489234 CDK: Fix regression in _checkpoint_state arg (#16141)
Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com>
2022-08-31 22:39:30 +03:00
Jimmy Ma
4fbe03bebe Add explicit tracking of Airbyte Protocol Version in ConnectorSpecification (#15340)
* Add Version to AirbyteMessage

* Move protocol version to ConnectorSpecification

* Add cdk generated protocol model

* Add protocol_version to the sample ConnectorSpec in the docs

* Update airbyte-protocol/protocol-models/src/main/resources/airbyte_protocol/airbyte_protocol.yaml

Co-authored-by: Sherif A. Nada <snadalive@gmail.com>

* update doc

* Update CDK changelog

* Update CDK protocol model

Co-authored-by: Sherif A. Nada <snadalive@gmail.com>
2022-08-29 14:46:56 -07:00
Brian Lai
10a2bd1d3c [low code connectors] add NoAuth to class types registry (#15939)
* add no auth to class type registry

* NoAuth should receive options and fail normally

* forgot to pass options in

* update changelog
2022-08-25 18:06:02 -04:00
Brian Lai
09cddffd36 [low code connectors] replace file retrieval with pkgutil to fix getting schema files (#15814)
* replace file retrieval with pkgutil to fix getting schema files

* slightly better error handling on missing files

* filter our schema gen warnings for some classes that cannot generate schemas

* add comment for todo

* add changelog and setup before publish
2022-08-23 03:06:20 -04:00
Brian Lai
ca6513625d [low code connectors] read configs from package_data (#15810)
* read configs from package_data

* update changelog and setup

* commenting out failing tests in the short term
2022-08-19 21:16:20 -04:00
Brian Lai
7e158ef9af [low code connectors] generate complete json schema from classes (#15647)
* draft: first pass at complete schema language generation and factory validator

* actually a working validator and fixes to the schema that went uncaught

* remove extra spike file

* fix formatting file

* Add method to generate the complete JSON schema of the low code declarative language

* add testing of a few components during schema gen

* pr feedback and a little bit of refactoring

* test for schema version

* fix some types that were erroneously marked as invalid schema

* some comments

* add jsonschemamixin to interfaces

* update tests now that interfaces are jsonschemamixin

* accidentally removed a mixin

* remove unneeded test

* make comment a little more clear

* update changelog

* bump version

* generic enum not enum class

* Add method to generate the complete JSON schema of the low code declarative language

* add testing of a few components during schema gen

* test for schema version

* update tests now that interfaces are jsonschemamixin

* accidentally removed a mixin

* remove unneeded test

* make comment a little more clear

* generic enum not enum class

* add generated json file and update docs to reference it

* verbage
2022-08-18 18:53:42 -04:00
Brian Lai
ca80d3782a [low code connectors] perform schema validation of the input config against the declarative language schema (#15543)
* draft: first pass at complete schema language generation and factory validator

* actually a working validator and fixes to the schema that went uncaught

* remove extra spike file

* fix formatting file

* pr feedback and a little bit of refactoring

* fix some types that were erroneously marked as invalid schema

* some comments

* add jsonschemamixin to interfaces

* update changelog

* bump version
2022-08-18 15:29:26 -04:00
Alexandre Girard
313ac11e6d [low-code connectors] Get parent stream's full slice (#15631)
* always access parent stream using full_refresh mode

* Update test

* fix substream slicer

* bump
2022-08-18 10:24:10 -07:00
Serhii Chvaliuk
4e6cb05759 CDK: Improve filter_secrets skip empty string (#15684)
* Improve `filter_secrets` skip empty string

Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com>
2022-08-16 19:28:57 +03:00
Alexandre Girard
288c3cabad Tutorial and documentation for config-based connectors (#15027)
* 5-step tutorial

* move

* tiny bit of editing

* Update tutorial

* update docs

* reset

* move files

* record selector, request options, and more links

* update

* update

* connector definition

* link

* links

* update example

* footnote

* typo

* document string interpolation

* note on string interpolation

* update

* fix code sample

* fix

* update sample

* fix

* use the actual config

* Update as per comments

* write as yaml

* typo

* Clarify options overloading

* clarify that docker must be running

* remove extra footnote

* use venv directly

* Apply suggestions from code review

Co-authored-by: Sherif A. Nada <snadalive@gmail.com>

* signup instructions

* update

* clarify that both dot and bracket notations are interchangeable

* Clarify how check works

* create spec and config before updating connector definition

* clarify what now_local() is

* rename to yaml structure

* Go through tutorial and update end of section code samples

* fix link

* update

* update code samples

* Update code samples

* Update to bracket notation

* remove superfluous comments

* Update docs/connector-development/config-based/tutorial/2-install-dependencies.md

Co-authored-by: Augustin <augustin.lafanechere@gmail.com>

* Update docs/connector-development/config-based/tutorial/3-connecting-to-the-API-source.md

Co-authored-by: Augustin <augustin.lafanechere@gmail.com>

* Update docs/connector-development/config-based/tutorial/3-connecting-to-the-API-source.md

Co-authored-by: Augustin <augustin.lafanechere@gmail.com>

* Update docs/connector-development/config-based/tutorial/3-connecting-to-the-API-source.md

Co-authored-by: Augustin <augustin.lafanechere@gmail.com>

* Update docs/connector-development/config-based/tutorial/3-connecting-to-the-API-source.md

Co-authored-by: Augustin <augustin.lafanechere@gmail.com>

* Update docs/connector-development/config-based/tutorial/3-connecting-to-the-API-source.md

Co-authored-by: Augustin <augustin.lafanechere@gmail.com>

* Update docs/connector-development/config-based/tutorial/4-reading-data.md

Co-authored-by: Augustin <augustin.lafanechere@gmail.com>

* fix path

* update

* motivation blurp

* warning

* warning

* fix code block

* update code samples

* update code sample

* update code samples

* small updates

* update yaml structure

* custom class example

* language annotations

* update warning

* Update tutorial to use dpath extractor

* Update record selector docs

* unit test

* link to contributing

* tiny update

* $ in front of commands

* $ in front of commands

* More readings

* link to existing config-based connectors

* index

* update

* delete broken link

* supported features

* update

* Add some links

* Update docs/connector-development/config-based/overview.md

Co-authored-by: Brian Lai <51336873+brianjlai@users.noreply.github.com>

* Update docs/connector-development/config-based/record-selector.md

Co-authored-by: Brian Lai <51336873+brianjlai@users.noreply.github.com>

* Update docs/connector-development/config-based/overview.md

Co-authored-by: Brian Lai <51336873+brianjlai@users.noreply.github.com>

* Update docs/connector-development/config-based/overview.md

Co-authored-by: Brian Lai <51336873+brianjlai@users.noreply.github.com>

* Update docs/connector-development/config-based/overview.md

Co-authored-by: Brian Lai <51336873+brianjlai@users.noreply.github.com>

* mention the unit

* headers

* remove mentions of interpolating on stream slice, etc.

* update

* exclude config-based docs

Co-authored-by: Sherif A. Nada <snadalive@gmail.com>
Co-authored-by: Augustin <augustin.lafanechere@gmail.com>
Co-authored-by: Brian Lai <51336873+brianjlai@users.noreply.github.com>
2022-08-12 15:50:54 -07:00
Alexandre Girard
6332fd6527 [low-code-connectors] Replace JelloExtractor with DpathExtractor (#15514)
* Handle extracting no records from root

* handle missing keys

* record extractor interface

* dpath extractor

* docstring

* handle extract root array

* Update airbyte-cdk/python/airbyte_cdk/sources/declarative/extractors/jello.py

Co-authored-by: Sherif A. Nada <snadalive@gmail.com>

* Update airbyte-cdk/python/airbyte_cdk/sources/declarative/extractors/record_selector.py

Co-authored-by: Sherif A. Nada <snadalive@gmail.com>

* update docstring

* respect extractor interface

* edge case handling

* document

* use dpath by default

* delete jello extractor

* bump cdk version

* delete jello dependency

* Update reference docs templates

* update template

Co-authored-by: Sherif A. Nada <snadalive@gmail.com>
2022-08-11 12:34:54 -07:00
Alexandre Girard
29fafe26eb [low-code connectors] Extract datetime parser and handle %s format directive (#15429)
* fix parse

* Revert "fix parse"

This reverts commit 3c76c5a782.

* fix parse timestamp

* extract datetime parser

* remove print

* use parser

* top level docstring

* rename variable

* do not use timestamp()

* Revert "do not use timestamp()"

This reverts commit 016cb69193.

* update comment

* bump cdk version

* Update template
2022-08-10 16:35:29 -07:00
Alexandre Girard
f540499f43 [low-code connectors]: Assert there are no custom top-level fields (#15489)
* move components to definitions field

* Also update the references

* validate the top level fields and add version

* raise exception on unknown fields

* newline

* unit tests

* set version to 0.1.0

* newline
2022-08-10 11:37:07 -07:00
Alexandre Girard
bbf3584cb7 Remove unused field from JsonSchema (#15425)
* few fixes from working with sendgrid

* reset to master

* only update the docstring

* reset
2022-08-10 10:58:22 -07:00
Alexandre Girard
9507d56be9 low-code connectors: fix parse and format methods (#15326)
* fix parse and format methods

* define constant

* remove timestamp magic keyword

* comment

* test for ci

* uncomment test

* use timestamp()

* Bump cdk version

* bump to 0.1.72
2022-08-08 19:02:02 -07:00
Brian Lai
054cbbe94d [low code connectors] fix bug where headers were not passed to cursor interpolation (#15347)
* fix bug where headers were not passed to cursor interpolation

* add headers to error handler predicate
2022-08-08 15:23:20 -04:00
Brian Lai
ef712f18aa [low-code connectors] fix so we don't display yaml when debug flag is turned off (#15383)
* fix so we don't display yaml when debug is turned off

* forgot to remove old debug level
2022-08-08 01:11:31 -04:00
Alexandre Girard
6f1715eabe low-code connectors: convert request headers to string before submitting them (#15336)
* convert values to strings

* introduce variable

* kwargs

* bump version
2022-08-06 10:10:23 -07:00
Alexandre Girard
5242ff8e95 low-code connectors: reset pagination between stream slices (#15330)
* reset pagination between stream slices

* Update airbyte-cdk/python/airbyte_cdk/sources/declarative/retrievers/simple_retriever.py

Co-authored-by: Sherif A. Nada <snadalive@gmail.com>

* Update airbyte-cdk/python/airbyte_cdk/sources/declarative/retrievers/simple_retriever.py

Co-authored-by: Sherif A. Nada <snadalive@gmail.com>

* patch

Co-authored-by: Sherif A. Nada <snadalive@gmail.com>
2022-08-05 16:44:56 -07:00
Alexandre Girard
c5c13f05b4 low-code-connectors: handle single records (#15346)
* handle single records

* comment

* comment
2022-08-05 15:52:45 -07:00
Brian Lai
bd31100774 initial first pass converting every component to dataclasses (#15189)
* initial first pass converting every component to dataclasses

* replace the hackier options pass through

* get rid of the hackier way

* fix issues w/ type hints by making options required and lots of fixes to the language to fix compatability for dataclasses

* add dataclasses-jsonschema to setup

* fix oauth authenticator to avoid dataclass name collisions

* fix spacing for CI tests

* remove property from oauth and fix a interpolation bug

* pr feedback and cleaning up the code a bit, attempt at avoiding renaming

* fix templates and bugs surfaced during greenhouse spec testing

* fix tests

* fix missing options in some declarative components

* fix tests related to pulling latest master

* fix issue w/ passing state, slice, and token to subcomponents

* switch name back to get_access_token() since no name collision anymore
2022-08-05 17:39:27 -04:00
Alexandre Girard
6e59cfd7be low-code connectors: Set slicer's request options (#15283)
* requester is a request options provider

* get request options from slicer

* remove prints

* share interface

* actual fix with test

* small fix

* missing tests

* missing *

* simplify intersection logic

* bump cdk version
2022-08-04 16:18:28 -07:00
Alexandre Girard
017a092194 cast to string before passing to strptime (#15323) 2022-08-04 15:47:13 -07:00
Alexandre Girard
79a54a81fd Emit a state message even if no records were read (#15067)
* Emit a state message even if no records were read

* newline

* merge

* comment

* implement logic in the abstract source

* remove logic from declarative source

* comment

* bump cdk version
2022-08-04 12:48:21 -07:00
Alexandre Girard
f81d86ee1f Generate reference docs source (#15183) 2022-08-02 09:51:23 -07:00
Alexandre Girard
a3ff80c179 [low-code-connectors] Disable parse-time interpolation in favor of runtime-only (#14923)
* abstract auth token

* basichttp

* remove prints

* docstrings

* get rid of parse-time interpolation

* always pass options through

* delete print

* delete misleading comment

* delete note

* reset

* pass down options

* delete duplicate file

* missing test

* refactor test

* rename to '$options'

* rename to ''

* interpolatedauth

* fix tests

* fix

* docstrings

* update docstring

* docstring

* update docstring

* remove extra field

* undo

* rename to runtime_parameters

* docstring

* update

* / -> *

* update template

* rename to options

* Add examples

* update docstring

* Update test

* newlines

* rename kwargs to options

* options init param

* delete duplicate line

* type hints

* update docstring

* Revert "delete duplicate line"

This reverts commit 4255d5b346.

* delete duplicate code from bad merge

* rename file

* bump cdk version
2022-07-28 08:57:17 -07:00
Evan Tahler
d449fb6067 All objects in the Airbyte Proticol have additionalProperties: true (#15081)
* All objects in the Airbyte Proticol have `additionalProperties: true`

* order of keys

* rebuild airbyte proticol for python CDK
2022-07-27 18:03:13 -07:00
Alexandre Girard
44ec661b5a [low-code connectors] Add request options and state to stream slicers (#14552)
* comment

* comment

* comments

* fix

* test for instantiating chain retrier

* fix parsing

* cleanup

* fix

* reset

* never raise on http error

* remove print

* comment

* comment

* comment

* comment

* remove prints

* add declarative stream to registry

* start working on limit paginator

* support for offset pagination

* tests

* move limit value

* extract request option

* boilerplate

* page increment

* delete offset paginator

* update conditional paginator

* refactor and fix test

* fix test

* small fix

* Delete dead code

* Add docstrings

* quick fix

* exponential backoff

* fix test

* fix

* delete unused properties

* fix

* missing unit tests

* uppercase

* docstrings

* rename to success

* compare full request instead of just url

* renmae module

* rename test file

* rename interface

* rename default retrier

* rename to compositeerrorhandler

* fix missing renames

* move action to filter

* str -> minmaxdatetime

* small fixes

* plural

* add example

* handle header variations

* also fix wait time from

* allow using a regex to extract the value

* group()

* docstring

* add docs

* update comment

* docstrings

* fix tests

* rename param

* cleanup stop_condition

* cleanup

* Add examples

* interpolated pagination strategy

* dont need duplicate class

* docstrings

* more docstrings

* docstrings

* fix tests

* first pass at substream

* seems to work for a single stream

* can also be defined in requester with stream_state

* tmp update

* update comment

* Update airbyte-cdk/python/airbyte_cdk/sources/declarative/requesters/http_requester.py

Co-authored-by: Sherif A. Nada <snadalive@gmail.com>

* version: Update Parquet library to latest release (#14502)

The upstream Parquet library that is currently pinned for use in the S3 destination plugin is over a year old. The current version is generating invalid schemas for date-time with time-zone fields which appears to be addressed in the `1.12.3` release of the library in commit c72862b613

* merge

* 🎉 Source Github: improve schema for stream `pull_request_commits` added "null" (#14613)

Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com>

* Docs: Fixed broken links (#14622)

* fixing broken links

* more broken links

* source-hubspot: change mentioning of Mailchimp into HubSpot  doc (#14620)

* Helm Chart: Add external temporal option (#14597)

* conflict env configmap and chart lock

* reverting lock

* add eof lines and documentation on values yaml

* conflict json file

* rollback json

* solve conflict

* correct minio with new version

Co-authored-by: Guy Feldman <gfeldman@86labs.com>

* 🎉 Add YAML format to source-file reader (#14588)

* Add yaml reader

* Update docs

* Bumpversion of connector

* bump docs

* Update pyarrow dependency

* Upgrade pandas dependency

* auto-bump connector version

Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>

* 🎉 Source Okta: add GroupMembers stream (#14380)

* add Group_Members stream to okta source

- Group_Members return a list of users, the same schema of Users stream.
- Create a shared schema users, and both group_members and users sechema use it as a reference.
- Add Group_Members stream to source connector

* add tests and fix logs schema

- fix the test error: None is not one of enums though the enum type includes both string and null, it comes from json schema validator
ddb87afad8/jsonschema/_validators.py (L279-L285)
- change grouop_members to use id as the cursor field since `filter` is not supported in the query string
- fix the abnormal state test on logs stream, when since is abnormally large, until has to defined, an equal or a larger value
- remove logs stream from full sync test, because 2 full sync always has a gap -- at least a new log about users or groups api.

* last polish before submit the PR

- bump docker version
- update changelog
- add the right abnormal value for logs stream
- correct the sample catalog

* address comments::

- improve comments for until parameter under the logs stream
- add use_cache on groupMembers

* add use_cache to Group_Members

* change configured_catalog to test

* auto-bump connector version

Co-authored-by: marcosmarxm <marcosmarxm@gmail.com>
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>

* split test files

* renames

* missing unit test

* add missing unit tests

* rename

* assert isinstance

* start extracting to their own files

* use final instead of classmethod

* assert we retry 429 errors

* Add log

* replace asserts with valueexceptions

* delete superfluous print statement

* only accept minmaxdatetime

* fix factory so we don't need to union everything with strings

* get class_name from type

* remove from class types registry

* process error handlers one at a time

* sort

* delete print statement

* comment

* comment

* format

* delete unused file

* comment

* interpolatedboolean

* comment

* not optional

* not optional

* unit tests

* fix request body data

* add test

* move file to right module

* update

* reset to master

* format

* rename to pass_by

* rename to page size

* fix

* fix some tests

* reset

* fix

* fix some of the tests

* fix test

* fix more tests

* all tests pass

* path is not optional

* reset

* reset

* reset

* delete print

* remove prints

* delete duplicate method

* add test

* fix body data

* delete extra newlines

* move to subpackage

* fix imports

* handle str body data

* simplify

* Update tests

* filter dates before stream state

* Revert "Update tests"

This reverts commit c0808c8009.

* update

* fix test

* state management

* add test

* delete dead code

* update cursor

* update cursor cartesian

* delete unused state class

* fix

* missing test

* update cursor substreams

* missing test

* fix typing

* fix typing

* delete unused field

* delete unused method

* update datetime stream slice

* cleanup

* assert

* request options

* request option cartesian

* assert when passing by path

* request options for substreams

* always return a map

* pass stream_state

* refactor and almost done fixing tests

* fix tests

* rename to inject_into

* only accept enum

* delete conditional paginator

* only return body data

* missing test

* update docstrings

* update docstrings

* update comment

* rename

* tests

* class_name -> type

* improve interface

* fix some of the tests

* fix more of the tests

* fix tests

* reset

* reset

* Revert "reset"

This reverts commit eb9a918a09.

* remove extra argument

* docstring

* update

* delete unused file

* reset

* reset

* rename

* fix timewindow

* create InterpolatedString

* helper method

* assert on request option

* better asserts

* format

* docstrings

* docstrings

* remove optional from type hint

* Update airbyte-cdk/python/airbyte_cdk/sources/declarative/stream_slicers/cartesian_product_stream_slicer.py

Co-authored-by: Sherif A. Nada <snadalive@gmail.com>

* inherit from request options provider

* inherit from request options provider

* remove optional from type hint

* remove extra parameter

* none check

Co-authored-by: Sherif A. Nada <snadalive@gmail.com>
Co-authored-by: Tobias Macey <tmacey@boundlessnotions.com>
Co-authored-by: Serhii Chvaliuk <grubberr@gmail.com>
Co-authored-by: Amruta Ranade <11484018+Amruta-Ranade@users.noreply.github.com>
Co-authored-by: Bas Beelen <bjgbeelen@gmail.com>
Co-authored-by: Marcos Marx <marcosmarxm@users.noreply.github.com>
Co-authored-by: Guy Feldman <gfeldman@86labs.com>
Co-authored-by: Christophe Duong <christophe.duong@gmail.com>
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
Co-authored-by: Yiyang Li <yiyangli2010@gmail.com>
Co-authored-by: marcosmarxm <marcosmarxm@gmail.com>
2022-07-27 15:30:49 -07:00
Alexandre Girard
2695c0c9db fix build (#15068) 2022-07-27 08:17:40 -07:00
Alexandre Girard
78f43cfb1d [low-code connectors] Handle 200 responses with error (#15055)
* Handle 200 responses with error

* missing file
2022-07-27 06:53:52 -07:00
Alexandre Girard
0f25ba612d Log stream_instance's metadata (#15025)
* Log stream_instance's metadata

* syncmode is only set on ConfiguredStream

* log both configured stream and stream instance
2022-07-26 12:18:48 -07:00
Alexandre Girard
783923db76 [low-code CDK] Enable runtime string interpolation in authenticators (#14914)
* interpolatedauth

* fix tests

* fix import

* no need for default

* Bump version

* Missing docstrings

* example

* missing example

* more docstrings

* interpolated types
2022-07-25 19:04:05 -07:00
Alexandre Girard
08239abafd Alex/lowcode referencedocs (#14973)
* Add docstrings for auth package

* docstrings for the check package

* docstrings for the datetime package

* docstrings for the decoder package

* docstrings for extractors package and fix tests

* interpolation docstrings

* ref ->  and parser docstrings

* docstrings for parsers package

* error handler docstrings

* requester docstrings

* more docstrings

* docstrings

* docstrings

* docstrings

* Use defined type annotations

* update

* update docstrings

* Update docstrings

* update docstrings

* update docstrings

* update template

* Revert "update template"

This reverts commit eb4a11858b.

* update template

* update

* move to interpolated_string

* update docstring

* update

* fix tests

* format

* return type can also be an array

* Update airbyte-cdk/python/airbyte_cdk/sources/declarative/interpolation/interpolated_boolean.py

Co-authored-by: Sherif A. Nada <snadalive@gmail.com>

* Update airbyte-cdk/python/airbyte_cdk/sources/declarative/interpolation/interpolation.py

Co-authored-by: Sherif A. Nada <snadalive@gmail.com>

* Update airbyte-cdk/python/airbyte_cdk/sources/declarative/interpolation/jinja.py

Co-authored-by: Sherif A. Nada <snadalive@gmail.com>

* Update airbyte-cdk/python/airbyte_cdk/sources/declarative/interpolation/interpolated_boolean.py

Co-authored-by: Sherif A. Nada <snadalive@gmail.com>

* Update airbyte-cdk/python/airbyte_cdk/sources/declarative/requesters/error_handlers/backoff_strategy.py

* Update as per comments

Co-authored-by: Sherif A. Nada <snadalive@gmail.com>
2022-07-25 18:10:32 -07:00
Augustin
4df0a48427 cdk and docs: remove "additionalProperties" (#14881) 2022-07-22 15:34:27 +02:00
Augustin
b76b73bbfb cdk: do not call init_uncaught_exception_handler from modules' root (#14892) 2022-07-21 16:34:20 +02:00
Brian Lai
2a8d2516c5 [#14361] Adding new generator for configuration based source template (#14887)
* [#14361] Adding new generator for configuration based source template

* remove unit tests and update a few doc files generated by the templates that aren't relevant to config based connectors

* use 0.1.65 as the latest available CDK version we have
2022-07-21 07:35:22 -04:00
Alexandre Girard
c98f196d64 [low-code connectors] Rename decode_response reference to response (#14877)
* checkout files from test branch

* read_incremental works

* reset to master

* remove dead code

* comment

* fix

* Add test

* comments

* utc

* format

* small fix

* Add test with rfc3339

* remove unused param

* fix test

* configurable state checkpointing

* update test

* start working on retrier

* retry predicate

* return response status

* look in error message

* cleanup test

* constant backoff strategy

* chain backoff strategy

* chain retrier

* Add to class types registry

* extract backoff time from header

* wait until

* update

* split file

* parse_records

* classmethod

* delete dead code

* comment

* comment

* comments

* fix

* test for instantiating chain retrier

* fix parsing

* cleanup

* fix

* reset

* never raise on http error

* remove print

* comment

* comment

* comment

* comment

* remove prints

* add declarative stream to registry

* start working on limit paginator

* support for offset pagination

* tests

* move limit value

* extract request option

* boilerplate

* page increment

* delete offset paginator

* update conditional paginator

* refactor and fix test

* fix test

* small fix

* Delete dead code

* Add docstrings

* quick fix

* exponential backoff

* fix test

* fix

* delete unused properties

* fix

* missing unit tests

* uppercase

* docstrings

* rename to success

* compare full request instead of just url

* renmae module

* rename test file

* rename interface

* rename default retrier

* rename to compositeerrorhandler

* fix missing renames

* move action to filter

* str -> minmaxdatetime

* small fixes

* plural

* add example

* handle header variations

* also fix wait time from

* allow using a regex to extract the value

* group()

* docstring

* add docs

* update comment

* docstrings

* fix tests

* rename param

* cleanup stop_condition

* cleanup

* Add examples

* interpolated pagination strategy

* dont need duplicate class

* docstrings

* more docstrings

* docstrings

* update comment

* Update airbyte-cdk/python/airbyte_cdk/sources/declarative/requesters/http_requester.py

Co-authored-by: Sherif A. Nada <snadalive@gmail.com>

* version: Update Parquet library to latest release (#14502)

The upstream Parquet library that is currently pinned for use in the S3 destination plugin is over a year old. The current version is generating invalid schemas for date-time with time-zone fields which appears to be addressed in the `1.12.3` release of the library in commit c72862b613

* merge

* 🎉 Source Github: improve schema for stream `pull_request_commits` added "null" (#14613)

Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com>

* Docs: Fixed broken links (#14622)

* fixing broken links

* more broken links

* source-hubspot: change mentioning of Mailchimp into HubSpot  doc (#14620)

* Helm Chart: Add external temporal option (#14597)

* conflict env configmap and chart lock

* reverting lock

* add eof lines and documentation on values yaml

* conflict json file

* rollback json

* solve conflict

* correct minio with new version

Co-authored-by: Guy Feldman <gfeldman@86labs.com>

* 🎉 Add YAML format to source-file reader (#14588)

* Add yaml reader

* Update docs

* Bumpversion of connector

* bump docs

* Update pyarrow dependency

* Upgrade pandas dependency

* auto-bump connector version

Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>

* 🎉 Source Okta: add GroupMembers stream (#14380)

* add Group_Members stream to okta source

- Group_Members return a list of users, the same schema of Users stream.
- Create a shared schema users, and both group_members and users sechema use it as a reference.
- Add Group_Members stream to source connector

* add tests and fix logs schema

- fix the test error: None is not one of enums though the enum type includes both string and null, it comes from json schema validator
ddb87afad8/jsonschema/_validators.py (L279-L285)
- change grouop_members to use id as the cursor field since `filter` is not supported in the query string
- fix the abnormal state test on logs stream, when since is abnormally large, until has to defined, an equal or a larger value
- remove logs stream from full sync test, because 2 full sync always has a gap -- at least a new log about users or groups api.

* last polish before submit the PR

- bump docker version
- update changelog
- add the right abnormal value for logs stream
- correct the sample catalog

* address comments::

- improve comments for until parameter under the logs stream
- add use_cache on groupMembers

* add use_cache to Group_Members

* change configured_catalog to test

* auto-bump connector version

Co-authored-by: marcosmarxm <marcosmarxm@gmail.com>
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>

* split test files

* renames

* missing unit test

* add missing unit tests

* rename

* assert isinstance

* start extracting to their own files

* use final instead of classmethod

* assert we retry 429 errors

* Add log

* replace asserts with valueexceptions

* delete superfluous print statement

* fix factory so we don't need to union everything with strings

* get class_name from type

* remove from class types registry

* process error handlers one at a time

* sort

* delete print statement

* comment

* comment

* format

* delete unused file

* comment

* interpolatedboolean

* comment

* not optional

* not optional

* unit tests

* fix request body data

* add test

* move file to right module

* update

* reset to master

* format

* rename to pass_by

* rename to page size

* fix

* add test

* fix body data

* delete extra newlines

* move to subpackage

* fix imports

* handle str body data

* simplify

* fix typing

* always return a map

* rename to inject_into

* only accept enum

* delete conditional paginator

* only return body data

* rename decoded response to response

* decoded_response -> response

Co-authored-by: Sherif A. Nada <snadalive@gmail.com>
Co-authored-by: Tobias Macey <tmacey@boundlessnotions.com>
Co-authored-by: Serhii Chvaliuk <grubberr@gmail.com>
Co-authored-by: Amruta Ranade <11484018+Amruta-Ranade@users.noreply.github.com>
Co-authored-by: Bas Beelen <bjgbeelen@gmail.com>
Co-authored-by: Marcos Marx <marcosmarxm@users.noreply.github.com>
Co-authored-by: Guy Feldman <gfeldman@86labs.com>
Co-authored-by: Christophe Duong <christophe.duong@gmail.com>
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
Co-authored-by: Yiyang Li <yiyangli2010@gmail.com>
Co-authored-by: marcosmarxm <marcosmarxm@gmail.com>
2022-07-21 02:58:22 -07:00
Alexandre Girard
04a44b3d8d [low-code connectors] Refactor paginator component so it owns the request options to set (#14433)
* checkout files from test branch

* read_incremental works

* reset to master

* remove dead code

* comment

* fix

* Add test

* comments

* utc

* format

* small fix

* Add test with rfc3339

* remove unused param

* fix test

* configurable state checkpointing

* update test

* start working on retrier

* retry predicate

* return response status

* look in error message

* cleanup test

* constant backoff strategy

* chain backoff strategy

* chain retrier

* Add to class types registry

* extract backoff time from header

* wait until

* update

* split file

* parse_records

* classmethod

* delete dead code

* comment

* comment

* comments

* fix

* test for instantiating chain retrier

* fix parsing

* cleanup

* fix

* reset

* never raise on http error

* remove print

* comment

* comment

* comment

* comment

* remove prints

* add declarative stream to registry

* start working on limit paginator

* support for offset pagination

* tests

* move limit value

* extract request option

* boilerplate

* page increment

* delete offset paginator

* update conditional paginator

* refactor and fix test

* fix test

* small fix

* Delete dead code

* Add docstrings

* quick fix

* exponential backoff

* fix test

* fix

* delete unused properties

* fix

* missing unit tests

* uppercase

* docstrings

* rename to success

* compare full request instead of just url

* renmae module

* rename test file

* rename interface

* rename default retrier

* rename to compositeerrorhandler

* fix missing renames

* move action to filter

* str -> minmaxdatetime

* small fixes

* plural

* add example

* handle header variations

* also fix wait time from

* allow using a regex to extract the value

* group()

* docstring

* add docs

* update comment

* docstrings

* fix tests

* rename param

* cleanup stop_condition

* cleanup

* Add examples

* interpolated pagination strategy

* dont need duplicate class

* docstrings

* more docstrings

* docstrings

* update comment

* Update airbyte-cdk/python/airbyte_cdk/sources/declarative/requesters/http_requester.py

Co-authored-by: Sherif A. Nada <snadalive@gmail.com>

* version: Update Parquet library to latest release (#14502)

The upstream Parquet library that is currently pinned for use in the S3 destination plugin is over a year old. The current version is generating invalid schemas for date-time with time-zone fields which appears to be addressed in the `1.12.3` release of the library in commit c72862b613

* merge

* 🎉 Source Github: improve schema for stream `pull_request_commits` added "null" (#14613)

Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com>

* Docs: Fixed broken links (#14622)

* fixing broken links

* more broken links

* source-hubspot: change mentioning of Mailchimp into HubSpot  doc (#14620)

* Helm Chart: Add external temporal option (#14597)

* conflict env configmap and chart lock

* reverting lock

* add eof lines and documentation on values yaml

* conflict json file

* rollback json

* solve conflict

* correct minio with new version

Co-authored-by: Guy Feldman <gfeldman@86labs.com>

* 🎉 Add YAML format to source-file reader (#14588)

* Add yaml reader

* Update docs

* Bumpversion of connector

* bump docs

* Update pyarrow dependency

* Upgrade pandas dependency

* auto-bump connector version

Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>

* 🎉 Source Okta: add GroupMembers stream (#14380)

* add Group_Members stream to okta source

- Group_Members return a list of users, the same schema of Users stream.
- Create a shared schema users, and both group_members and users sechema use it as a reference.
- Add Group_Members stream to source connector

* add tests and fix logs schema

- fix the test error: None is not one of enums though the enum type includes both string and null, it comes from json schema validator
ddb87afad8/jsonschema/_validators.py (L279-L285)
- change grouop_members to use id as the cursor field since `filter` is not supported in the query string
- fix the abnormal state test on logs stream, when since is abnormally large, until has to defined, an equal or a larger value
- remove logs stream from full sync test, because 2 full sync always has a gap -- at least a new log about users or groups api.

* last polish before submit the PR

- bump docker version
- update changelog
- add the right abnormal value for logs stream
- correct the sample catalog

* address comments::

- improve comments for until parameter under the logs stream
- add use_cache on groupMembers

* add use_cache to Group_Members

* change configured_catalog to test

* auto-bump connector version

Co-authored-by: marcosmarxm <marcosmarxm@gmail.com>
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>

* split test files

* renames

* missing unit test

* add missing unit tests

* rename

* assert isinstance

* start extracting to their own files

* use final instead of classmethod

* assert we retry 429 errors

* Add log

* replace asserts with valueexceptions

* delete superfluous print statement

* fix factory so we don't need to union everything with strings

* get class_name from type

* remove from class types registry

* process error handlers one at a time

* sort

* delete print statement

* comment

* comment

* format

* delete unused file

* comment

* interpolatedboolean

* comment

* not optional

* not optional

* unit tests

* fix request body data

* add test

* move file to right module

* update

* reset to master

* format

* rename to pass_by

* rename to page size

* fix

* add test

* fix body data

* delete extra newlines

* move to subpackage

* fix imports

* handle str body data

* simplify

* fix typing

* always return a map

* rename to inject_into

* only accept enum

* delete conditional paginator

* only return body data

* missing test

* update docstrings

* update docstrings

* update comment

* rename

Co-authored-by: Sherif A. Nada <snadalive@gmail.com>
Co-authored-by: Tobias Macey <tmacey@boundlessnotions.com>
Co-authored-by: Serhii Chvaliuk <grubberr@gmail.com>
Co-authored-by: Amruta Ranade <11484018+Amruta-Ranade@users.noreply.github.com>
Co-authored-by: Bas Beelen <bjgbeelen@gmail.com>
Co-authored-by: Marcos Marx <marcosmarxm@users.noreply.github.com>
Co-authored-by: Guy Feldman <gfeldman@86labs.com>
Co-authored-by: Christophe Duong <christophe.duong@gmail.com>
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
Co-authored-by: Yiyang Li <yiyangli2010@gmail.com>
Co-authored-by: marcosmarxm <marcosmarxm@gmail.com>
2022-07-20 22:46:51 -07:00
Sherif A. Nada
52e3755417 [low-code connectors] Bugfix transformations (#14810) 2022-07-18 16:33:23 -07:00
Ryan Fu
7281d637eb Fixed linter issue with add_fields.py comments (#14742) 2022-07-15 07:34:31 -07:00
Sherif A. Nada
a97216f96b [low code cdk] add a transformation for adding fields into an outgoing record (#14638)
Co-authored-by: Alexandre Girard <alexandre@airbyte.io>
2022-07-14 20:06:02 -07:00
Brian Lai
ff74998057 forgot to publish a new version of airbyte-cdk to PyPi (#14696) 2022-07-14 13:22:12 -04:00
Alexandre Girard
09aa685aad Alex/configurable retrier (#14330)
* checkout files from test branch

* read_incremental works

* reset to master

* remove dead code

* comment

* fix

* Add test

* comments

* utc

* format

* small fix

* Add test with rfc3339

* remove unused param

* fix test

* configurable state checkpointing

* update test

* start working on retrier

* retry predicate

* return response status

* look in error message

* cleanup test

* constant backoff strategy

* chain backoff strategy

* chain retrier

* Add to class types registry

* extract backoff time from header

* wait until

* update

* split file

* parse_records

* classmethod

* delete dead code

* comment

* comment

* comments

* fix

* test for instantiating chain retrier

* fix parsing

* cleanup

* fix

* reset

* never raise on http error

* remove print

* comment

* comment

* comment

* comment

* remove prints

* add declarative stream to registry

* Delete dead code

* Add docstrings

* quick fix

* exponential backoff

* fix test

* fix

* delete unused properties

* fix

* missing unit tests

* uppercase

* docstrings

* rename to success

* compare full request instead of just url

* renmae module

* rename test file

* rename interface

* rename default retrier

* rename to compositeerrorhandler

* fix missing renames

* move action to filter

* str -> minmaxdatetime

* small fixes

* plural

* add example

* handle header variations

* also fix wait time from

* allow using a regex to extract the value

* group()

* docstring

* add docs

* update comment

* docstrings

* update comment

* Update airbyte-cdk/python/airbyte_cdk/sources/declarative/requesters/http_requester.py

Co-authored-by: Sherif A. Nada <snadalive@gmail.com>

* version: Update Parquet library to latest release (#14502)

The upstream Parquet library that is currently pinned for use in the S3 destination plugin is over a year old. The current version is generating invalid schemas for date-time with time-zone fields which appears to be addressed in the `1.12.3` release of the library in commit c72862b613

* merge

* 🎉 Source Github: improve schema for stream `pull_request_commits` added "null" (#14613)

Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com>

* Docs: Fixed broken links (#14622)

* fixing broken links

* more broken links

* source-hubspot: change mentioning of Mailchimp into HubSpot  doc (#14620)

* Helm Chart: Add external temporal option (#14597)

* conflict env configmap and chart lock

* reverting lock

* add eof lines and documentation on values yaml

* conflict json file

* rollback json

* solve conflict

* correct minio with new version

Co-authored-by: Guy Feldman <gfeldman@86labs.com>

* 🎉 Add YAML format to source-file reader (#14588)

* Add yaml reader

* Update docs

* Bumpversion of connector

* bump docs

* Update pyarrow dependency

* Upgrade pandas dependency

* auto-bump connector version

Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>

* 🎉 Source Okta: add GroupMembers stream (#14380)

* add Group_Members stream to okta source

- Group_Members return a list of users, the same schema of Users stream.
- Create a shared schema users, and both group_members and users sechema use it as a reference.
- Add Group_Members stream to source connector

* add tests and fix logs schema

- fix the test error: None is not one of enums though the enum type includes both string and null, it comes from json schema validator
ddb87afad8/jsonschema/_validators.py (L279-L285)
- change grouop_members to use id as the cursor field since `filter` is not supported in the query string
- fix the abnormal state test on logs stream, when since is abnormally large, until has to defined, an equal or a larger value
- remove logs stream from full sync test, because 2 full sync always has a gap -- at least a new log about users or groups api.

* last polish before submit the PR

- bump docker version
- update changelog
- add the right abnormal value for logs stream
- correct the sample catalog

* address comments::

- improve comments for until parameter under the logs stream
- add use_cache on groupMembers

* add use_cache to Group_Members

* change configured_catalog to test

* auto-bump connector version

Co-authored-by: marcosmarxm <marcosmarxm@gmail.com>
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>

* split test files

* renames

* missing unit test

* add missing unit tests

* rename

* assert isinstance

* start extracting to their own files

* use final instead of classmethod

* assert we retry 429 errors

* Add log

* replace asserts with valueexceptions

* delete superfluous print statement

* fix factory so we don't need to union everything with strings

* get class_name from type

* remove from class types registry

* process error handlers one at a time

* sort

* delete print statement

* comment

* comment

* format

* delete unused file

Co-authored-by: Sherif A. Nada <snadalive@gmail.com>
Co-authored-by: Tobias Macey <tmacey@boundlessnotions.com>
Co-authored-by: Serhii Chvaliuk <grubberr@gmail.com>
Co-authored-by: Amruta Ranade <11484018+Amruta-Ranade@users.noreply.github.com>
Co-authored-by: Bas Beelen <bjgbeelen@gmail.com>
Co-authored-by: Marcos Marx <marcosmarxm@users.noreply.github.com>
Co-authored-by: Guy Feldman <gfeldman@86labs.com>
Co-authored-by: Christophe Duong <christophe.duong@gmail.com>
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
Co-authored-by: Yiyang Li <yiyangli2010@gmail.com>
Co-authored-by: marcosmarxm <marcosmarxm@gmail.com>
2022-07-14 08:24:37 -07:00
Brian Lai
7bff12aea5 [#3078] [CDK] Add support for enabling debug from command line and some basic general debug logs (#14521)
* allow for command line debug option and basic debug statements + declarative

* feedback from pr comments

* fix some tests w/ req/res mixed up and fixing logging tests

* formatting

* pr feedback: cleaning up traces in logger.py and update docs with debug configuration

* remove unneeded trace logger test

* remove extra print statement
2022-07-13 18:01:07 -04:00
Alexandre Girard
a4c51cdc54 set default paginator (#14678) 2022-07-13 08:54:42 -07:00
Brian Lai
cf71ccb460 simple update to add a lookback_window to the datetime stream slicer (#14609) 2022-07-12 19:49:15 -04:00