* [ISSUE #19410] remove request_options_provider from the … (#21403) * [ISSUE #19410] (incomplete) remove request_options_provider from the manifest * [ISSUE #19410] (incomplete) incomplete cleanup config_component_schema.json as well * [ISSUE #19410] update source-monday * [ISSUE #19410] code review * [ISSUE #19410] formatting files * [Low-Code CDK] Replace the $options keyword with $parameters (#21632) * refactor flows and tests to use parameters instead of options * update documentation to reflect the change from options to parameters * create migration script to replace options with parameters in existing manifests * update template to use parameters instead of options * fix tests after rebasing from the branch * address pr feedback and extra uses of options that I missed * additional changes needed after rebasing from master * migrate low-code connectors to use parameters instead of options * 🚨🚨 [Low Code CDK] Update `*ref` format to `#/` (#21434) * [Low-Code CDK] Remove JsonSchema type in favor of JsonSchemaFileLoader (#21832) * fully deprecate JsonSchema in favor of JsonFileSchemaLoader * remove usage in the legacy registry * Update migration scripts according to manifest file rename (#21920) * Issue 21866 remove legacy factory and validation flow (#21878) * [ISSUE #21866] clean ManifestDeclarativeSource validation * [ISSUE #21866] remove dataclasses-jsonschema * [ISSUE #21866] code review * [ISSUE-21866] flake8 * [ISSUE #21559] remove DefaultPaginator.url_base (#21823) * [ISSUE #21559] remove DefaultPaginator.url_base * [ISSUE #21559] code review * [ISSUE #21559] update migration script * [ISSUE #21559] code review * [ISSUE #21559] update documentation * [ISSUE #21559] run migration (#21824) * [ISSUE #21559] remove DefaultPaginator.url_base (#21823) * [ISSUE #21559] remove DefaultPaginator.url_base * [ISSUE #21559] code review * [ISSUE #21559] update migration script * [ISSUE #21559] code review * [ISSUE #21559] update documentation * [ISSUE #21559] run migration (#21824) * [ISSUE #21559] fix manifests * [ISSUE #21926] setup server to allow for local tests (#21974) * [Low Code CDK] remove checkpoint_interval from DeclarativeStream component (#22120) * Issue #21576 rename dpathextractor fieldpointer (#21990) * [ISSUE #21926] setup server to allow for local tests * [ISSUE #21576] Rename DpathExtractor.field_pointer to field_path * [ISSUE #21576] migration script * [ISSUE #21576] update source-monday and source-pocket as well * [ISSUE #21576] migration (#21997) * [ISSUE #21576] code review * Remove checkpoint_interval from source-prestashop manifest (#22141) * replacing options with parameters for a few connectors I missed or were newly added * [Low-Code CDK] Rremove stream_cursor_field from stream and derive it from stream_slicer (#22294) * update schema to derive cursor_field from a stream slicer if it exists * remove usage of stream_cursor_field on simple connector use cases * fixing some of the more complex usage of stream_cursor_field that rely on cartesian product stream slicers * fix documentation to replace references to stream_cursor_field * Low Code CDK: Remove `name` and `primary_key` from non-DeclarativeStream components (#21891) * fix eslint issues for webapp (#22462) * 🪟 🔧 Connector Builder frontend fixes for low_code_cdk_to_beta (#22375) * bump connector builder server to latest CDK version * fix breaking CDK changes in connector builder FE * [Low-Code CDK] Separate request path from RequestOption component (#22398) * split apart path from RequestOption and fix usages and cleanup the code * replace usage of path with RequestPath and get rid of default to RequestOption * fix bug where stream_slice_field was used in outbound request instead of request_option field_name * organize yaml schema names and update documentation for RequestOption and RequestPath * clean up tests * regenerate models * [ISSUE #19961] refactor stream slices (#22225) * [ISSUE #19961] add 'incremental' and partially remove CartesianProductStreamSlicer - Google PageSpeed Insights not working yet * [ISSUE #19961] fixing Google PageSpeed Insights * move incremental_sync field to the stream level and perform merging into one stream slicer at that level * add tests to merging incremental and iterable into cartesian * rewrite documentation to separate incremental sync and iterator concepts * update documentation to use partition router and revise the tutorial to reflect the new changes to the components * [ISSUE #19961] update code to newest CDK version and clean autogenerated files (#22670) * [ISSUE #19961] rename stream_slicer to partition_router and update ma… (#22590) * [ISSUE #19961] rename stream_slicer to partition_router and update manifests (for incremental_sync as well) * [ISSUE 19961] rename CustomStreamSlicer (#22598) * [ISSUE 19961] rename CustomStreamSlicer * [ISSUE #19961] code review CustomStreamSlicer * [ISSUE #19961] fix source_square incremental sync * [ISSUE #19961] rename SingleSlice to SinglePartitionRouter (#22591) * [ISSUE #19961] rename SingleSlice to SinglePartitionRouter * remove SinglePartitionRouter from the schema --------- Co-authored-by: brianjlai <brian.lai@airbyte.io> * [ISSUE #19961] rename SubstreamSlicer to SubstreamPartitionRouter (#22596) * [ISSUE #19961] TMP rename SubstreamSlicer to SubstreamPartitionRouter * [ISSUE #19961] revert DatetimeStreamSlicer.stream_state_field_start and DatetimeStreamSlicer.stream_state_field_end * [ISSUE #19961] rename ListStreamSlicer to ListPartitionRouter (#22593) --------- Co-authored-by: brianjlai <brian.lai@airbyte.io> * [ISSUE #19961] clean faulty merge * [ISSUE #19961] rename DatetimeStreamSlicer (#22617) * [ISSUE #19961] rename stream_slicer to partition_router and update manifests (for incremental_sync as well) * [ISSUE 19961] rename CustomStreamSlicer (#22598) * [ISSUE 19961] rename CustomStreamSlicer * [ISSUE #19961] code review CustomStreamSlicer * [ISSUE #19961] fix source_square incremental sync * [ISSUE #19961] rename SingleSlice to SinglePartitionRouter (#22591) * [ISSUE #19961] rename SingleSlice to SinglePartitionRouter * remove SinglePartitionRouter from the schema --------- Co-authored-by: brianjlai <brian.lai@airbyte.io> * [ISSUE #19961] rename DatetimeStreamSlicer * [ISSUE #19961] rename SubstreamSlicer to SubstreamPartitionRouter (#22596) * [ISSUE #19961] TMP rename SubstreamSlicer to SubstreamPartitionRouter * [ISSUE #19961] revert DatetimeStreamSlicer.stream_state_field_start and DatetimeStreamSlicer.stream_state_field_end * [ISSUE #19961] rename ListStreamSlicer to ListPartitionRouter (#22593) --------- Co-authored-by: brianjlai <brian.lai@airbyte.io> * Update docs/connector-development/config-based/understanding-the-yaml-file/partition-router.md Co-authored-by: Maxime Carbonneau-Leclerc <maxi297@users.noreply.github.com> * Update docs/connector-development/config-based/understanding-the-yaml-file/partition-router.md Co-authored-by: Maxime Carbonneau-Leclerc <maxi297@users.noreply.github.com> * Update docs/connector-development/config-based/understanding-the-yaml-file/yaml-overview.md Co-authored-by: Maxime Carbonneau-Leclerc <maxi297@users.noreply.github.com> * Update docs/connector-development/config-based/understanding-the-yaml-file/partition-router.md Co-authored-by: Maxime Carbonneau-Leclerc <maxi297@users.noreply.github.com> * Update docs/connector-development/config-based/understanding-the-yaml-file/partition-router.md Co-authored-by: Maxime Carbonneau-Leclerc <maxi297@users.noreply.github.com> * Update docs/connector-development/config-based/understanding-the-yaml-file/partition-router.md Co-authored-by: Maxime Carbonneau-Leclerc <maxi297@users.noreply.github.com> * Update docs/connector-development/config-based/understanding-the-yaml-file/incremental-syncs.md Co-authored-by: Maxime Carbonneau-Leclerc <maxi297@users.noreply.github.com> * update docs * [ISSUE #19961] clean unit tests files * [ISSUE #19961] code review --------- Co-authored-by: brianjlai <brian.lai@airbyte.io> Co-authored-by: Brian Lai <51336873+brianjlai@users.noreply.github.com> * [Low-Code CDK] Allow for children of custom components to specify parameters that are normally derived (#22379) * Fix a bug where child components of a custom component cannot receive fields from other components * add tests, documentation and commenting * fix test from merge * add better error message for nested initialization failures * 🪟 🔧 Connector Builder frontend fixes for low_code_cdk_to_beta (#22880) * restrict name to stream level * remove checkpoint interval * adjust logic for new request options * refactor slicers * wording * review comments * make oldest supported version explicit * separate the frontend and connector builder changes from the low-code to beta release * [Low-Code CDK] Add script to run low code unit tests and address issues with a few connectors (#23123) * consolidate all the changes into a new PR after I messed up the merge on the side branch * add set to allow this to be called externally if necessary later * remove last few extra fields i found and fix docs links * fix docs one more time --------- Co-authored-by: Maxime Carbonneau-Leclerc <maxi297@users.noreply.github.com> Co-authored-by: Catherine Noll <clnoll@users.noreply.github.com> Co-authored-by: maxi297 <maxime@airbyte.io> Co-authored-by: Lake Mossman <lake@airbyte.io> Co-authored-by: Joe Reuter <joe@airbyte.io>
242 lines
10 KiB
Python
242 lines
10 KiB
Python
#
|
|
# Copyright (c) 2023 Airbyte, Inc., all rights reserved.
|
|
#
|
|
|
|
import base64
|
|
import logging
|
|
from dataclasses import InitVar, dataclass
|
|
from typing import Any, Mapping, Union
|
|
|
|
import requests
|
|
from airbyte_cdk.sources.declarative.auth.declarative_authenticator import DeclarativeAuthenticator
|
|
from airbyte_cdk.sources.declarative.interpolation.interpolated_string import InterpolatedString
|
|
from airbyte_cdk.sources.declarative.types import Config
|
|
from airbyte_cdk.sources.streams.http.requests_native_auth.abstract_token import AbstractHeaderAuthenticator
|
|
from cachetools import TTLCache, cached
|
|
|
|
|
|
@dataclass
|
|
class ApiKeyAuthenticator(AbstractHeaderAuthenticator, DeclarativeAuthenticator):
|
|
"""
|
|
ApiKeyAuth sets a request header on the HTTP requests sent.
|
|
|
|
The header is of the form:
|
|
`"<header>": "<token>"`
|
|
|
|
For example,
|
|
`ApiKeyAuthenticator("Authorization", "Bearer hello")`
|
|
will result in the following header set on the HTTP request
|
|
`"Authorization": "Bearer hello"`
|
|
|
|
Attributes:
|
|
header (Union[InterpolatedString, str]): Header key to set on the HTTP requests
|
|
api_token (Union[InterpolatedString, str]): Header value to set on the HTTP requests
|
|
config (Config): The user-provided configuration as specified by the source's spec
|
|
parameters (Mapping[str, Any]): Additional runtime parameters to be used for string interpolation
|
|
"""
|
|
|
|
header: Union[InterpolatedString, str]
|
|
api_token: Union[InterpolatedString, str]
|
|
config: Config
|
|
parameters: InitVar[Mapping[str, Any]]
|
|
|
|
def __post_init__(self, parameters: Mapping[str, Any]):
|
|
self._header = InterpolatedString.create(self.header, parameters=parameters)
|
|
self._token = InterpolatedString.create(self.api_token, parameters=parameters)
|
|
|
|
@property
|
|
def auth_header(self) -> str:
|
|
return self._header.eval(self.config)
|
|
|
|
@property
|
|
def token(self) -> str:
|
|
return self._token.eval(self.config)
|
|
|
|
|
|
@dataclass
|
|
class BearerAuthenticator(AbstractHeaderAuthenticator, DeclarativeAuthenticator):
|
|
"""
|
|
Authenticator that sets the Authorization header on the HTTP requests sent.
|
|
|
|
The header is of the form:
|
|
`"Authorization": "Bearer <token>"`
|
|
|
|
Attributes:
|
|
api_token (Union[InterpolatedString, str]): The bearer token
|
|
config (Config): The user-provided configuration as specified by the source's spec
|
|
parameters (Mapping[str, Any]): Additional runtime parameters to be used for string interpolation
|
|
"""
|
|
|
|
api_token: Union[InterpolatedString, str]
|
|
config: Config
|
|
parameters: InitVar[Mapping[str, Any]]
|
|
|
|
def __post_init__(self, parameters: Mapping[str, Any]):
|
|
self._token = InterpolatedString.create(self.api_token, parameters=parameters)
|
|
|
|
@property
|
|
def auth_header(self) -> str:
|
|
return "Authorization"
|
|
|
|
@property
|
|
def token(self) -> str:
|
|
return f"Bearer {self._token.eval(self.config)}"
|
|
|
|
|
|
@dataclass
|
|
class BasicHttpAuthenticator(AbstractHeaderAuthenticator, DeclarativeAuthenticator):
|
|
"""
|
|
Builds auth based off the basic authentication scheme as defined by RFC 7617, which transmits credentials as USER ID/password pairs, encoded using base64
|
|
https://developer.mozilla.org/en-US/docs/Web/HTTP/Authentication#basic_authentication_scheme
|
|
|
|
The header is of the form
|
|
`"Authorization": "Basic <encoded_credentials>"`
|
|
|
|
Attributes:
|
|
username (Union[InterpolatedString, str]): The username
|
|
config (Config): The user-provided configuration as specified by the source's spec
|
|
password (Union[InterpolatedString, str]): The password
|
|
parameters (Mapping[str, Any]): Additional runtime parameters to be used for string interpolation
|
|
"""
|
|
|
|
username: Union[InterpolatedString, str]
|
|
config: Config
|
|
parameters: InitVar[Mapping[str, Any]]
|
|
password: Union[InterpolatedString, str] = ""
|
|
|
|
def __post_init__(self, parameters):
|
|
self._username = InterpolatedString.create(self.username, parameters=parameters)
|
|
self._password = InterpolatedString.create(self.password, parameters=parameters)
|
|
|
|
@property
|
|
def auth_header(self) -> str:
|
|
return "Authorization"
|
|
|
|
@property
|
|
def token(self) -> str:
|
|
auth_string = f"{self._username.eval(self.config)}:{self._password.eval(self.config)}".encode("utf8")
|
|
b64_encoded = base64.b64encode(auth_string).decode("utf8")
|
|
return f"Basic {b64_encoded}"
|
|
|
|
|
|
"""
|
|
maxsize - The maximum size of the cache
|
|
ttl - time-to-live value in seconds
|
|
docs https://cachetools.readthedocs.io/en/latest/
|
|
maxsize=1000 - when the cache is full, in this case more than 1000,
|
|
i.e. by adding another item the cache would exceed its maximum size, the cache must choose which item(s) to discard
|
|
ttl=86400 means that cached token will live for 86400 seconds (one day)
|
|
"""
|
|
cacheSessionTokenAuthenticator = TTLCache(maxsize=1000, ttl=86400)
|
|
|
|
|
|
@cached(cacheSessionTokenAuthenticator)
|
|
def get_new_session_token(api_url: str, username: str, password: str, response_key: str) -> str:
|
|
"""
|
|
This method retrieves session token from api by username and password for SessionTokenAuthenticator.
|
|
It's cashed to avoid a multiple calling by sync and updating session token every stream sync.
|
|
Args:
|
|
api_url: api url for getting new session token
|
|
username: username for auth
|
|
password: password for auth
|
|
response_key: field name in response to retrieve a session token
|
|
|
|
Returns:
|
|
session token
|
|
"""
|
|
response = requests.post(
|
|
f"{api_url}",
|
|
headers={"Content-Type": "application/json"},
|
|
json={"username": username, "password": password},
|
|
)
|
|
response.raise_for_status()
|
|
if not response.ok:
|
|
raise ConnectionError(f"Failed to retrieve new session token, response code {response.status_code} because {response.reason}")
|
|
return response.json()[response_key]
|
|
|
|
|
|
@dataclass
|
|
class SessionTokenAuthenticator(AbstractHeaderAuthenticator, DeclarativeAuthenticator):
|
|
"""
|
|
Builds auth based on session tokens.
|
|
A session token is a random value generated by a server to identify
|
|
a specific user for the duration of one interaction session.
|
|
|
|
The header is of the form
|
|
`"Specific Header": "Session Token Value"`
|
|
|
|
Attributes:
|
|
api_url (Union[InterpolatedString, str]): Base api url of source
|
|
username (Union[InterpolatedString, str]): The username
|
|
config (Config): The user-provided configuration as specified by the source's spec
|
|
password (Union[InterpolatedString, str]): The password
|
|
header (Union[InterpolatedString, str]): Specific header of source for providing session token
|
|
parameters (Mapping[str, Any]): Additional runtime parameters to be used for string interpolation
|
|
session_token (Union[InterpolatedString, str]): Session token generated by user
|
|
session_token_response_key (Union[InterpolatedString, str]): Key for retrieving session token from api response
|
|
login_url (Union[InterpolatedString, str]): Url fot getting a specific session token
|
|
validate_session_url (Union[InterpolatedString, str]): Url to validate passed session token
|
|
"""
|
|
|
|
api_url: Union[InterpolatedString, str]
|
|
header: Union[InterpolatedString, str]
|
|
session_token: Union[InterpolatedString, str]
|
|
session_token_response_key: Union[InterpolatedString, str]
|
|
username: Union[InterpolatedString, str]
|
|
config: Config
|
|
parameters: InitVar[Mapping[str, Any]]
|
|
login_url: Union[InterpolatedString, str]
|
|
validate_session_url: Union[InterpolatedString, str]
|
|
password: Union[InterpolatedString, str] = ""
|
|
|
|
def __post_init__(self, parameters):
|
|
self._username = InterpolatedString.create(self.username, parameters=parameters)
|
|
self._password = InterpolatedString.create(self.password, parameters=parameters)
|
|
self._api_url = InterpolatedString.create(self.api_url, parameters=parameters)
|
|
self._header = InterpolatedString.create(self.header, parameters=parameters)
|
|
self._session_token = InterpolatedString.create(self.session_token, parameters=parameters)
|
|
self._session_token_response_key = InterpolatedString.create(self.session_token_response_key, parameters=parameters)
|
|
self._login_url = InterpolatedString.create(self.login_url, parameters=parameters)
|
|
self._validate_session_url = InterpolatedString.create(self.validate_session_url, parameters=parameters)
|
|
|
|
self.logger = logging.getLogger("airbyte")
|
|
|
|
@property
|
|
def auth_header(self) -> str:
|
|
return self._header.eval(self.config)
|
|
|
|
@property
|
|
def token(self) -> str:
|
|
if self._session_token.eval(self.config):
|
|
if self.is_valid_session_token():
|
|
return self._session_token.eval(self.config)
|
|
if self._password.eval(self.config) and self._username.eval(self.config):
|
|
username = self._username.eval(self.config)
|
|
password = self._password.eval(self.config)
|
|
session_token_response_key = self._session_token_response_key.eval(self.config)
|
|
api_url = f"{self._api_url.eval(self.config)}{self._login_url.eval(self.config)}"
|
|
|
|
self.logger.info("Using generated session token by username and password")
|
|
return get_new_session_token(api_url, username, password, session_token_response_key)
|
|
|
|
raise ConnectionError("Invalid credentials: session token is not valid or provide username and password")
|
|
|
|
def is_valid_session_token(self) -> bool:
|
|
try:
|
|
response = requests.get(
|
|
f"{self._api_url.eval(self.config)}{self._validate_session_url.eval(self.config)}",
|
|
headers={self.auth_header: self._session_token.eval(self.config)},
|
|
)
|
|
response.raise_for_status()
|
|
except requests.exceptions.HTTPError as e:
|
|
if e.response.status_code == requests.codes["unauthorized"]:
|
|
self.logger.info(f"Unable to connect by session token from config due to {str(e)}")
|
|
return False
|
|
else:
|
|
raise ConnectionError(f"Error while validating session token: {e}")
|
|
if response.ok:
|
|
self.logger.info("Connection check for source is successful.")
|
|
return True
|
|
else:
|
|
raise ConnectionError(f"Failed to retrieve new session token, response code {response.status_code} because {response.reason}")
|