115 lines
6.3 KiB
Markdown
115 lines
6.3 KiB
Markdown
# CDK Migration Guide
|
|
|
|
## Upgrading to 3.0.0
|
|
Version 3.0.0 of the CDK updates the `HTTPStream` class by reusing the `HTTPClient` under the hood.
|
|
|
|
- `backoff_time` and `should_retry` methods are removed from HttpStream
|
|
- `HttpStreamAdapterHttpStatusErrorHandler` and `HttpStreamAdapterBackoffStrategy` adapters are marked as `deprecated`
|
|
- `raise_on_http_errors`, `max_retries`, `max_time`, `retry_factor` are marked as `deprecated`
|
|
|
|
Exceptions from the `requests` library should no longer be raised when calling `read_records`.
|
|
Therefore, catching exceptions should be updated, and error messages might change.
|
|
See [Migration of Source Zendesk Support](https://github.com/airbytehq/airbyte/pull/41032/commits/4d3a247f36b9826dcea4b98d30fc19802b03d014) as an example.
|
|
|
|
### Migration of `should_retry` method
|
|
In case the connector uses custom logic for backoff based on the response from the server, a new method `get_error_handler` should be implemented.
|
|
This method should return instance of [`ErrorHandler`](https://github.com/airbytehq/airbyte/blob/master/airbyte-cdk/python/airbyte_cdk/sources/streams/http/error_handlers/error_handler.py).
|
|
|
|
### Migration of `backoff_time` method
|
|
In case the connector uses custom logic for backoff time calculation, a new method `get_backoff_strategy` should be implemented.
|
|
This method should return instance(s) of [`BackoffStrategy`](https://github.com/airbytehq/airbyte/blob/master/airbyte-cdk/python/airbyte_cdk/sources/streams/http/error_handlers/backoff_strategy.py).
|
|
|
|
## Upgrading to 2.0.0
|
|
Version 2.0.0 of the CDK updates the `pydantic` dependency to from Pydantic v1 to Pydantic v2. It also
|
|
updates the `airbyte-protocol-models` dependency to a version that uses Pydantic V2 models.
|
|
|
|
The changes to Airbyte CDK itself are backwards-compatible, but some changes are required if the connector:
|
|
- uses Pydantic directly, e.g. for its own custom models, or
|
|
- uses the `airbyte_protocol` models directly, or `airbyte_cdk.models`, which points to `airbyte_protocol` models, or
|
|
- customizes HashableStreamDescriptor, which inherits from a protocol model and has therefore been updated to use Pydantic V2 models.
|
|
|
|
Some test assertions may also need updating due to changes to default serialization of the protocol models.
|
|
|
|
### Updating direct usage of Pydantic
|
|
|
|
If the connector uses pydantic, the code will need to be updated to reflect the change `pydantic` dependency version.
|
|
The Pydantic [migration guide](https://docs.pydantic.dev/latest/migration/) is a great resource for any questions that
|
|
might arise around upgrade behavior.
|
|
|
|
#### Using Pydantic V1 models with Pydantic V2
|
|
The easiest way to update the code to be compatible without major changes is to update the import statements from
|
|
`from pydantic` to `from pydantic.v1`, as Pydantic has kept the v1 module for backwards compatibility.
|
|
|
|
Some potential gotchas:
|
|
- `ValidationError` must be imported from `pydantic.v1.error_wrappers` instead of `pydantic.v1`
|
|
- `ModelMetaclass` must be imported from `pydantic.v1.main` instead of `pydantic.v1`
|
|
- `resolve_annotations` must be imported from `pydantic.v1.typing` instead of `pydantic.v1`
|
|
|
|
#### Upgrading to Pydantic V2
|
|
To upgrade all the way to V2 proper, Pydantic also offers a [migration tool](https://docs.pydantic.dev/latest/migration/#code-transformation-tool)
|
|
to automatically update the code to be compatible with Pydantic V2.
|
|
|
|
#### Updating assertions
|
|
It's possible that a connector might make assertions against protocol models without actually
|
|
importing them - for example when testing methods which return `AirbyteStateBlob` or `AnyUrl`.
|
|
|
|
To resolve this, either compare directly to a model, or `dict()` or `str()` your model accordingly, depending
|
|
on if you care most about the serialized output or the model (for a method which returns a model, option 1 is
|
|
preferred). For example:
|
|
|
|
```python
|
|
# Before
|
|
assert stream_read.slices[1].state[0].stream.stream_state == {"a_timestamp": 123}
|
|
|
|
# After - Option 1
|
|
from airbyte_cdk.models import AirbyteStateBlob
|
|
assert stream_read.slices[1].state[0].stream.stream_state == AirbyteStateBlob(a_timestamp=123)
|
|
|
|
# After - Option 2
|
|
assert stream_read.slices[1].state[0].stream.stream_state.dict() == {"a_timestamp": 123}
|
|
```
|
|
|
|
|
|
## Upgrading to 1.0.0
|
|
Starting from 1.0.0, CDK classes and functions should be imported directly from `airbyte_cdk` (example: `from airbyte_cdk import HttpStream`). Lower-level `__init__` files are not considered stable, and will be modified without introducing a major release.
|
|
|
|
Introducing breaking changes to a class or function exported from the top level `__init__.py` will require a major version bump and a migration note to help developer upgrade.
|
|
|
|
Note that the following packages are not part of the top level init because they require extras dependencies, but are still considered stable:
|
|
- `destination.vector_db_based`
|
|
- `source.file_based`
|
|
|
|
The `test` package is not included in the top level init either. The `test` package is still evolving and isn't considered stable.
|
|
|
|
|
|
A few classes were deleted from the Airbyte CDK in version 1.0.0:
|
|
- AirbyteLogger
|
|
- AirbyteSpec
|
|
- Authenticators in the `sources.streams.http.auth` module
|
|
|
|
|
|
|
|
### Migrating off AirbyteLogger
|
|
No connectors should still be using `AirbyteLogger` directly, but the class is still used in some interfaces. The only required change is to update the type annotation from `AirbyteLogger` to `logging.Logger`. For example:
|
|
|
|
```
|
|
def check_connection(self, logger: AirbyteLogger, config: Mapping[str, Any]) -> Tuple[bool, any]:
|
|
```
|
|
|
|
to
|
|
|
|
```
|
|
def check_connection(self, logger: logging.Logger, config: Mapping[str, Any]) -> Tuple[bool, any]:
|
|
```
|
|
Don't forget to also update the imports. You can delete `from airbyte_cdk import AirbyteLogger` and replace it with `import logging`.
|
|
|
|
### Migrating off AirbyteSpec
|
|
AirbyteSpec isn't used by any connectors in the repository, and I don't expect any custom connectors to use the class either. This should be a no-op.
|
|
|
|
### Migrating off Authenticators
|
|
Replace usage of authenticators in the `airbyte_cdk.sources.streams.http.auth` module with their sister classes in the `airbyte_cdk.sources.streams.http.requests_native_auth` module.
|
|
|
|
If any of your streams reference `self.authenticator`, you'll also need to update these references to `self._session.auth` as the authenticator is embedded in the session object.
|
|
|
|
Here is a [pull request that can serve as an example](https://github.com/airbytehq/airbyte/pull/38065/files).
|