## What
<!--
* Describe what the change is solving. Link all GitHub issues related to this change.
-->
## How
<!--
* Describe how code changes achieve the solution.
-->
## Review guide
<!--
1. `x.py`
2. `y.py`
-->
## User Impact
<!--
* What is the end result perceived by the user?
* If there are negative side effects, please list them.
-->
## Can this PR be safely reverted and rolled back?
<!--
* If unsure, leave it blank.
-->
- [ ] YES 💚
- [ ] NO ❌
## What
<!--
* Describe what the change is solving. Link all GitHub issues related to this change.
-->
## How
<!--
* Describe how code changes achieve the solution.
-->
## Review guide
<!--
1. `x.py`
2. `y.py`
-->
## User Impact
<!--
* What is the end result perceived by the user?
* If there are negative side effects, please list them.
-->
## Can this PR be safely reverted and rolled back?
<!--
* If unsure, leave it blank.
-->
- [ ] YES 💚
- [ ] NO ❌
## What
This change implements a large record truncation mechanism for the Snowflake destination connector to handle records exceeding Snowflake's 16MB row size limit.
## How
- The truncator preserves primary key fields and truncates other fields to fit within the 16MB limit.
- Added metadata to indicate which fields were truncated due to size limitations.
## User Impact
Users can now sync large records to Snowflake without encountering errors due to row size limitations. Fields may be truncated to fit within the 16MB limit, but primary keys are always preserved. Metadata is added to indicate which fields were affected.
Added support for handling empty data fields in Avro records and introduced a new test case for streams with no columns.
We've agreed there's a bug in a source (uscensus) that returns a single columnless stream. There's also a bug in the platform that allows a customer to create a connection using that columnless empty stream.
While it would be ideal for the destination connector to send an (yet non existent) `UpstreamError` back to the platform, we have some configurations of S3 (CSV, Json) that just allow teh columnless records to be persisted. We're just bringing the parquet/avro to the same permissiveness
## What
<!--
* Describe what the change is solving. Link all GitHub issues related to this change.
-->
## How
<!--
* Describe how code changes achieve the solution.
-->
## Review guide
<!--
1. `x.py`
2. `y.py`
-->
## User Impact
<!--
* What is the end result perceived by the user?
* If there are negative side effects, please list them.
-->
## Can this PR be safely reverted and rolled back?
<!--
* If unsure, leave it blank.
-->
- [ ] YES 💚
- [ ] NO ❌
when creating a table in snowflake, we use the DEFAULT_COLLATION set at the database or schema level. We should always use the utf-8 collation (which is the snowflake default), so our queries are simpler and faster (and accept more than 50 constants in an IN clause)
### TL;DR
Make destination-snowflake pass all tests
### What changed?
- Updated CDK version to 0.45.0
- Reduced JUnit method execution timeout to 20 minutes
- Improved error handling in SnowflakeDestination's main function
- Enhanced error message for invalid permissions in integration test
- Implemented a more robust cleanup process for Airbyte internal tables and schemas
- Removed unused Batch and LocalFileBatch classes
- Not in the PR: I also deleted about 5k tables and 2k schemas, which were making our tests run slower than necessary. The cleanup logic will automate those cleanups.
### How to test?
1. Run integration tests for the Snowflake destination connector
2. Verify that the new error message is displayed when testing with invalid permissions
3. Check that the cleanup process removes old tables and schemas as expected
4. Ensure that all existing functionality remains intact
### Why make this change?
These changes aim to improve the reliability and maintainability of the Snowflake destination connector. The updated CDK version and reduced test timeout should lead to faster and more efficient testing. The enhanced error handling and cleanup processes will help in identifying issues more quickly and keeping the test environment clean. Removing unused classes reduces code clutter and improves overall code quality.