airbyte

mirror of synced 2026-01-25 01:01:56 -05:00

Author	SHA1	Message	Date
Thibaud Chardonnens	921f4a13c6	add /tmp emptyDir volume to connector pods (#10761 ) Some connectors (such as destination-s3) require to write some temporary data (generally to /tmp). It is a good security practice to enforce read only root filesystem on Kubernetes pod, and, some productive Kubernetes clusters enforce that all pods run with read only root filesystem. Therefore, in order to still allow connectors to write temporary data to /tmp with read only root fs, we must mount an emptyDir volume to /tmp. The original PR was here: #9874 we decided to split it into 3 different PRs. This limit for this will be done in https://github.com/airbytehq/airbyte/issues/11025.	2022-03-10 21:01:09 +08:00
LiRen Tu	f5748998c8	Fix unit test failures from env configs (#10998 ) * Fix unit test failures from env configs * Default null to empty string * Format code * Fix one more unit test * Remove unused import	2022-03-09 14:24:54 -08:00
LiRen Tu	81417e6728	Add connector metadata as sentry tags (#10475 ) * Pass worker metadata to connector * Fix compilation * Pass in job id and image from worker * Remove application version * Add default job environment variables * Add back removed comment * Rename env map to job metadata * Fix env configs * Read connector from application * Use empty string * Remove println * Fix unit test * Fix compilation error * Introduce constants for worker env * Add worker env to ENV_VARS_TO_TRANSFER * Pass into getWorkerMetadata map to all constructions * Format code * Format octavia cli * Fix test compilation * Fix typos	2022-03-09 07:36:03 -08:00
Benoit Moriceau	780c98c476	Add test that check that we continue a reset as a reset if it failed (#10806 ) This is adding tests to make sure that a reset is continued as a reset after an attempt or as a job when the maximum amount of attempt is reach. It also fixes the workflow to continue as a reset in a new job if it fails more than the maximum number of attempt. Open question: - Is it what we want for the job (continue as a reset if the job failed)? - Do we need to respect the schedule if the reset failed more than the maximum attempts?	2022-03-08 17:36:39 -08:00
Benoit Moriceau	fd1f9339ed	Disable flaky test (#10927 )	2022-03-07 15:26:10 -08:00
Davin Chia	7bbcb369aa	Add failure origin metric. (#10884 ) Part of https://docs.google.com/document/d/11pEUsHyKUhh4CtV3aReau3SUG-ncEvy6ROJRVln6YB4/edit#. Introduce a metric to track failure origins.	2022-03-07 11:52:29 +08:00
Artemiy Kzr	e34c3578fd	Destination Clickhouse: enable normalization for Secure connections (#10754 ) * Clickhouse Destination: enable normalization for Secure connections * bump normalization version * run mypy check * add lib * install stubs running mypy * rollback gradlew command Co-authored-by: marcosmarxm <marcosmarxm@gmail.com>	2022-03-04 22:12:49 -03:00
Benoit Moriceau	fbe084f17b	Rm remaining merge conflict in comments (#10876 )	2022-03-04 17:01:36 -08:00
Lake Mossman	754452c220	raise timeout on flaky test (#10868 )	2022-03-04 13:22:50 -08:00
Benoit Moriceau	f18d304a86	Add autoformat (#10808 )	2022-03-02 15:34:45 -08:00
Jared Rhizor	5aecbc30f8	default to no resource limits for OSS (#10800 )	2022-03-02 15:18:24 -08:00
Benoit Moriceau	b610393279	Extract event from the temporal worker run factory (#10739 ) Extract of different events that can happen to a sync into a non temporal related interface.	2022-03-01 15:09:49 -08:00
Thibaud Chardonnens	ce9b967597	Adds default sidecar cpu request and limit and add resources to the init container (#10759 )	2022-03-01 18:46:09 -03:00
Benoit Moriceau	a60aa5f1a0	Update temporal retention TTL from 7 to 30 days (#10635 ) Increase the temporal retention to 30 days instead of 7. It will help with on call investigation.	2022-03-01 13:10:43 -08:00
Jared Rhizor	a46b885356	remove --cpu-shares flag (#10738 )	2022-02-28 15:14:18 -08:00
Davin Chia	ed9c3dca02	Fix error NPE in metrics emission. (#10675 )	2022-02-28 18:04:14 +08:00
lmossman	dd58eb3004	add mocks to tests	2022-02-25 11:19:40 -08:00
Charles	1242aa8b4a	Set resource limits for connector definitions: expose in worker (#10483 ) * pipe through to worker * wip * pass source and dest def resource reqs to job client * fix test * use resource requirements utils to get resource reqs for legacy and new impls * undo changes to pass sync input to container launcher worker factory * remove import * fix hierarchy order of resource requirements * add nullable annotations * undo change to test * format * use destination resource reqs for normalization and make resource req utils more flexible * format * refactor resource requirements utils and add tests * switch to storing source/dest resource requirements directly on job sync config * fix tests and javadocs * use sync input resource requirements for container orchestrator pod * do not set connection resource reqs to worker reqs * add overrident requirement utils method + test + comment Co-authored-by: lmossman <lake@airbyte.io>	2022-02-25 10:00:30 -08:00
Davin Chia	e21ec5a15b	Add attempt status by release stage metrics. (#10659 ) Add, - attempt_created_by_release_stage - attempt_failed_by_release_stage - attempt_succeeded_by_release_stage	2022-02-25 19:33:10 +08:00
Davin Chia	29095c659e	Correct cancelled job metric name. (#10658 )	2022-02-25 19:18:55 +08:00
Davin Chia	5bc6d814ba	Cloud Dashboard 1 (#10628 ) Publish metrics for: - created jobs tagged by release stage - failed jobs tagged by release stage - cancelled jobs tagged by release stage - succeed jobs tagged by release stage	2022-02-25 16:21:47 +08:00
Parker Mossman	2157b47b60	Log pod state if init pod wait condition times out (for debugging transient test issue) (#10639 ) * log pod state if init pod search times out * increase test timeout from 5 to 6 minutes to give kube pod process timeout time to trigger * format	2022-02-24 14:29:09 -08:00
Jared Rhizor	def938a3c9	stabilize connection manager tests (#10606 ) * stabilize connection manager tests * just call shutdown once * another run just so we can see if it's passing * another run just so we can see if it's passing * re-disable test * run another test * run another test * run another test * run another test	2022-02-24 14:00:43 -08:00
Benoit Moriceau	b4e14d7a44	Add missing continue as new (#10636 )	2022-02-24 11:02:47 -08:00
Jared Rhizor	9bf67dd91d	fix orchestrator restart problem for cloud (#10565 ) * test time ranges for cancellations * try with wait * fix cancellation on worker restart * revert for CI testing that the test fails without the retry policy * revert testing change * matrix test the different possible cases * re-enable new retry policy * switch to no_retry * switch back to new retry * paramaterize correctly * revert to no-retry * re-enable new retry policy * speed up test + fixees * significantly speed up test * fix ordering * use multiple task queues in connection manager test * use versioning for task queue change * remove sync workflow registration for the connection manager queue * use more specific example * respond to parker's comments	2022-02-23 22:11:39 -08:00
Parker Mossman	34be57c4c1	Add timeout to connector pod init container command (#10592 ) * add timeout to init container command * add disk usage check into init command * fix up disk usage checking and logs from init entrypoint * run format	2022-02-23 16:28:48 -08:00
Benoit Moriceau	22e4f6cd54	Change the block logic and block after the job creation (#10597 ) This is changing the check to see if a connection exist in order to make it more performant and more accurate. It makes sure that the workflow is reachable by trying to query it.	2022-02-23 15:24:24 -08:00
Benoit Moriceau	2c09037597	Bmoric/move flag check to handler (#10469 ) Move the feature flag checks to the handler instead of the configuration API. This could have avoid some bug related to the missing flag check in the cloud project.	2022-02-23 13:36:09 -08:00
Charles	2e4d91eb0a	refactor ConnectionHelper so that conversion logic can be shared (#10480 )	2022-02-22 18:08:42 -08:00
Subodh Kant Chaturvedi	54b134c255	convert enum to use the same thing as API (#10562 ) * convert enum to use the same thing as API * Fix flaky test Co-authored-by: Benoit Moriceau <benoit@airbyte.io>	2022-02-22 16:27:12 -08:00
Benoit Moriceau	dfce970430	Fix input (#10557 )	2022-02-22 12:29:35 -08:00
Maksym Pavlenok	bbd13802d8	🐛 Fix Python checker configs and Connector Base workflow (#10505 )	2022-02-22 19:58:55 +02:00
Davin Chia	db4dcdda75	Cloud Health Dashboard Step 0: Set up Metrics Registry. (#10478 ) Set up a Metrics Registry. The purpose of this registry is to better enforce metrics -> application relationship, metric -> description relationship, provide a central location where folks can understand what metrics OSS AB emits, and enforce some standards. Past experience has shown me that metrics emission can quickly get out of hand: 1) unclear what is emitted 2) similar metrics emitted in multiple places 3) not clear what metrics corresponds to what application. This is my attempt to provide a framework for us to operate in. Let me know if folks think this provides more complexity than is useful. I've added the KubePodProcess metric in here to demonstrate/test how everything will work in practice.	2022-02-22 23:29:43 +08:00
Charles	e7d7c773be	add keepalive for TCP socket in KubePodProcess (#10528 ) * add keepalive * afsdlk	2022-02-22 20:33:17 +05:30
Vadym Hevlich	5464b1c830	🐛 Normalization: Fix sync from HubSpot to MySQL fails with "Row size too large" on create table (#10485 ) * Update mysql normalization to cast string as text. Bump docker version. Update basic-normalization.md docs. * Update docs PR reference * Update mysql normalization to cast string as for is_timestamp_with_time_zone type	2022-02-22 14:22:26 +02:00
Malik Diarra	b26bdf7cc3	Add database object to ConfigRepository (#10473 )	2022-02-18 17:56:37 -08:00
Benoit Moriceau	76e969f2e5	Bmoric/add worflow internal state (#10439 ) Refactor the connection workflow in order to have a single final object as a workflow state	2022-02-18 13:14:49 -08:00
Benoit Moriceau	eb728c8dc4	Bmoric/refacto connection manager (#10370 ) This is re-organizing the connectionManagerWorkflow in order to make it easier to understand. It is: - Removing some un-needed variables - Extracting the activity call in their own methods - Reporting the status when possible do not wait to be out of the cancelation scope to report success and failure - Avoid else condition in order to be more explicit on when we exit.	2022-02-18 11:29:22 -08:00
Davin Chia	ef673c5695	Inject check connection resource (#10410 ) Make it possible to set resource limits specifically for Check Connection. This helps speed up the Check Connection operation for Java based connectors. After this PR is merged, I will do an OSS release and make the required Helm changes in Cloud.	2022-02-18 16:06:11 +08:00
Benoit Moriceau	9d546d3ebb	Add a maximum page size and use the count instead of the list (#10443 ) * Add a maximum page size and use the count instead of the list * Fix typo	2022-02-17 15:28:14 -08:00
Jared Rhizor	a66d8be03a	continue workflows on restarts (#10294 ) * fix normalization output processing in container orchestrator * add full scheduler v2 acceptance tests * speed up tests * fixes * clean up * wip handle worker restarts * only downtime during sync test not passing * commit temp * mostly cleaned up * add attempt count check * remove todo * switch all pending checks to running checks * use ++ * Update airbyte-container-orchestrator/src/main/java/io/airbyte/container_orchestrator/ContainerOrchestratorApp.java Co-authored-by: Charles <giardina.charles@gmail.com> * Update airbyte-workers/src/main/java/io/airbyte/workers/temporal/sync/LauncherWorker.java Co-authored-by: Charles <giardina.charles@gmail.com> * add more context * remove unused arg * test on CI that no_retry is insufficient * revert back to orchestrator retry * test for retry logic * remove fialing test and switch back activity config to just no retry Co-authored-by: Charles <giardina.charles@gmail.com>	2022-02-17 15:14:51 -08:00
LiRen Tu	049a11b2bc	🎉 Snowflake destination: reduce memory footprint (#10394 ) * Add detailed logging for flushing * Log sentry transaction event id * Adjust logging * Log memory usage * Add jvm monitoring * Remove log * Remove port 9010 * Remove host network mode * Sample record size * Remove profiling code * Add unit tests * Use average estimation * Rename variable * Format code * Bump version * Revert unnecessary change * Update doc * Fix format * Bump version in seed	2022-02-17 12:55:35 -08:00
Jared Rhizor	3da09aa152	make status checks configurable from env vars + use shorter replication interval for testing (#10368 ) * make status check interval env-configurable * apply to test files to get the speed improvements * evert "apply to test files to get the speed improvements" This reverts commit `97159e3a8b`. * Revert "evert "apply to test files to get the speed improvements"" This reverts commit `bf3c6a5612`.	2022-02-16 11:14:05 -08:00
Nikolai Korolev	1f908436fb	Normalization Clickhouse: Fix exception in case password is not provided (#10219 ) * Normalization Clickhouse: Fix exception in case password is not provided * Do not provide password in dbt config in case there is no one * bump connector version * bump normalization version Co-authored-by: Marcos Marx <marcosmarxm@gmail.com>	2022-02-16 15:56:08 -03:00
Benoit Moriceau	ab10996f89	If an activity is failing, stuck the workflow and make it queriable (#10121 ) After an activity failure, we are blocking the workflow. A new query method is available to query the workflows to get the list of workflow being stuck. Then the activity can be retry with a signal.	2022-02-15 14:32:03 -08:00
Parker Mossman	b742a451a0	Configure kube pod process per job type (#10200 ) * split workerConfigs and processFactory by job type, env var for check job node selectors * move status check interval to WorkerConfigs and customize for check worker * add scaffolding for spec, discover, and sync configs * optional orElse instead of orElseGet * add replicationWorkerConfigs with custom resource requirements	2022-02-15 09:59:41 -08:00
Jared Rhizor	af5e133a89	fix race conditions in ConnectionManagerWorkflowTest (#10296 )	2022-02-14 14:11:17 -08:00
Benoit Moriceau	c5e199f260	Disable flaky test (#10321 )	2022-02-14 09:44:00 -08:00
VitaliiMaltsev	e30d8348b2	Change JsonSchemaPrimitive to a class (#9913 ) * fix for jdk 17 * add JsonSchemaType class * fix tests * fix tests * fix tests * fix tests * fix tests * fix tests * fix Oracle tests * fix Redshift tests * fix Redshift tests * fix checkstyle * fix MSSQL tests * fix cockroachdb tests * fix checkstyle * fix checkstyle * replace star imports * replace star imports * replace star imports * update JsonSchemaType \| fixed checkstyle * Remove unused variables in test * Fix imports * Expand imports * Fix more imports Co-authored-by: vmaltsev <vitalii.maltsev@globallogic.com> Co-authored-by: Liren Tu <tuliren.git@outlook.com>	2022-02-14 02:12:37 -08:00
Lake Mossman	820a9ff840	do not wipe out existing connection resource requirements on update (#10291 ) * do not wipe out existing connection resource requirements on update * format * do not pull any values from worker configs here * remove logger	2022-02-11 15:57:13 -08:00

1 2 3 4 5 ...

367 Commits