199 Commits

Author SHA1 Message Date
Florian Hussonnois
352d4eb194 chore(test): use MicronautTest when possible 2025-12-19 14:45:27 +01:00
Loïc Mathieu
8eae8aba72 feat(executions): add a protection mecanism to avoid any potential concurrency overflow
Concurrency limit is based on a counter that increment and decrement the limit each time a flow is started and terminated.

This count should always be accurate.

But if some unexpected event occurs (bug or user manually do something wrong), the count may not be accurate anymore.

To avoid any potential issue, when we decrement the counter, we chech that concurrency count is bellow the limit before unqueing an execution.

Fixes #12031
Closes  #13301
2025-12-11 14:33:30 +01:00
Shankar
56bb3ca29c Fix week format in filter 2025-12-09 10:00:37 +01:00
Loïc Mathieu
14029e8c14 chore(tests): isolate concurrency related tests in their own class 2025-12-09 09:57:49 +01:00
brian.mulier
682d258e7b feat(ns-files): add a metadata layer on top for better performance & versioned ns files
part of https://github.com/kestra-io/kestra/issues/5617
2025-12-01 18:13:49 +01:00
Roman Acevedo
fbe6df34ca feat(executions): set duration to null for non-terminated execs and fix frontend duration 2025-12-01 16:06:35 +01:00
brian.mulier
c3d94dc8ff fix(tests): remove JdbcTestUtils.drop usages as it defeats concurrent test runs 2025-11-27 19:23:17 +01:00
Loïc Mathieu
1d1a065833 fix(tests): lower termination grace period
If a test mis-behave, for ex starting an execution but not terminating it, as the default termination grace period is 5mn it can take very long time to wait for post-test terminaison.
Switching to a termination grace period of 5s may help.

I also detect that the ExecutionControllerRunner test when launching the test suite, would not properly kill the `sleep-long` flow so waiting for it to complete, or the termination grace period. When a test that use this flow is launched separatly it works properly. As a safety net I reduce the sleep from 5mn to 30s.
2025-11-19 11:58:06 +01:00
Loïc Mathieu
735697ac71 chore(system): share JDBC repository code in an abstract CRUD repository 2025-11-18 11:13:16 +01:00
Loïc Mathieu
8346874c43 chore(system): reduce repository code duplication between OSS and EE
Part-of: https://github.com/kestra-io/kestra-ee/issues/1684
2025-11-17 10:03:45 +01:00
Roman Acevedo
71e49f9eb5 feat(executions): add IN, NOT_IN, CONTAINS LABELS #11916
- advance on https://github.com/kestra-io/kestra/issues/11587
- companion PR: https://github.com/kestra-io/kestra-ee/pull/5617
2025-10-31 10:20:05 +01:00
brian.mulier
07e90de835 fix(core): CrudEvent should not be done on the repository side for KV 2025-10-30 16:57:57 +01:00
Loïc Mathieu
c06ffb3063 feat(system): set taskrun attempt to resubmitted when a taskrun is resubmitted to a worker
Closes https://github.com/kestra-io/kestra/issues/12481
2025-10-30 15:46:05 +01:00
brian.mulier
d9144c8c4f feat(core): introduce KV Metadata in-repository storing (#12342)
part of https://github.com/kestra-io/kestra/issues/12341
2025-10-29 17:18:43 +01:00
Loïc Mathieu
948a5beffa fix(executions): set tasks to submitted after sending to the Worker
When computing the next tasks to run, all task runs are created in the CREATED state.
Then when computed tasks to send to the worker, CREATED task runs are listed and converted into worker task.
The issue is that on the next execution message, if tasks sent to the worker are still in CREATED (for ex because the Worker didn't start them yet), they would still be evaluted as to send to the worker.
Setting them to a new SUBMITTED state would prevent them to be taken into account again until they are really terminated.

This should avoid the deduplicateWorkerTask state but this is kept for now with a warning and would be removed later if it proves to work in all cases.
2025-10-10 11:06:31 +02:00
Loïc Mathieu
4a9564be3c fix(system): refactor concurrency limit to use a counter
A counter allow to lock by flow which solves the race when two executions are created at the same time and the executoion_runnings table is empty.

Evaluating concurrency limit on the main executionQueue method also avoid an unexpected behavior where the CREATED execution is processed twice as its status didn't change immediatly when QUEUED.

Closes https://github.com/kestra-io/kestra-ee/issues/4877
2025-10-09 15:39:59 +02:00
Loïc Mathieu
5c24308e71 fix(executions): evaluate multiple conditions in a separate queue
By evaluating multiple condition in a separate queue, we serialize their evaluation which avoir races when we compute the outputs for flow triggers.
This is because evaluation is a multi step process: first you get the existing condtion, then you evaluate, then you store the result. As this is not guarded by a lock you must not do it concurrently.

The race can still occurs if muiltiple executors run but this is less probable. A re-implementation would be needed probably in 2.0 for that.

Fixes https://github.com/kestra-io/kestra-ee/issues/4602
2025-10-03 10:35:49 +02:00
YannC
296fb2fb7a feat: implement Flows as a DataSource for dashboards (#11439)
* feat: implement Flows as a DataSource for dashboards

* chore: review changes

* fix: method signature changes from another commit apply in new flow fetchData method
2025-10-01 12:57:25 +02:00
Nicolas K.
b294457953 feat(tests): rework runner utils to not use the queue during testing (#11380)
* feat(tests): rework runner utils to not use the queue during testing

* feat(tests): rework runner utils to not use the queue during testing

* test: rework RetryCaseTest to not rely on executionQueue

* fix(tests): don't catch the Queue exception

* fix(tests): don't catch the Queue exception

* fix compile

* fix(test): concurrency error and made runner test parallel ready

* fix(tests): remove test instance

* feat(tests): use Test Runner Utils

* fix(tests): flaky tests

* fix(test): flaky tests

* feat(tests): rework runner utils to not use the queue during testing

* feat(tests): rework runner utils to not use the queue during testing

* test: rework RetryCaseTest to not rely on executionQueue

* fix(tests): don't catch the Queue exception

* fix(tests): don't catch the Queue exception

* fix compile

* fix(test): concurrency error and made runner test parallel ready

* fix(tests): remove test instance

* feat(tests): use Test Runner Utils

* fix(tests): flaky tests

* fix(test): flaky tests

* fix(tests): flaky set test

* fix(tests): remove RunnerUtils

* fix(tests): fix flaky

* feat(test): rework runner tests to remove the queue usage

* feat(test): fix a flaky and remove parallelism from mysql test suit

* fix(tests): flaky tests

* clean(tests): unwanted test

* add debug exec when fail

* feat(tests): add thread to mysql thread pool

* fix(test): flaky and disable a test

---------

Co-authored-by: nKwiatkowski <nkwiatkowski@kestra.io>
Co-authored-by: Roman Acevedo <roman.acevedo62@gmail.com>
2025-09-24 08:18:02 +02:00
Loïc Mathieu
02d9c589fb chore(system): remove the task run page
Part-of: https://github.com/kestra-io/kestra-ee/issues/5174
2025-09-23 14:48:30 +02:00
Roman Acevedo
504f925085 test: make AbstractExecutionRepositoryTest parallelizable (#11295)
* test: make AbstractExecutionRepositoryTest parallelizable

* feat(tests): play jdbc h2 tests in parallel

* fix(tests): failing unit tests

* tests: add await until timeout on some tests

* fix(tests): failing unit tests

* fix(tests): failing unit tests

---------

Co-authored-by: nKwiatkowski <nkwiatkowski@kestra.io>
Co-authored-by: Nicolas K. <nk_mikmak@hotmail.com>
2025-09-17 17:41:10 +02:00
Loïc Mathieu
617daa79db fix(executions): truncate the execution_running table as in 0.24 there was an issue in the purge
This table contains executions for flows that have a concurrency that are currently running.
It has been added in 0.24 but in that release there was a bug that may prevent some records to being correctly removed from this table.
To fix that, we truncate it once.
2025-09-15 17:29:28 +02:00
Loïc Mathieu
be5e24217b chore(system): extract the scheduler to its own module 2025-09-04 11:04:13 +02:00
Loïc Mathieu
a5724bcb18 chore(system): extract the executor to its own module 2025-09-04 11:04:13 +02:00
Florian Hussonnois
3929bf6172 feat(system): add distinct server-events for reporting
Refactor the services used to generate periodic reports on server usage.

Related-to: kestra-io/kestra-ee#3014
2025-08-20 12:20:31 +02:00
Florian Hussonnois
194ae826e5 chore(system): add WorkerJobQueueInterface to properly pass workerId on subscribe 2025-08-12 19:26:31 +02:00
Loïc Mathieu
2acf37e0e6 fix(executions): properly fail the task if it contains unsupported unicode sequence
This occurs in Postgres using the `\u0000` unicode sequence. Postgres refuse to store any JSONB with this sequence as it has no textual representation.
We now properly detect that and fail the task.

Fixes #10326
2025-08-11 11:53:39 +02:00
YannC
c68c1b16d9 fix: set postgres and mysql queue offset as a bigint (#10344) 2025-07-28 16:28:09 +02:00
Loïc Mathieu
ab464fff6e fix(executions): flow concurrency limit not honors when executions are created at a high rate
This is due to the fact that we now process the execution queue concurrently so there is a race when counting currently running executions. This can be seen easily using a ForEachItem as it could create tens or hundreds of executions almost instantly leading to almost all those executions started as they would all see 0 executions running...

Using a dedicated execution running queue, as done in EE, would serialize the messages and fix the issue.

However, if using multiple executor instances and concurrency limit = 1, there is a theoretical race as no locks will be done if no execution is running. A max surge of executions could be as high as the number of executor but this race is less probable to happen in real world scenario.

Fixes #10167
2025-07-25 11:35:00 +02:00
YannC.
b1d41f6f47 fix: handle label filter with and instead or for flow
close #4390
2025-07-22 09:42:45 +02:00
Loïc Mathieu
05b50c22e3 feat(executions): allow suspending an execution at a breakpoint
- When creating an execution, you can pass a breakpoint of the form `taskId.value` and an execution kind.
- An execution with a breakpoint will be suspended in the `BREAKPOINT` state when arriving at the point where the breakpoint task should be executed
- You can resume an execution from a breakpoint, this would resume the execution and remove the existing breakpoint. At this time a new breakpoint can be passed.
- You can pass a breakpoint when replaying an execution.

Part-of: https://github.com/kestra-io/kestra-ee/issues/1547
2025-07-09 10:56:00 +02:00
Loïc Mathieu
6a4397fdfd fix(system): avoid creating multiple worker job queue
We created miltiple worker job queue because the bean was in the prototype scope.
This was needed only for tests as they are closing it.
Switching to singleton and rebuilding the context of the test that needs it fixes the issue.
2025-07-08 10:35:50 +02:00
Nicolas K.
c6e5cdfd93 feat(cicd): #4006 migrate sonatype to maven central (#9803)
Co-authored-by: nKwiatkowski <nkwiatkowski@kestra.io>
2025-07-01 15:00:34 +02:00
Loïc Mathieu
5304630b30 fix(system): avoid starting two queues for WorkerJob and WorkerTriggerResult
We previously starts 2 queues: one for emit, and a specific one for receive wich handle WorkerJobRunning under the cover.

We can replace by using a single queue that will transparently handle WorkerJobRunning. As queues starts thread pools their cost is not negligeable.

Fixes #9007
2025-06-10 09:36:54 +02:00
Loïc Mathieu
3bc75efbf2 fix(dashboard): Properly cast log level column in Dashboards
Fixes #8128
2025-05-27 16:18:49 +02:00
YannC
ea402261d5 Replace the default dashboard with custom dashboard (#8769)
* feat:
- Implement width property
- Replace custom dashboard
- Started to integrate the KPI chart

* feat(ui): introduce dashboard chart layout system

* feat(ui): introduce dashboard chart kpi card

* chore(ui): amend layout widths for sm screen size

* chore(ui): prevent editing of default dashboard

* chore(ui): centering the KPI text inside the box

* chore(ui): dashboard edit preview to respect set layouts

* chore(ui): initial work on setting the default flow & namespace dashboards

* fix(ui): make sure there is no naming clashes

* feat: KPI chart backend implementation

* feat: validation annotations

* chore(ui): make chart legend align to right side

* chore(ui): properly show chart labels

* chore(ui): improve state and ID components inside custom tabels

* chore(ui): add proper link to execution in tables

* feat: implemented Triggers as Datasource for custom dashboards

close kestra-io/kestra-ee#3740

* feat: modified the Markdown chart so now it accept different sources

* feat: rename KPI property to numerator & where

close #3739

* chore(ui): improve markdown component

* chore(ui): markdown charts

* chore(ui): markdown charts

* chore(ui): markdown charts remove padding

* chore(ui): markdown charts

* feat: fixes + define custom dashboard equivalent to current default dashboard with some modification

* fix: round double value

* chore(ui): improve  flows and ns charts

* chore(ui): make sure that table shows execution links only if namespace and flowId exist

* chore(ui): make sure markdown is properly shown on dashboard edititng

* fix: correctly do preview instead of load on homepage

* fix: correctly preview markdown chart and add description in default flow dashboard

* fix: apply review changes

* fix: modify test following classes modifications on charts

* tests: restore package-lock

* remove chromatic tools and a warning

---------

Co-authored-by: MilosPaunovic <paun992@hotmail.com>
Co-authored-by: Bart Ledoux <bledoux@kestra.io>
2025-05-27 08:02:11 +02:00
Nicolas K.
29c3bd7dec fix(core): tenant migration scripts now update keys
* chore(core): add keyboard shortcuts icon to flow editor tab (#8925)

fix(core): don't send empty operations when migrating roles (#3807)

Co-authored-by: nKwiatkowski <nkwiatkowski@kestra.io>

* fix(core): migrate key that required tenant id to avoid duplication

---------

Co-authored-by: Miloš Paunović <paun992@hotmail.com>
Co-authored-by: nKwiatkowski <nkwiatkowski@kestra.io>
2025-05-23 16:44:53 +02:00
Loïc Mathieu
1a4bb04258 fix(system): rename flyway migrations
They have been renamed in develop due to clashes but one was already backported on 0.22.

Fixes https://github.com/kestra-io/kestra-ee/issues/3819
2025-05-23 11:42:21 +02:00
YannC
88fa884e26 fix(filters): change label filtering to 'and' instead of 'or' (#8661)
* fix(filters): change label filtering to 'and' instead of 'or'

link to #8489

* test(execution): add test to validate and behavior

* test(execution): fix test
2025-05-23 11:26:46 +02:00
Nicolas K.
734fcbc45b feat(core): #3427 add OSS tenant migration scripts (#8798)
* feat(core): #3427 add OSS tenant migration scripts

* clean(core): fixes after review

* clean(core): make only one command for oss and EE migration

* fix(core): user synchronisation command and clean PR

---------

Co-authored-by: nKwiatkowski <nkwiatkowski@kestra.io>
2025-05-19 16:42:59 +02:00
Loïc Mathieu
30b39b2d30 feat(execution): add an execution kind
This allow to differentiate between normal executions and test executions
2025-05-16 17:00:27 +02:00
Loïc Mathieu
7117ae60f5 feat(system): purge empty service instances
Purge service instance in EMPTY state after a certain duration, 30 days by default, to avoid never ending groth on the service_instances table.

Fixes #8514
2025-05-06 17:17:46 +02:00
Loïc Mathieu
02723aa3d9 fix(dashboards)*: SQL errors on logs data chart
- PostgreSQL needs a cast from enum to text.
- Missing quotes on the WHERE clause.

Fixes #8128
2025-04-30 18:09:45 +02:00
杨利伟
03f9aa8e85 fix(jdbc): add service_id index on service_instance table 2025-04-24 18:14:15 +02:00
Loïc Mathieu
724058a06f Feat/storage outputs (#8361)
* feat(executions): Store outputs inside the internal storage (1/2)

* feat(executions): Store outputs inside the internal storage (2/2)

* feat(test): allow passing tenantId to tests

---------

Co-authored-by: Ludovic DEHON <tchiot.ludo@gmail.com>
2025-04-22 15:46:57 +02:00
weibo1
e1c4ae22f2 feat(system): add index to commonly queried fields in the WHERE conditions of the triggers table 2025-04-16 09:05:00 +02:00
Florian Hussonnois
504ff282ef fix: add missing indices for service instance table 2025-04-09 19:01:42 +02:00
Nicolas K.
5285bea930 feat(Unit Tests) #8171 convert hamcrest to assertj (#8276)
* feat(Unit Tests) #8171 convert hamcrest to assertj

* fix(Unit Tests) #8171 failing unit test after assertj migration

---------

Co-authored-by: nKwiatkowski <nkwiatkowski@kestra.io>
2025-04-08 15:48:54 +02:00
brian.mulier
1a121951d6 fix(jdbc): conflict on migration numbers 2025-04-07 21:59:14 +02:00
Florian Hussonnois
8f29a72df7 refactor: add GenericFlow to support un-typed flow deserialization
Add new FlowId, FlowInterface and GenericFlow classes to support
deserialization of flow with un-typed plugins (i.e., tasks, triggers)
in order to inject defaults prior to strongly-typed deserialization.
2025-04-07 17:32:06 +02:00