Commit Graph

1 Commits

Author SHA1 Message Date
Daniel Becker
6c8a3dfc33 IMPALA-5444: Asynchronous code generation
This commit introduces optional asynchronous code generation.

Asynchronous code generation means that instead of waiting for codegen
to finish, the query starts in interpreted mode while codegen is done on
another thread.

All the function pointers that point to codegen'd functions are changed
to be atomic, wrapped in a CodegenFnPtr. These are initialised to
nullptr and as long as they are nullptr, the corresponding interpreted
functions are used (as before). When code generation is ready, the
funtion pointers are set by the codegen thread. No synchronisation is
needed as the function pointers are atomic and it is not a problem if,
at a given moment, only a subset of the codegen'd function pointers are
set and the rest are interpreted.

Asynchronous code generation can be turned on using the ASYNC_CODEGEN
boolean query option.

Testing:
 - In exhaustive mode, a limited number of end-to-end tests are run in
   async mode and with debug actions randomly delaying the codegen
   thread and the main thread after starting codegen to test various
   scenarios of relative timing. The number of such tests is kept
   small to avoid increasing the running time of the tests by too much.
 - Added a new end-to-end test, tests/query_test/test_async_codegen.py,
   which tests three relative timings:

    1. Async codegen finishes before query execution starts (only
       codegen'd code runs).
    2. Query execution finishes before async codegen finishes (only
       interpreted code runs).
    3. Async codegen finishes during query execution (both interpreted
       and condegen'd code runs, switching to codegen from interpreted
       mode.

Change-Id: Ia7cbfa7c6734dcf03641629429057d6a4194aa6b
Reviewed-on: http://gerrit.cloudera.org:8080/15105
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-07-01 17:31:52 +00:00