impala

mirror of https://github.com/apache/impala.git synced 2025-12-19 18:12:08 -05:00

Author	SHA1	Message	Date
Riza Suminto	95f353ac4a	IMPALA-13507: Allow disabling glog buffering via with_args fixture We have plenty of custom_cluster tests that assert against content of Impala daemon log files while the process is still running using assert_log_contains() and it's wrappers. The method specifically mention about disabling glog buffering ('-logbuflevel=-1'), but not all custom_cluster tests do that. This often result in flaky test that hard to triage and often neglected if it does not frequently run in core exploration. This patch adds boolean param 'disable_log_buffering' into CustomClusterTestSuite.with_args for test to declare intention to inspect log files in live minicluster. If it is True, start minicluster with '-logbuflevel=-1' for all daemons. If it is False, log WARNING on any calls to assert_log_contains(). There are several complex custom_cluster tests that left unchanged and print out such WARNING logs, such as: - TestQueryLive - TestQueryLogTableBeeswax - TestQueryLogOtherTable - TestQueryLogTableHS2 - TestQueryLogTableAll - TestQueryLogTableBufferPool - TestStatestoreRpcErrors - TestWorkloadManagementInitWait - TestWorkloadManagementSQLDetails This patch also fixed some small flake8 issues on modified tests. There is a flakiness sign at test_query_live.py where test query is submitted to coordinator and fail because sys.impala_query_live table has not exist yet from coordinator's perspective. This patch modify test_query_live.py to wait for few seconds until sys.impala_query_live is queryable. Testing: - Pass custom_cluster tests in exhaustive exploration. Change-Id: I56fb1746b8f3cea9f3db3514a86a526dffb44a61 Reviewed-on: http://gerrit.cloudera.org:8080/22015 Reviewed-by: Jason Fehr <jfehr@cloudera.com> Reviewed-by: Michael Smith <michael.smith@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2024-11-05 04:49:05 +00:00
Yida Wu	f93bd98621	IMPALA-11805: Use llvm ObjectCache for codegen caching Currently, we employ llvm::ExecutionEngine for codegen caching, providing access to compiled functions within the cached engine. However, the real challenge is the ExecutionEngine uses a lot of memory which largely exceeds our memory estimates and it is very hard to predict. This patch addresses this issue by using llvm::ObjectCache for codegen caching. In our case, each execution engine would have only one module, and after the compilation of the module, the compiled codegened functions of the module would be set to the execution engine, therefore functions could be used by Impala. During function compilation within the module, if an ObjectCache is set to the execution engine, the compiled codegened functions would be also written into the cache. This way, if we keep the cache, when revisiting the same module (fragment), we can efficiently reuse the specific ObjectCache, loading pre-compiled codegened functions and saving time. The tpch performance test indicates no significant regression compared to the previous use of ExecutionEngine. Post-change, the actual memory usage of each codegen caching entry is notably reduced. +----------+-----------------------+---------+------------+------------+----------------+ \| Workload \| File Format \| Avg (s) \| Delta(Avg) \| GeoMean(s) \| Delta(GeoMean) \| +----------+-----------------------+---------+------------+------------+----------------+ \| TPCH(1) \| parquet / none / none \| 0.22 \| -0.65% \| 0.20 \| -0.75% \| +----------+-----------------------+---------+------------+------------+----------------+ +----------+----------+-----------------------+--------+-------------+------------+------------+----------------+-------+----------------+---------+-------+ \| Workload \| Query \| File Format \| Avg(s) \| Base Avg(s) \| Delta(Avg) \| StdDev(%) \| Base StdDev(%) \| Iters \| Median Diff(%) \| MW Zval \| Tval \| +----------+----------+-----------------------+--------+-------------+------------+------------+----------------+-------+----------------+---------+-------+ \| TPCH(1) \| TPCH-Q13 \| parquet / none / none \| 0.49 \| 0.47 \| +2.80% \| 5.32% \| 5.07% \| 10 \| +1.22% \| 1.63 \| 1.19 \| \| TPCH(1) \| TPCH-Q4 \| parquet / none / none \| 0.16 \| 0.16 \| +3.51% \| 1.32% \| * 10.38% * \| 10 \| +0.06% \| 0.49 \| 1.06 \| \| TPCH(1) \| TPCH-Q11 \| parquet / none / none \| 0.12 \| 0.12 \| +1.39% \| 2.27% \| 2.24% \| 10 \| +1.50% \| 1.90 \| 1.37 \| \| TPCH(1) \| TPCH-Q19 \| parquet / none / none \| 0.21 \| 0.21 \| +1.56% \| * 10.02% * \| * 11.42% * \| 10 \| +1.18% \| 0.57 \| 0.32 \| \| TPCH(1) \| TPCH-Q18 \| parquet / none / none \| 0.27 \| 0.27 \| +1.71% \| 6.46% \| 1.29% \| 10 \| -0.19% \| -1.19 \| 0.81 \| \| TPCH(1) \| TPCH-Q6 \| parquet / none / none \| 0.11 \| 0.11 \| +0.79% \| 2.76% \| 2.15% \| 10 \| +0.10% \| 1.46 \| 0.71 \| \| TPCH(1) \| TPCH-Q3 \| parquet / none / none \| 0.26 \| 0.26 \| +0.71% \| 6.63% \| 6.18% \| 10 \| +0.04% \| 0.49 \| 0.25 \| \| TPCH(1) \| TPCH-Q17 \| parquet / none / none \| 0.17 \| 0.17 \| +0.41% \| * 14.66% * \| * 13.01% * \| 10 \| +0.05% \| 0.40 \| 0.07 \| \| TPCH(1) \| TPCH-Q14 \| parquet / none / none \| 0.16 \| 0.16 \| +0.19% \| 1.41% \| 1.39% \| 10 \| +0.25% \| 1.46 \| 0.31 \| \| TPCH(1) \| TPCH-Q20 \| parquet / none / none \| 0.17 \| 0.17 \| +0.22% \| 1.70% \| 1.77% \| 10 \| -0.05% \| -0.40 \| 0.28 \| \| TPCH(1) \| TPCH-Q12 \| parquet / none / none \| 0.16 \| 0.16 \| -0.27% \| 0.54% \| 1.46% \| 10 \| +0.14% \| 0.93 \| -0.54 \| \| TPCH(1) \| TPCH-Q22 \| parquet / none / none \| 0.11 \| 0.11 \| -0.38% \| 0.81% \| 2.06% \| 10 \| +0.03% \| 0.22 \| -0.54 \| \| TPCH(1) \| TPCH-Q16 \| parquet / none / none \| 0.17 \| 0.17 \| -0.38% \| 0.67% \| 1.58% \| 10 \| -0.01% \| -0.13 \| -0.70 \| \| TPCH(1) \| TPCH-Q8 \| parquet / none / none \| 0.27 \| 0.27 \| -0.08% \| 1.24% \| 1.15% \| 10 \| -0.33% \| -1.37 \| -0.15 \| \| TPCH(1) \| TPCH-Q15 \| parquet / none / none \| 0.16 \| 0.16 \| -1.18% \| * 16.61% * \| * 10.25% * \| 10 \| +0.33% \| 0.40 \| -0.19 \| \| TPCH(1) \| TPCH-Q1 \| parquet / none / none \| 0.22 \| 0.22 \| -1.67% \| 1.62% \| 7.45% \| 10 \| +0.43% \| 1.02 \| -0.70 \| \| TPCH(1) \| TPCH-Q5 \| parquet / none / none \| 0.22 \| 0.22 \| -0.98% \| 0.22% \| 1.55% \| 10 \| -0.26% \| -2.16 \| -1.97 \| \| TPCH(1) \| TPCH-Q21 \| parquet / none / none \| 0.48 \| 0.49 \| -1.18% \| 3.58% \| 4.40% \| 10 \| -0.25% \| -1.19 \| -0.66 \| \| TPCH(1) \| TPCH-Q10 \| parquet / none / none \| 0.26 \| 0.26 \| -1.93% \| 7.84% \| 6.24% \| 10 \| -0.14% \| -0.13 \| -0.62 \| \| TPCH(1) \| TPCH-Q7 \| parquet / none / none \| 0.18 \| 0.19 \| -3.31% \| * 11.47% * \| * 12.47% * \| 10 \| -0.25% \| -1.72 \| -0.63 \| \| TPCH(1) \| TPCH-Q9 \| parquet / none / none \| 0.34 \| 0.35 \| -5.22% \| 6.87% \| * 10.03% * \| 10 \| -2.15% \| -1.28 \| -1.38 \| \| TPCH(1) \| TPCH-Q2 \| parquet / none / none \| 0.16 \| 0.18 \| -11.00% \| * 16.07% * \| 3.84% \| 10 \| -0.90% \| -1.81 \| -2.35 \| +----------+----------+-----------------------+--------+-------------+------------+------------+----------------+-------+----------------+---------+-------+ We are no longer using ExecutionEngine for caching, so we got rid of the LlvmExecutionEngineWrapper class. Instead, we brought in a new class CodeGenObjectCache to implement llvm::ObjectCache. Testing: Passed LlvmCodeGenCacheTest and custom_cluster/test_codegen_cache.py. Change-Id: Ic3c1b46bb9018ed0320817141785a3bdc41fa677 Reviewed-on: http://gerrit.cloudera.org:8080/20733 Reviewed-by: Michael Smith <michael.smith@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2023-12-19 20:32:10 +00:00
Daniel Becker	db5f3b18e4	IMPALA-12306: (Part 2) Make codegen cache tests with symbol emitter more robust The codegen cache tests that include having a symbol emitter (previously TestCodegenCache.{test_codegen_cache_with_asm_module_dir,test_codegen_cache_with_perf_map}) introduced by IMPALA-12260 were added to ensure we don't produce a use-after-free. There are two problems with these tests: 1. Setting the codegen cache size correctly in the tests has proved to be difficult because new commits and different build types (debug vs. release) have a huge effect on what sizes are appropriate. We have had many build failures because of this. 2. Use-after-free is undefined behaviour and does not guarantee a crash but the tests rely on the crash to catch the bug described in IMPALA-12260. This change solves the second problem. The tests added by IMPALA-12260 relied on a crash in the situation described there: 'LlvmCodeGen::symbol_emitter_' is registered as an event listener with the current 'llvm::ExecutionEngine', then the engine is cached but the 'LlvmCodeGen' object, which owns the symbol emitter, is destroyed at the end of the query. When the cached execution engine is destroyed later, it frees any remaining object files and notifies the symbol emitter about this, but the symbol emitter has already been destroyed so its pointer is invalid (use-after-free). However, we can't rely on the crash to detect the use-after-free because 1) the crash is not guaranteed to happen, use-after-free is undefined behaviour 2) the crash may happen well after the query has finished returning results. This change solves the problem in the following way: In 'CodegenSymbolEmitter' we introduce a counter that is incremented in NotifyObjectEmitted() and decremented in NotifyFreeingObject(). At the time of the destruction of the 'CodegenSymbolEmitter', this counter should be zero - if it is greater than zero, the LLVM execution engine to which the 'CodegenSymbolEmitter' is subscribed is still alive and it will try to notify the symbol emitter when the object file is freed (most likely when the execution engine itself is destroyed), leading to use-after-free We also add a hidden startup flag, '--codegen_symbol_emitter_log_successful_destruction_test_only'. When it is set to true, 'CodegenSymbolEmitter' will log a message when it is being destroyed correctly (i.e. when the counter is zero and use-after-free will not happen). We use it in the tests - if we don't have the expected message in the logs (after some timeout), the test fails. Testing: - modified the tests TestCodegenCache.{test_codegen_cache_with_asm_module_dir,test_codegen_cache_with_perf_map} so they reliably detect use-after-free. Change-Id: I61b9b0de9c896f3de7eb1be7de33d822b1ab70d0 Reviewed-on: http://gerrit.cloudera.org:8080/20318 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2023-11-15 16:20:37 +00:00
Daniel Becker	3435baba67	IMPALA-12306: (Part 1) Make codegen cache tests with symbol emitter more robust The codegen cache tests that include having a symbol emitter (previously TestCodegenCache.{test_codegen_cache_with_asm_module_dir,test_codegen_cache_with_perf_map}) introduced by IMPALA-12260 were added to ensure we don't produce a use-after-free. There are two problems with these tests: 1. Setting the codegen cache size correctly in the tests has proved to be difficult because new commits and different build types (debug vs. release) have a huge effect on what sizes are appropriate. We have had many build failures because of this. 2. Use-after-free is undefined behaviour and does not guarantee a crash but the tests rely on the crash to catch the bug described in IMPALA-12260. This commit solves the first problem. We use the '--codegen_cache_entry_bytes_charge_overhead' startup flag to artificially assign a higher size (memory charge) to the cache entries, compared to which the real size, and therefore also changes in the real size, are insignificant. Change-Id: If801ae6d3d9f5286ed886b1d06c37a32bc1d2c54 Reviewed-on: http://gerrit.cloudera.org:8080/20304 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2023-08-09 12:28:52 +00:00
Daniel Becker	645abfc353	IMPALA-12269: Codegen cache false negative because of function names hash Codegen cache entries (execution engines holding an LLVM code module) are stored by keys derived from the unoptimised llvm modules: the key is either the whole unoptimised module (normal mode) or its hash (optimal mode). Because hash collisions are possible (in optimal mode), as an extra precaution we also compare the hashes of the function names in the current and the cached module. However, when assembling the function name list we do not filter out duplicate function names, which may result in cases where the unoptimised llvm modules are identical but the function name hashes do not match. Example: First query: select int_col, tinyint_col from alltypessmall order by int_col desc limit 20; Second query: select tinyint_col from alltypessmall order by int_col desc limit 20; In the first query, there are two 'SlotRef' objects referencing 'tinyint_col' which want to codegen a 'GetSlotRef()' function. The second invokation of 'SlotRef::GetCodegendComputeFnImpl()' checks the already codegen'd functions, finds the function created by its first invokation and returns that. The two 'SlotRef' objects will use the same 'llvm::Function' and there will be only one copy of it in the module, but both 'SlotRef's will call 'LlvmCodeGen::AddFunctionToJit()' with this function in order for their respective function pointers to be set after JIT-compilation. 'LlvmCodeGen::GetAllFunctionNames()' will return the names of all functions with which 'LlvmCodeGen::AddFunctionToJit()' has been called, including duplicates. The second query generates the same unoptimised module as the first query (for the corresponding fragment), but does not have a duplicated 'GetSlotRef()' function in its function name list, so the cached module is rejected. Note that this also results in the cached module being evicted when the new module from the second query is inserted into the cache because the new module will have the same key as the cached one (the modules are identical). This change fixes this problem by using a de-duplicated and sorted function name list. Testing: - Added a test in test_codegen_cache.py that asserts that there is a cache hit and no eviction in the above example. Change-Id: Ibf1d2b424c969fbba181ab90bf9c7bf22355f139 Reviewed-on: http://gerrit.cloudera.org:8080/20168 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2023-08-04 09:47:50 +00:00
Daniel Becker	66b701f806	IMPALA-12292: TestCodegenCache.{test_codegen_cache_with_asm_module_dir,test_codegen_cache_with_perf_map} fail in builds The above codegen cache tests were introduced by IMPALA-12260. They run two queries and the first query produces two codegen cache entries. The tests aim to bring about the following scenario: 1. both codegen cache entries from the first query fit in the cache AND 2. both entries from the first query are evicted during the second query. The parameters that can be tuned are the following: 1. the size of the codegen cache entries of the first query 2. the size of the codegen cache entries of the second query 3. the size of the codegen cache. If the parameters are chosen badly or the sizes of the codegen cache entries change because of other Impala changes (e.g. codegen optimisations), the conditions may not be satisfied and the tests may fail like they did now. This change makes the tests more robust by - increasing the cache footprint of the second query (from 487.40 KB to 663.68 KB) - choosing the size of the codegen cache so as to leave as much margin on each side as possible. At present - the minimal codegen cache size so that both entries from the first query fit the cache is around 2.4 MB - the maximal cache size so that both entries from the first query are evicted during the second query is around 4.1 MB Therefore we choose a cache size of 3.25 MB, which lies in the middle. Experience has shown that this setup is fragile and breaks easily when new commits are added to Impala. Therefore this change relaxes some of the assertions in the tests as a temporary measure to prevent build failures. For this and other reasons IMPALA-12306 was opened to make these tests more robust. Change-Id: I15320b8c0d06f4d93927b19731c11bd4e15b3690 Reviewed-on: http://gerrit.cloudera.org:8080/20224 Reviewed-by: Yida Wu <wydbaggio000@gmail.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2023-07-28 08:52:57 +00:00
Daniel Becker	c0feea2c9f	IMPALA-12260: Crash if '--asm_module_dir' is set If Impala is started with the --asm_module_dir flag set and codegen cache is used, Impala crashes. The problem is with the lifetime of 'LlvmCodeGen::symbol_emitter_'. It is registered as an event listener with the current 'llvm::ExecutionEngine'. Then the engine is cached but the 'LlvmCodeGen' object, which owns the symbol emitter, is destroyed at the end of the query. When the cached execution engine is destroyed later, it tries to notify the symbol emitter, but it has already been destroyed so its pointer is invalid. This change solves the problem by wrapping the execution engine and the symbol emitter together in a wrapper class, LlvmExecutionEngineWrapper, that is responsible for managing their lifetimes. The LlvmCodeGen and the CodeGenCache classes now hold shared pointers to this wrapper class. If we add other objects in the future whose lifetimes are tied to the execution engine (but are not owned by it), they should be put into the wrapper class. Testing: - added regression tests in tests/custom_cluster/test_codegen_cache.py that fail without this change. Change-Id: I23f871abb962ad317f9c0075ca303c09dd56bcd9 Reviewed-on: http://gerrit.cloudera.org:8080/20155 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2023-07-13 16:50:44 +00:00
Joe McDonnell	eb66d00f9f	IMPALA-11974: Fix lazy list operators for Python 3 compatibility Python 3 changes list operators such as range, map, and filter to be lazy. Some code that expects the list operators to happen immediately will fail. e.g. Python 2: range(0,5) == [0,1,2,3,4] True Python 3: range(0,5) == [0,1,2,3,4] False The fix is to wrap locations with list(). i.e. Python 3: list(range(0,5)) == [0,1,2,3,4] True Since the base operators are now lazy, Python 3 also removes the old lazy versions (e.g. xrange, ifilter, izip, etc). This uses future's builtins package to convert the code to the Python 3 behavior (i.e. xrange -> future's builtins.range). Most of the changes were done via these futurize fixes: - libfuturize.fixes.fix_xrange_with_import - lib2to3.fixes.fix_map - lib2to3.fixes.fix_filter This eliminates the pylint warnings: - xrange-builtin - range-builtin-not-iterating - map-builtin-not-iterating - zip-builtin-not-iterating - filter-builtin-not-iterating - reduce-builtin - deprecated-itertools-function Testing: - Ran core job Change-Id: Ic7c082711f8eff451a1b5c085e97461c327edb5f Reviewed-on: http://gerrit.cloudera.org:8080/19589 Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com> Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>	2023-03-09 17:17:57 +00:00
Joe McDonnell	82bd087fb1	IMPALA-11973: Add absolute_import, division to all eligible Python files This takes steps to make Python 2 behave like Python 3 as a way to flush out issues with running on Python 3. Specifically, it handles two main differences: 1. Python 3 requires absolute imports within packages. This can be emulated via "from __future__ import absolute_import" 2. Python 3 changed division to "true" division that doesn't round to an integer. This can be emulated via "from __future__ import division" This changes all Python files to add imports for absolute_import and division. For completeness, this also includes print_function in the import. I scrutinized each old-division location and converted some locations to use the integer division '//' operator if it needed an integer result (e.g. for indices, counts of records, etc). Some code was also using relative imports and needed to be adjusted to handle absolute_import. This fixes all Pylint warnings about no-absolute-import and old-division, and these warnings are now banned. Testing: - Ran core tests Change-Id: Idb0fcbd11f3e8791f5951c4944be44fb580e576b Reviewed-on: http://gerrit.cloudera.org:8080/19588 Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com> Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>	2023-03-09 17:17:57 +00:00
Yida Wu	e15610633e	IMPALA-11965: Fix TestCodegenCache failure when codegen cache disabled by default The patch fixes the testcase TestCodegenCache failure when the codegen cache is changed to be disabled by default, because the testcase assumes the codegen cache is enabled with the default setting. The solution is to specify a default value to codegen_cache_capacity in the testcase's start option, so that manually ensures the codegen cache is on during the test. Tests: Passed TestCodegenCache in the exhaustive run with codegen cache disabled by default. Change-Id: I749a6ba68553834bdea908741aa7449ed32cd569 Reviewed-on: http://gerrit.cloudera.org:8080/19574 Reviewed-by: Zoltan Borok-Nagy <boroknagyz@cloudera.com> Reviewed-by: Michael Smith <michael.smith@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2023-03-05 16:59:37 +00:00
Yida Wu	4bdb99938a	IMPALA-11470: Add Cache For Codegen Functions The patch adds supports of the cache for CodeGen functions to improve the performance of sub-second queries. The main idea is to store the codegen functions to a cache, and reuse them when it is appropriate to avoid repeated llvm optimization time which could take over hundreds of milliseconds. In this patch, we implement a cache to store codegen functions. The cache is a singleton instance for each daemon, and contains multiple cache entries. Each cache entry is at the fragment level, that is storing all the codegen functions of a fragment in a cache entry, if one exactly same fragment comes again, it should be able to find all the codegen functions it needs from the specific cache entry, therefore saving the time. The module bitcode is used as the key to the cache, which will be generated before the module optimization and final compilation. If codegen_cache_mode is NORMAL, which is by default, we will store the full bitcode string as the key. Otherwise, if codegen_cache_mode is set to OPTIMAL, we will store a key only containing the hash code and the total length of a full key to reduce memory consumption. Also, KrpcDataStreamSenderConfig::CodegenHashRow() is changed to pass the hash seed as an argument because it can't hit the cache for the fragment if using a dynamic hash seed within the codegen function. Codegen cache is disabled automatically for a fragment using a native udf, because it can lead to a crash in this case. The reason for that is the udf is loaded to the llvm execution engine global mapping instead of the llvm module, however, the current key to the cache entry uses the llvm module bitcode which can't reflect the change of the udf address if the udf is reloaded during runtime, for example database recreation, then it could lead to a crash due to using an old udf address from the cache. Disable it until there is a better solution, filed IMPALA-11771 to follow. The patch also introduces following new flags for start and query options for feature configuration and operation purpose. Start option for configuration: - codegen_cache_capacity: The capacity of the cache, if set to 0, codegen cache is disabled. Query option for operations: - disable_codegen_cache: Codegen cache will be disabled when it is set to true. - codegen_cache_mode: It is defined by a new enum type TCodeGenCacheMode. There are four types, NORMAL and OPTIMAL, and two other types, NORMAL_DEBUG and OPTIMAL_DEBUG, which are the debug mode of the first two types. If using NORMAL, a full key will be stored to the cache, it will cost more memory for each entry because the key is the bitcode of the llvm module, it can be large. If using OPTIMAL, the cache will only store the hash code and length of the key, it reduces the memory consumption largely, however, could be possible to have collision issues. If using debug modes, the behavior would be the same as the non-debug modes, but more logs or statistics will be allowed, that means could be slower. Only valid when disable_codegen_cache is set to false. New impalad metrics: - impala.codegen-cache.misses - impala.codegen-cache.entries-in-use - impala.codegen-cache.entries-in-use-bytes - impala.codegen-cache.entries-evicted - impala.codegen-cache.hits - impala.codegen-cache.entry-sizes New profile Metrics: - CodegenCacheLookupTime - CodegenCacheSaveTime - ModuleBitcodeGenTime - NumCachedFunctions TPCH-1 performance evaluation (8 iteration) on AWS m5a.4xlarge, the result removes the first iteration to show the benefit of the cache: Query Cached(s) NoCache(s) Delta(Avg) NoCodegen(s) Delta(Avg) TPCH-Q1 0.39 1.02 -61.76% 5.59 -93.02% TPCH-Q2 0.56 1.21 -53.72% 0.47 19.15% TPCH-Q3 0.37 0.77 -51.95% 0.43 -13.95% TPCH-Q4 0.36 0.51 -29.41% 0.33 9.09% TPCH-Q5 0.39 1.1 -64.55% 0.39 0% TPCH-Q6 0.24 0.27 -11.11% 0.77 -68.83% TPCH-Q7 0.39 1.2 -67.5% 0.39 0% TPCH-Q8 0.58 1.46 -60.27% 0.45 28.89% TPCH-Q9 0.8 1.38 -42.03% 1 -20% TPCH-Q10 0.6 1.03 -41.75% 0.85 -29.41% TPCH-Q11 0.3 0.93 -67.74% 0.2 50% TPCH-Q12 0.28 0.48 -41.67% 0.38 -26.32% TPCH-Q13 1.11 1.22 -9.02% 1.16 -4.31% TPCH-Q14 0.55 0.78 -29.49% 0.45 22.22% TPCH-Q15 0.33 0.73 -54.79% 0.44 -25% TPCH-Q16 0.32 0.78 -58.97% 0.41 -21.95% TPCH-Q17 0.56 0.84 -33.33% 0.89 -37.08% TPCH-Q18 0.54 0.92 -41.3% 0.89 -39.33% TPCH-Q19 0.35 2.34 -85.04% 0.35 0% TPCH-Q20 0.34 0.98 -65.31% 0.31 9.68% TPCH-Q21 0.83 1.14 -27.19% 0.86 -3.49% TPCH-Q22 0.26 0.52 -50% 0.25 4% From the result, it shows a pretty good performance compared to codegen without cache (default setting). However, compared to codegen disabled, as expected, for short queries, codegen cache is not always faster, probably because for the codegen cache, it still needs some time to prepare the codegen functions and generate an appropriate module bitcode to be the key, if the time of the preparation is larger than the benefit from the codegen functions, especially for the extremely short queries, the result can be slower than not using the codegen. There could be room to improve in future. We also test the total cache entry size for tpch queries. The data below shows the total codegen cache used by each tpch query. We can see the optimal mode is very helpful to reduce the size of the cache, and the reason is the much smaller key in optimal mode we mentioned before because the only difference between two modes is the key. Query Normal(KB) Optimal(KB) TPCH-Q1 604.1 50.9 TPCH-Q2 973.4 135.5 TPCH-Q3 561.1 36.5 TPCH-Q4 423.3 41.1 TPCH-Q5 866.9 93.3 TPCH-Q6 295.9 4.9 TPCH-Q7 1105.4 124.5 TPCH-Q8 1382.6 211 TPCH-Q9 1041.4 119.5 TPCH-Q10 738.4 65.4 TPCH-Q11 1201.6 136.3 TPCH-Q12 452.8 46.7 TPCH-Q13 541.3 48.1 TPCH-Q14 696.8 102.8 TPCH-Q15 1148.1 95.2 TPCH-Q16 740.6 77.4 TPCH-Q17 990.1 133.4 TPCH-Q18 376 70.8 TPCH-Q19 1280.1 179.5 TPCH-Q20 1260.9 180.7 TPCH-Q21 722.5 66.8 TPCH-Q22 713.1 49.8 Tests: Ran exhaustive tests. Added E2e testcase TestCodegenCache. Added unit testcase LlvmCodeGenCacheTest. Change-Id: If42c78a7f51fd582e5fe331fead494dadf544eb1 Reviewed-on: http://gerrit.cloudera.org:8080/19181 Reviewed-by: Michael Smith <michael.smith@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2022-12-07 21:57:46 +00:00

11 Commits