4 Commits

Author SHA1 Message Date
stiga-huang
ff8bb33b91 IMPALA-12870: Tag query id for Java pool threads
Logs from Java threads running in ExecutorService are missing the query
id which is stored in the C++ thread-local ThreadDebugInfo variable.
This patch adds JNI calls for Java threads to manage the ThreadDebugInfo
variable. Currently two thread pools are changed:
 - MissingTable loading pool in StmtMetadataLoader.parallelTableLoad().
 - Table loading pool in TableLoadingMgr.

MissingTable loading pool only lives within the parallelTableLoad()
method. So we initialize ThreadDebugInfo with the queryId at the
beginning of the thread and delete it at the end of the thread. Note
that a thread might be reused to load different tables, but they all
belong to the same query.

Table loading pool is a long running pool in catalogd that never
shut down. Threads in it is used to load tables triggered by different
queries. We initialize ThreadDebugInfo as the above but update it when
the thread starts loading table for a different query id, and reset it
when the loading is done. The query id is passed down from the catalogd
RPC request headers.

Tests:
 - Added e2e test to verify the logs.
 - Ran existing CORE tests.

Change-Id: I83cca55edc72de35f5e8c5422efc104e6aa894c1
Reviewed-on: http://gerrit.cloudera.org:8080/23558
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-10-23 03:35:29 +00:00
Csaba Ringhofer
f98b697c7b IMPALA-13929: Make 'functional-query' the default workload in tests
This change adds get_workload() to ImpalaTestSuite and removes it
from all test suites that already returned 'functional-query'.
get_workload() is also removed from CustomClusterTestSuite which
used to return 'tpch'.

All other changes besides impala_test_suite.py and
custom_cluster_test_suite.py are just mass removals of
get_workload() functions.

The behavior is only changed in custom cluster tests that didn't
override get_workload(). By returning 'functional-query' instead
of 'tpch', exploration_strategy() will no longer return 'core' in
'exhaustive' test runs. See IMPALA-3947 on why workload affected
exploration_strategy. An example for affected test is
TestCatalogHMSFailures which was skipped both in core and exhaustive
runs before this change.

get_workload() functions that return a different workload than
'functional-query' are not changed - it is possible that some of
these also don't handle exploration_strategy() as expected, but
individually checking these tests is out of scope in this patch.

Change-Id: I9ec6c41ffb3a30e1ea2de773626d1485c69fe115
Reviewed-on: http://gerrit.cloudera.org:8080/22726
Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>
Reviewed-by: Daniel Becker <daniel.becker@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-04-08 07:12:55 +00:00
Joe McDonnell
82bd087fb1 IMPALA-11973: Add absolute_import, division to all eligible Python files
This takes steps to make Python 2 behave like Python 3 as
a way to flush out issues with running on Python 3. Specifically,
it handles two main differences:
 1. Python 3 requires absolute imports within packages. This
    can be emulated via "from __future__ import absolute_import"
 2. Python 3 changed division to "true" division that doesn't
    round to an integer. This can be emulated via
    "from __future__ import division"

This changes all Python files to add imports for absolute_import
and division. For completeness, this also includes print_function in the
import.

I scrutinized each old-division location and converted some locations
to use the integer division '//' operator if it needed an integer
result (e.g. for indices, counts of records, etc). Some code was also using
relative imports and needed to be adjusted to handle absolute_import.
This fixes all Pylint warnings about no-absolute-import and old-division,
and these warnings are now banned.

Testing:
 - Ran core tests

Change-Id: Idb0fcbd11f3e8791f5951c4944be44fb580e576b
Reviewed-on: http://gerrit.cloudera.org:8080/19588
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>
2023-03-09 17:17:57 +00:00
Sahil Takiar
155ffd64ee IMPALA-9046: Profile counter that indicates if a JVM pause occurred
Adds a new section to the host profiles that includes JVM GC related
metrics. These metrics are taken from JMX and the JvmPauseMonitor.

The host profiles will now include a section like below:

        JVM:
           - GcCount: 19
           - GcNumInfoThresholdExceeded: 0
           - GcNumWarnThresholdExceeded: 0
           - GcTimeMillis: 17s476ms
           - GcTotalExtraSleepTimeMillis: 380

GcNumInfoThresholdExceeded, GcNumWarnThresholdExceeded, and
GcTotalExtraSleepTimeMillis are all taken from JvmPauseMonitor.
GcCount and GcTimeMillis are taken from JMX (specifically,
GarbageCollectorMXBean).

The counters themselves are derived from the impalad host-level metrics.

Changed the 'lock_' in JvmMetricCache (in memory-metrics.h) from a mutex
to a shared_mutex. Most accessors of the JvmMetricCache member variables
are read-only. A write only occurs lazily at most every second. This
should help reduce lock contention on JvmMetricCache now that all
queries will start accessing info stored by the JvmMetricCache.

Testing:
* Ran core tests
* Added a test that runs Java UDF, which triggers JVM GC

Change-Id: Idbaae2f9142b8be94532a0a147668a3d96091b0b
Reviewed-on: http://gerrit.cloudera.org:8080/16414
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Reviewed-by: Sahil Takiar <stakiar@cloudera.com>
Tested-by: Sahil Takiar <stakiar@cloudera.com>
2020-09-22 23:45:57 +00:00