2 Commits

Author SHA1 Message Date
stiga-huang
2f9abc4e80 IMPALA-14000: Dump jstacks first in dump-stacktraces.sh
bin/dump-stacktraces.sh collects pstack and jstack of the cluster. It's
used when some tests time out. Collecting pstacks might take long and
fail in the middle, causing jstacks not being collected. This changes
the script to collect jstacks first.

Also adds -c to the "thread apply all bt" gdb command to continue past
an error.

Change-Id: I8f610ee4d4934fe950a9f56cf74a7e76e5d63651
Reviewed-on: http://gerrit.cloudera.org:8080/22826
Reviewed-by: Daniel Becker <daniel.becker@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-05-06 18:24:51 +00:00
stiga-huang
90944d7340 IMPALA-10369: Dump server stacktraces when test_concurrent_ddls.py timeout
Recently, we see many timeout failures of test_concurrent_ddls.py in S3
builds, e.g. IMPALA-10280, IMPALA-10301, IMPALA-10363. It'd be helpful
to dump the server stacktraces so we can understand why some RPCs are
slow/stuck.

This patch extracts the logic of dumping stacktraces in
script-timeout-check.sh to a separate script, dump-stacktraces.sh.
The script also dumps jstacks of HMS and NameNode. Dumping all these
stacktraces is time-consuming so we do them in parallel, which also
helps to get consistent snapshots of all servers.

When any tests in test_concurrent_ddls.py timeout, we use
dump-stacktraces.sh to dump the stacktraces before exit. Previously,
some tests depend on pytest.mark.timeout for detecting timeouts. It's
hard to add a customized callback for dumping server stacktraces. So
this patch refactors test_concurrent_ddls.py to only use timeout of
multiprocessing.

Tests:
 - Tested the scripts locally.
 - Verified the error handling of timeout logics in Jenkins jobs

Change-Id: I514cf2d0ff842805c0abf7211f2a395151174173
Reviewed-on: http://gerrit.cloudera.org:8080/16800
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-12-03 08:05:23 +00:00