7 Commits

Author SHA1 Message Date
Joe McDonnell
ce8078204e IMPALA-13111: Fix the calculation of fragment ids for impala-gdb.py
The gdb helpers in impala-gdb.py provide functions to look on
the stack for the information added in IMPALA-6416 and get the
fragment/query ids. Right now, it is incorrectly using a signed
integer, which leads to incorrect ids like this:
-3cbda1606b3ade7c:f170c4bd00000000

This changes the logic to AND the integer with an 0xFF* sequence
of the right length. This forces the integer to be unsigned,
producing the right query id.

Testing:
 - Ran this on a minidump and verified the the listed query ids
   were valid (and existed in the profile log)

Change-Id: I59798407e99ee0e9100cac6b4b082cdb85ed43d1
Reviewed-on: http://gerrit.cloudera.org:8080/21472
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-05-31 22:57:14 +00:00
Joe McDonnell
aa4050b4d9 IMPALA-11976: Fix use of deprecated functions/fields removed in Python 3
Python 3 moved several things around or removed deprecated
functions / fields:
 - sys.maxint was removed, but sys.maxsize provides similar functionality
 - long was removed, but int provides the same range
 - file() was removed, but open() already provided the same functionality
 - Exception.message was removed, but str(exception) is equivalent
 - Some encodings (like hex) were moved to codecs.encode()
 - string.letters -> string.ascii_letters
 - string.lowercase -> string.ascii_lowercase
 - string.strip was removed

This fixes all of those locations. Python 3 also has slightly different
rounding behavior from round(), so this changes round() to use future's
builtins.round() to get the Python 3 behavior.

This fixes the following pylint warnings:
 - file-builtin
- long-builtin
- invalid-str-codec
- round-builtin
- deprecated-string-function
- sys-max-int
- exception-message-attribute

Testing:
 - Ran cores tests

Change-Id: I094cd7fd06b0d417fc875add401d18c90d7a792f
Reviewed-on: http://gerrit.cloudera.org:8080/19591
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>
2023-03-09 17:17:57 +00:00
Joe McDonnell
82bd087fb1 IMPALA-11973: Add absolute_import, division to all eligible Python files
This takes steps to make Python 2 behave like Python 3 as
a way to flush out issues with running on Python 3. Specifically,
it handles two main differences:
 1. Python 3 requires absolute imports within packages. This
    can be emulated via "from __future__ import absolute_import"
 2. Python 3 changed division to "true" division that doesn't
    round to an integer. This can be emulated via
    "from __future__ import division"

This changes all Python files to add imports for absolute_import
and division. For completeness, this also includes print_function in the
import.

I scrutinized each old-division location and converted some locations
to use the integer division '//' operator if it needed an integer
result (e.g. for indices, counts of records, etc). Some code was also using
relative imports and needed to be adjusted to handle absolute_import.
This fixes all Pylint warnings about no-absolute-import and old-division,
and these warnings are now banned.

Testing:
 - Ran core tests

Change-Id: Idb0fcbd11f3e8791f5951c4944be44fb580e576b
Reviewed-on: http://gerrit.cloudera.org:8080/19588
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>
2023-03-09 17:17:57 +00:00
Joe McDonnell
2b550634d2 IMPALA-11952 (part 2): Fix print function syntax
Python 3 now treats print as a function and requires
the parenthesis in invocation.

print "Hello World!"
is now:
print("Hello World!")

This fixes all locations to use the function
invocation. This is more complicated when the output
is being redirected to a file or when avoiding the
usual newline.

print >> sys.stderr , "Hello World!"
is now:
print("Hello World!", file=sys.stderr)

To support this properly and guarantee equivalent behavior
between python 2 and python 3, all files that use print
now add this import:
from __future__ import print_function

This also fixes random flake8 issues that intersect with
the changes.

Testing:
 - check-python-syntax.sh shows no errors related to print

Change-Id: Ib634958369ad777a41e72d80c8053b74384ac351
Reviewed-on: http://gerrit.cloudera.org:8080/19552
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Tested-by: Michael Smith <michael.smith@cloudera.com>
2023-02-28 17:11:50 +00:00
Philip Zeyliger
13dfdc64db IMPALA-6664: Tag log statements with fragment or query ids.
This implements much of the desire in IMPALA-6664 to tag all log
statements with their query ids. It re-uses the existing ThreadDebugInfo
infrastructure as well as the existing
InstallLogMessageListenerFunction() patch to glog (currently used for
log redaction) to prefix log messages with fragment ids or query ids,
when available. The fragment id is the query id with the last bits
incremented, so it's possible to correlate a given query's log messages.
For example:

  $ grep 85420d575b9ff4b9:402b8868 logs/cluster/impalad.INFO
  I0108 10:39:16.453958 14752 impala-server.cc:1052] 85420d575b9ff4b9:402b886800000000] Registered query query_id=85420d575b9ff4b9:402b886800000000 session_id=aa45e480434f0516:101ae5ac12679d94
  I0108 10:39:16.454738 14752 Frontend.java:1242] 85420d575b9ff4b9:402b886800000000] Analyzing query: select count(*) from tpcds.web_sales
  I0108 10:39:16.456627 14752 Frontend.java:1282] 85420d575b9ff4b9:402b886800000000] Analysis finished.
  I0108 10:39:16.463538 14818 admission-controller.cc:598] 85420d575b9ff4b9:402b886800000000] Schedule for id=85420d575b9ff4b9:402b886800000000 in pool_name=default-pool per_host_mem_estimate=180.02 MB PoolConfig: max_requests=-1 max_queued=200 max_mem=-1.00 B
  I0108 10:39:16.463603 14818 admission-controller.cc:603] 85420d575b9ff4b9:402b886800000000] Stats: agg_num_running=0, agg_num_queued=0, agg_mem_reserved=0,  local_host(local_mem_admitted=0, num_admitted_running=0, num_queued=0, backend_mem_reserved=0)
  I0108 10:39:16.463780 14818 admission-controller.cc:635] 85420d575b9ff4b9:402b886800000000] Admitted query id=85420d575b9ff4b9:402b886800000000
  I0108 10:39:16.463896 14818 coordinator.cc:93] 85420d575b9ff4b9:402b886800000000] Exec() query_id=85420d575b9ff4b9:402b886800000000 stmt=select count(*) from tpcds.web_sales
  I0108 10:39:16.464795 14818 coordinator.cc:356] 85420d575b9ff4b9:402b886800000000] starting execution on 2 backends for query_id=85420d575b9ff4b9:402b886800000000
  I0108 10:39:16.466384 24891 impala-internal-service.cc:49] ExecQueryFInstances(): query_id=85420d575b9ff4b9:402b886800000000 coord=pannier.sf.cloudera.com:22000 #instances=2
  I0108 10:39:16.467339 14818 coordinator.cc:370] 85420d575b9ff4b9:402b886800000000] started execution on 2 backends for query_id=85420d575b9ff4b9:402b886800000000
  I0108 10:39:16.467536 14823 query-state.cc:579] 85420d575b9ff4b9:402b886800000000] Executing instance. instance_id=85420d575b9ff4b9:402b886800000000 fragment_idx=0 per_fragment_instance_idx=0 coord_state_idx=0 #in-flight=1
  I0108 10:39:16.467627 14824 query-state.cc:579] 85420d575b9ff4b9:402b886800000001] Executing instance. instance_id=85420d575b9ff4b9:402b886800000001 fragment_idx=1 per_fragment_instance_idx=0 coord_state_idx=0 #in-flight=2
  I0108 10:39:16.820933 14824 query-state.cc:587] 85420d575b9ff4b9:402b886800000001] Instance completed. instance_id=85420d575b9ff4b9:402b886800000001 #in-flight=1 status=OK
  I0108 10:39:17.122299 14823 krpc-data-stream-mgr.cc:294] 85420d575b9ff4b9:402b886800000000] DeregisterRecvr(): fragment_instance_id=85420d575b9ff4b9:402b886800000000, node=2
  I0108 10:39:17.123500 24038 coordinator.cc:709] Backend completed: host=pannier.sf.cloudera.com:22001 remaining=2 query_id=85420d575b9ff4b9:402b886800000000
  I0108 10:39:17.123509 24038 coordinator-backend-state.cc:265] query_id=85420d575b9ff4b9:402b886800000000: first in-progress backend: pannier.sf.cloudera.com:22000
  I0108 10:39:17.167752 14752 impala-beeswax-server.cc:197] 85420d575b9ff4b9:402b886800000000] get_results_metadata(): query_id=85420d575b9ff4b9:402b886800000000
  I0108 10:39:17.168762 14752 coordinator.cc:483] 85420d575b9ff4b9:402b886800000000] ExecState: query id=85420d575b9ff4b9:402b886800000000 execution completed
  I0108 10:39:17.168808 14752 coordinator.cc:608] 85420d575b9ff4b9:402b886800000000] Coordinator waiting for backends to finish, 1 remaining. query_id=85420d575b9ff4b9:402b886800000000
  I0108 10:39:17.168880 14823 query-state.cc:587] 85420d575b9ff4b9:402b886800000000] Instance completed. instance_id=85420d575b9ff4b9:402b886800000000 #in-flight=0 status=OK
  I0108 10:39:17.168977 14821 query-state.cc:252] UpdateBackendExecState(): last report for 85420d575b9ff4b9:402b886800000000
  I0108 10:39:17.174401 24038 coordinator.cc:709] Backend completed: host=pannier.sf.cloudera.com:22000 remaining=1 query_id=85420d575b9ff4b9:402b886800000000
  I0108 10:39:17.174513 14752 coordinator.cc:814] 85420d575b9ff4b9:402b886800000000] Release admission control resources for query_id=85420d575b9ff4b9:402b886800000000
  I0108 10:39:17.174815 14821 query-state.cc:604] Cancel: query_id=85420d575b9ff4b9:402b886800000000
  I0108 10:39:17.174837 14821 krpc-data-stream-mgr.cc:325] cancelling all streams for fragment_instance_id=85420d575b9ff4b9:402b886800000001
  I0108 10:39:17.174856 14821 krpc-data-stream-mgr.cc:325] cancelling all streams for fragment_instance_id=85420d575b9ff4b9:402b886800000000
  I0108 10:39:17.179621 14752 impala-beeswax-server.cc:239] 85420d575b9ff4b9:402b886800000000] close(): query_id=85420d575b9ff4b9:402b886800000000
  I0108 10:39:17.179651 14752 impala-server.cc:1131] 85420d575b9ff4b9:402b886800000000] UnregisterQuery(): query_id=85420d575b9ff4b9:402b886800000000
  I0108 10:39:17.179666 14752 impala-server.cc:1238] 85420d575b9ff4b9:402b886800000000] Cancel(): query_id=85420d575b9ff4b9:402b886800000000
  I0108 10:39:17.179814 14752 coordinator.cc:684] 85420d575b9ff4b9:402b886800000000] CancelBackends() query_id=85420d575b9ff4b9:402b886800000000, tried to cancel 0 backends
  I0108 10:39:17.203898 14752 query-exec-mgr.cc:184] 85420d575b9ff4b9:402b886800000000] ReleaseQueryState(): deleted query_id=85420d575b9ff4b9:402b886800000000
  I0108 10:39:18.108947 14752 impala-server.cc:1993] 85420d575b9ff4b9:402b886800000000] Connection from client ::ffff:172.16.35.186:52096 closed, closing 1 associated session(s)
  I0108 10:39:18.108996 14752 impala-server.cc:1249] 85420d575b9ff4b9:402b886800000000] Closing session: aa45e480434f0516:101ae5ac12679d94
  I0108 10:39:18.109035 14752 impala-server.cc:1291] 85420d575b9ff4b9:402b886800000000] Closed session: aa45e480434f0516:101ae5ac12679d94

There are a few caveats here: the thread state isn't "scoped", so the "Closing
session" log statement is technically not part of the query. When that thread
is re-used for another query, it corrects itself. Some threads, like 14821,
aren't using the thread locals. In some case, we should go back and
add GetThreadDebugInfo()->SetQueryId(...) statements.

I've used this to debug some crashes (of my own doing) while running
parallel tests, and it's been quite helpful.

An alternative would be to use Kudu's be/src/kudu/util/async_logger.h,
and add the "Listener" functionality to it directly. Another alternative
would be to re-write all the *LOG macros, but this is quite painful (and
presumably was rejected when log redaction was introduced).

I changed thread-debug-info to capture TUniqueId (a thrift struct with
two int64s) rather than the string representation. This made it easier
to compare with the "0:0" id, which we treat as "unset".  If a developer
needs to analyze it from a debugger, gdb can print out hex just fine.

I added some context to request-context to be able to pipe ids through
to disk IO threads as well.

To test this, I moved "assert_log_contains" up to impala_test_suite, and
had it handle the default log location case. The test needs a sleep for
log buffering, but, it seems like a test with a sleep running in
parallel is better than a custom cluster test, which reboots the cluster
(and loads metadata).

Change-Id: I6634ef9d1a7346339f24f2d40a7a3aa36a535da8
Reviewed-on: http://gerrit.cloudera.org:8080/12129
Reviewed-by: Philip Zeyliger <philip@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2019-01-25 00:47:09 +00:00
Zoltan Borok-Nagy
d19f757641 IMPALA-7542: fix find-fragment-instances to find all "root threads"
find-fragment-instances didn't show all the threads
that worked on some fragment instance. It missed the
top-level "root threads" that started working on the
fragment instances.

I modified the get_fragment_instances() function to
check the local ThreadDebugInfo object of the threads
instead of checking the parent thread's ThreadDebugInfo.

I tested it locally on a core file.

Change-Id: I35ae1a6b384b002b343689469f02ceabd84af1b6
Reviewed-on: http://gerrit.cloudera.org:8080/11396
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-09-08 00:01:14 +00:00
Zoram Thanga
fe613e3247 IMPALA-7508: Add Impala Python GDB module
This patch adds a new Impala Python GDB module, and implements
a couple of covenience commands to make core dump analysis
easier.

The initial commands let us find queries and fragment instances
currently executing in an impalad at the time the daemon crashed:

(gdb) source impala-gdb.py
(gdb) find-query-ids
f74c863dff66a34d:1d983cc300000000
364525e12495932b:73f5dd0200000000
bc4a3eec25481981:edda04b800000000

(gdb) find-fragment-instances
Fragment Instance Id    Thread IDs

364525e12495932b:73f5dd02000000a2       [69]
364525e12495932b:73f5dd0200000171       [196, 136]
bc4a3eec25481981:edda04b8000001a8       [252, 237, 206]
f74c863dff66a34d:1d983cc30000009b       [200, 14, 13, 12, 6, 5, 3, 2]
f74c863dff66a34d:1d983cc30000013a       [4]

The commands have only been tested with Impala 2.12, and are not
expected to work with older versions since it uses ThreadDebugInfo
stuff from IMPALA-6416.

It is hoped that people contribute more commands to the module.

Change-Id: I24e7026f2265954ed592d6f62110cf8cb2c2202a
Reviewed-on: http://gerrit.cloudera.org:8080/11352
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-09-01 03:54:29 +00:00