IMPALA-12224: Improve error handling for shell interactive tests

Interactive shell tests can hang waiting for input if the
shell process hits errors or exits. For example, the problems
in the sasl package seen in IMPALA-12220 cause test_shell_interactive.py
to hang.

This improves the error detection/handling to avoid hangs for
most common shell errors. Specifically, it adds a check for
the impala-shell process exiting, and it adds a check for
a failure to connect to Impala. Both would previous result
in hangs.

Testing:
 - Verified test_shell_interactive.py doesn't hang with hand
   tests
 - Remove a vital import from impala-shell so it exits instantly
 - Simulate a connection problem by overwriting the port
   with a non-functional port
 - Test on Redhat 9 with the IMPALA-12220 issue

Change-Id: I7556fb687e06b41caa538d8c3231ec9f2ad98162
Reviewed-on: http://gerrit.cloudera.org:8080/20087
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>
This commit is contained in:
Joe McDonnell
2023-06-16 16:10:29 -07:00
parent a82830896b
commit bad064dbea
2 changed files with 58 additions and 9 deletions

View File

@@ -32,6 +32,12 @@ import sys
import time
from subprocess import Popen, PIPE
# This import is the actual ImpalaShell class from impala_shell.py.
# We rename it to ImpalaShellClass here because we later import another
# class called ImpalaShell from tests/shell/util.py, and we don't want
# to mask it.
from shell.impala_shell import ImpalaShell as ImpalaShellClass
from tests.common.environ import (IMPALA_LOCAL_BUILD_VERSION,
ImpalaTestClusterProperties)
from tests.common.impala_service import ImpaladService
@@ -229,10 +235,41 @@ class ImpalaShell(object):
# if stderr is redirected.
if wait_until_connected and (args is None or "--quiet" not in args) and \
stderr_file is None:
# We don't want to hang waiting for input. So, here are the scenarios
# we need to handle:
# 1. Shell process exits
# 2. Shell fails to connect. This can lead to an interactive prompt
# that blocks for input forever. The two messages to look for are
# "Error connecting" and "Socket error"
# 3. Process successfully connecting: "Connected to"
# Cases 1 and 2 should lead to an assert.
start_time = time.time()
process_status = None
connection_err = None
connected = False
while time.time() - start_time < timeout and not connected:
connected = "Connected to" in self.shell_process.stderr.readline()
while time.time() - start_time < timeout:
# Condition 1: check if the shell process has exited
# poll() returns None until the process exits
process_status = self.shell_process.poll()
if process_status is not None:
break
# readline() can block forever, so the timeout logic may not be effective
# if something gets stuck here.
line = self.shell_process.stderr.readline()
# Condition 2: check for errors connecting
if ImpalaShellClass.ERROR_CONNECTING_MESSAGE in line or \
ImpalaShellClass.SOCKET_ERROR_MESSAGE in line:
connection_err = line
break
# Condition 3: check if the connection is successful
connected = ImpalaShellClass.CONNECTED_TO_MESSAGE in line
if connected:
break
assert process_status is None, \
"Impala shell exited with return code {0}".format(process_status)
assert connection_err is None, connection_err
assert connected, "Impala shell is not connected"
def pid(self):