IMPALA-7926: Fix flakiness in test_reconnect

test_reconnect launches a shell that connects to one impalad in the
minicluster then reconnects to a different impalad while checking that
the impalad's open session metric changes accordingly.

To do this, the test gets the number of open sessions at the start of
the test and then expects that the number of sessions will have
increased by 1 on the impalad that the shell is currently connected
to.

This can be a problem if there is a session left over from another
test that is still active when test_reconnect starts but exits while
it's running.

test_reconnect is already marked to run serially, so there shouldn't
be any other sessions open while it runs anyways. The solution is to
wait at the start of the test until any sessions left over from other
tests have exited.

Testing:
- Ran the test in an environment where the timing was previously
  causing it to fail almost deterministically and it now passes.

Change-Id: I3017ca3bf7b4e33440cffb80e9a48a63bec14434
Reviewed-on: http://gerrit.cloudera.org:8080/12045
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This commit is contained in:
Thomas Tauber-Marshall
2018-12-05 16:31:57 -08:00
committed by Impala Public Jenkins
parent 9c44853998
commit 1cbcd0c37d

View File

@@ -225,35 +225,37 @@ class TestImpalaShellInteractive(object):
Verifies that a connect command by the user is honoured.
"""
def get_num_open_sessions(impala_service):
"""Helper method to retrieve the number of open sessions"""
return impala_service.get_metric_value('impala-server.num-open-beeswax-sessions')
def wait_for_num_open_sessions(impala_service, num, err):
"""Helper method to wait for the number of open sessions to reach 'num'."""
assert impala_service.wait_for_metric_value(
'impala-server.num-open-beeswax-sessions', num) == num, err
hostname = socket.getfqdn()
initial_impala_service = ImpaladService(hostname)
target_impala_service = ImpaladService(hostname, webserver_port=25001,
beeswax_port=21001, be_port=22001)
# Get the initial state for the number of sessions.
num_sessions_initial = get_num_open_sessions(initial_impala_service)
num_sessions_target = get_num_open_sessions(target_impala_service)
# This test is running serially, so there shouldn't be any open sessions, but wait
# here in case a session from a previous test hasn't been fully closed yet.
wait_for_num_open_sessions(
initial_impala_service, 0, "21000 should not have any remaining open sessions.")
wait_for_num_open_sessions(
target_impala_service, 0, "21001 should not have any remaining open sessions.")
# Connect to localhost:21000 (default)
p = ImpalaShell()
sleep(5)
# Make sure we're connected <hostname>:21000
assert get_num_open_sessions(initial_impala_service) == num_sessions_initial + 1, \
"Not connected to %s:21000" % hostname
wait_for_num_open_sessions(
initial_impala_service, 1, "Not connected to %s:21000" % hostname)
p.send_cmd("connect %s:21001" % hostname)
# Wait for a little while
sleep(5)
# The number of sessions on the target impalad should have been incremented.
assert get_num_open_sessions(target_impala_service) == num_sessions_target + 1, \
"Not connected to %s:21001" % hostname
wait_for_num_open_sessions(
target_impala_service, 1, "Not connected to %s:21001" % hostname)
assert "[%s:21001] default>" % hostname in p.get_result().stdout
# The number of sessions on the initial impalad should have been decremented.
assert get_num_open_sessions(initial_impala_service) == num_sessions_initial, \
"Connection to %s:21000 should have been closed" % hostname
wait_for_num_open_sessions(initial_impala_service, 0,
"Connection to %s:21000 should have been closed" % hostname)
@pytest.mark.execute_serially
def test_ddl_queries_are_closed(self):