IMPALA-6866: Rework timeouts for test_exchange_delays.py

Isilon has been failing on the exchange-delays-zero-rows
test case due to slow scans. Running this part of the
test with a larger value for stress_datastream_recvr_delay_ms
solved the issue.

Since this part of the test is sensitive to slow scans,
I pulled this out into its own test. The new test can
apply an extra delay for platforms with slow scans.
This subsumes the fix for IMPALA-6811 and treats S3,
ADLS, and ISILON as platforms with slow scans.

test_exchange_small_delay and test_exchange_large_delay
(minus exchange-delays-zero-rows) go back to the timeouts
they were using before IMPALA-6811. These tests did not
see issues with those timeouts.

The new arrangement with test_exchange_large_delay_zero_rows
does not change any timeouts except for Isilon. This applies
a slow scan extra delay on top of Isilon's already slow
build.

Change-Id: I2e919a4e502b1e6a4156aafbbe4b5ddfe679ed89
Reviewed-on: http://gerrit.cloudera.org:8080/10208
Reviewed-by: Michael Brown <mikeb@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This commit is contained in:
Joe McDonnell
2018-04-25 12:44:37 -07:00
committed by Impala Public Jenkins
parent d6dad9cdf8
commit 5592ecfe1a

View File

@@ -25,8 +25,7 @@ from tests.util.filesystem_utils import IS_S3, IS_ADLS, IS_ISILON
SLOW_BUILD_TIMEOUT=20000
DELAY_MS = specific_build_type_timeout(10000, slow_build_timeout=SLOW_BUILD_TIMEOUT)
# IMPALA-6381: Isilon can behave as a slow build.
# IMPALA-6811: S3/ADLS can also have a slow scan that requires a longer delay.
if IS_S3 or IS_ADLS or IS_ISILON:
if IS_ISILON:
DELAY_MS = SLOW_BUILD_TIMEOUT
@SkipIfBuildType.not_dev_build
@@ -60,9 +59,20 @@ class TestExchangeDelays(CustomClusterTestSuite):
"""
self.run_test_case('QueryTest/exchange-delays', vector)
# Test the special case when no batches are sent and the EOS message times out.
# IMPALA-6811: For future reference, the SQL used for this test case requires
# that the scan complete before the fragment sends the EOS message. A slow scan can
# cause this test to fail, because the receivers could be set up before the
# fragment starts sending (and thus can't time out).
# The SQL used for test_exchange_large_delay_zero_rows requires that the scan complete
# before the fragment sends the EOS message. A slow scan can cause this test to fail,
# because the receivers could be set up before the fragment starts sending (and thus
# can't time out). Use a longer delay for platforms that have slow scans:
# IMPALA-6811: S3/ADLS have slow scans.
# IMPALA-6866: Isilon has slow scans (and is counted as a slow build above).
SLOW_SCAN_EXTRA_DELAY_MS = 10000
if IS_S3 or IS_ADLS or IS_ISILON:
DELAY_MS += SLOW_SCAN_EXTRA_DELAY_MS
@pytest.mark.execute_serially
@CustomClusterTestSuite.with_args(
"--stress_datastream_recvr_delay_ms={0}".format(DELAY_MS)
+ " --datastream_sender_timeout_ms=1")
def test_exchange_large_delay_zero_rows(self, vector):
"""Test the special case when no batches are sent and the EOS message times out."""
self.run_test_case('QueryTest/exchange-delays-zero-rows', vector)