Files
impala/tests/custom_cluster/test_exchange_deferred_batches.py
Tim Armstrong cf224f8461 IMPALA-9128: part 2: dump traces for slow RPCs
This adds trace events for data stream RPCs and
dumps them when they take longer than
--impala_slow_rpc_threshold_ms.

I needed to modify the KRPC code to do this because it
currently only dumps traces for RPCs with deadlines.
I plan to add some version of this upstream in Kudu
so that we don't diverge our KRPC implementation.

Example output from test_exchange_small_buffer:

I1111 08:38:53.732910 26509 rpcz_store.cc:265] Call impala.DataStreamService.TransmitData from 127.0.0.1:42434 (request call id 43) took 7799ms. Request Metrics: {}
I1111 08:38:53.732928 26509 rpcz_store.cc:269] Trace:
1111 08:38:45.933412 (+     0us) impala-service-pool.cc:167] Inserting onto call queue
1111 08:38:45.933449 (+    37us) impala-service-pool.cc:254] Handling call
1111 08:38:45.933470 (+    21us) krpc-data-stream-mgr.cc:227] Added early sender
1111 08:38:47.906542 (+1973072us) krpc-data-stream-recvr.cc:327] Enqueuing deferred RPC
1111 08:38:53.732858 (+5826316us) krpc-data-stream-recvr.cc:506] Processing deferred RPC
1111 08:38:53.732860 (+     2us) krpc-data-stream-recvr.cc:399] Deserializing batch
1111 08:38:53.732888 (+    28us) krpc-data-stream-recvr.cc:426] Enqueuing deserialized batch
1111 08:38:53.732895 (+     7us) inbound_call.cc:162] Queueing success response

Disabled +-clang-diagnostic-gnu-zero-variadic-macro-arguments because it
had false positives on the TRACE_TO invocations.

Testing:
* Ran exhaustive and ASAN tests
* Ran stress test

Change-Id: Ic7af4b45c43ec731d742d3696112c5f800849947
Reviewed-on: http://gerrit.cloudera.org:8080/14668
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2019-11-14 20:24:58 +00:00

69 lines
3.0 KiB
Python

# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
import pytest
from tests.common.custom_cluster_test_suite import CustomClusterTestSuite
from tests.common.skip import SkipIfBuildType
@SkipIfBuildType.not_dev_build
class TestExchangeDeferredBatches(CustomClusterTestSuite):
@classmethod
def get_workload(cls):
return 'functional-query'
@classmethod
def setup_class(cls):
if cls.exploration_strategy() != 'exhaustive':
pytest.skip('runs only in exhaustive')
super(TestExchangeDeferredBatches, cls).setup_class()
@pytest.mark.execute_serially
@CustomClusterTestSuite.with_args(
"--stress_datastream_recvr_delay_ms=3000"
+ " --exchg_node_buffer_size_bytes=1024"
+ " --datastream_service_num_deserialization_threads=1"
+ " --impala_slow_rpc_threshold_ms=500")
def test_exchange_small_buffer(self, vector):
"""Exercise the code which handles deferred row batches. In particular,
the exchange buffer is set to a small value to cause incoming row batches
to be deferred at the receiver. Also, use a single deserialization thread
to limit the speed in which the deferred row batches are dequeued. These
settings help expose the race in IMPALA-8239 when there is any error
deserializing deferred row batches."""
TEST_QUERY = "select count(*) from tpch.lineitem t1, tpch.lineitem t2 " +\
"where t1.l_orderkey = t2.l_orderkey"
EXPECTED_RESULT = ['30012985']
for i in range(10):
# Simulate row batch insertion failure. This triggers IMPALA-8239.
debug_action = 'RECVR_ADD_BATCH:FAIL@0.8'
self.execute_query_expect_failure(self.client, TEST_QUERY,
query_options={'debug_action': debug_action})
for i in range(10):
# Simulate row batch insertion failure. This triggers IMPALA-8239.
debug_action = 'RECVR_UNPACK_PAYLOAD:FAIL@0.8'
self.execute_query_expect_failure(self.client, TEST_QUERY,
query_options={'debug_action': debug_action})
# Do a run with no debug action to make sure things are sane.
result = self.execute_query(TEST_QUERY, vector.get_value('exec_option'))
assert result.data == EXPECTED_RESULT