12 Commits

Author SHA1 Message Date
Riza Suminto
c0c6cc9df4 IMPALA-12201: Stabilize TestFetch
This patch attempt to stabilize TestFetch by using HS2 as test protocol.
test_rows_sent_counters is modified to use the default hs2_client.
test_client_fetch_time_stats and test_client_fetch_time_stats_incomplete
is modified to use MinimalHS2Connection that has more simpler mechanism
in terms of fetching (ImpylaHS2Connection always fetch 10240 rows at a
time).

Implemented minimal functions needed to wait for finished state and pull
runtime profile at MinimalHS2Connection.

Testing:
Loop the test 50 times and pass them all.

Change-Id: I52651df37a318357711d26d2414e025cce4185c3
Reviewed-on: http://gerrit.cloudera.org:8080/22847
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-05-07 00:45:08 +00:00
Csaba Ringhofer
f98b697c7b IMPALA-13929: Make 'functional-query' the default workload in tests
This change adds get_workload() to ImpalaTestSuite and removes it
from all test suites that already returned 'functional-query'.
get_workload() is also removed from CustomClusterTestSuite which
used to return 'tpch'.

All other changes besides impala_test_suite.py and
custom_cluster_test_suite.py are just mass removals of
get_workload() functions.

The behavior is only changed in custom cluster tests that didn't
override get_workload(). By returning 'functional-query' instead
of 'tpch', exploration_strategy() will no longer return 'core' in
'exhaustive' test runs. See IMPALA-3947 on why workload affected
exploration_strategy. An example for affected test is
TestCatalogHMSFailures which was skipped both in core and exhaustive
runs before this change.

get_workload() functions that return a different workload than
'functional-query' are not changed - it is possible that some of
these also don't handle exploration_strategy() as expected, but
individually checking these tests is out of scope in this patch.

Change-Id: I9ec6c41ffb3a30e1ea2de773626d1485c69fe115
Reviewed-on: http://gerrit.cloudera.org:8080/22726
Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>
Reviewed-by: Daniel Becker <daniel.becker@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-04-08 07:12:55 +00:00
Riza Suminto
71feb617e4 IMPALA-13835: Remove reference to protocol-specific states
With IMPALA-13682 merged, checking for query state can be done via
wait_for_impala_state(), wait_for_any_impala_state() and other helper
methods of ImpalaConnection. This patch remove all reference to
protocol-specific states such as BeeswaxService.QueryState.

Also fix flake8 errors and unused variable in modified test files.

Testing:
- Run and pass all affected tests.

Change-Id: Id6b56024fbfcea1ff005c34cd146d16e67cb6fa1
Reviewed-on: http://gerrit.cloudera.org:8080/22586
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-03-09 00:04:05 +00:00
Kurt Deschler
06bbbea257 IMPALA-12679: Improve test_rows_sent_counters assert
This patch changes the assert for failed test test_rows_sent_counters so
that the actual RPC count is displayed in the assert output. The root
cause of the failure will be addressed once sufficient data is collected
with the new output.

Testing:
  Ran test_rows_sent_counters with modified expected RPC count range to
simulate failure.

Change-Id: Ic6b48cf4039028e749c914ee60b88f04833a0069
Reviewed-on: http://gerrit.cloudera.org:8080/21310
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-04-17 05:00:01 +00:00
Kurt Deschler
aa164e3cbc IMPALA-12176: Improve client fetch metrics
This patch makes multiple improvements to query profile and RPC metrics
to improve observability and allow more detailed analysis of where time
is being spent by client RPCs.

- A new CreateResultSetTime metric has been added to PLAN_ROOT_SINK node
  in the query profile. This timer isolates the cost to convert fetched
  rows to the client protocol.
- Read/Write time is now tracked during client RPC execution and added to
  the rpcz JSON output. A checkbox in the /rpcz Web UI page enables
  display of the Read/Write stats.
- Read and Write time are defined from Thrift callbacks defined in
  apache::thrift::TProcessorEventHandler. Read time includes reading and
  deserializing Thrift RPC args from the transport. Write time includes
  serializing, writing, and flushing Thrift RPC args to the transport.
- Client RPC cost is tracked on a per-query basis and displayed in the
  server profile as RPCCount, RPCReadTimer, and RPCWriteTimer
- Accuracy of RPC histograms is changed from milliseconds to microseconds

Testing:
tests added to test_fetch.py and test_web_pages.py

Change-Id: I986f3f2afac1775274895393969b270cf956b262
Reviewed-on: http://gerrit.cloudera.org:8080/19966
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2023-06-09 16:59:03 +00:00
Joe McDonnell
21f5a7a6e5 IMPALA-10180: Add summary stats for client fetch wait time
This adds ClientFetchWaitTimeStats to the runtime profile
to track the min/max/# of samples for ClientFetchWaitTimer.
Here is some sample output:
- ClientFetchWaitTimeStats: (Avg: 161.554ms ; Min: 101.411ms ; Max: 461.728ms ; Number of samples: 6)
- ClientFetchWaitTimer: 969.326ms

This also fixes the definition of ClientFetchWaitTimer to avoid
including time after end of fetch. When the client is closing
the query, Finalize() gets called. The Finalize() call should
only add extra client wait time if fetch has not completed.

Testing:
 - Added test cases in query_test/test_fetch.py with specific
   numbers of fetches and verification of the statistics.
 - The test cases make use of a new function for parsing
   summary stats for timers, and this also gets its own test
   case.

Change-Id: I9ca525285e03c7b51b04ac292f7b3531e6178218
Reviewed-on: http://gerrit.cloudera.org:8080/19897
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Csaba Ringhofer <csringhofer@cloudera.com>
2023-05-18 16:09:30 +00:00
Joe McDonnell
82bd087fb1 IMPALA-11973: Add absolute_import, division to all eligible Python files
This takes steps to make Python 2 behave like Python 3 as
a way to flush out issues with running on Python 3. Specifically,
it handles two main differences:
 1. Python 3 requires absolute imports within packages. This
    can be emulated via "from __future__ import absolute_import"
 2. Python 3 changed division to "true" division that doesn't
    round to an integer. This can be emulated via
    "from __future__ import division"

This changes all Python files to add imports for absolute_import
and division. For completeness, this also includes print_function in the
import.

I scrutinized each old-division location and converted some locations
to use the integer division '//' operator if it needed an integer
result (e.g. for indices, counts of records, etc). Some code was also using
relative imports and needed to be adjusted to handle absolute_import.
This fixes all Pylint warnings about no-absolute-import and old-division,
and these warnings are now banned.

Testing:
 - Ran core tests

Change-Id: Idb0fcbd11f3e8791f5951c4944be44fb580e576b
Reviewed-on: http://gerrit.cloudera.org:8080/19588
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>
2023-03-09 17:17:57 +00:00
wzhou-code
df0a014e36 IMPALA-10927: Deflaky TestFetchAndSpooling.test_rows_sent_counters
IMPALA-8957 fixed the flakiness for test by adding a delay via
DEBUG_ACTION BPRS_BEFORE_ADD_ROWS in BlockingPlanRootSink::Send().
test_rows_sent_counters uses DEBUG_ACTION BPRS_BEFORE_ADD_BATCH when
spool_query_results is on, and uses BPRS_BEFORE_ADD_ROWS when
spool_query_results is off with assumption that result spooling is
disabled by default.

IMPALA-9856 enabled result spooling by default.
Following two issues were introduced for the test when result spooling
was enabled by default.
1) spool_query_results as false is not covered in the test since
extended dimension is added with spool_query_results as true.
2) Since the test uses BPRS_BEFORE_ADD_ROWS if spool_query_results is
not specified as true, it makes DEBUG_ACTION BPRS_BEFORE_ADD_ROWS to be
used for spool_query_results as true. This causes the test flaky since
no delay to be added in BufferedPlanRootSink::Send().

There is another bug in the test. It uses bool() to convert string to
boolean value, but the function returns true for any non empty string.

This patch changed the extended dimension setting for
spool_query_results as false, and made the test to use the right
DEBUG_ACTION for spool_query_results as true and false.
Also reverted the previous fixing which disabled the test for S3
testing environment.

Testing:
  - Ran the test more than 10000 times without failure on Jenkins.

Change-Id: I790bbe1072357caf8ee11bb37644cf29dc8bea0f
Reviewed-on: http://gerrit.cloudera.org:8080/18671
Reviewed-by: Quanlong Huang <huangquanlong@gmail.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2022-06-27 10:54:38 +00:00
Qifan Chen
444fedbfda IMPALA-10927 TestFetchAndSpooling.test_rows_sent_counters is flaky in core-s3 based test
This fix removes the flakiness for test_rows_sent_counters test by
disabling it in S3 testing environment.

Testing:
1. Unit test in a non-S3 environment;
2. Ran core test successfully.

Change-Id: Ie6f1a8bc80c24c1368282be097aa8f943dd95d1e
Reviewed-on: http://gerrit.cloudera.org:8080/17862
Reviewed-by: Wenzhe Zhou <wzhou@cloudera.com>
Tested-by: Qifan Chen <qchen@cloudera.com>
2021-09-28 18:34:33 +00:00
stakiar
c47fca5960 IMPALA-8962: FETCH_ROWS_TIMEOUT_MS should apply before rows are available
IMPALA-7312 added the query option FETCH_ROWS_TIMEOUT_MS, but it only
applies to fetch requests against a query that has already transitioned
to the 'FINISHED' state. This patch changes the timeout so that it
applies to queries in the 'RUNNING' state as well. Before this patch,
fetch requests issued while a query was 'RUNNING' blocked until the query
transitioned to the 'FINISHED' state, and then it fetched results and
returned them. After this patch, fetch requests against queries in the
'RUNNING' state will block for 'FETCH_ROWS_TIMEOUT_MS' and then return.

For HS2 clients, fetch requests that return while a query is 'RUNNING'
set their TStatusCode to STILL_EXECUTING_STATUS. For Beeswax clients,
fetch requests that return while a query is 'RUNNING' set the 'ready'
flag to false. For both clients, hasMoreRows is set to true.

If the following sequence of events occurs:
* A fetch request is issued and blocks on a 'RUNNING' query
* The query transitions to the 'FINISHED' state
* The fetch request attempts to read multiple batches
Then the time spent waiting for the query to finish is deducted from
the timeout used when waiting for rows to be produced by the Coordinator
fragment.

Fixed a bug in the current usage of FETCH_ROWS_TIMEOUT_MS where the
time units for FETCH_ROWS_TIMEOUT_MS and MonotonicStopWatch were not
being converted properly.

Tests:
* Moved existing fetch timeout tests from hs2/test_fetch.py into a new
test file hs2/test_fetch_timeout.py.
* Added several new tests to hs2/test_fetch_timeout.py to validate that
the timeout is applied to 'RUNNING' queries and that the timeout applies
across a 'RUNNING' and 'FINISHED' query.
* Added new tests to query_test/test_fetch.py to validate the timeout
while using the Beeswax protocol.

Change-Id: I2cba6bf062dcc1af19471d21857caa797c1ea4a4
Reviewed-on: http://gerrit.cloudera.org:8080/14332
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2019-10-08 19:13:33 +00:00
Sahil Takiar
750f659120 IMPALA-8926, IMPALA-8957: Fix result spooling flaky tests
TestResultSpooling::_test_full_queue was flaky because there was a race
condition in the test where the result spooling queue would not fill up
quickly enough. The original way around this was to sleep for a fixed
amount of time in hope that the queue would fill up by the time the
thread woke up. The new approach periodically searches the runtime
profile for specific patterns that indicate the queue is full.

TestFetchAndSpooling.test_rows_sent_counters was flaky because the
RowsSentRate can be 0 if the results are spooled fast enough (because
the time spent spooling results is 0). The fix is to use the DEBUG_ACTION
BPRS_BEFORE_ADD_BATCH to introduce a delay when spooling results, so that
the RowsSentRate is guaranteed to be non-zero.

TestFetch.test_rows_sent_counters was flaky because ClientFetchWaitTimer
can be 0 if the Coordinator does not end up waiting any time for results
to be fetched. The fix is to wait until the query has 'FINISHED'
(results are available to fetch) and then sleep so that the
ClientFetchWaitTimer is a non-zero value.

Cleaned up a few other tests as well.

Testing:
* Looped both tests for a few hours without failure

Change-Id: I3042f592bc79785e43ebc7b09ac1270eae8ed66f
Reviewed-on: http://gerrit.cloudera.org:8080/14275
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2019-09-24 19:14:06 +00:00
Sahil Takiar
34d132c513 IMPALA-8825: Add additional counters to PlanRootSink
Adds the counters RowsSent and RowsSentRate to the PLAN_ROOT_SINK
section of the profile:

  PLAN_ROOT_SINK:
     - PeakMemoryUsage: 4.01 MB (4202496)
     - RowBatchGetWaitTime: 0.000ns
     - RowBatchSendWaitTime: 0.000ns
     - RowsSent: 10 (10)
     - RowsSentRate: 416.00 /sec

RowsSent tracks the number of rows sent to the PlanRootSink via
PlanRootSink::Send. RowsSentRate tracks the rate that rows are sent to
the PlanRootSink.

Adds the counters NumRowsFetched, NumRowsFetchedFromCache, and
RowMaterializationRate to the ImpalaServer section of the profile.

  ImpalaServer:
     - ClientFetchWaitTimer: 11.999ms
     - NumRowsFetched: 10 (10)
     - NumRowsFetchedFromCache: 10 (10)
     - RowMaterializationRate: 9.00 /sec
     - RowMaterializationTimer: 1s007ms

NumRowsFetched tracks the total number of rows fetched by the query,
but does not include rows fetched from the cache. NumRowsFetchedFromCache
tracks the total number of rows fetched from the query results cache.
RowMaterializationRate tracks the rate at which rows are materialized.
RowMaterializationTimer already existed and tracks how much time is
spent materializing rows.

Testing:
* Added tests to test_fetch_first.py and query_test/test_fetch.py
* Enabled some tests in test_fetch_first.py that were pending
the completion of IMPALA-8819
* Ran core tests

Change-Id: Id9e101e2f3e2bf8324e149c780d35825ceecc036
Reviewed-on: http://gerrit.cloudera.org:8080/14180
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Sahil Takiar <stakiar@cloudera.com>
2019-09-16 19:36:11 +00:00