mirror of
https://github.com/apache/impala.git
synced 2026-01-26 21:02:23 -05:00
This will run concurrent TPC-DS/H queries against a CM managed cluster.
Stress test outline (and notes):
1) Get a set of queries. TPCH and/or TPCDS queries will be used.
TODO: Add randomly generated queries.
2) For each query, run it individually to find:
a) Minimum mem limit to avoid spilling
b) Minimum mem limit to successfully run the query (spilling
allowed)
c) Runtime when no mem was spilled
d) Runtime when mem was spilled
e) A row order independent hash of the result set.
This is a slow process so the results will be written to disk for
reuse.
3) Find the memory available to Impalad. This will be done by finding
the minimum
memory available across all impalads (-mem_limit startup option).
Ideally, for
maximum stress, all impalads will have the same memory
configuration but this is
not required.
4) Optionally, set an amount of memory that can be overcommitted.
5) Start submitting queries. There are two modes for throttling the
number of
concurrent queries:
a) Submit queries until all available memory (as determined by
items 3 and 4) is
used. Before running the query a query mem limit is set
between 2a and 2b.
(There is a runtime option to increase the likelihood that a
query will be
given the full 2a limit to avoid spilling.)
b) TODO: Use admission control.
6) Randomly cancel queries to test cancellation. There is a runtime
option to control
the likelihood that a query will be randomly canceled.
7) Cancel long running queries. Queries that run longer than some
expected time,
determined by the number of queries currently running, will be
canceled.
TODO: Collect stacks of timed out queries and add reporting.
8) If a query errored, verify that memory was overcommitted during
execution and the
error is a mem limit exceeded error. There is no other reason a
query should error
and any such error will cause the stress test to stop.
TODO: Handle crashes -- collect core dumps and restart Impala
TODO: Handle client connectivity timeouts -- retry a few times
9) Verify the result set hash of successful queries.
Change-Id: I4bd7f8a7cc65d5ae910a33afba59135040a99061
Reviewed-on: http://gerrit.cloudera.org:8080/474
Reviewed-by: Casey Ching <casey@cloudera.com>
Tested-by: Casey Ching <casey@cloudera.com>