Files
impala/tests/stress
Casey Ching facedb2aa5 Add stress test for TPC queries running against a cluster
This will run concurrent TPC-DS/H queries against a CM managed cluster.

Stress test outline (and notes):
 1) Get a set of queries. TPCH and/or TPCDS queries will be used.
    TODO: Add randomly generated queries.
 2) For each query, run it individually to find:
     a) Minimum mem limit to avoid spilling
     b) Minimum mem limit to successfully run the query (spilling
     allowed)
     c) Runtime when no mem was spilled
     d) Runtime when mem was spilled
     e) A row order independent hash of the result set.
    This is a slow process so the results will be written to disk for
    reuse.
 3) Find the memory available to Impalad. This will be done by finding
 the minimum
    memory available across all impalads (-mem_limit startup option).
    Ideally, for
    maximum stress, all impalads will have the same memory
    configuration but this is
    not required.
 4) Optionally, set an amount of memory that can be overcommitted.
 5) Start submitting queries. There are two modes for throttling the
 number of
    concurrent queries:
     a) Submit queries until all available memory (as determined by
     items 3 and 4) is
        used. Before running the query a query mem limit is set
        between 2a and 2b.
        (There is a runtime option to increase the likelihood that a
        query will be
        given the full 2a limit to avoid spilling.)
     b) TODO: Use admission control.
 6) Randomly cancel queries to test cancellation. There is a runtime
 option to control
    the likelihood that a query will be randomly canceled.
 7) Cancel long running queries. Queries that run longer than some
 expected time,
    determined by the number of queries currently running, will be
    canceled.
    TODO: Collect stacks of timed out queries and add reporting.
 8) If a query errored, verify that memory was overcommitted during
 execution and the
    error is a mem limit exceeded error. There is no other reason a
    query should error
    and any such error will cause the stress test to stop.
    TODO: Handle crashes -- collect core dumps and restart Impala
    TODO: Handle client connectivity timeouts -- retry a few times
 9) Verify the result set hash of successful queries.

Change-Id: I4bd7f8a7cc65d5ae910a33afba59135040a99061
Reviewed-on: http://gerrit.cloudera.org:8080/474
Reviewed-by: Casey Ching <casey@cloudera.com>
Tested-by: Casey Ching <casey@cloudera.com>
2015-08-15 23:10:25 +00:00
..