Commit Graph

4 Commits

Author SHA1 Message Date
ishaan
565d15579c Add the ability to use a workload as the unit of execution in the Impala benchmark runner.
At the moment, a query is the default unit of execution and parallelism in the Impala
performance suite. With this change, we now have the ability to treat a workload as the
unit of execution. A workload is defined as a unique combination of the dataset, scale
factor, a subset (or all) of the queries in the dataset, and a table format (file format,
compression codec and compression scheme).

It introduces two new command line options in bin/run-workload.py:
  * --execution_scope
    The default scope is 'query', and it maintains previous semantics. The
    new scope is 'workload', which toggles the unit of execution to a workload.
  * --shuffle_query_exec_order.
    Shuffles the order in which queries are executed (only applicable when the
    execution_scope if workload), defaults to False.

Change-Id: I790d75f0896210cda8eb999015b0be04246e4c45
Reviewed-on: http://gerrit.ent.cloudera.com:8080/503
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: Ishaan Joshi <ishaan@cloudera.com>
2014-01-08 10:53:07 -08:00
Lenni Kuff
dd9798c9f3 IMP-785: calculation_util.calculate_mean does not calculate mean (instead median) 2014-01-08 10:48:35 -08:00
Henry Robinson
e15a39143a Fix definition of calculate_mean 2014-01-08 10:48:34 -08:00
Lenni Kuff
4cf7d2634e Update benchmark runner to use mean of all results if num_clients > 1 2014-01-08 10:48:30 -08:00