Added execution summary, modified benchmark to handle JSON

- Added execution summary to the beeswax client and QueryResult
- Modified report-benchmark-results to handle JSON and perform
  execution summary comparison between runs
- Added comments to the new workload runner

Change-Id: I9c3c5f2fdc5d8d1e70022c4077334bc44e3a2d1d
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3598
Reviewed-by: Taras Bobrovytsky <tbobrovytsky@cloudera.com>
Tested-by: jenkins
(cherry picked from commit fd0b1406be2511c202e02fa63af94fbbe5e18eee)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3618
This commit is contained in:
Taras Bobrovytsky
2014-06-23 19:20:11 -07:00
committed by jenkins
parent 3bed0be1df
commit e94de02469
9 changed files with 1105 additions and 527 deletions

View File

@@ -27,7 +27,18 @@ class Workload(object):
A workload is the internal representation for the set of queries on a dataset. It
consists of the dataset name, and a mapping of query names to query strings.
Args:
name (str): workload name. (Eg. tpch)
query_name_filters (list of str): List of regular expressions used for matching query
names
Attributes:
name (str): workload name (Eg. tpch)
__query_map (dict): contains a query name -> string mapping; mapping of query name to
section (ex. "TPCH-Q10" -> "select * from...")
"""
WORKLOAD_DIR = os.environ['IMPALA_WORKLOAD_DIR']
def __init__(self, name, query_name_filters=None):
@@ -82,7 +93,15 @@ class Workload(object):
Transform all the queries in the workload's query map to query objects based on the
input test vector and scale factor.
Args:
test_vector (?): query vector
scale_factor (str): eg. "300gb"
Returns:
(list of Query): these will be consumed by ?
"""
queries = list()
for query_name, query_str in self.__query_map.iteritems():
queries.append(Query(name=query_name,