impala

mirror of https://github.com/apache/impala.git synced 2026-01-06 15:01:43 -05:00

Author	SHA1	Message	Date
Casey Ching	074e5b4349	Remove hashbang from non-script python files Many python files had a hashbang and the executable bit set though they were not intended to be run a standalone script. That makes determining which python files are actually scripts very difficult. A future patch will update the hashbang in real python scripts so they use $IMPALA_HOME/bin/impala-python. Change-Id: I04eafdc73201feefe65b85817a00474e182ec2ba Reviewed-on: http://gerrit.cloudera.org:8080/599 Reviewed-by: Casey Ching <casey@cloudera.com> Reviewed-by: Taras Bobrovytsky <tbobrovytsky@cloudera.com> Tested-by: Internal Jenkins	2015-08-04 05:26:07 +00:00
Taras Bobrovytsky	29a7368940	Modified perf_result_datastore to use Impala instead of MySQL Change-Id: I441a51bc7e03d1bfe2283e77c16cba9394034258 Reviewed-on: http://gerrit.cloudera.org:8080/325 Reviewed-by: Martin Grund <mgrund@cloudera.com> Tested-by: Taras Bobrovytsky <tbobrovytsky@cloudera.com>	2015-04-09 20:25:28 +00:00
Taras Bobrovytsky	fd1a469878	Significant improvements to benchmark report - Added % change to performance regressions/improvements table - Automatic extraction of Impala version from runtime profiles - Execution summary row will not be printed if max time is < 100ms or < 2% of the overall runtime - Failed queries are ignored - First result is discarded for each query - Geometric mean was added to summary - Improved handling of multiple workloads in a single JSON file - Improved handling of the case when queries are different in results and reference results - Works well for single client runs. Additional work is needed to handle multiple client runs well. Change-Id: Ice7b9cc4fd7502a448d35ace10fbcef183df1769 Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4210 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: jenkins (cherry picked from commit c722f6b0a104df54b550978cd222a9af4d39b929) Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5250 Reviewed-by: Taras Bobrovytsky <tbobrovytsky@cloudera.com>	2014-11-13 18:54:08 -08:00
ishaan	565d15579c	Add the ability to use a workload as the unit of execution in the Impala benchmark runner. At the moment, a query is the default unit of execution and parallelism in the Impala performance suite. With this change, we now have the ability to treat a workload as the unit of execution. A workload is defined as a unique combination of the dataset, scale factor, a subset (or all) of the queries in the dataset, and a table format (file format, compression codec and compression scheme). It introduces two new command line options in bin/run-workload.py: * --execution_scope The default scope is 'query', and it maintains previous semantics. The new scope is 'workload', which toggles the unit of execution to a workload. * --shuffle_query_exec_order. Shuffles the order in which queries are executed (only applicable when the execution_scope if workload), defaults to False. Change-Id: I790d75f0896210cda8eb999015b0be04246e4c45 Reviewed-on: http://gerrit.ent.cloudera.com:8080/503 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: Ishaan Joshi <ishaan@cloudera.com>	2014-01-08 10:53:07 -08:00
Lenni Kuff	dd9798c9f3	IMP-785: calculation_util.calculate_mean does not calculate mean (instead median)	2014-01-08 10:48:35 -08:00
Henry Robinson	e15a39143a	Fix definition of calculate_mean	2014-01-08 10:48:34 -08:00
Lenni Kuff	4cf7d2634e	Update benchmark runner to use mean of all results if num_clients > 1	2014-01-08 10:48:30 -08:00

7 Commits