impala

mirror of https://github.com/apache/impala.git synced 2026-01-06 06:01:03 -05:00

Author	SHA1	Message	Date
Dan Hecht	ffa7829b70	IMPALA-3918: Remove Cloudera copyrights and add ASF license header For files that have a Cloudera copyright (and no other copyright notice), make changes to follow the ASF source file header policy here: http://www.apache.org/legal/src-headers.html#headers Specifically: 1) Remove the Cloudera copyright. 2) Modify NOTICE.txt according to http://www.apache.org/legal/src-headers.html#notice to follow that format and add a line for Cloudera. 3) Replace or add the existing ASF license text with the one given on the website. Much of this change was automatically generated via: git grep -li 'Copyright.Cloudera' > modified_files.txt cat modified_files.txt \| xargs perl -n -i -e 'print unless m#Copyright.Cloudera#i;' cat modified_files_txt \| xargs fix_apache_license.py [1] Some manual fixups were performed following those steps, especially when license text was completely missing from the file. [1] https://gist.github.com/anonymous/ff71292094362fc5c594 with minor modification to ORIG_LICENSE to match Impala's license text. Change-Id: I2e0bd8420945b953e1b806041bea4d72a3943d86 Reviewed-on: http://gerrit.cloudera.org:8080/3779 Reviewed-by: Dan Hecht <dhecht@cloudera.com> Tested-by: Internal Jenkins	2016-08-09 08:19:41 +00:00
Casey Ching	074e5b4349	Remove hashbang from non-script python files Many python files had a hashbang and the executable bit set though they were not intended to be run a standalone script. That makes determining which python files are actually scripts very difficult. A future patch will update the hashbang in real python scripts so they use $IMPALA_HOME/bin/impala-python. Change-Id: I04eafdc73201feefe65b85817a00474e182ec2ba Reviewed-on: http://gerrit.cloudera.org:8080/599 Reviewed-by: Casey Ching <casey@cloudera.com> Reviewed-by: Taras Bobrovytsky <tbobrovytsky@cloudera.com> Tested-by: Internal Jenkins	2015-08-04 05:26:07 +00:00
Taras Bobrovytsky	29a7368940	Modified perf_result_datastore to use Impala instead of MySQL Change-Id: I441a51bc7e03d1bfe2283e77c16cba9394034258 Reviewed-on: http://gerrit.cloudera.org:8080/325 Reviewed-by: Martin Grund <mgrund@cloudera.com> Tested-by: Taras Bobrovytsky <tbobrovytsky@cloudera.com>	2015-04-09 20:25:28 +00:00
Taras Bobrovytsky	fd1a469878	Significant improvements to benchmark report - Added % change to performance regressions/improvements table - Automatic extraction of Impala version from runtime profiles - Execution summary row will not be printed if max time is < 100ms or < 2% of the overall runtime - Failed queries are ignored - First result is discarded for each query - Geometric mean was added to summary - Improved handling of multiple workloads in a single JSON file - Improved handling of the case when queries are different in results and reference results - Works well for single client runs. Additional work is needed to handle multiple client runs well. Change-Id: Ice7b9cc4fd7502a448d35ace10fbcef183df1769 Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4210 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: jenkins (cherry picked from commit c722f6b0a104df54b550978cd222a9af4d39b929) Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5250 Reviewed-by: Taras Bobrovytsky <tbobrovytsky@cloudera.com>	2014-11-13 18:54:08 -08:00
ishaan	565d15579c	Add the ability to use a workload as the unit of execution in the Impala benchmark runner. At the moment, a query is the default unit of execution and parallelism in the Impala performance suite. With this change, we now have the ability to treat a workload as the unit of execution. A workload is defined as a unique combination of the dataset, scale factor, a subset (or all) of the queries in the dataset, and a table format (file format, compression codec and compression scheme). It introduces two new command line options in bin/run-workload.py: * --execution_scope The default scope is 'query', and it maintains previous semantics. The new scope is 'workload', which toggles the unit of execution to a workload. * --shuffle_query_exec_order. Shuffles the order in which queries are executed (only applicable when the execution_scope if workload), defaults to False. Change-Id: I790d75f0896210cda8eb999015b0be04246e4c45 Reviewed-on: http://gerrit.ent.cloudera.com:8080/503 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: Ishaan Joshi <ishaan@cloudera.com>	2014-01-08 10:53:07 -08:00
Lenni Kuff	dd9798c9f3	IMP-785: calculation_util.calculate_mean does not calculate mean (instead median)	2014-01-08 10:48:35 -08:00
Henry Robinson	e15a39143a	Fix definition of calculate_mean	2014-01-08 10:48:34 -08:00
Lenni Kuff	4cf7d2634e	Update benchmark runner to use mean of all results if num_clients > 1	2014-01-08 10:48:30 -08:00

8 Commits