single_node_perf_run.py uses git_hash_A vs. git_hash_B, distinguish
them by their position in the command-line
arguments. single_node_perf_run.py calls report_benchmark_results.py,
which uses the "reference vs. input", distinguished by their
command-line flags. The output of report_benchmark_results.py uses
"{empty string} vs Base".
In the long run, I think it would be better to fix all three to use
the same terminology, but this comment hopefully adds clarity.
Change-Id: Ib236ce7e83dc193ef1382f6304444ce58759a639
Reviewed-on: http://gerrit.cloudera.org:8080/8470
Tested-by: Impala Public Jenkins
Reviewed-by: Jim Apple <jbapple-impala@apache.org>
In single_node_perf_run.py, restore_workloads() can make the tree
"dirty", and when a tree is dirty, git won't let you switch branches
in a way that clobbers the dirty file contents:
$ cd $(mktemp -d)
$ git init .
Initialized empty Git repository in /tmp/tmp.H0NxzTXLUj/.git/
$ touch foo && git add foo && git commit -a -m "foo"
[master (root-commit) 3776149] foo
1 file changed, 0 insertions(+), 0 deletions(-)
create mode 100644 foo
$ git checkout -b ok_foo && echo "ok" >> foo && git commit -a -m "foo is ok"
Switched to a new branch 'ok_foo'
[ok_foo 9fd5bde] foo is ok
1 file changed, 1 insertion(+)
$ git checkout master && echo "not ok" >> foo
Switched to branch 'master'
$ git checkout ok_foo
error: Your local changes to the following files would be overwritten by checkout:
foo
Please, commit your changes or stash them before you can switch branches.
Aborting
Discovered when testing single_node_perf_run with
https://gerrit.cloudera.org/#/c/7153/; after this commit, that patch
works with single_node_perf_run.py
Change-Id: Id0220f3cd7a26d2627e40cd432c23815a6d65ea4
Reviewed-on: http://gerrit.cloudera.org:8080/7291
Reviewed-by: Jim Apple <jbapple-impala@apache.org>
Tested-by: Impala Public Jenkins
When git checkout would overwrite changes, it fails and alerts the
user to do something with the changes. This patch removes any changes
to files induced by the workload copy-and-paste.
Testing: using a patch provided by Lars Volker that touched
testdata/workloads/ (https://gerrit.cloudera.org/#/c/7073/), I was
able to reproduce the problem he saw and see that this patch fixed it.
Change-Id: I9a0d004c353eb4b547aeaf3c56289594326653d7
Reviewed-on: http://gerrit.cloudera.org:8080/7145
Reviewed-by: Lars Volker <lv@cloudera.com>
Tested-by: Impala Public Jenkins
This is a migration from an old and broken script from another
repository. Example use:
bin/single_node_perf_run.py --ninja --workloads targeted-perf \
--load --scale 4 --iterations 20 --num_impalads 3 \
--start_minicluster --query_names PERF_AGG-Q3 \
$(git rev-parse HEAD~1) $(git rev-parse HEAD)
The script can load data, run benchmarks, and compare the statistics
of those runs for significant differences in performance. It glues
together buildall.sh, bin/load-data.py, bin/run-workload.py, and
tests/benchmark/report_benchmark_results.py.
Change-Id: I70ba7f3c28f612a370915615600bf8dcebcedbc9
Reviewed-on: http://gerrit.cloudera.org:8080/6818
Reviewed-by: Jim Apple <jbapple-impala@apache.org>
Tested-by: Impala Public Jenkins