Files
impala/tests/performance
Thomas Tauber-Marshall b8a8edddcb IMPALA-8207: Fix query loading for perf and stress tests
Problems with perf queries (run-workload.py):
- TPCH picks up stress test specific queries (TPCH-AGG1/2/3)
- TPCDS picks up queries that were intended just to validate that data
  was loaded properly but that aren't interesting from a perf
  perspective (TPCDS-COUNT-<table>)
- TPCDS picks up both decimal_v1 and decimal_v2 queries. This is
  mostly harmless as for queries with matching names only one gets run
  but it causes some queries with mismatched names to be run twice
  (TPCDS-Q39-1/2 vs. TPCDS-Q39.1/2)

Problems with stress queries (concurrent_select.py):
- TPCDS fails to pick up Q22A as it does not use the decimal_v2
  queries, even though decimal_v2 is the default now.

This problem is exacerbated by the fact that the two scripts have
different code paths for selecting the queries, so in the past changes
that were made to one path were not always made to the other.

This patch merges the two paths to reduce code duplication and prevent
these sorts of issues in the future, and fixes the above issues.

One complication is that historically the stress test has used query
names in the form 'q1' whereas the perf test has used query names in
the form 'TPCH-Q1'. This patch standardizes on using 'TPCH-Q1'.

Testing:
- Added a test that checks that the perf tests pick up the expected
  number of queries.
- Manually ran the scripts and verified that the correct queries are
  selected.

Change-Id: Id1966d6ca8babdda07d47e089b75ba06d0318c0d
Reviewed-on: http://gerrit.cloudera.org:8080/12503
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2019-02-19 22:31:17 +00:00
..