Commit Graph

14 Commits

Author SHA1 Message Date
Thomas Tauber-Marshall
11ea79c525 Renamed conjunct_ordering.test to primitive_conjunct_ordering.test in targeted-perf
This is needed because the workload runner required a prefix of query names to run.

Change-Id: Ica8db68141ef653b0b01a7cfa7773302717a35a2
Reviewed-on: http://gerrit.cloudera.org:8080/3021
Tested-by: Internal Jenkins
Reviewed-by: Thomas Tauber-Marshall <tmarshall@cloudera.com>
2016-05-14 01:30:00 -07:00
Thomas Tauber-Marshall
8c2bf9769a IMPALA-2805: Order conjuncts based on selectivity and cost
Added costs to all Exprs, which estimate the relative cost of evaluating
an expression and all of its children. Costs are calculated during
analysis. For now, these costs are intended as a simple way to order
expressions from cheap to expensive, not necessarily to be a precise
reflection of running times.

In general, expressions that deal with variable length types like strings
will have higher cost than those dealing with fixed length types
like numbers and booleans. Additionally, expressions with complicated
subexpressions will have higher cost than simpler expressions.

Also added PlanNode.orderConjunctsByCost, which takes a list of Exprs and
returns a new list sorted according to an estimate of the cheapest order to
evaulate the conjuncts in, based on their cost and selectivity.

The conjuncts are sorted by repeatedly iterating over them and choosing the
conjunct that would result in the least total estimated work were it to be
applied before the remaining conjuncts. Selectivities are exponentially
backed off, and Exprs without selectivity estimates are given a reasonable
default.

Change-Id: I02279a26fbc6308ac5eb819d78345fc010469034
Reviewed-on: http://gerrit.cloudera.org:8080/2598
Reviewed-by: Thomas Tauber-Marshall <tmarshall@cloudera.com>
Tested-by: Internal Jenkins
2016-05-12 14:17:53 -07:00
Martin Grund
89113544cd Fix targeted perf queries to deal with run-workload.py limitations.
This patch fixes the comment style in the queries to work properly with
the limitations of the run-workload.py script. This includes removing
quotes and + from comments that otherwise get interpreted.

Change-Id: I791e7bd4145717aa0628c56b93582cd207195039
Reviewed-on: http://gerrit.cloudera.org:8080/1689
Reviewed-by: Martin Grund <mgrund@cloudera.com>
Tested-by: Internal Jenkins
2016-01-05 00:52:38 +00:00
Mostafa Mokhtar
f79a021cce Add targeted perf queries for nightly performance runs
Tests were tuned to run on a 9x node cluster with 64GB RAM against TPC-H 300GB database.

Change-Id: Ib421bcd463d370f795a235b755aeb24a6a70f705
Reviewed-on: http://gerrit.cloudera.org:8080/1394
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Internal Jenkins
2015-12-29 05:04:10 +00:00
Skye Wanderman-Milne
c79cd3aa23 Add targted-perf query that makes local expr allocations
Change-Id: Ida40481cb429227058d78c619820de23f5c4a15e
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4772
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: jenkins
2014-10-07 15:48:32 -07:00
Lenni Kuff
c3619d9581 Add targeted perf queries for columns materialized from inline-view
The planner should see only c1 and c2 are being materialized from
the inline-view in these queries. This will provide a significant
performance improvement on Parquet format tables.

Change-Id: If9a366000531a8383dc20ad6f40456ace2281b7d
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1017
Reviewed-by: Marcel Kornacker <marcel@cloudera.com>
Tested-by: jenkins
2014-01-08 10:54:17 -08:00
Lenni Kuff
11556a1ad2 Add targeted perf regression test for IMPALA-288 2014-01-08 10:50:13 -08:00
ishaan
15658f384b Include targeted performance tests in experiments and add a new query 2014-01-08 10:49:02 -08:00
Skye Wanderman-Milne
461a48df2b Refactor testing framework to generate Avro tables. 2014-01-08 10:48:45 -08:00
Lenni Kuff
328ceed4e7 Add support for generating lzo compressed text files and running tests against lzo 2014-01-08 10:48:38 -08:00
Lenni Kuff
90d7e085fa Update tests to use num_nodes=0, use external impala cluster, add sanity check run mode 2014-01-08 10:48:38 -08:00
Skye Wanderman-Milne
6c08716439 IMPALA-92: Significant performance difference between LIKE = 'x' AND = 'x' 2014-01-08 10:48:21 -08:00
ishaan
09d6d931f4 Change the way data is loaded 2014-01-08 10:48:09 -08:00
Lenni Kuff
3fb375cdc4 Add initial set of queries for targeted perf workload
Includes a query that runs a simple "limit 0" as well as queries that perform aggregation
on columns with different numbers of GROUP BY groups.
2014-01-08 10:47:23 -08:00