This is needed because the workload runner required a prefix of query names to run.
Change-Id: Ica8db68141ef653b0b01a7cfa7773302717a35a2
Reviewed-on: http://gerrit.cloudera.org:8080/3021
Tested-by: Internal Jenkins
Reviewed-by: Thomas Tauber-Marshall <tmarshall@cloudera.com>
Added costs to all Exprs, which estimate the relative cost of evaluating
an expression and all of its children. Costs are calculated during
analysis. For now, these costs are intended as a simple way to order
expressions from cheap to expensive, not necessarily to be a precise
reflection of running times.
In general, expressions that deal with variable length types like strings
will have higher cost than those dealing with fixed length types
like numbers and booleans. Additionally, expressions with complicated
subexpressions will have higher cost than simpler expressions.
Also added PlanNode.orderConjunctsByCost, which takes a list of Exprs and
returns a new list sorted according to an estimate of the cheapest order to
evaulate the conjuncts in, based on their cost and selectivity.
The conjuncts are sorted by repeatedly iterating over them and choosing the
conjunct that would result in the least total estimated work were it to be
applied before the remaining conjuncts. Selectivities are exponentially
backed off, and Exprs without selectivity estimates are given a reasonable
default.
Change-Id: I02279a26fbc6308ac5eb819d78345fc010469034
Reviewed-on: http://gerrit.cloudera.org:8080/2598
Reviewed-by: Thomas Tauber-Marshall <tmarshall@cloudera.com>
Tested-by: Internal Jenkins
This patch fixes the comment style in the queries to work properly with
the limitations of the run-workload.py script. This includes removing
quotes and + from comments that otherwise get interpreted.
Change-Id: I791e7bd4145717aa0628c56b93582cd207195039
Reviewed-on: http://gerrit.cloudera.org:8080/1689
Reviewed-by: Martin Grund <mgrund@cloudera.com>
Tested-by: Internal Jenkins
Tests were tuned to run on a 9x node cluster with 64GB RAM against TPC-H 300GB database.
Change-Id: Ib421bcd463d370f795a235b755aeb24a6a70f705
Reviewed-on: http://gerrit.cloudera.org:8080/1394
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Internal Jenkins
The planner should see only c1 and c2 are being materialized from
the inline-view in these queries. This will provide a significant
performance improvement on Parquet format tables.
Change-Id: If9a366000531a8383dc20ad6f40456ace2281b7d
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1017
Reviewed-by: Marcel Kornacker <marcel@cloudera.com>
Tested-by: jenkins