mirror of
https://github.com/apache/impala.git
synced 2026-01-08 03:02:48 -05:00
This patch contains 2 parts:
1. When both conditions below are true, push down limit to
pre-aggregation
a) aggregation node has no aggregate function
b) aggregation node has no predicate
2. finish aggregation when number of unique keys of hash table has
exceeded the limit.
Sample queries:
SELECT DISTINCT f FROM t LIMIT n
Can pass the LIMIT all the way down to the pre-aggregation, which
leads to a nearly unbounded speedup on these queries in large tables
when n is low.
Testing:
Add test targeted-perf/queries/aggregation.test
Pass core test
Change-Id: I930a6cb203615acfc03f23118d1bc1f0ea360995
Reviewed-on: http://gerrit.cloudera.org:8080/17821
Reviewed-by: Qifan Chen <qchen@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>