mirror of
https://github.com/apache/impala.git
synced 2026-02-02 15:00:38 -05:00
Grouping aggregations previously always repartitioned their input, even if preceding joins or aggs had already partitioned the data on the required key (or an equivalent key). This patch checks to see if data is already partitioned on the required exprs (or equivalent ones), and if so skips the preaggregation and only does a merge aggregation. The patch also does some refactoring of the aggregation planning in DistributedPlanner to make it easier to implement the change. Includes planner tests for the three cases that are affected: grouping aggregations, non-grouping distinct aggregations and grouping distinct aggregations. Change-Id: Iffdcfd3629b8a69bd23915e1adba3b8323cbbaef Reviewed-on: http://gerrit.cloudera.org:8080/2414 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Internal Jenkins