mirror of
https://github.com/apache/impala.git
synced 2025-12-19 18:12:08 -05:00
Impala has an optimization for analytic expressions that have a rank filter on top of the analytic expression. It can add a top-n plan node to reduce the amount of rows examined. This is tested in tpcds query 67. The optimization logic relies on an unassigned rank conjunct within the analyzer while creating the analytic plan node. A slight reorganization of the code was needed to implement this optimization. The SlotRefs for the AnalyticInfo needed to be created a little earlier from where it was done in the previous commit. A small fix was made to normalize binary predicates. A non-normalized binary predicate prevents the optimization from being used. A call to the checkAndApplyLimitPushdown is needed for some of the optimizations to kick in. A new AllProjectInfo internal class was created to hold the relationships between the Calcite RexNode objects and the Impala Analytic expressions. Also, IMPALA-14158 is fixed by this commit. The nullsFirst value was incorrect when the syntax was explicit in the query. A new Calcite planner test was added in the junit tests to ensure the optimization kicks in. The new test file is in the PlannerTest/calcite/limit-pushdown-analytic-calcite.test file. This is a copy of the limit-pushdown-analytic.test file in its parent directory but with some modified results. Most of the differences are trivial, but IMPALA-14469 has been filed to deal with one optimization that did not get fixed, which is when the order by clause has a constant expression. Change-Id: Ie6fa6781db56771b13b0cf49bd236f776016bf8d Reviewed-on: http://gerrit.cloudera.org:8080/23317 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Aman Sinha <amsinha@cloudera.com>