Files
impala/tests/query_test/test_limit_pushdown_analytic.py
Aman Sinha ecfc1af0db IMPALA-9983 : Pushdown limit to analytic sort operator
This patch pushes the LIMIT from a top level Sort down to
the Sort below an Analytic operator when it is safe to do
so. There are several qualifying checks that are done. The
optimization is done at the time of creating the top level
Sort in the single node planner. When the pushdown is
applicable, the analytic sort is converted to a TopN sort.
Further, this is split into a bottom TopN and an upper
TopN separated by a hash partition exchange. This
ensures that the limit is applied as early as possible
before hash partitioning.

Fixed couple of additional related issues uncovered as a
result of limit pushdown:
 - Changed the analytic sort's partition-by expr sort
   semantic from NULLS FIRST to NULLS LAST to ensure
   correctness in the presence of limit.
 - The LIMIT on the analytic sort node was causing it to
   be treated as a merging point in the distributed planner.
   Fixed it by introducing an api allowPartitioned() in the
   PlanNode.

Testing:
 - Ran PlannerTest and updated several EXPLAIN plans.
 - Added Planner tests for both positive and negative cases of
   limit pushdown.
 - Ran end-to-end TPC-DS queries. Specifically tested
   TPC-DS q67 for limit pushdown and result correctness.
 - Added targeted end-to-end tests using TPC-H dataset.

Change-Id: Ib39f46a7bb75a34466eef7f91ddc25b6e6c99284
Reviewed-on: http://gerrit.cloudera.org:8080/16219
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-08-02 02:42:40 +00:00

38 lines
1.3 KiB
Python

# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
# Test the limit pushdown to analytic sort in the presence
# of ranking functions
from tests.common.impala_test_suite import ImpalaTestSuite
class TestLimitPushdownAnalytic(ImpalaTestSuite):
@classmethod
def get_workload(cls):
return 'tpch'
@classmethod
def add_test_dimensions(cls):
super(TestLimitPushdownAnalytic, cls).add_test_dimensions()
cls.ImpalaTestMatrix.add_constraint(lambda v:
v.get_value('table_format').file_format in ['parquet'])
def test_limit_pushdown_analytic(self, vector):
self.run_test_case('limit-pushdown-analytic', vector)