mirror of
https://github.com/apache/impala.git
synced 2025-12-26 14:02:53 -05:00
A new test case from IMPALA-13445 reveals a pre-existing bug where cost-based planning may increase expectedNumInputInstance greater than inputFragment.getNumInstances(), which leads to precondition violation. The following scenario all happened when the Precondition was hit: 1. The environment is either Erasure Coded HDFS or Ozone. 2. The source table does not have stats nor numRows table property. 3. There is only one fragment consisting of a ScanNode in the plan tree before the addition of DML fragment. 4. Byte-based cardinality estimation logic kicks in. 5. Byte-based cardinality causes high scan cost, which leads to maxScanThread exceeding inputFragment.getPlanRoot(). 6. expectedNumInputInstance is assigned equal to maxScanThread. 7. Precondition expectedNumInputInstance < inputFragment.getPlanRoot() is violated. This scenario triggers a special condition that attempts to lower expectedNumInputInstance. But instead of lowering expectedNumInputInstance, the special logic increases it due to higher byte-based cardinality estimation. There is also a new bug where DistributedPlanner.java mistakenly passes root.getInputCardinality() instead of root.getCardinality(). This patch fixes both issues and does minor refactoring to change variable names into camel cases. Relaxed validation of the last test case of test_query_cpu_count_on_insert to let it pass in Erasure Coded HDFS and Ozone setup. Testing: - Make several assertions in test_executor_groups.py more verbose. - Pass test_executor_groups.py in Erasure Coded HDFS and Ozone setup. - Added new Planner tests with unknown cardinality estimation. - Pass core tests in regular setup. Change-Id: I834eb6bf896752521e733cd6b77a03f746e6a447 Reviewed-on: http://gerrit.cloudera.org:8080/21966 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>