Revert "IMPALA-11123: Optimize count(star) for ORC scans"

This reverts commit f932d78ad0. The commit is reverted because it cause significant regression for non-optimized counts star query in parquet format. There are several conflicts that need to be resolved manually: - Removed assertion against 'NumFileMetadataRead' counter that is lost with the revert. - Adjust the assertion in test_plain_count_star_optimization, test_in_predicate_push_down, and test_partitioned_insert of test_iceberg.py due to missing improvement in parquet optimized count star code path. - Keep the "override" specifier in hdfs-parquet-scanner.h to pass clang-tidy - Keep python3 style of RuntimeError instantiation in test_file_parser.py to pass check-python-syntax.sh Change-Id: Iefd8fd0838638f9db146f7b706e541fe2aaf01c1 Reviewed-on: http://gerrit.cloudera.org:8080/19843 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Wenzhe Zhou <wzhou@cloudera.com>
2025-12-19 18:12:08 -05:00 · 2023-05-04 14:45:43 -07:00
parent 4d9f50eb74
commit 7ca20b3c94
39 changed files with 269 additions and 1104 deletions
--- a/tests/custom_cluster/test_executor_groups.py
+++ b/tests/custom_cluster/test_executor_groups.py
@@ -771,7 +771,7 @@ class TestExecutorGroups(CustomClusterTestSuite):
    different number of executors and memory limit in each."""
    # A small query with estimated memory per host of 10MB that can run on the small
    # executor group
-    SMALL_QUERY = "select count(*) from tpcds_parquet.date_dim where d_year=2022;"
+    SMALL_QUERY = "select count(*) from tpcds_parquet.date_dim;"
    # A large query with estimated memory per host of 132MB that can only run on
    # the large executor group.
    LARGE_QUERY = "select * from tpcds_parquet.store_sales where ss_item_sk = 1 limit 50;"