Revert "IMPALA-11123: Optimize count(star) for ORC scans"

This reverts commit f932d78ad0.

The commit is reverted because it cause significant regression for
non-optimized counts star query in parquet format.

There are several conflicts that need to be resolved manually:
- Removed assertion against 'NumFileMetadataRead' counter that is lost
  with the revert.
- Adjust the assertion in test_plain_count_star_optimization,
  test_in_predicate_push_down, and test_partitioned_insert of
  test_iceberg.py due to missing improvement in parquet optimized count
  star code path.
- Keep the "override" specifier in hdfs-parquet-scanner.h to pass
  clang-tidy
- Keep python3 style of RuntimeError instantiation in
  test_file_parser.py to pass check-python-syntax.sh

Change-Id: Iefd8fd0838638f9db146f7b706e541fe2aaf01c1
Reviewed-on: http://gerrit.cloudera.org:8080/19843
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Wenzhe Zhou <wzhou@cloudera.com>
This commit is contained in:
Riza Suminto
2023-05-04 14:45:43 -07:00
committed by Wenzhe Zhou
parent 4d9f50eb74
commit 7ca20b3c94
39 changed files with 269 additions and 1104 deletions

View File

@@ -771,7 +771,7 @@ class TestExecutorGroups(CustomClusterTestSuite):
different number of executors and memory limit in each."""
# A small query with estimated memory per host of 10MB that can run on the small
# executor group
SMALL_QUERY = "select count(*) from tpcds_parquet.date_dim where d_year=2022;"
SMALL_QUERY = "select count(*) from tpcds_parquet.date_dim;"
# A large query with estimated memory per host of 132MB that can only run on
# the large executor group.
LARGE_QUERY = "select * from tpcds_parquet.store_sales where ss_item_sk = 1 limit 50;"