mirror of
https://github.com/apache/impala.git
synced 2026-02-03 00:00:40 -05:00
For the Iceberg tables, table-level statistics such as numRows can be computed according to iceberg parition stats, which is more accurate and real-time. Obtaining these statistics is independent of StatsSetupConst.ROW_COUNT and StatsSetupConst.TOTAL_SIZE in HMS. This is an improvement for estimating the cardinality of the Iceberg tables. But now the calculation of V2 Iceberg table is not accurate, maybe after IMPALA-11516(Return better partition stats for V2 tables) is ready, they can be considered to replace those MHS statistics. Testing: - Existing tests - Test on 'On-demand Metadata' mode - For 'select * from iceberg_v2_positional_not_all_data_files_have_delete_files where i = (select max(i) from iceberg_v2_positional_update_all_rows)', the 'Join Order' and 'Distribution Mode' are the same as when table stats are present Change-Id: I3e92d3f25e2a57a64556249410d0af3522598c00 Reviewed-on: http://gerrit.cloudera.org:8080/19168 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>