mirror of
https://github.com/apache/impala.git
synced 2026-02-01 21:00:29 -05:00
CollectionItemsRead in the runtime profile counts the total number of nested collection items read by the scan node. Only created for scans that support nested types, e.g. Parquet or ORC. Each scanner thread maintains its local counter and merges it into HdfsScanNode counter for each row batch. However, the local counter in orc-scanner is uninitialized, leading to weird values. This patch simply initializes it to 0 and adds test coverage. Tests: Add profile verification for this counter on some existing query tests. Note that there are some implementation difference between Parquet and ORC scanners (e.g. in predicate pushdown). So we will see different counter results in some query. I just pick some queries that have consistent counters. Change-Id: Id7783d1460ac9b98e94d3a31028b43f5a9884f99 Reviewed-on: http://gerrit.cloudera.org:8080/18528 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>