IMPALA-14619: Reset levels_readahead_ for late materialization

Previously, `BaseScalarColumnReader::levels_readahead_` was not reset
when the reader did not do page filtering. If a query selected the last
row containing a collection value in a row group, `levels_readahead_`
would be set and would not be reset when advancing to the next row
group without page filtering. As a result, trying to skip collection
values at the start of the next row group would cause a check failure.

This patch fixes the failure by resetting `levels_readahead_` in
`BaseScalarColumnReader::Reset()`, which is always called when advancing
to the next row group.

`levels_readahead_` is also moved out of the "Members used for page
filtering" section as the variable is also used in late materialization.

Testing:
- Added an E2E test for the fix.

Change-Id: Idac138ffe4e1a9260f9080a97a1090b467781d00
Reviewed-on: http://gerrit.cloudera.org:8080/23779
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This commit is contained in:
Xuebin Su
2025-12-11 17:18:59 +08:00
committed by Impala Public Jenkins
parent 2ebdc05c1d
commit d54b75ccf1
4 changed files with 29 additions and 13 deletions

View File

@@ -1067,6 +1067,7 @@ Status BaseScalarColumnReader::Reset(const HdfsFileDesc& file_desc,
pos_current_value_ = ParquetLevel::INVALID_POS;
row_group_first_row_ = row_group_first_row;
current_row_ = -1;
levels_readahead_ = false;
vector<ScanRange::SubRange> sub_ranges;
CreateSubRanges(&sub_ranges);

View File

@@ -452,6 +452,19 @@ class BaseScalarColumnReader : public ParquetColumnReader {
/// processed the first (zeroeth) row.
int64_t current_row_ = -1;
/// This flag is needed for the proper tracking of the last processed row.
/// The batched and non-batched interfaces behave differently. E.g. when using the
/// batched interface you don't need to invoke NextLevels() in advance, while you need
/// to do that for the non-batched interface. In fact, the batched interface doesn't
/// call NextLevels() at all. It directly reads the levels then the corresponding value
/// in a loop. On the other hand, the non-batched interface (ReadValue()) expects that
/// the levels for the next value are already read via NextLevels(). And after reading
/// the value it calls NextLevels() to read the levels of the next value. Hence, the
/// levels are always read ahead in this case.
/// Returns true, if we read ahead def and rep levels. In this case 'current_row_'
/// points to the row we'll process next, not to the row we already processed.
bool levels_readahead_ = false;
/////////////////////////////////////////
/// BEGIN: Members used for page filtering
/// They are not set when we don't filter out pages at all.
@@ -475,19 +488,6 @@ class BaseScalarColumnReader : public ParquetColumnReader {
/// rows and increment this field.
int current_row_range_ = 0;
/// This flag is needed for the proper tracking of the last processed row.
/// The batched and non-batched interfaces behave differently. E.g. when using the
/// batched interface you don't need to invoke NextLevels() in advance, while you need
/// to do that for the non-batched interface. In fact, the batched interface doesn't
/// call NextLevels() at all. It directly reads the levels then the corresponding value
/// in a loop. On the other hand, the non-batched interface (ReadValue()) expects that
/// the levels for the next value are already read via NextLevels(). And after reading
/// the value it calls NextLevels() to read the levels of the next value. Hence, the
/// levels are always read ahead in this case.
/// Returns true, if we read ahead def and rep levels. In this case 'current_row_'
/// points to the row we'll process next, not to the row we already processed.
bool levels_readahead_ = false;
/// END: Members used for page filtering
/////////////////////////////////////////