mirror of
https://github.com/apache/impala.git
synced 2026-01-05 21:00:54 -05:00
Reading dictionary encoded Parquet data pages where the bit width is larger than the encoded type's size (e.g. coding 8 bit TINYINT with 16 bit dictionary indices) led to DCHECK error in debug builds. Impala does not create such parquet files (an N bit type can have maximum 2^N distinct values, so N bit dictionary indices are enough for a dictionary that contains every possible value), but the Parquet standard does not forbid to do so. These DCHECKs were probably introduced by a copy paste error (similar checks exist in the non-dictionary encoded bit reader functions, where they are valid). Testing: - a new test is added to check that these data pages can be decoded correctly Change-Id: I9ff3b00cbcab09dec11b3607d7d9a9c2c0025e1a Reviewed-on: http://gerrit.cloudera.org:8080/10683 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
390 B
390 B