Files
impala/testdata/data/dict_encoding_with_large_bit_width.parquet
Csaba Ringhofer 06fe321050 IMPALA-7417: Remove DCHECKs with unnecessary constraint on dictionary encoding bit width
Reading dictionary encoded Parquet data pages where the bit width is
larger than the encoded type's size (e.g. coding 8 bit TINYINT with
16 bit dictionary indices) led to DCHECK error in debug builds.
Impala does not create such parquet files (an N bit type can have
maximum 2^N distinct values, so N bit dictionary indices are enough
for a dictionary that contains every possible value), but the Parquet
standard does not forbid to do so.

These DCHECKs were probably introduced by a copy paste error (similar
checks exist in the non-dictionary encoded bit reader functions,
where they are valid).

Testing:
- a new test is added to check that these data pages can be decoded
  correctly

Change-Id: I9ff3b00cbcab09dec11b3607d7d9a9c2c0025e1a
Reviewed-on: http://gerrit.cloudera.org:8080/10683
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-06-11 23:25:46 +00:00

390 B