mirror of
https://github.com/apache/impala.git
synced 2026-01-06 06:01:03 -05:00
Allow values larger than 64KB to be written to Parquet files. This was previously limited by a fixed data page size. This commit removes that limitation by allowing the page size to grow when necessary. This occurs when there are enough unique values to switch from dictionary encoding to plain encoding, and then there are huge values larger than the default 64KB page size. In this case, it may be possible to write files larger than one HDFS block, but this is an edge case and not worth introducing additional complexity to handle. Change-Id: I165ef44ba48ff0c3c3203860157a61c45f77df8b Reviewed-on: http://gerrit.cloudera.org:8080/120 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: Internal Jenkins