Files
impala/testdata/tzdb_tiny
Csaba Ringhofer 60095a4c6b IMPALA-5050: Add support to read TIMESTAMP_MILLIS and TIMESTAMP_MICROS from Parquet
Changes:
- parquet.thrift is updated to a newer version which contains the
  timestamp logical type.
- INT64 columns with converted types TIMESTAMP_MILLIS and
  TIMESTAMP_MICROS can be read as TIMESTAMP.
- If the logical type is timestamp, then the type will contain the
  information whether the UTC->local conversion is necessary. This
  feature is only supported for the new timestamp types, so INT96
  timestamps must still use flag
  convert_legacy_hive_parquet_utc_timestamps.
- Min/max stat filtering is enabled again for columns that need
  UTC->local conversion. This was disabled in IMPALA-7559 because
  it could incorrectly drop column chunks.
- CREATE TABLE LIKE PARQUET converts these columns to
  TIMESTAMP - before the change, an error was returned instead.
- Bulk of the Parquet column stat logic was moved to a new class
  called "ColumnStatsReader".

Testing:
- Added unit tests for timezone conversion (this needed a new public
  function in timezone_db.h and adding CET to tzdb_tiny).
- Added parquet files (created with parquet-mr) with int64 timestamp
  columns.

Change-Id: I4c7c01fffa31b3d2ca3480adf6ff851137dadac3
Reviewed-on: http://gerrit.cloudera.org:8080/11057
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-11-14 20:16:14 +00:00
..