mirror of
https://github.com/apache/impala.git
synced 2026-02-02 06:00:36 -05:00
When a regular Parquet/ORC table is converted to Iceberg via Hive, only the Iceberg metadata files need to be created. The data files can stay in place. This causes problems when the data files don't have field ids for the schema elements. Currently Impala resolves columns in data files based on Iceberg field ids, but since they are missing, Impala raises an error or returns NULLs. With this patch Impala falls back to the default column resolution strategy when the data files lack field ids. Testing: * added e2e tests both for Parquet and ORC Change-Id: I85881b09891c7bd101e7a96e92561b70bbe5af41 Reviewed-on: http://gerrit.cloudera.org:8080/17953 Reviewed-by: Csaba Ringhofer <csringhofer@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>