mirror of
https://github.com/apache/impala.git
synced 2025-12-26 14:02:53 -05:00
The query option PARQUET_FALLBACK_SCHEMA_RESOLUTION allows matching of Parquet fields by name instead of by index (the default). Parquet column names are case sensitive, but Impala treats db/table/column/field names as case-insensitive. Today, there is no way today to select Parquet columns with mixed casing via SQL using the name-based field resolution policy. This patch changes the matching of Parquet fields to be case-insensitive. Testing: - Modified the data files backing complextypestbl to contain fields with mixed casing. - Several existing tests run against this table, including the test for name-based resolution. - I confirmed that without this fix, the existing name-based resolution tests fail on the modified data files. - I locally ran test_scanners.py and test_nested_types.py on exhaustive with this fix. Change-Id: I87395f84ba29b4c3d8e41be1ea4e89e500b8a9f4 Reviewed-on: http://gerrit.cloudera.org:8080/5891 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Impala Public Jenkins
15 lines
258 B
JSON
15 lines
258 B
JSON
[
|
|
{"ID": 8,
|
|
"Int_Array": [-1],
|
|
"int_array_array": [[-1,-2],[]],
|
|
"Int_Map": {"k1": -1},
|
|
"int_map_array": [{}, {"k1": 1}, {}, {}],
|
|
"nested_Struct": {
|
|
"a": -1,
|
|
"B": [-1],
|
|
"c": {
|
|
"D": [
|
|
[{"e": -1, "f": "nonnullable"}]]},
|
|
"G": {}}}
|
|
]
|