Initial implementation of KUDU-1261 (array column type) recently merged
in upstream Apache Kudu repository. This patch add initial Impala
support for working with Kudu tables having array type columns.
Unlike rows, the elements of a Kudu array are stored in a different
format than Impala. Instead of per-row bit flag for NULL info, values
and NULL bits are stored in separate arrays.
The following types of queries are not supported in this patch:
- (IMPALA-14538) Queries that reference an array column as a table, e.g.
```sql
SELECT item FROM kudu_array.array_int;
```
- (IMPALA-14539) Queries that create duplicate collection slots, e.g.
```sql
SELECT array_int FROM kudu_array AS t, t.array_int AS unnested;
```
Testing:
- Add some FE tests in AnalyzeDDLTest and AnalyzeKuduDDLTest.
- Add EE test test_kudu.py::TestKuduArray.
Since Impala does not support inserting complex types, including
array, the data insertion part of the test is achieved through
custom C++ code kudu-array-inserter.cc that insert into Kudu via
Kudu C++ client. It would be great if we could migrate it to Python so
that it can be moved to the same file as the test (IMPALA-14537).
- Pass core tests.
Co-authored-by: Riza Suminto
Change-Id: I9282aac821bd30668189f84b2ed8fff7047e7310
Reviewed-on: http://gerrit.cloudera.org:8080/23493
Reviewed-by: Alexey Serbin <alexey@apache.org>
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>