Files
impala/testdata/ComplexTypesTbl
Daniel Becker ff3d0c7984 IMPALA-12019: Support ORDER BY for arrays of fixed length types in select list
As a first stage of IMPALA-10939, this change implements support for
including in the sorting tuple top-level collections that only contain
fixed length types (including fixed length structs). For these types the
implementation is almost the same as the existing handling of strings.

Another limitation is that structs that contain any type of collection
are not yet allowed in the sorting tuple.

Also refactored the RawValue::Write*() functions to have a clearer
interface.

Testing:
 - Added a new test table that contains many rows with arrays. This is
   queried in a new test added in test_sort.py, to ensure that we handle
   spilling correctly.
 - Added tests that have arrays and/or maps in the sorting tuple in
   test_queries.py::TestQueries::{test_sort,
       test_top_n,test_partitioned_top_n}.

Change-Id: Ic7974ef392c1412e8c60231e3420367bd189677a
Reviewed-on: http://gerrit.cloudera.org:8080/19660
Reviewed-by: Csaba Ringhofer <csringhofer@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2023-05-18 09:56:55 +00:00
..

The two Parquet files (nullable.parq and nonnullable_orc.parq) were generated
as testdata/data/schemas/nested/README stated.

The two ORC files (nullable.orc and nonnullable.orc) were generated by the orc-tools
which can convert JSON files into ORC format. However, we need to modify nullable.json
and nonnullable.json to meet the format it requires. The whole file should not be a array.
It should be JSON objects of each row joined by '\n'. Assume the JSON files are
nullable_orc.json and nonnullable_orc.json.

The ORC files can be regenerated by running the following commands in current directory:

wget https://search.maven.org/remotecontent?filepath=org/apache/orc/orc-tools/1.5.4/orc-tools-1.5.4-uber.jar \
  -O orc-tools-1.5.4-uber.jar

java -jar orc-tools-1.5.4-uber.jar convert \
  -s "struct<id:bigint,int_array:array<int>,int_array_Array:array<array<int>>,int_map:map<string,int>,int_Map_Array:array<map<string,int>>,nested_struct:struct<A:int,b:array<int>,C:struct<d:array<array<struct<E:int,F:string>>>>,g:map<string,struct<H:struct<i:array<double>>>>>>" \
  -o nullable.orc \
  nullable_orc.json

java -jar orc-tools-1.5.4-uber.jar convert \
  -s "struct<ID:bigint,Int_Array:array<int>,int_array_array:array<array<int>>,Int_Map:map<string,int>,int_map_array:array<map<string,int>>,nested_Struct:struct<a:int,B:array<int>,c:struct<D:array<array<struct<e:int,f:string>>>>,G:map<string,struct<h:struct<i:array<double>>>>>>" \
  -o nonnullable.orc \
  nonnullable_orc.json