NULL values are printed as "NULL" if they are top level or in
collections, but as "null" in structs. We should print collections and
structs in JSON form, so it should be "null" in collections, too. Hive
also follows the latter (correct) approach.
This commit changes the printing of NULL values to "null" in
collections.
Testing:
- Modified the tests to expect "null" instead of "NULL" in collections.
Change-Id: Ie5e7f98df4014ea417ddf73ac0fb8ec01ef655ba
Reviewed-on: http://gerrit.cloudera.org:8080/19236
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Daniel Becker <daniel.becker@cloudera.com>
Non-matching rows from the left side will null out all slots from the
right side in left outer joins. If the right side is a subquery, it
is possible that some returned expressions will be non-NULL even if all
slots are NULL (e.g. constants) - these expressions are wrapped as
IF(TupleIsNull(tids), NULL, expr) to null them in the non-matching
case.
The logic above used to hit a precondition for complex types. We can
safely ignore complex types for now, as currently the only possible
expression that returns a complex type is SlotRef, which doesn't
need to be wrapped. We will have to revisit this once functions are
added that return complex types.
Testing:
- added a regression test and ran it
Change-Id: Iaa8991cd4448d5c7ef7f44f73ee07e2a2b6f37ce
Reviewed-on: http://gerrit.cloudera.org:8080/18954
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Adding support for MAP types in the select list.
An example of how maps are printed:
{"k1":2,"k2":null}
Nested collection types (maps and arrays) are supported in any
combination. However, structs in collections and collections in structs
are not supported.
Limitations (other than map support) as described in the commit for
IMPALA-9498 still apply, the following are to be implemented later:
- Unify HS2 / Beeswax logic with the way STRUCTs are handled.
This could be done in a "final" logic that can handle
STRUCTS/ARRAYS nested to each other
- Implement "deep copy" and "deep serialize" for collections in BE.
This would enable all operators, e.g. ORDER BY and UNION.
Testing:
- modified the FE tests that checked that maps were not allowed in the
select list - now the test expect maps are allowed there
- added FE and EE tests involving maps based on the array tests
Change-Id: I921c647f1779add36e7f5df4ce6ca237dcfaf001
Reviewed-on: http://gerrit.cloudera.org:8080/18736
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
More than 1d arrays in select list tried to register a
CollectionTableRef with name "item" for the inner arrays,
leading to name collision if there was more than one such array.
The logic is changed to always use the full path as implicit alias
in CollectionTableRefs backing arrays in select list.
As a side effect this leads to using the fully qualified names
in expressions in the explain plans of queries that use arrays
from views. This is not an intended change, but I don't consider
it to be critical. Created IMPALA-11452 to deal with more
sophisticated alias handling in collections.
Testing:
- added a new table to testdata and a regression test
Change-Id: I6f2b6cad51fa25a6f6932420eccf1b0a964d5e4e
Reviewed-on: http://gerrit.cloudera.org:8080/18734
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
The expectation for predicates on unnested arrays is that they are
either picked up by the SCAN node or the UNNEST node for evaluation. If
there is only one array being unnested then the SCAN node, otherwise
the UNNEST node will be responsible for the evaluation. However, if
there is a JOIN node involved where the JOIN construction happens
before creating the UNNEST node then the JOIN node incorrectly picks
up the predicates for the unnested arrays as well. This patch is to fix
this behaviour.
Tests:
- Added E2E tests to cover result correctness.
- Added planner tests to verify that the desired node picks up the
predicates for unnested arrays.
Change-Id: I89fed4eef220ca513b259f0e2649cdfbe43c797a
Reviewed-on: http://gerrit.cloudera.org:8080/18614
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Until now ARRAYs had to be unnested in queries. This patch adds
support to return ARRAYs as STRINGs (JSON arrays) in select list,
for example:
select id, int_array from functional_parquet.complextypestbl where id = 1;
returns: 1, [1,2,3]
Returning ARRAYs from inline or HMS views is also supported -
these arrays can be used both in the select list or as relative
table references. Using them as non-relative table reference is
not supported (IMPALA-11052).
Though STRUCTs are already supported, ARRAYs and STRUCTs nested in
each other are not supported yet.
Things intentionally postponed for later commits:
- Add MAP suppport too - this shouldn't be too tricky after
ARRAY support, but I don't want to make this patch even more
complex.
- Unify HS2 / Beeswax logic with the way STRUCTs are handled.
This could be done in a "final" logic that can handle
STRUCTS/ARRAYS nested to each other
- Implement "deep copy" and "deep serialize" for ARRAYs in BE.
This would enable all operators, e.g. ORDER BY and UNION.
Testing:
- FE tests were added for analyses and authorization
- EE tests were added
- core tests were ran
Change-Id: Ibb1e42ffb21c7ddc033aba0f754b0108e46f34d0
Reviewed-on: http://gerrit.cloudera.org:8080/17811
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>