IMPALA-11280: Join node incorrectly picks up unnest(array) predicates

The expectation for predicates on unnested arrays is that they are
either picked up by the SCAN node or the UNNEST node for evaluation. If
there is only one array being unnested then the SCAN node, otherwise
the UNNEST node will be responsible for the evaluation. However, if
there is a JOIN node involved where the JOIN construction happens
before creating the UNNEST node then the JOIN node incorrectly picks
up the predicates for the unnested arrays as well. This patch is to fix
this behaviour.

Tests:
  - Added E2E tests to cover result correctness.
  - Added planner tests to verify that the desired node picks up the
    predicates for unnested arrays.

Change-Id: I89fed4eef220ca513b259f0e2649cdfbe43c797a
Reviewed-on: http://gerrit.cloudera.org:8080/18614
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This commit is contained in:
Gabor Kaszab
2022-06-13 14:04:36 +02:00
committed by Impala Public Jenkins
parent 1285fc95ad
commit 2744f46fbd
7 changed files with 131 additions and 6 deletions

View File

@@ -334,11 +334,52 @@ BIGINT,INT
select item from complextypes_arrays_only_view.int_array
---- CATCH
AnalysisException: Non-relative collections are currently not supported on collections from views.
=====
====
---- QUERY
# IMPALA-11052: allow using collections returned from views as non-relative table refs
with s as (select int_array a from complextypestbl t)
select item from s.a
---- CATCH
AnalysisException: Could not resolve table reference: 's.a
=====
====
---- QUERY
# IMPALA-11280. There is a join involved here by using the IN operator, and multiple
# arrays are unnested. Checks that the predicate on an unnested array is evaluated
# correctly.
select id, unnested_arr1, unnested_arr2
from (
select id, unnest(arr1) as unnested_arr1, unnest(arr2) as unnested_arr2
from complextypes_arrays
where id % 2 = 1 and id in (select id from alltypestiny)
) a
where a.unnested_arr1 < 5;
---- RESULTS
1,1,'one'
1,2,'two'
1,3,'three'
1,4,'four'
7,1,'NULL'
7,2,'NULL'
---- TYPES
INT,INT,STRING
====
---- QUERY
# Similar as above but here the join is explicitly included in the query string and is not
# a result of a query rewrite.
select a.id, unnested_arr1, unnested_arr2
from (
select cta.id, unnest(arr1) as unnested_arr1, unnest(arr2) as unnested_arr2
from functional_parquet.complextypes_arrays cta
left join functional_parquet.alltypestiny ti on cta.id = ti.id
where cta.id % 2 = 1) a
where a.unnested_arr1 < 5;
---- RESULTS
1,1,'one'
1,2,'two'
1,3,'three'
1,4,'four'
7,1,'NULL'
7,2,'NULL'
---- TYPES
INT,INT,STRING
====