mirror of
https://github.com/apache/impala.git
synced 2025-12-25 02:03:09 -05:00
The existing behavior is that materialized views are treated as views and therefore expanded similar to a view when one queries the MV directly (SELECT * FROM materialized_view). This is incorrect since an MV is a regular table with physical properties such as partitioning, clustering etc. and should be treated as such even though it has a view definition associated with it. This patch focuses on the use case where MVs are created as HDFS tables and makes the MVs a derived class of HdfsTable, therefore making it a Table object. It adds support for collecting and displaying statistics on materialized views and these statistics could be leveraged by an external frontend that supports MV based query rewrites (note that such a rewrite is not supported by Impala with or without this patch). Note that we are not introducing new syntax for MVs since DDL, DML operations on MVs are only supported through Hive. Directly querying a MV is permitted but inserts into MVs is not since MVs are supposed to be only modified through an external refresh when the source tables have modifications. If the source tables associated with a materialized view have column masking or row-filtering Ranger policies, querying the MV will throw an error. This behavior is consistent with that of Hive. Testing: - Added transactional tables for alltypes, jointbl and used them as source tables to create materialized view. - Added tests for compute stats, drop stats, show stats and simple select query on a materialized view. - Added test for select on a materialized view when the source table has a column mask. - Modified analyzer tests related to alter, insert, drop of materialized view. Change-Id: If3108996124c6544a97fb0c34b6aff5e324a6cff Reviewed-on: http://gerrit.cloudera.org:8080/17595 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Quanlong Huang <huangquanlong@gmail.com>
80 lines
2.0 KiB
Plaintext
80 lines
2.0 KiB
Plaintext
====
|
|
---- QUERY
|
|
# Basic test on querying a view.
|
|
select count(int_col), count(bigint_col) from functional.alltypes_view
|
|
---- RESULTS
|
|
7300,7300
|
|
---- TYPES
|
|
BIGINT, BIGINT
|
|
====
|
|
---- QUERY
|
|
# Using views in union.
|
|
select bigint_col, string_col from functional.alltypes_view order by id limit 2
|
|
union all (select * from functional.complex_view) order by 1, 2 limit 10
|
|
---- RESULTS
|
|
0,'0'
|
|
2,'0'
|
|
2,'1'
|
|
10,'1'
|
|
---- TYPES
|
|
BIGINT, STRING
|
|
====
|
|
---- QUERY
|
|
# Using a view in subquery.
|
|
select t.* from (select * from functional.complex_view) t
|
|
order by t.abc, t.xyz desc limit 10;
|
|
---- RESULTS
|
|
2,'1'
|
|
2,'0'
|
|
---- TYPES
|
|
BIGINT, STRING
|
|
====
|
|
---- QUERY
|
|
# Using multiple views in a join.
|
|
select count(*) from functional.alltypes_view t1, functional.alltypes_view_sub t2
|
|
where t1.id < 10 and t2.x < 5 and t1.id = t2.x
|
|
---- RESULTS
|
|
3650
|
|
---- TYPES
|
|
BIGINT
|
|
====
|
|
---- QUERY
|
|
# Self-join of a view to make sure the join op is properly set
|
|
# in the cloned view instances.
|
|
select count(*) from functional.alltypes_view t1
|
|
left outer join functional.alltypes_view t2 on t1.id+10 = t2.id
|
|
full outer join functional.alltypes_view t3 on t2.id+20 = t3.id
|
|
---- RESULTS
|
|
7330
|
|
---- TYPES
|
|
BIGINT
|
|
====
|
|
---- QUERY
|
|
# Test that Impala can handle incorrect column metadata created by Hive (IMPALA-994).
|
|
select * from functional.alltypes_hive_view where id = 0
|
|
---- RESULTS
|
|
0,true,0,0,0,0,0,0,'01/01/09','0',2009-01-01 00:00:00,2009,1
|
|
---- TYPES
|
|
INT, BOOLEAN, TINYINT, SMALLINT, INT, BIGINT, FLOAT, DOUBLE, STRING, STRING, TIMESTAMP, INT, INT
|
|
====
|
|
---- QUERY
|
|
# Regression test for IMPALA-1010. This currently produces a bushy plan.
|
|
select STRAIGHT_JOIN c.id, d.date_string_col from
|
|
alltypessmall d join [SHUFFLE] (select a.id as id, b.date_string_col from
|
|
alltypessmall a join [SHUFFLE] alltypessmall b on (a.id = b.id)) c on c.id = d.id
|
|
order by c.id limit 2
|
|
---- RESULTS
|
|
0,'01/01/09'
|
|
1,'01/01/09'
|
|
---- TYPES
|
|
int, STRING
|
|
====
|
|
---- QUERY
|
|
# Simple select on a materialized view
|
|
select * from functional_orc_def.mv1_alltypes_jointbl where c3 = 1106;
|
|
---- RESULTS
|
|
0,true,1106,0,94612
|
|
---- TYPES
|
|
SMALLINT, BOOLEAN, BIGINT, BIGINT, INT
|
|
====
|