Files
impala/testdata/datasets/functional/schema_constraints.csv
Matthew Jacobs 922ee70317 IMPALA-5336: Fix partition pruning when column is cast
Partition pruning has two mechanisms:
1) Simple predicates (e.g. binary predicates of the form
   <SlotRef> <op> <LiteralExpr>) can be used to derive lists
   of matching partition ids directly from the
   partition key values. This is handled directly in the FE
   and is very efficient for supported simple predicates.
2) General expr evaluation of predicates using the BE (via
   FeSupport). This works for all predicates, so is the
   mechanism used for predicates not supported by (1).

The issue was that (1) was being used when a binary
predicate contained an implicit cast on the SlotRef. While
this is OK when being evaluated by the BE, the simple
mechanism in (1) would not be able to match the partition
key values with the predicate literal because the partition
key values cannot be cast in the FE.

The fix is to force binary predicates involving a cast to be
evaluated in the BE.

Testing: A planner test was added to demonstrate the
expected partition pruning occurs.

Some modifications were made to the functional schema table
stringpartitionkey, so it will be necessary to reload those
tables:

load-data.py -w functional-query --table_names=stringpartitionkey

Change-Id: I94f597a6589f5e34d2b74abcd29be77c4161cd99
Reviewed-on: http://gerrit.cloudera.org:8080/7521
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: Impala Public Jenkins
2017-07-31 21:49:17 +00:00

13 KiB

1# Table level constraints:
2# Allows for defining constraints on which file formats to generate for an individual
3# table. The table name should match the base table name defined in the schema template
4# file.
5table_name:stringids, constraint:restrict_to, table_format:hbase/none/none
6table_name:hbasecolumnfamilies, constraint:restrict_to, table_format:hbase/none/none
7table_name:insertalltypesagg, constraint:restrict_to, table_format:hbase/none/none
8table_name:alltypessmallbinary, constraint:restrict_to, table_format:hbase/none/none
9table_name:insertalltypesaggbinary, constraint:restrict_to, table_format:hbase/none/none
10table_name:hbasealltypeserror, constraint:restrict_to, table_format:hbase/none/none
11table_name:hbasealltypeserrornonulls, constraint:restrict_to, table_format:hbase/none/none
12table_name:alltypesinsert, constraint:restrict_to, table_format:text/none/none
13table_name:stringpartitionkey, constraint:restrict_to, table_format:text/none/none
14table_name:alltypesnopart_insert, constraint:restrict_to, table_format:text/none/none
15table_name:insert_overwrite_nopart, constraint:restrict_to, table_format:text/none/none
16table_name:insert_overwrite_partitioned, constraint:restrict_to, table_format:text/none/none
17table_name:insert_string_partitioned, constraint:restrict_to, table_format:text/none/none
18table_name:alltypesinsert, constraint:restrict_to, table_format:parquet/none/none
19table_name:alltypesnopart_insert, constraint:restrict_to, table_format:parquet/none/none
20table_name:alltypesinsert, constraint:restrict_to, table_format:text/none/none
21table_name:alltypesnopart_insert, constraint:restrict_to, table_format:text/none/none
22table_name:insert_overwrite_nopart, constraint:restrict_to, table_format:text/none/none
23table_name:insert_overwrite_partitioned, constraint:restrict_to, table_format:text/none/none
24table_name:insert_string_partitioned, constraint:restrict_to, table_format:text/none/none
25table_name:alltypesinsert, constraint:restrict_to, table_format:parquet/none/none
26table_name:alltypesnopart_insert, constraint:restrict_to, table_format:parquet/none/none
27table_name:insert_overwrite_nopart, constraint:restrict_to, table_format:parquet/none/none
28table_name:insert_overwrite_partitioned, constraint:restrict_to, table_format:parquet/none/none
29table_name:insert_string_partitioned, constraint:restrict_to, table_format:parquet/none/none
30table_name:old_rcfile_table, constraint:restrict_to, table_format:rc/none/none
31table_name:bad_text_lzo, constraint:restrict_to, table_format:text/lzo/block
32table_name:bad_text_gzip, constraint:restrict_to, table_format:text/gzip/block
33table_name:bad_seq_snap, constraint:restrict_to, table_format:seq/snap/block
34table_name:bad_avro_snap_strings, constraint:restrict_to, table_format:avro/snap/block
35table_name:bad_avro_snap_floats, constraint:restrict_to, table_format:avro/snap/block
36table_name:bad_avro_decimal_schema, constraint:restrict_to, table_format:avro/snap/block
37table_name:bad_parquet, constraint:restrict_to, table_format:parquet/none/none
38table_name:bad_parquet_strings_negative_len, constraint:restrict_to, table_format:parquet/none/none
39table_name:bad_parquet_strings_out_of_bounds, constraint:restrict_to, table_format:parquet/none/none
40table_name:bad_magic_number, constraint:restrict_to, table_format:parquet/none/none
41table_name:bad_metadata_len, constraint:restrict_to, table_format:parquet/none/none
42table_name:bad_dict_page_offset, constraint:restrict_to, table_format:parquet/none/none
43table_name:bad_compressed_size, constraint:restrict_to, table_format:parquet/none/none
44table_name:alltypesagg_hive_13_1, constraint:restrict_to, table_format:parquet/none/none
45table_name:kite_required_fields, constraint:restrict_to, table_format:parquet/none/none
46table_name:bad_column_metadata, constraint:restrict_to, table_format:parquet/none/none
47table_name:lineitem_multiblock, constraint:restrict_to, table_format:parquet/none/none
48table_name:lineitem_sixblocks, constraint:restrict_to, table_format:parquet/none/none
49table_name:lineitem_multiblock_one_row_group, constraint:restrict_to, table_format:parquet/none/none
50# TODO: Support Avro. Data loading currently fails for Avro because complex types
51# cannot be converted to the corresponding Avro types yet.
52table_name:allcomplextypes, constraint:restrict_to, table_format:text/none/none
53table_name:allcomplextypes, constraint:restrict_to, table_format:parquet/none/none
54table_name:allcomplextypes, constraint:restrict_to, table_format:hbase/none/none
55table_name:functional, constraint:restrict_to, table_format:text/none/none
56table_name:complextypes_fileformat, constraint:restrict_to, table_format:text/none/none
57table_name:complextypes_fileformat, constraint:restrict_to, table_format:parquet/none/none
58table_name:complextypes_fileformat, constraint:restrict_to, table_format:avro/snap/block
59table_name:complextypes_fileformat, constraint:restrict_to, table_format:rc/snap/block
60table_name:complextypes_fileformat, constraint:restrict_to, table_format:seq/snap/block
61table_name:complextypes_multifileformat, constraint:restrict_to, table_format:text/none/none
62# TODO: Avro
63table_name:complextypestbl, constraint:restrict_to, table_format:parquet/none/none
64table_name:alltypeserror, constraint:exclude, table_format:parquet/none/none
65table_name:alltypeserrornonulls, constraint:exclude, table_format:parquet/none/none
66table_name:unsupported_types, constraint:exclude, table_format:parquet/none/none
67table_name:escapechartesttable, constraint:exclude, table_format:parquet/none/none
68table_name:TblWithRaggedColumns, constraint:exclude, table_format:parquet/none/none
69# the text_ tables are for testing test delimiters and escape chars in text files
70table_name:text_comma_backslash_newline, constraint:restrict_to, table_format:text/none/none
71table_name:text_dollar_hash_pipe, constraint:restrict_to, table_format:text/none/none
72table_name:text_thorn_ecirc_newline, constraint:restrict_to, table_format:text/none/none
73table_name:bad_serde, constraint:restrict_to, table_format:text/none/none
74table_name:rcfile_lazy_binary_serde, constraint:restrict_to, table_format:rc/none/none
75table_name:unsupported_partition_types, constraint:restrict_to, table_format:text/none/none
76table_name:nullformat_custom, constraint:exclude, table_format:parquet/none/none
77table_name:alltypes_view, constraint:restrict_to, table_format:text/none/none
78table_name:allcomplextypes_view, constraint:restrict_to, table_format:text/none/none
79table_name:alltypes_view, constraint:restrict_to, table_format:seq/snap/block
80table_name:alltypes_hive_view, constraint:restrict_to, table_format:text/none/none
81table_name:alltypes_view_sub, constraint:restrict_to, table_format:text/none/none
82table_name:alltypes_view_sub, constraint:restrict_to, table_format:seq/snap/block
83table_name:alltypes_parens, constraint:restrict_to, table_format:text/none/none
84table_name:complex_view, constraint:restrict_to, table_format:text/none/none
85table_name:complex_view, constraint:restrict_to, table_format:seq/snap/block
86table_name:view_view, constraint:restrict_to, table_format:text/none/none
87table_name:view_view, constraint:restrict_to, table_format:seq/snap/block
88table_name:subquery_view, constraint:restrict_to, table_format:seq/snap/block
89table_name:subquery_view, constraint:restrict_to, table_format:rc/none/none
90# liketbl and tblwithraggedcolumns all have
91# NULLs in primary key columns. hbase does not support
92# writing NULLs to primary key columns.
93table_name:liketbl, constraint:exclude, table_format:hbase/none/none
94table_name:tblwithraggedcolumns, constraint:exclude, table_format:hbase/none/none
95# Tables with only one column are not supported in hbase.
96table_name:greptiny, constraint:exclude, table_format:hbase/none/none
97table_name:tinyinttable, constraint:exclude, table_format:hbase/none/none
98# overflow uses a manually constructed text file which doesn't make sense to write to
99# other table formats since the values that would be written are different (e.g. already
100# truncated.)
101table_name:overflow, constraint:restrict_to, table_format:text/none/none
102# widerow has a single column with a single row containing a 10MB string. hbase doesn't
103# seem to like this.
104table_name:widerow, constraint:exclude, table_format:hbase/none/none
105# nullformat_custom is used in null-insert tests, which user insert overwrite,
106# which is not supported in hbase. The schema is also specified in HIVE_CREATE
107# with no corresponding LOAD statement.
108table_name:nullformat_custom, constraint:exclude, table_format:hbase/none/none
109table_name:unsupported_types, constraint:exclude, table_format:hbase/none/none
110# Decimal can only be tested on formats Impala can write to (text and parquet).
111# TODO: add Avro once Hive or Impala can write Avro decimals
112table_name:decimal_tbl, constraint:restrict_to, table_format:text/none/none
113table_name:decimal_tiny, constraint:restrict_to, table_format:text/none/none
114table_name:decimal_tbl, constraint:restrict_to, table_format:parquet/none/none
115table_name:decimal_tiny, constraint:restrict_to, table_format:parquet/none/none
116table_name:avro_decimal_tbl, constraint:restrict_to, table_format:avro/snap/block
117# TODO first set of tests are for text/none/none
118table_name:chars_tiny, constraint:restrict_to, table_format:text/none/none
119# invalid_decimal_part_tbl[1,2,3] tables are used for testing invalid decimal
120# partition key values (see IMPALA-1040)
121table_name:invalid_decimal_part_tbl1, constraint:restrict_to, table_format:text/none/none
122table_name:invalid_decimal_part_tbl2, constraint:restrict_to, table_format:text/none/none
123table_name:invalid_decimal_part_tbl3, constraint:restrict_to, table_format:text/none/none
124table_name:avro_decimal_tbl, constraint:restrict_to, table_format:avro/snap/block
125# testescape tables are used for testing text scanner delimiter handling
126table_name:table_no_newline, constraint:restrict_to, table_format:text/none/none
127table_name:table_no_newline_part, constraint:restrict_to, table_format:text/none/none
128table_name:testescape_16_lf, constraint:restrict_to, table_format:text/none/none
129table_name:testescape_16_crlf, constraint:restrict_to, table_format:text/none/none
130table_name:testescape_17_lf, constraint:restrict_to, table_format:text/none/none
131table_name:testescape_17_crlf, constraint:restrict_to, table_format:text/none/none
132table_name:testescape_32_lf, constraint:restrict_to, table_format:text/none/none
133table_name:testescape_32_crlf, constraint:restrict_to, table_format:text/none/none
134# alltimezones is used to verify that impala properly deals with timezones
135table_name:alltimezones, constraint:restrict_to, table_format:text/none/none
136# Avro schema is inferred from the column definitions (IMPALA-1136)
137table_name:no_avro_schema, constraint:restrict_to, table_format:avro/snap/block
138table_name:avro_unicode_nulls, constraint:restrict_to, table_format:avro/snap/block
139# test single and multi stream bz2 files
140table_name:bzip2_tbl, constraint:restrict_to, table_format:text/bzip/block
141table_name:large_bzip2_tbl, constraint:restrict_to, table_format:text/bzip/block
142table_name:multistream_bzip2_tbl, constraint:restrict_to, table_format:text/bzip/block
143table_name:large_multistream_bzip2_tbl, constraint:restrict_to, table_format:text/bzip/block
144# Kudu can't handle certain types such as timestamp so we pick and choose the tables
145# we actually use for Kudu related tests.
146table_name:alltypes, constraint:only, table_format:kudu/none/none
147table_name:alltypessmall, constraint:only, table_format:kudu/none/none
148table_name:alltypestiny, constraint:only, table_format:kudu/none/none
149table_name:alltypesagg, constraint:only, table_format:kudu/none/none
150table_name:alltypesaggnonulls, constraint:only, table_format:kudu/none/none
151table_name:testtbl, constraint:only, table_format:kudu/none/none
152table_name:jointbl, constraint:only, table_format:kudu/none/none
153table_name:emptytable, constraint:only, table_format:kudu/none/none
154table_name:dimtbl, constraint:only, table_format:kudu/none/none
155table_name:tinytable, constraint:only, table_format:kudu/none/none
156table_name:tinyinttable, constraint:only, table_format:kudu/none/none
157table_name:zipcode_incomes, constraint:only, table_format:kudu/none/none
158table_name:nulltable, constraint:only, table_format:kudu/none/none
159table_name:nullescapedtable, constraint:only, table_format:kudu/none/none
160# Skipping header lines is only effective with text tables
161table_name:table_with_header, constraint:restrict_to, table_format:text/none/none
162table_name:table_with_header_2, constraint:restrict_to, table_format:text/none/none
163table_name:table_with_header_insert, constraint:restrict_to, table_format:text/none/none
164# We also test that skipping header lines works on compressed tables (IMPALA-5287)
165table_name:table_with_header, constraint:restrict_to, table_format:text/gzip/block
166table_name:table_with_header_2, constraint:restrict_to, table_format:text/gzip/block
167table_name:table_with_header_insert, constraint:restrict_to, table_format:text/gzip/block
168# Inserting into parquet tables should not be affected by the 'skip.header.line.count'
169# property, so we test parquet format as well.
170table_name:table_with_header_insert, constraint:restrict_to, table_format:parquet/none/none