Files
impala/testdata/datasets/functional/schema_constraints.csv
Zoltan Borok-Nagy e91c7810f0 IMPALA-10850: Interpret timestamp predicates in local timezone in IcebergScanNode
IcebergScanNode interprets the timestamp literals as UTC timestamps
during predicate pushdown to Iceberg. It causes problems when the
Iceberg table uses TIMESTAMPTZ (which corresponds to TIMESTAMP WITH
LOCAL TIME ZONE in SQL) because in the scanners we assume that the
timestamp literals in a query are in local timezone.

Hence, if the Iceberg table is partitioned by HOUR(ts), and Impala is
running in a different timezone than UTC, then the following query
doesn't return any rows:

 SELECT * from t
 WHERE ts = <some ts>;

Because during predicate pushdown the timestamp is interpreted as a
UTC timestamp (no conversion from local to UTC), but during query
execution the timestamp data in the files are converted to local
timezone, then compared to <some ts>. I.e. in the scanner the
assumption is that <some ts> is in local timezone.

On the other hand, when Iceberg type TIMESTAMP (which correcponds
to TIMESTAMP WITHOUT TIME ZONE in SQL) is used, then we should just
push down the timestamp values without any conversion. In this case
there is no conversion in the scanners either.

Testing:
 * added e2e test with TIMESTAMPTZ
 * added e2e test with TIMESTAMP

Change-Id: I181be5d2fa004f69b457f69ff82dc2f9877f46fa
Reviewed-on: http://gerrit.cloudera.org:8080/18399
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Csaba Ringhofer <csringhofer@cloudera.com>
2022-04-21 12:49:31 +00:00

335 lines
23 KiB
CSV

# Table level constraints:
# Allows for defining constraints on which file formats to generate for an individual
# table. The table name should match the base table name defined in the schema template
# file.
table_name:stringids, constraint:restrict_to, table_format:hbase/none/none
table_name:hbasecolumnfamilies, constraint:restrict_to, table_format:hbase/none/none
table_name:insertalltypesagg, constraint:restrict_to, table_format:hbase/none/none
table_name:alltypessmallbinary, constraint:restrict_to, table_format:hbase/none/none
table_name:insertalltypesaggbinary, constraint:restrict_to, table_format:hbase/none/none
table_name:hbasealltypeserror, constraint:restrict_to, table_format:hbase/none/none
table_name:hbasealltypeserrornonulls, constraint:restrict_to, table_format:hbase/none/none
table_name:alltypes_date_partition, constraint:restrict_to, table_format:text/none/none
table_name:alltypesinsert, constraint:restrict_to, table_format:text/none/none
table_name:alltypes_promoted, constraint:restrict_to, table_format:orc/def/block
table_name:alltypes_deleted_rows, constraint:restrict_to, table_format:orc/def/block
table_name:stringpartitionkey, constraint:restrict_to, table_format:text/none/none
table_name:alltypesnopart_insert, constraint:restrict_to, table_format:text/none/none
table_name:insert_overwrite_nopart, constraint:restrict_to, table_format:text/none/none
table_name:insert_overwrite_partitioned, constraint:restrict_to, table_format:text/none/none
table_name:insert_string_partitioned, constraint:restrict_to, table_format:text/none/none
table_name:alltypesinsert, constraint:restrict_to, table_format:parquet/none/none
table_name:alltypesnopart_insert, constraint:restrict_to, table_format:parquet/none/none
table_name:alltypesinsert, constraint:restrict_to, table_format:text/none/none
table_name:alltypesnopart_insert, constraint:restrict_to, table_format:text/none/none
table_name:insert_overwrite_nopart, constraint:restrict_to, table_format:text/none/none
table_name:insert_overwrite_partitioned, constraint:restrict_to, table_format:text/none/none
table_name:insert_string_partitioned, constraint:restrict_to, table_format:text/none/none
table_name:alltypesinsert, constraint:restrict_to, table_format:parquet/none/none
table_name:alltypesnopart_insert, constraint:restrict_to, table_format:parquet/none/none
table_name:insert_overwrite_nopart, constraint:restrict_to, table_format:parquet/none/none
table_name:insert_overwrite_partitioned, constraint:restrict_to, table_format:parquet/none/none
table_name:insert_string_partitioned, constraint:restrict_to, table_format:parquet/none/none
table_name:old_rcfile_table, constraint:restrict_to, table_format:rc/none/none
table_name:bad_text_gzip, constraint:restrict_to, table_format:text/gzip/block
table_name:bad_seq_snap, constraint:restrict_to, table_format:seq/snap/block
table_name:bad_avro_snap_strings, constraint:restrict_to, table_format:avro/snap/block
table_name:bad_avro_snap_floats, constraint:restrict_to, table_format:avro/snap/block
table_name:bad_avro_decimal_schema, constraint:restrict_to, table_format:avro/snap/block
table_name:bad_avro_date_out_of_range, constraint:restrict_to, table_format:avro/snap/block
table_name:hive2_bad_avro_date_pre_gregorian, constraint:restrict_to, table_format:avro/snap/block
table_name:hive3_avro_date_pre_gregorian, constraint:restrict_to, table_format:avro/snap/block
table_name:bad_parquet, constraint:restrict_to, table_format:parquet/none/none
table_name:bad_parquet_strings_negative_len, constraint:restrict_to, table_format:parquet/none/none
table_name:bad_parquet_strings_out_of_bounds, constraint:restrict_to, table_format:parquet/none/none
table_name:bad_parquet_decimals, constraint:restrict_to, table_format:parquet/none/none
table_name:bad_magic_number, constraint:restrict_to, table_format:parquet/none/none
table_name:bad_metadata_len, constraint:restrict_to, table_format:parquet/none/none
table_name:bad_dict_page_offset, constraint:restrict_to, table_format:parquet/none/none
table_name:bad_compressed_size, constraint:restrict_to, table_format:parquet/none/none
table_name:alltypesagg_hive_13_1, constraint:restrict_to, table_format:parquet/none/none
table_name:kite_required_fields, constraint:restrict_to, table_format:parquet/none/none
table_name:bad_column_metadata, constraint:restrict_to, table_format:parquet/none/none
table_name:lineitem_multiblock, constraint:restrict_to, table_format:parquet/none/none
table_name:lineitem_sixblocks, constraint:restrict_to, table_format:parquet/none/none
table_name:lineitem_multiblock_one_row_group, constraint:restrict_to, table_format:parquet/none/none
table_name:customer_multiblock, constraint:restrict_to, table_format:parquet/none/none
table_name:hudi_partitioned, constraint:restrict_to, table_format:parquet/none/none
table_name:hudi_non_partitioned, constraint:restrict_to, table_format:parquet/none/none
table_name:hudi_as_parquet, constraint:restrict_to, table_format:parquet/none/none
# Iceberg tests are executed in the PARQUET file format dimension
table_name:airports_orc, constraint:restrict_to, table_format:parquet/none/none
table_name:airports_parquet, constraint:restrict_to, table_format:parquet/none/none
table_name:complextypestbl_iceberg_orc, constraint:restrict_to, table_format:parquet/none/none
table_name:hadoop_catalog_test_external, constraint:restrict_to, table_format:parquet/none/none
table_name:iceberg_int_partitioned, constraint:restrict_to, table_format:parquet/none/none
table_name:iceberg_non_partitioned, constraint:restrict_to, table_format:parquet/none/none
table_name:iceberg_partitioned, constraint:restrict_to, table_format:parquet/none/none
table_name:iceberg_partitioned_orc_external, constraint:restrict_to, table_format:parquet/none/none
table_name:iceberg_partition_transforms_zorder, constraint:restrict_to, table_format:parquet/none/none
table_name:iceberg_resolution_test_external, constraint:restrict_to, table_format:parquet/none/none
table_name:iceberg_alltypes_part, constraint:restrict_to, table_format:parquet/none/none
table_name:iceberg_alltypes_part_orc, constraint:restrict_to, table_format:parquet/none/none
table_name:iceberg_legacy_partition_schema_evolution, constraint:restrict_to, table_format:parquet/none/none
table_name:iceberg_legacy_partition_schema_evolution_orc, constraint:restrict_to, table_format:parquet/none/none
table_name:iceberg_timestamp_part, constraint:restrict_to, table_format:parquet/none/none
table_name:iceberg_timestamptz_part, constraint:restrict_to, table_format:parquet/none/none
table_name:iceberg_uppercase_col, constraint:restrict_to, table_format:parquet/none/none
table_name:iceberg_v2_delete_positional, constraint:restrict_to, table_format:parquet/none/none
# TODO: Support Avro. Data loading currently fails for Avro because complex types
# cannot be converted to the corresponding Avro types yet.
table_name:allcomplextypes, constraint:restrict_to, table_format:text/none/none
table_name:allcomplextypes, constraint:restrict_to, table_format:parquet/none/none
table_name:allcomplextypes, constraint:restrict_to, table_format:hbase/none/none
table_name:functional, constraint:restrict_to, table_format:text/none/none
table_name:complextypes_fileformat, constraint:restrict_to, table_format:text/none/none
table_name:complextypes_fileformat, constraint:restrict_to, table_format:parquet/none/none
table_name:complextypes_fileformat, constraint:restrict_to, table_format:avro/snap/block
table_name:complextypes_fileformat, constraint:restrict_to, table_format:rc/snap/block
table_name:complextypes_fileformat, constraint:restrict_to, table_format:seq/snap/block
table_name:complextypes_fileformat, constraint:restrict_to, table_format:orc/def/block
table_name:complextypes_multifileformat, constraint:restrict_to, table_format:text/none/none
# TODO: Avro
table_name:complextypestbl, constraint:restrict_to, table_format:parquet/none/none
table_name:complextypestbl, constraint:restrict_to, table_format:orc/def/block
table_name:complextypestbl_minor_compacted, constraint:restrict_to, table_format:orc/def/block
table_name:complextypestbl_deleted_rows, constraint:restrict_to, table_format:orc/def/block
table_name:complextypestbl_medium, constraint:restrict_to, table_format:parquet/none/none
table_name:complextypestbl_medium, constraint:restrict_to, table_format:orc/def/block
table_name:complextypestbl_non_transactional, constraint:restrict_to, table_format:orc/def/block
table_name:pos_item_key_value_complextypestbl, constraint:restrict_to, table_format:orc/def/block
table_name:pos_item_key_value_complextypestbl, constraint:restrict_to, table_format:parquet/none/none
table_name:alltypes_structs, constraint:restrict_to, table_format:parquet/none/none
table_name:alltypes_structs, constraint:restrict_to, table_format:orc/def/block
table_name:complextypes_structs, constraint:restrict_to, table_format:parquet/none/none
table_name:complextypes_structs, constraint:restrict_to, table_format:orc/def/block
table_name:complextypes_nested_structs, constraint:restrict_to, table_format:parquet/none/none
table_name:complextypes_nested_structs, constraint:restrict_to, table_format:orc/def/block
table_name:complextypes_arrays, constraint:restrict_to, table_format:parquet/none/none
table_name:complextypes_arrays, constraint:restrict_to, table_format:orc/def/block
table_name:alltypeserror, constraint:exclude, table_format:parquet/none/none
table_name:alltypeserrornonulls, constraint:exclude, table_format:parquet/none/none
table_name:unsupported_types, constraint:exclude, table_format:parquet/none/none
table_name:escapechartesttable, constraint:exclude, table_format:parquet/none/none
table_name:TblWithRaggedColumns, constraint:exclude, table_format:parquet/none/none
# the text_ tables are for testing test delimiters and escape chars in text files
table_name:text_comma_backslash_newline, constraint:restrict_to, table_format:text/none/none
table_name:text_dollar_hash_pipe, constraint:restrict_to, table_format:text/none/none
table_name:text_thorn_ecirc_newline, constraint:restrict_to, table_format:text/none/none
table_name:bad_serde, constraint:restrict_to, table_format:text/none/none
table_name:rcfile_lazy_binary_serde, constraint:restrict_to, table_format:rc/none/none
table_name:unsupported_partition_types, constraint:restrict_to, table_format:text/none/none
table_name:nullformat_custom, constraint:exclude, table_format:parquet/none/none
table_name:alltypes_view, constraint:restrict_to, table_format:text/none/none
table_name:allcomplextypes_view, constraint:restrict_to, table_format:text/none/none
table_name:alltypes_view, constraint:restrict_to, table_format:seq/snap/block
table_name:alltypes_hive_view, constraint:restrict_to, table_format:text/none/none
table_name:alltypes_view_sub, constraint:restrict_to, table_format:text/none/none
table_name:alltypes_view_sub, constraint:restrict_to, table_format:seq/snap/block
table_name:alltypes_parens, constraint:restrict_to, table_format:text/none/none
table_name:complex_view, constraint:restrict_to, table_format:text/none/none
table_name:complex_view, constraint:restrict_to, table_format:seq/snap/block
table_name:view_view, constraint:restrict_to, table_format:text/none/none
table_name:view_view, constraint:restrict_to, table_format:seq/snap/block
table_name:subquery_view, constraint:restrict_to, table_format:seq/snap/block
table_name:subquery_view, constraint:restrict_to, table_format:rc/none/none
# liketbl, tblwithraggedcolumns and manynulls all have
# NULLs in primary key columns. hbase does not support
# writing NULLs to primary key columns.
table_name:liketbl, constraint:exclude, table_format:hbase/none/none
table_name:manynulls, constraint:exclude, table_format:hbase/none/none
table_name:tblwithraggedcolumns, constraint:exclude, table_format:hbase/none/none
# Tables with only one column are not supported in hbase.
table_name:greptiny, constraint:exclude, table_format:hbase/none/none
table_name:tinyinttable, constraint:exclude, table_format:hbase/none/none
# overflow uses a manually constructed text file which doesn't make sense to write to
# other table formats since the values that would be written are different (e.g. already
# truncated.)
table_name:overflow, constraint:restrict_to, table_format:text/none/none
# widerow has a single column with a single row containing a 10MB string. hbase doesn't
# seem to like this.
table_name:widerow, constraint:exclude, table_format:hbase/none/none
# nullformat_custom is used in null-insert tests, which user insert overwrite,
# which is not supported in hbase. The schema is also specified in HIVE_CREATE
# with no corresponding LOAD statement.
table_name:nullformat_custom, constraint:exclude, table_format:hbase/none/none
table_name:unsupported_types, constraint:exclude, table_format:hbase/none/none
# Decimal can only be tested on formats Impala can write to (text and parquet).
# TODO: add Avro once Hive or Impala can write Avro decimals
table_name:decimal_tbl, constraint:restrict_to, table_format:text/none/none
table_name:decimal_tiny, constraint:restrict_to, table_format:text/none/none
table_name:decimal_tbl, constraint:restrict_to, table_format:parquet/none/none
table_name:decimal_tiny, constraint:restrict_to, table_format:parquet/none/none
table_name:decimal_tbl, constraint:restrict_to, table_format:kudu/none/none
table_name:decimal_tiny, constraint:restrict_to, table_format:kudu/none/none
table_name:decimal_tbl, constraint:restrict_to, table_format:orc/def/block
table_name:decimal_tiny, constraint:restrict_to, table_format:orc/def/block
table_name:decimal_rtf_tbl, constraint:restrict_to, table_format:text/none/none
table_name:decimal_rtf_tbl, constraint:restrict_to, table_format:parquet/none/none
table_name:decimal_rtf_tbl, constraint:restrict_to, table_format:kudu/none/none
table_name:decimal_rtf_tbl, constraint:restrict_to, table_format:orc/def/block
table_name:decimal_rtf_tiny_tbl, constraint:restrict_to, table_format:text/none/none
table_name:decimal_rtf_tiny_tbl, constraint:restrict_to, table_format:parquet/none/none
table_name:decimal_rtf_tiny_tbl, constraint:restrict_to, table_format:kudu/none/none
table_name:decimal_rtf_tiny_tbl, constraint:restrict_to, table_format:orc/def/block
table_name:avro_decimal_tbl, constraint:restrict_to, table_format:avro/snap/block
# CHAR is not supported by HBase.
table_name:chars_tiny, constraint:exclude, table_format:hbase/none/none
table_name:chars_medium, constraint:exclude, table_format:hbase/none/none
# invalid_decimal_part_tbl[1,2,3] tables are used for testing invalid decimal
# partition key values (see IMPALA-1040)
table_name:invalid_decimal_part_tbl1, constraint:restrict_to, table_format:text/none/none
table_name:invalid_decimal_part_tbl2, constraint:restrict_to, table_format:text/none/none
table_name:invalid_decimal_part_tbl3, constraint:restrict_to, table_format:text/none/none
table_name:avro_decimal_tbl, constraint:restrict_to, table_format:avro/snap/block
# testescape tables are used for testing text scanner delimiter handling
table_name:table_no_newline, constraint:restrict_to, table_format:text/none/none
table_name:table_no_newline_part, constraint:restrict_to, table_format:text/none/none
table_name:testescape_16_lf, constraint:restrict_to, table_format:text/none/none
table_name:testescape_16_crlf, constraint:restrict_to, table_format:text/none/none
table_name:testescape_17_lf, constraint:restrict_to, table_format:text/none/none
table_name:testescape_17_crlf, constraint:restrict_to, table_format:text/none/none
table_name:testescape_32_lf, constraint:restrict_to, table_format:text/none/none
table_name:testescape_32_crlf, constraint:restrict_to, table_format:text/none/none
# alltimezones is used to verify that impala properly deals with timezones
table_name:alltimezones, constraint:restrict_to, table_format:text/none/none
# Avro schema is inferred from the column definitions (IMPALA-1136)
table_name:no_avro_schema, constraint:restrict_to, table_format:avro/snap/block
table_name:avro_unicode_nulls, constraint:restrict_to, table_format:avro/snap/block
# test single and multi stream bz2 files
table_name:bzip2_tbl, constraint:restrict_to, table_format:text/bzip/block
table_name:large_bzip2_tbl, constraint:restrict_to, table_format:text/bzip/block
table_name:multistream_bzip2_tbl, constraint:restrict_to, table_format:text/bzip/block
table_name:large_multistream_bzip2_tbl, constraint:restrict_to, table_format:text/bzip/block
# Kudu can't handle certain types such as timestamp so we pick and choose the tables
# we actually use for Kudu related tests.
table_name:alltypes, constraint:only, table_format:kudu/none/none
table_name:alltypessmall, constraint:only, table_format:kudu/none/none
table_name:alltypestiny, constraint:only, table_format:kudu/none/none
table_name:alltypesagg, constraint:only, table_format:kudu/none/none
table_name:alltypesaggnonulls, constraint:only, table_format:kudu/none/none
table_name:testtbl, constraint:only, table_format:kudu/none/none
table_name:jointbl, constraint:only, table_format:kudu/none/none
table_name:emptytable, constraint:only, table_format:kudu/none/none
table_name:dimtbl, constraint:only, table_format:kudu/none/none
table_name:tinytable, constraint:only, table_format:kudu/none/none
table_name:tinyinttable, constraint:only, table_format:kudu/none/none
table_name:zipcode_incomes, constraint:only, table_format:kudu/none/none
table_name:nulltable, constraint:only, table_format:kudu/none/none
table_name:nullrows, constraint:only, table_format:kudu/none/none
table_name:nullescapedtable, constraint:only, table_format:kudu/none/none
table_name:decimal_tbl, constraint:only, table_format:kudu/none/none
table_name:decimal_rtf_tbl, constraint:only, table_format:kudu/none/none
table_name:decimal_rtf_tiny_tbl, constraint:only, table_format:kudu/none/none
table_name:decimal_tiny, constraint:only, table_format:kudu/none/none
table_name:strings_with_quotes, constraint:only, table_format:kudu/none/none
table_name:manynulls, constraint:only, table_format:kudu/none/none
table_name:date_tbl, constraint:only, table_format:kudu/none/none
# Skipping header lines is only effective with text tables
table_name:table_with_header, constraint:restrict_to, table_format:text/none/none
table_name:table_with_header_2, constraint:restrict_to, table_format:text/none/none
table_name:table_with_header_insert, constraint:restrict_to, table_format:text/none/none
# We also test that skipping header lines works on compressed tables (IMPALA-5287)
table_name:table_with_header, constraint:restrict_to, table_format:text/gzip/block
table_name:table_with_header_2, constraint:restrict_to, table_format:text/gzip/block
table_name:table_with_header_insert, constraint:restrict_to, table_format:text/gzip/block
# Inserting into parquet tables should not be affected by the 'skip.header.line.count'
# property, so we test parquet format as well.
table_name:table_with_header_insert, constraint:restrict_to, table_format:parquet/none/none
# IMPALA-7368/IMPALA-7370/IMPALA-8198 adds DATE support for text, hbase, parquet and avro.
# IMPALA-8801 adds DATE support for ORC.
# IMPALA-8800 adds DATE support for Kudu.
# Other file-formats will be introduced later.
table_name:date_tbl, constraint:restrict_to, table_format:parquet/none/none
table_name:date_tbl, constraint:restrict_to, table_format:avro/snap/block
table_name:date_tbl, constraint:restrict_to, table_format:orc/def/block
table_name:date_tbl, constraint:restrict_to, table_format:hbase/none/none
table_name:date_tbl, constraint:restrict_to, table_format:kudu/none/none
table_name:date_tbl, constraint:restrict_to, table_format:text/none/none
table_name:date_tbl, constraint:restrict_to, table_format:text/bzip/block
table_name:date_tbl, constraint:restrict_to, table_format:text/gzip/block
table_name:date_tbl, constraint:restrict_to, table_format:text/snap/block
table_name:date_tbl, constraint:restrict_to, table_format:text/def/block
table_name:date_tbl_error, constraint:restrict_to, table_format:text/none/none
table_name:date_tbl_error, constraint:restrict_to, table_format:text/bzip/block
table_name:date_tbl_error, constraint:restrict_to, table_format:text/gzip/block
table_name:date_tbl_error, constraint:restrict_to, table_format:text/snap/block
table_name:date_tbl_error, constraint:restrict_to, table_format:text/def/block
table_name:insert_date_tbl, constraint:restrict_to, table_format:hbase/none/none
# Full transactional table is only supported for ORC
table_name:full_transactional_table, constraint:restrict_to, table_format:orc/def/block
# Insert-only transactional tables only work for file-format based tables
table_name:insert_only_transactional_table, constraint:exclude, table_format:hbase/none/none
table_name:insert_only_transactional_table, constraint:exclude, table_format:kudu/none/none
table_name:insertonly_nopart_insert, constraint:restrict_to, table_format:text/none/none
table_name:insertonly_nopart_insert, constraint:restrict_to, table_format:parquet/none/none
table_name:insertonly_part_insert, constraint:restrict_to, table_format:text/none/none
table_name:insertonly_part_insert, constraint:restrict_to, table_format:parquet/none/none
# A materialized view is based on one or more transactional (in this case insert-only)
# base tables, so the MVs need to be excluded for the table formats where the base
# tables are excluded
table_name:materialized_view, constraint:exclude, table_format:hbase/none/none
table_name:materialized_view, constraint:exclude, table_format:kudu/none/none
table_name:mv1_alltypes_jointbl, constraint:restrict_to, table_format:orc/def/block
table_name:mv2_alltypes_jointbl, constraint:restrict_to, table_format:orc/def/block
table_name:insert_only_transactional_bucketed_table, constraint:exclude, table_format:hbase/none/none
table_name:insert_only_transactional_bucketed_table, constraint:exclude, table_format:kudu/none/none
# Bucketed tables only work for file-format based tables
table_name:bucketed_ext_table, constraint:exclude, table_format:hbase/none/none
table_name:bucketed_ext_table, constraint:exclude, table_format:kudu/none/none
table_name:bucketed_table, constraint:exclude, table_format:hbase/none/none
table_name:bucketed_table, constraint:exclude, table_format:kudu/none/none
# The uncompressed ORC tables are mainly used in test_scanners_fuzz.py to avoid creating
# them each time when running the test. Developers may run this test many times locally.
table_name:uncomp_src_alltypes, constraint:restrict_to, table_format:orc/def/block
table_name:uncomp_src_decimal_tbl, constraint:restrict_to, table_format:orc/def/block
table_name:part_strings_with_quotes, constraint:restrict_to, table_format:text/none/none
# 'alltypessmall_bool_sorted' only used in ORC tests.
table_name:alltypessmall_bool_sorted, constraint:restrict_to, table_format:orc/def/block
table_name:complextypes_arrays_only_view, constraint:restrict_to, table_format:parquet/none/none
table_name:complextypes_arrays_only_view, constraint:restrict_to, table_format:orc/def/block
# 'alltypestiny_negative' only used in ORC tests.
table_name:alltypestiny_negative, constraint:restrict_to, table_format:orc/def/block