IMPALA-9691: Support Kudu Timestamp and Date bloom filter

Impala save timestamp as 12 bytes of structure TimestampValue with
time in nano seconds. Kudu store timestamp as 8 bytes of Unix Time
microseconds. To avoid the data truncation issue in the bloom filter,
add FunctionCallExpr with 'utc_to_unix_micros' as the root of source
expression of bloom filter to convert timestamp values to microseconds
when building timestamp bloom filter for Kudu.
Generated functional date_tbl table in Kudu format for unit-test.
Added new test cases for Kudu Timestamp and Date bloom filters.

Testing:
Passed all core tests.

Change-Id: I3c1e9bcc9fd6d79a39f25eaa3396188fc0a52a48
Reviewed-on: http://gerrit.cloudera.org:8080/16094
Reviewed-by: Thomas Tauber-Marshall <tmarshall@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This commit is contained in:
wzhou-code
2020-06-17 21:55:40 -07:00
committed by Impala Public Jenkins
parent 62729980d9
commit c7ce4fa109
7 changed files with 311 additions and 30 deletions

View File

@@ -2726,6 +2726,19 @@ LOAD DATA LOCAL INPATH '{impala_home}/testdata/data/date_tbl/0003.txt' OVERWRITE
---- DEPENDENT_LOAD
insert overwrite table {db_name}{db_suffix}.{table_name} partition(date_part)
select id_col, date_col, date_part from functional.{table_name};
---- CREATE_KUDU
-- Can't create partitions with date_part since Kudu don't support "partition by"
-- with non key column.
DROP TABLE IF EXISTS {db_name}{db_suffix}.{table_name};
CREATE TABLE {db_name}{db_suffix}.{table_name} (
id_col INT PRIMARY KEY,
date_col DATE NULL,
date_part DATE NOT NULL
)
PARTITION BY HASH (id_col) PARTITIONS 3 STORED AS KUDU;
---- DEPENDENT_LOAD_KUDU
INSERT INTO TABLE {db_name}{db_suffix}.{table_name}
SELECT id_col, date_col, date_part FROM {db_name}.{table_name};
====
---- DATASET
functional