IMPALA-9691: Support Kudu Timestamp and Date bloom filter

Impala save timestamp as 12 bytes of structure TimestampValue with time in nano seconds. Kudu store timestamp as 8 bytes of Unix Time microseconds. To avoid the data truncation issue in the bloom filter, add FunctionCallExpr with 'utc_to_unix_micros' as the root of source expression of bloom filter to convert timestamp values to microseconds when building timestamp bloom filter for Kudu. Generated functional date_tbl table in Kudu format for unit-test. Added new test cases for Kudu Timestamp and Date bloom filters. Testing: Passed all core tests. Change-Id: I3c1e9bcc9fd6d79a39f25eaa3396188fc0a52a48 Reviewed-on: http://gerrit.cloudera.org:8080/16094 Reviewed-by: Thomas Tauber-Marshall <tmarshall@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2026-01-06 06:01:03 -05:00 · 2020-06-17 21:55:40 -07:00
parent 62729980d9
commit c7ce4fa109
7 changed files with 311 additions and 30 deletions
--- a/testdata/datasets/functional/functional_schema_template.sql
+++ b/testdata/datasets/functional/functional_schema_template.sql
@@ -2726,6 +2726,19 @@ LOAD DATA LOCAL INPATH '{impala_home}/testdata/data/date_tbl/0003.txt' OVERWRITE
 ---- DEPENDENT_LOAD
 insert overwrite table {db_name}{db_suffix}.{table_name} partition(date_part)
 select id_col, date_col, date_part from functional.{table_name};
+---- CREATE_KUDU
+-- Can't create partitions with date_part since Kudu don't support "partition by"
+-- with non key column.
+DROP TABLE IF EXISTS {db_name}{db_suffix}.{table_name};
+CREATE TABLE {db_name}{db_suffix}.{table_name} (
+  id_col INT PRIMARY KEY,
+  date_col DATE NULL,
+  date_part DATE NOT NULL
+)
+PARTITION BY HASH (id_col) PARTITIONS 3 STORED AS KUDU;
+---- DEPENDENT_LOAD_KUDU
+INSERT INTO TABLE {db_name}{db_suffix}.{table_name}
+SELECT id_col, date_col, date_part FROM {db_name}.{table_name};
 ====
 ---- DATASET
 functional