mirror of
https://github.com/apache/impala.git
synced 2026-02-03 00:00:40 -05:00
This patch adds support for constant propagation of range predicates involving date and timestamp constants. Previously, only equality predicates were considered for propagation. The new type of propagation is shown by the following example: Before constant propagation: WHERE date_col = CAST(timestamp_col as DATE) AND timestamp_col BETWEEN '2019-01-01' AND '2020-01-01' After constant propagation: WHERE date_col >= '2019-01-01' AND date_col <= '2020-01-01' AND timestamp_col >= '2019-01-01' AND timestamp_col <= '2020-01-01' AND date_col = CAST(timestamp_col as DATE) As a consequence, since Impala supports table partitioning by date columns but not timestamp columns, the above propagation enables partition pruning based on timestamp ranges. Existing code for equality based constant propagation was refactored and consolidated into a new class which handles both equality and range based constant propagation. Range based propagation is only applied to date and timestamp columns. Testing: - Added new range constant propagation tests to PlannerTest. - Added e2e test for range constant propagation based on a newly added date partitioned table. - Ran precommit tests. Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b Reviewed-on: http://gerrit.cloudera.org:8080/16346 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
26 lines
680 B
Plaintext
26 lines
680 B
Plaintext
====
|
|
---- QUERY
|
|
# Constant propagation for range predicates on timestamp.
|
|
select count(*), sum(int_col) from alltypes_date_partition
|
|
where date_col = cast(timestamp_col as date)
|
|
and timestamp_col between '2009-01-01' and '2009-02-01';
|
|
---- RESULTS
|
|
156,620
|
|
---- TYPES
|
|
BIGINT, BIGINT
|
|
====
|
|
---- QUERY
|
|
# Mix of various predicates some of which are eligible for propagation
|
|
with dp_view as
|
|
(select * from alltypes_date_partition
|
|
where date_col = cast(timestamp_col as date))
|
|
select count(*), sum(int_col) from dp_view
|
|
where int_col < 100 and timestamp_col >= '2009-01-01'
|
|
and bigint_col in (20, 40)
|
|
and timestamp_col <= '2009-02-01';
|
|
---- RESULTS
|
|
62,186
|
|
---- TYPES
|
|
BIGINT, BIGINT
|
|
====
|