Files
impala/testdata/workloads/functional-planner
Aman Sinha 1411ca6a00 IMPALA-9183: Convert disjunctive predicates to conjunctive normal form
Added an expression rewrite rule to convert a disjunctive predicate to
conjunctive normal form (CNF). Converting to CNF enables multi-table
predicates that were only evaluated by a Join operator to be converted
into either single-table conjuncts that are eligible for predicate pushdown
to the scan operator or other multi-table conjuncts that are eligible to
be pushed to a Join below. This helps improve performance for such queries.

Since converting to CNF expands the number of expressions, we place a
limit on the maximum number of CNF exprs (each AND is counted as 1 CNF expr)
that are considered. Once the MAX_CNF_EXPRS limit (default is unlimited) is
exceeded, whatever expression was supplied to the rule is returned without
further transformation. A setting of -1 or 0 allows unlimited number of
CNF exprs to be created upto int32 max. Another option ENABLE_CNF_REWRITES
enables or disables the entire rewrite. This is False by default until we
have done more thorough functional testing (tracking JIRA IMPALA-9539).

Examples of rewrites:
 original: (a AND b) OR c
 rewritten: (a OR c) AND (b OR c)

 original: (a AND b) OR (c AND d)
 rewritten: (a OR c) AND (a OR d) AND (b OR c) AND (b OR d)

 original: NOT(a OR b)
 rewritten: NOT(a) AND NOT(b)

Testing:
 - Added new unit tests with variations of disjunctive predicates
   and verified their Explain plans
 - Manually tested the result correctness on impala shell by running
   these queries with ENABLE_CNF_REWRITES enabled and disabled
 - Added TPC-H q7, q19 and TPC-DS q13 with the CNF rewrite enabled
 - Preliminary performance testing of TPC-DS q13 on a 10TB scale factor
   shows almost 5x improvement:
      Original baseline: 47.5 sec
      With this patch and CNF rewrite enabled: 9.4 sec

Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072
Reviewed-on: http://gerrit.cloudera.org:8080/15462
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-03-24 09:18:32 +00:00
..