Files
impala/testdata/datasets
Matthew Jacobs 922ee70317 IMPALA-5336: Fix partition pruning when column is cast
Partition pruning has two mechanisms:
1) Simple predicates (e.g. binary predicates of the form
   <SlotRef> <op> <LiteralExpr>) can be used to derive lists
   of matching partition ids directly from the
   partition key values. This is handled directly in the FE
   and is very efficient for supported simple predicates.
2) General expr evaluation of predicates using the BE (via
   FeSupport). This works for all predicates, so is the
   mechanism used for predicates not supported by (1).

The issue was that (1) was being used when a binary
predicate contained an implicit cast on the SlotRef. While
this is OK when being evaluated by the BE, the simple
mechanism in (1) would not be able to match the partition
key values with the predicate literal because the partition
key values cannot be cast in the FE.

The fix is to force binary predicates involving a cast to be
evaluated in the BE.

Testing: A planner test was added to demonstrate the
expected partition pruning occurs.

Some modifications were made to the functional schema table
stringpartitionkey, so it will be necessary to reload those
tables:

load-data.py -w functional-query --table_names=stringpartitionkey

Change-Id: I94f597a6589f5e34d2b74abcd29be77c4161cd99
Reviewed-on: http://gerrit.cloudera.org:8080/7521
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: Impala Public Jenkins
2017-07-31 21:49:17 +00:00
..

This directory contains Impala test data sets. The directory layout is structured as follows:

datasets/
   <data set>/<data set>_schema_template.sql
   <data set>/<data files SF1>/data files
   <data set>/<data files SF2>/data files

Where SF is the scale factor controlling data size. This allows for scaling the same schema to
different sizes based on the target test environment.