Files
impala/testdata/datasets
Tim Armstrong 95b56d0e2d IMPALA-7586: fix predicate pushdown of escaped strings
This fixes a class of bugs where the planner incorrectly uses the raw
string from the parser instead of the unescaped string. This occurs in
several places that push predicates down to the storage layer:
* Kudu scans
* HBase scans
* Data source scans

There are some more complex issues with escapes and the LIKE predicate
that are tracked separately by IMPALA-2422.

This also uncovered a different issue with RCFiles that is tracked by
IMPALA-7778 and is worked around by the tests added.

In order to make bugs like this more obvious in future, I renamed
getValue() to getValueWithOriginalEscapes().

Testing:
Added regression test that tests handling of backslash escapes on all
file formats. I did not add a regression test for the data source bug
since it seems to require some major modification of the data source
test infrastructure.

Change-Id: I53d6e20dd48ab6837ddd325db8a9d49ee04fed28
Reviewed-on: http://gerrit.cloudera.org:8080/11814
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-11-01 21:27:13 +00:00
..

This directory contains Impala test data sets. The directory layout is structured as follows:

datasets/
   <data set>/<data set>_schema_template.sql
   <data set>/<data files SF1>/data files
   <data set>/<data files SF2>/data files

Where SF is the scale factor controlling data size. This allows for scaling the same schema to
different sizes based on the target test environment.