mirror of
https://github.com/apache/impala.git
synced 2026-01-29 03:00:27 -05:00
This fixes a class of bugs where the planner incorrectly uses the raw string from the parser instead of the unescaped string. This occurs in several places that push predicates down to the storage layer: * Kudu scans * HBase scans * Data source scans There are some more complex issues with escapes and the LIKE predicate that are tracked separately by IMPALA-2422. This also uncovered a different issue with RCFiles that is tracked by IMPALA-7778 and is worked around by the tests added. In order to make bugs like this more obvious in future, I renamed getValue() to getValueWithOriginalEscapes(). Testing: Added regression test that tests handling of backslash escapes on all file formats. I did not add a regression test for the data source bug since it seems to require some major modification of the data source test infrastructure. Change-Id: I53d6e20dd48ab6837ddd325db8a9d49ee04fed28 Reviewed-on: http://gerrit.cloudera.org:8080/11814 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This directory contains Impala test data sets. The directory layout is structured as follows: datasets/ <data set>/<data set>_schema_template.sql <data set>/<data files SF1>/data files <data set>/<data files SF2>/data files Where SF is the scale factor controlling data size. This allows for scaling the same schema to different sizes based on the target test environment.