impala

mirror of https://github.com/apache/impala.git synced 2026-01-29 03:00:27 -05:00

Files

Tim Armstrong 95b56d0e2d IMPALA-7586: fix predicate pushdown of escaped strings

This fixes a class of bugs where the planner incorrectly uses the raw
string from the parser instead of the unescaped string. This occurs in
several places that push predicates down to the storage layer:
* Kudu scans
* HBase scans
* Data source scans

There are some more complex issues with escapes and the LIKE predicate
that are tracked separately by IMPALA-2422.

This also uncovered a different issue with RCFiles that is tracked by
IMPALA-7778 and is worked around by the tests added.

In order to make bugs like this more obvious in future, I renamed
getValue() to getValueWithOriginalEscapes().

Testing:
Added regression test that tests handling of backslash escapes on all
file formats. I did not add a regression test for the data source bug
since it seems to require some major modification of the data source
test infrastructure.

Change-Id: I53d6e20dd48ab6837ddd325db8a9d49ee04fed28
Reviewed-on: http://gerrit.cloudera.org:8080/11814
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>

2018-11-01 21:27:13 +00:00

functional

IMPALA-7586: fix predicate pushdown of escaped strings

2018-11-01 21:27:13 +00:00

hive-benchmark

IMPALA-3918: Remove Cloudera copyrights and add ASF license header

2016-08-09 08:19:41 +00:00

tpcds

IMPALA-7399: Emit a junit xml report when trapping errors

2018-08-23 18:33:58 +00:00

tpch

IMPALA-7399: Emit a junit xml report when trapping errors

2018-08-23 18:33:58 +00:00

README

Move functional data loading to new framework + initial changes for workload directory structure

2014-01-08 10:44:18 -08:00

README

This directory contains Impala test data sets. The directory layout is structured as follows:

datasets/
   <data set>/<data set>_schema_template.sql
   <data set>/<data files SF1>/data files
   <data set>/<data files SF2>/data files

Where SF is the scale factor controlling data size. This allows for scaling the same schema to
different sizes based on the target test environment.