mirror of
https://github.com/apache/impala.git
synced 2026-01-05 12:01:11 -05:00
This fixes how we validate delimiters to be in line with Hive. A delimiter must fit in a single byte and can be specified in the following formats, as far as I can tell (there isn't documentation): - A single ASCII or unicode character (ex. '|') - An escape character in octal format (ex. \001. Stored in the metastore as a unicode character: \u0001). - A signed decimal integer in the range [-128:127]. Used to support delimiters for ASCII character values between 128-255 (-2 maps to ASCII 254). Previously, we were not handling the "signed integer" case so there was no way to specify a delimiter in the "extended" ASCII range of 128-255. To support result validation, the test infrastructure had to be updated to support reading/writing different character encodings. Change-Id: Ie3c4d444dc9c6e60192093ed0c0f6f151eab16bc Reviewed-on: http://gerrit.ent.cloudera.com:8080/1848 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/1888
This directory contains Impala test data sets. The directory layout is structured as follows: datasets/ <data set>/<data set>_schema_template.sql <data set>/<data files SF1>/data files <data set>/<data files SF2>/data files Where SF is the scale factor controlling data size. This allows for scaling the same schema to different sizes based on the target test environment.