mirror of
https://github.com/apache/impala.git
synced 2026-01-06 06:01:03 -05:00
There was an incorrect DCHECK in the parquet scanner. If abort_on_error is false, the intended behaviour is to skip to the next row group, but the DCHECK assumed that execution should have aborted if a parse error was encountered. This also: - Fixes a DCHECK after an empty row group. InitColumns() would try to create empty scan ranges for the column readers. - Uses metadata_range_->file() instead of stream_->filename() in the scanner. InitColumns() was using stream_->filename() in error messages, which used to work but now stream_ is set to NULL before calling InitColumns(). Change-Id: I8e29e4c0c268c119e1583f16bd6cf7cd59591701 Reviewed-on: http://gerrit.cloudera.org:8080/1257 Reviewed-by: Dan Hecht <dhecht@cloudera.com> Tested-by: Internal Jenkins
This directory contains Impala test data sets. The directory layout is structured as follows: datasets/ <data set>/<data set>_schema_template.sql <data set>/<data files SF1>/data files <data set>/<data files SF2>/data files Where SF is the scale factor controlling data size. This allows for scaling the same schema to different sizes based on the target test environment.