Files
impala/testdata/workloads/functional-query/queries/DataErrorsTest/avro-errors.test
Alex Behm 931bf49cd9 IMPALA-3905: HdfsScanner::GetNext() for Avro, RC, and Seq scans.
Implements HdfsScanner::GetNext() for the Avro, RC File, and
Sequence File scanners. Changes ProcessSplit() to repeatedly call
GetNext() to share the core scanning code between the legacy
ProcessSplit() interface (ProcessSplit()) and the new GetNext()
interface.

Summary of changes:
- Slightly change code flow for initial scan range that
  only parses the file header. The new code sets
  'only_parsing_header_' in Open() and then honors
  that flag in GetNextInternal(). Before, all the logic
  was inside ProcessSpit().
- Replace 'finished_' with 'eos_'.
- Add a RowBatch parameter to various functions.
- Change Close() to free all resources when a nullptr
  RowBatch is passed.

Testing:
- Exhaustive tests passed on debug
- Core tests passed on asan
- TODO: Perf testing on cluster

Change-Id: Ie18f57b0d3fe0052a8ccd361b6a5fcdf979d0669
Reviewed-on: http://gerrit.cloudera.org:8080/6527
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Impala Public Jenkins
2017-07-01 21:59:34 +00:00

34 lines
1.2 KiB
Plaintext

====
---- QUERY
# Read from the corrupt files. We may get partial results.
select * from bad_avro_snap_strings
---- RESULTS: VERIFY_IS_SUPERSET
'valid'
---- TYPES
string
---- ERRORS
row_regex: .*Problem parsing file $NAMENODE/.*
File '$NAMENODE/test-warehouse/bad_avro_snap_strings_avro_snap/truncated_string.avro' is corrupt: truncated data block at offset 155
File '$NAMENODE/test-warehouse/bad_avro_snap_strings_avro_snap/negative_string_len.avro' is corrupt: invalid length -7 at offset 164
File '$NAMENODE/test-warehouse/bad_avro_snap_strings_avro_snap/invalid_union.avro' is corrupt: invalid union value 4 at offset 174 (1 of 2 similar)
====
---- QUERY
# Read from the corrupt files. We may get partial results.
select * from bad_avro_snap_floats
---- RESULTS: VERIFY_IS_SUPERSET
1
---- TYPES
float
---- ERRORS
Problem parsing file $NAMENODE/test-warehouse/bad_avro_snap_floats_avro_snap/truncated_float.avro at 159
File '$NAMENODE/test-warehouse/bad_avro_snap_floats_avro_snap/truncated_float.avro' is corrupt: truncated data block at offset 159
====
---- QUERY
select * from bad_avro_decimal_schema
---- TYPES
string,decimal
---- RESULTS
---- ERRORS
Column 'value': invalid Avro decimal type with precision = '5' scale = '7'
====