mirror of
https://github.com/apache/impala.git
synced 2026-01-04 09:00:56 -05:00
There was an incorrect DCHECK in the parquet scanner. If abort_on_error is false, the intended behaviour is to skip to the next row group, but the DCHECK assumed that execution should have aborted if a parse error was encountered. This also: - Fixes a DCHECK after an empty row group. InitColumns() would try to create empty scan ranges for the column readers. - Uses metadata_range_->file() instead of stream_->filename() in the scanner. InitColumns() was using stream_->filename() in error messages, which used to work but now stream_ is set to NULL before calling InitColumns(). Change-Id: I8e29e4c0c268c119e1583f16bd6cf7cd59591701 Reviewed-on: http://gerrit.cloudera.org:8080/1257 Reviewed-by: Dan Hecht <dhecht@cloudera.com> Tested-by: Internal Jenkins
73 lines
555 B
Plaintext
73 lines
555 B
Plaintext
====
|
|
---- QUERY
|
|
# IMPALA-2558: trigger bad parse_status_ in HdfsParquetScanner::AssembleRows()
|
|
select id, cnt from bad_column_metadata t, (select count(*) cnt from t.int_array) v
|
|
---- TYPES
|
|
bigint,bigint
|
|
---- RESULTS
|
|
1,10
|
|
2,10
|
|
3,10
|
|
4,10
|
|
5,10
|
|
6,10
|
|
7,10
|
|
8,10
|
|
9,10
|
|
11,10
|
|
12,10
|
|
13,10
|
|
14,10
|
|
15,10
|
|
16,10
|
|
17,10
|
|
18,10
|
|
19,10
|
|
21,10
|
|
22,10
|
|
23,10
|
|
24,10
|
|
25,10
|
|
26,10
|
|
27,10
|
|
28,10
|
|
29,10
|
|
30,10
|
|
====
|
|
---- QUERY
|
|
# IMPALA-2558
|
|
select id from bad_column_metadata
|
|
---- TYPES
|
|
bigint
|
|
---- RESULTS
|
|
1
|
|
2
|
|
3
|
|
4
|
|
5
|
|
6
|
|
7
|
|
8
|
|
9
|
|
10
|
|
11
|
|
12
|
|
13
|
|
14
|
|
15
|
|
16
|
|
17
|
|
18
|
|
19
|
|
21
|
|
22
|
|
23
|
|
24
|
|
25
|
|
26
|
|
27
|
|
28
|
|
29
|
|
30
|
|
====
|