Files
impala/testdata/cluster
Bharath Vissapragada d19751669a IMPALA-3680: Cleanup the scan range state after failed hdfs cache reads
Currently we don't reset the file read offset if ZCR fails. Due to
this, when we switch to the normal read path, we hit the eosr of
the scan-range even before reading the expected data length. If both
the ReadFromCache() and ReadRange() calls fail without reading any
data, we end up creating a whole list of scan-ranges, each with size
1KB (DEFAULT_READ_PAST_SIZE) assuming we are reading past the scan
range. This gives a huge performance hit. This patch just calls
ScanRange::Close() after the failed cache reads to clean up the
file system state so that the re-reads start from beginning of
the scan range.

This was hit as a part of debugging IMPALA-3679, where the queries
on 1gb cached data were running ~20x slower compared to non-cached
runs.

Change-Id: I0a9ea19dd8571b01d2cd5b87da1c259219f6297a
Reviewed-on: http://gerrit.cloudera.org:8080/3313
Reviewed-by: Michael Brown <mikeb@cloudera.com>
Tested-by: Bharath Vissapragada <bharathv@cloudera.com>
2016-07-05 13:37:26 -07:00
..