We weren't attaching resources to the row batch when starting a new
row group, so it was possible for string data to be overwritten. This
patch removes CloseStreams() and merges its functionality with
AttachCompletedResources() so it's not possible to destroy streams
without transferring the resources first. It also merges and removes
ScannerContext::Close().
Also adds test cases for IMPALA-720.
Change-Id: Ia8f40c7d39d8702716f1d337fe797e2696bd0fcb
parquet-mr had a bug where it didn't include the dictionary page's
header in the total column size. We now compensate for this by
detecting these files and padding the scan range length. This required
changing how the scanner detects when it's finished: it now counts the
number of rows rather than checking eosr (since the scan range may be
longer than the column).
Change-Id: Id9933808b965003c0c3b3aa78c32fe29a0c4bcbe
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1097
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: jenkins