Commit Graph

2 Commits

Author SHA1 Message Date
Skye Wanderman-Milne
7767d300a3 IMPALA-3311: fix string data coming out of aggs in subplans
The problem: varlen data (e.g. strings) produced by aggregations is
freed by FreeLocalAllocations() after passing up the output
batch. This works for streaming operators or blocking operators that
copy their input, but results in memory corruption when the output
reaches non-copying blocking operators, e.g. SubplanNode and
NestedLoopJoinNode.

The fix: this patch makes the PartitionedAggregationNode copy out
produced string data if the node is in a subplan. Otherwise it calls
MarkNeedsToReturn() on the output batch. Marking the batch would work
in the subplan case as well, but would likely be less efficient since
it would result in many small batches coming out of the subplan.

The patch includes a test case. However, this test only exposes the
problem with an ASAN build and the --disable_mem_pools flag, which we
don't currently have automated testing for.

Change-Id: Iada891504c261ba54f4eb8c9d7e4e5223668d7b9
Reviewed-on: http://gerrit.cloudera.org:8080/2929
Reviewed-by: Dan Hecht <dhecht@cloudera.com>
Tested-by: Internal Jenkins
2016-05-12 23:06:36 -07:00
Jim Apple
7fc739f6d6 IMPALA-2897: Fix equality comparisons on null build-side rows.
If a hash table stores null build-side rows, then it must treat null
expressions in build side rows as being equal. Otherwise, long collision
chains can accumulate, as rows with the same nulls will have the same
hash values but not compare equal.

Equality between build rows and probe rows stays the same.

Change-Id: I5f11addca7dc97408f6eb89de5082657333d17b9
Reviewed-on: http://gerrit.cloudera.org:8080/1956
Reviewed-by: Jim Apple <jbapple@cloudera.com>
Tested-by: Internal Jenkins
2016-02-03 20:46:32 +00:00