impala/testdata/workloads/functional-query/queries/QueryTest/wide-row.test at e94de02469a3dfeda7d9358d7a78cb54c7a67159 - impala - Gitea: Git with a cup of tea

jprdonnelly/impala

mirror of https://github.com/apache/impala.git synced 2025-12-31 15:00:10 -05:00

Files

Skye Wanderman-Milne 9147cd7518 IMPALA-525: Adjust IO buffer size based on read length and other memory fixes

We were previously wasting memory by always reading into 8MB IO
buffers, even when the data read was much less than 8MB. With this
patch, the IO manager picks a buffer size closer to the actual amount
being read (we don't use the exact size so we can continue to recycle
buffers). The minimum IO buffer size is determined via the
--min_buffer_size flag, and the max IO buffer size via the --read_size
flag.

This technique also helps with IMPALA-652, since short columns will
not use as much memory as before (we will not use considerably more
memory than the size of the table).

This patch also changes StringBuffer to use a doubling strategy so it
doesn't end up allocating many large unused buffers, and has the
scanner context use the requested length as the sync read size if it's
larger than the size produced by read_past_size_cb(). These changes
help prevent the boundary buffer in the scanner context from
allocating excess memory.

Change-Id: I0efb3b023ddfddb08bca22d5cb5f9511fb4d6c50
Reviewed-on: http://gerrit.ent.cloudera.com:8080/938
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: jenkins

2014-01-08 10:54:01 -08:00

10 lines

119 B

Plaintext

Raw Blame History

 ====
 ---- QUERY
 # string_col is 10MB
 select length(string_col) from widerow;
 ---- RESULTS
 10485760
 ---- TYPES
 INT
 ====