impala

mirror of https://github.com/apache/impala.git synced 2026-01-07 00:02:28 -05:00

Files

Zoltan Borok-Nagy d086babdbd IMPALA-13598: OPTIMIZE redundantly accumulates memory in HDFS WRITER

When OptimizeStmt created the table sink it didn't set
'inputIsClustered' to true. Therefore HdfsTableSink expected
random input and kept the output writers open for every partition,
which resulted in high memory consumption and potentially an
OOM error when the number of partitions are high.

Since we actually sort the rows before the sink we can set
'inputIsClustered' to true, which means HdfsTableSink can write
files one by one, because whenever it gets a row that belongs
to a new partition it knows that it can close the current output
writer, and open a new one.

Testing:
 * added e2e test

Change-Id: I8d451c50c4b6dff9433ab105493051bee106bc63
Reviewed-on: http://gerrit.cloudera.org:8080/22192
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>

2024-12-11 22:19:10 +00:00

DataErrorsTest

IMPALA-12904: test_type_conversions_hive3 silently passes because of wrongly defined test dimensions

2024-03-19 04:32:08 +00:00

QueryTest

IMPALA-13598: OPTIMIZE redundantly accumulates memory in HDFS WRITER

2024-12-11 22:19:10 +00:00

limit-pushdown-analytic.test

IMPALA-10296: Fix analytic limit pushdown when predicates are present

2021-01-22 05:31:37 +00:00