mirror of
https://github.com/apache/impala.git
synced 2026-01-09 06:05:09 -05:00
per host memory limit for a query
With this patch the per host memory limit of a query is automatically
set using the mem_limit set in the query options and the mem_estimate
calculated by the planner based on the following pseudo code:
if mem_limit is set in query options:
use that and if 'clamp-mem-limit-query-option' is true:
enforce the min/max query mem limits defined in the pool config.
else:
mem_limit = max(mem_estiamte,
min_mem_limit_required_to_accomodate_largest_initial_reservation)
finally, enforce min/max query mem limits defined in the pool
config on this value.
This calculated mem limit will also be used for admission accounting
and consequently for admission control. Moreover, three new pool
configuration options have been added to enable this behaviour:
"min-query-mem-limit" & "max-query-mem-limit" => help
clamp the per host memory limit for a query. If both these limits
are not configured, then the estimates from planning are not used
as a memory limit and only used for making admission decisions.
Moreover the estimates will no longer have a lower bound based
on the largest initial reservation.
"clamp-mem-limit-query-option" => if false, the mem_limit defined in
the query options is used directly and the max/min query mem limits
are not enforced on it.
Testing:
Added e2e test cases.
Added frontend tests for changes to RequestPoolService.
Successfully passed exhaustive tests.
Change-Id: Ifec00141651982f5975803c2165b7d7a10ebeaa6
Reviewed-on: http://gerrit.cloudera.org:8080/11157
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This directory contains Impala test workloads. The directory layout for the workloads should follow: workloads/ <data set name>/<data set name>_dimensions.csv <- The test dimension file <data set name>/<data set name>_core.csv <- A test vector file <data set name>/<data set name>_pairwise.csv <data set name>/<data set name>_exhaustive.csv <data set name>/queries/<query test>.test <- The queries for this workload