From 6f52ce10e302ed9d168731dc11db07aabbfa2e53 Mon Sep 17 00:00:00 2001 From: Alex Rodoni Date: Tue, 26 Jun 2018 14:30:38 -0700 Subject: [PATCH] [DOCS] Clarification on admission control and DDL statements Removed the confusing example and paragraphs. Change-Id: I2e3e82bd34e88e7a13de1864aeb97f01023bc715 Reviewed-on: http://gerrit.cloudera.org:8080/10829 Reviewed-by: Tim Armstrong Tested-by: Impala Public Jenkins --- docs/topics/impala_admission.xml | 146 +++++++++++++------------------ 1 file changed, 61 insertions(+), 85 deletions(-) diff --git a/docs/topics/impala_admission.xml b/docs/topics/impala_admission.xml index 5de246bf5..317fa80ef 100644 --- a/docs/topics/impala_admission.xml +++ b/docs/topics/impala_admission.xml @@ -50,6 +50,11 @@ under the License. before returning with an error. These queue settings let you ensure that queries do not wait indefinitely, so that you can detect and correct starvation scenarios.

+

+ Queries, DML statements, and some DDL statements, including + CREATE TABLE AS SELECT and COMPUTE + STATS are affected by admission control. +

Enable this feature if your cluster is underutilized at some times and overutilized at others. Overutilization is indicated by performance @@ -765,38 +770,42 @@ impala.admission-control.pool-queue-timeout-ms.queue_name - - Guidelines for Using Admission Control - - - - - - - - - - -

- To see how admission control works for particular queries, examine the profile output for the query. This - information is available through the PROFILE statement in impala-shell - immediately after running a query in the shell, on the queries page of the Impala - debug web UI, or in the Impala log file (basic information at log level 1, more detailed information at log - level 2). The profile output contains details about the admission decision, such as whether the query was - queued or not and which resource pool it was assigned to. It also includes the estimated and actual memory - usage for the query, so you can fine-tune the configuration for the memory limits of the resource pools. -

- -

- Remember that the limits imposed by admission control are soft limits. - The decentralized nature of this mechanism means that each Impala node makes its own decisions about whether - to allow queries to run immediately or to queue them. These decisions rely on information passed back and forth - between nodes by the statestore service. If a sudden surge in requests causes more queries than anticipated to run - concurrently, then throughput could decrease due to queries spilling to disk or contending for resources; - or queries could be cancelled if they exceed the MEM_LIMIT setting while running. -

- - - -

- In impala-shell, you can also specify which resource pool to direct queries to by - setting the REQUEST_POOL query option. -

- -

- The statements affected by the admission control feature are primarily queries, but also include statements - that write data such as INSERT and CREATE TABLE AS SELECT. Most write - operations in Impala are not resource-intensive, but inserting into a Parquet table can require substantial - memory due to buffering intermediate data before writing out each Parquet data block. See - for instructions about inserting data efficiently into - Parquet tables. -

- -

- Although admission control does not scrutinize memory usage for other kinds of DDL statements, if a query - is queued due to a limit on concurrent queries or memory usage, subsequent statements in the same session - are also queued so that they are processed in the correct order: -

- --- This query could be queued to avoid out-of-memory at times of heavy load. -select * from huge_table join enormous_table using (id); --- If so, this subsequent statement in the same session is also queued --- until the previous statement completes. -drop table huge_table; - - -

- If you set up different resource pools for different users and groups, consider reusing any classifications - you developed for use with Sentry security. See for details. -

- -

- For details about all the Fair Scheduler configuration settings, see - Fair Scheduler Configuration, in particular the tags such as <queue> and - <aclSubmitApps> to map users and groups to particular resource pools (queues). -

- - - - +

+ In impala-shell, you can also specify which + resource pool to direct queries to by setting the + REQUEST_POOL query option. +

+

+ If you set up different resource pools for different users and + groups, consider reusing any classifications you developed for use + with Sentry security. See for details. +

+

+ For details about all the Fair Scheduler configuration settings, see + Fair Scheduler Configuration, in + particular the tags such as <queue> and + <aclSubmitApps> to map users and groups to + particular resource pools (queues). +

+ + -