mirror of
https://github.com/apache/impala.git
synced 2026-01-05 21:00:54 -05:00
This patch optimizes queries processing few rows by forcing them to be
executed on a single node with codegen disabled (and with just one
scanner thread if the threshold is below the batch size). If a limit
clause is present in the query, this limit is chosen over the input
cardinality if it can be directly applied on a scan node. If it is not
possible to get the maximum with confidence, i.e. because of missing
statistics this optimization will not be applied
This behavior is controllable with the new query option:
EXEC_SINGLE_NODE_ROWS_THRESHOLD
that has a default value of 100.
Change-Id: I4bdacc708e7338914b52efdf7ff67d28c50539f3
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4822
Reviewed-by: Martin Grund <mgrund@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5344
Reviewed-by: Alex Behm <alex.behm@cloudera.com>