IMPALA-8807: fix OPTIMIZE_PARTITION_KEY_SCANS docs

The docs were inaccurate about the cases in which the optimisation
applied. Happily, it actually works in a much wider set of cases.

Change-Id: I8909b23bfe2b90470fc559fbc01f1e3aa3caa85d
Reviewed-on: http://gerrit.cloudera.org:8080/13949
Reviewed-by: Alex Rodoni <arodoni@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This commit is contained in:
Tim Armstrong
2019-07-29 17:29:35 -07:00
committed by Alex Rodoni
parent 8099911fd7
commit b6b45c0665

View File

@@ -52,15 +52,31 @@ under the License.
<p conref="../shared/impala_common.xml#common/usage_notes_blurb"/>
<p>
This optimization speeds up common <q>introspection</q> operations when using queries
to calculate the cardinality and range for partition key columns.
This optimization speeds up common <q>introspection</q> operations
over partition key columns, for example determining the distinct values
of partition keys.
</p>
<p>
This optimization does not apply if the queries contain any <codeph>WHERE</codeph>,
<codeph>GROUP BY</codeph>, or <codeph>HAVING</codeph> clause. The relevant queries
should only compute the minimum, maximum, or number of distinct values for the
partition key columns across the whole table.
This optimization does not apply to <codeph>SELECT</codeph> statements
that reference columns that are not partition keys. It also only applies
when all the partition key columns in the <codeph>SELECT</codeph> statement
are referenced in one of the following contexts:
<ul>
<li>
<p>
Within a <codeph>MAX()</codeph> or <codeph>MAX()</codeph>
aggregate function or as the argument of any aggregate function with
the <codeph>DISTINCT</codeph> keyword applied.
</p>
</li>
<li>
<p>
Within a <codeph>WHERE</codeph>, <codeph>GROUP BY</codeph>
or <codeph>HAVING</codeph> clause.
</p>
</li>
</ul>
</p>
<p>