mirror of
https://github.com/apache/impala.git
synced 2025-12-19 18:12:08 -05:00
IMPALA-14089: Support REFRESH on multiple partitions
Currently we just support REFRESH on the whole table or a specific partition: REFRESH [db_name.]table_name [PARTITION (key_col1=val1 [, key_col2=val2...])] If users want to refresh multiple partitions, they have to submit multiple statements each for a single partition. This has some drawbacks: - It requires holding the table write lock inside catalogd multiple times, which increase lock contention with other read/write operations on the same table, e.g. getPartialCatalogObject requests from coordinators. - Catalog version of the table will be increased multiple times. Coordinators in local catalog mode is more likely to see different versions between their getPartialCatalogObject requests so have to retry planning to resolve InconsistentMetadataFetchException. - Partitions are reloaded in sequence. They should be reloaded in parallel like we do in refreshing the whole table. This patch extends the syntax to refresh multiple partitions in one statement: REFRESH [db_name.]table_name [PARTITION (key_col1=val1 [, key_col2=val2...]) [PARTITION (key_col1=val3 [, key_col2=val4...])...]] Example: REFRESH foo PARTITION(p=0) PARTITION(p=1) PARTITION(p=2); TResetMetadataRequest is extended to have a list of partition specs for this. If the list has only one item, we still use the existing logic of reloading a specific partition. If the list has more than one item, partitions will be reloaded in parallel. This is implemented in CatalogServiceCatalog#reloadTable(). Previously it always invokes HdfsTable#load() with partitionsToUpdate=null. Now the parameter is set when TResetMetadataRequest has the partition list. HMS notification events in RELOAD type will be fired for each partition if enable_reload_events is turned on. Once HIVE-28967 is resolved, we can fire a single event for multiple partitions. Updated docs in impala_refresh.xml. Tests: - Added FE and e2e tests Change-Id: Ie5b0deeaf23129ed6e1ba2817f54291d7f63d04e Reviewed-on: http://gerrit.cloudera.org:8080/22938 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This commit is contained in:
committed by
Impala Public Jenkins
parent
063b90c433
commit
b37f4509fa
@@ -10604,6 +10604,7 @@ under the License.
|
||||
<keydef keys="impala132"><topicmeta><keywords><keyword>Impala 1.3.2</keyword></keywords></topicmeta></keydef>
|
||||
<keydef keys="impala130"><topicmeta><keywords><keyword>Impala 1.3.0</keyword></keywords></topicmeta></keydef>
|
||||
|
||||
<keydef keys="impala50_full"><topicmeta><keywords><keyword>Impala 5.0</keyword></keywords></topicmeta></keydef>
|
||||
<keydef keys="impala42_full"><topicmeta><keywords><keyword>Impala 4.2</keyword></keywords></topicmeta></keydef>
|
||||
<keydef keys="impala34_full"><topicmeta><keywords><keyword>Impala 3.4</keyword></keywords></topicmeta></keydef>
|
||||
<keydef keys="impala33_full"><topicmeta><keywords><keyword>Impala 3.3</keyword></keywords></topicmeta></keydef>
|
||||
|
||||
@@ -67,7 +67,9 @@ under the License.
|
||||
|
||||
<p conref="../shared/impala_common.xml#common/syntax_blurb"/>
|
||||
|
||||
<codeblock rev="IMPALA-1683">REFRESH [<varname>db_name</varname>.]<varname>table_name</varname> [PARTITION (<varname>key_col1</varname>=<varname>val1</varname> [, <varname>key_col2</varname>=<varname>val2</varname>...])]</codeblock>
|
||||
<codeblock rev="IMPALA-1683">REFRESH [<varname>db_name</varname>.]<varname>table_name</varname>
|
||||
[PARTITION (<varname>key_col1</varname>=<varname>val1</varname> [, <varname>key_col2</varname>=<varname>val2</varname>...])
|
||||
[PARTITION (<varname>key_col1</varname>=<varname>val3</varname> [, <varname>key_col2</varname>=<varname>val4</varname>...])...]</codeblock>
|
||||
|
||||
<p conref="../shared/impala_common.xml#common/usage_notes_blurb"/>
|
||||
|
||||
@@ -115,7 +117,7 @@ under the License.
|
||||
<p conref="../shared/impala_common.xml#common/refresh_vs_invalidate"/>
|
||||
|
||||
<p rev="IMPALA-1683">
|
||||
<b>Refreshing a single partition:</b>
|
||||
<b>Refreshing specific partitions:</b>
|
||||
</p>
|
||||
|
||||
<p rev="IMPALA-1683">
|
||||
@@ -125,6 +127,13 @@ under the License.
|
||||
values for each of the partition key columns.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
In <keyword keyref="impala50_full"/> and higher, the <codeph>REFRESH</codeph> statement
|
||||
can apply to multiple partitions at a time, rather than a single partition. Use the
|
||||
optional <codeph>PARTITION (<varname>partition_spec</varname>)</codeph> clause for each
|
||||
each of the partition.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
The following rules apply:
|
||||
<ul>
|
||||
@@ -164,6 +173,9 @@ refresh p2 partition (z=1, y=0)
|
||||
-- Incomplete partition spec causes an error.
|
||||
refresh p2 partition (y=0)
|
||||
ERROR: AnalysisException: Items in partition spec must exactly match the partition columns in the table definition: default.p2 (1 vs 2)
|
||||
|
||||
-- Refresh multiple partitions.
|
||||
refresh p2 partition (y=0, z=3) partition (y=1, z=0) partition (y=1, z=2);
|
||||
]]>
|
||||
</codeblock>
|
||||
|
||||
|
||||
Reference in New Issue
Block a user