10 Commits

Author SHA1 Message Date
stiga-huang
b37f4509fa IMPALA-14089: Support REFRESH on multiple partitions
Currently we just support REFRESH on the whole table or a specific
partition:
  REFRESH [db_name.]table_name [PARTITION (key_col1=val1 [, key_col2=val2...])]

If users want to refresh multiple partitions, they have to submit
multiple statements each for a single partition. This has some
drawbacks:
 - It requires holding the table write lock inside catalogd multiple
   times, which increase lock contention with other read/write
   operations on the same table, e.g. getPartialCatalogObject requests
   from coordinators.
 - Catalog version of the table will be increased multiple times.
   Coordinators in local catalog mode is more likely to see different
   versions between their getPartialCatalogObject requests so have to
   retry planning to resolve InconsistentMetadataFetchException.
 - Partitions are reloaded in sequence. They should be reloaded in
   parallel like we do in refreshing the whole table.

This patch extends the syntax to refresh multiple partitions in one
statement:
  REFRESH [db_name.]table_name
  [PARTITION (key_col1=val1 [, key_col2=val2...])
   [PARTITION (key_col1=val3 [, key_col2=val4...])...]]
Example:
  REFRESH foo PARTITION(p=0) PARTITION(p=1) PARTITION(p=2);

TResetMetadataRequest is extended to have a list of partition specs for
this. If the list has only one item, we still use the existing logic of
reloading a specific partition. If the list has more than one item,
partitions will be reloaded in parallel. This is implemented in
CatalogServiceCatalog#reloadTable(). Previously it always invokes
HdfsTable#load() with partitionsToUpdate=null. Now the parameter is
set when TResetMetadataRequest has the partition list.

HMS notification events in RELOAD type will be fired for each partition
if enable_reload_events is turned on. Once HIVE-28967 is resolved, we
can fire a single event for multiple partitions.

Updated docs in impala_refresh.xml.

Tests:
 - Added FE and e2e tests

Change-Id: Ie5b0deeaf23129ed6e1ba2817f54291d7f63d04e
Reviewed-on: http://gerrit.cloudera.org:8080/22938
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-05-28 05:18:53 +00:00
Shajini Thayasingh
1a84a1420c IMPALA-9770: [DOCS] Remove Sentry references in documentation
Updated all the associated topics.

Change-Id: Id4c5e9aa4d060ceaa426908a444d280a5564749d
Reviewed-on: http://gerrit.cloudera.org:8080/17469
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
2021-06-02 19:34:15 +00:00
Alex Rodoni
a9b01963b6 [DOCS] Added another scenario when REFRESH is required
Change-Id: Ifff8f6b6f834402a158312d57076a84ad1eeab45
Reviewed-on: http://gerrit.cloudera.org:8080/10773
Reviewed-by: Vuk Ercegovac <vercegovac@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-06-21 01:36:18 +00:00
Alex Rodoni
8b71c67cad IMPALA-6987: [DOCS] Refactor the INVALIDATE METADATA and REFRESH docs
Change-Id: I2124e14900d0f82569c061cc46006447bb054b36
Reviewed-on: http://gerrit.cloudera.org:8080/10339
Reviewed-by: Vuk Ercegovac <vercegovac@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-06-08 18:18:05 +00:00
John Russell
0e65409926 IMPALA-5259: [DOCS] Doc REFRESH FUNCTIONS
The text from impala_common.xml is reused verbatim under
the REFRESH page and in the UDFs page by a #include-like
mechanism.

Change-Id: Ic41fec781396b69e6df06b8de0b29c42ad51ce8f
Reviewed-on: http://gerrit.cloudera.org:8080/7044
Reviewed-by: Taras Bobrovytsky <tbobrovytsky@cloudera.com>
Tested-by: Impala Public Jenkins
2017-06-08 18:15:08 +00:00
John Russell
f3820e6205 MPALA-3402: [DOCS] Remove CDH- JIRA numbers from rev=
For history and tracking purposes, there are many
instances of rev="CDH-1234" for various CDH- JIRA
numbers. This produces no visible output, it's just
FYI for the person editing the source. Removing all
these now from the upstream doc source, so as not
to have "CDH" all through the doc source files.

Change-Id: I29089e5a31cd72e876b2ccb8375d1c10693c6aba
Reviewed-on: http://gerrit.cloudera.org:8080/6349
Reviewed-by: Ambreen Kazi <ambreen.kazi@cloudera.com>
Reviewed-by: John Russell <jrussell@cloudera.com>
Tested-by: Impala Public Jenkins
2017-03-10 23:56:51 +00:00
John Russell
661921b205 [DOCS] Major update to Impala + Kudu page
Upgrade with details of latest syntax.

Fine-tune discussion of PK and other Kudu
notions.

The impala_kudu diff looks larger than actual changes
to the page, because subtopics got moved
around and promoted/demoted (which changes the
indentation). Best to review that page start-to-finish.

CREATE TABLE details for Impala + Kudu.

ALTER TABLE details for Impala + Kudu.

Unhide the Impala partitioning + Kudu topic.
Mainly a brief intro then a link to delegate
details to the main Kudu page, which already
has a partitioning subtopic.

Include changes to reserved words. Entirely
from Kudu integration work.

Add Kudu considerations for misc SQL statements.

Addressed Todd's and Dimitris's comments for certain files.
(Up to the beginning of the "Partitioning" section in
impala_kudu.xml.)

Added Kudu blurbs to data type topics:
- Some aren't supported.
- Others are supported but can't go in the primary key.

Added walkthrough of renaming internal/external tables.

Split out Kudu CREATE TABLE syntax from other file formats.

Correct info about CTAS for Kudu tables.

Add examples of basic Kudu, external Kudu, and Kudu CTAS.

Change-Id: I76dcb948dab08532fe41326b22ef78d73282db2c
Reviewed-on: http://gerrit.cloudera.org:8080/5649
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: Impala Public Jenkins
2017-02-17 01:10:12 +00:00
John Russell
8377b9949c Global search/replace: audience="Cloudera" -> audience="hidden".
For this change to land in master, the audience="hidden" code review
needs to be completed first. Otherwise, the doc build would still work
but the audience="hidden" content would be visible rather than hidden as
desired.

Some work happening in parallel might introduce additional instances of
audience="Cloudera". I suggest addressing those in a followup CR so this
global change can land quickly.

Since the changes apply across so many different files, but are so
narrow in scope, I suggest that the way to validate (check that no
extraneous changes were introduced accidentally) is to diff just the
changed lines:

git diff -U0 HEAD^ HEAD

In patch set 2, I updated other topics marked audience="Cloudera"
by CRs that were pushed in the meantime.

Change-Id: Ic93d89da77e1f51bbf548a522d98d0c4e2fb31c8
Reviewed-on: http://gerrit.cloudera.org:8080/5613
Reviewed-by: John Russell <jrussell@cloudera.com>
Tested-by: Impala Public Jenkins
2017-01-18 19:31:57 +00:00
Jim Apple
d484d2f684 Add Apache license header to files in doc directory
This now gives a clean RAT check with bin/check-rat-report.py, which
is one way for the Impala community to check compliance with ASF rules
on intellectual property.

Change-Id: I2ad06435f84a65ba126759e42a18fdaf52cd7036
Reviewed-on: http://gerrit.cloudera.org:8080/5232
Reviewed-by: Jim Apple <jbapple-impala@apache.org>
Tested-by: Impala Public Jenkins
Reviewed-by: John Russell <jrussell@cloudera.com>
2016-12-02 23:54:32 +00:00
Jim Apple
3be0f122a5 IMPALA-3398: Add docs to main Impala branch.
These are refugees from doc_prototype. They can be rendered with the
DITA Open Toolkit version 2.3.3 by:

/tmp/dita-ot-2.3.3/bin/dita \
  -i impala.ditamap \
  -f html5 \
  -o $(mktemp -d) \
  -filter impala_html.ditaval

Change-Id: I8861e99adc446f659a04463ca78c79200669484f
Reviewed-on: http://gerrit.cloudera.org:8080/5014
Reviewed-by: John Russell <jrussell@cloudera.com>
Tested-by: John Russell <jrussell@cloudera.com>
2016-11-17 22:38:44 +00:00