Currently we just support REFRESH on the whole table or a specific
partition:
REFRESH [db_name.]table_name [PARTITION (key_col1=val1 [, key_col2=val2...])]
If users want to refresh multiple partitions, they have to submit
multiple statements each for a single partition. This has some
drawbacks:
- It requires holding the table write lock inside catalogd multiple
times, which increase lock contention with other read/write
operations on the same table, e.g. getPartialCatalogObject requests
from coordinators.
- Catalog version of the table will be increased multiple times.
Coordinators in local catalog mode is more likely to see different
versions between their getPartialCatalogObject requests so have to
retry planning to resolve InconsistentMetadataFetchException.
- Partitions are reloaded in sequence. They should be reloaded in
parallel like we do in refreshing the whole table.
This patch extends the syntax to refresh multiple partitions in one
statement:
REFRESH [db_name.]table_name
[PARTITION (key_col1=val1 [, key_col2=val2...])
[PARTITION (key_col1=val3 [, key_col2=val4...])...]]
Example:
REFRESH foo PARTITION(p=0) PARTITION(p=1) PARTITION(p=2);
TResetMetadataRequest is extended to have a list of partition specs for
this. If the list has only one item, we still use the existing logic of
reloading a specific partition. If the list has more than one item,
partitions will be reloaded in parallel. This is implemented in
CatalogServiceCatalog#reloadTable(). Previously it always invokes
HdfsTable#load() with partitionsToUpdate=null. Now the parameter is
set when TResetMetadataRequest has the partition list.
HMS notification events in RELOAD type will be fired for each partition
if enable_reload_events is turned on. Once HIVE-28967 is resolved, we
can fire a single event for multiple partitions.
Updated docs in impala_refresh.xml.
Tests:
- Added FE and e2e tests
Change-Id: Ie5b0deeaf23129ed6e1ba2817f54291d7f63d04e
Reviewed-on: http://gerrit.cloudera.org:8080/22938
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
The text from impala_common.xml is reused verbatim under
the REFRESH page and in the UDFs page by a #include-like
mechanism.
Change-Id: Ic41fec781396b69e6df06b8de0b29c42ad51ce8f
Reviewed-on: http://gerrit.cloudera.org:8080/7044
Reviewed-by: Taras Bobrovytsky <tbobrovytsky@cloudera.com>
Tested-by: Impala Public Jenkins
For history and tracking purposes, there are many
instances of rev="CDH-1234" for various CDH- JIRA
numbers. This produces no visible output, it's just
FYI for the person editing the source. Removing all
these now from the upstream doc source, so as not
to have "CDH" all through the doc source files.
Change-Id: I29089e5a31cd72e876b2ccb8375d1c10693c6aba
Reviewed-on: http://gerrit.cloudera.org:8080/6349
Reviewed-by: Ambreen Kazi <ambreen.kazi@cloudera.com>
Reviewed-by: John Russell <jrussell@cloudera.com>
Tested-by: Impala Public Jenkins
Upgrade with details of latest syntax.
Fine-tune discussion of PK and other Kudu
notions.
The impala_kudu diff looks larger than actual changes
to the page, because subtopics got moved
around and promoted/demoted (which changes the
indentation). Best to review that page start-to-finish.
CREATE TABLE details for Impala + Kudu.
ALTER TABLE details for Impala + Kudu.
Unhide the Impala partitioning + Kudu topic.
Mainly a brief intro then a link to delegate
details to the main Kudu page, which already
has a partitioning subtopic.
Include changes to reserved words. Entirely
from Kudu integration work.
Add Kudu considerations for misc SQL statements.
Addressed Todd's and Dimitris's comments for certain files.
(Up to the beginning of the "Partitioning" section in
impala_kudu.xml.)
Added Kudu blurbs to data type topics:
- Some aren't supported.
- Others are supported but can't go in the primary key.
Added walkthrough of renaming internal/external tables.
Split out Kudu CREATE TABLE syntax from other file formats.
Correct info about CTAS for Kudu tables.
Add examples of basic Kudu, external Kudu, and Kudu CTAS.
Change-Id: I76dcb948dab08532fe41326b22ef78d73282db2c
Reviewed-on: http://gerrit.cloudera.org:8080/5649
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: Impala Public Jenkins
For this change to land in master, the audience="hidden" code review
needs to be completed first. Otherwise, the doc build would still work
but the audience="hidden" content would be visible rather than hidden as
desired.
Some work happening in parallel might introduce additional instances of
audience="Cloudera". I suggest addressing those in a followup CR so this
global change can land quickly.
Since the changes apply across so many different files, but are so
narrow in scope, I suggest that the way to validate (check that no
extraneous changes were introduced accidentally) is to diff just the
changed lines:
git diff -U0 HEAD^ HEAD
In patch set 2, I updated other topics marked audience="Cloudera"
by CRs that were pushed in the meantime.
Change-Id: Ic93d89da77e1f51bbf548a522d98d0c4e2fb31c8
Reviewed-on: http://gerrit.cloudera.org:8080/5613
Reviewed-by: John Russell <jrussell@cloudera.com>
Tested-by: Impala Public Jenkins
This now gives a clean RAT check with bin/check-rat-report.py, which
is one way for the Impala community to check compliance with ASF rules
on intellectual property.
Change-Id: I2ad06435f84a65ba126759e42a18fdaf52cd7036
Reviewed-on: http://gerrit.cloudera.org:8080/5232
Reviewed-by: Jim Apple <jbapple-impala@apache.org>
Tested-by: Impala Public Jenkins
Reviewed-by: John Russell <jrussell@cloudera.com>
These are refugees from doc_prototype. They can be rendered with the
DITA Open Toolkit version 2.3.3 by:
/tmp/dita-ot-2.3.3/bin/dita \
-i impala.ditamap \
-f html5 \
-o $(mktemp -d) \
-filter impala_html.ditaval
Change-Id: I8861e99adc446f659a04463ca78c79200669484f
Reviewed-on: http://gerrit.cloudera.org:8080/5014
Reviewed-by: John Russell <jrussell@cloudera.com>
Tested-by: John Russell <jrussell@cloudera.com>