IMPALA-6553: [DOCS] load_catalog_in_background default change

Change-Id: I548b2d1532c12f8d3c795a940b7f980482ecf09b
Reviewed-on: http://gerrit.cloudera.org:8080/9389
Reviewed-by: John Russell <jrussell@cloudera.com>
Tested-by: Impala Public Jenkins
This commit is contained in:
Alex Rodoni
2018-02-21 17:20:28 -08:00
committed by Impala Public Jenkins
parent 77d07f8067
commit 3a1d802ead
2 changed files with 38 additions and 10 deletions

View File

@@ -3443,10 +3443,39 @@ select * from header_line limit 10;
</p>
<p id="load_catalog_in_background">
By default, the metadata loading and caching on startup happens asynchronously, so Impala can begin
accepting requests promptly. To enable the original behavior, where Impala waited until all metadata was
loaded before accepting any requests, set the <cmdname>catalogd</cmdname> configuration option
<codeph>--load_catalog_in_background=false</codeph>.
Use <codeph>--load_catalog_in_background</codeph> option to control when
the metadata of a table is loaded.
<ul>
<li>
If set to <codeph>false</codeph>, the metadata of a table is
loaded when it is referenced for the first time. This means that the
first run of a particular query can be slower than subsequent runs.
Starting in Impala 2.2, the default for
<codeph>load_catalog_in_background</codeph> is
<codeph>false</codeph>.
</li>
<li>
If set to <codeph>true</codeph>, the catalog service attempts to
load metadata for a table even if no query needed that metadata. So
metadata will possibly be already loaded when the first query that
would need it is run. However, for the following reasons, we
recommend not to set the option to <codeph>true</codeph>.
<ul>
<li>
Background load can interfere with query-specific metadata
loading. This can happen on startup or after invalidating
metadata, with a duration depending on the amount of metadata,
and can lead to a seemingly random long running queries that are
difficult to diagnose.
</li>
<li>
Impala may load metadata for tables that are possibly never
used, potentially increasing catalog size and consequently memory
usage for both catalog service and Impala Daemon.
</li>
</ul>
</li>
</ul>
</p>
<ul id="catalogd_xrefs">
@@ -3458,7 +3487,6 @@ select * from header_line limit 10;
<cmdname>catalogd</cmdname> daemon.
</p>
</li>
<li>
<p>
The <codeph>REFRESH</codeph> and <codeph>INVALIDATE METADATA</codeph> statements are no longer needed

View File

@@ -192,11 +192,11 @@ under the License.
By default, the <codeph>INVALIDATE METADATA</codeph> command checks HDFS permissions of the underlying data
files and directories, caching this information so that a statement can be cancelled immediately if for
example the <codeph>impala</codeph> user does not have permission to write to the data directory for the
table. (This checking does not apply if you have set the <cmdname>catalogd</cmdname> configuration option
<codeph>--load_catalog_in_background=false</codeph>.) Impala reports any lack of write permissions as an
<codeph>INFO</codeph> message in the log file, in case that represents an oversight. If you change HDFS
permissions to make data readable or writeable by the Impala user, issue another <codeph>INVALIDATE
METADATA</codeph> to make Impala aware of the change.
table. (This checking does not apply when the <cmdname>catalogd</cmdname> configuration option
<codeph>--load_catalog_in_background</codeph> is set to <codeph>false</codeph>, which it is by default.)
Impala reports any lack of write permissions as an <codeph>INFO</codeph> message in the log file, in case
that represents an oversight. If you change HDFS permissions to make data readable or writeable by the Impala
user, issue another <codeph>INVALIDATE METADATA</codeph> to make Impala aware of the change.
</p>
<p conref="../shared/impala_common.xml#common/usage_notes_blurb"/>