impala/docs/topics/impala_metadata.xml

<?xml version="1.0" encoding="UTF-8"?>
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements.  See the NOTICE file
distributed with this work for additional information
regarding copyright ownership.  The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License.  You may obtain a copy of the License at

  http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied.  See the License for the
specific language governing permissions and limitations
under the License.
-->
<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
<concept id="impala_metadata">

  <title>Metadata Management</title>

  <prolog>
    <metadata>
      <data name="Category" value="Impala"/>
      <data name="Category" value="Configuring"/>
      <data name="Category" value="Administrators"/>
      <data name="Category" value="Developers"/>
    </metadata>
  </prolog>

  <conbody>

    <p>
      This topic describes various knobs you can use to control how Impala manages its metadata
      in order to improve performance and scalability.
    </p>

    <p outputclass="toc inpage"/>

  </conbody>

  <concept id="on_demand_metadata">

    <title>On-demand Metadata</title>

    <conbody>

      <p>
        In previous versions of Impala, every coordinator kept a replica of all the cache in
        <codeph>catalogd</codeph>, consuming large memory on each coordinator with no option to
        evict. Metadata always propagated through the <codeph>statestored</codeph> and suffers
        from head-of-line blocking, for example, one user loading a big table blocking another
        user loading a small table.
      </p>

      <p>
        With this new feature, the coordinators pull metadata as needed from
        <codeph>catalogd</codeph> and cache it locally. The cached metadata gets evicted
        automatically under memory pressure.
      </p>

      <p>
        The granularity of on-demand metadata fetches is now at the partition level between the
        coordinator and <codeph>catalogd</codeph>. Common use cases like add/drop partitions do
        not trigger unnecessary serialization/deserialization of large metadata.
      </p>

      <p>
        This feature is disabled by default.
      </p>

      <p>
        The feature can be used in either of the following modes.
        <dl>
          <dlentry>

            <dt>
              Metadata on-demand mode
            </dt>

            <dd>
              In this mode, all coordinators use the metadata on-demand.
            </dd>

            <dd>
              Set the following on <codeph>catalogd</codeph>:
<codeblock>--catalog_topic_mode=minimal</codeblock>
            </dd>

            <dd>
              Set the following on all <codeph>impalad</codeph> coordinators:
<codeblock>--use_local_catalog=true</codeblock>
            </dd>

          </dlentry>

          <dlentry>

            <dt>
              Mixed mode
            </dt>

            <dd>
              In this mode, only some coordinators are enabled to use the metadata on-demand.
            </dd>

            <dd>
              We recommend that you use the mixed mode only for testing local catalog’s impact
              on heap usage.
            </dd>

            <dd>
              Set the following on <codeph>catalogd</codeph>:
<codeblock>--catalog_topic_mode=mixed</codeblock>
            </dd>

            <dd>
              Set the following on <codeph>impalad</codeph> coordinators with metdadata
              on-demand:
<codeblock>--use_local_catalog=true </codeblock>
            </dd>

          </dlentry>
          <dlentry>
            <dt>Flags related to <codeph>use_local_catalog</codeph></dt>
            <dd>When <codeph>use_local_catalog</codeph> is enabled or set to <codeph>True</codeph> on the impalad
              coordinators the following list of flags configure various parameters as described below. It is not
              recommended to change the default values on these flags.
            </dd>
            <dd>
              <ul>
                <li>The flag <codeph>local_catalog_cache_mb</codeph> (defaults to -1) configures the
                  size of the catalog cache within each coordinator. With the default set to -1, the
                  cache is auto-configured to 60% of the configured Java heap size. Note that the
                  Java heap size is distinct from and typically smaller than the overall Impala
                  memory limit.</li>
                <li>The flag <codeph>local_catalog_cache_expiration_s</codeph> (defaults to 3600) configures the
                  expiration time of the catalog cache within each impalad. Even if the configured
                  cache capacity has not been reached, items are removed from the cache if they have not
                  been accessed in the defined amount of time.</li>
                <li>The flag <codeph>local_catalog_max_fetch_retries</codeph> (defaults to 40) configures
                  the maximum number of retries needed for queries to fetch a metadata object from the impalad
                  coordinator's local catalog cache.</li>
              </ul>
            </dd>
          </dlentry>
        </dl>
      </p>

    </conbody>

  </concept>

  <concept id="auto_invalidate_metadata">

    <title>Automatic Invalidation of Metadata Cache</title>

    <conbody>

      <p>
        To keep the size of metadata bounded, <codeph>catalogd</codeph> periodically scans all
        the tables and invalidates those not recently used. There are two types of
        configurations for <codeph>catalogd</codeph> and <codeph>impalad</codeph>.
      </p>

      <dl>
        <dlentry>

          <dt>
            Time-based cache invalidation
          </dt>

          <dd>
            <codeph>Catalogd</codeph> invalidates tables that are not recently used in the
            specified time period (in seconds).
          </dd>

          <dd>
            The <codeph>&#8209;&#8209;invalidate_tables_timeout_s</codeph> flag needs to be
            applied to both <codeph>impalad</codeph> and <codeph>catalogd</codeph>.
          </dd>

        </dlentry>

        <dlentry>

          <dt>
            Memory-based cache invalidation
          </dt>

          <dd>
            When the memory pressure reaches 60% of JVM heap size after a Java garbage
            collection in <codeph>catalogd</codeph>, Impala invalidates 10% of the least
            recently used tables.
          </dd>

          <dd>
            The <codeph>&#8209;&#8209;invalidate_tables_on_memory_pressure</codeph> flag needs
            to be applied to both <codeph>impalad</codeph> and <codeph>catalogd</codeph>.
          </dd>

        </dlentry>
      </dl>

      <p>
        Automatic invalidation of metadata provides more stability with lower chances of running
        out of memory, but the feature could potentially cause performance issues and may
        require tuning.
      </p>

    </conbody>

  </concept>

  <concept id="auto_poll_hms_notification">

    <title>Automatic Invalidation/Refresh of Metadata</title>

    <conbody>

      <p>
        When tools such as Hive and Spark are used to process the raw data ingested into Hive
        tables, new HMS metadata (database, tables, partitions) and filesystem metadata (new
        files in existing partitions/tables) is generated. In previous versions of Impala, in
        order to pick up this new information, Impala users needed to manually issue an
        <codeph>INVALIDATE</codeph> or <codeph>REFRESH</codeph> commands.
      </p>

      <p>
        When automatic invalidate/refresh of metadata is enabled, <codeph>catalogd</codeph>
        polls Hive Metastore (HMS) notification events at a configurable interval and processes
        the following changes:
      </p>

      <note>
	      This is a preview feature in <keyword keyref="impala33_full"/> and <keyword keyref="impala40"/>
	      It is generally available and enabled by default from <keyword keyref="impala41"/> onwards.
      </note>

      <ul>
        <li>
          Invalidates the tables when it receives the <codeph>ALTER TABLE</codeph> event.
        </li>

        <li>
          Refreshes the partition when it receives the <codeph>ALTER</codeph>,
          <codeph>ADD</codeph>, or <codeph>DROP</codeph> partitions.
        </li>

        <li>
          Adds the tables or databases when it receives the <codeph>CREATE TABLE</codeph> or
          <codeph>CREATE DATABASE</codeph> events.
        </li>

        <li>
          Removes the tables from <codeph>catalogd</codeph> when it receives the <codeph>DROP
          TABLE</codeph> or <codeph>DROP DATABASE</codeph> events.
        </li>

        <li>
          Refreshes the table and partitions when it receives the <codeph>INSERT</codeph>
          events.
          <p>
            If the table is not loaded at the time of processing the <codeph>INSERT</codeph>
            event, the event processor does not need to refresh the table and skips it.
          </p>
        </li>

        <li>
          Changes the database and updates <codeph>catalogd</codeph> when it receives the
          <codeph>ALTER DATABASE</codeph> events. The following changes are supported. This
          event does not invalidate the tables in the database.
          <ul>
            <li>
              Change the database properties
            </li>

            <li>
              Change the comment on the database
            </li>

            <li>
              Change the owner of the database
            </li>

            <li>
              Change the default location of the database
              <p>
                Changing the default location of the database does not move the tables of that
                database to the new location. Only the new tables which are created subsequently
                use the default location of the database in case it is not provided in the
                create table statement.
              </p>
            </li>
          </ul>
        </li>
      </ul>

      <p>
        This feature is controlled by the
        <codeph>&#8209;&#8209;hms_event_polling_interval_s</codeph> flag. Start the
        <codeph>catalogd</codeph> with the <codeph>‑‑hms_event_polling_interval_s</codeph>
        flag set to a positive double value to enable the feature and set the polling frequency in
        seconds. We recommend the value between 1.0 to 5.0 seconds.
      </p>

      <p>
        The following use cases are not supported:
      </p>

      <ul>
        <li>
          When you bypass HMS and add or remove data into table by adding files directly on the
          filesystem, HMS does not generate the <codeph>INSERT</codeph> event, and the event
          processor will not invalidate the corresponding table or refresh the corresponding
          partition.
          <p>
            It is recommended that you use the <codeph>LOAD DATA</codeph> command to do the data
            load in such cases, so that event processor can act on the events generated by the
            <codeph>LOAD</codeph> command.
          </p>
        </li>

        <li>
          The Spark API that saves data to a specified location does not generate events in HMS,
          thus is not supported. For example:
<codeblock>Seq((1, 2)).toDF("i", "j").write.save("/user/hive/warehouse/spark_etl.db/customers/date=01012019")</codeblock>
        </li>
      </ul>

      <p>
        This feature is turned off by default with the
        <codeph>&#8209;&#8209;hms_event_polling_interval_s</codeph> flag set to
        <codeph>0</codeph>.
      </p>

    </conbody>

    <concept id="configure_event_based_metadata_sync">

      <title>Configure HMS for Event Based Automatic Metadata Sync</title>

      <conbody>

        <p>
          To use the HMS event based metadata sync:
        </p>

        <ol>
          <li>
            Add the following entries to the <codeph>hive-site.xml</codeph> of the Hive
            Metastore service.
<codeblock> &lt;property>
    &lt;name>hive.metastore.transactional.event.listeners&lt;/name>
    &lt;value>org.apache.hive.hcatalog.listener.DbNotificationListener&lt;/value>
  &lt;/property>
  &lt;property>
    &lt;name>hive.metastore.dml.events&lt;/name>
    &lt;value>true&lt;/true>
  &lt;/property></codeblock>
          </li>

          <li>
            Save <codeph>hive-site.xml</codeph>.
          </li>

          <li>
            Set the <codeph>hive.metastore.dml.events</codeph> configuration key to
            <codeph>true</codeph> in HiveServer2 service's <codeph>hive-site.xml</codeph>. This
            configuration key needs to be set to <codeph>true</codeph> in both Hive services,
            HiveServer2 and Hive Metastore.
          </li>

          <li>
            If applicable, set the <codeph>hive.metastore.dml.events</codeph> configuration key
            to <codeph>true</codeph> in <codeph>hive-site.xml</codeph> used by the Spark
            applications (typically, <codeph>/etc/hive/conf/hive-site.xml</codeph>) so that the
            <codeph>INSERT</codeph> events are generated when the Spark application inserts data
            into existing tables and partitions.
          </li>

          <li>
            Restart the HiveServer2, Hive Metastore, and Spark (if applicable) services.
          </li>
        </ol>

      </conbody>

    </concept>

    <concept id="disable_event_based_metadata_sync">

      <title>Disable Event Based Automatic Metadata Sync</title>

      <conbody>

        <p>
          When the <codeph>‑‑hms_event_polling_interval_s</codeph> flag is set to a non-zero
          value for your <codeph>catalogd</codeph>, the event-based automatic invalidation is
          enabled for all databases and tables. If you wish to have the fine-grained control on
          which tables or databases need to be synced using events, you can use the
          <codeph>impala.disableHmsSync</codeph> property to disable the event processing at the
          table or database level.
        </p>

        <p>
          When you add the <codeph>DBPROPERTIES</codeph> or <codeph>TBLPROPERTIES</codeph> with
          the <codeph>impala.disableHmsSync</codeph> key, the HMS event based sync is turned on
          or off. The value of the <codeph>impala.disableHmsSync</codeph> property determines if
          the event processing needs to be disabled for a particular table or database.
        </p>

        <ul>
          <li>
            If <codeph>'impala.disableHmsSync'='true'</codeph>, the events for that table or
            database are ignored and not synced with HMS.
          </li>

          <li>
            If <codeph>'impala.disableHmsSync'='false'</codeph> or if
            <codeph>impala.disableHmsSync</codeph> is not set, the automatic sync with HMS is
            enabled if the <codeph>‑‑hms_event_polling_interval_s</codeph> global flag is
            set to non-zero.
          </li>
        </ul>

        <ul>
          <li>
            To disable the event based HMS sync for a new database, set the
            <codeph>impala.disableHmsSync</codeph> database properties in Hive as currently,
            Impala does not support setting database properties:
<codeblock>CREATE DATABASE &lt;name> WITH DBPROPERTIES ('impala.disableHmsSync'='true');</codeblock>
          </li>

          <li>
            To enable or disable the event based HMS sync for a table:
<codeblock>CREATE TABLE &lt;name> WITH TBLPROPERTIES ('impala.disableHmsSync'='true' | 'false');</codeblock>
          </li>

          <li>
            To change the event based HMS sync at the table level:
<codeblock>ALTER TABLE &lt;name> WITH TBLPROPERTIES ('impala.disableHmsSync'='true' | 'false');</codeblock>
          </li>
        </ul>

        <p>
          When both table and database level properties are set, the table level property takes
          precedence. If the table level property is not set, then the database level property
          is used to evaluate if the event needs to be processed or not.
        </p>

        <p>
          If the property is changed from <codeph>true</codeph> (meaning events are skipped) to
          <codeph>false</codeph> (meaning events are not skipped), you need to issue a manual
          <codeph>INVALIDATE METADATA</codeph> command to reset event processor because it
          doesn't know how many events have been skipped in the past and cannot know if the
          object in the event is the latest. In such a case, the status of the event processor
          changes to <codeph>NEEDS_INVALIDATE</codeph>.
        </p>

      </conbody>

    </concept>

    <concept id="event_processor_metrics">

      <title>Metrics for Event Based Automatic Metadata Sync</title>

      <conbody>

        <p>
          You can use the web UI of the <codeph>catalogd</codeph> to check the state of the
          automatic invalidate event processor.
        </p>

        <p>
          Under the web UI, there are two pages that presents the metrics for HMS event
          processor that is responsible for the event based automatic metadata sync.
          <ul>
            <li>
              <b>/metrics#events</b>
            </li>

            <li>
              <b>/events</b>
              <p>
                This provides a detailed view of the metrics of the event processor, including
                min, max, mean, median, of the durations and rate metrics for all the counters
                listed on the <b>/metrics#events</b> page.
              </p>
            </li>
          </ul>
        </p>

      </conbody>

      <concept id="concept_gch_xzm_1hb">

        <title>/metrics#events Page</title>

        <conbody>

          <p>
            The <b>/metrics#events</b> page provides the following metrics about the HMS event
            processor.
          </p>

          <table id="events-tbl">
            <tgroup cols="2">
              <colspec colnum="1" colname="col1" colwidth="1*"/>
              <colspec colnum="2" colname="col3" colwidth="2.58*"/>
              <thead>
                <row>
                  <entry>
                    Name
                  </entry>
                  <entry>
                    Description
                  </entry>
                </row>
              </thead>
              <tbody>
                <row>
                  <entry>
                    events-processor.avg-events-fetch-duration
                  </entry>
                  <entry>
                    Average duration to fetch a batch of events and process it.
                  </entry>
                </row>
                <row>
                  <entry>
                    events-processor.avg-events-process-duration
                  </entry>
                  <entry>
                    Average time taken to process a batch of events received from the Metastore.
                  </entry>
                </row>
                <row>
                  <entry>
                    events-processor.events-received
                  </entry>
                  <entry>
                    Total number of the Metastore events received.
                  </entry>
                </row>
                <row>
                  <entry>
                    events-processor.events-received-15min-rate
                  </entry>
                  <entry>
                    Exponentially weighted moving average (EWMA) of number of events received in
                    last 15 min.

                    <p>
                      This rate of events can be used to determine if there are spikes in event
                      processor activity during certain hours of the day.
                    </p>
                  </entry>
                </row>
                <row>
                  <entry>
                    events-processor.events-received-1min-rate
                  </entry>
                  <entry>
                    Exponentially weighted moving average (EWMA) of number of events received in
                    last 1 min.

                    <p>
                      This rate of events can be used to determine if there are spikes in event
                      processor activity during certain hours of the day.
                    </p>
                  </entry>
                </row>
                <row>
                  <entry>
                    events-processor.events-received-5min-rate
                  </entry>
                  <entry>
                    Exponentially weighted moving average (EWMA) of number of events received in
                    last 5 min.

                    <p>
                      This rate of events can be used to determine if there are spikes in event
                      processor activity during certain hours of the day.
                    </p>
                  </entry>
                </row>
                <row>
                  <entry>
                    events-processor.events-skipped
                  </entry>
                  <entry>
                    Total number of the Metastore events skipped.

                    <p>
                      Events can be skipped based on certain flags are table and database level.
                      You can use this metric to make decisions, such as:
                      <ul>
                        <li>
                          If most of the events are being skipped, see if you might just turn
                          off the event processing.
                        </li>

                        <li>
                          If most of the events are not skipped, see if you need to add flags on
                          certain databases.
                        </li>
                      </ul>
                    </p>
                  </entry>
                </row>
                <row>
                  <entry>
                    events-processor.outstanding-event-count
                  </entry>
                  <entry>
                    Total number of outstanding events to be processed on db event executors
                    and table event executors when
                    <codeph>--enable_hierarchical_event_processing</codeph> flag is
                    <codeph>true</codeph>.
                  </entry>
                </row>
                <row>
                  <entry>
                    events-processor.status
                  </entry>
                  <entry>
                    Metastore event processor status to see if there are events being received
                    or not. Possible states are:

                    <ul>
                      <li>
                        <codeph>PAUSED</codeph>
                        <p>
                          The event processor is paused because catalog is being reset
                          concurrently.
                        </p>
                      </li>

                      <li>
                        <codeph>ACTIVE</codeph>
                        <p>
                          The event processor is scheduled at a given frequency.
                        </p>
                      </li>

                      <li>
                        <codeph>ERROR</codeph>
                      </li>

                      <li>
                        The event processor is in error state and event processing has stopped.
                        Needs a manual <codeph>INVALIDATE</codeph> command to reset the state.
                      </li>

                      <li>
                        <codeph>NEEDS_INVALIDATE</codeph>
                        <p>
                          The event processor could not resolve certain events and needs a
                          manual <codeph>INVALIDATE</codeph> command to reset the state.
                        </p>
                      </li>

                      <li>
                        <codeph>STOPPED</codeph>
                        <p>
                          The event processing has been shutdown. No events will be processed.
                        </p>
                      </li>

                      <li>
                        <codeph>DISABLED</codeph>
                        <p>
                          The event processor is not configured to run.
                        </p>
                      </li>
                    </ul>
                  </entry>
                </row>
              </tbody>
            </tgroup>
          </table>

        </conbody>

      </concept>

    </concept>

  </concept>

</concept>