IMPALA-13142: [DOCS] Documentation for Impala StateStore & Catalogd HA

Change-Id: I8927c9cd61f0274ad91111d6ac4a079f7a563197
Reviewed-on: http://gerrit.cloudera.org:8080/21615
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Yida Wu <wydbaggio000@gmail.com>
Reviewed-by: Wenzhe Zhou <wzhou@cloudera.com>
This commit is contained in:
m-sanjana19
2024-07-25 15:43:18 +05:30
committed by Wenzhe Zhou
parent f5a15ad605
commit db6ead8136
4 changed files with 252 additions and 2 deletions

View File

@@ -64,6 +64,10 @@ under the License.
<topicref href="topics/impala_tutorial.xml"/>
<topicref href="topics/impala_admin.xml">
<topicref href="topics/impala_timeouts.xml"/>
<topicref href="topics/impala_ha.xml">
<topicref href="topics/impala_ha_statestore.xml"/>
<topicref href="topics/impala_ha_catalog.xml"/>
</topicref>
<topicref href="topics/impala_proxy.xml"/>
<topicref href="topics/impala_disk_space.xml"/>
<topicref audience="integrated" href="topics/impala_auditing.xml"/>
@@ -361,8 +365,7 @@ under the License.
<!-- End of former contents of Installing-and-Using-Impala_xi42979.ditamap. -->
<topicref audience="standalone" href="topics/impala_faq.xml"/>
<topicref audience="standalone" href="topics/impala_release_notes.xml">
<mapref href="impala_release_notes.ditamap" format="ditamap"
audience="standalone"/>
<mapref href="impala_release_notes.ditamap" format="ditamap" audience="standalone"/>
</topicref>
<!-- Substitution variables and link destinations abstracted

50
docs/topics/impala_ha.xml Normal file
View File

@@ -0,0 +1,50 @@
<?xml version="1.0" encoding="UTF-8"?>
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
<concept id="impala_ha">
<title>Configuring Impala for High Availability</title>
<prolog>
<metadata>
<data name="Category" value="Impala"/>
<data name="Category" value="Administrators"/>
<data name="Category" value="High Availability"/>
<data name="Category" value="Configuring"/>
<data name="Category" value="Data Analysts"/>
</metadata>
</prolog>
<conbody>
<p>The Impala StateStore checks on the health of all Impala daemons in a cluster, and
continuously relays its findings to each of the daemons. The Catalog stores metadata of
databases, tables, partitions, resource usage information, configuration settings, and other
objects managed by Impala. If StateStore and Catalog daemons are single instances in an Impala
cluster, it will create a single point of failure. Although Impala coordinators/executors
continue to execute queries if the StateStore node is down, coordinators/executors will not
get state updates. This causes degradation of admission control &amp; cluster membership
updates. To mitigate this, a pair of StateStore and Catalog instances can be deployed in an
Impala cluster so that Impala cluster could survive failures of StateStore or Catalog.</p>
<p><b>Prerequisite:</b></p>
<p>To enable High Availability for Impala CatalogD and StateStore, you must configure at least
two Impala CatalogD, two StateStore instances on two different nodes.<note
id="note_dg1_qhh_fcc" type="note">CatalogD HA and Statestore HA are independent features.
Users can enable CatalogD HA, Statestore HA, or both CatalogD HA and Statestore
HA.</note></p>
<p outputclass="toc"/>
</conbody>
</concept>

View File

@@ -0,0 +1,100 @@
<?xml version="1.0" encoding="UTF-8"?>
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
<concept id="impala_ha_catalog">
<title>Configuring Catalog for High Availability</title>
<prolog>
<metadata>
<data name="Category" value="Impala"/>
<data name="Category" value="Administrators"/>
<data name="Category" value="High Availability"/>
<data name="Category" value="Configuring"/>
<data name="Category" value="Data Analysts"/>
<data name="Category" value="Catalog"/>
</metadata>
</prolog>
<conbody>
<p>With any new query requests, the Impala coordinator sends metadata requests to Catalog
service and sends metadata updates to Catalog which in turn propagates metadata updates to
hive metastore. With a pair of primary/standby Catalog instances, the standby Catalog instance
will be promoted as the primary instance to continue Catalog service for Impala cluster when
the primary instance goes down. The active Catalogd acts as the source of metadata and
provides Catalog service for the Impala cluster. This high availability mode of Catalog
service reduces the outage duration of the Impala cluster when the primary Catalog instance
fails. To support Catalog HA, you can now add two Catalogd instances in an Active-Passive high
availability pair to an Impala cluster.</p>
</conbody>
<concept id="enabling_catalog_ha">
<title>Enabling Catalog High Availability</title>
<conbody>
<p>To enable Catalog high availability in an Impala cluster, follow these steps:<ul
id="ul_pj4_ycn_1cc">
<li>Set the starting flag <codeph>enable_catalogd_ha</codeph> to <codeph>true</codeph> for
both catalogd instances and the StateStore.</li>
</ul></p>
<p>The active StateStore will assign roles to the CatalogD instances, designating one as the
active CatalogD and the other as the standby CatalogD. The active CatalogD acts as the
source of metadata and provides Catalog services for the Impala cluster.</p>
</conbody>
</concept>
<concept id="disabling_catalog_ha">
<title>Disabling Catalog High Availability</title>
<conbody>
<p>To disable Catalog high availability in an Impala cluster, follow these steps:<ol
id="ol_udj_b1n_1cc">
<li>Remove one CatalogD instance from the Impala cluster.</li>
<li>Restart the remaining CatalogD instance without the starting flag
<codeph>enable_catalogd_ha</codeph>.</li>
<li>Restart the StateStore without the starting flag
<codeph>enable_catalogd_ha</codeph>.</li>
</ol></p>
</conbody>
</concept>
<concept id="monitoring_status_ha">
<title>Monitoring Catalog HA Status in the StateStore Web Page</title>
<conbody>
<p>A new web page <codeph>/catalog_ha_info</codeph> has been added to the StateStore debug web
server. This page displays the Catalog HA status, including:</p>
<ul id="ha_status_web">
<li>Active Catalog Node</li>
<li>Standby Catalog Node</li>
<li>Notified Subscribers Table</li>
</ul>
<p>To access this information, navigate to the <codeph>/catalog_ha_info</codeph> page on the
StateStore debug web server.</p>
</conbody>
</concept>
<concept id="catalog_failure_detection">
<title>Catalog Failure Detection</title>
<conbody>
<p>The StateStore instance continuously sends heartbeat to its registered clients, including
the primary and standby Catalog instances, to track Impala daemons in the cluster to
determine if the daemon is healthy. If the StateStore finds the primary Catalog instance is
not healthy but the standby Catalog instance is healthy, StateStore promotes the standby
Catalog instance as primary instance and notify all coordinators for this change.
Coordinators will switch over to the new primary Catalog instance.</p>
<p>When the system detects that the active CatalogD is unhealthy, it initiates a failover to
the standby CatalogD. During this brief transition, some nodes might not immediately
recognize the new active CatalogD, causing currently running queries to fail due to lack of
access to metadata. These failed queries need to be rerun after the failover is complete and
the new active CatalogD is operational.</p>
</conbody>
</concept>
</concept>

View File

@@ -0,0 +1,97 @@
<?xml version="1.0" encoding="UTF-8"?>
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
<concept id="impala_ha_statestore">
<title>Configuring StateStore for High Availability</title>
<prolog>
<metadata>
<data name="Category" value="Impala"/>
<data name="Category" value="Administrators"/>
<data name="Category" value="High Availability"/>
<data name="Category" value="Configuring"/>
<data name="Category" value="Data Analysts"/>
<data name="Category" value="StateStore"/>
</metadata>
</prolog>
<conbody>
<p>With a pair of StateStore instances in primary/standby mode, the primary StateStore instance
will send the cluster's state update and propagate metadata updates. It periodically sends
heartbeat to the standby StateStore instance, CatalogD, coordinators and executors. The standby
StateStore instance also sends heartbeats to the CatalogD, and coordinators and executors. RPC
connections between daemons and StateStore instances are kept alive so that broken connections
usually won't result in false failure reports between nodes. The standby StateStore instance
takes over the primary role when the service is needed in order to continue to operate when
the primary instance goes down.</p>
</conbody>
<concept id="enabling_statestore_ha">
<title>Enabling StateStore High Availability</title>
<conbody>
<p>To enable StateStore High Availability (HA) in an Impala cluster, follow these steps:<ol
id="ol_k2p_zxm_1cc">
<li>Restart two StateStore instances with the following additional
flags:<codeblock id="codeblock_tmp_bym_1cc">enable_statestored_ha: true
state_store_ha_port: RPC port for StateStore HA service (default: 24020)
state_store_peer_host: Hostname of the peer StateStore instance
state_store_peer_ha_port: RPC port of high availability service on the peer StateStore instance (default: 24020)
</codeblock></li>
<li>Restart all subscribers (including CatalogD, coordinators, and executors) with the
following additional
flags:<codeblock id="codeblock_zyl_lmh_fcc">state_store_host: Hostname of the first StateStore instance
state_store_port: RPC port for StateStore registration on the first StateStore instance (default: 24000)
enable_statestored_ha: true
state_store_2_host: Hostname of the second StateStore instance
state_store_2_port: RPC port for StateStore registration on the second StateStore instance (default: 24000)</codeblock></li>
</ol></p>
<p>By setting these flags, the Impala cluster is configured to use two StateStore instances for
high availability, ensuring high availability and fault tolerance.</p>
</conbody>
</concept>
<concept id="disabling_statestore_ha">
<title>Disabling StateStore High Availability</title>
<conbody>
<p>To disable StateStore high availability in an Impala cluster, follow these steps:<ol
id="ol_udj_b1n_1cc">
<li>Remove one StateStore instance from the Impala cluster.</li>
<li>Restart the remaining StateStore instance along with the coordinator, executor, and
CatalogD nodes, ensuring they are restarted without the
<codeph>enable_statestored_ha</codeph> flag.</li>
</ol></p>
</conbody>
</concept>
<concept id="statestore_failure_detection">
<title>StateStore Failure Detection</title>
<conbody>
<p>The primary StateStore instance continuously sends heartbeat to its registered clients, and
the standby StateStore instance. Each StateStore client registers to both active and standby
StateStore instances, and maintains the following information about the StateStore servers:
the server IP and port, service role - primary/standby, the last time the heartbeat request
was received, or number of missed heartbeats. A missing heartbeat response from the
StateStors client indicates an unhealthy daemon. There is a flag that defines
<codeph>MAX_MISSED_HEARTBEAT_REQUEST_NUM </codeph>as the consecutive number of missed
heartbeat requests to indicate losing communication with the StateStore server from the
client's point of view so that the client marks the StateStore server as down. Standby
StateStore instance collects the connection states between the clients (CatalogD,
coordinators and executors) and primary StateStore instance in its heartbeat messages to the
clients. If the standby StateStore instance misses <codeph>MAX_MISSED_HEARTBEAT_REQUEST_NUM
</codeph>of heartbeat requests from the primary StateStore instance and the majority of
clients lose connections with the primary StateStore, it takes over the primary role.</p>
</conbody>
</concept>
</concept>