mirror of
https://github.com/apache/impala.git
synced 2025-12-19 09:58:28 -05:00
Change-Id: I8927c9cd61f0274ad91111d6ac4a079f7a563197 Reviewed-on: http://gerrit.cloudera.org:8080/21615 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Yida Wu <wydbaggio000@gmail.com> Reviewed-by: Wenzhe Zhou <wzhou@cloudera.com>
98 lines
5.5 KiB
XML
98 lines
5.5 KiB
XML
<?xml version="1.0" encoding="UTF-8"?>
|
||
<!--
|
||
Licensed to the Apache Software Foundation (ASF) under one
|
||
or more contributor license agreements. See the NOTICE file
|
||
distributed with this work for additional information
|
||
regarding copyright ownership. The ASF licenses this file
|
||
to you under the Apache License, Version 2.0 (the
|
||
"License"); you may not use this file except in compliance
|
||
with the License. You may obtain a copy of the License at
|
||
|
||
http://www.apache.org/licenses/LICENSE-2.0
|
||
|
||
Unless required by applicable law or agreed to in writing,
|
||
software distributed under the License is distributed on an
|
||
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||
KIND, either express or implied. See the License for the
|
||
specific language governing permissions and limitations
|
||
under the License.
|
||
-->
|
||
<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
|
||
<concept id="impala_ha_statestore">
|
||
<title>Configuring StateStore for High Availability</title>
|
||
<prolog>
|
||
<metadata>
|
||
<data name="Category" value="Impala"/>
|
||
<data name="Category" value="Administrators"/>
|
||
<data name="Category" value="High Availability"/>
|
||
<data name="Category" value="Configuring"/>
|
||
<data name="Category" value="Data Analysts"/>
|
||
<data name="Category" value="StateStore"/>
|
||
</metadata>
|
||
</prolog>
|
||
<conbody>
|
||
<p>With a pair of StateStore instances in primary/standby mode, the primary StateStore instance
|
||
will send the cluster's state update and propagate metadata updates. It periodically sends
|
||
heartbeat to the standby StateStore instance, CatalogD, coordinators and executors. The standby
|
||
StateStore instance also sends heartbeats to the CatalogD, and coordinators and executors. RPC
|
||
connections between daemons and StateStore instances are kept alive so that broken connections
|
||
usually won't result in false failure reports between nodes. The standby StateStore instance
|
||
takes over the primary role when the service is needed in order to continue to operate when
|
||
the primary instance goes down.</p>
|
||
</conbody>
|
||
<concept id="enabling_statestore_ha">
|
||
<title>Enabling StateStore High Availability</title>
|
||
<conbody>
|
||
<p>To enable StateStore High Availability (HA) in an Impala cluster, follow these steps:<ol
|
||
id="ol_k2p_zxm_1cc">
|
||
<li>Restart two StateStore instances with the following additional
|
||
flags:<codeblock id="codeblock_tmp_bym_1cc">enable_statestored_ha: true
|
||
state_store_ha_port: RPC port for StateStore HA service (default: 24020)
|
||
state_store_peer_host: Hostname of the peer StateStore instance
|
||
state_store_peer_ha_port: RPC port of high availability service on the peer StateStore instance (default: 24020)
|
||
</codeblock></li>
|
||
<li>Restart all subscribers (including CatalogD, coordinators, and executors) with the
|
||
following additional
|
||
flags:<codeblock id="codeblock_zyl_lmh_fcc">state_store_host: Hostname of the first StateStore instance
|
||
state_store_port: RPC port for StateStore registration on the first StateStore instance (default: 24000)
|
||
enable_statestored_ha: true
|
||
state_store_2_host: Hostname of the second StateStore instance
|
||
state_store_2_port: RPC port for StateStore registration on the second StateStore instance (default: 24000)</codeblock></li>
|
||
</ol></p>
|
||
<p>By setting these flags, the Impala cluster is configured to use two StateStore instances for
|
||
high availability, ensuring high availability and fault tolerance.</p>
|
||
</conbody>
|
||
</concept>
|
||
<concept id="disabling_statestore_ha">
|
||
<title>Disabling StateStore High Availability</title>
|
||
<conbody>
|
||
<p>To disable StateStore high availability in an Impala cluster, follow these steps:<ol
|
||
id="ol_udj_b1n_1cc">
|
||
<li>Remove one StateStore instance from the Impala cluster.</li>
|
||
<li>Restart the remaining StateStore instance along with the coordinator, executor, and
|
||
CatalogD nodes, ensuring they are restarted without the
|
||
<codeph>enable_statestored_ha</codeph> flag.</li>
|
||
</ol></p>
|
||
</conbody>
|
||
</concept>
|
||
<concept id="statestore_failure_detection">
|
||
<title>StateStore Failure Detection</title>
|
||
<conbody>
|
||
<p>The primary StateStore instance continuously sends heartbeat to its registered clients, and
|
||
the standby StateStore instance. Each StateStore client registers to both active and standby
|
||
StateStore instances, and maintains the following information about the StateStore servers:
|
||
the server IP and port, service role - primary/standby, the last time the heartbeat request
|
||
was received, or number of missed heartbeats. A missing heartbeat response from the
|
||
StateStor’s client indicates an unhealthy daemon. There is a flag that defines
|
||
<codeph>MAX_MISSED_HEARTBEAT_REQUEST_NUM </codeph>as the consecutive number of missed
|
||
heartbeat requests to indicate losing communication with the StateStore server from the
|
||
client's point of view so that the client marks the StateStore server as down. Standby
|
||
StateStore instance collects the connection states between the clients (CatalogD,
|
||
coordinators and executors) and primary StateStore instance in its heartbeat messages to the
|
||
clients. If the standby StateStore instance misses <codeph>MAX_MISSED_HEARTBEAT_REQUEST_NUM
|
||
</codeph>of heartbeat requests from the primary StateStore instance and the majority of
|
||
clients lose connections with the primary StateStore, it takes over the primary role.</p>
|
||
</conbody>
|
||
</concept>
|
||
</concept>
|