mirror of
https://github.com/apache/impala.git
synced 2025-12-19 09:58:28 -05:00
Change-Id: I2f1ee44cddd71e2d929d1253a73b33e52c08ad42 Reviewed-on: http://gerrit.cloudera.org:8080/14907 Reviewed-by: Alex Rodoni <arodoni@cloudera.com> Tested-by: Alex Rodoni <arodoni@cloudera.com>
673 lines
23 KiB
XML
673 lines
23 KiB
XML
<?xml version="1.0" encoding="UTF-8"?>
|
||
<!--
|
||
Licensed to the Apache Software Foundation (ASF) under one
|
||
or more contributor license agreements. See the NOTICE file
|
||
distributed with this work for additional information
|
||
regarding copyright ownership. The ASF licenses this file
|
||
to you under the Apache License, Version 2.0 (the
|
||
"License"); you may not use this file except in compliance
|
||
with the License. You may obtain a copy of the License at
|
||
|
||
http://www.apache.org/licenses/LICENSE-2.0
|
||
|
||
Unless required by applicable law or agreed to in writing,
|
||
software distributed under the License is distributed on an
|
||
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||
KIND, either express or implied. See the License for the
|
||
specific language governing permissions and limitations
|
||
under the License.
|
||
-->
|
||
<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
|
||
<concept id="proxy">
|
||
|
||
<title>Using Impala through a Proxy for High Availability</title>
|
||
|
||
<titlealts audience="PDF">
|
||
|
||
<navtitle>Load-Balancing Proxy for HA</navtitle>
|
||
|
||
</titlealts>
|
||
|
||
<prolog>
|
||
<metadata>
|
||
<data name="Category" value="High Availability"/>
|
||
<data name="Category" value="Impala"/>
|
||
<data name="Category" value="Network"/>
|
||
<data name="Category" value="Proxy"/>
|
||
<data name="Category" value="Administrators"/>
|
||
<data name="Category" value="Developers"/>
|
||
<data name="Category" value="Data Analysts"/>
|
||
</metadata>
|
||
</prolog>
|
||
|
||
<conbody>
|
||
|
||
<p>
|
||
For most clusters that have multiple users and production availability requirements, you
|
||
might want to set up a load-balancing proxy server to relay requests to and from Impala.
|
||
</p>
|
||
|
||
<p>
|
||
Set up a software package of your choice to perform these functions.
|
||
</p>
|
||
|
||
<note>
|
||
<p conref="../shared/impala_common.xml#common/statestored_catalogd_ha_blurb"/>
|
||
</note>
|
||
|
||
<p outputclass="toc inpage"/>
|
||
|
||
</conbody>
|
||
|
||
<concept id="proxy_overview">
|
||
|
||
<title>Overview of Proxy Usage and Load Balancing for Impala</title>
|
||
|
||
<prolog>
|
||
<metadata>
|
||
<data name="Category" value="Concepts"/>
|
||
</metadata>
|
||
</prolog>
|
||
|
||
<conbody>
|
||
|
||
<p>
|
||
Using a load-balancing proxy server for Impala has the following advantages:
|
||
</p>
|
||
|
||
<ul>
|
||
<li>
|
||
Applications connect to a single well-known host and port, rather than keeping track
|
||
of the hosts where the <cmdname>impalad</cmdname> daemon is running.
|
||
</li>
|
||
|
||
<li>
|
||
If any host running the <cmdname>impalad</cmdname> daemon becomes unavailable,
|
||
application connection requests still succeed because you always connect to the proxy
|
||
server rather than a specific host running the <cmdname>impalad</cmdname> daemon.
|
||
</li>
|
||
|
||
<li> The coordinator node for each Impala query potentially requires
|
||
more memory and CPU cycles than the other nodes that process the
|
||
query. The proxy server can issue queries so that each connection uses
|
||
a different coordinator node. This load-balancing technique lets the
|
||
<cmdname>impalad</cmdname> nodes share this additional work, rather
|
||
than concentrating it on a single machine. </li>
|
||
</ul>
|
||
|
||
<p>
|
||
The following setup steps are a general outline that apply to any load-balancing proxy
|
||
software:
|
||
</p>
|
||
|
||
<ol>
|
||
<li>
|
||
Select and download the load-balancing proxy software or other load-balancing hardware
|
||
appliance. It should only need to be installed and configured on a single host,
|
||
typically on an edge node.
|
||
</li>
|
||
|
||
<li>
|
||
Configure the load balancer (typically by editing a configuration file). In
|
||
particular:
|
||
<ul>
|
||
<li>
|
||
To relay Impala requests back and forth, set up a port that the load balancer will
|
||
listen on.
|
||
</li>
|
||
|
||
<li>
|
||
Select a load balancing algorithm. See
|
||
<xref
|
||
href="#proxy_balancing" format="dita"/> for load balancing
|
||
algorithm options.
|
||
</li>
|
||
|
||
<li>
|
||
For Kerberized clusters, follow the instructions in
|
||
<xref
|
||
href="impala_proxy.xml#proxy_kerberos"/>.
|
||
</li>
|
||
</ul>
|
||
</li>
|
||
|
||
<li>
|
||
If you are using Hue or JDBC-based applications, you typically set up load balancing
|
||
for both ports 21000 and 21050 because these client applications connect through port
|
||
21050 while the <cmdname>impala-shell</cmdname> command connects through port 21000.
|
||
See <xref href="impala_ports.xml#ports"/> for when to use port 21000, 21050, or
|
||
another value depending on what type of connections you are load balancing.
|
||
</li>
|
||
|
||
<li>
|
||
Run the load-balancing proxy server, pointing it at the configuration file that you
|
||
set up.
|
||
</li>
|
||
|
||
<li>
|
||
For any scripts, jobs, or configuration settings for applications that formerly
|
||
connected to a specific <cmdname>impalad</cmdname> to run Impala SQL statements,
|
||
change the connection information (such as the <codeph>-i</codeph> option in
|
||
<cmdname>impala-shell</cmdname>) to point to the load balancer instead.
|
||
</li>
|
||
</ol>
|
||
|
||
<note>
|
||
The following sections use the HAProxy software as a representative example of a load
|
||
balancer that you can use with Impala.
|
||
</note>
|
||
|
||
</conbody>
|
||
|
||
</concept>
|
||
|
||
<concept id="proxy_balancing" rev="">
|
||
|
||
<title>Choosing the Load-Balancing Algorithm</title>
|
||
|
||
<conbody>
|
||
|
||
<p>
|
||
Load-balancing software offers a number of algorithms to distribute requests. Each
|
||
algorithm has its own characteristics that make it suitable in some situations but not
|
||
others.
|
||
</p>
|
||
|
||
<dl>
|
||
<dlentry>
|
||
|
||
<dt>
|
||
Leastconn
|
||
</dt>
|
||
|
||
<dd>
|
||
Connects sessions to the coordinator with the fewest connections, to balance the
|
||
load evenly. Typically used for workloads consisting of many independent,
|
||
short-running queries. In configurations with only a few client machines, this
|
||
setting can avoid having all requests go to only a small set of coordinators.
|
||
</dd>
|
||
|
||
<dd>
|
||
Recommended for Impala with F5.
|
||
</dd>
|
||
|
||
</dlentry>
|
||
|
||
<dlentry>
|
||
|
||
<dt>
|
||
Source IP Persistence
|
||
</dt>
|
||
|
||
<dd>
|
||
<p>
|
||
Sessions from the same IP address always go to the same coordinator. A good choice
|
||
for Impala workloads containing a mix of queries and DDL statements, such as
|
||
<codeph>CREATE TABLE</codeph> and <codeph>ALTER TABLE</codeph>. Because the
|
||
metadata changes from a DDL statement take time to propagate across the cluster,
|
||
prefer to use the Source IP Persistence in this case. If you are unable to choose
|
||
Source IP Persistence, run the DDL and subsequent queries that depend on the
|
||
results of the DDL through the same session, for example by running
|
||
<codeph>impala-shell -f <varname>script_file</varname></codeph> to submit several
|
||
statements through a single session.
|
||
</p>
|
||
</dd>
|
||
|
||
<dd>
|
||
<p>
|
||
Required for setting up high availability with Hue.
|
||
</p>
|
||
</dd>
|
||
|
||
</dlentry>
|
||
|
||
<dlentry>
|
||
|
||
<dt>
|
||
Round-robin
|
||
</dt>
|
||
|
||
<dd>
|
||
Distributes connections to all coordinator nodes. Typically not recommended for
|
||
Impala.
|
||
</dd>
|
||
|
||
</dlentry>
|
||
</dl>
|
||
|
||
<p>
|
||
You might need to perform benchmarks and load testing to determine which setting is
|
||
optimal for your use case. Always set up using two load-balancing algorithms: Source IP
|
||
Persistence for Hue and Leastconn for others.
|
||
</p>
|
||
|
||
</conbody>
|
||
|
||
</concept>
|
||
|
||
<concept id="proxy_kerberos">
|
||
|
||
<title>Special Proxy Considerations for Clusters Using Kerberos</title>
|
||
|
||
<prolog>
|
||
<metadata>
|
||
<data name="Category" value="Security"/>
|
||
<data name="Category" value="Kerberos"/>
|
||
<data name="Category" value="Authentication"/>
|
||
<data name="Category" value="Proxy"/>
|
||
</metadata>
|
||
</prolog>
|
||
|
||
<conbody>
|
||
|
||
<p>
|
||
In a cluster using Kerberos, applications check host credentials to verify that the host
|
||
they are connecting to is the same one that is actually processing the request.
|
||
</p>
|
||
|
||
<p>
|
||
In <keyword keyref="impala211_full">Impala 2.11</keyword> and lower versions, once you
|
||
enable a proxy server in a Kerberized cluster, users will not be able to connect to
|
||
individual impala daemons directly from impala-shell.
|
||
</p>
|
||
|
||
<p>
|
||
In <keyword keyref="impala212_full">Impala 2.12</keyword> and higher versions, when you
|
||
enable a proxy server in a Kerberized cluster, users have an option to connect to Impala
|
||
daemons directly from <cmdname>impala-shell</cmdname> using the <codeph>-b</codeph> /
|
||
<codeph>--kerberos_host_fqdn</codeph> <cmdname>impala-shell</cmdname> flag. This option
|
||
can be used for testing or troubleshooting purposes, but not recommended for live
|
||
production environments as it defeats the purpose of a load balancer/proxy.
|
||
</p>
|
||
|
||
<p>
|
||
Example:
|
||
<codeblock>
|
||
impala-shell -i impalad-1.mydomain.com -k -b loadbalancer-1.mydomain.com
|
||
</codeblock>
|
||
</p>
|
||
|
||
<p>
|
||
Alternatively, with the fully qualified configurations:
|
||
<codeblock>impala-shell --impalad=impalad-1.mydomain.com:21000 --kerberos --kerberos_host_fqdn=loadbalancer-1.mydomain.com</codeblock>
|
||
</p>
|
||
|
||
<p>
|
||
See <xref href="impala_shell_options.xml#shell_options"/> for information about the
|
||
option.
|
||
</p>
|
||
|
||
<p>
|
||
To validate the load-balancing proxy server, perform these extra Kerberos setup steps:
|
||
</p>
|
||
|
||
<ol>
|
||
<li>
|
||
This section assumes you are starting with a Kerberos-enabled cluster. See
|
||
<xref href="impala_kerberos.xml#kerberos"/> for instructions for setting up Impala
|
||
with Kerberos. See <xref keyref="cdh_sg_kerberos_prin_keytab_deploy"/> for general
|
||
steps to set up Kerberos.
|
||
</li>
|
||
|
||
<li>
|
||
Choose the host you will use for the proxy server. Based on the Kerberos setup
|
||
procedure, it should already have an entry
|
||
<codeph>impala/<varname>proxy_host</varname>@<varname>realm</varname></codeph> in its
|
||
<filepath>keytab</filepath>. If not, go back over the initial Kerberos configuration
|
||
steps for the <filepath>keytab</filepath> on each host running the
|
||
<cmdname>impalad</cmdname> daemon.
|
||
</li>
|
||
|
||
<li>
|
||
Copy the <filepath>keytab</filepath> file from the proxy host to all other hosts in
|
||
the cluster that run the <cmdname>impalad</cmdname> daemon. Put the
|
||
<filepath>keytab</filepath> file in a secure location on each of these other hosts.
|
||
</li>
|
||
|
||
<li>
|
||
Add an entry
|
||
<codeph>impala/<varname>actual_hostname</varname>@<varname>realm</varname></codeph> to
|
||
the <filepath>keytab</filepath> on each host running the <cmdname>impalad</cmdname>
|
||
daemon.
|
||
</li>
|
||
|
||
<li>
|
||
For each <cmdname>impalad</cmdname> node, merge the existing
|
||
<filepath>keytab</filepath> with the proxy’s <filepath>keytab</filepath> using
|
||
<cmdname>ktutil</cmdname>, producing a new <filepath>keytab</filepath> file. For
|
||
example:
|
||
<codeblock>$ ktutil
|
||
ktutil: read_kt proxy.keytab
|
||
ktutil: read_kt impala.keytab
|
||
ktutil: write_kt proxy_impala.keytab
|
||
ktutil: quit</codeblock>
|
||
</li>
|
||
|
||
<li>
|
||
To verify that the <filepath>keytabs</filepath> are merged, run the command:
|
||
<codeblock>
|
||
klist -k <varname>keytabfile</varname>
|
||
</codeblock>
|
||
The command lists the credentials for both <codeph>principal</codeph> and
|
||
<codeph>be_principal</codeph> on all nodes.
|
||
</li>
|
||
|
||
<li>
|
||
Make sure that the <codeph>impala</codeph> user has the permission to read this merged
|
||
<filepath>keytab</filepath> file.
|
||
</li>
|
||
|
||
<li>
|
||
For each coordinator <codeph>impalad</codeph> host in the cluster that participates in
|
||
the load balancing, add the following configuration options to receive client
|
||
connections coming through the load balancer proxy server:
|
||
<codeblock>
|
||
--principal=impala/<varname>proxy_host@realm</varname>
|
||
--be_principal=impala/<varname>actual_host@realm</varname>
|
||
--keytab_file=<varname>path_to_merged_keytab</varname>
|
||
</codeblock>
|
||
<p>
|
||
The <codeph>--principal</codeph> setting prevents a client from connecting to a
|
||
coordinator <codeph>impalad</codeph> using a principal other than one specified.
|
||
</p>
|
||
|
||
<note>
|
||
Every host has different <codeph>--be_principal</codeph> because the actual host
|
||
name is different on each host. Specify the fully qualified domain name (FQDN) for
|
||
the proxy host, not the IP address. Use the exact FQDN as returned by a reverse DNS
|
||
lookup for the associated IP address.
|
||
</note>
|
||
</li>
|
||
|
||
<li>
|
||
Restart Impala to make the changes take effect. Restart the <cmdname>impalad</cmdname>
|
||
daemons on all hosts in the cluster, as well as the <cmdname>statestored</cmdname> and
|
||
<cmdname>catalogd</cmdname> daemons.
|
||
</li>
|
||
</ol>
|
||
|
||
<section id="section_fjz_mfn_yjb">
|
||
|
||
<title>Client Connection to Proxy Server in Kerberized Clusters</title>
|
||
|
||
<p>
|
||
When a client connect to Impala, the service principal specified by the client must
|
||
match the <codeph>-principal</codeph> setting of the Impala proxy server. And the
|
||
client should connect to the proxy server port.
|
||
</p>
|
||
|
||
<p>
|
||
In <filepath>hue.ini</filepath>, set the following to configure Hue to automatically
|
||
connect to the proxy server:
|
||
</p>
|
||
|
||
<codeblock>[impala]
|
||
server_host=<varname>proxy_host</varname>
|
||
impala_principal=impala/<varname>proxy_host</varname></codeblock>
|
||
|
||
<p>
|
||
The following are the JDBC connection string formats when connecting through the load
|
||
balancer with the load balancer's host name in the principal:
|
||
</p>
|
||
|
||
<codeblock>jdbc:hive2://<varname>proxy_host</varname>:<varname>load_balancer_port</varname>/;principal=impala/_HOST@<varname>realm</varname>
|
||
jdbc:hive2://<varname>proxy_host</varname>:<varname>load_balancer_port</varname>/;principal=impala/<varname>proxy_host</varname>@<varname>realm</varname></codeblock>
|
||
|
||
<p>
|
||
When starting <cmdname>impala-shell</cmdname>, specify the service principal via the
|
||
<codeph>-b</codeph> or <codeph>--kerberos_host_fqdn</codeph> flag.
|
||
</p>
|
||
|
||
</section>
|
||
|
||
</conbody>
|
||
|
||
</concept>
|
||
|
||
<concept id="proxy_tls">
|
||
|
||
<title>Special Proxy Considerations for TLS/SSL Enabled Clusters</title>
|
||
|
||
<prolog>
|
||
<metadata>
|
||
<data name="Category" value="Security"/>
|
||
<data name="Category" value="TLS"/>
|
||
<data name="Category" value="Authentication"/>
|
||
<data name="Category" value="Proxy"/>
|
||
</metadata>
|
||
</prolog>
|
||
|
||
<conbody>
|
||
|
||
<p>
|
||
When TLS/SSL is enabled for Impala, the client application, whether impala-shell, Hue,
|
||
or something else, expects the certificate common name (CN) to match the hostname that
|
||
it is connected to. With no load balancing proxy server, the hostname and certificate CN
|
||
are both that of the <codeph>impalad</codeph> instance. However, with a proxy server,
|
||
the certificate presented by the <codeph>impalad</codeph> instance does not match the
|
||
load balancing proxy server hostname. If you try to load-balance a TLS/SSL-enabled
|
||
Impala installation without additional configuration, you see a certificate mismatch
|
||
error when a client attempts to connect to the load balancing proxy host.
|
||
</p>
|
||
|
||
<p>
|
||
You can configure a proxy server in several ways to load balance TLS/SSL enabled Impala:
|
||
</p>
|
||
|
||
<dl>
|
||
<dlentry>
|
||
<dt> TLS/SSL Bridging</dt>
|
||
<dd> In this configuration, the proxy server presents a TLS/SSL
|
||
certificate to the client, decrypts the client request, then
|
||
re-encrypts the request before sending it to the backend
|
||
<codeph>impalad</codeph>. The client and server certificates can
|
||
be managed separately. The request or resulting payload is encrypted
|
||
in transit at all times. </dd>
|
||
|
||
</dlentry>
|
||
|
||
<dlentry>
|
||
|
||
<dt>
|
||
TLS/SSL Passthrough
|
||
</dt>
|
||
|
||
<dd>
|
||
In this configuration, traffic passes through to the backend
|
||
<codeph>impalad</codeph> instance with no interaction from the load balancing proxy
|
||
server. Traffic is still encrypted end-to-end.
|
||
</dd>
|
||
|
||
<dd>
|
||
The same server certificate, utilizing either wildcard or Subject Alternate Name
|
||
(SAN), must be installed on each <codeph>impalad</codeph> instance.
|
||
</dd>
|
||
|
||
</dlentry>
|
||
|
||
<dlentry>
|
||
|
||
<dt>
|
||
TLS/SSL Offload
|
||
</dt>
|
||
|
||
<dd>
|
||
In this configuration, all traffic is decrypted on the load balancing proxy server,
|
||
and traffic between the backend <codeph>impalad</codeph> instances is unencrypted.
|
||
This configuration presumes that cluster hosts reside on a trusted network and only
|
||
external client-facing communication need to be encrypted in-transit.
|
||
</dd>
|
||
|
||
</dlentry>
|
||
</dl>
|
||
|
||
<p>
|
||
Refer to your load balancer documentation for the steps to set up Impala and the load
|
||
balancer using one of the options above.
|
||
</p>
|
||
|
||
</conbody>
|
||
|
||
</concept>
|
||
|
||
<concept id="tut_proxy">
|
||
|
||
<title>Example of Configuring HAProxy Load Balancer for Impala</title>
|
||
|
||
<prolog>
|
||
<metadata>
|
||
<data name="Category" value="Configuring"/>
|
||
</metadata>
|
||
</prolog>
|
||
|
||
<conbody>
|
||
|
||
<p>
|
||
If you are not already using a load-balancing proxy, you can experiment with
|
||
<xref href="http://haproxy.1wt.eu/" scope="external" format="html">HAProxy</xref> a
|
||
free, open source load balancer. This example shows how you might install and configure
|
||
that load balancer on a Red Hat Enterprise Linux system.
|
||
</p>
|
||
|
||
<ul>
|
||
<li>
|
||
<p>
|
||
Install the load balancer:
|
||
</p>
|
||
<codeblock>yum install haproxy</codeblock>
|
||
</li>
|
||
|
||
<li>
|
||
<p>
|
||
Set up the configuration file: <filepath>/etc/haproxy/haproxy.cfg</filepath>. See
|
||
the following section for a sample configuration file.
|
||
</p>
|
||
</li>
|
||
|
||
<li>
|
||
<p>
|
||
Run the load balancer (on a single host, preferably one not running
|
||
<cmdname>impalad</cmdname>):
|
||
</p>
|
||
<codeblock>/usr/sbin/haproxy –f /etc/haproxy/haproxy.cfg</codeblock>
|
||
</li>
|
||
|
||
<li>
|
||
<p>
|
||
In <cmdname>impala-shell</cmdname>, JDBC applications, or ODBC applications, connect
|
||
to the listener port of the proxy host, rather than port 21000 or 21050 on a host
|
||
actually running <cmdname>impalad</cmdname>. The sample configuration file sets
|
||
haproxy to listen on port 25003, therefore you would send all requests to
|
||
<codeph><varname>haproxy_host</varname>:25003</codeph>.
|
||
</p>
|
||
</li>
|
||
</ul>
|
||
|
||
<p>
|
||
This is the sample <filepath>haproxy.cfg</filepath> used in this example:
|
||
</p>
|
||
|
||
<codeblock>global
|
||
# To have these messages end up in /var/log/haproxy.log you will
|
||
# need to:
|
||
#
|
||
# 1) configure syslog to accept network log events. This is done
|
||
# by adding the '-r' option to the SYSLOGD_OPTIONS in
|
||
# /etc/sysconfig/syslog
|
||
#
|
||
# 2) configure local2 events to go to the /var/log/haproxy.log
|
||
# file. A line like the following can be added to
|
||
# /etc/sysconfig/syslog
|
||
#
|
||
# local2.* /var/log/haproxy.log
|
||
#
|
||
log 127.0.0.1 local0
|
||
log 127.0.0.1 local1 notice
|
||
chroot /var/lib/haproxy
|
||
pidfile /var/run/haproxy.pid
|
||
maxconn 4000
|
||
user haproxy
|
||
group haproxy
|
||
daemon
|
||
|
||
# turn on stats unix socket
|
||
#stats socket /var/lib/haproxy/stats
|
||
|
||
#---------------------------------------------------------------------
|
||
# common defaults that all the 'listen' and 'backend' sections will
|
||
# use if not designated in their block
|
||
#
|
||
# You might need to adjust timing values to prevent timeouts.
|
||
#
|
||
# The timeout values should be dependant on how you use the cluster
|
||
# and how long your queries run.
|
||
#---------------------------------------------------------------------
|
||
defaults
|
||
mode http
|
||
log global
|
||
option httplog
|
||
option dontlognull
|
||
option http-server-close
|
||
option forwardfor except 127.0.0.0/8
|
||
option redispatch
|
||
retries 3
|
||
maxconn 3000
|
||
timeout connect 5000
|
||
timeout client 3600s
|
||
timeout server 3600s
|
||
|
||
#
|
||
# This sets up the admin page for HA Proxy at port 25002.
|
||
#
|
||
listen stats :25002
|
||
balance
|
||
mode http
|
||
stats enable
|
||
stats auth <varname>username</varname>:<varname>password</varname>
|
||
|
||
# Setup for Impala.
|
||
# Impala client connect to load_balancer_host:25003.
|
||
# HAProxy will balance connections among the list of servers listed below.
|
||
# The list of Impalad is listening at port 21000 for beeswax (impala-shell) or original ODBC driver.
|
||
# For JDBC or ODBC version 2.x driver, use port 21050 instead of 21000.
|
||
listen impala :25003
|
||
mode tcp
|
||
option tcplog
|
||
balance leastconn
|
||
|
||
server <varname>symbolic_name_1</varname> impala-host-1.example.com:21000 check
|
||
server <varname>symbolic_name_2</varname> impala-host-2.example.com:21000 check
|
||
server <varname>symbolic_name_3</varname> impala-host-3.example.com:21000 check
|
||
server <varname>symbolic_name_4</varname> impala-host-4.example.com:21000 check
|
||
|
||
# Setup for Hue or other JDBC-enabled applications.
|
||
# In particular, Hue requires sticky sessions.
|
||
# The application connects to load_balancer_host:21051, and HAProxy balances
|
||
# connections to the associated hosts, where Impala listens for
|
||
# JDBC requests at port 21050.
|
||
listen impalajdbc :21051
|
||
mode tcp
|
||
option tcplog
|
||
balance source
|
||
|
||
server <varname>symbolic_name_5</varname> impala-host-1.example.com:21050 check
|
||
server <varname>symbolic_name_6</varname> impala-host-2.example.com:21050 check
|
||
server <varname>symbolic_name_7</varname> impala-host-3.example.com:21050 check
|
||
server <varname>symbolic_name_8</varname> impala-host-4.example.com:21050 check
|
||
</codeblock>
|
||
|
||
<note type="important">
|
||
Hue requires the <codeph>check</codeph> option at end of each line in the above file to
|
||
ensure HAProxy can detect any unreachable <cmdname>Impalad</cmdname> server, and
|
||
failover can be successful. Without the TCP check, you may hit an error when the
|
||
<cmdname>impalad</cmdname> daemon to which Hue tries to connect is down.
|
||
</note>
|
||
|
||
<note conref="../shared/impala_common.xml#common/proxy_jdbc_caveat"/>
|
||
|
||
</conbody>
|
||
|
||
</concept>
|
||
|
||
</concept>
|