mirror of
https://github.com/apache/impala.git
synced 2026-01-08 21:03:01 -05:00
These are refugees from doc_prototype. They can be rendered with the DITA Open Toolkit version 2.3.3 by: /tmp/dita-ot-2.3.3/bin/dita \ -i impala.ditamap \ -f html5 \ -o $(mktemp -d) \ -filter impala_html.ditaval Change-Id: I8861e99adc446f659a04463ca78c79200669484f Reviewed-on: http://gerrit.cloudera.org:8080/5014 Reviewed-by: John Russell <jrussell@cloudera.com> Tested-by: John Russell <jrussell@cloudera.com>
107 lines
4.4 KiB
XML
107 lines
4.4 KiB
XML
<?xml version="1.0" encoding="UTF-8"?>
|
|
<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
|
|
<concept id="install">
|
|
|
|
<title><ph audience="standalone">Installing Impala</ph><ph audience="integrated">Impala Installation</ph></title>
|
|
<prolog>
|
|
<metadata>
|
|
<data name="Category" value="Impala"/>
|
|
<data name="Category" value="Installing"/>
|
|
<data name="Category" value="Administrators"/>
|
|
</metadata>
|
|
</prolog>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
<indexterm audience="Cloudera">installation</indexterm>
|
|
<indexterm audience="Cloudera">pseudo-distributed cluster</indexterm>
|
|
<indexterm audience="Cloudera">cluster</indexterm>
|
|
<indexterm audience="Cloudera">DataNodes</indexterm>
|
|
<indexterm audience="Cloudera">NameNode</indexterm>
|
|
<indexterm audience="Cloudera">Cloudera Manager</indexterm>
|
|
<indexterm audience="Cloudera">impalad</indexterm>
|
|
<indexterm audience="Cloudera">impala-shell</indexterm>
|
|
<indexterm audience="Cloudera">statestored</indexterm>
|
|
Impala is an open-source add-on to the Cloudera Enterprise Core that returns rapid responses to
|
|
queries.
|
|
</p>
|
|
|
|
<note>
|
|
<p>
|
|
Under CDH 5, Impala is included as part of the CDH installation and no separate steps are needed.
|
|
<ph audience="standalone">Therefore, the instruction steps in this section apply to CDH 4 only.</ph>
|
|
</p>
|
|
</note>
|
|
|
|
<p outputclass="toc inpage"/>
|
|
</conbody>
|
|
|
|
<concept id="install_details">
|
|
|
|
<title>What is Included in an Impala Installation</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
Impala is made up of a set of components that can be installed on multiple nodes throughout your cluster.
|
|
The key installation step for performance is to install the <cmdname>impalad</cmdname> daemon (which does
|
|
most of the query processing work) on <i>all</i> DataNodes in the cluster.
|
|
</p>
|
|
|
|
<p>
|
|
The Impala package installs these binaries:
|
|
</p>
|
|
|
|
<ul>
|
|
<li>
|
|
<p>
|
|
<cmdname>impalad</cmdname> - The Impala daemon. Plans and executes queries against HDFS, HBase, <ph rev="2.2.0">and Amazon S3 data</ph>.
|
|
<xref href="impala_processes.xml#processes">Run one impalad process</xref> on each node in the cluster
|
|
that has a DataNode.
|
|
</p>
|
|
</li>
|
|
|
|
<li>
|
|
<p>
|
|
<cmdname>statestored</cmdname> - Name service that tracks location and status of all
|
|
<codeph>impalad</codeph> instances in the cluster. <xref href="impala_processes.xml#processes">Run one
|
|
instance of this daemon</xref> on a node in your cluster. Most production deployments run this daemon
|
|
on the namenode.
|
|
</p>
|
|
</li>
|
|
|
|
<li rev="1.2">
|
|
<p>
|
|
<cmdname>catalogd</cmdname> - Metadata coordination service that broadcasts changes from Impala DDL and
|
|
DML statements to all affected Impala nodes, so that new tables, newly loaded data, and so on are
|
|
immediately visible to queries submitted through any Impala node.
|
|
<!-- Consider removing this when 1.2 gets far in the past. -->
|
|
(Prior to Impala 1.2, you had to run the <codeph>REFRESH</codeph> or <codeph>INVALIDATE
|
|
METADATA</codeph> statement on each node to synchronize changed metadata. Now those statements are only
|
|
required if you perform the DDL or DML through an external mechanism such as Hive <ph rev="2.2.0">or by uploading
|
|
data to the Amazon S3 filesystem</ph>.)
|
|
<xref href="impala_processes.xml#processes">Run one instance of this daemon</xref> on a node in your cluster,
|
|
preferably on the same host as the <codeph>statestored</codeph> daemon.
|
|
</p>
|
|
</li>
|
|
|
|
<li>
|
|
<p>
|
|
<cmdname>impala-shell</cmdname> - <xref href="impala_impala_shell.xml#impala_shell">Command-line
|
|
interface</xref> for issuing queries to the Impala daemon. You install this on one or more hosts
|
|
anywhere on your network, not necessarily DataNodes or even within the same cluster as Impala. It can
|
|
connect remotely to any instance of the Impala daemon.
|
|
</p>
|
|
</li>
|
|
</ul>
|
|
|
|
<p>
|
|
Before doing the installation, ensure that you have all necessary prerequisites. See
|
|
<xref href="impala_prereqs.xml#prereqs"/> for details.
|
|
</p>
|
|
</conbody>
|
|
</concept>
|
|
|
|
</concept>
|