IMPALA-7214: [DOCS] Update Impala docs to decouple Impala and DataNodes

- Take 1: Let's review these docs before we go clean up many more.

Change-Id: I1c91f7975c09dae9908591eeeac0d55e5355b2d4
Reviewed-on: http://gerrit.cloudera.org:8080/12400
Reviewed-by: Alex Rodoni <arodoni@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This commit is contained in:
Alex Rodoni
2019-02-07 17:06:47 -08:00
parent 29df1d55f6
commit 697a15b341
9 changed files with 164 additions and 468 deletions

View File

@@ -2079,7 +2079,7 @@ show functions in _impala_builtins like '*<varname>substring</varname>*';
<b>Sorting considerations:</b> Although you can specify an <codeph>ORDER BY</codeph> clause in an
<codeph>INSERT ... SELECT</codeph> statement, any <codeph>ORDER BY</codeph> clause is ignored and the
results are not necessarily sorted. An <codeph>INSERT ... SELECT</codeph> operation potentially creates
many different data files, prepared on different data nodes, and therefore the notion of the data being
many different data files, prepared by different executor Impala daemons, and therefore the notion of the data being
stored in sorted order is impractical.
</p>
@@ -3185,8 +3185,9 @@ select max(height), avg(height) from census_data where age &gt; 20;
<codeph><xref href="../topics/impala_order_by.xml#order_by">ORDER BY</xref></codeph> clause to also use a
<codeph><xref href="../topics/impala_limit.xml#limit">LIMIT</xref></codeph> clause. In Impala 1.4.0 and
higher, the <codeph>LIMIT</codeph> clause is optional for <codeph>ORDER BY</codeph> queries. In cases where
sorting a huge result set requires enough memory to exceed the Impala memory limit for a particular node,
Impala automatically uses a temporary disk work area to perform the sort operation.
sorting a huge result set requires enough memory to exceed the Impala
memory limit for a particular executor Impala daemon, Impala automatically uses a
temporary disk work area to perform the sort operation.
</p>
<p rev="1.2" id="limit_and_offset">

View File

@@ -34,10 +34,9 @@ under the License.
<conbody>
<p>
The Impala server is a distributed, massively parallel processing (MPP) database engine. It consists of
different daemon processes that run on specific hosts within your <keyword keyref="distro"/> cluster.
</p>
<p> The Impala server is a distributed, massively parallel processing (MPP)
database engine. It consists of different daemon processes that run on
specific hosts within your cluster. </p>
<p outputclass="toc inpage"/>
</conbody>
@@ -48,35 +47,35 @@ under the License.
<conbody>
<p>
The core Impala component is a daemon process that runs on each DataNode of the cluster, physically represented
by the <codeph>impalad</codeph> process. It reads and writes to data files; accepts queries transmitted
from the <codeph>impala-shell</codeph> command, Hue, JDBC, or ODBC; parallelizes the queries and
distributes work across the cluster; and transmits intermediate query results back to the
central coordinator node.
</p>
<p> The core Impala component is the Impala daemon, physically represented
by the <codeph>impalad</codeph> process. A few of the key functions that
an Impala daemon performs are:<ul>
<li>Reads and writes to data files.</li>
<li>Accepts queries transmitted from the <codeph>impala-shell</codeph>
command, Hue, JDBC, or ODBC.</li>
<li>Parallelizes the queries and distributes work across the
cluster.</li>
<li>Transmits intermediate query results back to the central
coordinator. </li>
</ul></p>
<p>Impala daemons can be deployed in one of the following ways:<ul>
<li>HDFS and Impala are co-located, and each Impala daemon runs on the
same host as a DataNode.</li>
<li>Impala is deployed separately in a compute cluster and reads
remotely from HDFS, S3, ADLS, etc.</li>
</ul></p>
<p>
You can submit a query to the Impala daemon running on any DataNode, and that instance of the daemon serves as the
<term>coordinator node</term> for that query. The other nodes transmit partial results back to the
coordinator, which constructs the final result set for a query. When running experiments with functionality
through the <codeph>impala-shell</codeph> command, you might always connect to the same Impala daemon for
convenience. For clusters running production workloads, you might load-balance by
submitting each query to a different Impala daemon in round-robin style, using the JDBC or ODBC interfaces.
</p>
<p> The Impala daemons are in constant communication with StateStore, to
confirm which daemons are healthy and can accept new work. </p>
<p>
The Impala daemons are in constant communication with the <term>statestore</term>, to confirm which nodes
are healthy and can accept new work.
</p>
<p rev="1.2">
They also receive broadcast messages from the <cmdname>catalogd</cmdname> daemon (introduced in Impala 1.2)
whenever any Impala node in the cluster creates, alters, or drops any type of object, or when an
<codeph>INSERT</codeph> or <codeph>LOAD DATA</codeph> statement is processed through Impala. This
background communication minimizes the need for <codeph>REFRESH</codeph> or <codeph>INVALIDATE
METADATA</codeph> statements that were needed to coordinate metadata across nodes prior to Impala 1.2.
</p>
<p rev="1.2"> They also receive broadcast messages from the
<cmdname>catalogd</cmdname> daemon (introduced in Impala 1.2) whenever
any Impala daemon in the cluster creates, alters, or drops any type of
object, or when an <codeph>INSERT</codeph> or <codeph>LOAD DATA</codeph>
statement is processed through Impala. This background communication
minimizes the need for <codeph>REFRESH</codeph> or <codeph>INVALIDATE
METADATA</codeph> statements that were needed to coordinate metadata
across Impala daemons prior to Impala 1.2. </p>
<p rev="2.9.0 IMPALA-3807 IMPALA-5147 IMPALA-5503">
In <keyword keyref="impala29_full"/> and higher, you can control which hosts act as query coordinators
@@ -98,32 +97,28 @@ under the License.
<conbody>
<p>
The Impala component known as the <term>statestore</term> checks on the health of Impala daemons on all the
DataNodes in a cluster, and continuously relays its findings to each of those daemons. It is physically
represented by a daemon process named <codeph>statestored</codeph>; you only need such a process on one
host in the cluster. If an Impala daemon goes offline due to hardware failure, network error, software issue,
or other reason, the statestore informs all the other Impala daemons so that future queries can avoid making
requests to the unreachable node.
</p>
<p> The Impala component known as the StateStore checks on the health of
all Impala daemons in a cluster, and continuously relays its findings to
each of those daemons. It is physically represented by a daemon process
named <codeph>statestored</codeph>. You only need such a process on one
host in a cluster. If an Impala daemon goes offline due to hardware
failure, network error, software issue, or other reason, the StateStore
informs all the other Impala daemons so that future queries can avoid
making requests to the unreachable Impala daemon. </p>
<p>
Because the statestore's purpose is to help when things go wrong and
<p> Because the StateStore's purpose is to help when things go wrong and
to broadcast metadata to coordinators, it is not always critical to the
normal operation of an Impala cluster. If the statestore is not running
normal operation of an Impala cluster. If the StateStore is not running
or becomes unreachable, the Impala daemons continue running and
distributing work among themselves as usual when working with the data
known to Impala. The cluster just becomes less robust if other Impala
daemons fail, and metadata becomes less consistent as it changes while
the statestore is offline. When the statestore comes back online, it
the StateStore is offline. When the StateStore comes back online, it
re-establishes communication with the Impala daemons and resumes its
monitoring and broadcasting functions.
</p>
monitoring and broadcasting functions. </p>
<p>
If you issue a DDL statement while the statestore is down, the queries
that access the new object the DDL created will fail.
</p>
<p> If you issue a DDL statement while the StateStore is down, the queries
that access the new object the DDL created will fail. </p>
<p conref="../shared/impala_common.xml#common/statestored_catalogd_ha_blurb"/>
@@ -145,35 +140,25 @@ under the License.
<conbody>
<p>
The Impala component known as the <term>catalog service</term> relays the metadata changes from Impala SQL
statements to all the Impala daemons in a cluster. It is physically represented by a daemon process named
<codeph>catalogd</codeph>; you only need such a process on one host in the cluster. Because the requests
are passed through the statestore daemon, it makes sense to run the <cmdname>statestored</cmdname> and
<cmdname>catalogd</cmdname> services on the same host.
</p>
<p> The Impala component known as the Catalog Service relays the metadata
changes from Impala SQL statements to all the Impala daemons in a
cluster. It is physically represented by a daemon process named
<codeph>catalogd</codeph>. You only need such a process on one host in
a cluster. Because the requests are passed through the StateStore
daemon, it makes sense to run the <cmdname>statestored</cmdname> and
<cmdname>catalogd</cmdname> services on the same host. </p>
<p>
The catalog service avoids the need to issue
<codeph>REFRESH</codeph> and <codeph>INVALIDATE METADATA</codeph> statements when the metadata changes are
performed by statements issued through Impala. When you create a table, load data, and so on through Hive,
you do need to issue <codeph>REFRESH</codeph> or <codeph>INVALIDATE METADATA</codeph> on an Impala node
before executing a query there.
</p>
<p> The catalog service avoids the need to issue <codeph>REFRESH</codeph>
and <codeph>INVALIDATE METADATA</codeph> statements when the metadata
changes are performed by statements issued through Impala. When you
create a table, load data, and so on through Hive, you do need to issue
<codeph>REFRESH</codeph> or <codeph>INVALIDATE METADATA</codeph> on an
Impala daemon before executing a query there. </p>
<p>
This feature touches a number of aspects of Impala:
</p>
<!-- This was formerly a conref, but since the list of links also included a link
to this same topic, materializing the list here and removing that
circular link. (The conref is still used in Incompatible Changes.)
<ul conref="../shared/impala_common.xml#common/catalogd_xrefs">
<li/>
</ul>
-->
<ul id="catalogd_xrefs">
<li>
<p>
@@ -184,16 +169,17 @@ under the License.
</li>
<li>
<p>
The <codeph>REFRESH</codeph> and <codeph>INVALIDATE METADATA</codeph> statements are not needed
when the <codeph>CREATE TABLE</codeph>, <codeph>INSERT</codeph>, or other table-changing or
data-changing operation is performed through Impala. These statements are still needed if such
operations are done through Hive or by manipulating data files directly in HDFS, but in those cases the
statements only need to be issued on one Impala node rather than on all nodes. See
<xref href="impala_refresh.xml#refresh"/> and
<xref href="impala_invalidate_metadata.xml#invalidate_metadata"/> for the latest usage information for
those statements.
</p>
<p> The <codeph>REFRESH</codeph> and <codeph>INVALIDATE
METADATA</codeph> statements are not needed when the
<codeph>CREATE TABLE</codeph>, <codeph>INSERT</codeph>, or other
table-changing or data-changing operation is performed through
Impala. These statements are still needed if such operations are
done through Hive or by manipulating data files directly in HDFS,
but in those cases the statements only need to be issued on one
Impala daemon rather than on all daemons. See <xref
href="impala_refresh.xml#refresh"/> and <xref
href="impala_invalidate_metadata.xml#invalidate_metadata"/> for
the latest usage information for those statements. </p>
</li>
</ul>

View File

@@ -43,270 +43,4 @@ under the License.
<p outputclass="toc"/>
</conbody>
<!-- These other topics are waiting to be filled in. Could become subtopics or top-level topics depending on the depth of coverage in each case. -->
<concept id="intro_data_lifecycle" audience="hidden">
<title>Overview of the Data Lifecycle for Impala</title>
<conbody/>
</concept>
<concept id="intro_etl" audience="hidden">
<title>Overview of the Extract, Transform, Load (ETL) Process for Impala</title>
<prolog>
<metadata>
<data name="Category" value="ETL"/>
<data name="Category" value="Ingest"/>
<data name="Category" value="Concepts"/>
</metadata>
</prolog>
<conbody/>
</concept>
<concept id="intro_hadoop_data" audience="hidden">
<title>How Impala Works with Hadoop Data Files</title>
<conbody/>
</concept>
<concept id="intro_web_ui" audience="hidden">
<title>Overview of the Impala Web Interface</title>
<conbody/>
</concept>
<concept id="intro_bi" audience="hidden">
<title>Using Impala with Business Intelligence Tools</title>
<conbody/>
</concept>
<concept id="intro_ha" audience="hidden">
<title>Overview of Impala Availability and Fault Tolerance</title>
<conbody/>
</concept>
<!-- This is pretty much ready to go. Decide if it should go under "Concepts" or "Performance",
and if it should be split out into a separate file, and then take out the audience= attribute
to make it visible.
-->
<concept id="intro_llvm" audience="hidden">
<title>Overview of Impala Runtime Code Generation</title>
<conbody>
<!-- Adapted from the CIDR15 paper written by the Impala team. -->
<p>
Impala uses <term>LLVM</term> (a compiler library and collection of related tools) to perform just-in-time
(JIT) compilation within the running <cmdname>impalad</cmdname> process. This runtime code generation
technique improves query execution times by generating native code optimized for the architecture of each
host in your particular cluster. Performance gains of 5 times or more are typical for representative
workloads.
</p>
<p>
Impala uses runtime code generation to produce query-specific versions of functions that are critical to
performance. In particular, code generation is applied to <term>inner loop</term> functions, that is, those
that are executed many times (for every tuple) in a given query, and thus constitute a large portion of the
total time the query takes to execute. For example, when Impala scans a data file, it calls a function to
parse each record into Impalas in-memory tuple format. For queries scanning large tables, billions of
records could result in billions of function calls. This function must therefore be extremely efficient for
good query performance, and removing even a few instructions from each function call can result in large
query speedups.
</p>
<p>
Overall, JIT compilation has an effect similar to writing custom code to process a query. For example, it
eliminates branches, unrolls loops, propagates constants, offsets and pointers, and inlines functions.
Inlining is especially valuable for functions used internally to evaluate expressions, where the function
call itself is more expensive than the function body (for example, a function that adds two numbers).
Inlining functions also increases instruction-level parallelism, and allows the compiler to make further
optimizations such as subexpression elimination across expressions.
</p>
<p>
Impala generates runtime query code automatically, so you do not need to do anything special to get this
performance benefit. This technique is most effective for complex and long-running queries that process
large numbers of rows. If you need to issue a series of short, small queries, you might turn off this
feature to avoid the overhead of compilation time for each query. In this case, issue the statement
<codeph>SET DISABLE_CODEGEN=true</codeph> to turn off runtime code generation for the duration of the
current session.
</p>
<!--
<p>
Without code generation,
functions tend to be suboptimal
to handle situations that cannot be predicted in advance.
For example,
a record-parsing function that
only handles integer types will be faster at parsing an integer-only file
than a function that handles other data types
such as strings and floating-point numbers.
However, the schemas of the files to
be scanned are unknown at compile time,
and so a general-purpose function must be used, even if at runtime
it is known that more limited functionality is sufficient.
</p>
<p>
A source of large runtime overheads are virtual functions. Virtual function calls incur a large performance
penalty, particularly when the called function is very simple, as the calls cannot be inlined.
If the type of the object instance is known at runtime, we can use code generation to replace the virtual
function call with a call directly to the correct function, which can then be inlined. This is especially
valuable when evaluating expression trees. In Impala (as in many systems), expressions are composed of a
tree of individual operators and functions.
</p>
<p>
Each type of expression that can appear in a query is implemented internally by overriding a virtual function.
Many of these expression functions are quite simple, for example, adding two numbers.
The virtual function call can be more expensive than the function body itself. By resolving the virtual
function calls with code generation and then inlining the resulting function calls, Impala can evaluate expressions
directly with no function call overhead. Inlining functions also increases
instruction-level parallelism, and allows the compiler to make further optimizations such as subexpression
elimination across expressions.
</p>
-->
</conbody>
</concept>
<!-- Same as the previous section: adapted from CIDR paper, ready to externalize after deciding where to go. -->
<concept audience="hidden" id="intro_io">
<title>Overview of Impala I/O</title>
<conbody>
<p>
Efficiently retrieving data from HDFS is a challenge for all SQL-on-Hadoop systems. To perform
data scans from both disk and memory at or near hardware speed, Impala uses an HDFS feature called
<term>short-circuit local reads</term> to bypass the DataNode protocol when reading from local disk. Impala
can read at almost disk bandwidth (approximately 100 MB/s per disk) and is typically able to saturate all
available disks. For example, with 12 disks, Impala is typically capable of sustaining I/O at 1.2 GB/sec.
Furthermore, <term>HDFS caching</term> allows Impala to access memory-resident data at memory bus speed,
and saves CPU cycles as there is no need to copy or checksum data blocks within memory.
</p>
<p>
The I/O manager component interfaces with storage devices to read and write data. I/O manager assigns a
fixed number of worker threads per physical disk (currently one thread per rotational disk and eight per
SSD), providing an asynchronous interface to clients (<term>scanner threads</term>).
</p>
</conbody>
</concept>
<!-- Same as the previous section: adapted from CIDR paper, ready to externalize after deciding where to go. -->
<!-- Although good idea to get some answers from Henry first. -->
<concept audience="hidden" id="intro_state_distribution">
<title>State distribution</title>
<conbody>
<p>
As a massively parallel database that can run on hundreds of nodes, Impala must coordinate and synchronize
its metadata across the entire cluster. Impala's symmetric-node architecture means that any node can accept
and execute queries, and thus each node needs up-to-date versions of the system catalog and a knowledge of
which hosts the <cmdname>impalad</cmdname> daemons run on. To avoid the overhead of TCP connections and
remote procedure calls to retrieve metadata during query planning, Impala implements a simple
publish-subscribe service called the <term>statestore</term> to push metadata changes to a set of
subscribers (the <cmdname>impalad</cmdname> daemons running on all the DataNodes).
</p>
<p>
The statestore maintains a set of topics, which are arrays of <codeph>(<varname>key</varname>,
<varname>value</varname>, <varname>version</varname>)</codeph> triplets called <term>entries</term> where
<varname>key</varname> and <varname>value</varname> are byte arrays, and <varname>version</varname> is a
64-bit integer. A topic is defined by an application, and so the statestore has no understanding of the
contents of any topic entry. Topics are persistent through the lifetime of the statestore, but are not
persisted across service restarts. Processes that receive updates to any topic are called
<term>subscribers</term>, and express their interest by registering with the statestore at startup and
providing a list of topics. The statestore responds to registration by sending the subscriber an initial
topic update for each registered topic, which consists of all the entries currently in that topic.
</p>
<!-- Henry: OK, but in practice, what is in these topic messages for Impala? -->
<p>
After registration, the statestore periodically sends two kinds of messages to each subscriber. The first
kind of message is a topic update, and consists of all changes to a topic (new entries, modified entries
and deletions) since the last update was successfully sent to the subscriber. Each subscriber maintains a
per-topic most-recent-version identifier which allows the statestore to only send the delta between
updates. In response to a topic update, each subscriber sends a list of changes it intends to make to its
subscribed topics. Those changes are guaranteed to have been applied by the time the next update is
received.
</p>
<p>
The second kind of statestore message is a <term>heartbeat</term>, formerly sometimes called
<term>keepalive</term>. The statestore uses heartbeat messages to maintain the connection to each
subscriber, which would otherwise time out its subscription and attempt to re-register.
</p>
<p>
Prior to Impala 2.0, both kinds of communication were combined in a single kind of message. Because these
messages could be very large in instances with thousands of tables, partitions, data files, and so on,
Impala 2.0 and higher divides the types of messages so that the small heartbeat pings can be transmitted
and acknowledged quickly, increasing the reliability of the statestore mechanism that detects when Impala
nodes become unavailable.
</p>
<p>
If the statestore detects a failed subscriber (for example, by repeated failed heartbeat deliveries), it
stops sending updates to that node.
<!-- Henry: what are examples of these transient topic entries? -->
Some topic entries are marked as transient, meaning that if their owning subscriber fails, they are
removed.
</p>
<p>
Although the asynchronous nature of this mechanism means that metadata updates might take some time to
propagate across the entire cluster, that does not affect the consistency of query planning or results.
Each query is planned and coordinated by a particular node, so as long as the coordinator node is aware of
the existence of the relevant tables, data files, and so on, it can distribute the query work to other
nodes even if those other nodes have not received the latest metadata updates.
<!-- Henry: need another example here of what's in a topic, e.g. is it the list of available tables? -->
<!--
For example, query planning is performed on a single node based on the
catalog metadata topic, and once a full plan has been computed, all information required to execute that
plan is distributed directly to the executing nodes.
There is no requirement that an executing node should
know about the same version of the catalog metadata topic.
-->
</p>
<p>
We have found that the statestore process with default settings scales well to medium sized clusters, and
can serve our largest deployments with some configuration changes.
<!-- Henry: elaborate on the configuration changes. -->
</p>
<p>
<!-- Henry: other examples like load information? How is load information used? -->
The statestore does not persist any metadata to disk: all current metadata is pushed to the statestore by
its subscribers (for example, load information). Therefore, should a statestore restart, its state can be
recovered during the initial subscriber registration phase. Or if the machine that the statestore is
running on fails, a new statestore process can be started elsewhere, and subscribers can fail over to it.
There is no built-in failover mechanism in Impala, instead deployments commonly use a retargetable DNS
entry to force subscribers to automatically move to the new process instance.
<!-- Henry: translate that last sentence into instructions / guidelines. -->
</p>
</conbody>
</concept>
</concept>

View File

@@ -57,33 +57,30 @@ Lots of nuances to illustrate through sample code.
See <xref href="impala_shell_options.xml"/> for the command-line and configuration file options you can use.
</p>
<p>
You can connect to any DataNode where an instance of <cmdname>impalad</cmdname> is running,
and that host coordinates the execution of all queries sent to it.
</p>
<p> You can connect to any Impala daemon (<cmdname>impalad</cmdname>), and
that daemon coordinates the execution of all queries sent to it. </p>
<p>
For simplicity during development, you might always connect to the same host, perhaps running <cmdname>impala-shell</cmdname> on
the same host as <cmdname>impalad</cmdname> and specifying the hostname as <codeph>localhost</codeph>.
</p>
<p>
In a production environment, you might enable load balancing, in which you connect to specific host/port combination
but queries are forwarded to arbitrary hosts. This technique spreads the overhead of acting as the coordinator
node among all the DataNodes in the cluster. See <xref href="impala_proxy.xml"/> for details.
</p>
<p> In a production environment, you might enable load balancing, in which
you connect to specific host/port combination but queries are forwarded to
arbitrary hosts. This technique spreads the overhead of acting as the
coordinator node among all the Impala daemons in the cluster. See <xref
href="impala_proxy.xml"/> for details. </p>
<p>
<b>To connect the Impala shell during shell startup:</b>
</p>
<ol>
<li>
Locate the hostname of a DataNode within the cluster that is running an instance of the
<cmdname>impalad</cmdname> daemon. If that DataNode uses a non-default port (something
other than port 21000) for <cmdname>impala-shell</cmdname> connections, find out the
port number also.
</li>
<li> Locate the hostname that is running an instance of the
<cmdname>impalad</cmdname> daemon. If that <cmdname>impalad</cmdname>
uses a non-default port (something other than port 21000) for
<cmdname>impala-shell</cmdname> connections, find out the port number
also. </li>
<li>
Use the <codeph>-i</codeph> option to the
@@ -126,21 +123,17 @@ $ impala-shell -i <varname>some.other.hostname</varname>:<varname>port_number</v
[Not connected] &gt; </codeblock>
</li>
<li>
Locate the hostname of a DataNode within the cluster that is running an instance of the
<cmdname>impalad</cmdname> daemon. If that DataNode uses a non-default port (something
other than port 21000) for <cmdname>impala-shell</cmdname> connections, find out the
port number also.
</li>
<li> Locate the hostname that is running the <cmdname>impalad</cmdname>
daemon. If that <cmdname>impalad</cmdname> uses a non-default port
(something other than port 21000) for <cmdname>impala-shell</cmdname>
connections, find out the port number also. </li>
<li>
Use the <codeph>connect</codeph> command to connect to an Impala instance. Enter a command of the form:
<codeblock>[Not connected] &gt; connect <varname>impalad-host</varname>
<li> Use the <codeph>connect</codeph> command to connect to an Impala
instance. Enter a command of the form: <codeblock>[Not connected] &gt; connect <varname>impalad-host</varname>
[<varname>impalad-host</varname>:21000] &gt;</codeblock>
<note>
Replace <varname>impalad-host</varname> with the hostname you have configured for any DataNode running
Impala in your environment. The changed prompt indicates a successful connection.
</note>
<note> Replace <varname>impalad-host</varname> with the hostname you
have configured to run Impala in your environment. The changed prompt
indicates a successful connection. </note>
</li>
</ol>
@@ -148,11 +141,9 @@ $ impala-shell -i <varname>some.other.hostname</varname>:<varname>port_number</v
<b>To start <cmdname>impala-shell</cmdname> in a specific database:</b>
</p>
<p>
You can use all the same connection options as in previous examples.
For simplicity, these examples assume that you are logged into one of
the DataNodes that is running the <cmdname>impalad</cmdname> daemon.
</p>
<p> You can use all the same connection options as in previous examples. For
simplicity, these examples assume that you are logged into one of the
Impala daemons. </p>
<ol>
<li>
@@ -182,11 +173,9 @@ $ impala-shell -i localhost -d parquet_gzip_compression
<b>To run one or several statements in non-interactive mode:</b>
</p>
<p>
You can use all the same connection options as in previous examples.
For simplicity, these examples assume that you are logged into one of
the DataNodes that is running the <cmdname>impalad</cmdname> daemon.
</p>
<p> You can use all the same connection options as in previous examples. For
simplicity, these examples assume that you are logged into one of the
Impala daemons. </p>
<ol>
<li>

View File

@@ -98,10 +98,9 @@ under the License.
For information on installing the Impala shell, see <xref href="impala_install.xml#install"/>.
</p>
<p>
For information about establishing a connection to a DataNode running the <codeph>impalad</codeph> daemon
through the <codeph>impala-shell</codeph> command, see <xref href="impala_connecting.xml#connecting"/>.
</p>
<p> For information about establishing a connection to a coordinator Impala
daemon through the <codeph>impala-shell</codeph> command, see <xref
href="impala_connecting.xml#connecting"/>. </p>
<p>
For a list of the <codeph>impala-shell</codeph> command-line options, see

View File

@@ -143,15 +143,13 @@ under the License.
</li>
<li>
<p>
Data is physically divided based on units of storage called
<p> Data is physically divided based on units of storage called
<term>tablets</term>. Tablets are stored by <term>tablet
servers</term>. Each tablet server can store multiple tablets,
and each tablet is replicated across multiple tablet servers,
managed automatically by Kudu. Where practical, co-locate the
tablet servers on the same hosts as the DataNodes, although that
is not required.
</p>
tablet servers on the same hosts as the Impala daemons, although
that is not required. </p>
</li>
</ul>

View File

@@ -49,15 +49,6 @@ under the License.
<codeph>LIMIT</codeph> clause, which reduces network overhead and the
memory requirement on the coordinator node. </p>
<note>
<p rev="1.4.0 obwl">
In Impala 1.4.0 and higher, the <codeph>LIMIT</codeph> clause is now optional (rather than required) for
queries that use the <codeph>ORDER BY</codeph> clause. Impala automatically uses a temporary disk work area
to perform the sort if the sort operation would otherwise exceed the Impala memory limit for a particular
DataNode.
</p>
</note>
<p conref="../shared/impala_common.xml#common/syntax_blurb"/>
<p>
@@ -105,8 +96,6 @@ col_ref ::= <varname>column_name</varname> | <varname>integer_literal</varname>
</p>
<p rev="obwl" conref="../shared/impala_common.xml#common/order_by_limit"/>
<!-- Good to show an example of cases where ORDER BY does and doesn't work with complex types. -->
<p conref="../shared/impala_common.xml#common/complex_types_blurb"/>
<p rev="2.3.0">
@@ -131,13 +120,12 @@ ERROR: AnalysisException: ORDER BY expression 'score' with complex type 'ARRAY&l
<p conref="../shared/impala_common.xml#common/example_blurb"/>
<p>
The following query retrieves the user ID and score, only for scores greater than one million,
with the highest scores for each user listed first.
Because the individual array elements are now represented as separate rows in the result set,
they can be used in the <codeph>ORDER BY</codeph> clause, referenced using the <codeph>ITEM</codeph>
pseudocolumn that represents each array element.
</p>
<p> The following query retrieves the user ID and score, only for scores
greater than one million, with the highest scores for each user listed
first. Because the individual array elements are now represented as
separate rows in the result set, they can be used in the <codeph>ORDER
BY</codeph> clause, referenced using the <codeph>ITEM</codeph>
pseudo-column that represents each array element. </p>
<codeblock>SELECT id, item FROM games, games.score
WHERE item &gt; 1000000
@@ -168,13 +156,14 @@ ORDER BY id, info.value desc;
<p conref="../shared/impala_common.xml#common/usage_notes_blurb"/>
<p>
Although the <codeph>LIMIT</codeph> clause is now optional on <codeph>ORDER BY</codeph> queries, if your
query only needs some number of rows that you can predict in advance, use the <codeph>LIMIT</codeph> clause
to reduce unnecessary processing. For example, if the query has a clause <codeph>LIMIT 10</codeph>, each data
node sorts its portion of the relevant result set and only returns 10 rows to the coordinator node. The
coordinator node picks the 10 highest or lowest row values out of this small intermediate result set.
</p>
<p> Although the <codeph>LIMIT</codeph> clause is now optional on
<codeph>ORDER BY</codeph> queries, if your query only needs some number
of rows that you can predict in advance, use the <codeph>LIMIT</codeph>
clause to reduce unnecessary processing. For example, if the query has a
clause <codeph>LIMIT 10</codeph>, each executor Impala daemon sorts its
portion of the relevant result set and only returns 10 rows to the
coordinator node. The coordinator node picks the 10 highest or lowest row
values out of this small intermediate result set. </p>
<p>
If an <codeph>ORDER BY</codeph> clause is applied to an early phase of query processing, such as a subquery
@@ -212,45 +201,46 @@ SELECT page_title AS "Page 3 of search results", page_url FROM search_content
<p conref="../shared/impala_common.xml#common/internals_blurb"/>
<p>
Impala sorts the intermediate results of an <codeph>ORDER BY</codeph> clause in memory whenever practical. In
a cluster of N DataNodes, each node sorts roughly 1/Nth of the result set, the exact proportion varying
depending on how the data matching the query is distributed in HDFS.
</p>
<p> Impala sorts the intermediate results of an <codeph>ORDER BY</codeph>
clause in memory whenever practical. In a cluster of N executor Impala
daemons, each daemon sorts roughly 1/Nth of the result set, the exact
proportion varying depending on how the data matching the query is
distributed in HDFS. </p>
<p>
If the size of the sorted intermediate result set on any DataNode would cause the query to exceed the Impala
memory limit, Impala sorts as much as practical in memory, then writes partially sorted data to disk. (This
technique is known in industry terminology as <q>external sorting</q> and <q>spilling to disk</q>.) As each
8 MB batch of data is written to disk, Impala frees the corresponding memory to sort a new 8 MB batch of
data. When all the data has been processed, a final merge sort operation is performed to correctly order the
in-memory and on-disk results as the result set is transmitted back to the coordinator node. When external
sorting becomes necessary, Impala requires approximately 60 MB of RAM at a minimum for the buffers needed to
read, write, and sort the intermediate results. If more RAM is available on the DataNode, Impala will use
the additional RAM to minimize the amount of disk I/O for sorting.
</p>
<p> If the size of the sorted intermediate result set on any executor Impala
daemon would cause the query to exceed the Impala memory limit, Impala
sorts as much as practical in memory, then writes partially sorted data to
disk. (This technique is known in industry terminology as <q>external
sorting</q> and <q>spilling to disk</q>.) As each 8 MB batch of data is
written to disk, Impala frees the corresponding memory to sort a new 8 MB
batch of data. When all the data has been processed, a final merge sort
operation is performed to correctly order the in-memory and on-disk
results as the result set is transmitted back to the coordinator node.
When external sorting becomes necessary, Impala requires approximately 60
MB of RAM at a minimum for the buffers needed to read, write, and sort the
intermediate results. If more RAM is available on the Impala daemon,
Impala will use the additional RAM to minimize the amount of disk I/O for
sorting. </p>
<p>
This external sort technique is used as appropriate on each DataNode (possibly including the coordinator
node) to sort the portion of the result set that is processed on that node. When the sorted intermediate
results are sent back to the coordinator node to produce the final result set, the coordinator node uses a
merge sort technique to produce a final sorted result set without using any extra resources on the
coordinator node.
</p>
<p> This external sort technique is used as appropriate on each Impala
daemon (possibly including the coordinator node) to sort the portion of
the result set that is processed on that node. When the sorted
intermediate results are sent back to the coordinator node to produce the
final result set, the coordinator node uses a merge sort technique to
produce a final sorted result set without using any extra resources on the
coordinator node. </p>
<p rev="obwl">
<b>Configuration for disk usage:</b>
</p>
<p rev="obwl" conref="../shared/impala_common.xml#common/order_by_scratch_dir"/>
<!-- Here is actually the more logical place to collect all those examples, move them from SELECT and cross-reference to here. -->
<!-- <p rev="obwl" conref="../shared/impala_common.xml#common/restrictions_blurb"/> -->
<p rev="obwl"
conref="../shared/impala_common.xml#common/order_by_scratch_dir"/>
<p rev="obwl" conref="../shared/impala_common.xml#common/insert_sort_blurb"/>
<p rev="obwl" conref="../shared/impala_common.xml#common/order_by_view_restriction"/>
<p rev="obwl"
conref="../shared/impala_common.xml#common/order_by_view_restriction"/>
<p>
With the lifting of the requirement to include a <codeph>LIMIT</codeph> clause in every <codeph>ORDER
@@ -259,16 +249,17 @@ SELECT page_title AS "Page 3 of search results", page_url FROM search_content
<ul>
<li>
<p>
Now the use of scratch disk space raises the possibility of an <q>out of disk space</q> error on a
particular DataNode, as opposed to the previous possibility of an <q>out of memory</q> error. Make sure
to keep at least 1 GB free on the filesystem used for temporary sorting work.
</p>
<p> Now the use of scratch disk space raises the possibility of an
<q>out of disk space</q> error on a particular Impala daemon, as
opposed to the previous possibility of an <q>out of memory</q> error.
Make sure to keep at least 1 GB free on the filesystem used for
temporary sorting work. </p>
</li>
</ul>
<p rev="obwl" conref="../shared/impala_common.xml#common/null_sorting_change"/>
<p rev="obwl"
conref="../shared/impala_common.xml#common/null_sorting_change"/>
<codeblock>[localhost:21000] > create table numbers (x int);
[localhost:21000] > insert into numbers values (1), (null), (2), (null), (3);
[localhost:21000] > select x from numbers order by x nulls first;

View File

@@ -82,13 +82,13 @@ under the License.
</p>
<ul>
<li>
<p>
What a <term>plan fragment</term> is.
Impala decomposes each query into smaller units of work that are distributed across the cluster.
Wherever possible, a data block is read, filtered, and aggregated by plan fragments executing
on the same host. For some operations, such as joins and combining intermediate results into
a final result set, data is transmitted across the network from one DataNode to another.
</p>
<p> What a <term>plan fragment</term> is. Impala decomposes each query
into smaller units of work that are distributed across the cluster.
Wherever possible, a data block is read, filtered, and aggregated by
plan fragments executing on the same host. For some operations, such
as joins and combining intermediate results into a final result set,
data is transmitted across the network from one Impala daemon to
another. </p>
</li>
<li>
<p>

View File

@@ -601,9 +601,7 @@ Memory Usage: Additional Notes
available to Impala and reduce the amount of memory required on each node.
</li>
<li>
Increase the overall memory capacity of each DataNode at the hardware level.
</li>
<li> Add more memory to the hosts running Impala daemons. </li>
<li>
On a cluster with resources shared between Impala and other Hadoop components, use