mirror of
https://github.com/apache/impala.git
synced 2026-01-05 12:01:11 -05:00
The scope of this change is limited to removing the Cloudera copyright from the Impala shell banner and replacing it with a conref to a generic message with no reference to Cloudera or CDH version numbers. Change-Id: I1f6a3175cd34c434e3e6bccd99665b021287a768 Reviewed-on: http://gerrit.cloudera.org:8080/6138 Reviewed-by: John Russell <jrussell@cloudera.com> Tested-by: Impala Public Jenkins
219 lines
8.2 KiB
XML
219 lines
8.2 KiB
XML
<?xml version="1.0" encoding="UTF-8"?>
|
|
<!--
|
|
Licensed to the Apache Software Foundation (ASF) under one
|
|
or more contributor license agreements. See the NOTICE file
|
|
distributed with this work for additional information
|
|
regarding copyright ownership. The ASF licenses this file
|
|
to you under the Apache License, Version 2.0 (the
|
|
"License"); you may not use this file except in compliance
|
|
with the License. You may obtain a copy of the License at
|
|
|
|
http://www.apache.org/licenses/LICENSE-2.0
|
|
|
|
Unless required by applicable law or agreed to in writing,
|
|
software distributed under the License is distributed on an
|
|
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
|
KIND, either express or implied. See the License for the
|
|
specific language governing permissions and limitations
|
|
under the License.
|
|
-->
|
|
<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
|
|
<concept id="connecting">
|
|
|
|
<title>Connecting to impalad through impala-shell</title>
|
|
<titlealts audience="PDF"><navtitle>Connecting to impalad</navtitle></titlealts>
|
|
<prolog>
|
|
<metadata>
|
|
<data name="Category" value="Impala"/>
|
|
<data name="Category" value="impala-shell"/>
|
|
<data name="Category" value="Network"/>
|
|
<data name="Category" value="DataNode"/>
|
|
<data name="Category" value="Developers"/>
|
|
<data name="Category" value="Data Analysts"/>
|
|
</metadata>
|
|
</prolog>
|
|
|
|
<conbody>
|
|
|
|
<!--
|
|
TK: This would be a good theme for a tutorial topic.
|
|
Lots of nuances to illustrate through sample code.
|
|
-->
|
|
|
|
<p>
|
|
Within an <cmdname>impala-shell</cmdname> session, you can only issue queries while connected to an instance
|
|
of the <cmdname>impalad</cmdname> daemon. You can specify the connection information:
|
|
<ul>
|
|
<li>
|
|
Through command-line options when you run the <cmdname>impala-shell</cmdname> command.
|
|
</li>
|
|
<li>
|
|
Through a configuration file that is read when you run the <cmdname>impala-shell</cmdname> command.
|
|
</li>
|
|
<li>
|
|
During an <cmdname>impala-shell</cmdname> session, by issuing a <codeph>CONNECT</codeph> command.
|
|
</li>
|
|
</ul>
|
|
See <xref href="impala_shell_options.xml"/> for the command-line and configuration file options you can use.
|
|
</p>
|
|
|
|
<p>
|
|
You can connect to any DataNode where an instance of <cmdname>impalad</cmdname> is running,
|
|
and that host coordinates the execution of all queries sent to it.
|
|
</p>
|
|
|
|
<p>
|
|
For simplicity during development, you might always connect to the same host, perhaps running <cmdname>impala-shell</cmdname> on
|
|
the same host as <cmdname>impalad</cmdname> and specifying the hostname as <codeph>localhost</codeph>.
|
|
</p>
|
|
|
|
<p>
|
|
In a production environment, you might enable load balancing, in which you connect to specific host/port combination
|
|
but queries are forwarded to arbitrary hosts. This technique spreads the overhead of acting as the coordinator
|
|
node among all the DataNodes in the cluster. See <xref href="impala_proxy.xml"/> for details.
|
|
</p>
|
|
|
|
<p>
|
|
<b>To connect the Impala shell during shell startup:</b>
|
|
</p>
|
|
|
|
<ol>
|
|
<li>
|
|
Locate the hostname of a DataNode within the cluster that is running an instance of the
|
|
<cmdname>impalad</cmdname> daemon. If that DataNode uses a non-default port (something
|
|
other than port 21000) for <cmdname>impala-shell</cmdname> connections, find out the
|
|
port number also.
|
|
</li>
|
|
|
|
<li>
|
|
Use the <codeph>-i</codeph> option to the
|
|
<cmdname>impala-shell</cmdname> interpreter to specify the connection information for
|
|
that instance of <cmdname>impalad</cmdname>:
|
|
<codeblock>
|
|
# When you are logged into the same machine running impalad.
|
|
# The prompt will reflect the current hostname.
|
|
$ impala-shell
|
|
|
|
# When you are logged into the same machine running impalad.
|
|
# The host will reflect the hostname 'localhost'.
|
|
$ impala-shell -i localhost
|
|
|
|
# When you are logged onto a different host, perhaps a client machine
|
|
# outside the Hadoop cluster.
|
|
$ impala-shell -i <varname>some.other.hostname</varname>
|
|
|
|
# When you are logged onto a different host, and impalad is listening
|
|
# on a non-default port. Perhaps a load balancer is forwarding requests
|
|
# to a different host/port combination behind the scenes.
|
|
$ impala-shell -i <varname>some.other.hostname</varname>:<varname>port_number</varname>
|
|
</codeblock>
|
|
</li>
|
|
</ol>
|
|
|
|
<p>
|
|
<b>To connect the Impala shell after shell startup:</b>
|
|
</p>
|
|
|
|
<ol>
|
|
<li>
|
|
Start the Impala shell with no connection:
|
|
<codeblock>$ impala-shell</codeblock>
|
|
<p>
|
|
You should see a prompt like the following:
|
|
</p>
|
|
<codeblock>Welcome to the Impala shell. Press TAB twice to see a list of available commands.
|
|
...
|
|
<ph conref="../shared/ImpalaVariables.xml#impala_vars/ShellBanner"/>
|
|
[Not connected] > </codeblock>
|
|
</li>
|
|
|
|
<li>
|
|
Locate the hostname of a DataNode within the cluster that is running an instance of the
|
|
<cmdname>impalad</cmdname> daemon. If that DataNode uses a non-default port (something
|
|
other than port 21000) for <cmdname>impala-shell</cmdname> connections, find out the
|
|
port number also.
|
|
</li>
|
|
|
|
<li>
|
|
Use the <codeph>connect</codeph> command to connect to an Impala instance. Enter a command of the form:
|
|
<codeblock>[Not connected] > connect <varname>impalad-host</varname>
|
|
[<varname>impalad-host</varname>:21000] ></codeblock>
|
|
<note>
|
|
Replace <varname>impalad-host</varname> with the hostname you have configured for any DataNode running
|
|
Impala in your environment. The changed prompt indicates a successful connection.
|
|
</note>
|
|
</li>
|
|
</ol>
|
|
|
|
<p>
|
|
<b>To start <cmdname>impala-shell</cmdname> in a specific database:</b>
|
|
</p>
|
|
|
|
<p>
|
|
You can use all the same connection options as in previous examples.
|
|
For simplicity, these examples assume that you are logged into one of
|
|
the DataNodes that is running the <cmdname>impalad</cmdname> daemon.
|
|
</p>
|
|
|
|
<ol>
|
|
<li>
|
|
Find the name of the database containing the relevant tables, views, and so
|
|
on that you want to operate on.
|
|
</li>
|
|
|
|
<li>
|
|
Use the <codeph>-d</codeph> option to the
|
|
<cmdname>impala-shell</cmdname> interpreter to connect and immediately
|
|
switch to the specified database, without the need for a <codeph>USE</codeph>
|
|
statement or fully qualified names:
|
|
<codeblock>
|
|
# Subsequent queries with unqualified names operate on
|
|
# tables, views, and so on inside the database named 'staging'.
|
|
$ impala-shell -i localhost -d staging
|
|
|
|
# It is common during development, ETL, benchmarking, and so on
|
|
# to have different databases containing the same table names
|
|
# but with different contents or layouts.
|
|
$ impala-shell -i localhost -d parquet_snappy_compression
|
|
$ impala-shell -i localhost -d parquet_gzip_compression
|
|
</codeblock>
|
|
</li>
|
|
</ol>
|
|
|
|
<p>
|
|
<b>To run one or several statements in non-interactive mode:</b>
|
|
</p>
|
|
|
|
<p>
|
|
You can use all the same connection options as in previous examples.
|
|
For simplicity, these examples assume that you are logged into one of
|
|
the DataNodes that is running the <cmdname>impalad</cmdname> daemon.
|
|
</p>
|
|
|
|
<ol>
|
|
<li>
|
|
Construct a statement, or a file containing a sequence of statements,
|
|
that you want to run in an automated way, without typing or copying
|
|
and pasting each time.
|
|
</li>
|
|
|
|
<li>
|
|
Invoke <cmdname>impala-shell</cmdname> with the <codeph>-q</codeph> option to run a single statement, or
|
|
the <codeph>-f</codeph> option to run a sequence of statements from a file.
|
|
The <cmdname>impala-shell</cmdname> command returns immediately, without going into
|
|
the interactive interpreter.
|
|
<codeblock>
|
|
# A utility command that you might run while developing shell scripts
|
|
# to manipulate HDFS files.
|
|
$ impala-shell -i localhost -d database_of_interest -q 'show tables'
|
|
|
|
# A sequence of CREATE TABLE, CREATE VIEW, and similar DDL statements
|
|
# can go into a file to make the setup process repeatable.
|
|
$ impala-shell -i localhost -d database_of_interest -f recreate_tables.sql
|
|
</codeblock>
|
|
</li>
|
|
</ol>
|
|
|
|
</conbody>
|
|
</concept>
|