impala-shell does not set any socket timeout while connecting to the
impala server. This change sets a timeout on the socket before
connecting and unsets it back after successfully connecting. The default
timeout on this socket is 5 sec.
Usage: impala-shell --client_connect_timeout=<value in ms>
Testing:
1. Added a test where I create a random listening socket.
impala-shell (with ssl enabled) connects to this socket and
times out after 2 sec.
2. Created a kerberized impala cluster with ssl enabled and
connected to the impalad using an openssl client (block the
beeswax server thread to accept new connection) -
E.g. - openssl s_client -connect <IP Addr>:21000
Used impala-shell to connect to the same impalad later.
impala-shell timed out after the default of 5 sec.I verified
it manually.
Change-Id: I130fc47f7a83f591918d6842634b4e5787d00813
Reviewed-on: http://gerrit.cloudera.org:8080/11540
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Prior to this patch, Impala shell --var could not accept values from
other variables unlike the one in Impala interactive shell with the SET
command. This patch refactors the logic of variable substitution to
use the same logic in both interactive and command line shells.
Example:
$ impala-shell.sh \
--var="msg1=1" \
--var="msg2=\${var:msg1}2" \
--var="msg3=\${var:msg1}\${var:msg2}"
[localhost:21000] default> select ${var:msg3};
Query: select 112
+-----+
| 112 |
+-----+
| 112 |
+-----+
Testing:
- Added a new shell test
- Ran all shell tests
Change-Id: Ib5b9fda329c45f2e5682f3cbc76d29ceca2e226a
Reviewed-on: http://gerrit.cloudera.org:8080/11623
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
test_cancellation runs a impala-shell process with a query specified
then sends a SIGINT and confirms that the shell cancels the query and
exits.
The hang was happening because the shell's signal handler was
incorrectly using the same Thirft connection when calling Close() as
the main shell thread, which is not thread safe.
Testing:
- Ran test_cancellation in a loop 500 times. Previously the hang would
repro about every 10 runs.
Change-Id: I9c4b570604f7706712eb8e19b1ce69bf35cf15e2
Reviewed-on: http://gerrit.cloudera.org:8080/11465
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
I recently helped debug an issue where impala-shell was being given
the hiveserver2 port rather than the beeswax port. I've updated
the error message a little bit to indicate that this may be the issue.
Here is the new message:
$impala-shell.sh -i ....:21050 -q 'select 1'
Starting Impala Shell without Kerberos authentication
Error: Unable to communicate with impalad service. This service may
not be an impalad instance. A common problem is that the port
specified does not match the -beeswax_port flag on the underlying
impalad. Check host:port and try again.
Traceback (most recent call last):
File "/home/philip/src/Impala/shell/impala_shell.py", line 1709, in <module>
execute_queries_non_interactive_mode(options, query_options)
File "/home/philip/src/Impala/shell/impala_shell.py", line 1565, in execute_queries_non_interactive_mode
shell = ImpalaShell(options, query_options)
File "/home/philip/src/Impala/shell/impala_shell.py", line 232, in __init__
self.do_connect(options.impalad)
File "/home/philip/src/Impala/shell/impala_shell.py", line 798, in do_connect
self._connect()
File "/home/philip/src/Impala/shell/impala_shell.py", line 842, in _connect
result = self.imp_client.connect()
File "/home/philip/src/Impala/shell/impala_client.py", line 257, in connect
result = self.ping_impala_service()
File "/home/philip/src/Impala/shell/impala_client.py", line 262, in ping_impala_service
return self.imp_service.PingImpalaService()
File "/home/philip/src/Impala/shell/gen-py/ImpalaService/ImpalaService.py", line 229, in PingImpalaService
return self.recv_PingImpalaService()
File "/home/philip/src/Impala/shell/gen-py/ImpalaService/ImpalaService.py", line 245, in recv_PingImpalaService
raise x
thrift.Thrift.TApplicationException: Invalid method name: 'PingImpalaService'
Change-Id: I14465e8f666c4a5f3968db8864dfdb1205641a33
Reviewed-on: http://gerrit.cloudera.org:8080/11368
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This adds an IMPALA_HISTFILE environment variable (and --history_file
argument) to the shell which overrides the default location of
~/.impalahistory for the shell history. The shell tests now override
this variable to /dev/null so they don't store history. The tests that
need history use a pytest fixture to use a temporary file for their
history. This allows so that they can run in parallel without stomping
on each other's history.
This also fixes a couple flaky test which were previously missing the
"execute_serially" annotation -- that annotation is no longer needed
after this fix.
A couple of the tests still need to be executed serially because they
look at metrics such as the number of executed or running queries, and
those metrics are unstable if other tests run in parallel.
I tested this by running:
./bin/impala-py.test tests/shell/test_shell_interactive.py \
-m 'not execute_serially' \
-n 80 \
--random
... several times in a row on an 88-core box. Prior to the change,
several would fail each time. Now they pass.
Change-Id: I1da5739276e63a50590dfcb2b050703f8e35fec7
Reviewed-on: http://gerrit.cloudera.org:8080/11045
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Todd Lipcon <todd@apache.org>
If the remote impalad died while the shell waited for a
command to complete, the shell disconnected. Previously
after restarting the remote impalad, we needed to run
"connect;" to reconnect, now the shell will automatically
reconnect.
Testing:
Added test_auto_connect_after_impalad_died in
test_shell_interactive_reconnect.py
Change-Id: Ia13365a9696886f01294e98054cf4e7cd66ab712
Reviewed-on: http://gerrit.cloudera.org:8080/10992
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This change provides a way to modify command-line options like -B,
--output_file and --delimiter inside impala-shell without quitting
the shell and then restarting again.
Also fixed IMPALA-7286: command "unset" does not work for shell options
Testing:
Added tests for all new options in test_shell_interactive.py
Tested on Python 2.6 and Python 2.7
Change-Id: Id8d4487c24f24806223bfd5c54336914e3afd763
Reviewed-on: http://gerrit.cloudera.org:8080/10900
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This patch fixes the slow performance in Impala shell, especially for
large queries by replacing all calls to sqlparse.format(sql_string,
strip_comments=True) with the custom implementation of strip comments
that does not use grouping. The code to strip leading comments was also
refactored to not use grouping.
* Benchmark running a query with 12K columns *
Before the patch:
$ time impala-shell.sh -f large.sql --quiet
real 2m4.154s
user 2m0.536s
sys 0m0.088s
After the patch:
$ time impala-shell.sh -f large.sql --quiet
real 0m3.885s
user 0m1.516s
sys 0m0.048s
Testing:
- Added a new test to test the Impala shell performance
- Ran all shell tests on Python 2.6 and Python 2.7
Change-Id: Idac9f3caed7c44846a8c922dbe5ca3bf3b095b81
Reviewed-on: http://gerrit.cloudera.org:8080/10939
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
After additional testing around IMPALA-2782, it was discovered
that impala-shell starts the session displaying the expected
hostname (as passed -i flag) on the prompt. This gives the
impression that the load balancer was bypassed, however the
actual TSSLSocket is still created with the hostname passed
in via the -b or --kerberos_host_fqdn flag.
This change ensures that the hostname used to create the
TSSLSocket will always be the one passed in via the -i flag
on impala-shell. This change is required by IMPALA-2782.
Testing:
Using netcat, we verified that the impala daemon host[:port]
value passed into the -i/--impalad option is indeed the one
impala-shell tries to connect to in both cases (with and
without -b)
Change-Id: Ibee05bd0dbe8c6ae108b890f0ae0f6900149773a
Reviewed-on: http://gerrit.cloudera.org:8080/10580
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This patch fixes an issue where parseline is unable to deduce the
correct command when a statement has a leading comment.
Before:
> -- comment
> insert into table t values(100);
Fetched 1 row(s) in 0.01s
After:
> -- comment
> insert into table t values(100);
Modified 1 row(s) in 0.01s
Before (FE syntax error):
> /*comment*/ help;
After (show help correctly):
> /*comment*/ help;
Testing:
- Added shell tests
- Ran end-to-end shell tests on Python 2.6 and Python 2.7
Change-Id: I7ac7cb5a30e6dda73ebe761d9f0eb9ba038e14a7
Reviewed-on: http://gerrit.cloudera.org:8080/9933
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Bharath Vissapragada <bharathv@cloudera.com>
This patch fixes the issue where non-matching quotes inside comments
will cause the shell to not terminate.
The fix is to strip any SQL comments before sending to shlex since shlex
does not understand SQL comments and will raise an exception when it
sees unmatched quotes regardless whether the quotes are in the comments or
not.
Testing:
- Added new shell tests
- Ran all end-to-end shell tests on Python 2.6 and Python 2.7
Change-Id: I2feae34026a7e63f3d31489f757f093a73ca5d2c
Reviewed-on: http://gerrit.cloudera.org:8080/10541
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This patch fixes the issue where non-matching quotes inside comments
will cause the shell to not terminate.
The fix is to strip any SQL comments before sending to shlex since shlex
does not understand SQL comments and will raise an exception when it sees
unmatched quotes regardless whether the quotes are in the comments or
not.
Testing:
- Added new shell tests
- Ran all end-to-end shell tests
Change-Id: Ic899fdddc182947f73101ddbc2e3c8caf97d9085
Reviewed-on: http://gerrit.cloudera.org:8080/10474
Reviewed-by: Michael Brown <mikeb@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This patch fixes a bug in sqlparse where sqlparse incorrectly splits a
statement that has a new line inside double quotes. The bug in sqlparse
causes Impala shell to go to infinite loop when a statement contains a
new line inside double quotes.
The patch in sqlparse is based on the upstream fix at
https://github.com/andialbrecht/sqlparse/pull/396
Testing:
- Added new end-to-end shell tests
- Ran end-to-end shell tests
Change-Id: I9142f21a888189d351f00ce09baeba123bc0959b
Reviewed-on: http://gerrit.cloudera.org:8080/9195
Reviewed-by: David Knupp <dknupp@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
The bug is that PrettyOutputFormatter.format() returned a unicode
object, and Python cannot automatically write unicode objects to
output streams where there is no default encoding.
The fix is to convert to UTF-8 encoded in a regular string, which
can be output to any output device. This makes the output type
consistent with DelimitedOutputFormatter.format().
Based on code by Marcell Szabo.
Testing:
Added a basic test.
Played around in an interactive shell to make sure that unicode
characters still work in interactive mode.
Change-Id: I9de641ecf767a2feef3b9f48b344ef2d55e17a7f
Reviewed-on: http://gerrit.cloudera.org:8080/9928
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Before this commit it was quite random which DDL oprations
returned a result set and which didn't.
With this commit, every DDL operations return a summary of
its execution. They declare their result set schema in
Frontend.java, and provide the summary in CalatogOpExecutor.java.
Updated the tests according to the new behavior.
Change-Id: Ic542fb8e49e850052416ac663ee329ee3974e3b9
Reviewed-on: http://gerrit.cloudera.org:8080/9090
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
When passing comamnd line options to a new instance of the
ImpalaShell, we ususally transfer the options to member
variables of that new instance. We weren't doing that with
all of the LDAP-related options, even though we wanted to
access them later. In some environments and under certain
conditions, this could then lead to a NameError exception
being thrown.
This patch takes away any reliance on the original options
object returned by parse_args() beyond the __init__()
method of the ImpalaShell class, by tranferring all LDAP
options to member variables. Also, a test has been added to
exercise the code path where the exception had been occurring.
Change-Id: I810850f569ef3f4487f7eeba81ca520dc955ac2e
Reviewed-on: http://gerrit.cloudera.org:8080/9744
Reviewed-by: Michael Brown <mikeb@cloudera.com>
Tested-by: Impala Public Jenkins
configured with load balancer and kerberos.
This change adds an impala-shell option -b / --kerberos_host_fqdn.
This allows user to optionally specify the load-balancer's host so
that impala-shell will accept a direct connection to impala daemons
in a kerberized cluster.
Change-Id: I4726226a7a3817421b133f74dd4f4cf8c52135f9
Reviewed-on: http://gerrit.cloudera.org:8080/7241
Reviewed-by: <andy@phdata.io>
Reviewed-by: Philip Zeyliger <philip@cloudera.com>
Tested-by: Impala Public Jenkins
The value of LDAP password in Impala shell contains extra line break
causes authentication failure, but the user can't detect the cause of
the failure.
I fixed the issue by adding inspection to the password for common
pitfalls and issuing a warning in the shell when authentication fails.
Change-Id: Ie570166aea62af223905b7f0124e9efb15a88ac7
Reviewed-on: http://gerrit.cloudera.org:8080/9506
Reviewed-by: Sailesh Mukil <sailesh@cloudera.com>
Tested-by: Impala Public Jenkins
This patch is to show query submitted and query progress web links in
Impala shell for CTE queries.
Testing:
- Ran end-to-end shell tests
Change-Id: Ie3352406e3b048be395a20405c8e6b911e663164
Reviewed-on: http://gerrit.cloudera.org:8080/9537
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Impala Public Jenkins
The patch is to remove any comments in a statement when checking if a
statement ends with a semicolon delimiter.
For example:
Before (semicolon delimiter is needed at the end):
select 1 + 1; -- comment\n;
After (semicolon delimiter is no longer needed):
select 1 + 1; -- comment
Testing:
- Ran end-to-end tests in shell
Change-Id: I54f9a8f65214023520eaa010fc462a663d02d258
Reviewed-on: http://gerrit.cloudera.org:8080/9191
Reviewed-by: Fredy Wijaya <fwijaya@cloudera.com>
Reviewed-by: Taras Bobrovytsky <tbobrovytsky@cloudera.com>
Tested-by: Impala Public Jenkins
Adds a concept of a "removed" query option that has no effect but does
not return an error when a user attempts to set it. These options are
not returned by "set" or "set all" commands that are executed in
impala-shell or server-side.
These query options have been deprecated for several releases:
DEFAULT_ORDER_BY_LIMIT, ABORT_ON_DEFAULT_LIMIT_EXCEEDED,
V_CPU_CORES, RESERVATION_REQUEST_TIMEOUT, RM_INITIAL_MEM,
SCAN_NODE_CODEGEN_THRESHOLD, MAX_IO_BUFFERS
RM_INITIAL_MEM did still have an effect, but it was undocumented and
MEM_LIMIT should be used in preference.
DISABLE_CACHED_READS also had an effect but it was documented as
deprecated.
Otherwise the options had no effect at all.
Testing:
Ran exhaustive build.
Updated query option tests to reflect the new behaviour.
Cherry-picks: not for 2.x.
Change-Id: I9e742e9b0eca0e5c81fd71db3122fef31522fcad
Reviewed-on: http://gerrit.cloudera.org:8080/9118
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins
This removes the deprecated option in time for 3.0.
Testing:
Ran core tests. Manually ran the shell with the argument to confirm that
it reported "no such option".
Cherry-picks: not for 2.x.
Change-Id: I8f430bad0578e150d5e80066b9e7572041af4a15
Reviewed-on: http://gerrit.cloudera.org:8080/9072
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins
Impala shell can accidentally convert certain
literal strings to lowercase. Impala shell splits
each command into tokens and then converts the
first token to lowercase to figure out how it
should execute the command. The splitting is done
by spaces only. Thus, if the user types a TAB
after the SELECT, the first token after the split
becomes the SELECT plus whatever comes after it.
Testing:
TestImpalaShellInteractive.test_case_sensitive_command
TestImpalaShellInteractive.test_unexpected_conversion_for_literal_string_to_lowercase
TestImpalaShell.test_var_substitution
Change-Id: Ifdce9781d1d97596c188691b62a141b9bd137610
Reviewed-on: http://gerrit.cloudera.org:8080/8762
Reviewed-by: Zoltan Borok-Nagy <boroknagyz@cloudera.com>
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins
Issue 1: When query is cancelled via CTRL-C while being executed in Impala-shell
then an exception is thrown from Impala backend saying 'Invalid query handle'.
This is because one ImpalaClient was making RPC's while another ImpalaClient
cancelled the query on the backend. As a result RPC handlers in ImpalaServer
try to access a ClientRequestState that had been cleared from the backend. The
issue is confidently reproducable both in wait_to_finish and in fetch states of
the query.
As a solution the query cancellation is indicated to ImpalaClient via a bool
flag. Once a cancellation originated exception reaches Impala shell this flag
is checked to decide whether to suppress the error or not.
Issue 2: Every time a query was cancelled a 'use db' command was issued
automatically. This happened to historical reasons but is not needed anymore
(see Jira for more details).
Change-Id: I6cefaf1dae78baae238289816a7cb9d210fb38e2
Reviewed-on: http://gerrit.cloudera.org:8080/8549
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins
Four display levels are introduced for each query option: REGULAR, ADVANCED,
DEVELOPMENT and DEPRECATED. When the query options are displayed in Impala
shell using SET then only the REGULAR and ADVANCED options are shown. A new
command called SET ALL shows all the options grouped by their option levels.
When the query options are displayed through the SET SQL statement then the
result set would contain an extra column indicating the level of each option.
Similarly to Impala shell here the SET command only diplays the REGULAR and
ADVANCED options while SET ALL shows them all.
If the Impala shell connects to an Impala daemon that predates this change
then all the options would be displayed in the REGULAR group.
Change-Id: I75720d0d454527e1a0ed19bb43cf9e4f018ce1d1
Reviewed-on: http://gerrit.cloudera.org:8080/8447
Reviewed-by: Dan Hecht <dhecht@cloudera.com>
Tested-by: Impala Public Jenkins
The ImpalaShell didn't issue the 'USE <current-db>' command after
reconnecting to the Impala daemon. Therefore the client session
used the default DB after reconnection, not the previously selected DB.
Setting the current DB is done by the _validate_database method.
Before this commit it appended the "use <db>" command to the
command queue of the Cmd class. But, at this point we might already
have commands in the command queue that will run before the
"use <db>" command. In case of reconnection, we want to invoke
the USE command right away.
Also, the command processed by the precmd() method can entirely skip
the command queue, therefore it is not enough to insert the USE
command to the front of the command queue. We need to issue the
USE command with the onecmd() method to execute it immediately.
I extended the _validate_database method with an "immediately" flag.
If this flag is true, _validate_database will use the onecmd() method.
Otherwise, it will append the USE command to the command queue to
maintain the previous behaviour.
I added a new automated test suite named test_shell_interactive_reconnect.py
to the "custom cluster" tests. It sets the default database, and after
reconnection it checks if the shell set it again automatically.
One test case checks if the shell set the DB after manually reconnecting
to the impala daemon by issuing the CONNECT command.
The other test case checks if the shell set the DB after automatic
reconnection due to cluster restart.
I needed to backup the impala shell history file because I didn't
want to pollute it by the test cases (just like the way it is done in
tests/shell/test_shell_interactive.py). I created utility functions for
this in tests/shell/util.py and now test_shell_interactive.py and
the newly created test suite are using these utility functions.
Change-Id: I40dfa00ba0314d356fe8617446f516505c925e5e
Reviewed-on: http://gerrit.cloudera.org:8080/8368
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins
Query options can be set from command line and impala rc as
key=value pairs, where key is case insensitive.
Examples:
command line:
impala-shell.sh -Q MT_DOP=1 --query_option=MAX_ERRORS=200
.impalarc:
[impala.query_options]
EXPLAIN_LEVEL=2
MT_DOP=2
The options set in command line will update the ones
in impalarc one by one, so the result of the example
above will be:
EXPLAIN_LEVEL=2
MT_DOP=1
MAX_ERRORS=200
Additional changes:
- 0 and 1 are accepted as bools in section [impala] to
make it more consistent with [impala.query_options]
- options that are expected to be bool but are not
0/1/true/false lead to error instead of warning
Change-Id: I26a3b67230c80a99bd246b6af205d558fec9a986
Reviewed-on: http://gerrit.cloudera.org:8080/8038
Reviewed-by: Michael Brown <mikeb@cloudera.com>
Tested-by: Impala Public Jenkins
This adds a warning to impala-shell if -r/--refresh_after_connect is
used, in anticipation of us removing the feature in a future version.
Change-Id: Id297f80c0f596a69ef8ecde948812b82d2a5c0fa
Reviewed-on: http://gerrit.cloudera.org:8080/8381
Reviewed-by: Philip Zeyliger <philip@cloudera.com>
Reviewed-by: Bharath Vissapragada <bharathv@cloudera.com>
Tested-by: Impala Public Jenkins
Impala-shell crashes with 2 source commands on the same line and runs
a command multiple times if it shares the same line with a source
command.
The bug is caused by a misuse of cmdqueue. The cmdqueue member of
cmd.Cmd is used to execute commands not directly from user input in an
event loop. When a 'source' is run, execute_query_list() is called which
also executes the commands in cmdqueue, causing them to be executed
twice.
The fix is for execute_query_list() to not run the commands in cmdqueue.
For the non-interactive case, where the event loop won't be run, we call
execute_query_list() with cmdqueue so that the commands get run.
A test case is added to test_shell_interactive.py.
Change-Id: I453af2d4694d47e184031cb07ecd2af259ba20f3
Reviewed-on: http://gerrit.cloudera.org:8080/8063
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Impala Public Jenkins
This patch adds a new command "rerun" and a shortcut "@" to impala-shell
. Users can rerun a certain query by its index given by history command.
A valid index is an integer in [1, history_length] or
[-history_length, -1]. Negative values index history in reverse order.
For example, "@1;" or "rerun 1;" reruns the first query shown in history
and "@-1;" reruns the last query. The rerun command itself won't appear
in history. The history index is 1-based and increasing. Old entries
might be truncated when impala-shell starts, and the indexes will be
realigned to 1, so the same index may refer to different commands among
multiple impala-shell instances.
Testing: A test case test_rerun is added to
shell/test_shell_interactive.py
Change-Id: Ifc28e8ce07845343267224c3b9ccb71b29a524d2
Reviewed-on: http://gerrit.cloudera.org:8080/7674
Reviewed-by: Sailesh Mukil <sailesh@cloudera.com>
Tested-by: Impala Public Jenkins
The shell uses Thrift's TSSLSocket to negotiate secure connections to
Impala. This socket uses a variable SSL_VERSION to determine which SSL
and TLS protocol versions it will connect to.
SSL_VERSION was hardcoded to be PROTOCOL_TLSv1, which only supports
TLSv1 servers and no other protocol version. Change the allowed version
to be PROTOCOL_SSLv23, which supports any TLS or SSL protocol. We rely
on the server not to allow SSLv2 or v3 connections.
Testing: Added a new custom cluster test to confirm that the shell can
connect to a TLSv1.2 cluster. Confirmed that the test is correctly
skipped on machines with an old version of OpenSSL that does not support
TLSv1.2.
Change-Id: I5487f82d110676b9c3c7a5305931da00c7f68ca0
Reviewed-on: http://gerrit.cloudera.org:8080/7675
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins
For some queries, the exec summary will not be completely filled
in even if the query is FINISHED. In particular, the exec_stats field
may not be set. This was causing an error in our test code that
converts the exec summary to a more usable format.
The situation is essentially deterministic for some queries, but
it was being hidden by testing code that caught the error and
discarded it in most situations, leading to flaky tests.
This patch removes the 'try' that was hiding the error and makes
the code check for the presence of exec_stats and handle it rather
than generating an error.
I filed IMPALA-5783 for followup work to be more rigorous about
when the exec summary should and shouldn't be fully present.
Testing:
- Ran the affected tests in a loop and they are no longer flaky.
Change-Id: Id52ac62da2b01f9e163e97cbe4590f8db6b663d2
Reviewed-on: http://gerrit.cloudera.org:8080/7627
Tested-by: Impala Public Jenkins
Reviewed-by: Thomas Tauber-Marshall <tmarshall@cloudera.com>
Function print_to_stderr() has a syntax error when an error message is
displayed.
I solved this problem by exchanging the position of variable and the
subsequent strings in function print_to_stderr().
Change-Id: Ib883499a88f39d91b69bea4291f1ce5dd264ccf6
Reviewed-on: http://gerrit.cloudera.org:8080/7187
Reviewed-by: Bharath Vissapragada <bharathv@cloudera.com>
Tested-by: Impala Public Jenkins
Help information of KEYVAL option in impala-shell is not clear enough.
I fix this issue by adding clear description to help information of
KEYVAL option.
Change-Id: I68cfc16838c6c0e7813f03dd4296f9eb54ec4c63
Reviewed-on: http://gerrit.cloudera.org:8080/7179
Reviewed-by: Jim Apple <jbapple-impala@apache.org>
Tested-by: Impala Public Jenkins
When only with ldap_password_cmd option, impala-shell runs successfully.
I solved this problem by throwing an error when --ldap_password_cmd is
used without LDAP auth, that is, ldap_password_cmd option will only
take effect if ldap option presents.
Change-Id: I3711d8a0eca2fa8612e2943fa9121945db6b012e
Reviewed-on: http://gerrit.cloudera.org:8080/7188
Reviewed-by: Bharath Vissapragada <bharathv@cloudera.com>
Tested-by: Impala Public Jenkins
This change avoids printing blank lines when the Impala
shell fetches 0 rows from a statement.
Change-Id: I6e18ce36be07ee90a16b007b1e30d5255ef8a839
Reviewed-on: http://gerrit.cloudera.org:8080/7055
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Impala Public Jenkins
Help information of query_file option in impala-shell misses stdin
description.
I fix this issue by adding stdin description to help information of
query_file option.
Change-Id: I0264174fd062497c72891b31cf9ac1ba6c00f716
Reviewed-on: http://gerrit.cloudera.org:8080/7178
Reviewed-by: Henry Robinson <henry@cloudera.com>
Tested-by: Impala Public Jenkins
The introduction has redundant '\' in impala-shell when using LDAP.
I fix this issue by removing extraneous '\' in introduction when
impala-shell using LDAP.
Change-Id: I30c601ab255a4882260f7be23b5763ef8ec76d28
Reviewed-on: http://gerrit.cloudera.org:8080/7166
Reviewed-by: Bharath Vissapragada <bharathv@cloudera.com>
Tested-by: Impala Public Jenkins
Bug:
When Sentry-based authorization is enabled, a user that isn't authorized
to EXPLAIN a statement that uses a view can still access unauthorized
information, such as view's definition, by running the statement and
asking for the query profile or the execution summary.
Fix:
During query compilation, determine if the user can access the the runtime
profile or the execution summary. Upon request for a runtime profile or
execution summary from a user, determine based on that information and
the user that is asking for the profile if the runtime profile
(or execution summary) will be returned or an authorization error.
The authorization rule enforced is the following:
- User A runs statement S, A asks for profile, A has profile access:
Runtime profile is returned
- User A runs statement S, A asks for profile, A doesn't have profile access:
Authorization error
- User A runs statement S, user B asks for profile:
Authorization error.
This patch doesn't enforce access to the runtime profile or execution summary
through the Web UI.
Change-Id: I2255d587367c2d328590ae8534a5406c4b0c9b15
Reviewed-on: http://gerrit.cloudera.org:8080/7064
Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Tested-by: Impala Public Jenkins
Allow users to keep a longer history of queries if desired. I
personally find it useful to keep a long history of queries to
reference and want to bump this up to a very large value, but
keep the default reasonable. Also change the config loader
to not freak out over unknown parameters so as not to break
for users that end up with new options set running on older
shells.
Testing: Created .impalarc as follows, now getting more history saved.
Put broken things in .impalarc and make sure they are logged as
warnings.
[impala]
history_max=1000
Change-Id: Iaf65bbecb8fd7f1105aac62b6745d6125a603d7f
Reviewed-on: http://gerrit.cloudera.org:8080/6335
Reviewed-by: Michael Brown <mikeb@cloudera.com>
Tested-by: Impala Public Jenkins
When using connections secured with SSL, a connection close comprises
of a bi-directional SSL_shutdown(). The second part of the
bi-directional shutdown requires that the client also close the socket
explicitly, and the server blocks till it gets the close
notification from the client.
This patch ensures that the above happens. Without this fix, the
impala-shell was found to hang over connections secured with SSL
when an error was encountered.
Change-Id: I814df93bbcd457ad3f96b4c1ef5d8b0ddd6d141f
Reviewed-on: http://gerrit.cloudera.org:8080/6587
Reviewed-by: Sailesh Mukil <sailesh@cloudera.com>
Tested-by: Impala Public Jenkins
The bug was in the third-party pkg_resources.py script. The version
check was broken because it matches any version with a "0.7" substring
instead of just versions starting with 0.7.
This is a known bug. setuptools even re-released 20.7.0 as version
20.8.0 to avoid it:
e5822f0d5b
Testing:
I was unable to reproduce this locally, but I think the fix is clear-cut
enough that this is ok.
Change-Id: I0565c0e6c1be7d82c3f35d2545ba044a684bb075
Reviewed-on: http://gerrit.cloudera.org:8080/5314
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins
Adds support in the shell to report the number of modified
rows for all DML operations, as well as the number of rows
with errors.
Testing: Added shell tests.
Change-Id: I3d3d7aa8d176e03ea58fb00f2a81fb3e34965aa1
Reviewed-on: http://gerrit.cloudera.org:8080/5103
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Internal Jenkins