impala-shell does not set any socket timeout while connecting to the
impala server. This change sets a timeout on the socket before
connecting and unsets it back after successfully connecting. The default
timeout on this socket is 5 sec.
Usage: impala-shell --client_connect_timeout=<value in ms>
Testing:
1. Added a test where I create a random listening socket.
impala-shell (with ssl enabled) connects to this socket and
times out after 2 sec.
2. Created a kerberized impala cluster with ssl enabled and
connected to the impalad using an openssl client (block the
beeswax server thread to accept new connection) -
E.g. - openssl s_client -connect <IP Addr>:21000
Used impala-shell to connect to the same impalad later.
impala-shell timed out after the default of 5 sec.I verified
it manually.
Change-Id: I130fc47f7a83f591918d6842634b4e5787d00813
Reviewed-on: http://gerrit.cloudera.org:8080/11540
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Prior to this patch, Impala shell --var could not accept values from
other variables unlike the one in Impala interactive shell with the SET
command. This patch refactors the logic of variable substitution to
use the same logic in both interactive and command line shells.
Example:
$ impala-shell.sh \
--var="msg1=1" \
--var="msg2=\${var:msg1}2" \
--var="msg3=\${var:msg1}\${var:msg2}"
[localhost:21000] default> select ${var:msg3};
Query: select 112
+-----+
| 112 |
+-----+
| 112 |
+-----+
Testing:
- Added a new shell test
- Ran all shell tests
Change-Id: Ib5b9fda329c45f2e5682f3cbc76d29ceca2e226a
Reviewed-on: http://gerrit.cloudera.org:8080/11623
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
test_cancellation runs a shell process, executes a query, sleeps,
sends a sigint to the process, and then checks that the query is
cancelled. If the sigint is sent prior to the shell installing its
signal handler, the test can fail with a KeyboardInterrupt.
This patch removes the reliance on the sleep being long enough by
actually reading the output of the shell and only cancelling the
query once the shell shows that it has started running.
Testing:
- Ran test_cancellation in a loop.
Change-Id: I65302ffb838d5185f77853bc2e53296f3a701d93
Reviewed-on: http://gerrit.cloudera.org:8080/11255
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Thomas Marshall <thomasmarshall@cmu.edu>
This patch fixes the flaky test by rewriting the test to perform a
large query from a non-existent table instead since the test focuses
on Impala shell performance and not Impala performance in general.
This patch also reduces the time limit from 30 seconds to 10 seconds
and increases the number of columns in the query to 10K.
Testing:
- Ran all shell tests
Change-Id: Ic87891f34872da65aac5ce02caf01da1c050efa5
Reviewed-on: http://gerrit.cloudera.org:8080/11201
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This fixes a flakiness in test_multiline_queries_in_history wherein a
part of the shell prompt would be absorbed in a previous regex search
that would ultimately result in the failure of the subsequent regex
search that looks for the prompt.
Also fixed a few formatting issues flagged by flake8.
Change-Id: If7474f832a88bc29b321f21b050c9665294e63d5
Reviewed-on: http://gerrit.cloudera.org:8080/11175
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This adds an IMPALA_HISTFILE environment variable (and --history_file
argument) to the shell which overrides the default location of
~/.impalahistory for the shell history. The shell tests now override
this variable to /dev/null so they don't store history. The tests that
need history use a pytest fixture to use a temporary file for their
history. This allows so that they can run in parallel without stomping
on each other's history.
This also fixes a couple flaky test which were previously missing the
"execute_serially" annotation -- that annotation is no longer needed
after this fix.
A couple of the tests still need to be executed serially because they
look at metrics such as the number of executed or running queries, and
those metrics are unstable if other tests run in parallel.
I tested this by running:
./bin/impala-py.test tests/shell/test_shell_interactive.py \
-m 'not execute_serially' \
-n 80 \
--random
... several times in a row on an 88-core box. Prior to the change,
several would fail each time. Now they pass.
Change-Id: I1da5739276e63a50590dfcb2b050703f8e35fec7
Reviewed-on: http://gerrit.cloudera.org:8080/11045
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Todd Lipcon <todd@apache.org>
This change adds a new query option "timezone" which
defines the timezone used for utc<->local conversions.
The main goal is to simplify testing, but I think that
some users may also find it useful so it is added as a
"general" query option.
Examples:
set timezone=UTC;
set timezone="Europe/Budapest"
The timezones are validated, but as query options are not
sent to the coordinator immediately, the error checking
will only happen when running a query.
Leading/trailing " and 'characters are stripped because the
/ character cannot be entered unquoted in some contexts.
Currently the timezone has effect in the following cases:
-function now()
-conversions between unix time and timestamp if flag
use_local_tz_for_unix_timestamp_conversions is true
-reading parquet timestamps written by Hive if flag
convert_legacy_hive_parquet_utc_timestamps is true
In the near future Parquet timestamps's isAdjustedToUTC
property will be supported, which will decide whether
to do utc->local conversion on a per file+column basis.
This conversion will be also affected.
Testing:
- Extended test_local_tz_conversion.py to actually
test utc<->local conversion. Until now the effect
of flag use_local_tz_for_unix_timestamp_conversions
was practically untested.
- Added a shell test to check that the default of the
query option is the system's timezone.
- Added a shell test to check timezone validation.
Change-Id: I73de86eff096e1c581d3b56a0d9330d686f77272
Reviewed-on: http://gerrit.cloudera.org:8080/11064
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This change provides a way to modify command-line options like -B,
--output_file and --delimiter inside impala-shell without quitting
the shell and then restarting again.
Also fixed IMPALA-7286: command "unset" does not work for shell options
Testing:
Added tests for all new options in test_shell_interactive.py
Tested on Python 2.6 and Python 2.7
Change-Id: Id8d4487c24f24806223bfd5c54336914e3afd763
Reviewed-on: http://gerrit.cloudera.org:8080/10900
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This patch fixes the slow performance in Impala shell, especially for
large queries by replacing all calls to sqlparse.format(sql_string,
strip_comments=True) with the custom implementation of strip comments
that does not use grouping. The code to strip leading comments was also
refactored to not use grouping.
* Benchmark running a query with 12K columns *
Before the patch:
$ time impala-shell.sh -f large.sql --quiet
real 2m4.154s
user 2m0.536s
sys 0m0.088s
After the patch:
$ time impala-shell.sh -f large.sql --quiet
real 0m3.885s
user 0m1.516s
sys 0m0.048s
Testing:
- Added a new test to test the Impala shell performance
- Ran all shell tests on Python 2.6 and Python 2.7
Change-Id: Idac9f3caed7c44846a8c922dbe5ca3bf3b095b81
Reviewed-on: http://gerrit.cloudera.org:8080/10939
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This patch removes write support for unsupported formats like Sequence,
Avro and compressed text. Also, the related query options
ALLOW_UNSUPPORTED_FORMATS and SEQ_COMPRESSION_MODE have been migrated
to the REMOVED query options type.
Testing:
Ran exhaustive build.
Change-Id: I821dc7495a901f1658daa500daf3791b386c7185
Reviewed-on: http://gerrit.cloudera.org:8080/10823
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
test_shell_commandline.py::TestImpalaShell::test_socket_opening
uses netcat to listen to an ephemeral port to verify the expected
socket opening behavior of impala-shell.
This port number is fixed to 42000. When this port happens to
be used by another outbound socket, this test will fail.
This change refactors the test to use socket.bind(). The port used
in this test is no longer fixed and will be picked automatically.
This change also adds the proper cleanup logics to the various
subprocess.Popen objects used in the test.
Change-Id: Idd64632ded936d49fc404bcac75588dd7886be44
Reviewed-on: http://gerrit.cloudera.org:8080/10747
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
After additional testing around IMPALA-2782, it was discovered
that impala-shell starts the session displaying the expected
hostname (as passed -i flag) on the prompt. This gives the
impression that the load balancer was bypassed, however the
actual TSSLSocket is still created with the hostname passed
in via the -b or --kerberos_host_fqdn flag.
This change ensures that the hostname used to create the
TSSLSocket will always be the one passed in via the -i flag
on impala-shell. This change is required by IMPALA-2782.
Testing:
Using netcat, we verified that the impala daemon host[:port]
value passed into the -i/--impalad option is indeed the one
impala-shell tries to connect to in both cases (with and
without -b)
Change-Id: Ibee05bd0dbe8c6ae108b890f0ae0f6900149773a
Reviewed-on: http://gerrit.cloudera.org:8080/10580
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This patch fixes an issue where parseline is unable to deduce the
correct command when a statement has a leading comment.
Before:
> -- comment
> insert into table t values(100);
Fetched 1 row(s) in 0.01s
After:
> -- comment
> insert into table t values(100);
Modified 1 row(s) in 0.01s
Before (FE syntax error):
> /*comment*/ help;
After (show help correctly):
> /*comment*/ help;
Testing:
- Added shell tests
- Ran end-to-end shell tests on Python 2.6 and Python 2.7
Change-Id: I7ac7cb5a30e6dda73ebe761d9f0eb9ba038e14a7
Reviewed-on: http://gerrit.cloudera.org:8080/9933
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Bharath Vissapragada <bharathv@cloudera.com>
This patch fixes the issue where non-matching quotes inside comments
will cause the shell to not terminate.
The fix is to strip any SQL comments before sending to shlex since shlex
does not understand SQL comments and will raise an exception when it
sees unmatched quotes regardless whether the quotes are in the comments or
not.
Testing:
- Added new shell tests
- Ran all end-to-end shell tests on Python 2.6 and Python 2.7
Change-Id: I2feae34026a7e63f3d31489f757f093a73ca5d2c
Reviewed-on: http://gerrit.cloudera.org:8080/10541
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
test_kudu_dml_reporting has been causing a large number of build
failures. Temporarily disable it while we figure out what's going on.
Also improve output of test_kudu_dml_reporting on failure.
Change-Id: I222e4c86a50f2450201fbad8b937e8fcf4fac31d
Reviewed-on: http://gerrit.cloudera.org:8080/10527
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This patch fixes the issue where non-matching quotes inside comments
will cause the shell to not terminate.
The fix is to strip any SQL comments before sending to shlex since shlex
does not understand SQL comments and will raise an exception when it sees
unmatched quotes regardless whether the quotes are in the comments or
not.
Testing:
- Added new shell tests
- Ran all end-to-end shell tests
Change-Id: Ic899fdddc182947f73101ddbc2e3c8caf97d9085
Reviewed-on: http://gerrit.cloudera.org:8080/10474
Reviewed-by: Michael Brown <mikeb@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tweak the query so that it still runs for a long time but
can cancel the fragment quicker instead of being
stuck in a long sleep() call.
Change-Id: I0c90d4f5c277f7b0d5561637944b454f7a44c76e
Reviewed-on: http://gerrit.cloudera.org:8080/10499
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Tim Armstrong <tarmstrong@cloudera.com>
This patch fixes a bug in sqlparse where sqlparse incorrectly splits a
statement that has a new line inside double quotes. The bug in sqlparse
causes Impala shell to go to infinite loop when a statement contains a
new line inside double quotes.
The patch in sqlparse is based on the upstream fix at
https://github.com/andialbrecht/sqlparse/pull/396
Testing:
- Added new end-to-end shell tests
- Ran end-to-end shell tests
Change-Id: I9142f21a888189d351f00ce09baeba123bc0959b
Reviewed-on: http://gerrit.cloudera.org:8080/9195
Reviewed-by: David Knupp <dknupp@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
test_shall_commandline:test_cancellation starts an Impala shell
process, runs a query, sleeps briefly, and then cancels the query by
sending a SIGINT to the process. This has been occasionally failing
with either the error 'KeyboardInterrupt' or with the query succeeding
instead of being cancelled.
The problem occurs if the process hasn't fully started up before the
SIGINT is sent - in particular, if ImpalaShell:__init__ hasn't
installed the signal handler, which happens sometimes depending on
concurrent load on the machine. Depending on the exact timing, this
may cause a 'KeyboardInterrupt' that isn't handled, or the signal
may be ignored and the query allowed to run to completion.
The solution is to increase the time spent sleeping.
Testing:
- I can reliably repro the problem locally by reducing the sleep time.
Change-Id: I5d13de6207807e4ba2e2e406a29d670f01d6c3a0
Reviewed-on: http://gerrit.cloudera.org:8080/10177
Reviewed-by: Thomas Tauber-Marshall <tmarshall@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Thrift 0.9.3 implements "ostream& operator<<(ostream&, T)" for thrift
data types while impala did the same to enums and special types
including TNetworkAddress and TUniqueId. To prepare for the upgrade of
thrift 0.9.3, this patch renames these impala defined functions. In the
absence of operator<<, assertion macros like DCHECK_EQ can no longer be
used on non-enum thrift defined types.
Change-Id: I9c303997411237e988ef960157f781776f6fcb60
Reviewed-on: http://gerrit.cloudera.org:8080/9168
Reviewed-by: Tianyi Wang <twang@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
The bug is that PrettyOutputFormatter.format() returned a unicode
object, and Python cannot automatically write unicode objects to
output streams where there is no default encoding.
The fix is to convert to UTF-8 encoded in a regular string, which
can be output to any output device. This makes the output type
consistent with DelimitedOutputFormatter.format().
Based on code by Marcell Szabo.
Testing:
Added a basic test.
Played around in an interactive shell to make sure that unicode
characters still work in interactive mode.
Change-Id: I9de641ecf767a2feef3b9f48b344ef2d55e17a7f
Reviewed-on: http://gerrit.cloudera.org:8080/9928
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
We have seen this test fail because the fully-qualified domain name
differed between the python test process and the impala shell process
(see JIRA for details). The exact domain name is irrelevant to the test
- we only really care about whether the prompt appeared or not.
Change-Id: I24078ef97d56e5bb32fd866af861e3a1d19c8c44
Reviewed-on: http://gerrit.cloudera.org:8080/9831
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
When passing comamnd line options to a new instance of the
ImpalaShell, we ususally transfer the options to member
variables of that new instance. We weren't doing that with
all of the LDAP-related options, even though we wanted to
access them later. In some environments and under certain
conditions, this could then lead to a NameError exception
being thrown.
This patch takes away any reliance on the original options
object returned by parse_args() beyond the __init__()
method of the ImpalaShell class, by tranferring all LDAP
options to member variables. Also, a test has been added to
exercise the code path where the exception had been occurring.
Change-Id: I810850f569ef3f4487f7eeba81ca520dc955ac2e
Reviewed-on: http://gerrit.cloudera.org:8080/9744
Reviewed-by: Michael Brown <mikeb@cloudera.com>
Tested-by: Impala Public Jenkins
This patch is to show query submitted and query progress web links in
Impala shell for CTE queries.
Testing:
- Ran end-to-end shell tests
Change-Id: Ie3352406e3b048be395a20405c8e6b911e663164
Reviewed-on: http://gerrit.cloudera.org:8080/9537
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Impala Public Jenkins
The semicolon was in the wrong place in one of the test queries and
the failure was swallowed silently.
This meant that one fewer prompt was displayed than expected. This
didn't cause a test failure because the prompt regex also matched the
"Connected to host:port" message printed in the shell preamble. I'm
unsure why this would cause the test failure but my best theory is that
in the failure case, the "Connected" and prompt messages are both
buffered when we evaluate the first prompt regex, and the regex swallows
up the whole input, rather than just the first instance.
Testing:
Tightened up the prompt regex and checked that the query actually
executed successfully. With these improvements, the broken query
text caused a test failure.
I looped the test for a while to make sure it was robust.
Added a couple of related test cases to make sure we aren't losing
coverage.
Change-Id: If917bbc8e87b83c188b6d5e1acad912892b8c6fe
Reviewed-on: http://gerrit.cloudera.org:8080/9441
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Impala Public Jenkins
The patch is to remove any comments in a statement when checking if a
statement ends with a semicolon delimiter.
For example:
Before (semicolon delimiter is needed at the end):
select 1 + 1; -- comment\n;
After (semicolon delimiter is no longer needed):
select 1 + 1; -- comment
Testing:
- Ran end-to-end tests in shell
Change-Id: I54f9a8f65214023520eaa010fc462a663d02d258
Reviewed-on: http://gerrit.cloudera.org:8080/9191
Reviewed-by: Fredy Wijaya <fwijaya@cloudera.com>
Reviewed-by: Taras Bobrovytsky <tbobrovytsky@cloudera.com>
Tested-by: Impala Public Jenkins
Adds a concept of a "removed" query option that has no effect but does
not return an error when a user attempts to set it. These options are
not returned by "set" or "set all" commands that are executed in
impala-shell or server-side.
These query options have been deprecated for several releases:
DEFAULT_ORDER_BY_LIMIT, ABORT_ON_DEFAULT_LIMIT_EXCEEDED,
V_CPU_CORES, RESERVATION_REQUEST_TIMEOUT, RM_INITIAL_MEM,
SCAN_NODE_CODEGEN_THRESHOLD, MAX_IO_BUFFERS
RM_INITIAL_MEM did still have an effect, but it was undocumented and
MEM_LIMIT should be used in preference.
DISABLE_CACHED_READS also had an effect but it was documented as
deprecated.
Otherwise the options had no effect at all.
Testing:
Ran exhaustive build.
Updated query option tests to reflect the new behaviour.
Cherry-picks: not for 2.x.
Change-Id: I9e742e9b0eca0e5c81fd71db3122fef31522fcad
Reviewed-on: http://gerrit.cloudera.org:8080/9118
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins
This removes the deprecated option in time for 3.0.
Testing:
Ran core tests. Manually ran the shell with the argument to confirm that
it reported "no such option".
Cherry-picks: not for 2.x.
Change-Id: I8f430bad0578e150d5e80066b9e7572041af4a15
Reviewed-on: http://gerrit.cloudera.org:8080/9072
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins
Jenkins jobs occasionally hang on test_query_cancellation_during_fetch.
There was a workaround proposal submitted under this Jira ID, however,
apparently jobs still hang on this test randomly. Reverting the
workaround and skipping the test until further fix proposal provided.
This reverts commit 7810d1f9a2.
Change-Id: I51acee49b5a17c4852410b7568fd1d092b114a6d
Reviewed-on: http://gerrit.cloudera.org:8080/8972
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins
Apparently test_query_cancellation_during_fetch hangs occasionally
in Jenkins builds. The Impala debug page shows the query being
cancelled, however, on the host the ImpalaShell process related to
that query is still running.
Since I had no luck in reproducing the issue locally I only have a
theory what might be going on here: The query is cancelled
successfully on Impala backend and when the test tries to get the
stdout and stderr from the ImpalaShell it gets stuck. It might be
the case that ImpalaShell process fetching the query results holds
the stdout. According to the documentation of subprocess.communicate()
it may cause issues to fetch data when the data size is large or
unlimited, that we can consider to be the case here.
As a workaround there is a new optional parameter to
util.ImpalaShell to omit the stdout because this test wouldn't use
it anyway and we get rid of fetching the large result from
ImpalaShell.
Change-Id: I082c83b91b6d0c527de92c7992f0dc9d1b290433
Reviewed-on: http://gerrit.cloudera.org:8080/8852
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins
Impala shell can accidentally convert certain
literal strings to lowercase. Impala shell splits
each command into tokens and then converts the
first token to lowercase to figure out how it
should execute the command. The splitting is done
by spaces only. Thus, if the user types a TAB
after the SELECT, the first token after the split
becomes the SELECT plus whatever comes after it.
Testing:
TestImpalaShellInteractive.test_case_sensitive_command
TestImpalaShellInteractive.test_unexpected_conversion_for_literal_string_to_lowercase
TestImpalaShell.test_var_substitution
Change-Id: Ifdce9781d1d97596c188691b62a141b9bd137610
Reviewed-on: http://gerrit.cloudera.org:8080/8762
Reviewed-by: Zoltan Borok-Nagy <boroknagyz@cloudera.com>
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins
In the query cancellation tests it is essential to wait until the
query gets to a desired state (waiting_to_finish, fetching) and then
cancel it. Apparently, ASAN query execution happens slower than on a
Release build. As a result a hard coded timeout threshold is not
sufficient to cover all the builds, or should be set to a wastingly
high value.
As a solution the query state is checked on the Impala debug page in
intervals until it reaches the desired state or the maximum retry
attempt value is reached.
Change-Id: Ie0bff485a872df7be8efd784314a6ca91aaadd11
Reviewed-on: http://gerrit.cloudera.org:8080/8713
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins
Issue 1: When query is cancelled via CTRL-C while being executed in Impala-shell
then an exception is thrown from Impala backend saying 'Invalid query handle'.
This is because one ImpalaClient was making RPC's while another ImpalaClient
cancelled the query on the backend. As a result RPC handlers in ImpalaServer
try to access a ClientRequestState that had been cleared from the backend. The
issue is confidently reproducable both in wait_to_finish and in fetch states of
the query.
As a solution the query cancellation is indicated to ImpalaClient via a bool
flag. Once a cancellation originated exception reaches Impala shell this flag
is checked to decide whether to suppress the error or not.
Issue 2: Every time a query was cancelled a 'use db' command was issued
automatically. This happened to historical reasons but is not needed anymore
(see Jira for more details).
Change-Id: I6cefaf1dae78baae238289816a7cb9d210fb38e2
Reviewed-on: http://gerrit.cloudera.org:8080/8549
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins
Four display levels are introduced for each query option: REGULAR, ADVANCED,
DEVELOPMENT and DEPRECATED. When the query options are displayed in Impala
shell using SET then only the REGULAR and ADVANCED options are shown. A new
command called SET ALL shows all the options grouped by their option levels.
When the query options are displayed through the SET SQL statement then the
result set would contain an extra column indicating the level of each option.
Similarly to Impala shell here the SET command only diplays the REGULAR and
ADVANCED options while SET ALL shows them all.
If the Impala shell connects to an Impala daemon that predates this change
then all the options would be displayed in the REGULAR group.
Change-Id: I75720d0d454527e1a0ed19bb43cf9e4f018ce1d1
Reviewed-on: http://gerrit.cloudera.org:8080/8447
Reviewed-by: Dan Hecht <dhecht@cloudera.com>
Tested-by: Impala Public Jenkins
The ImpalaShell didn't issue the 'USE <current-db>' command after
reconnecting to the Impala daemon. Therefore the client session
used the default DB after reconnection, not the previously selected DB.
Setting the current DB is done by the _validate_database method.
Before this commit it appended the "use <db>" command to the
command queue of the Cmd class. But, at this point we might already
have commands in the command queue that will run before the
"use <db>" command. In case of reconnection, we want to invoke
the USE command right away.
Also, the command processed by the precmd() method can entirely skip
the command queue, therefore it is not enough to insert the USE
command to the front of the command queue. We need to issue the
USE command with the onecmd() method to execute it immediately.
I extended the _validate_database method with an "immediately" flag.
If this flag is true, _validate_database will use the onecmd() method.
Otherwise, it will append the USE command to the command queue to
maintain the previous behaviour.
I added a new automated test suite named test_shell_interactive_reconnect.py
to the "custom cluster" tests. It sets the default database, and after
reconnection it checks if the shell set it again automatically.
One test case checks if the shell set the DB after manually reconnecting
to the impala daemon by issuing the CONNECT command.
The other test case checks if the shell set the DB after automatic
reconnection due to cluster restart.
I needed to backup the impala shell history file because I didn't
want to pollute it by the test cases (just like the way it is done in
tests/shell/test_shell_interactive.py). I created utility functions for
this in tests/shell/util.py and now test_shell_interactive.py and
the newly created test suite are using these utility functions.
Change-Id: I40dfa00ba0314d356fe8617446f516505c925e5e
Reviewed-on: http://gerrit.cloudera.org:8080/8368
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins
Query options can be set from command line and impala rc as
key=value pairs, where key is case insensitive.
Examples:
command line:
impala-shell.sh -Q MT_DOP=1 --query_option=MAX_ERRORS=200
.impalarc:
[impala.query_options]
EXPLAIN_LEVEL=2
MT_DOP=2
The options set in command line will update the ones
in impalarc one by one, so the result of the example
above will be:
EXPLAIN_LEVEL=2
MT_DOP=1
MAX_ERRORS=200
Additional changes:
- 0 and 1 are accepted as bools in section [impala] to
make it more consistent with [impala.query_options]
- options that are expected to be bool but are not
0/1/true/false lead to error instead of warning
Change-Id: I26a3b67230c80a99bd246b6af205d558fec9a986
Reviewed-on: http://gerrit.cloudera.org:8080/8038
Reviewed-by: Michael Brown <mikeb@cloudera.com>
Tested-by: Impala Public Jenkins
(Re-applies reverted commit 387bde0639.
The commit broke ASAN tests due to a race in how test infrastructure
re-uses connections. The fix for that is in an adjacent commit.)
When converting TQueryOptions to a map<string,string>, we now convert
unset options to the empty string. Within TQueryOptions we have optional
options (like mt_dop or compression_codec) with no default specified. In
this case, the user was seeing 0 for numeric types and the first enum
option for enumeration types (e.g., "NONE" in the compression case).
This was confusing as the implementation handles this "null" case
differently (e.g., using SNAPPY as the default codec in the case
reported in the JIRA).
When running "set" in impala-shell, the difference is as
follows:
- BUFFER_POOL_LIMIT: [0]
+ BUFFER_POOL_LIMIT: []
- COMPRESSION_CODEC: [NONE]
+ COMPRESSION_CODEC: []
- MT_DOP: [0]
+ MT_DOP: []
- RESERVATION_REQUEST_TIMEOUT: [0]
+ RESERVATION_REQUEST_TIMEOUT: []
- SEQ_COMPRESSION_MODE: [0]
+ SEQ_COMPRESSION_MODE: []
- V_CPU_CORES: [0]
+ V_CPU_CORES: []
Obviously, the empty string is a valid value for a string-typed option, where
it will be impossible to tell the difference between "unset" and "set to empty
string." Today, there are two string-typed options: debug_string defaults to ""
and request_pool has no default. An alternative would have been to use
a special token like "_unset" or to introduce a new field in the beeswax
Thrift ConfigVariable struct. I think the empty string approach
is clearest.
The other users of this state, which I believe are HiveServer2's OpenSession()
call and HiveServer2's response to a "SET" query are affected. They
benefit from the same fix, and a new test has been added to test_hs2.py.
I did a mild refactoring in the HS2 tests to write a helper method
for the very common pattern of excecuting a query.
Testing:
* Manual testing with impala-shell
* Modified impala-shell tests to check this explicitly for one case.
* Modified HS2 test to check this as well as the SET k=v statement,
which looked otherwise untested.
Change-Id: I29f5d8ab874cb1338077f16019a9537766cac0c4
Reviewed-on: http://gerrit.cloudera.org:8080/8096
Reviewed-by: Dan Hecht <dhecht@cloudera.com>
Tested-by: Impala Public Jenkins
Impala-shell crashes with 2 source commands on the same line and runs
a command multiple times if it shares the same line with a source
command.
The bug is caused by a misuse of cmdqueue. The cmdqueue member of
cmd.Cmd is used to execute commands not directly from user input in an
event loop. When a 'source' is run, execute_query_list() is called which
also executes the commands in cmdqueue, causing them to be executed
twice.
The fix is for execute_query_list() to not run the commands in cmdqueue.
For the non-interactive case, where the event loop won't be run, we call
execute_query_list() with cmdqueue so that the commands get run.
A test case is added to test_shell_interactive.py.
Change-Id: I453af2d4694d47e184031cb07ecd2af259ba20f3
Reviewed-on: http://gerrit.cloudera.org:8080/8063
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Impala Public Jenkins
Due to re-use of connections in the test infrastructure, this commit
is causing ASAN failures in the builds. That is being worked out
as part of IMPALA-5908, but, in the meanwhile, it's prudent
to revert.
This reverts commit 387bde0639.
Change-Id: I41bc8ab0f1df45bbd311030981a7c18989c2edc8
Reviewed-on: http://gerrit.cloudera.org:8080/8087
Reviewed-by: Dan Hecht <dhecht@cloudera.com>
Tested-by: Impala Public Jenkins
When converting TQueryOptions to a map<string,string>, we now convert
unset options to the empty string. Within TQueryOptions we have optional
options (like mt_dop or compression_codec) with no default specified. In
this case, the user was seeing 0 for numeric types and the first enum
option for enumeration types (e.g., "NONE" in the compression case).
This was confusing as the implementation handles this "null" case
differently (e.g., using SNAPPY as the default codec in the case
reported in the JIRA).
When running "set" in impala-shell, the difference is as
follows:
- BUFFER_POOL_LIMIT: [0]
+ BUFFER_POOL_LIMIT: []
- COMPRESSION_CODEC: [NONE]
+ COMPRESSION_CODEC: []
- MT_DOP: [0]
+ MT_DOP: []
- RESERVATION_REQUEST_TIMEOUT: [0]
+ RESERVATION_REQUEST_TIMEOUT: []
- SEQ_COMPRESSION_MODE: [0]
+ SEQ_COMPRESSION_MODE: []
- V_CPU_CORES: [0]
+ V_CPU_CORES: []
Obviously, the empty string is a valid value for a string-typed option, where
it will be impossible to tell the difference between "unset" and "set to empty
string." Today, there are two string-typed options: debug_string defaults to ""
and request_pool has no default. An alternative would have been to use
a special token like "_unset" or to introduce a new field in the beeswax
Thrift ConfigVariable struct. I think the empty string approach
is clearest.
The other users of this state, which I believe are HiveServer2's OpenSession()
call and HiveServer2's response to a "SET" query are affected. They
benefit from the same fix, and a new test has been added to test_hs2.py.
I did a mild refactoring in the HS2 tests to write a helper method
for the very common pattern of excecuting a query.
Testing:
* Manual testing with impala-shell
* Modified impala-shell tests to check this explicitly for one case.
* Modified HS2 test to check this as well as the SET k=v statement,
which looked otherwise untested.
Change-Id: I86bc06a58d67b099da911293202dae9e844c439b
Reviewed-on: http://gerrit.cloudera.org:8080/7886
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: Impala Public Jenkins
This patch adds a new command "rerun" and a shortcut "@" to impala-shell
. Users can rerun a certain query by its index given by history command.
A valid index is an integer in [1, history_length] or
[-history_length, -1]. Negative values index history in reverse order.
For example, "@1;" or "rerun 1;" reruns the first query shown in history
and "@-1;" reruns the last query. The rerun command itself won't appear
in history. The history index is 1-based and increasing. Old entries
might be truncated when impala-shell starts, and the indexes will be
realigned to 1, so the same index may refer to different commands among
multiple impala-shell instances.
Testing: A test case test_rerun is added to
shell/test_shell_interactive.py
Change-Id: Ifc28e8ce07845343267224c3b9ccb71b29a524d2
Reviewed-on: http://gerrit.cloudera.org:8080/7674
Reviewed-by: Sailesh Mukil <sailesh@cloudera.com>
Tested-by: Impala Public Jenkins
Function print_to_stderr() has a syntax error when an error message is
displayed.
I solved this problem by exchanging the position of variable and the
subsequent strings in function print_to_stderr().
Change-Id: Ib883499a88f39d91b69bea4291f1ce5dd264ccf6
Reviewed-on: http://gerrit.cloudera.org:8080/7187
Reviewed-by: Bharath Vissapragada <bharathv@cloudera.com>
Tested-by: Impala Public Jenkins
This change avoids printing blank lines when the Impala
shell fetches 0 rows from a statement.
Change-Id: I6e18ce36be07ee90a16b007b1e30d5255ef8a839
Reviewed-on: http://gerrit.cloudera.org:8080/7055
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Impala Public Jenkins
This test is failing on distros that don't support Kudu, but it
shouldn't even be run.
Tested by setting KUDU_IS_SUPPORTED to false, and then trying to
run the test, confirming that it gets skipped. When the env var
KUDU_IS_SUPPORTED is true, the test runs.
Change-Id: Ia36319228d4e9cac9cb675f3207ef2ba39f24e7e
Reviewed-on: http://gerrit.cloudera.org:8080/5854
Reviewed-by: Michael Brown <mikeb@cloudera.com>
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: Impala Public Jenkins
This commit also removes the now unused `DISTRIBUTE`, `SPLIT`, and
`BUCKETS` keywords that were going to be newly released in Impala 2.6,
but are now unused. Additionally, a few remaining uses of the
`DISTRIBUTE BY` syntax has been switched to `PARTITION BY`.
Change-Id: I32fdd5ef26c532f7a30220db52bdfbf228165922
Reviewed-on: http://gerrit.cloudera.org:8080/5382
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: Internal Jenkins