Commit Graph

209 Commits

Author SHA1 Message Date
Jinchul
bfbcd1fe86 IMPALA-4664: Unexpected string conversion in Shell
Impala shell can accidentally convert certain
literal strings to lowercase. Impala shell splits
each command into tokens and then converts the
first token to lowercase to figure out how it
should execute the command. The splitting is done
by spaces only. Thus, if the user types a TAB
after the SELECT, the first token after the split
becomes the SELECT plus whatever comes after it.

Testing:
TestImpalaShellInteractive.test_case_sensitive_command
TestImpalaShellInteractive.test_unexpected_conversion_for_literal_string_to_lowercase
TestImpalaShell.test_var_substitution

Change-Id: Ifdce9781d1d97596c188691b62a141b9bd137610
Reviewed-on: http://gerrit.cloudera.org:8080/8762
Reviewed-by: Zoltan Borok-Nagy <boroknagyz@cloudera.com>
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins
2017-12-15 21:32:20 +00:00
Jinchul
9560d883e2 IMPALA-4506: Do not display some intro message if --quiet is set
Change-Id: I19c6d00dfbbe805ee9c525b72eb5703840e2f582
Reviewed-on: http://gerrit.cloudera.org:8080/8613
Reviewed-by: Jim Apple <jbapple-impala@apache.org>
Tested-by: Impala Public Jenkins
2017-11-30 18:30:47 +00:00
Gabor Kaszab
6d9da17288 IMPALA-1144: Fix exception when cancelling query in Impala-shell with CTRL-C
Issue 1: When query is cancelled via CTRL-C while being executed in Impala-shell
then an exception is thrown from Impala backend saying 'Invalid query handle'.
This is because one ImpalaClient was making RPC's while another ImpalaClient
cancelled the query on the backend. As a result RPC handlers in ImpalaServer
try to access a ClientRequestState that had been cleared from the backend. The
issue is confidently reproducable both in wait_to_finish and in fetch states of
the query.

As a solution the query cancellation is indicated to ImpalaClient via a bool
flag. Once a cancellation originated exception reaches Impala shell this flag
is checked to decide whether to suppress the error or not.

Issue 2: Every time a query was cancelled a 'use db' command was issued
automatically. This happened to historical reasons but is not needed anymore
(see Jira for more details).

Change-Id: I6cefaf1dae78baae238289816a7cb9d210fb38e2
Reviewed-on: http://gerrit.cloudera.org:8080/8549
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins
2017-11-29 03:44:51 +00:00
Gabor Kaszab
88cb68cfbe IMPALA-2181: Add query option levels for display
Four display levels are introduced for each query option: REGULAR, ADVANCED,
DEVELOPMENT and DEPRECATED. When the query options are displayed in Impala
shell using SET then only the REGULAR and ADVANCED options are shown. A new
command called SET ALL shows all the options grouped by their option levels.

When the query options are displayed through the SET SQL statement then the
result set would contain an extra column indicating the level of each option.
Similarly to Impala shell here the SET command only diplays the REGULAR and
ADVANCED options while SET ALL shows them all.

If the Impala shell connects to an Impala daemon that predates this change
then all the options would be displayed in the REGULAR group.

Change-Id: I75720d0d454527e1a0ed19bb43cf9e4f018ce1d1
Reviewed-on: http://gerrit.cloudera.org:8080/8447
Reviewed-by: Dan Hecht <dhecht@cloudera.com>
Tested-by: Impala Public Jenkins
2017-11-28 00:31:15 +00:00
Zoltan Borok-Nagy
6539e89c81 IMPALA-2235: Fix current db when shell auto-reconnects
The ImpalaShell didn't issue the 'USE <current-db>' command after
reconnecting to the Impala daemon. Therefore the client session
used the default DB after reconnection, not the previously selected DB.

Setting the current DB is done by the _validate_database method.
Before this commit it appended the "use <db>" command to the
command queue of the Cmd class. But, at this point we might already
have commands in the command queue that will run before the
"use <db>" command. In case of reconnection, we want to invoke
the USE command right away.

Also, the command processed by the precmd() method can entirely skip
the command queue, therefore it is not enough to insert the USE
command to the front of the command queue. We need to issue the
USE command with the onecmd() method to execute it immediately.

I extended the _validate_database method with an "immediately" flag.
If this flag is true, _validate_database will use the onecmd() method.
Otherwise, it will append the USE command to the command queue to
maintain the previous behaviour.

I added a new automated test suite named test_shell_interactive_reconnect.py
to the "custom cluster" tests. It sets the default database, and after
reconnection it checks if the shell set it again automatically.

One test case checks if the shell set the DB after manually reconnecting
to the impala daemon by issuing the CONNECT command.
The other test case checks if the shell set the DB after automatic
reconnection due to cluster restart.

I needed to backup the impala shell history file because I didn't
want to pollute it by the test cases (just like the way it is done in
tests/shell/test_shell_interactive.py). I created utility functions for
this in tests/shell/util.py and now test_shell_interactive.py and
the newly created test suite are using these utility functions.

Change-Id: I40dfa00ba0314d356fe8617446f516505c925e5e
Reviewed-on: http://gerrit.cloudera.org:8080/8368
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins
2017-11-15 22:42:22 +00:00
Csaba Ringhofer
0a0affb692 IMPALA-5736: Add impala-shell argument to set default query options
Query options can be set from command line and impala rc as
key=value pairs, where key is case insensitive.

Examples:
command line:
impala-shell.sh -Q MT_DOP=1 --query_option=MAX_ERRORS=200

.impalarc:
[impala.query_options]
EXPLAIN_LEVEL=2
MT_DOP=2

The options set in command line will update the ones
in impalarc one by one, so the result of the example
above will be:
EXPLAIN_LEVEL=2
MT_DOP=1
MAX_ERRORS=200

Additional changes:
- 0 and 1 are accepted as bools in section [impala] to
  make it more consistent with [impala.query_options]
- options that are expected to be bool but are not
  0/1/true/false lead to error instead of warning

Change-Id: I26a3b67230c80a99bd246b6af205d558fec9a986
Reviewed-on: http://gerrit.cloudera.org:8080/8038
Reviewed-by: Michael Brown <mikeb@cloudera.com>
Tested-by: Impala Public Jenkins
2017-11-03 00:11:31 +00:00
Tim Armstrong
1a74b24bd6 IMPALA-3998: deprecate --refresh_after_connect
This adds a warning to impala-shell if -r/--refresh_after_connect is
used, in anticipation of us removing the feature in a future version.

Change-Id: Id297f80c0f596a69ef8ecde948812b82d2a5c0fa
Reviewed-on: http://gerrit.cloudera.org:8080/8381
Reviewed-by: Philip Zeyliger <philip@cloudera.com>
Reviewed-by: Bharath Vissapragada <bharathv@cloudera.com>
Tested-by: Impala Public Jenkins
2017-10-25 22:32:10 +00:00
Tianyi Wang
bd08ed4230 IMPALA-5416: Fix an impala-shell command recursion bug
Impala-shell crashes with 2 source commands on the same line and runs
a command multiple times if it shares the same line with a source
command.
The bug is caused by a misuse of cmdqueue. The cmdqueue member of
cmd.Cmd is used to execute commands not directly from user input in an
event loop. When a 'source' is run, execute_query_list() is called which
also executes the commands in cmdqueue, causing them to be executed
twice.
The fix is for execute_query_list() to not run the commands in cmdqueue.
For the non-interactive case, where the event loop won't be run, we call
execute_query_list() with cmdqueue so that the commands get run.
A test case is added to test_shell_interactive.py.

Change-Id: I453af2d4694d47e184031cb07ecd2af259ba20f3
Reviewed-on: http://gerrit.cloudera.org:8080/8063
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Impala Public Jenkins
2017-09-21 21:41:31 +00:00
Tianyi Wang
c871e007be IMPALA-992: Rerun past queries from history in shell
This patch adds a new command "rerun" and a shortcut "@" to impala-shell
. Users can rerun a certain query by its index given by history command.
A valid index is an integer in [1, history_length] or
[-history_length, -1]. Negative values index history in reverse order.
For example, "@1;" or "rerun 1;" reruns the first query shown in history
and "@-1;" reruns the last query. The rerun command itself won't appear
in history. The history index is 1-based and increasing. Old entries
might be truncated when impala-shell starts, and the indexes will be
realigned to 1, so the same index may refer to different commands among
multiple impala-shell instances.

Testing: A test case test_rerun is added to
shell/test_shell_interactive.py

Change-Id: Ifc28e8ce07845343267224c3b9ccb71b29a524d2
Reviewed-on: http://gerrit.cloudera.org:8080/7674
Reviewed-by: Sailesh Mukil <sailesh@cloudera.com>
Tested-by: Impala Public Jenkins
2017-08-23 03:34:45 +00:00
Henry Robinson
e4a0e2f391 IMPALA-5775: Allow shell to support TLSv1, v1.1 and v1.2
The shell uses Thrift's TSSLSocket to negotiate secure connections to
Impala. This socket uses a variable SSL_VERSION to determine which SSL
and TLS protocol versions it will connect to.

SSL_VERSION was hardcoded to be PROTOCOL_TLSv1, which only supports
TLSv1 servers and no other protocol version. Change the allowed version
to be PROTOCOL_SSLv23, which supports any TLS or SSL protocol. We rely
on the server not to allow SSLv2 or v3 connections.

Testing: Added a new custom cluster test to confirm that the shell can
connect to a TLSv1.2 cluster. Confirmed that the test is correctly
skipped on machines with an old version of OpenSSL that does not support
TLSv1.2.

Change-Id: I5487f82d110676b9c3c7a5305931da00c7f68ca0
Reviewed-on: http://gerrit.cloudera.org:8080/7675
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins
2017-08-16 08:10:02 +00:00
Thomas Tauber-Marshall
6757b6235c IMPALA-5708: Test failure with invalid exec summary
For some queries, the exec summary will not be completely filled
in even if the query is FINISHED. In particular, the exec_stats field
may not be set. This was causing an error in our test code that
converts the exec summary to a more usable format.

The situation is essentially deterministic for some queries, but
it was being hidden by testing code that caught the error and
discarded it in most situations, leading to flaky tests.

This patch removes the 'try' that was hiding the error and makes
the code check for the presence of exec_stats and handle it rather
than generating an error.

I filed IMPALA-5783 for followup work to be more rigorous about
when the exec summary should and shouldn't be fully present.

Testing:
- Ran the affected tests in a loop and they are no longer flaky.

Change-Id: Id52ac62da2b01f9e163e97cbe4590f8db6b663d2
Reviewed-on: http://gerrit.cloudera.org:8080/7627
Tested-by: Impala Public Jenkins
Reviewed-by: Thomas Tauber-Marshall <tmarshall@cloudera.com>
2017-08-14 19:35:12 +00:00
davidxdh
19005e6e47 IMPALA-5513: Fix display message exception when using invalid KEYVAL
Function print_to_stderr() has a syntax error when an error message is
displayed.

I solved this problem by exchanging the position of variable and the
subsequent strings in function print_to_stderr().

Change-Id: Ib883499a88f39d91b69bea4291f1ce5dd264ccf6
Reviewed-on: http://gerrit.cloudera.org:8080/7187
Reviewed-by: Bharath Vissapragada <bharathv@cloudera.com>
Tested-by: Impala Public Jenkins
2017-07-18 01:23:36 +00:00
davidxdh
36cca141e0 IMPALA-5507: Add clear description to help information of KEYVAL option
Help information of KEYVAL option in impala-shell is not clear enough.

I fix this issue by adding clear description to help information of
KEYVAL option.

Change-Id: I68cfc16838c6c0e7813f03dd4296f9eb54ec4c63
Reviewed-on: http://gerrit.cloudera.org:8080/7179
Reviewed-by: Jim Apple <jbapple-impala@apache.org>
Tested-by: Impala Public Jenkins
2017-07-11 11:05:14 +00:00
davidxdh
bd3b95e3a9 IMPALA-5514: Throw an error when --ldap_password_cmd is used without LDAP auth
When only with ldap_password_cmd option, impala-shell runs successfully.

I solved this problem by throwing an error when --ldap_password_cmd is
used without LDAP auth, that is, ldap_password_cmd option will only
take effect if ldap option presents.

Change-Id: I3711d8a0eca2fa8612e2943fa9121945db6b012e
Reviewed-on: http://gerrit.cloudera.org:8080/7188
Reviewed-by: Bharath Vissapragada <bharathv@cloudera.com>
Tested-by: Impala Public Jenkins
2017-07-11 08:43:11 +00:00
Vincent Tran
1fc7e65723 IMPALA-4418: Fixes extra blank lines in query result
This change avoids printing blank lines when the Impala
shell fetches 0 rows from a statement.

Change-Id: I6e18ce36be07ee90a16b007b1e30d5255ef8a839
Reviewed-on: http://gerrit.cloudera.org:8080/7055
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Impala Public Jenkins
2017-06-16 09:33:40 +00:00
davidxdh
e2532a96c8 IMPALA-5506: Add stdin description to help information of query_file option
Help information of query_file option in impala-shell misses stdin
description.

I fix this issue by adding stdin description to help information of
query_file option.

Change-Id: I0264174fd062497c72891b31cf9ac1ba6c00f716
Reviewed-on: http://gerrit.cloudera.org:8080/7178
Reviewed-by: Henry Robinson <henry@cloudera.com>
Tested-by: Impala Public Jenkins
2017-06-14 20:51:18 +00:00
davidxdh
9ef5ad48a8 IMPALA-5492: Fix incorrect newline character in the LDAP message
The introduction has redundant '\' in impala-shell when using LDAP.

I fix this issue by removing extraneous '\' in introduction when
impala-shell using LDAP.

Change-Id: I30c601ab255a4882260f7be23b5763ef8ec76d28
Reviewed-on: http://gerrit.cloudera.org:8080/7166
Reviewed-by: Bharath Vissapragada <bharathv@cloudera.com>
Tested-by: Impala Public Jenkins
2017-06-14 08:26:29 +00:00
Dimitris Tsirogiannis
aba37d3218 IMPALA-4965: Authorize access to runtime profile and exec summary
Bug:
When Sentry-based authorization is enabled, a user that isn't authorized
to EXPLAIN a statement that uses a view can still access unauthorized
information, such as view's definition, by running the statement and
asking for the query profile or the execution summary.

Fix:
During query compilation, determine if the user can access the the runtime
profile or the execution summary. Upon request for a runtime profile or
execution summary from a user, determine based on that information and
the user that is asking for the profile if the runtime profile
(or execution summary) will be returned or an authorization error.

The authorization rule enforced is the following:
- User A runs statement S, A asks for profile, A has profile access:
  Runtime profile is returned
- User A runs statement S, A asks for profile, A doesn't have profile access:
  Authorization error
- User A runs statement S, user B asks for profile:
  Authorization error.

This patch doesn't enforce access to the runtime profile or execution summary
through the Web UI.

Change-Id: I2255d587367c2d328590ae8534a5406c4b0c9b15
Reviewed-on: http://gerrit.cloudera.org:8080/7064
Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Tested-by: Impala Public Jenkins
2017-06-10 02:08:37 +00:00
Zach Amsden
5905083bfe IMPALA-5127: Add history_max option
Allow users to keep a longer history of queries if desired.  I
personally find it useful to keep a long history of queries to
reference and want to bump this up to a very large value, but
keep the default reasonable.  Also change the config loader
to not freak out over unknown parameters so as not to break
for users that end up with new options set running on older
shells.

Testing: Created .impalarc as follows, now getting more history saved.
Put broken things in .impalarc and make sure they are logged as
warnings.

[impala]
history_max=1000

Change-Id: Iaf65bbecb8fd7f1105aac62b6745d6125a603d7f
Reviewed-on: http://gerrit.cloudera.org:8080/6335
Reviewed-by: Michael Brown <mikeb@cloudera.com>
Tested-by: Impala Public Jenkins
2017-05-16 02:35:22 +00:00
Jim Apple
374f1121da IMPALA-3224: De-Cloudera non-docs JIRA URLs
John Russell is planning to fix the URLS in docs in a separate commit.

Fixed using:

    (git ls-files | xargs replace \
    'https://issues.cloudera.org/browse/IMPALA' 'IMPALA' --) && \
    git checkout HEAD docs

Change-Id: I28ea06e89341de234f9005fdc72a2e43f0ab8182
Reviewed-on: http://gerrit.cloudera.org:8080/6487
Reviewed-by: Jim Apple <jbapple-impala@apache.org>
Tested-by: Impala Public Jenkins
2017-05-07 04:44:57 +00:00
Sailesh Mukil
2a34076b2d IMPALA-5182: Explicitly close connection to impalad on error from shell
When using connections secured with SSL, a connection close comprises
of a bi-directional SSL_shutdown(). The second part of the
bi-directional shutdown requires that the client also close the socket
explicitly, and the server blocks till it gets the close
notification from the client.

This patch ensures that the above happens. Without this fix, the
impala-shell was found to hang over connections secured with SSL
when an error was encountered.

Change-Id: I814df93bbcd457ad3f96b4c1ef5d8b0ddd6d141f
Reviewed-on: http://gerrit.cloudera.org:8080/6587
Reviewed-by: Sailesh Mukil <sailesh@cloudera.com>
Tested-by: Impala Public Jenkins
2017-04-20 23:14:33 +00:00
Tim Armstrong
cdbcdca670 IMPALA-4570: shell tarball breaks with certain setuptools versions
The bug was in the third-party pkg_resources.py script. The version
check was broken because it matches any version with a "0.7" substring
instead of just versions starting with 0.7.

This is a known bug. setuptools even re-released 20.7.0 as version
20.8.0 to avoid it:
e5822f0d5b

Testing:
I was unable to reproduce this locally, but I think the fix is clear-cut
enough that this is ok.

Change-Id: I0565c0e6c1be7d82c3f35d2545ba044a684bb075
Reviewed-on: http://gerrit.cloudera.org:8080/5314
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins
2016-12-03 03:34:16 +00:00
Matthew Jacobs
77a2941a42 IMPALA-3713,IMPALA-4439: Fix Kudu DML shell reporting
Adds support in the shell to report the number of modified
rows for all DML operations, as well as the number of rows
with errors.

Testing: Added shell tests.

Change-Id: I3d3d7aa8d176e03ea58fb00f2a81fb3e34965aa1
Reviewed-on: http://gerrit.cloudera.org:8080/5103
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Internal Jenkins
2016-11-17 04:13:25 +00:00
Amos Bird
628685ae74 IMPALA-1654: General partition exprs in DDL operations.
This commit handles partition related DDL in a more general way. We can
now use compound predicates to specify a list of partitions in
statements like ALTER TABLE DROP PARTITION and COMPUTE INCREMENTAL
STATS, etc. It will also make sure some statements only accept one
partition at a time, such as PARTITION SET LOCATION and LOAD DATA. ALTER
TABLE ADD PARTITION remains using the old PartitionKeyValue's logic.

The changed partition related DDLs are as follows,

Table: p (i int) partitioned by (j int, k string)
Partitions:
+-------+---+-------+--------+------+--------------+-------------------+
| j     | k | #Rows | #Files | Size | Bytes Cached | Cache Replication |
+-------+---+-------+--------+------+--------------+-------------------+
| 1     | a | -1    | 0      | 0B   | NOT CACHED   | NOT CACHED        |
| 1     | b | -1    | 0      | 0B   | NOT CACHED   | NOT CACHED        |
| 1     | c | -1    | 0      | 0B   | NOT CACHED   | NOT CACHED        |
| 2     | d | -1    | 0      | 0B   | NOT CACHED   | NOT CACHED        |
| 2     | e | -1    | 0      | 0B   | NOT CACHED   | NOT CACHED        |
| 2     | f | -1    | 0      | 0B   | NOT CACHED   | NOT CACHED        |
| Total |   | -1    | 0      | 0B   | 0B           |                   |
+-------+---+-------+--------+------+--------------+-------------------+

1. show files in p partition (j<2, k='a');
2. alter table p partition (j<2, k in ("b","c") set cached in 'testPool';

// j can appear more than once,
3.1. alter table p partition (j<2, j>0, k<>"d") set uncached;
// it is the same as
3.2. alter table p partition (j<2 and j>0, not k="e") set uncached;
// we can also do 'or'
3.3. alter table p partition (j<2 or j>0, k like "%") set uncached;

// missing 'k' matches all values of k
4. alter table p partition (j<2) set fileformat textfile;
5. alter table p partition (k rlike ".*") set serdeproperties ("k"="v");
6. alter table p partition (j is not null) set tblproperties ("k"="v");
7. alter table p drop partition (j<2);
8. compute incremental stats p partition(j<2);

The remaining old partition related DDLs are as follows,

1. load data inpath '/path/from' into table p partition (j=2, k="d");
2. alter table p add partition (j=2, k="g");
3. alter table p partition (j=2, k="g") set location '/path/to';
4. insert into p partition (j=2, k="g") values (1), (2), (3);

General partition expressions or partially specified partition specs
allows partition predicates to return empty partition set no matter
'IF EXISTS' is specified.

Examples:

[localhost.localdomain:21000] >
alter table p drop partition (j=2, k="f");
Query: alter table p drop partition (j=2, k="f")
+-------------------------+
| summary                 |
+-------------------------+
| Dropped 1 partition(s). |
+-------------------------+
Fetched 1 row(s) in 0.78s
[localhost.localdomain:21000] >
alter table p drop partition (j=2, k<"f");
Query: alter table p drop partition (j=2, k<"f")
+-------------------------+
| summary                 |
+-------------------------+
| Dropped 2 partition(s). |
+-------------------------+
Fetched 1 row(s) in 0.41s
[localhost.localdomain:21000] >
alter table p drop partition (k="a");
Query: alter table p drop partition (k="a")
+-------------------------+
| summary                 |
+-------------------------+
| Dropped 1 partition(s). |
+-------------------------+
Fetched 1 row(s) in 0.25s
[localhost.localdomain:21000] > show partitions p;
Query: show partitions p
+-------+---+-------+--------+------+--------------+-------------------+
| j     | k | #Rows | #Files | Size | Bytes Cached | Cache Replication |
+-------+---+-------+--------+------+--------------+-------------------+
| 1     | b | -1    | 0      | 0B   | NOT CACHED   | NOT CACHED        |
| 1     | c | -1    | 0      | 0B   | NOT CACHED   | NOT CACHED        |
| Total |   | -1    | 0      | 0B   | 0B           |                   |
+-------+---+-------+--------+------+--------------+-------------------+
Fetched 3 row(s) in 0.01s

Change-Id: I2c9162fcf9d227b8daf4c2e761d57bab4e26408f
Reviewed-on: http://gerrit.cloudera.org:8080/3942
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Internal Jenkins
2016-11-15 03:27:36 +00:00
Matthew Jacobs
99ed6dc67a IMPALA-4134,IMPALA-3704: Kudu INSERT improvements
1.) IMPALA-4134: Use Kudu AUTO FLUSH
Improves performance of writes to Kudu up to 4.2x in
bulk data loading tests (load 200 million rows from
lineitem).

2.) IMPALA-3704: Improve errors on PK conflicts
The Kudu client reports an error for every PK conflict,
and all errors were being returned in the error status.
As a result, inserts/updates/deletes could return errors
with thousands errors reported. This changes the error
handling to log all reported errors as warnings and
return only the first error in the query error status.

3.) Improve the DataSink reporting of the insert stats.
The per-partition stats returned by the data sink weren't
useful for Kudu sinks. Firstly, the number of appended rows
was not being displayed in the profile. Secondly, the
'stats' field isn't populated for Kudu tables and thus was
confusing in the profile, so it is no longer printed if it
is not set in the thrift struct.

Testing: Ran local tests, including new tests to verify
the query profile insert stats. Manual cluster testing was
conducted of the AUTO FLUSH functionality, and that testing
informed the default mutation buffer value of 100MB which
was found to provide good results.

Change-Id: I5542b9a061b01c543a139e8722560b1365f06595
Reviewed-on: http://gerrit.cloudera.org:8080/4728
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: Internal Jenkins
2016-10-25 02:06:10 +00:00
Thomas Tauber-Marshall
7fad3e5dc3 IMPALA-3002/IMPALA-1473: Cardinality observability cleanup
IMPALA-3002:
The shell prints an incorrect value for '#Rows' in the exec
summary for broadcast nodes due to incorrect logic around
whether to use max or agg stats. This patch makes the behavior
consistent with the way the be treats exec summaries in
summary-util.cc. This incorrect logic was also duplicated in
the impala_beeswax test framework.

IMPALA-1473:
When there is a merging exchange with a limit, we may copy rows
into the output batch beyond the limit. In this case, we currently
update the output batch's size to reflect the limit, but we also
need to update ExecNode::num_rows_returned_ or the exec summary
may show that the exchange node returned more rows than it really
did.

Additionally, PlanFragmentExecutor::GetNext does not update
rows_produced_counter_ in some cases, leading the runtime profile
to display an incorrect value for 'RowsProduced'.

Change-Id: I386719370386c9cff09b8b35d15dc712dc6480aa
Reviewed-on: http://gerrit.cloudera.org:8080/4679
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: Internal Jenkins
2016-10-15 01:25:51 +00:00
Harrison Sheinblatt
33a35ea4f8 Revert "Use impala-python when building shell tarball"
This reverts commit 34bc6a72db.

This change causes the impala-shell to segfault when run on
CentOS 5.10 using python 2.4.  We maintain python 2.4
compatibility, so reverting the change to build with 2.6.

Change-Id: I32f425c703a164279ea5b3268c3512fa980d39d9
Reviewed-on: http://gerrit.cloudera.org:8080/4176
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: Tim Armstrong <tarmstrong@cloudera.com>
2016-08-31 01:35:02 +00:00
Sailesh Mukil
c23bf38a20 IMPALA-3893, IMPALA-3901: impala-shell prints incorrect coordinator address, overly verbose
The webserver address was always configured as 0.0.0.0 (meaning that
the webserver could be reached on any IP for that machine) unless
otherwise specified. This is not a correct value to dispay to the
user. This patch returns the hostname of the node, when requested,
if the webserver host address is 0.0.0.0.

This patch also does not print the coordinator link for very simple
queries, as it's not necessary and is unnecessarily verbose.

This patch also does away with pinging the impalad an extra time per
query for finding the host time and webserver address. It instead
remembers the webserver address at connect time and displays client
local time for every query instead.

Change-Id: I9d167b66f2dd8629e40a7094d21ea7ce6b43d23b
Reviewed-on: http://gerrit.cloudera.org:8080/3994
Tested-by: Internal Jenkins
Reviewed-by: Sailesh Mukil <sailesh@cloudera.com>
Tested-by: Sailesh Mukil <sailesh@cloudera.com>
2016-08-23 18:25:06 +00:00
Tim Armstrong
50e21247d6 IMPALA-3992: bad shell error message when running nonexistent file
Fix the error handling code and add a test.

Change-Id: Iebcf1dc8a1a08b400a2c769a9cff38ea02c8e525
Reviewed-on: http://gerrit.cloudera.org:8080/4022
Reviewed-by: Henry Robinson <henry@cloudera.com>
Tested-by: Internal Jenkins
2016-08-18 03:37:48 +00:00
Dan Hecht
dd906a81d5 IMPALA-3918: remove Cloudera copyright from the shell welcome message
Change-Id: I3b3dcad8997e5b58b4ffda42fc95e3dba1e8a641
Reviewed-on: http://gerrit.cloudera.org:8080/4005
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Internal Jenkins
2016-08-17 00:35:17 +00:00
Sailesh Mukil
0d689d3626 IMPALA-3965: TSSLSocketWithWildcardSAN.py not exported as part of impala-shell build lib
TSSLSocketWithWildcardSAN.py was recently added to the impala-shell
as a part of IMPALA-3159. However, it was not exported as a part of
the shell tarball.

This change adds the file to the tarball.

Change-Id: I5a7ab8c20c0b20c21b7f8d008e39c940419e3c4d
Reviewed-on: http://gerrit.cloudera.org:8080/3872
Reviewed-by: Sailesh Mukil <sailesh@cloudera.com>
Tested-by: Internal Jenkins
2016-08-10 02:21:22 +00:00
Dan Hecht
f35ce46137 IMPALA-3918: Remove Cloudera copyrights and add ASF license header
This file snuck in with the old copyright header after I put together
the original change for this JIRA.

Change-Id: I311e493bec7e63ea6dd7229140045d486540612a
Reviewed-on: http://gerrit.cloudera.org:8080/3867
Reviewed-by: Dan Hecht <dhecht@cloudera.com>
Tested-by: Internal Jenkins
2016-08-09 21:01:51 +00:00
Dan Hecht
ffa7829b70 IMPALA-3918: Remove Cloudera copyrights and add ASF license header
For files that have a Cloudera copyright (and no other copyright
notice), make changes to follow the ASF source file header policy here:

http://www.apache.org/legal/src-headers.html#headers

Specifically:
1) Remove the Cloudera copyright.
2) Modify NOTICE.txt according to
   http://www.apache.org/legal/src-headers.html#notice
   to follow that format and add a line for Cloudera.
3) Replace or add the existing ASF license text with the one given
   on the website.

Much of this change was automatically generated via:

git grep -li 'Copyright.*Cloudera' > modified_files.txt
cat modified_files.txt | xargs perl -n -i -e 'print unless m#Copyright.*Cloudera#i;'
cat modified_files_txt | xargs fix_apache_license.py [1]

Some manual fixups were performed following those steps, especially when
license text was completely missing from the file.

[1] https://gist.github.com/anonymous/ff71292094362fc5c594 with minor
    modification to ORIG_LICENSE to match Impala's license text.

Change-Id: I2e0bd8420945b953e1b806041bea4d72a3943d86
Reviewed-on: http://gerrit.cloudera.org:8080/3779
Reviewed-by: Dan Hecht <dhecht@cloudera.com>
Tested-by: Internal Jenkins
2016-08-09 08:19:41 +00:00
Tim Armstrong
34bc6a72db Use impala-python when building shell tarball
This is a minor build fix that allows buildall.sh to succeed on systems
where setuptools isn't installed for system python.

Change-Id: I33a24ea4f77e655acaa0de22211c9ef008f5e650
Reviewed-on: http://gerrit.cloudera.org:8080/3797
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Internal Jenkins
2016-07-28 21:33:41 +00:00
Sailesh Mukil
45ff0f9e67 IMPALA-3159: impala-shell does not accept wildcard or SAN certificates
The impala-shell could not accept wildcard or SAN certificates
previously as the thrift library it depended on did not support them.
This patch subclasses TSSLSocket and adds the logic to take care of
the above mentioned cases by introducing the new
TSSLSocketWithWildcardSAN class.

The certificate matching logic is based on the python-ssl source code.

Added custom cluster tests to test both wildcard matching and SAN
matching.

Added be/src/testutil/certificates-info.txt which contains all the
information about the certificates which are added for the tests.

This has been tested with Python2.4 and Python2.6.

Change-Id: I75e37012eeeb0bcf87a5edf875f0ff915daf8b89
Reviewed-on: http://gerrit.cloudera.org:8080/3765
Reviewed-by: Sailesh Mukil <sailesh@cloudera.com>
Tested-by: Internal Jenkins
2016-07-26 02:44:25 +00:00
Sailesh Mukil
900f148078 IMPALA-1671: Print time and link to coordinator web UI once query is submitted in shell
To help supportability and debugging, it's helpful to have the impala
shell print out the coordinator time and the link to the coordinator
web UI once the query is submitted.

This is done by calling the PingImpalaService() routine everytime a
query is submitted, which returns the coordinator's hostname,
webserver port and the coordinator epoch time at that moment which the
shell then formats and prints out.

Added tests to verify these new messages.

Change-Id: I704eb64546e27c367830120241311fea6091266b
Reviewed-on: http://gerrit.cloudera.org:8080/3507
Reviewed-by: Sailesh Mukil <sailesh@cloudera.com>
Tested-by: Internal Jenkins
2016-07-14 19:04:45 +00:00
Henry Robinson
0dde1c2f86 IMPALA-3628: Fix cancellation from shell when security is enabled
To cancel a query, the shell will create a separate connection inside
it's SIGINT handler, and send the cancellation RPC. However this
connection did not start a secure connection if it needed to, meaning
that the cancellation attempt would just hang.

A workaround is to kill the shell process, which I expect is what users
have been doing with this bug which has been around since 2014.

Testing:

I added a custom cluster test that starts Impala with SSL
enabled, and wrote two tests - one just to check SSL connectivity, and
the other to mimic the existing test_cancellation which sends SIGINT to
the shell process. In doing so I refactored the shell testing code a bit
so that all tests use a single ImpalaShell object, rather than rolling
their own Popen() based approaches when they needed to do something
unusual, like cancel a query.

In the cancellation test on my machine, SIGINT can take a few tries to
be effective. I'm not sure if this is a timing thing - perhaps the
Python interpreter doesn't correctly pass signals through to a handler
if it's in a blocking call, for example. The test reliably passes within
~5 tries on my machine, so the test tries 30 times, once per second.

Change-Id: If99085e75708d92a08dbecf0131a2234fedad33a
Reviewed-on: http://gerrit.cloudera.org:8080/3302
Reviewed-by: Henry Robinson <henry@cloudera.com>
Tested-by: Henry Robinson <henry@cloudera.com>
2016-07-05 16:40:23 -07:00
Bharath Vissapragada
5ede8eb8a7 IMPALA-2336: Ignore trailing comments in non-interactive mode
This patch trims trailing comments while parsing queries in
non-interactive mode. Users usually have comments in the end
of the script which should be ignored. Without this patch,
the script fails with an exception since it expects a valid
SQL. The behavior however remains the same with interactive
mode.

Change-Id: I723763ef7eedd03cf22058fadf06e9673a0d94d2
Reviewed-on: http://gerrit.cloudera.org:8080/3169
Reviewed-by: Casey Ching <casey@cloudera.com>
Tested-by: Internal Jenkins
2016-05-31 23:32:11 -07:00
Henry Robinson
a805e100b2 IMPALA-3397: Source query files from shell.
This patch allows you to write SOURCE <file> or SRC <file>, and have the
shell read the file and execute all the queries in it.

Change-Id: Ib05df3e755cd12e9e9562de6b353857940eace03
Reviewed-on: http://gerrit.cloudera.org:8080/2663
Reviewed-by: Henry Robinson <henry@cloudera.com>
Tested-by: Internal Jenkins
2016-05-12 14:17:54 -07:00
Andre Araujo
f3733aed84 IMPALA-2180: Extend SET command to allow setting variables in Impala Shell.
The SET command has been extended with the following syntax, to allow
setting of variables in the Impala Shell:

SET VAR:<variable_name>=<value>

The UNSET command has also been modified to allow:

UNSET VAR:<variable_name>

This patch builds on the changes in IMPALA-2179. The main change for
this patch was to ensure that all SET commands are processed by the
shell, rather than being send to the front end as a query. For this
I had to modify the command sanitization function to remove comments
that happen in front of a SET command.

Comments can be a can of worms to parse, so I tried to be as strict
as possible to avoid collateral effects. Comments are only removed
if they happen right at the beginning of the line AND before a SET
command. NO other comments are touched, including comments before,
after or within queries.

Change-Id: I87e07385122187ab8d324346499896a3dfbbafe6
Reviewed-on: http://gerrit.cloudera.org:8080/679
Reviewed-by: Casey Ching <casey@cloudera.com>
Tested-by: Internal Jenkins
2016-02-10 10:17:18 +00:00
Andre Araujo
bcce19012d IMPALA-2179: Extend Impala shell to allow passing variables through the command line
This patch adds the command line option `--var` to allow the user to set
variable to be used in commands within the shell. It does *not* implement the
setting of variables through the SET command, as Hive does. This extension will
be implemented separately on IMPALA-2180.

The syntax for specifying a parameter in the command line is --var=KEY=VAL, as
for example: --var=start_date=20150101

Variables are textually replaced by their value in the Impala shell commands.
The substitution work similarly for interactive sessions as well as for command
line queries and/or scripts (-q and -f options, respectively).

Variables can be referenced as ${VAR:VAR_NAME} (case-insensitive). The form
${HIVEVAR:VAR_NAME} can also be used for compatibility with Hive scripts.

To prevent any of the reference expressions above from being replaced you can
escape them with a backslash (e.g. \${VAR:VAR_NAME} and \${HIVEVAR:VAR_NAME}).

The Impala shell's SET command now also reports the set variables and their
values.

Change-Id: Ia491fae91256334bb60c9066d119fe9a1e9779dd
Reviewed-on: http://gerrit.cloudera.org:8080/611
Reviewed-by: Casey Ching <casey@cloudera.com>
Tested-by: Internal Jenkins
2016-02-10 00:20:46 +00:00
Henry Robinson
cca964c3c6 IMPALA-1934: Allow shell to retrieve LDAP password from shell cmd
Adds a new option --ldap_password_cmd that takes a string which is
executed as a shell command. The stdout results are used as the LDAP
password for this shell session.

Tests are added for the negative case (where the command fails for some
reason), but without tests for successful LDAP connections we can't test
the case where the password is correct.

Change-Id: Ib0362be5e167ff752e764ad2152c4c4b679f83c2
Reviewed-on: http://gerrit.cloudera.org:8080/1542
Reviewed-by: Henry Robinson <henry@cloudera.com>
Tested-by: Internal Jenkins
2016-01-19 23:41:25 +00:00
Casey Ching
cfb1ab5c2c IMPALA-2781: Fix shell error reporting after chdir
The original error reporting relied on $0 being accessible from the
current working dir, which failed if a script changed the working dir
and $0 was relative. This updates the error reporting command to cd back
to the original dir before accessing $0.

Change-Id: I2185af66e35e29b41dbe1bb08de24200bacea8a1
Reviewed-on: http://gerrit.cloudera.org:8080/1666
Reviewed-by: Casey Ching <casey@cloudera.com>
Tested-by: Internal Jenkins
2016-01-14 07:10:54 +00:00
Tim Armstrong
72c7428d27 Reduce log spam from shell tarball
Use -q flag so that less output is produced when building the eggs.

Change-Id: Ic356d9a84b30d2b1d8ba02558b2565e22bbfcda2
Reviewed-on: http://gerrit.cloudera.org:8080/1742
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Internal Jenkins
2016-01-12 04:42:59 +00:00
Lars Volker
6f3e794058 Fix bin/set-pythonpath.sh for zsh
Change-Id: I706a42e48118bd16b769b571f7157543799018c5
Reviewed-on: http://gerrit.cloudera.org:8080/1587
Reviewed-by: Lars Volker <lv@cloudera.com>
Tested-by: Internal Jenkins
2015-12-22 14:53:42 +00:00
Tim Armstrong
7f133b453e IMPALA-1325: don't suppress errors in shell
Remove a hack in impala-shell that suppressed any error messages that
included the string "Cancelled". This originally was necessary since
user-initiated cancellation messages were propagated back to the client,
but is no longer necessary. Certain errors that occur asynchronously
were suppressed, because the error is propagated by cancelling the
query.

Confirmed that no error messages are printed for manually cancelled
queries, and that error messages that were previously suppressed (from
my IMPALA-2298 patch) now show in the shell.

Change-Id: Iac53b1307768cbb07640ddc88b152ae71c71beab
Reviewed-on: http://gerrit.cloudera.org:8080/1529
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Internal Jenkins
2015-12-19 05:43:59 +00:00
Casey Ching
e2bfb6ae2f Misc improvements to shell scripts about error reporting
Changes:
  1) Consistently use "set -euo pipefail".
  2) When an error happens, print the file and line.
  3) Consolidated some of the kill scripts.
  4) Added better error messages to the load data script.
  5) Changed use of #!/bin/sh to bash.

Change-Id: I14fef66c46c1b4461859382ba3fd0dee0fbcdce1
Reviewed-on: http://gerrit.cloudera.org:8080/1620
Reviewed-by: Casey Ching <casey@cloudera.com>
Tested-by: Internal Jenkins
2015-12-17 18:25:27 +00:00
Juan Yu
0df3b419a0 IMPALA-2309: Compute stats query return error if set LIVE_PROGRESS=true
Impala shell cannot get child query handle so it cannot
query live progress for COMPUTE STATS query. Disable live
progress callback for compute stats query.

Change-Id: I2d2f342a805905a4fa868686e7c9e9362c2c2223
Reviewed-on: http://gerrit.cloudera.org:8080/1109
Reviewed-by: Juan Yu <jyu@cloudera.com>
Tested-by: Internal Jenkins
2015-10-05 11:30:37 -07:00
ishaan
cbfc65e041 IMPALA-2110: Readline bug in centos7 causes the shell to produce garbled output.
This patch addresses the problem by not importing readline if the shell is used in
non-interactive mode. More information on the readline bug can be found
here: https://bugs.python.org/issue19884

Change-Id: Ia9dae6308c3807d3fd323449766bf7e7371f800f
Reviewed-on: http://gerrit.cloudera.org:8080/511
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: Internal Jenkins
2015-09-04 21:51:26 +00:00
Henry Robinson
c1fd862238 IMPALA-1975: Automatically reconnect failed connections from the shell
This patch calls PingImpalaService() at the beginning of each command
loop (if the shell is currently connected). If the call fails, the shell
will try and reconnect. Reconnecting is best-effort - if it fails, the
command is processed anyway so as not to interfere with any commands
that might still give useful output in a disconnected state.

Change-Id: I37cb2f4fc235fedff16d48ad5125b9a30bd7dfd0
Reviewed-on: http://gerrit.cloudera.org:8080/547
Reviewed-by: Henry Robinson <henry@cloudera.com>
Tested-by: Internal Jenkins
2015-08-05 01:00:54 +00:00