Commit Graph

23 Commits

Author SHA1 Message Date
jasonmfehr
e17fd9a0d5 IMPALA-11850 Adds HTTP tracing headers when using the hs2-http protocol.
When using the hs2 protocol with the http transport, include several
tracing http headers by default.  These headers are:

  * X-Request-Id        -- client defined string that identifies the
                           http request, this string is meaningful only
                           to the client
  * X-Impala-Session-Id -- session id generated by the Impala backend,
                           will be omitted on http calls that occur
                           before this id has been generated
  * X-Impala-Query-Id   -- query id generated by the Impala backend,
                           will be omitted on http calls that occur
                           before this id has been generated

The Impala shell includes these headers by default.  The command
line argument --no_http_tracing has been added to remove these
headers.

The Impala backend logs out these headers if they are on the http
request.  The log messages are written out at log level 2 (RPC).

Testing:
  - manual testing (verified using debugging proxy and impala logs)
  - new python test

Change-Id: I7857eb5ec03eba32e06ec8d4133480f2e958ad2f
Reviewed-on: http://gerrit.cloudera.org:8080/19428
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2023-02-10 02:09:17 +00:00
jasonmfehr
f2f6b4b580 IMPALA-11375 Impala shell outputs details of each RPC
When the Impala shell is using the hs2 protocol, it makes multiple RPCs
to the Impala daemon.  These calls pass Thrift objects back and forth.
This change adds the '--show_rpc' which outputs the details of the RPCs
to stdout and the '--rpc_file' flag which outputs the RPC details to the
specified file path.

RPC details include:
- operation name
- request attempt count
- Impala session/query ids (if applicable)
- call duration
- call status (success/failure)
- request Thrift objects
- response Thrift objects

Certain information is not included in the RPC details:
- Thrift object attributes named 'secret' or 'password'
  are redacted.
- Thrift objects with a type of TRowSet or TGetRuntimeProfileResp
  are not include as the information contained within them is
  already available in the standard output from the Impala shell.

Testing:
- Added new tests in the end-to-end test suite.

Change-Id: I36f8dbc96726aa2a573133acbe8a558299381f8b
Reviewed-on: http://gerrit.cloudera.org:8080/19388
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2023-01-12 23:31:14 +00:00
Daniel Becker
9b3385cee2 IMPALA-11799: Fix example of the hs2_fp_format shell option
IMPALA-10660 introduced the "hs2_fp_format" shell option. In the help
section describing the query option it states:

  Use '%16G' to match Beeswax protocol's floating-point output format

However, '%16G' is not accepted by the shell:

  bin/impala-shell.sh --hs2_fp_format='%16G'
  Invalid floating point format specification: %16G

This commit changes the example to '16G'.

Also corrected the name of the option in
shell/impala_shell_config_defaults.py from 'fp_format_specification' to
'hs2_fp_format'.

Change-Id: If53e69b495dfeb8d6d65878eff9580c5e12f793d
Reviewed-on: http://gerrit.cloudera.org:8080/19359
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2023-01-03 15:31:50 +00:00
Peter Rozsa
81e36d4584 IMPALA-10660: Impala shell prints DOUBLEs with less precision in HS2 than beeswax
This change adds a shell option called "hs2_fp_format"
which manipulates the print format of floating-point values in HS2.
It lets the user to specify a Python-based format specification
expression (https://docs.python.org/2.7/library/string.html#formatspec)
which will get parsed and applied to floating-point
column values. The default value is None, in this case the
formatting is the same as the state before this change.
This option does not support the Beeswax protocol, because Beeswax
converts all of the column values to strings in its response.

Tests: command line tests for various formatting options and
       for invalid formatting option

Change-Id: I424339266be66437941be8bafaa83fa0f2dfbd4e
Reviewed-on: http://gerrit.cloudera.org:8080/18990
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2022-09-23 14:34:52 +00:00
yx91490
c7784bde55 IMPALA-1682: Support printing the output of a query (rows) vertically.
In vertical mode, impala-shell will print each row in the format:
firstly print a line contains line number, then print this row's columns
line by line, each column line started with it's name and a colon.

To enable it: use shell option '-E' or '--vertical', or 'set VERTICAL=
true' in interactive mode. to disable it in interactive mode: 'set
VERTICAL=false'. NOTICE: it will be disabled if '-B' option or 'set
WRITE_DELIMITED=true' is specified.

Tests:
add methods in test_shell_interactive.py and test_shell_commandline.py.

Change-Id: I5cee48d5a239d6b7c0f51331275524a25130fadf
Reviewed-on: http://gerrit.cloudera.org:8080/18549
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2022-06-13 15:41:07 +00:00
Abhishek Rawat
8e755e7571 IMPALA-11126: impala-shell: Support configurable socket timeout for http
client

In 'hs2-http' mode, the socket timeout is None, which could cause
hang like symptoms in case of a problematic remote server.

Added support for configurable socket timeout using the new impala-shell
config option '--http_socket_timeout_s'. If a reasonable timeout is
set, impala-shell client can retry in case of connection issues, when
possible. The default value of '--http_socket_timeout_s' is set to None,
to prevent behavior changes for existing clients.

More details on socket timeout here:
https://docs.python.org/3/library/socket.html#socket-timeouts

Testing:
- Added tests for various timeout values in test_shell_commandline.py
- Ran e2e shell tests.

Change-Id: I29fa4ff96cdcf154c3aac7e43340af60d7d61e94
Reviewed-on: http://gerrit.cloudera.org:8080/18336
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Wenzhe Zhou <wzhou@cloudera.com>
Reviewed-by: Csaba Ringhofer <csringhofer@cloudera.com>
2022-04-01 16:31:19 +00:00
Steve Carlin
bb9fb663ce IMPALA-10778: Allow impala-shell to connect directly to HS2
Impala-shell already uses HS2 protocol to connect to Impalad.
This commit allows impala-shell to connect to any server (for
example, Hive) using the hs2 protocol. This will be done via
the "--strict_hs2_protocol" option.

When the "--strict_hs2_protocol" option is turned on, only features
supported by hs2 will work. For instance, "runtime-profile" is an
impalad specific feature and will be disabled.

The "--strict_hs2_protocol" will only work on servers that abide
by the strict definition of what is supported by HS2. So one will
be able to connect to Hive in this mode, but connections to Impala
will not work. Any feature supported by Hive (e.g. kerberos
authentication) should work as well.

Note: While authentication should work, the test framework is not
set up to create an HS2 server that does authentication at this point
so this feature should be used with caution.
Change-Id: I674a45640a4a7b3c9a577830dbc7b16a89865a9e
Reviewed-on: http://gerrit.cloudera.org:8080/17660
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-08-27 09:45:59 +00:00
David Knupp
bc9d7e063d IMPALA-3343, IMPALA-9489: Make impala-shell compatible with python 3.
This is the main patch for making the the impala-shell cross-compatible with
python 2 and python 3. The goal is wind up with a version of the shell that will
pass python e2e tests irrepsective of the version of python used to launch the
shell, under the assumption that the test framework itself will continue to run
with python 2.7.x for the time being.

Notable changes for reviewers to consider:

- With regard to validating the patch, my assumption is that simply passing
  the existing set of e2e shell tests is sufficient to confirm that the shell
  is functioning properly. No new tests were added.

- A new pytest command line option was added in conftest.py to enable a user
  to specify a path to an alternate impala-shell executable to test. It's
  possible to use this to point to an instance of the impala-shell that was
  installed as a standalone python package in a separate virtualenv.

  Example usage:
  USE_THRIFT11_GEN_PY=true impala-py.test --shell_executable=/<path to virtualenv>/bin/impala-shell -sv shell/test_shell_commandline.py

  The target virtualenv may be based on either python3 or python2. However,
  this has no effect on the version of python used to run the test framework,
  which remains tied to python 2.7.x for the foreseeable future.

- The $IMPALA_HOME/bin/impala-shell.sh now sets up the impala-shell python
  environment independenty from bin/set-pythonpath.sh. The default version
  of thrift is thrift-0.11.0 (See IMPALA-9489).

- The wording of the header changed a bit to include the python version
  used to run the shell.

    Starting Impala Shell with no authentication using Python 3.7.5
    Opened TCP connection to localhost:21000
    ...

    OR

    Starting Impala Shell with LDAP-based authentication using Python 2.7.12
    Opened TCP connection to localhost:21000
    ...

- By far, the biggest hassle has been juggling str versus unicode versus
  bytes data types. Python 2.x was fairly loose and inconsistent in
  how it dealt with strings. As a quick demo of what I mean:

  Python 2.7.12 (default, Nov 12 2018, 14:36:49)
  [GCC 5.4.0 20160609] on linux2
  Type "help", "copyright", "credits" or "license" for more information.
  >>> d = 'like a duck'
  >>> d == str(d) == bytes(d) == unicode(d) == d.encode('utf-8') == d.decode('utf-8')
  True

  ...and yet there are weird unexpected gotchas.

  >>> d.decode('utf-8') == d.encode('utf-8')
  True
  >>> d.encode('utf-8') == bytearray(d, 'utf-8')
  True
  >>> d.decode('utf-8') == bytearray(d, 'utf-8')   # fails the eq property?
  False

  As a result, this was inconsistency was reflected in the way we handled
  strings in the impala-shell code, but things still just worked.

  In python3, there's a much clearer distinction between strings and bytes, and
  as such, much tighter type consistency is expected by standard libs like
  subprocess, re, sqlparse, prettytable, etc., which are used throughout the
  shell. Even simple calls that worked in python 2.x:

  >>> import re
  >>> re.findall('foo', b'foobar')
  ['foo']

  ...can throw exceptions in python 3.x:

  >>> import re
  >>> re.findall('foo', b'foobar')
  Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "/data0/systest/venvs/py3/lib/python3.7/re.py", line 223, in findall
      return _compile(pattern, flags).findall(string)
  TypeError: cannot use a string pattern on a bytes-like object

  Exceptions like this resulted in a many, if not most shell tests failing
  under python 3.

  What ultimately seemed like a better approach was to try to weed out as many
  existing spurious str.encode() and str.decode() calls as I could, and try to
  implement what is has colloquially been called a "unicode sandwich" -- namely,
  "bytes on the outside, unicode on the inside, encode/decode at the edges."

  The primary spot in the shell where we call decode() now is when sanitising
  input...

  args = self.sanitise_input(args.decode('utf-8'))

  ...and also whenever a library like re required it. Similarly, str.encode()
  is primarily used where a library like readline or csv requires is.

- PYTHONIOENCODING needs to be set to utf-8 to override the default setting for
  python 2. Without this, piping or redirecting stdout results in unicode errors.

- from __future__ import unicode_literals was added throughout

Testing:

  To test the changes, I ran the e2e shell tests the way we always do (against
  the normal build tarball), and then I set up a python 3 virtual env with the
  shell installed as a package, and manually ran the tests against that.

  No effort has been made at this point to come up with a way to integrate
  testing of the shell in a python3 environment into our automated test
  processes.

Change-Id: Idb004d352fe230a890a6b6356496ba76c2fab615
Reviewed-on: http://gerrit.cloudera.org:8080/15524
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-04-18 05:13:50 +00:00
Alice Fan
e1d1428181 IMPALA-9384: Improve Impala shell usability by enabling live_progress in interactive mode
In order to improve usability, this patch makes Impala shell show query
processing status while the query is running. The patch enables shell
option live_progress by default when a user launches impala shell in
the interactive mode. The patch also adds a new command line flag
"--disable_live_progress", which allows a user to disable live_progress
at runtime. In the interactive mode, a user can disable live_progress
by either using the command line flag or setting the option as False in
the config file. As for in the non-interactive mode (when the -q or -f
options are used), live reporting is not supported. Impala-shell will
disable live_progress if the mode is detected.

Testing:
- Added and updated tests in test_shell_interactive.py and test_shell_commandline.py
- Successfully ran all shell related tests

Change-Id: I3765b775f663fa227e59728acffe4d5ea9a5e2d3
Reviewed-on: http://gerrit.cloudera.org:8080/15219
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
2020-03-09 21:28:19 +00:00
wzhou-code
6a23ec6985 IMPALA-6393: Add support for live_summary and live_progress in impalarc
This patch adds support for live_summary and live_progress in impalarc.

Testing:
1) Added unit-test cases in test_shell_commandline.py and
   test_shell_interactive.py for live_summary and live_progress.
2) Successfully ran all other tests in test_shell_interactive.py and
   test_shell_commandline.py

Change-Id: If4549b775a7966ad89d661d0349cc78754e13a86
Reviewed-on: http://gerrit.cloudera.org:8080/14927
Reviewed-by: Bikramjeet Vig <bikramjeet.vig@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-01-23 01:48:13 +00:00
Tim Armstrong
f1f3ae9ec2 IMPALA-7290: part 2: Add HS2 support to Impala shell
HS2 is added as an option via --protocol=hs2. The user-visible
differences in behaviour are minimal. Beeswax is still the
default and can be explicitly enabled via --protocol=beeswax
but will be deprecated. The default is unchanged because
changing the default could break certain workflows, e.g.
those that explicitly specify the port with -i or deployments
that hit --fe_service_threads for HS2 and somehow rely on
impala-shell not contributing to that limit. For most
workflows the change is transparent and we should change
the default in a major version change.

This support requires Impala-specific extensions to
the HS2 interface, similar to the existing extensions
to Beeswax. Thus the HS2 shell is only
forwards-compatible with newer Impala versions.
I considered trying to gracefully degrade when the
new extensions weren't present, but it didn't seem to be
worth the ongoing testing effort.

Differences between HS2 and Beeswax are abstracted into
ImpalaClient subclasses.
Here are the changes required to make it work:
* Switch to TBinaryProtocolAccelerated to avoid perf
  regression. The HS2 protocol requires decoding
  more primitive values (because its not a string-per-row),
  which was slow with the pure python implementation of
  TBinaryProtocol.
* Added bitarray module to efficiently unpack null indicators
* Minimise invasiveness of changes by transposing and stringifying
  the columnar results into rows in impala_client.py. The transposition
  needs to happen before display anyway.
* Add PingImpalaHS2Service() to get back version string and webserver
  address.
* Add CloseImpalaOperation() extension to return DML row counts. This
  possibly addresses IMPALA-1789, although we need to confirm that
  this is a sufficient solution.
* Add is_closed member to query handles to avoid shell independently
  tracking whether the query handle was closed or not.
* Include query status in HS2 log to match beeswax.
* HS2 GetLog() command now includes query status error message for
  consistency with beeswax.
* "set"/"set all" uses the client requests options, not the session
  default. This captures the effective value of TIMEZONE, which
  was previously missing. This also requires test changes where
  the tests set non-default values, e.g. for ABORT_ON_ERROR.
* "set all" on the server side returns REMOVED query options - the
  shell needs to know these so it can correctly ignore them.
* Clean up self.orig_cmd/self.last_leading comment argument
  passing to avoid implicit parameter passing through multiple
  function calls.
* Clean up argument handling in shell tests to consistently pass
  around lists of arguments instead of strings that are subject
  to shell tokenisation rules.
* Consistently close connections in the shell to avoid leaking
  HS2 sessions. This is enforced by making ImpalaShell a context
  manager and also eliminating all sys.exit() calls that would
  bypass the explicit connection closing.

Testing:
* Shell tests can run with both protocols
* Add tests for formatting of all types and NULL values
* Added testing for floating point output formatting, which does
  change as a result of switching to server-side vs client-side
  formatting.
* Verified that newly-added tests were actually going through HS2
  by disabling hs2 on the minicluster and running tests.
* Add checks to test_verify_metrics.py to ensure that no sessions
  are left open at the end of tests.

Performance:
Baseline from beeswax shell for large extract is as follows:

  $ time impala-shell.sh -B -q 'select * from tpch_parquet.orders' > /dev/null
  real    0m6.708s
  user    0m5.132s
  sys     0m0.204s

After this change it is somewhat slower, but we generally don't consider
bulk extract performance through the shell to be perf-critical:
  real    0m7.625s
  user    0m6.436s
  sys     0m0.256s

Change-Id: I6d5cc83d545aacc659523f29b1d6feed672e2a12
Reviewed-on: http://gerrit.cloudera.org:8080/12884
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2019-06-20 10:23:28 +00:00
Ethan Xue
487547ec44 IMPALA-6042: Allow Impala shell to use a global impalarc config
Currently, impalarc files can be specified on a per-user basis
(stored in ~/.impalarc), and they aren't created by default. The
Impala shell should pick up /etc/impalarc as well, in addition
to the user-specific configurations.

The intent here is to allow a "global" configuration of the shell
by a system administrator. The default path of the global config
file can be changed by setting the $IMPALA_SHELL_GLOBAL_CONFIG_FILE
environment variable.

Note that the options set in the user config file take precedence
over those in the global config file.

Change-Id: I3a3179b6d9c9e3b2b01d6d3c5847cadb68782816
Reviewed-on: http://gerrit.cloudera.org:8080/13313
Reviewed-by: Bikramjeet Vig <bikramjeet.vig@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2019-05-30 03:59:54 +00:00
aphadke
a18809ba1f IMPALA-7943: Bump the default client timeout set on impala-shell
As part of IMPALA-7555, we added a default socket timeout of 5 seconds
when connecting to an impalad. Under heavy load with kerberos and SSL
enabled, we could hit this default timeout. This change bumps up the
timeout to 60 secs to make the impala-shell more robust.

Change-Id: Ifc40069e86cbf93634320804efba003fb5551afe
Reviewed-on: http://gerrit.cloudera.org:8080/12051
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Lars Volker <lv@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-12-07 06:57:24 +00:00
aphadke
2fb8ebaef2 IMPALA-7555: Set socket timeout in impala-shell
impala-shell does not set any socket timeout while connecting to the
impala server. This change sets a timeout on the socket before
connecting and unsets it back after successfully connecting. The default
timeout on this socket is 5 sec.
Usage: impala-shell --client_connect_timeout=<value in ms>

Testing:
1. Added a test where I create a random listening socket.
impala-shell (with ssl enabled) connects to this socket and
times out after 2 sec.

2. Created a kerberized impala cluster with ssl enabled and
connected to the impalad using an openssl client (block the
beeswax server thread to accept new connection) -

E.g. - openssl s_client -connect <IP Addr>:21000
Used impala-shell to connect to the same impalad later.
impala-shell timed out after the default of 5 sec.I verified
it manually.

Change-Id: I130fc47f7a83f591918d6842634b4e5787d00813
Reviewed-on: http://gerrit.cloudera.org:8080/11540
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-10-18 01:41:42 +00:00
Todd Lipcon
bf24a814cc IMPALA-6335. Allow most shell tests to run in parallel
This adds an IMPALA_HISTFILE environment variable (and --history_file
argument) to the shell which overrides the default location of
~/.impalahistory for the shell history. The shell tests now override
this variable to /dev/null so they don't store history. The tests that
need history use a pytest fixture to use a temporary file for their
history. This allows so that they can run in parallel without stomping
on each other's history.

This also fixes a couple flaky test which were previously missing the
"execute_serially" annotation -- that annotation is no longer needed
after this fix.

A couple of the tests still need to be executed serially because they
look at metrics such as the number of executed or running queries, and
those metrics are unstable if other tests run in parallel.

I tested this by running:

  ./bin/impala-py.test tests/shell/test_shell_interactive.py \
    -m 'not execute_serially' \
    -n 80 \
    --random

... several times in a row on an 88-core box. Prior to the change,
several would fail each time. Now they pass.

Change-Id: I1da5739276e63a50590dfcb2b050703f8e35fec7
Reviewed-on: http://gerrit.cloudera.org:8080/11045
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Todd Lipcon <todd@apache.org>
2018-08-08 03:39:39 +00:00
Vincent Tran
2c1fbecc9f IMPALA-2782: Allow impala-shell to connect directly to impalad when
configured with load balancer and kerberos.

This change adds an impala-shell option -b / --kerberos_host_fqdn.
This allows user to optionally specify the load-balancer's host so
that impala-shell will accept a direct connection to impala daemons
in a kerberized cluster.

Change-Id: I4726226a7a3817421b133f74dd4f4cf8c52135f9
Reviewed-on: http://gerrit.cloudera.org:8080/7241
Reviewed-by: <andy@phdata.io>
Reviewed-by: Philip Zeyliger <philip@cloudera.com>
Tested-by: Impala Public Jenkins
2018-03-21 20:45:48 +00:00
Tim Armstrong
56464e4616 IMPALA-3998: Remove refresh_after_connect option from shell
This removes the deprecated option in time for 3.0.

Testing:
Ran core tests. Manually ran the shell with the argument to confirm that
it reported "no such option".

Cherry-picks: not for 2.x.

Change-Id: I8f430bad0578e150d5e80066b9e7572041af4a15
Reviewed-on: http://gerrit.cloudera.org:8080/9072
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins
2018-01-23 01:48:50 +00:00
Zach Amsden
5905083bfe IMPALA-5127: Add history_max option
Allow users to keep a longer history of queries if desired.  I
personally find it useful to keep a long history of queries to
reference and want to bump this up to a very large value, but
keep the default reasonable.  Also change the config loader
to not freak out over unknown parameters so as not to break
for users that end up with new options set running on older
shells.

Testing: Created .impalarc as follows, now getting more history saved.
Put broken things in .impalarc and make sure they are logged as
warnings.

[impala]
history_max=1000

Change-Id: Iaf65bbecb8fd7f1105aac62b6745d6125a603d7f
Reviewed-on: http://gerrit.cloudera.org:8080/6335
Reviewed-by: Michael Brown <mikeb@cloudera.com>
Tested-by: Impala Public Jenkins
2017-05-16 02:35:22 +00:00
Dan Hecht
ffa7829b70 IMPALA-3918: Remove Cloudera copyrights and add ASF license header
For files that have a Cloudera copyright (and no other copyright
notice), make changes to follow the ASF source file header policy here:

http://www.apache.org/legal/src-headers.html#headers

Specifically:
1) Remove the Cloudera copyright.
2) Modify NOTICE.txt according to
   http://www.apache.org/legal/src-headers.html#notice
   to follow that format and add a line for Cloudera.
3) Replace or add the existing ASF license text with the one given
   on the website.

Much of this change was automatically generated via:

git grep -li 'Copyright.*Cloudera' > modified_files.txt
cat modified_files.txt | xargs perl -n -i -e 'print unless m#Copyright.*Cloudera#i;'
cat modified_files_txt | xargs fix_apache_license.py [1]

Some manual fixups were performed following those steps, especially when
license text was completely missing from the file.

[1] https://gist.github.com/anonymous/ff71292094362fc5c594 with minor
    modification to ORIG_LICENSE to match Impala's license text.

Change-Id: I2e0bd8420945b953e1b806041bea4d72a3943d86
Reviewed-on: http://gerrit.cloudera.org:8080/3779
Reviewed-by: Dan Hecht <dhecht@cloudera.com>
Tested-by: Internal Jenkins
2016-08-09 08:19:41 +00:00
Martin Grund
ed18dd4a8b IMPALA-80: Dynamic progress reporting for the shell
This patch adds a way to allow for dynamic progress reporting in the
shell. There are two new command line flags for the shell

   --live_progress - will print the completed vs total # of scan ranges
   --live_summary - prints an updated exec summary

In addition to the command line flags, these options can be set from
within the shell using:

   set LIVE_SUMMARY=True
   set LIVE_PROGRESS=True

The new options will be listed under shell options. Both reports will be
updated at most every second, for longer running queries it will be
adjusted to the time between two RPC calls to get the query status. To
provide this information in the ExecSummary, the Thrift structure for
the ExecSummary was extended to contain a progress indicator. The output
is printed to stderr and only available in interactive mode.

An example video is available here:

https://asciinema.org/a/5wi7ypckx4ol4ha1hlg3e3q1k

Change-Id: I70b2ab5fa74dc2ba5bc3b338ef13ddc6ccf367d2
Reviewed-on: http://gerrit.cloudera.org:8080/508
Tested-by: Internal Jenkins
Reviewed-by: Martin Grund <mgrund@cloudera.com>
2015-07-17 17:59:29 +00:00
Abdullah Yousufi
cc517a26d5 IMPALA-1126: Remove strict unicode mode flag in shell
Removed the the --strict_unicode command-line option flag
because encoding to utf-8 is the same regardless if the
encoding policy is strict or ignore.

Change-Id: If71062d580072c51acb1bb401471ca818c97a9be
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/3725
Reviewed-by: Abdullah Yousufi <abdullah.yousufi@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4078
2014-08-27 22:27:42 -07:00
Abdullah Yousufi
a80506ff3c Refactored impala-shell
This is a reorganization of the existing impala-shell.

The basic idea was to split up the shell into two components: one part
soley responsible for the CLI functionality, and another to represent
the impala client/connection that would interact with the Beeswax api and
execute queries, fetch results, etc.

One major change was to redo how the existing shell handled cancellation,
which was to create a thread for each rpc, so that Ctrl+C would not interrupt
the system calls and break the socket connection. In the new approach,
a new client instance is created to close the query and if the socket connection is
broken, the client reconnects. Cancellation currently works.

Change-Id: I0f371f68552c065b2317f967c6cf7483b44be3df
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/3316
Reviewed-by: Abdullah Yousufi <abdullah.yousufi@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4008
2014-08-22 20:13:04 -07:00
Abdullah Yousufi
504c83fe78 IMPALA-601: Read shell configuration from a file
Optionally loads options from a file in the user's home directory, called
'.impalarc', though the path to another file can be passed in as a command-line
option. The file must have a case-sensitive [impala] header. Specifying
the option in the command line overwrites the config file's value
for the option for that instance of the shell. If an option is not
specified in the config file, its default value is used.

Change-Id: I218da2c1e10308c5b8729883fa625f0c284397a7
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2956
Reviewed-by: Abdullah Yousufi <abdullah.yousufi@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/3629
2014-07-29 22:14:50 -07:00