Currently, the shell tarball maintains its own packaging code
and directory layout. This is very complicated and currently has
several Python packages directly checked into our repository.
To simplify it, this changes the shell tarball to be based on
pip installing the pypi package. Specifically, the new directory
structure for an unpack shell tarball is:
impala-shell-4.5.0-SNAPSHOT/
impala-shell
install_py${PYTHON_VERSION}/
install_py${ANOTHER_PYTHON_VERSION}/
For example, install_py2.7 is the Python 2.7 pip install of impala-shell.
install_py3.8 is a Python 3.8 pip install of impala-shell. This means
that the impala-shell script simply picks the install for the
specified version of python and uses that pip install directory.
To make this more consistent across different Linux distributions, this
upgrades pip in the virtualenv to the latest.
With this, ext-py and pkg_resources.py can be removed.
This requires rearranging the shell build code. Specifically, this splits
out the code that generates impala_build_version.py so that it can run
before generating the pypi package. The shell tarball now has a dependency
on the pypi package and must run after it.
This builds on Michael Smith's work from IMPALA-11399.
Testing:
- Ran shell tests locally
- Built on Centos 7, Redhat 8 & 9, Ubuntu 20 & 22, SLES 15
Change-Id: Ifbb66ab2c5bc7180221f98d9bf5e38d62f4ac036
Reviewed-on: http://gerrit.cloudera.org:8080/20171
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Impala Interactive Shell
You can use the Impala shell tool (impala-shell) to connect to an Impala service. The shell allows you to set up databases and tables, insert data, and issue queries. For ad hoc queries and exploration, you can submit SQL statements in an interactive session. The impala-shell interpreter accepts all the same SQL statements listed in Impala SQL Statements, plus some shell-only commands that you can use for tuning performance and diagnosing problems.
To automate your work, you can specify command-line options to process a single statement or a script file. (Other avenues for Impala automation via python are provided by Impyla or ODBC.)
Installing
$ pip install impala-shell
Online documentation
Quickstart
Non-interactive mode
Processing a single query, e.g., show tables:
$ impala-shell -i impalad-host.domain.com -d some_database -q 'show tables'
Processing a text file with a series of queries:
$ impala-shell -i impalad-host.domain.com -d some_database -f /path/to/queries.sql
Launching the interactive shell
To connect to an impalad host at the default service port (21000):
$ impala-shell -i impalad-host.domain.com
Starting Impala Shell without Kerberos authentication
Connected to impalad-host.domain.com:21000
Server version: impalad version 2.11.0-SNAPSHOT RELEASE (build d4596f9ca3ea32a8008cdc809a7ac9a3dea47962)
***********************************************************************************
Welcome to the Impala shell.
(Impala Shell v3.0.0-SNAPSHOT (73e90d2) built on Thu Mar 8 00:59:00 PST 2018)
The '-B' command line flag turns off pretty-printing for query results. Use this
flag to remove formatting from results you want to save for later, or to benchmark
Impala.
***********************************************************************************
[impalad-host.domain.com:21000] >
Launching the interactive shell (secure mode)
To connect to a secure host using kerberos and SSL:
$ impala-shell -k --ssl -i impalad-secure-host.domain.com
Disconnecting
To exit the shell when running interactively, press Ctrl-D at the shell prompt.