Commit Graph

40 Commits

Author SHA1 Message Date
Michael Smith
512a73771f IMPALA-14452: Fix impala-shell SSL with Python 3.12
Removes deprecated ImpalaHttpClient constructor that supported port and
path as it has been deprecated since at least 2020 and appears unused.

Removes cert_file and key_file as they were also never used, and if
required must now be passed in via ssl_context.

Updates TSSLSocket fixes for Thrift 0.16 and Python 3.12. _validate_cert
was removed by Thrift 0.16, but everything worked because Thrift used
ssl.match_hostname instead. With Python 3.12 ssl.match_hostname no
longer exists so we rely on OpenSSL to handle verification with
ssl.PROTOCOL_TLS_CLIENT.

Only uses ssl.PROTOCOL_TLS_CLIENT when match_hostname is unavailable to
avoid changing existing behavior. THRIFT-792 identifies that TSocket
suppresses connection errors, where we would otherwise see SSL hostname
verification errors like

    ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED]
    certificate verify failed: IP address mismatch, certificate is not
    valid for '::1'. (_ssl.c:1131)

Python 2.7.9 and 3.2 are minimum required versions; both have been EOL
for several years.

Testing:
- ran custom_cluster/{test_client_ssl.py,test_ipv6.py} on Ubuntu 24 with
  Python 3.12, OpenSSL 3.0.13.
- ran custom_cluster/test_client_ssl.py on RHEL 7.9 with Python 2.7.5
  and Python 3.6.8, OpenSSL 1.0.2k-fips.
- adds test that hostname checking is configured.

Change-Id: I046a9010ac4cb1f7d705935054b306cddaf8bdc7
Reviewed-on: http://gerrit.cloudera.org:8080/23519
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Csaba Ringhofer <csringhofer@cloudera.com>
2025-10-20 09:55:22 +00:00
Csaba Ringhofer
5cca1aa9e5 IMPALA-13820: add ipv6 support for webui/hs2/hs2-http/beeswax
Main changes:
- added flag external_interface to override hostname for
  beeswax/hs2/hs2-http port to allow testing ipv6 on these
  interfaces without forcing ipv6 on internal communication
- compile Squeasel with USE_IPV6 to allow ipv6 on webui (webui
  interface can be configured with existing flag webserver_interface)
- fixed the handling of [<ipv6addr>].<port> style addresses in
  impala-shell (e.g. [::1]:21050) and test framework
- improved handling of custom clusters in test framework to
  allow webui/ImpalaTestSuite's clients to work with non
  standard settings (also fixes these clients with SSL)

Using ipv4 vs ipv6 vs dual stack can be configured by setting
the interface to bind to with flag webserver_interface and
external_interface. The Thrift server behind hs2/hs2-http/beeswax
only accepts a single host name and uses the first address
returned by getaddrinfo() that it can successfully bind to. This
means that unless an ipv6 address is used (like ::1) the behavior
will depend on the order of addresses returned by getaddrinfo():
63b7a263fc/lib/cpp/src/thrift/transport/TServerSocket.cpp (L481)
For dual stack the only way currently is to bind to "::",
as the Thrift server can only listen a single socket.

Testing:
- added custom cluster tests for ipv6 only/dual interface
  with and without SSL
- manually tested in dual stack environment with client on a
  different host
- among clients impala-shell and impyla are tested, but not
  JDBC/ODBC
- no tests yet on truly ipv6 only environment, as internal
  communication (e.g. krpc) is not ready for ipv6

To test manually the dev cluster can be started with ipv6 support:
dual mode:
bin/start-impala-cluster.py --impalad_args="--external_interface=:: --webserver_interface=::" --catalogd_args="--webserver_interface=::" --state_store_args="--webserver_interface=::"

ipv6 only:
bin/start-impala-cluster.py --impalad_args="--external_interface=::1 --webserver_interface=::1" --catalogd_args="--webserver_interface=::1" --state_store_args="--webserver_interface=::1"

Change-Id: I51ac66c568cc9bb06f4a3915db07a53c100109b6
Reviewed-on: http://gerrit.cloudera.org:8080/22527
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-06-21 14:00:31 +00:00
Riza Suminto
00dc79adf6 IMPALA-13907: Remove reference to create_beeswax_client
This patch replace create_beeswax_client() reference to
create_hs2_client() or vector-based client creation to prepare towards
hs2 test migration.

test_session_expiration_with_queued_query is changed to use impala.dbapi
directly from Impyla due to limitation in ImpylaHS2Connection.

TestAdmissionControllerRawHS2 is migrated to use hs2 as default test
protocol.

Modify test_query_expiration.py to set query option through client
instead of SET query. test_query_expiration is slightly modified due to
behavior difference in hs2 ImpylaHS2Connection.

Remove remaining reference to BeeswaxConnection.QueryState.

Fixed a bug in ImpylaHS2Connection.wait_for_finished_timeout().

Fix some easy flake8 issues caught thorugh this command:
git show HEAD --name-only | grep '^tests.*py' \
  | xargs -I {} impala-flake8 {} \
  | grep -e U100 -e E111 -e E301 -e E302 -e E303 -e F...

Testing:
- Pass exhaustive tests.

Change-Id: I1d84251835d458cc87fb8fedfc20ee15aae18d51
Reviewed-on: http://gerrit.cloudera.org:8080/22700
Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-03-29 18:37:45 +00:00
Csaba Ringhofer
e49ed3d243 IMPALA-13790: Fix test_wildcard_san_ssl / test_wildcard_ssl
These tests failed in various ways depending on OS/openssl version.
An issue identified is that the certificates contained CN=* while
wildcard subject should be like *.<domain>. Recreated wildcard
certs with *.impala.test common name and added some host names
that match them in bootstrap_system.sh.

Removed the @xfail from the tests as my expectation is that they
should work on all supported OS.

Tested on
- Ubuntu 20.04 / OpenSSL 1.1.1f
- Ubuntu 22.04 / OpenSSL 3.0.2
- RHEL 7.9     / OpenSSL 1.0.2k
- RHEL 8.6     / OpenSSL 1.1.1k
- Rocky 9.2    / OpenSSL 3.2.2

Change-Id: Ieedf682d06bdb6f8f68a5f77e41175e895b77ca9
Reviewed-on: http://gerrit.cloudera.org:8080/22569
Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-03-06 19:42:24 +00:00
Michael Smith
e1098a6a02 IMPALA-13214: Skip wait_until_connected when shell exits
The ImpalaShell class expects to start impala-shell and interact with it
by sending instructions over stdin and reading the results. This
assumption was incorrect when used for impala-shell batch sessions,
where the process exits on its own. If there's a delay in
ImpalaShell.__init__ - between starting the process and polling to see
that it's running - for a batch process, ImpalaShell will fail the
assertion that process_status is None. This can be easily reproduced by
adding a small (0.1s) sleep after starting the new process.

Most batch runs of impala-shell happen through `run_impala_shell_cmd`.
Updated that function to only wait for a successful connection when
stdin input is supplied. Otherwise the command is assumed to be a batch
function and any failures will be detected during `get_result`. Removed
explicit use of `wait_until_connected` as redundant.

Fixed cases in test_config_file that previously ignored WARNING before
the connection string because they did not specify
`wait_until_connected`.

Tested by running shell/test_shell_commandline.py with a 0.1s delay
before ImpalaShell polls.

Change-Id: I24e029b6192a17773760cb44fd7a4f87b71c0aae
Reviewed-on: http://gerrit.cloudera.org:8080/21598
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Jason Fehr <jfehr@cloudera.com>
Reviewed-by: Kurt Deschler <kdeschle@cloudera.com>
2024-07-24 15:38:43 +00:00
Joe McDonnell
234d641d7b IMPALA-11961/IMPALA-12207: Add Redhat 9 / Ubuntu 22 support
This adds support for Redhat 9 / Ubuntu 22. It updates
to a newer toolchain that has those builds, and it adds
supporting code in bootstrap_system.sh.

Redhat 9 and Ubuntu 22 use python = python3, which requires
various changes to build scripts and tests. Ubuntu 22 uses
Python 3.10, which deprecates certain ssl.PROTOCOL_TLS, so
this adapts test_client_ssl.py to that change until it
can be fully addressed in IMPALA-12219.

Various OpenSSL methods have been deprecated. As a workaround
until these can be addressed properly, this specifies
-Wno-deprecated-declarations. This can be removed once the
code is adapted to the non-deprecated APIs in IMPALA-12226.

Impala crashes with tcmalloc errors unless we update to a newer
gperftools, so this moves to gperftools 2.10. gperftools changed
the default for tcmalloc.aggressive_memory_decommit to off, so
this adapts our code to set it for backend tests. The gperftools
upgrade does not show any performance regression:

+----------+-----------------------+---------+------------+------------+----------------+
| Workload | File Format           | Avg (s) | Delta(Avg) | GeoMean(s) | Delta(GeoMean) |
+----------+-----------------------+---------+------------+------------+----------------+
| TPCH(42) | parquet / none / none | 3.08    | -0.64%     | 2.20       | -0.37%         |
+----------+-----------------------+---------+------------+------------+----------------+

With newer Python versions, the impala-virtualenv command
fails to create a Python 3 virtualenv. This switches to
using Python 3's builtin venv command for Python >=3.6.

Kudu needed a newer version and LLVM required a couple patches.

Testing:
 - Ran a core job on Ubuntu 22 and Redhat 9. The tests run
   to completion without crashing. There are test failures
   that will be addressed in follow-up JIRAs.
 - Ran dockerised tests on Ubuntu 22.
 - Ran dockerised tests on Ubuntu 20 and Rocky 8.5.

Change-Id: If1fcdb2f8c635ecd6dc7a8a1db81f5f389c78b86
Reviewed-on: http://gerrit.cloudera.org:8080/20073
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>
2023-06-21 05:21:01 +00:00
Joe McDonnell
82bd087fb1 IMPALA-11973: Add absolute_import, division to all eligible Python files
This takes steps to make Python 2 behave like Python 3 as
a way to flush out issues with running on Python 3. Specifically,
it handles two main differences:
 1. Python 3 requires absolute imports within packages. This
    can be emulated via "from __future__ import absolute_import"
 2. Python 3 changed division to "true" division that doesn't
    round to an integer. This can be emulated via
    "from __future__ import division"

This changes all Python files to add imports for absolute_import
and division. For completeness, this also includes print_function in the
import.

I scrutinized each old-division location and converted some locations
to use the integer division '//' operator if it needed an integer
result (e.g. for indices, counts of records, etc). Some code was also using
relative imports and needed to be adjusted to handle absolute_import.
This fixes all Pylint warnings about no-absolute-import and old-division,
and these warnings are now banned.

Testing:
 - Ran core tests

Change-Id: Idb0fcbd11f3e8791f5951c4944be44fb580e576b
Reviewed-on: http://gerrit.cloudera.org:8080/19588
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>
2023-03-09 17:17:57 +00:00
Joe McDonnell
ba3518366a IMPALA-11952 (part 4): Fix odds and ends: Octals, long, lambda, etc.
There are a variety of small python 3 syntax differences:
 - Octal constants need to start with 0o rather than just 0
 - Long constants are not supported (i.e. numbers ending with L)
 - Lambda syntax is slightly different
 - The 'ur' string mode is no longer supported

Testing:
 - check-python-syntax.sh now passes

Change-Id: Ie027a50ddf6a2a0db4b34ec9b49484ce86947f20
Reviewed-on: http://gerrit.cloudera.org:8080/19554
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
2023-02-28 17:11:50 +00:00
Joe McDonnell
2b550634d2 IMPALA-11952 (part 2): Fix print function syntax
Python 3 now treats print as a function and requires
the parenthesis in invocation.

print "Hello World!"
is now:
print("Hello World!")

This fixes all locations to use the function
invocation. This is more complicated when the output
is being redirected to a file or when avoiding the
usual newline.

print >> sys.stderr , "Hello World!"
is now:
print("Hello World!", file=sys.stderr)

To support this properly and guarantee equivalent behavior
between python 2 and python 3, all files that use print
now add this import:
from __future__ import print_function

This also fixes random flake8 issues that intersect with
the changes.

Testing:
 - check-python-syntax.sh shows no errors related to print

Change-Id: Ib634958369ad777a41e72d80c8053b74384ac351
Reviewed-on: http://gerrit.cloudera.org:8080/19552
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Tested-by: Michael Smith <michael.smith@cloudera.com>
2023-02-28 17:11:50 +00:00
Joe McDonnell
2f74e956aa IMPALA-11472: Reduce test dimensions for TestClientSsl
With the addition of extra dimensions for varients
of impala-shell, TestClientSsl currently runs
four different shells against three different protocols
for a total of 12 dimensions. Some tests in TestClientSsl
take a while to run (e.g. test_wildcard_ssl takes 4 minutes
on some platforms). This can take over an hour to run.

This reduces the test dimensions to only tests two
shells (dev python2 and dev python3) with two protocols
(HS2 and HS2-HTTP) for a total of 4 dimensions. This
should reduce the runtime significantly.

Testing:
 - Ran TestClientSsl locally and checked the test
   dimensions
 - Ran shell tests and checked that their test
   dimensions don't change

Change-Id: I3d4a4792a37cba2231d9999e8bfa2279ba029a05
Reviewed-on: http://gerrit.cloudera.org:8080/18843
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
2022-08-16 00:13:18 +00:00
Michael Smith
5263d13112 IMPALA-11314: Test PyPI package with system python
Sets up a virtualenv with system python to install the impala-shell PyPI
package into. Using system python provides better coverage for Python
versions likely to be used by customers. Runs impala-shell tests using
the PyPI package to provide better coverage for the artifact customers
will use.

Includes a PyPI install in notests_independent_targets because these
seem to be used for Python testing despite -notests.

Change-Id: I384ea6a7dab51945828cca629860400a23fa0c05
Reviewed-on: http://gerrit.cloudera.org:8080/18586
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>
2022-06-13 17:13:42 +00:00
Joe McDonnell
6a199be854 IMPALA-11249: Fix add_test_dimensions() locations to call super()
The original issue is that the strict HS2 shell tests
are not running in precommit or nightly jobs, but they
do run in local developer environments. Investigating
this showed that the shell tests were running with a
weird set of test dimensions that includes
table_format_and_file_extension. That dimension is only
used in test_insert.py::TestInsertFileExtension.

What is happening is that the shell tests and other
locations are running add_test_dimensions() without
calling super(..., cls).add_test_dimensions(). The
behavior is unclear, but there is clearly cross-talk
between the different tests that do this.

This changes all add_test_dimensions() locations to
call super(..., cls).add_test_dimensions() if they
don't already. Each location has been tuned to run
the same set of tests as before (except the shell
tests which now run the strict HS2 tests).

As part of this, several shell tests need to be
skipped or fixed for strict HS2.

Testing:
 - Ran core job
 - Ran tests locally to verify the set of tests
   didn't change.

Change-Id: Ib20fd479d3b91ed0ed89a0bc5623cd2a5a458614
Reviewed-on: http://gerrit.cloudera.org:8080/18557
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2022-05-26 03:42:51 +00:00
Thomas Tauber-Marshall
39c424d7c8 IMPALA-10454: Bump --ssl_minimum_version to tls1.2
TLS versions < 1.2 are now considered insecure. This patch improves
Impala's default security.

This is made possible now in part because Impala 4.0 dropped support
for Python versions < 2.7.9 (or 2.7.5 on certain distributions where
it has been patched) as lower Python versions do not support tls1.2

Testing:
- Existing SSL tests are updated to reflect the new default.

Change-Id: Ifed66646b041a061f9db92744710aef7453f39e4
Reviewed-on: http://gerrit.cloudera.org:8080/16988
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-01-28 04:19:39 +00:00
Tim Armstrong
78da6adab8 IMPALA-4238: make TestClientSsl more robust
This changes the test to wait until it is executing in the backend
before trying to cancel it. This should remove planning time as
a variable that might cause the test to be flaky (e.g. if planning
is slow on S3 because of the time taken to list files).

Also dump the /queries debug page when the assertion is hit to
aid debugging.

Change-Id: I0c884f76659005e7245a156ee33c249b86662b75
Reviewed-on: http://gerrit.cloudera.org:8080/16760
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-11-24 00:11:33 +00:00
Lars Volker
74c7b7e55f IMPALA-8863: Add support to run tests over HTTP/HS2
This change adds support to run backend tests over HTTP using a new
version of Impyla (0.16.1). It also adds a test that exercises
authentication over HTTP.

Change-Id: I7156558071781378fcb9c8941c0f4dd82eb0d018
Reviewed-on: http://gerrit.cloudera.org:8080/14059
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2019-11-26 22:46:40 +00:00
Thomas Tauber-Marshall
16e0d550c2 IMPALA-4057, IMPALA-4050: Fix --webserver_interface
This patch fixes two issues with --webserver_interface:
- When --webserver_interface was used start-impala-cluster.py with a
  value that's different from --hostname, minicluster startup would
  appear to fail as liveness is determined by checking for the webui's
  availability at the address specified for --hostname.
- The value of --webserver_interface was applied correctly for the
  catalogd and statestored but not for impalads, due to the way
  ExecEnv constructed the Webserver.
- It is now possible to specify a hostname for webserver_interface
  instead of an IP. The webserver will resolve the hostname.

This patch also upgrades our version of psutil to the latest for the
function 'net_if_addrs'. This requires a few change to our use of
psutil, mostly adding '()' to call functions that previously were
variables.

Testing:
- Added a custom cluster test that finds all available interfaces,
  binds the webserver to one of them, and checks that its only
  available over that interface.

Change-Id: Ic7e75908426756d73f13a0fa3cfc21fc31da164c
Reviewed-on: http://gerrit.cloudera.org:8080/14266
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2019-09-20 20:43:26 +00:00
Bharath Vissapragada
72c9370856 IMPALA-8717: impala-shell support for HS2 HTTP endpoint
Adds impala-shell support to connect to HiveServer2 HTTP endpoint.
Relies on toolchain change at https://gerrit.cloudera.org/#/c/13725/.

Use --protocol='hs2-http' to enable this behavior.

Example usages:
---------------
impala-shell --protocol='hs2-http'  (No auth)
impala-shell --protocol='hs2-http' --ldap -u..... (PLAIN auth)
impala-shell --protocol-'hs2-http' --ssl --ca_cert... (TLS)
impala-shell --protocol='hs2-http' --ldap --ssl --ca_cert... (LDAP +
TLS)

Limitations:
-----------
- Does not support Kerberos (-k) due to lack ot SPNEGO support.

Testing:
--------
- Parameterized existing shell tests to support this combination.
- Added shell test coverage for LDAP auth.

Change-Id: I8323950857dfe1c1dfd5377fde79f87bc2ce9534
Reviewed-on: http://gerrit.cloudera.org:8080/13746
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Bharath Vissapragada <bharathv@cloudera.com>
2019-07-29 05:43:48 +00:00
Robbie Zhang
d5673bf241 IMPALA-8595: Support TLSv1.2 with Python < 2.7.9 in shell
IMPALA-5690 replaced thrift 0.9.0 with 0.9.3 in which THRIFT-3505
changed transport/TSSLSocket.py.
In thrift 0.9.3, if the python version is lower than 2.7.9, TSSLSocket
uses PROTOCOL_TLSv1 by default and the SSL version is passed to
TSSLSocket as a paramter when calling TSSLSocket.__init__.
Although TLSv1.2 is supported by Python from 2.7.9, Red Hat/CentOS
support TLSv1.2 from 2.7.5 with upgraded python-libs. We need to get
impala-shell support TLSv1.2 with Python 2.7.5 on Red Hat/CentOS.

TESTING:
impala-py.test tests/custom_cluster/test_client_ssl.py

Change-Id: I3fb6510f4b556bd8c6b1e86380379aba8be4b805
Reviewed-on: http://gerrit.cloudera.org:8080/13457
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2019-06-02 02:40:10 +00:00
Andrew Sherman
0a37377aa3 IMPALA-8333: Remove Impala Shell warnings part 2
Set IMPALA_TOOLCHAIN_BUILD_ID=40-193a30b3af to pickup "Patch Thrift to
0.9.3-p6 to eliminate ssl warnings"

Set IMPALA_THRIFT_VERSION=0.9.3-p6 to pick up the new thrift build,
which removes an unnecessary and confusing warning.

TESTING

Change the tests in test_client_ssl.py which were looking for specific
deprecation warnings to instead search for any Deprecation Warning.
Ran all end-to-end tests with new toolchain.
Built Impala on all supported platforms.

Change-Id: I8ae7e068894da5981fc083e690051da268bfde4d
Reviewed-on: http://gerrit.cloudera.org:8080/13404
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2019-05-30 22:34:17 +00:00
Tim Armstrong
0a9ea803d2 IMPALA-7290: part 1: clean up shell tests
This sets up the tests to be extensible to test shell
in both beeswax and HS2 modes.

Testing:
* Add test dimension containing only beeswax in preparation
  for HS2 dimension.
* Factor out hardcoded ports.
* Add tests for formatting of all types and NULL values.
* Merge date shell test into general type tests.
* Added testing for floating point output formatting, which does
  change as a result of switching to server-side vs client-side
  formatting.
* Use unique_database for tests that create tables.

Change-Id: Ibe5ab7f4817e690b7d3be08d71f8f14364b84412
Reviewed-on: http://gerrit.cloudera.org:8080/13083
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2019-04-30 11:30:45 +00:00
Thomas Tauber-Marshall
10b9195035 IMPALA-8407: Warn when Impala shell fails to connect due to tlsv1.2
When impala-shell is used to connect to an impala cluster with
--ssl_minimum_version=tlsv1.2, if the Python version being used is
< 2.7.9 the connection will fail due to a limitation of TSSLSocket.
See IMPALA-6990 for more details.

Currently, when this occurs, the error that gets printed is "EOF
occurred in violation of protocol", which is not very helpful. This
patch detect this situation and prints a more informative warning.

Testing:
- Updated test_tls_v12 so that instead of being skipped on affected
  platforms, it runs and checks for the presence of the warning.

Change-Id: I3feddaccb9be3a15220ce9e59aa7ed41d41b8ab6
Reviewed-on: http://gerrit.cloudera.org:8080/13003
Reviewed-by: Thomas Marshall <tmarshall@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2019-04-18 23:19:04 +00:00
Andrew Sherman
00df388c5f IMPALA-8332: Remove Impala Shell warnings part 1
In Thrift 0.9.3 the TSSLSocket initializer TSSLSocket.__init__ prints
warnings if positional parameters are used. Change our usage of this
initializer to use named parameters.

Follow up work on "IMPALA-8333 Remove Impala Shell warnings part 2" will
remove one further warning message.

TESTING

Ran all end-to-end tests.
Added tests for the deprecation warnings to test_client_ssl.py.

Change-Id: I31f9a0bb12ca6a1da9129eacd29ac105b883e01b
Reviewed-on: http://gerrit.cloudera.org:8080/12837
Reviewed-by: Fredy Wijaya <fwijaya@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2019-03-22 23:11:18 +00:00
Fredy Wijaya
9c44853998 IMPALA-6591: Fix test_ssl flaky test
test_ssl has a logic that waits for the number of in-flight queries to
be 1. However, the logic for wait_for_num_in_flight_queries(1) only
waits for the condition to be true for a period of time and does not
throw an exception when the time has elapsed and the condition is not
met. In other words, the logic in test_ssl that loops while the number
of in-flight queries is 1 never gets executed. I was able to simulate
this issue by making Impala shell start much longer.

Prior to this patch, in the event that Impala shell took much longer to
start, the test started sending the commands to Impala shell even when
Impala shell was not ready to receive commands. The patch fixes the
issue by waiting until Impala shell is connected. The patch also adds
assert in other places that calls wait_for_num_in_flight_queries and
updates the default behavior for Impala shell to wait until it is
connected.

Testing:
- Ran core and exhaustive tests several times on CentOS 6 without any
  issue

Change-Id: I9805269d8b806aecf5d744c219967649a041d49f
Reviewed-on: http://gerrit.cloudera.org:8080/12047
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-12-12 22:44:34 +00:00
Fredy Wijaya
96f9765348 IMPALA-7893: Correctly handle Ctrl+C for cancelling a non-running query
This patch fixes the issue with Ctrl+C handling for cancelling a
non-running query to behave similar to Linux shell.

Before (pressing Ctrl+C does not do anything):
[localhost:21000] default> select

After (pressing Ctrl+C cancels the query and starts a new prompt):
[localhost:21000] default> select^C
[localhost:21000] default>

Testing:
- Added a new cancellation test
- Ran all shell E2E tests

Change-Id: I80d7b2c2350224d88d0bfeb1745d9ed76e83cf6d
Reviewed-on: http://gerrit.cloudera.org:8080/11990
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-11-28 10:28:39 +00:00
Thomas Tauber-Marshall
5e92d139b9 IMPALA-7678: Reapply "IMPALA-7660: Support ECDH ciphers for debug webserver"
This patch reverses the revert of IMPALA-7660.

The problem with IMPALA-7660 was that urllib.urlopen added the
'context' parameter in 2.7.9, so it isn't present on rhel7, which uses
2.7.5

The fix is to switch to using the 'requests' library, which supports
ssl connections on all the platforms Impala is supported on.

This patch also adds more info to the error message printed by
start-impala-cluster.py when the debug webserver cannot be reached yet
to help with debugging these issues in the future.

Testing:
- Ran full builds on rhel7, rhel6, and ubuntu16.

Change-Id: I679469ed7f27944f75004ec4b16d513e6ea6b544
Reviewed-on: http://gerrit.cloudera.org:8080/11625
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-10-17 05:39:32 +00:00
poojanilangekar
fec2d64e8f IMPALA-7678: Revert "IMPALA-7660: Support ECDH ciphers for debug webserver"
This reverts commit 0e1de31ba5.

Change-Id: Id4034a4323be741bc7d9fffcf17288aeb3649b31
Reviewed-on: http://gerrit.cloudera.org:8080/11616
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-10-08 22:19:54 +00:00
Thomas Tauber-Marshall
0e1de31ba5 IMPALA-7660: Support ECDH ciphers for debug webserver
A recent change (IMPALA-7519) added support for ecdh ciphers for the
beeswax/hs2 server. This patch pulls in a recent change on squeasel to
extend that support to the debug webserver.

It also fixes a bug that prevented start-impala-cluster.py from
completing successfully when the webserver is launched with ssl, due
to it trying to verify the availablitiy of the webserver over http.

Testing:
- Added a custom cluster test that verifies start-impala-cluster.py
  runs successfully with webserver ssl enabled.
- Adds the webserver to an existing test for ecdh ciphers.

Change-Id: I80a6b370d5860812cde13229b5bcb2977814c73c
Reviewed-on: http://gerrit.cloudera.org:8080/11585
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-10-05 21:16:17 +00:00
Philip Zeyliger
d3cf6d3257 IMPALA-7629: Re-enable erroneously disabled TestClientSsl tests.
The fix for IMPALA-6990 had a bug, disabling some tests erroneously.
With this change, the tests run on Ubuntu16:04 like so:

  tests/custom_cluster/test_client_ssl.py::TestClientSsl::test_ssl[] PASSED
  tests/custom_cluster/test_client_ssl.py::TestClientSsl::test_tls_ecdh[] PASSED
  tests/custom_cluster/test_client_ssl.py::TestClientSsl::test_tls_v12[] PASSED
  tests/custom_cluster/test_client_ssl.py::TestClientSsl::test_wildcard_ssl[] xfail
  tests/custom_cluster/test_client_ssl.py::TestClientSsl::test_wildcard_san_ssl[] xfail

The xfails are all "Inconsistent wildcard support on target platforms".

On centos7:

  custom_cluster/test_client_ssl.py::TestClientSsl::test_ssl[] PASSED
  custom_cluster/test_client_ssl.py::TestClientSsl::test_tls_ecdh[] SKIPPED
  custom_cluster/test_client_ssl.py::TestClientSsl::test_tls_v12[] SKIPPED
  custom_cluster/test_client_ssl.py::TestClientSsl::test_wildcard_ssl[] xfail
  custom_cluster/test_client_ssl.py::TestClientSsl::test_wildcard_san_ssl[] xfail

On centos6:
  custom_cluster/test_client_ssl.py::TestClientSsl::test_ssl[] PASSED
  custom_cluster/test_client_ssl.py::TestClientSsl::test_tls_ecdh[] SKIPPED
  custom_cluster/test_client_ssl.py::TestClientSsl::test_tls_v12[] SKIPPED
  custom_cluster/test_client_ssl.py::TestClientSsl::test_wildcard_ssl[] SKIPPED
  custom_cluster/test_client_ssl.py::TestClientSsl::test_wildcard_san_ssl[] SKIPPED

I used "curl --silent https://.../consoleText | grep test_client_ssl | sed -e 's/\[.*\]/[]/'"
to extract these from Jenkins output.

Change-Id: I64879b8af39f967b0059797e7b36421ce0e58bed
Reviewed-on: http://gerrit.cloudera.org:8080/11530
Reviewed-by: Philip Zeyliger <philip@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-10-02 01:54:41 +00:00
Tim Armstrong
09150f04ca IMPALA-7628: skip test_tls_ecdh on Python 2.6
This is a temporary workaround. On the CentOS 6 build that failed
test_tls_v12, test_wildcard_san_ssl and test_wildcard_ssl were
all skipped so I figured this will unblock the tests without
losing coverage on most platforms that have recent Python.

Change-Id: I94ae9d254d5fd337774a24106eb9b08585ac0b01
Reviewed-on: http://gerrit.cloudera.org:8080/11519
Reviewed-by: Thomas Marshall <thomasmarshall@cmu.edu>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-09-26 19:56:11 +00:00
Thomas Tauber-Marshall
cf7f221d2f IMPALA-7519: Support elliptic curve ssl ciphers
Thrift's SSLSocketFactory class does not support setting ciphers that
use ecdh. This patch modifies our existing subclass of
SSLSocketFactory to override the ciphers() method and enable ECDH.

The code for this was taken from be/src/kudu/security/tls_context.cc

Testing:
- Added a custom cluster test that verifies that a cluster with only
  ECDH ciphers enabled works.

Change-Id: I1666ceabec51b425e8a82be1cf519e2ac35fa5a6
Reviewed-on: http://gerrit.cloudera.org:8080/11376
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-09-24 20:24:26 +00:00
Sailesh Mukil
fdd9d059c2 IMPALA-6990: TestClientSsl.test_tls_v12 failing due to Python SSL error
When we upgraded to thrift-0.9.3, the TSSLSocket.py logic changed quite
a bit. Our RHEL7 machines come equipped with Python 2.7.5. Looking at
these comments, that means that we'll be unable to create a 'SSLContext'
but be able to explicitly specify ciphers:
88591e32e7/lib/py/src/transport/TSSLSocket.py (L37-L41)
    # SSLContext is not available for Python < 2.7.9
    _has_ssl_context = sys.hexversion >= 0x020709F0

    # ciphers argument is not available for Python < 2.7.0
    _has_ciphers = sys.hexversion >= 0x020700F0

If we cannot create a 'SSLContext', then we cannot use TLSv1.2 and have
to use TLSv1:
88591e32e7/lib/py/src/transport/TSSLSocket.py (L48-L49)
    # For python >= 2.7.9, use latest TLS that both client and server
    # supports.
    # SSL 2.0 and 3.0 are disabled via ssl.OP_NO_SSLv2 and ssl.OP_NO_SSLv3.
    # For python < 2.7.9, use TLS 1.0 since TLSv1_X nor OP_NO_SSLvX is
    # unavailable.
    _default_protocol = ssl.PROTOCOL_SSLv23 if _has_ssl_context else \
        ssl.PROTOCOL_TLSv1

Our custom cluster test forces the server to use TLSv1.2 and also forces
a specific cipher:
2f22a6f67f/tests/custom_cluster/test_client_ssl.py (L118-L119)

So this combination of configuration values causes a failure in RHEL7
because we only allow a specific cipher which works with TLSv1.2, but
the client cannot use TLSv1.2 due to the Python version as mentioned above.

We've not noticed these failures on older-than-RHEL7-systems since the
OpenSSL versions on those systems don't support TLSv1.2. (< OpenSSL 1.0.1)

To fix this, we need to change the Python version on RHEL 7 to be
>= Python 2.7.9. This patch skips the test if an older version of
Python than 2.7.9 is detected.

Change-Id: I92c66ecaeb94b0c83ee6f1396c082709c21b3187
Reviewed-on: http://gerrit.cloudera.org:8080/10529
Reviewed-by: Sailesh Mukil <sailesh@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-05-31 05:29:13 +00:00
Henry Robinson
c163ac1468 IMPALA-5816: xfail wildcard TLS cert tests
Wildcard support is not uniform across all platforms that Impala is
tested on. This patch xfails the wildcard tests in test_client_ssl.

A follow-up change will generate certificates on a per-host basis, which
should allow compatible wildcard certs to be generated for all platforms.

Change-Id: I86148739aa1c66c817eed8b727f68cfc08c178ed
Reviewed-on: http://gerrit.cloudera.org:8080/7908
Reviewed-by: Sailesh Mukil <sailesh@cloudera.com>
Tested-by: Impala Public Jenkins
2017-08-31 02:50:16 +00:00
Henry Robinson
81c3d883b9 IMPALA-5775: (Addendum) Make SSL cluster actually come up in test_client_ssl.py
The non-wildcard certs in test_client_ssl.py require that the hostname
of the process is 'localhost' for clients to validate them. This wasn't
the case for one test, and so the cluster wouldn't actually
start. Although the test would still pass (because the shell wasn't
actually checking the certificate), it's better hygiene to have the
cluster correctly configured to make sure we're testing what we think we
are.

Testing: test continues to pass

Change-Id: Idad8bbf3b8be853d3406bcbaed24909501500ea9
Reviewed-on: http://gerrit.cloudera.org:8080/7732
Reviewed-by: Henry Robinson <henry@cloudera.com>
Tested-by: Impala Public Jenkins
2017-08-24 02:23:21 +00:00
Henry Robinson
e4a0e2f391 IMPALA-5775: Allow shell to support TLSv1, v1.1 and v1.2
The shell uses Thrift's TSSLSocket to negotiate secure connections to
Impala. This socket uses a variable SSL_VERSION to determine which SSL
and TLS protocol versions it will connect to.

SSL_VERSION was hardcoded to be PROTOCOL_TLSv1, which only supports
TLSv1 servers and no other protocol version. Change the allowed version
to be PROTOCOL_SSLv23, which supports any TLS or SSL protocol. We rely
on the server not to allow SSLv2 or v3 connections.

Testing: Added a new custom cluster test to confirm that the shell can
connect to a TLSv1.2 cluster. Confirmed that the test is correctly
skipped on machines with an old version of OpenSSL that does not support
TLSv1.2.

Change-Id: I5487f82d110676b9c3c7a5305931da00c7f68ca0
Reviewed-on: http://gerrit.cloudera.org:8080/7675
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins
2017-08-16 08:10:02 +00:00
Jim Apple
e39f1676e1 IMPALA-4295: XFAIL wildcard SSL test
commit 9f61397fc4 exposed a bug (one
that was latent before the commit). I am XFAILing this now just to
green the build; IMPALA-4295 can be resolved when this issue is fixed
and not just XFAILed.

Change-Id: Ie809c6c6c967447d527927ebbc6b110095e7320a
Reviewed-on: http://gerrit.cloudera.org:8080/4784
Reviewed-by: Jim Apple <jbapple@cloudera.com>
Tested-by: Internal Jenkins
2016-10-22 02:51:25 +00:00
Henry Robinson
9f61397fc4 IMPALA-2905: Handle coordinator fragment lifecycle like all others
The plan-root fragment instance that runs on the coordinator should be
handled like all others: started via RPC and run asynchronously. Without
this, the fragment requires special-case code throughout the
coordinator, and does not show up in system metrics etc.

This patch adds a new sink type, PlanRootSink, to the root fragment
instance so that the coordinator can pull row batches that are pushed by
the root instance. The coordinator signals completion to the fragment
instance via closing the consumer side of the sink, whereupon the
instance is free to complete.

Since the root instance now runs asynchronously wrt to the coordinator,
we add several coordination methods to allow the coordinator to wait for
a point in the instance's execution to be hit - e.g. to wait until the
instance has been opened.

Done in this patch:

* Add PlanRootSink
* Add coordination to PFE to allow coordinator to observe lifecycle
* Make FragmentMgr a singleton
* Removed dead code from Coordinator::Wait() and elsewhere.
* Moved result output exprs out of QES and into PlanRootSink.
* Remove special-case limit-based teardown of coordinator fragment, and
  supporting functions in PlanFragmentExecutor.
* Simplified lifecycle of PlanFragmentExecutor by separating Open() into
  Open() and Exec(), the latter of which drives the sink by reading
  rows from the plan tree.
* Add child profile to PlanFragmentExecutor to measure time spent in
  each lifecycle phase.
* Removed dependency between InitExecProfiles() and starting root
  fragment.
* Removed mostly dead-code handling of LIMIT 0 queries.
* Ensured that SET returns a result set in all cases.
* Fix test_get_log() HS2 test. Errors are only guaranteed to be visible
  after fetch calls return EOS, but test was assuming this would happen
  after first fetch.

Change-Id: Ibb0064ec2f085fa3a5598ea80894fb489a01e4df
Reviewed-on: http://gerrit.cloudera.org:8080/4402
Tested-by: Internal Jenkins
Reviewed-by: Henry Robinson <henry@cloudera.com>
2016-10-16 15:55:29 +00:00
Dan Hecht
ffa7829b70 IMPALA-3918: Remove Cloudera copyrights and add ASF license header
For files that have a Cloudera copyright (and no other copyright
notice), make changes to follow the ASF source file header policy here:

http://www.apache.org/legal/src-headers.html#headers

Specifically:
1) Remove the Cloudera copyright.
2) Modify NOTICE.txt according to
   http://www.apache.org/legal/src-headers.html#notice
   to follow that format and add a line for Cloudera.
3) Replace or add the existing ASF license text with the one given
   on the website.

Much of this change was automatically generated via:

git grep -li 'Copyright.*Cloudera' > modified_files.txt
cat modified_files.txt | xargs perl -n -i -e 'print unless m#Copyright.*Cloudera#i;'
cat modified_files_txt | xargs fix_apache_license.py [1]

Some manual fixups were performed following those steps, especially when
license text was completely missing from the file.

[1] https://gist.github.com/anonymous/ff71292094362fc5c594 with minor
    modification to ORIG_LICENSE to match Impala's license text.

Change-Id: I2e0bd8420945b953e1b806041bea4d72a3943d86
Reviewed-on: http://gerrit.cloudera.org:8080/3779
Reviewed-by: Dan Hecht <dhecht@cloudera.com>
Tested-by: Internal Jenkins
2016-08-09 08:19:41 +00:00
Sailesh Mukil
45ff0f9e67 IMPALA-3159: impala-shell does not accept wildcard or SAN certificates
The impala-shell could not accept wildcard or SAN certificates
previously as the thrift library it depended on did not support them.
This patch subclasses TSSLSocket and adds the logic to take care of
the above mentioned cases by introducing the new
TSSLSocketWithWildcardSAN class.

The certificate matching logic is based on the python-ssl source code.

Added custom cluster tests to test both wildcard matching and SAN
matching.

Added be/src/testutil/certificates-info.txt which contains all the
information about the certificates which are added for the tests.

This has been tested with Python2.4 and Python2.6.

Change-Id: I75e37012eeeb0bcf87a5edf875f0ff915daf8b89
Reviewed-on: http://gerrit.cloudera.org:8080/3765
Reviewed-by: Sailesh Mukil <sailesh@cloudera.com>
Tested-by: Internal Jenkins
2016-07-26 02:44:25 +00:00
Taras Bobrovytsky
609b80410e Clean up Python test import statements
Many of our test scripts have import statements that look like
"from xxx import *". It is a good practice to explicitly name what
needs to be imported. This commit implements this practice. Also,
unused import statements are removed.

Change-Id: I6a33bb66552ae657d1725f765842f648faeb26a8
Reviewed-on: http://gerrit.cloudera.org:8080/3444
Reviewed-by: Michael Brown <mikeb@cloudera.com>
Tested-by: Internal Jenkins
2016-07-15 23:26:18 +00:00
Henry Robinson
0dde1c2f86 IMPALA-3628: Fix cancellation from shell when security is enabled
To cancel a query, the shell will create a separate connection inside
it's SIGINT handler, and send the cancellation RPC. However this
connection did not start a secure connection if it needed to, meaning
that the cancellation attempt would just hang.

A workaround is to kill the shell process, which I expect is what users
have been doing with this bug which has been around since 2014.

Testing:

I added a custom cluster test that starts Impala with SSL
enabled, and wrote two tests - one just to check SSL connectivity, and
the other to mimic the existing test_cancellation which sends SIGINT to
the shell process. In doing so I refactored the shell testing code a bit
so that all tests use a single ImpalaShell object, rather than rolling
their own Popen() based approaches when they needed to do something
unusual, like cancel a query.

In the cancellation test on my machine, SIGINT can take a few tries to
be effective. I'm not sure if this is a timing thing - perhaps the
Python interpreter doesn't correctly pass signals through to a handler
if it's in a blocking call, for example. The test reliably passes within
~5 tries on my machine, so the test tries 30 times, once per second.

Change-Id: If99085e75708d92a08dbecf0131a2234fedad33a
Reviewed-on: http://gerrit.cloudera.org:8080/3302
Reviewed-by: Henry Robinson <henry@cloudera.com>
Tested-by: Henry Robinson <henry@cloudera.com>
2016-07-05 16:40:23 -07:00