Commit Graph

2 Commits

Author SHA1 Message Date
Michael Smith
512a73771f IMPALA-14452: Fix impala-shell SSL with Python 3.12
Removes deprecated ImpalaHttpClient constructor that supported port and
path as it has been deprecated since at least 2020 and appears unused.

Removes cert_file and key_file as they were also never used, and if
required must now be passed in via ssl_context.

Updates TSSLSocket fixes for Thrift 0.16 and Python 3.12. _validate_cert
was removed by Thrift 0.16, but everything worked because Thrift used
ssl.match_hostname instead. With Python 3.12 ssl.match_hostname no
longer exists so we rely on OpenSSL to handle verification with
ssl.PROTOCOL_TLS_CLIENT.

Only uses ssl.PROTOCOL_TLS_CLIENT when match_hostname is unavailable to
avoid changing existing behavior. THRIFT-792 identifies that TSocket
suppresses connection errors, where we would otherwise see SSL hostname
verification errors like

    ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED]
    certificate verify failed: IP address mismatch, certificate is not
    valid for '::1'. (_ssl.c:1131)

Python 2.7.9 and 3.2 are minimum required versions; both have been EOL
for several years.

Testing:
- ran custom_cluster/{test_client_ssl.py,test_ipv6.py} on Ubuntu 24 with
  Python 3.12, OpenSSL 3.0.13.
- ran custom_cluster/test_client_ssl.py on RHEL 7.9 with Python 2.7.5
  and Python 3.6.8, OpenSSL 1.0.2k-fips.
- adds test that hostname checking is configured.

Change-Id: I046a9010ac4cb1f7d705935054b306cddaf8bdc7
Reviewed-on: http://gerrit.cloudera.org:8080/23519
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Csaba Ringhofer <csringhofer@cloudera.com>
2025-10-20 09:55:22 +00:00
Joe McDonnell
ea0969a772 IMPALA-11980 (part 2): Fix absolute import issues for impala_shell
Python 3 changed the behavior of imports with PEP328. Existing
imports become absolute unless they use the new relative import
syntax. This adapts the impala-shell code to use absolute
imports, fixing issues where it is imported from our test code.

There are several parts to this:
1. It moves impala shell code into shell/impala_shell.
   This matches the directory structure of the PyPi package.
2. It changes the imports in the shell code to be
   absolute paths (i.e. impala_shell.foo rather than foo).
   This fixes issues with Python 3 absolute imports.
   It also eliminates the need for ugly hacks in the PyPi
   package's __init__.py.
3. This changes Thrift generation to put it directly in
   $IMPALA_HOME/shell rather than $IMPALA_HOME/shell/gen-py.
   This means that the generated Thrift code is rooted in
   the same directory as the shell code.
4. This changes the PYTHONPATH to include $IMPALA_HOME/shell
   and not $IMPALA_HOME/shell/gen-py. This means that the
   test code is using the same import paths as the pypi
   package.

With all of these changes, the source code is very close
to the directory structure of the PyPi package. As long as
CMake has generated the thrift files and the Python version
file, only a few differences remain. This removes those
differences by moving the setup.py / MANIFEST.in and other
files from the packaging directory to the top-level
shell/ directory. This means that one can pip install
directly from the source code. i.e. pip install $IMPALA_HOME/shell

This also moves the shell tarball generation script to the
packaging directory and changes bin/impala-shell.sh to use
Python 3.

This sorts the imports using isort for the affected Python files.

Testing:
 - Ran a regular core job with Python 2
 - Ran a core job with Python 3 and verified that the absolute
   import issues are gone.

Change-Id: Ica75a24fa6bcb78999b9b6f4f4356951b81c3124
Reviewed-on: http://gerrit.cloudera.org:8080/22330
Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Tested-by: Riza Suminto <riza.suminto@cloudera.com>
2025-05-21 15:14:11 +00:00