7 Commits

Author SHA1 Message Date
Joe McDonnell
ea0969a772 IMPALA-11980 (part 2): Fix absolute import issues for impala_shell
Python 3 changed the behavior of imports with PEP328. Existing
imports become absolute unless they use the new relative import
syntax. This adapts the impala-shell code to use absolute
imports, fixing issues where it is imported from our test code.

There are several parts to this:
1. It moves impala shell code into shell/impala_shell.
   This matches the directory structure of the PyPi package.
2. It changes the imports in the shell code to be
   absolute paths (i.e. impala_shell.foo rather than foo).
   This fixes issues with Python 3 absolute imports.
   It also eliminates the need for ugly hacks in the PyPi
   package's __init__.py.
3. This changes Thrift generation to put it directly in
   $IMPALA_HOME/shell rather than $IMPALA_HOME/shell/gen-py.
   This means that the generated Thrift code is rooted in
   the same directory as the shell code.
4. This changes the PYTHONPATH to include $IMPALA_HOME/shell
   and not $IMPALA_HOME/shell/gen-py. This means that the
   test code is using the same import paths as the pypi
   package.

With all of these changes, the source code is very close
to the directory structure of the PyPi package. As long as
CMake has generated the thrift files and the Python version
file, only a few differences remain. This removes those
differences by moving the setup.py / MANIFEST.in and other
files from the packaging directory to the top-level
shell/ directory. This means that one can pip install
directly from the source code. i.e. pip install $IMPALA_HOME/shell

This also moves the shell tarball generation script to the
packaging directory and changes bin/impala-shell.sh to use
Python 3.

This sorts the imports using isort for the affected Python files.

Testing:
 - Ran a regular core job with Python 2
 - Ran a core job with Python 3 and verified that the absolute
   import issues are gone.

Change-Id: Ica75a24fa6bcb78999b9b6f4f4356951b81c3124
Reviewed-on: http://gerrit.cloudera.org:8080/22330
Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Tested-by: Riza Suminto <riza.suminto@cloudera.com>
2025-05-21 15:14:11 +00:00
Csaba Ringhofer
f98b697c7b IMPALA-13929: Make 'functional-query' the default workload in tests
This change adds get_workload() to ImpalaTestSuite and removes it
from all test suites that already returned 'functional-query'.
get_workload() is also removed from CustomClusterTestSuite which
used to return 'tpch'.

All other changes besides impala_test_suite.py and
custom_cluster_test_suite.py are just mass removals of
get_workload() functions.

The behavior is only changed in custom cluster tests that didn't
override get_workload(). By returning 'functional-query' instead
of 'tpch', exploration_strategy() will no longer return 'core' in
'exhaustive' test runs. See IMPALA-3947 on why workload affected
exploration_strategy. An example for affected test is
TestCatalogHMSFailures which was skipped both in core and exhaustive
runs before this change.

get_workload() functions that return a different workload than
'functional-query' are not changed - it is possible that some of
these also don't handle exploration_strategy() as expected, but
individually checking these tests is out of scope in this patch.

Change-Id: I9ec6c41ffb3a30e1ea2de773626d1485c69fe115
Reviewed-on: http://gerrit.cloudera.org:8080/22726
Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>
Reviewed-by: Daniel Becker <daniel.becker@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-04-08 07:12:55 +00:00
Riza Suminto
95f353ac4a IMPALA-13507: Allow disabling glog buffering via with_args fixture
We have plenty of custom_cluster tests that assert against content of
Impala daemon log files while the process is still running using
assert_log_contains() and it's wrappers. The method specifically mention
about disabling glog buffering ('-logbuflevel=-1'), but not all
custom_cluster tests do that. This often result in flaky test that hard
to triage and often neglected if it does not frequently run in core
exploration.

This patch adds boolean param 'disable_log_buffering' into
CustomClusterTestSuite.with_args for test to declare intention to
inspect log files in live minicluster. If it is True, start minicluster
with '-logbuflevel=-1' for all daemons. If it is False, log WARNING on
any calls to assert_log_contains().

There are several complex custom_cluster tests that left unchanged and
print out such WARNING logs, such as:
- TestQueryLive
- TestQueryLogTableBeeswax
- TestQueryLogOtherTable
- TestQueryLogTableHS2
- TestQueryLogTableAll
- TestQueryLogTableBufferPool
- TestStatestoreRpcErrors
- TestWorkloadManagementInitWait
- TestWorkloadManagementSQLDetails

This patch also fixed some small flake8 issues on modified tests.

There is a flakiness sign at test_query_live.py where test query is
submitted to coordinator and fail because sys.impala_query_live table
has not exist yet from coordinator's perspective. This patch modify
test_query_live.py to wait for few seconds until sys.impala_query_live
is queryable.

Testing:
- Pass custom_cluster tests in exhaustive exploration.

Change-Id: I56fb1746b8f3cea9f3db3514a86a526dffb44a61
Reviewed-on: http://gerrit.cloudera.org:8080/22015
Reviewed-by: Jason Fehr <jfehr@cloudera.com>
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-11-05 04:49:05 +00:00
Joe McDonnell
e9fb8e717c IMPALA-12114: Pull in fix for THRIFT-5705 and add test
This pulls in a new toolchain to get a Thrift with
the patch for THRIFT-5705. This fixes an issue where
idle clients using TLS are needlessly disconnected due
to a bug in the read retry count logic inside Thrift.

Tests:
 - This modifies test_thrift_socket.py to make it do
   more idle polls and check that ImpalaShell is not
   disconnected. It fails without the THRIFT-5705 patch
   and passes now.

Change-Id: Ifc7704cba032a91b9fd0d5d54d1e0a7e17fb10bb
Reviewed-on: http://gerrit.cloudera.org:8080/19962
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Daniel Becker <daniel.becker@cloudera.com>
Reviewed-by: Andrew Sherman <asherman@cloudera.com>
2023-06-02 15:57:37 +00:00
Joe McDonnell
82bd087fb1 IMPALA-11973: Add absolute_import, division to all eligible Python files
This takes steps to make Python 2 behave like Python 3 as
a way to flush out issues with running on Python 3. Specifically,
it handles two main differences:
 1. Python 3 requires absolute imports within packages. This
    can be emulated via "from __future__ import absolute_import"
 2. Python 3 changed division to "true" division that doesn't
    round to an integer. This can be emulated via
    "from __future__ import division"

This changes all Python files to add imports for absolute_import
and division. For completeness, this also includes print_function in the
import.

I scrutinized each old-division location and converted some locations
to use the integer division '//' operator if it needed an integer
result (e.g. for indices, counts of records, etc). Some code was also using
relative imports and needed to be adjusted to handle absolute_import.
This fixes all Pylint warnings about no-absolute-import and old-division,
and these warnings are now banned.

Testing:
 - Ran core tests

Change-Id: Idb0fcbd11f3e8791f5951c4944be44fb580e576b
Reviewed-on: http://gerrit.cloudera.org:8080/19588
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>
2023-03-09 17:17:57 +00:00
Joe McDonnell
ba3518366a IMPALA-11952 (part 4): Fix odds and ends: Octals, long, lambda, etc.
There are a variety of small python 3 syntax differences:
 - Octal constants need to start with 0o rather than just 0
 - Long constants are not supported (i.e. numbers ending with L)
 - Lambda syntax is slightly different
 - The 'ur' string mode is no longer supported

Testing:
 - check-python-syntax.sh now passes

Change-Id: Ie027a50ddf6a2a0db4b34ec9b49484ce86947f20
Reviewed-on: http://gerrit.cloudera.org:8080/19554
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
2023-02-28 17:11:50 +00:00
Riza Suminto
f917dc111c IMPALA-11674: Fix timeout detection for TSSLSocket
Functions IsPeekTimeoutTException() and IsReadTimeoutTException() in
be/src/rpc/thrift-util.cc make assumption about the implementation of
read(), peek(), write() and write_partial() in TSocket.cpp and
TSSLSocket.cpp. The functions read() and peek() in TSSLSocket.cpp were
changed in version 0.11.0 and 0.16.0 to throw different exception for
timeout. This cause IsPeekTimeoutTException() and
IsReadTimeoutTException() to return wrong value after upgrade thrift,
which in turn cause TAcceptQueueServer::Peek() to rethrow the exception
to caller TAcceptQueueServer::run() and make TAcceptQueueServer::run()
to close the connection, ignoring idle_session_timeout query option.

The issue was reproducible through the following scenario:

1. From the local development environment, start the impala cluster with
SSL enabled and idle_client_poll_period_s equals 5 seconds.

export CERT_DIR="$IMPALA_HOME/be/src/testutil"
export SSL_ARGS="--ssl_client_ca_certificate=$CERT_DIR/server-cert.pem
  --ssl_server_certificate=$CERT_DIR/server-cert.pem
  --ssl_private_key=$CERT_DIR/server-key.pem
  --hostname=localhost"
./bin/start-impala-cluster.py --state_store_args="$SSL_ARGS" \
  --catalogd_args="$SSL_ARGS" \
  --impalad_args="$SSL_ARGS --idle_client_poll_period_s=5"

2. Run impala-shell with a higher idle_session_timeout query option

impala-shell.sh --ssl -Q idle_session_timeout=100

3. Run a simple query like "show databases" and rerun it after 15
   seconds pass.

The second query run will fail with the following error message in impala-shell:
[localhost:21050] default> show databases;
Caught exception TLS/SSL connection has been closed (EOF) (_ssl.c:1829), type=<class 'ssl.SSLZeroReturnError'> in CloseSession.
Warning: close session RPC failed: TLS/SSL connection has been closed (EOF) (_ssl.c:1829), <class 'ssl.SSLZeroReturnError'>

This patch fix the expected error message in IsReadTimeoutTException and
IsPeekTimeoutTException to correctly detect timeout error from
TSSLSocket. Additionally, this patch also fix typo in
NEW_THRIFT_VERSION_MSG.

Testing:
- Redo the scenario manually, with and without SSL, and confirm that
  the second query complete without error.
- Add test_thrift_socket.py to begin verifying IsPeekTimeoutTException
  function.

Change-Id: I6ad168a1c96d751a3c50d924e6ecaf6404e589ab
Reviewed-on: http://gerrit.cloudera.org:8080/19157
Reviewed-by: Wenzhe Zhou <wzhou@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Zoltan Borok-Nagy <boroknagyz@cloudera.com>
2022-10-21 08:30:55 +00:00