6 Commits

Author SHA1 Message Date
Joe McDonnell
c5a0ec8bdf IMPALA-11980 (part 1): Put all thrift-generated python code into the impala_thrift_gen package
This puts all of the thrift-generated python code into the
impala_thrift_gen package. This is similar to what Impyla
does for its thrift-generated python code, except that it
uses the impala_thrift_gen package rather than impala._thrift_gen.
This is a preparatory patch for fixing the absolute import
issues.

This patches all of the thrift files to add the python namespace.
This has code to apply the patching to the thirdparty thrift
files (hive_metastore.thrift, fb303.thrift) to do the same.

Putting all the generated python into a package makes it easier
to understand where the imports are getting code. When the
subsequent change rearranges the shell code, the thrift generated
code can stay in a separate directory.

This uses isort to sort the imports for the affected Python files
with the provided .isort.cfg file. This also adds an impala-isort
shell script to make it easy to run.

Testing:
 - Ran a core job

Change-Id: Ie2927f22c7257aa38a78084efe5bd76d566493c0
Reviewed-on: http://gerrit.cloudera.org:8080/20169
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>
2025-04-15 17:03:02 +00:00
Csaba Ringhofer
f98b697c7b IMPALA-13929: Make 'functional-query' the default workload in tests
This change adds get_workload() to ImpalaTestSuite and removes it
from all test suites that already returned 'functional-query'.
get_workload() is also removed from CustomClusterTestSuite which
used to return 'tpch'.

All other changes besides impala_test_suite.py and
custom_cluster_test_suite.py are just mass removals of
get_workload() functions.

The behavior is only changed in custom cluster tests that didn't
override get_workload(). By returning 'functional-query' instead
of 'tpch', exploration_strategy() will no longer return 'core' in
'exhaustive' test runs. See IMPALA-3947 on why workload affected
exploration_strategy. An example for affected test is
TestCatalogHMSFailures which was skipped both in core and exhaustive
runs before this change.

get_workload() functions that return a different workload than
'functional-query' are not changed - it is possible that some of
these also don't handle exploration_strategy() as expected, but
individually checking these tests is out of scope in this patch.

Change-Id: I9ec6c41ffb3a30e1ea2de773626d1485c69fe115
Reviewed-on: http://gerrit.cloudera.org:8080/22726
Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>
Reviewed-by: Daniel Becker <daniel.becker@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-04-08 07:12:55 +00:00
Joe McDonnell
82bd087fb1 IMPALA-11973: Add absolute_import, division to all eligible Python files
This takes steps to make Python 2 behave like Python 3 as
a way to flush out issues with running on Python 3. Specifically,
it handles two main differences:
 1. Python 3 requires absolute imports within packages. This
    can be emulated via "from __future__ import absolute_import"
 2. Python 3 changed division to "true" division that doesn't
    round to an integer. This can be emulated via
    "from __future__ import division"

This changes all Python files to add imports for absolute_import
and division. For completeness, this also includes print_function in the
import.

I scrutinized each old-division location and converted some locations
to use the integer division '//' operator if it needed an integer
result (e.g. for indices, counts of records, etc). Some code was also using
relative imports and needed to be adjusted to handle absolute_import.
This fixes all Pylint warnings about no-absolute-import and old-division,
and these warnings are now banned.

Testing:
 - Ran core tests

Change-Id: Idb0fcbd11f3e8791f5951c4944be44fb580e576b
Reviewed-on: http://gerrit.cloudera.org:8080/19588
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>
2023-03-09 17:17:57 +00:00
Lars Volker
74c7b7e55f IMPALA-8863: Add support to run tests over HTTP/HS2
This change adds support to run backend tests over HTTP using a new
version of Impyla (0.16.1). It also adds a test that exercises
authentication over HTTP.

Change-Id: I7156558071781378fcb9c8941c0f4dd82eb0d018
Reviewed-on: http://gerrit.cloudera.org:8080/14059
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2019-11-26 22:46:40 +00:00
Michael Ho
67cd55e044 IMPALA-7802: Close connections of idle client sessions
Previously, if idle session timeout is set either via
startup flag or query options, a client session will expire
after that set period of inactivity. However, the network
connection and the service thread of an expired session will
still be around until the session is closed by the client.
This is highly undesirable as these idle sessions still count
towards the quota bound by --fe_esrvice_threads, so if the
total number of sessions (including the idle ones) reaches
that upper bound, all incoming new session will block until
some of the existing sessions exit. There is no time bound on
when those expired sessions will be closed. In some sense,
leaving many idle sessions opened is a denial-of-service attack
on Impala.

This change implements support for closing expired client sessions.
In particular, a new flag --idle_client_poll_time_s is added to
specify a time interval in seconds of client's inactivity which
will cause an idle service thread of a client connection to wake up
and check if all sessions associated with the connection are idle.
If so, the connection will be closed. This allows the service threads
to be freed up without waiting for client to close the connections.

Testing done:
- core build
- new targeted test which verifies the connections of expired sessions
are closed.
- verified the flags function as expected in a secure cluster with Kerberos + SSL

Change-Id: I97c4fb8e1b741add273f8a913fb0967303683e38
Reviewed-on: http://gerrit.cloudera.org:8080/13607
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2019-06-21 02:34:09 +00:00
Thomas Tauber-Marshall
b1cb879577 IMPALA-1653: Don't close hiveserver2 session when connection is closed
Currently, when a client connection is closed, we always close any
session started over that connection. This is a requirement for
beeswax, which always ties sessions to connections, but it is not
required for hiveserver2, which allows sessions to be used across
connections with a session token.

This patch changes this behavior so that hiveserver2 sessions are no
longer closed when the corresponding connection is closed.

One downside of this change is that clients may inadvertently leave
sessions open indefinitely if they close their connection without
calling CloseSession(), which can waste space on the coordinator.
We already have a flag --idle_session_timeout, but this flag is off
by default and sessions that hit this timeout are expired but not
fully closed.

Rather than changing the default idle session behavior, which could
affect existing users, this patch mitigates this issue by adding a
new flag: --disconnected_session_timeout which is set to 1 hour by
default. When a session has had no open connections for longer than
this time, it will be closed and any associated queries will be
unregistered.

Testing:
- Added e2e tests.

Change-Id: Ia4555cd9b73db5b4dde92cd4fac4f9bfa3664d78
Reviewed-on: http://gerrit.cloudera.org:8080/13306
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2019-05-30 03:00:12 +00:00