Previously, if idle session timeout is set either via
startup flag or query options, a client session will expire
after that set period of inactivity. However, the network
connection and the service thread of an expired session will
still be around until the session is closed by the client.
This is highly undesirable as these idle sessions still count
towards the quota bound by --fe_esrvice_threads, so if the
total number of sessions (including the idle ones) reaches
that upper bound, all incoming new session will block until
some of the existing sessions exit. There is no time bound on
when those expired sessions will be closed. In some sense,
leaving many idle sessions opened is a denial-of-service attack
on Impala.
This change implements support for closing expired client sessions.
In particular, a new flag --idle_client_poll_time_s is added to
specify a time interval in seconds of client's inactivity which
will cause an idle service thread of a client connection to wake up
and check if all sessions associated with the connection are idle.
If so, the connection will be closed. This allows the service threads
to be freed up without waiting for client to close the connections.
Testing done:
- core build
- new targeted test which verifies the connections of expired sessions
are closed.
- verified the flags function as expected in a secure cluster with Kerberos + SSL
Change-Id: I97c4fb8e1b741add273f8a913fb0967303683e38
Reviewed-on: http://gerrit.cloudera.org:8080/13607
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
I used some ideas from Alex Leblang's abandoned patch:
https://gerrit.cloudera.org/#/c/137/ in order to run .test files through
HS2. The advantage of using Impyla is that much of the code will be
reusable for any Python client implementing the standard Python dbapi
and does not require us implementing yet another thrift client.
This gives us better coverage of non-trivial result sets from HS2,
including handling of NULLs, error logs and more interesting result
sets than the basic HS2 tests.
I added HS2 coverage to TestQueries, which has a reasonable variety of
queries and covers the data types in alltypes. I also added
TestDecimalQueries, TestStringQuery and TestCharFormats to get coverage
of DECIMAL, CHAR and VARCHAR that aren't in alltypes. Coverage of
results sets with NULLs was limited so I added a couple of queries.
Places where results differ from Beeswax:
* Impyla is a Python dbapi client so must convert timestamps into python datetime
objects, which only have microsecond precision. Therefore result
timestamps within nanosecond precision are truncated.
* The HS2 interface reports the NULL type as BOOLEAN as a workaround for
IMPALA-914.
* The Beeswax interface reported VARCHAR as STRING, but HS2 reports
VARCHAR.
I dealt with different results by adding additional result sections so
that the expected differences between the clients/protocols were
explicit.
Limitations:
* Not all of the same methods are implemented as for beeswax, so some
tests that have more complicated interactions with the client will not
work with HS2 yet.
* We don't have a way to get the affected row count for inserts.
I also simplified the ImpalaConnection API by removing some unnecessary
methods and moved some generic methods to the base class.
Testing:
* Confirmed that it detected IMPALA-7588 by re-applying the buggy patch.
* Ran exhaustive and CentOS6 tests.
Change-Id: I9908ccc4d3df50365be8043b883cacafca52661e
Reviewed-on: http://gerrit.cloudera.org:8080/11546
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Implement asynchronous admission control queuing. This is achieved by
running the admission control code-path in a separate thread. Major
changes include: propagating cancellation to the admission control
thread and dequeuing thread, and adding a new Query Operation State
called "PENDING" that represents the state between completion of
planning and starting of query execution.
Testing:
- Added a deterministic end to end test and a session expiry test.
- Ran multiple stress tests successfully with a cancellation probability
of 60% and with different values for the following parameters:
max_requests, queue_wait_timeout_ms. Ensured that the impalad was in a
valid state afterwards (no orphan fragments or wrong metrics).
- Ran all exhaustive tests and ASAN core tests successfully.
- Ran data load successfully.
Change-Id: I989cf5b259afb8f5bc5c35590c94961c81ce88bf
Reviewed-on: http://gerrit.cloudera.org:8080/10060
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This commit makes idle_session_timeout a query option.
idle_session_timeout currently can be set as a command line
option, which will be the default timeout for sessions.
HS2 sessions can override it with a smaller value by setting
it in the configuration overlay of HS2 OpenSession().
However, we can't override idle_session_timeout for JDBC/ODBC
connections, because we cannot put this in the connection string.
This commit is a workaround for this problem, it allows JDBC/ODBC
connections to set the session timeout as a query option
with the SET statement.
After this commit, the session timeout can be overridden to
any value, i.e. the command line flag idle_session_timeout
doesn't limit this option anymore.
I created an automated test case in JdbcTest.java based on
test_hs2.py::test_concurrent_session_mixed_idle_timeout. I also
extended the test_session_expiration and test_set_and_unset
test suites.
Change-Id: I32e2775f80da387b0df4195fe2c5435b3f8e585e
Reviewed-on: http://gerrit.cloudera.org:8080/8490
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins
For files that have a Cloudera copyright (and no other copyright
notice), make changes to follow the ASF source file header policy here:
http://www.apache.org/legal/src-headers.html#headers
Specifically:
1) Remove the Cloudera copyright.
2) Modify NOTICE.txt according to
http://www.apache.org/legal/src-headers.html#notice
to follow that format and add a line for Cloudera.
3) Replace or add the existing ASF license text with the one given
on the website.
Much of this change was automatically generated via:
git grep -li 'Copyright.*Cloudera' > modified_files.txt
cat modified_files.txt | xargs perl -n -i -e 'print unless m#Copyright.*Cloudera#i;'
cat modified_files_txt | xargs fix_apache_license.py [1]
Some manual fixups were performed following those steps, especially when
license text was completely missing from the file.
[1] https://gist.github.com/anonymous/ff71292094362fc5c594 with minor
modification to ORIG_LICENSE to match Impala's license text.
Change-Id: I2e0bd8420945b953e1b806041bea4d72a3943d86
Reviewed-on: http://gerrit.cloudera.org:8080/3779
Reviewed-by: Dan Hecht <dhecht@cloudera.com>
Tested-by: Internal Jenkins
Many of our test scripts have import statements that look like
"from xxx import *". It is a good practice to explicitly name what
needs to be imported. This commit implements this practice. Also,
unused import statements are removed.
Change-Id: I6a33bb66552ae657d1725f765842f648faeb26a8
Reviewed-on: http://gerrit.cloudera.org:8080/3444
Reviewed-by: Michael Brown <mikeb@cloudera.com>
Tested-by: Internal Jenkins
Many python files had a hashbang and the executable bit set though
they were not intended to be run a standalone script. That makes
determining which python files are actually scripts very difficult.
A future patch will update the hashbang in real python scripts so they
use $IMPALA_HOME/bin/impala-python.
Change-Id: I04eafdc73201feefe65b85817a00474e182ec2ba
Reviewed-on: http://gerrit.cloudera.org:8080/599
Reviewed-by: Casey Ching <casey@cloudera.com>
Reviewed-by: Taras Bobrovytsky <tbobrovytsky@cloudera.com>
Tested-by: Internal Jenkins
The session expiration test would occasionally fail because two sessions ended up expiring
at approximately the same time. This patch ensures that no session is active when the
initial metric value is polled.
Change-Id: Ib62e7c23fb0c43e0e8ee0c17770d47df19964117
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4502
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: jenkins
This has only failed once as far as I can tell, and we don't have any log
files for the failure. Nor have I been able to repro. So, let's increase the
test timeout to get a bit more slop in the case of the occasional long sleep
on the impalad side. If there is a bug where a session is stuck in the
checked-out state, the test will still find it.
Change-Id: Ib63a8a83d9c3e772787dc548473868c58f950eb8
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4494
Reviewed-by: Daniel Hecht <dhecht@cloudera.com>
Tested-by: jenkins
A crucial comparison was between time values with different units.
Tests didn't catch this because they only confirmed that sessions were
timed out within the correct time, not that they were *not* timed out
early.
Change-Id: Ia8c57d3d70e4702996d0225b167142b7bf88d236
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1926
Tested-by: jenkins
Reviewed-by: Henry Robinson <henry@cloudera.com>