This fixes all core e2e tests running on my local dockerised
minicluster build. I do not yet have a CI job or script running
but I wanted to get feedback on these changes sooner. The second
part of the change will include the CI script and any follow-on
fixes required for the exhaustive tests.
The following fixes were required:
* Detect docker_network from TEST_START_CLUSTER_ARGS
* get_webserver_port() does not depend on the caller passing in
the default webserver port. It failed previously because it
relied on start-impala-cluster.py setting -webserver_port
for *all* processes.
* Add SkipIf markers for tests that don't make sense or are
non-trivial to fix for containerised Impala.
* Support loading Impala-lzo plugin from host for tests that depend on
it.
* Fix some tests that had 'localhost' hardcoded - instead it should
be $INTERNAL_LISTEN_HOST, which defaults to localhost.
* Fix bug with sorting impala daemons by backend port, which is
the same for all dockerised impalads.
Testing:
I ran tests locally as follows after having set up a docker network and
starting other services:
./buildall.sh -noclean -notests -ninja
ninja -j $IMPALA_BUILD_THREADS docker_images
export TEST_START_CLUSTER_ARGS="--docker_network=impala-cluster"
export FE_TEST=false
export BE_TEST=false
export JDBC_TEST=false
export CLUSTER_TEST=false
./bin/run-all-tests.sh
Change-Id: Iee86cbd2c4631a014af1e8cef8e1cd523a812755
Reviewed-on: http://gerrit.cloudera.org:8080/12639
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This change removes the flag --use_krpc which allows users
to fall back to using Thrift based implementation of DataStream
services. This flag was originally added during development of
IMPALA-2567. It has served its purpose.
As we port more ImpalaInternalServices to use KRPC, it's becoming
increasingly burdensome to maintain parallel implementation of the
RPC handlers. Therefore, going forward, KRPC is always enabled.
This change removes the Thrift based implemenation of DataStreamServices
and also simplifies some of the tests which were skipped when KRPC
is disabled.
Testing done: core debug build.
Change-Id: Icfed200751508478a3d728a917448f2dabfc67c3
Reviewed-on: http://gerrit.cloudera.org:8080/10835
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Implement asynchronous admission control queuing. This is achieved by
running the admission control code-path in a separate thread. Major
changes include: propagating cancellation to the admission control
thread and dequeuing thread, and adding a new Query Operation State
called "PENDING" that represents the state between completion of
planning and starting of query execution.
Testing:
- Added a deterministic end to end test and a session expiry test.
- Ran multiple stress tests successfully with a cancellation probability
of 60% and with different values for the following parameters:
max_requests, queue_wait_timeout_ms. Ensured that the impalad was in a
valid state afterwards (no orphan fragments or wrong metrics).
- Ran all exhaustive tests and ASAN core tests successfully.
- Ran data load successfully.
Change-Id: I989cf5b259afb8f5bc5c35590c94961c81ce88bf
Reviewed-on: http://gerrit.cloudera.org:8080/10060
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This change renames the label of the MemTracker in
KrpcDataStreamMgr for tracking payloads of early RPCs
to "Data Stream Manager Early RPCs". This is to distinguish
these RPCs from the deferred RPCs in a receiver. The early
RPCs refer to those RPCs which arrive before a receiver
is ready. The responses to these RPCs are deferred until
the receiver is created. The receiver may also defer
responses to RPCs if the deserialized payloads of RPCs in
an inbound queue exceed FLAGS_exchg_node_buffer_size_bytes.
In this case, the RPCs won't be responded to until the
inbound queue is drained.
Change-Id: I5bb72c28e8d660a6b78543dbc8b5b156e0e7c843
Reviewed-on: http://gerrit.cloudera.org:8080/9633
Reviewed-by: Dan Hecht <dhecht@cloudera.com>
Reviewed-by: Michael Ho <kwho@cloudera.com>
Tested-by: Impala Public Jenkins
The fix for IMPALA-6193 added a memory tracker for the memory consumed
by the payloads in the service queue of DataStreamService. This change
extends it by introducing a bound on the memory usage for that service
queue. In addition, it deprecates FLAGS_datastream_service_queue_depth
and replaces it with FLAGS_datastream_service_queue_mem_limit. These flags
only take effect when KRPC is in use and KRPC was never enabled in any
previous releases so it seems safe to do this flag replacement. The new
flag FLAGS_datastream_service_queue_mem_limit directly dictates the amount
of memory which can be consumed by the service queue of DataStreamService.
This allows a more direct control over the memory usage of the queue instead
of inferring via the number of entries in the queue. The default value of
this flag is left at 0, in which case it will be set to 5% of process
memory limit.
Testing done: exhaustive debug builds. Updated data-stream-test to
exercise the case in which the payload is larger than the limit.
Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8aaaa248e880b9
Reviewed-on: http://gerrit.cloudera.org:8080/9282
Reviewed-by: Michael Ho <kwho@cloudera.com>
Tested-by: Impala Public Jenkins
This change adds a flag "--use_krpc" to start-impala-cluster.py. The
flag is currently passed as an argument to the impalad daemon. In the
future it will also enable KRPC for the catalogd and statestored
daemons.
This change also adds a flag "--test_krpc" to pytest. When running tests
using "impala-py.test --test_krpc", the test cluster will be started
by passing "--use_krpc" to start-impala-cluster.py (see above).
This change also adds a SkipIf to skip tests based on whether the
cluster was started with KRPC support or not.
- SkipIf.not_krpc can be used to mark a test that depends on KRPC.
- SkipIf.not_thrift can be used to mark a test that depends on Thrift
RPC.
This change adds a meta test to make sure that the new SkipIf decorators
work correctly. The test should be removed as soon as real tests have
been added with the new decorators.
Change-Id: Ie01a5de2afac4a0f43d5fceff283f6108ad6a3ab
Reviewed-on: http://gerrit.cloudera.org:8080/9291
Reviewed-by: David Knupp <dknupp@cloudera.com>
Tested-by: Impala Public Jenkins
Some of the tests added in IMPALA-6193 rely on flags that are only
compiled for debug binaries. This change marks those tests as debug-only
so that they do not break the release tests.
Change-Id: I89ae25ee8c1aca3833c2d98e902ddaad2dd01aad
Reviewed-on: http://gerrit.cloudera.org:8080/9207
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Impala Public Jenkins
This change adds memory tracking to incoming transmit data RPCs when
using KRPC. We track memory against a global tracker called "Data Stream
Service" until it is handed over to the stream manager. There we track
it in a global tracker called "Data Stream Queued RPC Calls" until a
receiver registers and takes over the early sender RPCs. Inside the
receiver, memory for deferred RPCs is tracked against the fragment
instance's memtracker until we unpack the batches and add them to the
row batch queue.
The DCHECK in MemTracker::Close() covers that all memory consumed by a
tracker gets release eventually. In addition to that, this change adds a
custom cluster test that makes sure that queued memory gets tracked by
inspecting the peak consumption of the new memtrackers.
Change-Id: I2df1204d2483313a8a18e5e3be6cec9e402614c4
Reviewed-on: http://gerrit.cloudera.org:8080/8914
Reviewed-by: Lars Volker <lv@cloudera.com>
Tested-by: Impala Public Jenkins