mirror of
https://github.com/apache/impala.git
synced 2026-01-04 18:00:57 -05:00
This patch fixes a slightly pathological state that occurs when the statestore is under heavy load. The result of the bug is that subscribers cannot successfully re-register because the statestore never marks them as failed. The exact sequence of events is as follows: 1. Subscriber registers with state-store. 2. Statestore does not send heartbeats in timely fashion to subscriber. Subscriber times-out. 3. Subscriber is restarted quickly. Statestore does not detect restart. 4. Subscriber's RegisterSubscriber() call fails, because statestore detects duplicate registration. 5. Subscriber restarts again. Since state-store is slow to send heartbeats, the state-store has not detected the restart and the subscriber receives a heartbeat message from the statestore and does not reject it. 6. Statestore continues to believe subscriber is alive, since the heartbeats are not being rejected. To fix this, we add a registration ID to each successfully registered subscriber that is known to both subscriber and statestore. If the subscriber should restart and re-register, it receives a new registration ID. Whenever a heartbeat arrives, it compares its registration ID to that sent by the statestore with the heartbeat, and rejects the heartbeat if they do not match. We also allow re-registration of existing subscribers (getting rid of the dreaded "Duplicate subscription" message). A new registration overwrites an old one. Change-Id: Ie32df3a586ccb375375ebfbcbec1aaeb930b6bfe Reviewed-on: http://gerrit.ent.cloudera.com:8080/778 Tested-by: jenkins Reviewed-by: Henry Robinson <henry@cloudera.com>
26 lines
918 B
Python
Executable File
26 lines
918 B
Python
Executable File
#!/usr/bin/env python
|
|
# Copyright (c) 2012 Cloudera, Inc. All rights reserved.
|
|
#
|
|
import pytest
|
|
import os
|
|
from tests.common.impala_test_suite import ImpalaTestSuite
|
|
from tests.common.impala_cluster import Process
|
|
|
|
class SimpleSubscriberProcess(Process):
|
|
"""Runs a subscriber binary that registers with the statestore and immediately exits,
|
|
indicating its sucesss in the exit code"""
|
|
def __init__(self):
|
|
binary_path = os.path.join(
|
|
os.environ['IMPALA_HOME'], "be/build/debug/statestore/statestore-test-client")
|
|
Process.__init__(self, [binary_path])
|
|
|
|
class TestStatestore(ImpalaTestSuite):
|
|
def test_subscriber_restart(self):
|
|
"""Start several clients with the same subscriber ID to confirm that re-registration
|
|
after a process restart works correctly (see IMPALA-620)"""
|
|
s = SimpleSubscriberProcess()
|
|
for i in xrange(5):
|
|
s.start()
|
|
rc, _, _ = s.wait()
|
|
assert rc == 0
|