This is the first step in cleaning up the test logging. It provides a common connection
interface that provides tracing around all operations. When a test fails the output will
be executable SQL. It also logs actions such as when a connection is opened, close, or
when an operation is cancelled. Currently only beeswax connections are supported, but
I have a seperate patch that adds support for executing using HS2 as well as Beeswax.
Example of new logging:
-- connecting to: localhost:21000
-- executing against localhost:21000
use functional;
SET disable_codegen=False;
SET abort_on_error=1;
SET batch_size=0;
SET num_nodes=0;
-- executing against localhost:21000
select a.timestamp_col from alltypessmall a inner join alltypessmall b on
(a.timestamp_col = b.timestamp_col)
where a.year=2009 and a.month=1 and b.year=2009 and b.month=1;
-- closing connection to: localhost:21000
Change-Id: Iedc7d4d3a84bfeff6cc1daae6ed1ca97613d7700
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1133
Tested-by: jenkins
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
Test suites that derive from common.CustomClusterTestSuite have a brand
new cluster for every tests case, which they can configure as they wish
with custom arguments using the @with_args() decorator.
A future improvement is to optionally only have one cluster per test
suite, to allow multiple tests to run more quickly if they share
configuration options.
Change-Id: I6abd5740e644996d7ca2800edf4ff11b839d1bc4
Reviewed-on: http://gerrit.ent.cloudera.com:8080/882
Reviewed-by: Henry Robinson <henry@cloudera.com>
Tested-by: jenkins
This patch fixes a slightly pathological state that occurs when the
statestore is under heavy load. The result of the bug is that
subscribers cannot successfully re-register because the statestore never
marks them as failed.
The exact sequence of events is as follows:
1. Subscriber registers with state-store.
2. Statestore does not send heartbeats in timely fashion to
subscriber. Subscriber times-out.
3. Subscriber is restarted quickly. Statestore does not detect
restart.
4. Subscriber's RegisterSubscriber() call fails, because statestore
detects duplicate registration.
5. Subscriber restarts again. Since state-store is slow to send
heartbeats, the state-store has not detected the restart and the
subscriber receives a heartbeat message from the statestore and
does not reject it.
6. Statestore continues to believe subscriber is alive, since the
heartbeats are not being rejected.
To fix this, we add a registration ID to each successfully registered
subscriber that is known to both subscriber and statestore. If the
subscriber should restart and re-register, it receives a new
registration ID. Whenever a heartbeat arrives, it compares its
registration ID to that sent by the statestore with the heartbeat, and
rejects the heartbeat if they do not match.
We also allow re-registration of existing subscribers (getting rid of
the dreaded "Duplicate subscription" message). A new registration
overwrites an old one.
Change-Id: Ie32df3a586ccb375375ebfbcbec1aaeb930b6bfe
Reviewed-on: http://gerrit.ent.cloudera.com:8080/778
Tested-by: jenkins
Reviewed-by: Henry Robinson <henry@cloudera.com>
This brings back online the process failure tests and adds a basic failure
test for the catalog service. The timeouts had to be adjusted to account for the
extra time it takes to load the the catalog and also there is an additional state
store subscriber. Note: the statestore 'live.backends' metric which is used in these
tests needs to be renamed, it really means 'live.subscribers'. However, it requires some
coordination with other teams to make the change.
Also updated start-impala-cluster to check the catalog.ready flag to ensure the impalad
catalog is ready to accept queries.
Change-Id: If22e25dba7dc83aa40bec937b5f82b815bed4645
Reviewed-on: http://gerrit.ent.cloudera.com:8080/730
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
Tested-by: jenkins
The Impala CatalogService manages the caching and dissemination of cluster-wide metadata.
The CatalogService combines the metadata from the Hive Metastore, the NameNode,
and potentially additional sources in the future. The CatalogService uses the
StateStore to broadcast metadata updates across the cluster.
The CatalogService also directly handles executing metadata updates request from
impalad servers (DDL requests). It exposes a Thrift interface to allow impalads to
directly connect execute their DDL operations.
The CatalogService has two main components - a C++ server that implements StateStore
integration, Thrift service implementiation, and exporting of the debug webpage/metrics.
The other main component is the Java Catalog that manages caching and updating of of all
the metadata. For each StateStore heartbeat, a delta of all metadata updates is broadcast
to the rest of the cluster.
Some Notes On the Changes
---
* The metadata is all sent as thrift structs. To do this all catalog objects (Tables/Views,
Databases, UDFs) have thrift struct to represent them. These are sent with each statestore
delta update.
* The existing Catalog class has been seperated into two seperate sub-classes. An
ImpladCatalog and a CatalogServiceCatalog. See the comments on those classes for more
details.
What is working:
* New CatalogService created
* Working with statestore delta updates and latest UDF changes
* DDL performed on Node 1 is now visible on all other nodes without a "refresh".
* Each DDL operation against the Catalog Service will return the catalog version that
contains the change. An impalad will wait for the statestore heartbeat that contains this
version before returning from the DDL comment.
* All table types (Hbase, Hdfs, Views) getting their metadata propagated properly
* Block location information included in CS updates and used by Impalads
* Column and table stats included in CS updates and used by Impalads
* Query tests are all passing
Still TODO:
* Directly return catalog object metadata from DDL requests
* Poll the Hive Metastore to detect new/dropped/modified tables
* Reorganize the FE code for the Catalog Service. I don't think we want everything in the
same JAR.
Change-Id: I8c61296dac28fb98bcfdc17361f4f141d3977eda
Reviewed-on: http://gerrit.ent.cloudera.com:8080/601
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
Tested-by: Lenni Kuff <lskuff@cloudera.com>
This change adds support for the statestore to send delta updates. It also updates
the scheduler to properly handle receiving delta and non-delta updates.
With this change, the statestore will keeps a versioned log of the history of updates to
each topic, with each topic update getting a sequentially increasing version number that
is unique across the topic.
Subscribers also track the max version of each topic which they have have successfully
processed. The statestore can use this information to send a delta of updates to a
subscriber, rather than all items in the topic. For non-delta updates, the statestore
will send an update that includes all values in the topic.
Additionally, if a client has received an unexpected delta update version range, they
can request a new delta update by setting the "from_version" field of the TTopicDelta
heartbeat response. The next state-store update will be based off of this version the
client responded with.
Still left to do - Garbage collect un-needed deletions (those which have a version
number smaller than min(subscriber.version_number))
Change-Id: I53be0a647d7f7f3f8aea0d13cd9dc219411a1e5a
Reviewed-on: http://gerrit.ent.cloudera.com:8080/217
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
Tested-by: Lenni Kuff <lskuff@cloudera.com>