Commit Graph

19 Commits

Author SHA1 Message Date
David Knupp
f590bc0da6 IMPALA-4750: Rename test infra classes so they don't mimic test classes.
This patch addresses warning messages from pytest re: the imported
TestMatrix, TestVector, and TestDimension classes, which were being
collected as potential test classes. The fix was to simply prepend
the class names with Impala-

git grep -l 'TestDimension' | xargs \
    sed -i 's/TestDimension/ImpalaTestDimension/g'

git grep -l 'TestMatrix' | xargs \
    sed -i 's/TestMatrix/ImpalaTestMatrix/g'

git grep -l 'TestVector' | xargs \
    sed -i 's/TestVector/ImpalaTestVector/g'

The tests all passed in an exhaustive run on the upstream jenkins
server:

http://jenkins.impala.io:8080/view/Utility/job/pre-review-test/8/

Change-Id: I06b7bc6fd99fbb637a47ba376bf9830705c1fce1
Reviewed-on: http://gerrit.cloudera.org:8080/5794
Reviewed-by: Michael Brown <mikeb@cloudera.com>
Reviewed-by: Jim Apple <jbapple-impala@apache.org>
Tested-by: Impala Public Jenkins
2017-01-26 23:40:22 +00:00
Attila Jeges
60414f0633 IMPALA-4278: Don't abort Catalog startup quickly if HMS is not present
This change introduces a new catalogd startup option
(init_first_metastore_client_timeout_seconds) that specifies the
time in seconds catalogd should spend on retrying to establish a
connection to HMS the first time on startup before giving up and
exiting fatally.

Setting this startup option to a value that is greater than the HMS
startup time will allow CM to start Impala at the same time or even
before HMS.

The default value of init_first_metastore_client_timeout_seconds is
120 seconds.

Change-Id: I546d8fe9836004832ae40110c9fe22b3e704e11b
Reviewed-on: http://gerrit.cloudera.org:8080/5095
Reviewed-by: Henry Robinson <henry@cloudera.com>
Tested-by: Internal Jenkins
2016-11-18 03:12:12 +00:00
Dan Hecht
ffa7829b70 IMPALA-3918: Remove Cloudera copyrights and add ASF license header
For files that have a Cloudera copyright (and no other copyright
notice), make changes to follow the ASF source file header policy here:

http://www.apache.org/legal/src-headers.html#headers

Specifically:
1) Remove the Cloudera copyright.
2) Modify NOTICE.txt according to
   http://www.apache.org/legal/src-headers.html#notice
   to follow that format and add a line for Cloudera.
3) Replace or add the existing ASF license text with the one given
   on the website.

Much of this change was automatically generated via:

git grep -li 'Copyright.*Cloudera' > modified_files.txt
cat modified_files.txt | xargs perl -n -i -e 'print unless m#Copyright.*Cloudera#i;'
cat modified_files_txt | xargs fix_apache_license.py [1]

Some manual fixups were performed following those steps, especially when
license text was completely missing from the file.

[1] https://gist.github.com/anonymous/ff71292094362fc5c594 with minor
    modification to ORIG_LICENSE to match Impala's license text.

Change-Id: I2e0bd8420945b953e1b806041bea4d72a3943d86
Reviewed-on: http://gerrit.cloudera.org:8080/3779
Reviewed-by: Dan Hecht <dhecht@cloudera.com>
Tested-by: Internal Jenkins
2016-08-09 08:19:41 +00:00
Taras Bobrovytsky
609b80410e Clean up Python test import statements
Many of our test scripts have import statements that look like
"from xxx import *". It is a good practice to explicitly name what
needs to be imported. This commit implements this practice. Also,
unused import statements are removed.

Change-Id: I6a33bb66552ae657d1725f765842f648faeb26a8
Reviewed-on: http://gerrit.cloudera.org:8080/3444
Reviewed-by: Michael Brown <mikeb@cloudera.com>
Tested-by: Internal Jenkins
2016-07-15 23:26:18 +00:00
Casey Ching
074e5b4349 Remove hashbang from non-script python files
Many python files had a hashbang and the executable bit set though
they were not intended to be run a standalone script. That makes
determining which python files are actually scripts very difficult.
A future patch will update the hashbang in real python scripts so they
use $IMPALA_HOME/bin/impala-python.

Change-Id: I04eafdc73201feefe65b85817a00474e182ec2ba
Reviewed-on: http://gerrit.cloudera.org:8080/599
Reviewed-by: Casey Ching <casey@cloudera.com>
Reviewed-by: Taras Bobrovytsky <tbobrovytsky@cloudera.com>
Tested-by: Internal Jenkins
2015-08-04 05:26:07 +00:00
Alex Behm
1327f4f07b IMPALA-491: Fixed worker process failure test and clients-in-use metric accounting.
Fixing the worker process failure test revealed a bug in our clients-in-use metrics
accounting in the client cache in the presence of node failures.
The pathological sequence causing leading to an incorrect metric was:
1. ClientConnection c'tor checks out an existing client and updates metrics
2. Existing client connection has broken pipe (or some other TException)
3. Recover by calling ClientConnection::Reopen()
4. Reopen() internally tries to create a new client connection, but the connection
   fails because the remote side is dead.
5. ClientConnection::client_ is set to NULL and therefore its d'tor
   does not properly decrement the clients-in-use for the client
   checked out in step 1.

Change-Id: I911210b23c4a024a4a8a84365a3d12268767b031
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/3804
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: jenkins
2014-08-17 12:43:30 -07:00
Nong Li
a25400c94e Increase timeout in test_rows_availability to make sure query state is what we expect.
Change-Id: Id4feebcc7b7cecb07555009219e6420e48a0c82b
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3534
Tested-by: jenkins
Reviewed-by: Nong Li <nong@cloudera.com>
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3579
2014-07-22 12:12:13 -07:00
Lenni Kuff
6afea60704 Update test logging to print executable SQL statements and log all actions executed
This is the first step in cleaning up the test logging. It provides a common connection
interface that provides tracing around all operations. When a test fails the output will
be executable SQL. It also logs actions such as when a connection is opened, close, or
when an operation is cancelled. Currently only beeswax connections are supported, but
I have a seperate patch that adds support for executing using HS2 as well as Beeswax.

Example of new logging:
-- connecting to: localhost:21000
-- executing against localhost:21000
use functional;

SET disable_codegen=False;
SET abort_on_error=1;
SET batch_size=0;
SET num_nodes=0;

-- executing against localhost:21000
select a.timestamp_col from alltypessmall a inner join alltypessmall b on
(a.timestamp_col = b.timestamp_col)
where a.year=2009 and a.month=1 and b.year=2009 and b.month=1;
-- closing connection to: localhost:21000

Change-Id: Iedc7d4d3a84bfeff6cc1daae6ed1ca97613d7700
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1133
Tested-by: jenkins
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
2014-01-08 10:54:40 -08:00
Henry Robinson
9bc840dc85 Support for custom cluster configurations in some tests
Test suites that derive from common.CustomClusterTestSuite have a brand
new cluster for every tests case, which they can configure as they wish
with custom arguments using the @with_args() decorator.

A future improvement is to optionally only have one cluster per test
suite, to allow multiple tests to run more quickly if they share
configuration options.

Change-Id: I6abd5740e644996d7ca2800edf4ff11b839d1bc4
Reviewed-on: http://gerrit.ent.cloudera.com:8080/882
Reviewed-by: Henry Robinson <henry@cloudera.com>
Tested-by: jenkins
2014-01-08 10:53:57 -08:00
Henry Robinson
f241782966 IMPALA-620: Fix re-registration starvation bug in statestore
This patch fixes a slightly pathological state that occurs when the
statestore is under heavy load. The result of the bug is that
subscribers cannot successfully re-register because the statestore never
marks them as failed.

The exact sequence of events is as follows:

1. Subscriber registers with state-store.
2. Statestore does not send heartbeats in timely fashion to
   subscriber. Subscriber times-out.
3. Subscriber is restarted quickly. Statestore does not detect
   restart.
4. Subscriber's RegisterSubscriber() call fails, because statestore
   detects duplicate registration.
5. Subscriber restarts again. Since state-store is slow to send
   heartbeats, the state-store has not detected the restart and the
   subscriber receives a heartbeat message from the statestore and
   does not reject it.
6. Statestore continues to believe subscriber is alive, since the
   heartbeats are not being rejected.

To fix this, we add a registration ID to each successfully registered
subscriber that is known to both subscriber and statestore. If the
subscriber should restart and re-register, it receives a new
registration ID. Whenever a heartbeat arrives, it compares its
registration ID to that sent by the statestore with the heartbeat, and
rejects the heartbeat if they do not match.

We also allow re-registration of existing subscribers (getting rid of
the dreaded "Duplicate subscription" message). A new registration
overwrites an old one.

Change-Id: Ie32df3a586ccb375375ebfbcbec1aaeb930b6bfe
Reviewed-on: http://gerrit.ent.cloudera.com:8080/778
Tested-by: jenkins
Reviewed-by: Henry Robinson <henry@cloudera.com>
2014-01-08 10:53:53 -08:00
Lenni Kuff
2336ed99a4 Re-enable process failure tests + add simple failure tests for catalogd
This brings back online the process failure tests and adds a basic failure
test for the catalog service. The timeouts had to be adjusted to account for the
extra time it takes to load the the catalog and also there is an additional state
store subscriber. Note: the statestore 'live.backends' metric which is used in these
tests needs to be renamed, it really means 'live.subscribers'. However, it requires some
coordination with other teams to make the change.
Also updated start-impala-cluster to check the catalog.ready flag to ensure the impalad
catalog is ready to accept queries.

Change-Id: If22e25dba7dc83aa40bec937b5f82b815bed4645
Reviewed-on: http://gerrit.ent.cloudera.com:8080/730
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
Tested-by: jenkins
2014-01-08 10:53:52 -08:00
Lenni Kuff
a2cbd2820e Add Catalog Service and support for automatic metadata refresh
The Impala CatalogService manages the caching and dissemination of cluster-wide metadata.
The CatalogService combines the metadata from the Hive Metastore, the NameNode,
and potentially additional sources in the future. The CatalogService uses the
StateStore to broadcast metadata updates across the cluster.
The CatalogService also directly handles executing metadata updates request from
impalad servers (DDL requests). It exposes a Thrift interface to allow impalads to
directly connect execute their DDL operations.
The CatalogService has two main components - a C++ server that implements StateStore
integration, Thrift service implementiation, and exporting of the debug webpage/metrics.
The other main component is the Java Catalog that manages caching and updating of of all
the metadata. For each StateStore heartbeat, a delta of all metadata updates is broadcast
to the rest of the cluster.

Some Notes On the Changes
---
* The metadata is all sent as thrift structs. To do this all catalog objects (Tables/Views,
Databases, UDFs) have thrift struct to represent them. These are sent with each statestore
delta update.
* The existing Catalog class has been seperated into two seperate sub-classes. An
ImpladCatalog and a CatalogServiceCatalog. See the comments on those classes for more
details.

What is working:
* New CatalogService created
* Working with statestore delta updates and latest UDF changes
* DDL performed on Node 1 is now visible on all other nodes without a "refresh".
* Each DDL operation against the Catalog Service will return the catalog version that
  contains the change. An impalad will wait for the statestore heartbeat that contains this
  version before returning from the DDL comment.
* All table types (Hbase, Hdfs, Views) getting their metadata propagated properly
* Block location information included in CS updates and used by Impalads
* Column and table stats included in CS updates and used by Impalads
* Query tests are all passing

Still TODO:
* Directly return catalog object metadata from DDL requests
* Poll the Hive Metastore to detect new/dropped/modified tables
* Reorganize the FE code for the Catalog Service. I don't think we want everything in the
  same JAR.

Change-Id: I8c61296dac28fb98bcfdc17361f4f141d3977eda
Reviewed-on: http://gerrit.ent.cloudera.com:8080/601
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
Tested-by: Lenni Kuff <lskuff@cloudera.com>
2014-01-08 10:53:11 -08:00
Alex Behm
e4a24c8c1d Fixed the process failure test that was failing due to a race in
reading/writing a query's profile web page.

Change-Id: Ibf4a27aa17eb6439630d1616c2c719fc1ee2ba4e
Reviewed-on: http://gerrit.ent.cloudera.com:8080/553
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Alex Behm <alex.behm@cloudera.com>
2014-01-08 10:53:03 -08:00
Alex Behm
6a1cc58936 IMPALA-491: Improve error message when queries are cancelled due to BE nodes dying.
Change-Id: If9a47d9021b08385743093fbe8054b48119eaff9
Reviewed-on: http://gerrit.ent.cloudera.com:8080/523
Tested-by: jenkins
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Alex Behm <alex.behm@cloudera.com>
2014-01-08 10:52:58 -08:00
Lenni Kuff
8095a963d0 Enable statestore to send delta updates
This change adds support for the statestore to send delta updates. It also updates
the scheduler to properly handle receiving delta and non-delta updates.

With this change, the statestore will keeps a versioned log of the history of updates to
each topic, with each topic update getting a sequentially increasing version number that
is unique across the topic.
Subscribers also track the max version of each topic which they have have successfully
processed. The statestore can use this information to send a delta of updates to a
subscriber, rather than all items in the topic. For non-delta updates, the statestore
will send an update that includes all values in the topic.

Additionally, if a client has received an unexpected delta update version range, they
can request a new delta update by setting the "from_version" field of the TTopicDelta
heartbeat response. The next state-store update will be based off of this version the
client responded with.

Still left to do - Garbage collect un-needed deletions (those which have a version
number smaller than min(subscriber.version_number))

Change-Id: I53be0a647d7f7f3f8aea0d13cd9dc219411a1e5a
Reviewed-on: http://gerrit.ent.cloudera.com:8080/217
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
Tested-by: Lenni Kuff <lskuff@cloudera.com>
2014-01-08 10:52:45 -08:00
Lenni Kuff
fa3c927b25 Fix test to wait for impalad num_live_backends=CLUSTER_SIZE after statestore restart 2014-01-08 10:51:30 -08:00
Lenni Kuff
9a4feb7391 Add impala local failure test framework and tests 2014-01-08 10:50:18 -08:00
ishaan
15658f384b Include targeted performance tests in experiments and add a new query 2014-01-08 10:49:02 -08:00
Lenni Kuff
4bd6279646 Add targeted failure injection test suite using debug failpoints
Change-Id: I013e913ad3c89f44524bf19638a1da2b83df7463
2014-01-08 10:47:54 -08:00