Commit Graph

329 Commits

Author SHA1 Message Date
Nong Li
2489e211f0 Update version to 1.2.2.
Change-Id: Id70f4af930050075a41b1953fc4c5c935bb5b671
2014-01-08 10:54:30 -08:00
Henry Robinson
6d9a7e290d Build Openldap as a thirdparty package
Change-Id: Ifbb0f468a23186f4160fceb462953bc321469c27
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1049
Reviewed-by: Henry Robinson <henry@cloudera.com>
Tested-by: Henry Robinson <henry@cloudera.com>
2014-01-08 10:54:20 -08:00
Henry Robinson
cb965d259a Build changes to use cyrus-sasl-2.1.23
Change-Id: Ie87e35945b6a415b0383cb75ffcae2fe35755623
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1047
Reviewed-by: Henry Robinson <henry@cloudera.com>
Tested-by: Henry Robinson <henry@cloudera.com>
2014-01-08 10:54:19 -08:00
Nong Li
b225477ae9 Bump version to 1.2.2-INTERNAL.
Change-Id: I256ef47b6e957a2723422e606d1b87f4e800bbf9
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1032
Reviewed-by: Henry Robinson <henry@cloudera.com>
Tested-by: Nong Li <nong@cloudera.com>
2014-01-08 10:54:17 -08:00
Lenni Kuff
01660374c6 Additional fe and testdata pom.xml cleanup
This change cleans up our FE pom.xml file by removing unneeded
dependencies and system dependencies (system dependencies are now pulled in
from the Maven release repository).

The upside is that our pom is cleaner and it will also help reduce the likelihood of
broken dependencies since Maven will pull in the right versions.  The downside
is that we now pull in quite a few more JARs.

Note: I was unable to find release artifacts for Sentry and Parquet so I leaving
those as "system" for now.

Change-Id: I0b917b09a02243d78d89747591ab6bccacf7cf38

Saving changes

Change-Id: I3697a7b44884c40e077b3e354fef76625e1b881d
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1011
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
Tested-by: jenkins
2014-01-08 10:54:17 -08:00
Lenni Kuff
e86ca62ec7 Do not append any JARs from thirdparty/ to the classpath
Change-Id: Id68c1bc118a1b8efebb6d035ca94a41cf1c4ded1
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1005
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: jenkins
2014-01-08 10:54:16 -08:00
Henry Robinson
ce2781c48d Remove bad quotes from thrift configure script
Change-Id: Id671f5366813378ead9362f67b082b7af705b005
Reviewed-on: http://gerrit.ent.cloudera.com:8080/994
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: Henry Robinson <henry@cloudera.com>
2014-01-08 10:54:14 -08:00
Sean Mackrory
2b313a9782 IMP-1147. Impala build fails: PIC_LIB_PATH: unbound variable
Change-Id: Ifb173b553b9a52392b5d7caf3630032b89e89c2d
Reviewed-on: http://gerrit.ent.cloudera.com:8080/992
Reviewed-by: Sean Mackrory <sean@cloudera.com>
Tested-by: Sean Mackrory <sean@cloudera.com>
2014-01-08 10:54:14 -08:00
Sean Mackrory
bb39e33101 IMP-1106. Allow libevent location to be overridden in Thrift dependency build
Change-Id: Ia4d92bb4bdfcb7ba29a36904afdb9fd5e398307d
Reviewed-on: http://gerrit.ent.cloudera.com:8080/968
Reviewed-by: Henry Robinson <henry@cloudera.com>
Reviewed-by: Sean Mackrory <sean@cloudera.com>
Tested-by: Sean Mackrory <sean@cloudera.com>
2014-01-08 10:54:14 -08:00
ishaan
287953e87c Better error logging while loading data.
Change-Id: I67cbd9fd1d915ea043a731b7951f29fec25fc446
Reviewed-on: http://gerrit.ent.cloudera.com:8080/982
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: jenkins
2014-01-08 10:54:13 -08:00
Lenni Kuff
6e09b90ea3 Properly set timeout in start-impala-cluster
Change-Id: I8cedf484d0ce9d2752e3970883f419ab51a82c3b
Reviewed-on: http://gerrit.ent.cloudera.com:8080/980
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: jenkins
2014-01-08 10:54:13 -08:00
Lenni Kuff
e2b9b4a735 Bump version to v1.2.1
Change-Id: I8f1c9ae1fd0ad195fa7817d324d192c2386eac09
Reviewed-on: http://gerrit.ent.cloudera.com:8080/974
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: Lenni Kuff <lskuff@cloudera.com>
2014-01-08 10:54:12 -08:00
ishaan
81b80c702c Upgrade thirdparty to use CDH4.5 bits.
The following changes have been made:
  -- Update hbase
  -- Update hive
  -- Update hadoop
  -- Update the parquet version to 1.2.5

Change-Id: Id6ceaef0e9eebab27ffd408160116fa84ed300fb
2014-01-08 10:54:09 -08:00
Lenni Kuff
6282d364a8 IMP-1134: DoAsUser and impersonator are reversed in audit logs
The audit logs currently have the "impersonator" field set to what we call the doAsUser
and the "user" field set as the connected user. They should be reversed.

Added basic tests to validate the correct event gets audited.

Change-Id: Idfa0aaa6c88debedc4993bd0489dbd3f696fcf17
Reviewed-on: http://gerrit.ent.cloudera.com:8080/958
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
Tested-by: jenkins
2014-01-08 10:54:03 -08:00
ishaan
bf5359be8d Cleanup Impala connections after data is loaded.
Change-Id: I152b09808740d5344462bcbaf4df4b71d88504cc
Reviewed-on: http://gerrit.ent.cloudera.com:8080/953
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: jenkins
2014-01-08 10:54:02 -08:00
Lenni Kuff
6c25e78715 Add option to start-impala-cluster to only restart impalad
This helps speed up the restart time becuase we don't need to restart
the catalog server and reload the table metadata. This is useful if you
want to restart the impalad with a different command line parameter
or if you are making changes to only the impalad binary.

Change-Id: I0b714afaf7e508c450a353a53d67d95165de3486
Reviewed-on: http://gerrit.ent.cloudera.com:8080/897
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
Tested-by: jenkins
2014-01-08 10:53:59 -08:00
Lenni Kuff
f579ee8b25 Fix logging in load-data to print the query being executed
Change-Id: I4332e8d3a340f11e1bbb1f6c5126b0b9b4a2ad8e
Reviewed-on: http://gerrit.ent.cloudera.com:8080/949
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: Ishaan Joshi <ishaan@cloudera.com>
2014-01-08 10:53:58 -08:00
Henry Robinson
9bc840dc85 Support for custom cluster configurations in some tests
Test suites that derive from common.CustomClusterTestSuite have a brand
new cluster for every tests case, which they can configure as they wish
with custom arguments using the @with_args() decorator.

A future improvement is to optionally only have one cluster per test
suite, to allow multiple tests to run more quickly if they share
configuration options.

Change-Id: I6abd5740e644996d7ca2800edf4ff11b839d1bc4
Reviewed-on: http://gerrit.ent.cloudera.com:8080/882
Reviewed-by: Henry Robinson <henry@cloudera.com>
Tested-by: jenkins
2014-01-08 10:53:57 -08:00
ishaan
fcdcf1a9d8 Parallelize data loaded through Impala to speed up data loading.
Currently, we execute all the queries involved in data loading serially. This change
creates a separate .sql file for each file format, compression codec and compression
scheme combination, and executes all the files in parallel. Additionally, we now store all the
.sql files (independent of workload) in $IMPALA_HOME/data_load_files/<dataset_name>. Note
that only data loaded through Impala is parallelized, data loaded through hive and hbase
remains serial.

On our build machines, the time taken to load all the data from snapshot was on the order
of 15 minutes.

Change-Id: If8a862c43f0e75b506ca05d83eacdc05621cbbf8
Reviewed-on: http://gerrit.ent.cloudera.com:8080/804
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: jenkins
2014-01-08 10:53:53 -08:00
ishaan
b17bc45148 Disable process failure tests to stabilize builds.
Change-Id: I4a8c36f24ba2d0f4547b0e6eb70bcc2010ae1975
Reviewed-on: http://gerrit.ent.cloudera.com:8080/865
Reviewed-by: Nong Li <nong@cloudera.com>
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: Ishaan Joshi <ishaan@cloudera.com>
2014-01-08 10:53:52 -08:00
Lenni Kuff
2336ed99a4 Re-enable process failure tests + add simple failure tests for catalogd
This brings back online the process failure tests and adds a basic failure
test for the catalog service. The timeouts had to be adjusted to account for the
extra time it takes to load the the catalog and also there is an additional state
store subscriber. Note: the statestore 'live.backends' metric which is used in these
tests needs to be renamed, it really means 'live.subscribers'. However, it requires some
coordination with other teams to make the change.
Also updated start-impala-cluster to check the catalog.ready flag to ensure the impalad
catalog is ready to accept queries.

Change-Id: If22e25dba7dc83aa40bec937b5f82b815bed4645
Reviewed-on: http://gerrit.ent.cloudera.com:8080/730
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
Tested-by: jenkins
2014-01-08 10:53:52 -08:00
Lenni Kuff
77e6430811 Add alias in impala-config.sh to gerrit-merge-verify script
This alias makes it easy to verify and merge Gerrit changes

Change-Id: Idb0b5c3e6c825721e375bdf0c86b5975df1ed4b9
Reviewed-on: http://gerrit.ent.cloudera.com:8080/836
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: Lenni Kuff <lskuff@cloudera.com>
2014-01-08 10:53:50 -08:00
ishaan
f6f8d9d19d Fix query_executor to set user specified Impala query options.
This is currently broken (query options do not get set via run-workload). If any
query options are provided to run-workload, it exits with an error. This patch
re-enables setting query options through run-workload and also moves their validation to
impala_beeswax.

Change-Id: I1df010990f9e57ebd4cf59ada5d9646a883df380
Reviewed-on: http://gerrit.ent.cloudera.com:8080/820
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: Ishaan Joshi <ishaan@cloudera.com>
2014-01-08 10:53:49 -08:00
Nong Li
1f92f36080 Bump version to 1.2.1-Internal.
Change-Id: I8b8ff39799652031d6d625da1d9a1d20e8acecb3
Reviewed-on: http://gerrit.ent.cloudera.com:8080/760
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
Tested-by: Nong Li <nong@cloudera.com>
2014-01-08 10:53:48 -08:00
Henry Robinson
f238f5fdcf New subsystem to support multiple authentication types
This patch reworks our Kerberos authentication layer to support multiple
authentication protocols, particularly PLAIN/SASL to support external
LDAP authentication.

There is now a system-wide AuthManager object, initialised by InitAuth()
which occurs during the usual InitCommonRuntime() setup. The AuthManager
is responsible for supplying AuthProvider objects to ThriftServers and
ThriftClients. The AuthProvider in turn generates Thrift transport
objects which are usually SASL-enabled, and which either employ GSSAPI
or PLAIN mechanisms.

In miscellaneous changes:

* Cyrus SASL now builds both with LDAP and the dummy '--enable-true'
  external authentication mechanisms enabled.
* To test PLAIN/SASL authentication, you must now include
  $IMPALA_HOME/thirdparty/${IMPALA_CYRUS_SASL_VERSION}/build/lib/sasl2 in
  FLAGS_sasl_path.
* The shell now has an option to authenticate using LDAP, and will
  prompt for a password at startup before doing so.
* Since the authentication code is almost entirely Thrift-specific, it
  has been moved to the rpc lib.

Change-Id: I771de50f05630efdf1606ab9f0f48146ad54595e
Reviewed-on: http://gerrit.ent.cloudera.com:8080/716
Tested-by: jenkins
Reviewed-by: Henry Robinson <henry@cloudera.com>
2014-01-08 10:53:43 -08:00
Henry Robinson
bb1f48588d Disable saslauthd in Cyrus-Sasl build
We've had at least one case of Sasl failing to build during
Saslauthd. We don't use that component, so it's fine to disable it
rather than figure out the actual issue.

Change-Id: I1e16063970806823f7fe3b40a1b0e74a32c4b57f
Reviewed-on: http://gerrit.ent.cloudera.com:8080/736
Reviewed-by: Henry Robinson <henry@cloudera.com>
Tested-by: Henry Robinson <henry@cloudera.com>
2014-01-08 10:53:41 -08:00
Henry Robinson
f02293bf5f Upgrade cyrus-sasl to 2.1.25
Change-Id: I1864c6fa0811f615777e9a7ed0aeef5494104449
Reviewed-on: http://gerrit.ent.cloudera.com:8080/733
Tested-by: jenkins
Reviewed-by: Henry Robinson <henry@cloudera.com>
2014-01-08 10:53:40 -08:00
Henry Robinson
0aa120adca Allow building individual thirdparty components.
Now you can write:

./build_thirdparty -sasl -gflags

or similar to build individual thirdparty libaries, which is handy if
you're upgrading a single library or changing its build flags.

The behaviour with no command-line flags is the same as before this
patch, except that the 'git clean' is called only from the individual
library directories, rather than /thirdparty as before; this avoids
blowing away unchecked in directories while still removing build
artefacts as intended.

Change-Id: Iaafb6f6e42b0173c11eec3b08c8dea895dcd9199
Reviewed-on: http://gerrit.ent.cloudera.com:8080/725
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
Tested-by: jenkins
2014-01-08 10:53:40 -08:00
Lenni Kuff
af6d381401 IMPALA-565: Support user impersonation for HS2 authorization requests
This change adds support for user impersonation for HS2 authorization
requests. It adds a new flag (--authorized_proxy_user_config) that if
set, allows users (ex. hue) to impersonate as another user. The user they
wish to impersonate as is passed using the HS2 configuration property,
'impala.doas.user'.
The configuration allows for specifying the list of users a proxy user
can impersonate as well, or '*' to allow the proxy user to impersonate
any user. For example: hue=user1,user2,admin=*

Change-Id: I2a13e31e5bde2e6df47134458c803168415d0437
Reviewed-on: http://gerrit.ent.cloudera.com:8080/574
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
Tested-by: Lenni Kuff <lskuff@cloudera.com>
2014-01-08 10:53:39 -08:00
ishaan
0cb16863ee run-workload should log a warning to console and not fail if abort_on_query_error is False and the
query fails.

This change also disables printing the runtime_profile to the console.

Change-Id: Ic7bc3406d6eddb67a514ecfb4a27add8c40a8604
Reviewed-on: http://gerrit.ent.cloudera.com:8080/687
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: Ishaan Joshi <ishaan@cloudera.com>
2014-01-08 10:53:25 -08:00
Nong Li
4800995d44 Add execution for Hive UDFs.
Change-Id: I6a5ad96fed77e2b8a2701f21a917a8eb7a11d500
Reviewed-on: http://gerrit.ent.cloudera.com:8080/458
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: Nong Li <nong@cloudera.com>
2014-01-08 10:53:25 -08:00
Lenni Kuff
13605ad834 Support catalogd in ImpalaCluster test library
Adds basic support for catalogd to our ImpalaCluster test library/object model.
This will allow us to write more programatic tests targeting the catalogd process
including process failure tests and metric check validators.

Change-Id: I8e5f7bc73f999f105437c6d3d52c6d436a354d2d
Reviewed-on: http://gerrit.ent.cloudera.com:8080/617
Tested-by: jenkins
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
Tested-by: Lenni Kuff <lskuff@cloudera.com>
2014-01-08 10:53:16 -08:00
ishaan
aa530ce11d Change the order of fields stored in the benchmark results to fix performance comparisons.
Change-Id: I7b7ebd711adfe9a44cba92b55d35ef8dd97eba60
Reviewed-on: http://gerrit.ent.cloudera.com:8080/584
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: Ishaan Joshi <ishaan@cloudera.com>
2014-01-08 10:53:12 -08:00
Lenni Kuff
5a97258c1a Update table metadata loading to workaround Hive MetaStore bug HIVE-5457
There is a Hive Metastore concurrency bug (HIVE-5457) which causes concurrent
calls to getTable() to sometimes fail due with data nucleus exceptions. This
causes catalogd to fail to load ALL metadata for all tables. This fix is to
serialize our calls to getTable(). Additionally, tweaked the logging a bit and
improved start-impala-cluster to do a better job of reporting the status of catalog
initialization. It's too bad we have to serialize these calls, but we seem to be able
to run everything else in parallel with no problems (get col stats, block md, etc).

Also added a couple of changes in our hive-site to match the defaults for our cluster
metastore deployments.

Change-Id: Ic70e2a9b8190a56510e430d8da3942dca252eb4c
Reviewed-on: http://gerrit.ent.cloudera.com:8080/609
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
Tested-by: Lenni Kuff <lskuff@cloudera.com>
2014-01-08 10:53:11 -08:00
Lenni Kuff
a2cbd2820e Add Catalog Service and support for automatic metadata refresh
The Impala CatalogService manages the caching and dissemination of cluster-wide metadata.
The CatalogService combines the metadata from the Hive Metastore, the NameNode,
and potentially additional sources in the future. The CatalogService uses the
StateStore to broadcast metadata updates across the cluster.
The CatalogService also directly handles executing metadata updates request from
impalad servers (DDL requests). It exposes a Thrift interface to allow impalads to
directly connect execute their DDL operations.
The CatalogService has two main components - a C++ server that implements StateStore
integration, Thrift service implementiation, and exporting of the debug webpage/metrics.
The other main component is the Java Catalog that manages caching and updating of of all
the metadata. For each StateStore heartbeat, a delta of all metadata updates is broadcast
to the rest of the cluster.

Some Notes On the Changes
---
* The metadata is all sent as thrift structs. To do this all catalog objects (Tables/Views,
Databases, UDFs) have thrift struct to represent them. These are sent with each statestore
delta update.
* The existing Catalog class has been seperated into two seperate sub-classes. An
ImpladCatalog and a CatalogServiceCatalog. See the comments on those classes for more
details.

What is working:
* New CatalogService created
* Working with statestore delta updates and latest UDF changes
* DDL performed on Node 1 is now visible on all other nodes without a "refresh".
* Each DDL operation against the Catalog Service will return the catalog version that
  contains the change. An impalad will wait for the statestore heartbeat that contains this
  version before returning from the DDL comment.
* All table types (Hbase, Hdfs, Views) getting their metadata propagated properly
* Block location information included in CS updates and used by Impalads
* Column and table stats included in CS updates and used by Impalads
* Query tests are all passing

Still TODO:
* Directly return catalog object metadata from DDL requests
* Poll the Hive Metastore to detect new/dropped/modified tables
* Reorganize the FE code for the Catalog Service. I don't think we want everything in the
  same JAR.

Change-Id: I8c61296dac28fb98bcfdc17361f4f141d3977eda
Reviewed-on: http://gerrit.ent.cloudera.com:8080/601
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
Tested-by: Lenni Kuff <lskuff@cloudera.com>
2014-01-08 10:53:11 -08:00
ishaan
565d15579c Add the ability to use a workload as the unit of execution in the Impala benchmark runner.
At the moment, a query is the default unit of execution and parallelism in the Impala
performance suite. With this change, we now have the ability to treat a workload as the
unit of execution. A workload is defined as a unique combination of the dataset, scale
factor, a subset (or all) of the queries in the dataset, and a table format (file format,
compression codec and compression scheme).

It introduces two new command line options in bin/run-workload.py:
  * --execution_scope
    The default scope is 'query', and it maintains previous semantics. The
    new scope is 'workload', which toggles the unit of execution to a workload.
  * --shuffle_query_exec_order.
    Shuffles the order in which queries are executed (only applicable when the
    execution_scope if workload), defaults to False.

Change-Id: I790d75f0896210cda8eb999015b0be04246e4c45
Reviewed-on: http://gerrit.ent.cloudera.com:8080/503
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: Ishaan Joshi <ishaan@cloudera.com>
2014-01-08 10:53:07 -08:00
Henry Robinson
41c88219ab Fix PYTHONPATH for Thrift on non-Debian systems
Python modules on Redhat systems might be in lib or in lib64, unlike Debian systems which
symlink one to the other

Change-Id: Ia1e2d362e3d7e13b87c70e7578644827a5234a91
Reviewed-on: http://gerrit.ent.cloudera.com:8080/544
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
Tested-by: jenkins
2014-01-08 10:53:00 -08:00
Henry Robinson
b9bc9a9e89 Add SSL support for client connections to Impala
This patch allows Impala to start either Beeswax or HS2 on an
SSL-secured port. SSL is a certificate-based authentication scheme,
where the server provides a certificate to the client as part of the
handshake process. The client verifies that certificate, either by
contacting a trusted third-party certificate authority (CA), or by
accepting a 'self-signed' certificate from the server that is also
provided to the client out-of-band; the client simply compares the two
certificate copies.

Once the certificate is verified, the client and server negotiate an
encryption key for the session, using a public key provided by the
server to encrypt that negotiation. Therefore the server has to have
access to a private key in order to decrypt the encryption key.

Both certificate and key are stored in industry standard .PEM
format. Impala uses the same certificate and key for both Beeswax and
HS2, and the files containing the certificate and key are provided via
--ssl_server_certificate and --ssl_private_key. If either are non-blank,
SSL is enabled for Beeswax and HS2.

The Python shell supports SSL as of this patch via new --ssl and
--ca_cert flags.

Finally, this patch also adds support for Impala's ThriftClients to use
SSL, paving the way for having the backend service use encryption on the
wire as well (although such a configuration is not used by this
patch). The client SSL support is only currently used for the new test
case.

This patch does not enable 'mutual' authentication, where clients
provide certificates to the server in order to authenticate
themselves. Impala has other authentication mechanisms for that purpose.

Change-Id: I3942aa0d21b34b7cda748292f04a9523f35ee6d4
Reviewed-on: http://gerrit.ent.cloudera.com:8080/514
Tested-by: jenkins
Reviewed-by: Henry Robinson <henry@cloudera.com>
2014-01-08 10:53:00 -08:00
Lenni Kuff
79cdeac3d6 Consolidate test cluster under IMPALA_HOME/cluster_logs + store logs during data loading
Change-Id: I8f6239e4ccb0515c85bf80193a475788fb18dedb
Reviewed-on: http://gerrit.ent.cloudera.com:8080/518
Tested-by: jenkins
Reviewed-by: Nong Li <nong@cloudera.com>
2014-01-08 10:52:56 -08:00
ishaan
533ca4d3e6 Fix start-impala-cluster.py to take the user specified log level into account.
Previously, the user specified command line paramter --log_level was not being
taken into account while starting the mini impala cluster.

Change-Id: I433412b6a7057585136d2ad887010881217d9676
Reviewed-on: http://gerrit.ent.cloudera.com:8080/520
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
Tested-by: Ishaan Joshi <ishaan@cloudera.com>
2014-01-08 10:52:56 -08:00
Henry Robinson
dbed012396 Move from 'Mongoose' to 'Squeasel' webserver
We now maintain our own internal version of the Mongoose webserver,
renamed to 'Squeasel' for differentiation. This patch imports the new
code, and swaps all mentions of mongoose or mg_ for squeasel / sq_.

In the future, we might consider making Squeasel a git subproject so
that we can pull in changes more easily.

Change-Id: I83b595dc336a32f2c8aba59eee420b71274b681b
Reviewed-on: http://gerrit.ent.cloudera.com:8080/485
Tested-by: jenkins
Reviewed-by: Henry Robinson <henry@cloudera.com>
2014-01-08 10:52:55 -08:00
Lenni Kuff
533f1751a4 Buffer logs in start impala cluster
Change-Id: I84a79219c20bf2aeed2b90f6895577112b150663
Reviewed-on: http://gerrit.ent.cloudera.com:8080/481
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: Lenni Kuff <lskuff@cloudera.com>
2014-01-08 10:52:49 -08:00
Lenni Kuff
a1f2f72f49 Add Impala DDL support for creation of AVRO tables + support for CREATE/ALTER SERDEPROPERTIES
This change adds Impala DDL support for creation of AVRO tables.
Additionally, it add Impala support for CREATE and ALTER SERDEPROPERTIES
which are used when creating Avro backed tables. This syntax is not
exactly the same as the Hive support since it introduces a new
fileformat (AVROFILE) that implies the needed Serialization library,
input format, and output format.

Change-Id: I5047e419198a89599e9d014fdedfee1a20437a7d
Reviewed-on: http://gerrit.ent.cloudera.com:8080/464
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
Tested-by: Lenni Kuff <lskuff@cloudera.com>
2014-01-08 10:52:48 -08:00
Lenni Kuff
88bc7ea39d Bump version from v1.2 to v1.2.0
Change-Id: Ie15d8d1d03e37766a5086063e1de579e87f66263
Reviewed-on: http://gerrit.ent.cloudera.com:8080/424
Reviewed-by: Henry Robinson <henry@cloudera.com>
Tested-by: jenkins
2014-01-08 10:52:41 -08:00
Nong Li
69d6f20bbe Add option to bin/run-all-tests to allow for more iterations.
Change-Id: Ic440ed0454e275fa67b7d27e392acb803cf59d39
Reviewed-on: http://gerrit.ent.cloudera.com:8080/395
Tested-by: jenkins
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
2014-01-08 10:52:39 -08:00
ishaan
53cd9eadab Treat HBase as a file format for functional tests
Change-Id: Ia01181a1e10eb108419122d347e9d869a69e8922
Reviewed-on: http://gerrit.ent.cloudera.com:8080/102
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: Ishaan Joshi <ishaan@cloudera.com>
2014-01-08 10:52:36 -08:00
Lenni Kuff
1b49174a0a Cleanup start-impala-cluster and use exec to start processes
This change cleans up start-impala-cluster to remove all the uneeded log4j setup code.
As part of this change, updated the start-impalad script to "exec" the impala binaries,
which removes the .sh wrapper script from the list of running processes.

Change-Id: I5dee49b72ff51012bf43ab9d2a3a21fd2b841ff5
Reviewed-on: http://gerrit.ent.cloudera.com:8080/270
Tested-by: jenkins
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: Ishaan Joshi <ishaan@cloudera.com>
2014-01-08 10:52:25 -08:00
Lenni Kuff
f7cd659a4c Run BE unit tests before all other tests
This change modifies run-all-tests to run the BE unit tests before all other tests.
This is beneficial for a number of reasons - it helps catch basic bugs earlier
(query tests will probably fail if there is a BE unit test failure) and also allows
us to keep the mini-impala-cluster running after a build to help with debugging.
Ideally, we could also run the FE unit tests before the query tests but there is
currently a dependency on the TPC-H temp tables generated by the query tests so this
cannot be done.

Change-Id: Id43dbac456236258cd9986e990779d27f5d41075
Reviewed-on: http://gerrit.ent.cloudera.com:8080/269
Tested-by: jenkins
Tested-by: Ishaan Joshi <ishaan@cloudera.com>
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
2014-01-08 10:52:23 -08:00
Aaron Davidson
d0665481d1 Vary number of build threads based on number of cores
Simply makes buildall.sh and the make_*.sh commands use 2 * ncores
build threads. ncores includes logical CPUs.

Change-Id: Ib3fbf1f1c8362c5bd3afab61f4d3030a50c51c10
Reviewed-on: http://gerrit.ent.cloudera.com:8080/288
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
Tested-by: jenkins
2014-01-08 10:52:22 -08:00
Nong Li
b93219be38 Extend parquet dictionary compatibility checks to 1.2 internal.
Change-Id: I98c030caf557a7f0b530137d6de75c23a3c90c73
Reviewed-on: http://gerrit.ent.cloudera.com:8080/174
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Alex Behm <alex.behm@cloudera.com>
2014-01-08 10:52:07 -08:00