Commit Graph

412 Commits

Author SHA1 Message Date
ishaan
e408560c56 Perf Framework: Move exec functions to a separate file and deprecate Hive execution.
This patch does the following:
  - Removes code that deals with executing queries through Hive.
  - Gives the user the option to specify only the hostname for the Impalads.
  - Moves the execution functions to their own .py file.
  - Removes some duplicate code (exec_shell_cmd -> exec_process)

Change-Id: If49951c7bb5423ef9343d4d211f6da13d397325a
Reviewed-on: http://gerrit.cloudera.org:8080/862
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: Internal Jenkins
2015-09-22 10:58:32 -07:00
ishaan
b9cac28ae5 Add cdh5.7.0-SNAPSHOT Hadoop/HBase/Hive/LLAMA/Sentry dependencies.
Change-Id: Ifda9939b7ad035b9877d75fbedbeb15cfa7ce517
2015-09-16 15:03:38 -07:00
Juan Yu
9bd86d53c1 Remove "localhost" from JVM debug port option to allow remote debugging
Allow remote debugger to connect impala daemon will be very helpful
for debugging frontend.
You can also remote debug Impala running on a real cluster by
setting JAVA_TOOL_OPTIONS="-agentlib:jdwp=transport=dt_socket,
server=y,address=30000,suspend=n,quiet=y"
in Impala Environment Advanced Configuration via CM.

Change-Id: I761c5b2229d107ca4559c220488838b85fc14d53
Reviewed-on: http://gerrit.cloudera.org:8080/671
Reviewed-by: Juan Yu <jyu@cloudera.com>
Tested-by: Internal Jenkins
2015-08-26 06:56:24 +00:00
Martin Grund
5afd5bc8f6 Toolchain Cleanup and ASAN Improvements
This patch provides the last fixes to finally enable the toolchain:

     - Remove static OpenSSL dependency
     - Fixing inline assembly problems in ASAN
     - Issues with non-relocatable LLVM 3.3 - adds manual system
       includes to fix issues with hardcoded header paths in clang.

When the toolchain is enabled and we build for ASAN we use a specific
toolchain file to build with LLVM-trunk as the main compiler. Even
though this uses LLVM-trunk for compiling the Impala code, this will use
LLVM 3.3 for codegen.  In addition, this enables us to follow up with
TSAN and LEAKSAN.

Change-Id: I0abb914ca3f192cb7edd83ead134bc9e2d02071f
Reviewed-on: http://gerrit.cloudera.org:8080/556
Tested-by: Internal Jenkins
Reviewed-by: Martin Grund <mgrund@cloudera.com>
2015-08-21 20:14:31 +00:00
Juan Yu
9d83d1de83 Revert "Remove localhost from JVM debug option to allow remote debugger connect to Impala daemon"
This reverts commit 51c56d57e8c59599ca789569be264ec0c4f25ef7.
2015-08-21 12:09:00 -07:00
Juan Yu
113603a563 Remove localhost from JVM debug option to allow remote debugger connect to Impala daemon
Change-Id: I31ce9d03adc3a4c88e257f3ec1883fba5e06bca0
2015-08-21 12:07:29 -07:00
Sailesh Mukil
51b9db9ecb impala-config.sh modified to use patched version of re2
Change-Id: I0d8fe1753d99068586f6b8e63f9d127f470acaf4
Reviewed-on: http://gerrit.cloudera.org:8080/660
Reviewed-by: Casey Ching <casey@cloudera.com>
Reviewed-by: Martin Grund <mgrund@cloudera.com>
Tested-by: Internal Jenkins
2015-08-19 22:53:44 +00:00
Alex Behm
2372667ba3 Addendum: Quietly resolve FE dependencies in Jenkins runs.
I had missed a few places in my original commit:
2b43828fe22dbb17b4f6df875fc59af6772f3984

Change-Id: I55c0d0a79f6c3416f6ba64cfbf4c1dbb4293bd36
Reviewed-on: http://gerrit.cloudera.org:8080/616
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Internal Jenkins
2015-08-14 00:23:41 +00:00
Skye Wanderman-Milne
f8134ff133 IMPALA-2187: Run py.test through impala python env.
The symptom of this bug was that we were seeing "ValueError: bad marshal data"
when trying to import from tests.hs2.test_hs2 during customer cluster tests.

The problem was that we were not running the custom cluster tests through the
new Impala Python virtualenv.

Some tests (properly running with the virtualenv) that run before the customer
cluster tests had caused the generation of pyc files for tests.hs2.test_hs2.
Those pyc files then appeared corrupted when executing the custom cluster
tests because the default python env is running a different version than the
virtualenv those pyc files were generated from in earlier tests.

Change-Id: Ie9d8f90c65921247dd885804165f9b7271ea807b
Reviewed-on: http://gerrit.cloudera.org:8080/618
Reviewed-by: Casey Ching <casey@cloudera.com>
Tested-by: Internal Jenkins
2015-08-09 06:17:48 +00:00
Casey Ching
d202d6a967 Use "impala-python" (virtualenv) instead of system python
Python tests and infra scripts will now use "python" from the virtualenv
via $IMPALA_HOME/bin/impala-python. Some scripts could be simplified now
that python 2.6 and a dependable set of third-party libraries are
available but that is not done as part of this commit.

Change-Id: If1cf96898d6350e78ea107b9026b12ba63a4162f
Reviewed-on: http://gerrit.cloudera.org:8080/603
Reviewed-by: Taras Bobrovytsky <tbobrovytsky@cloudera.com>
Tested-by: Internal Jenkins
2015-08-06 02:09:09 +00:00
Casey Ching
ca5856b8f8 Python: Bootstrap a virtualenv and add impala-python command
This adds a bootstrap script and a "impala-python" command to
$IMPALA_HOME/bin that automatically runs the bootstrap and redirects to
the virtualenv python. Existing python scripts will later be updated to
use the this new "impala-python" command.

The bootstrap script will build a virtualenv to ensure a minimum python
version (2.6) and a well known set of dependencies. The bootstrap script
can be run with python 2.4 but 2.6 must already be installed on the
system. The resulting virtualenv will use 2.6 at a minimum.

Only dependencies explicitly listed in requirements.txt will be
installed and available (no system packages will ever be used). No
packages will ever be downloaded when setting up the virtualenv. In the
future new dependencies can be added by editing the requirements.txt
file. Installation through requirements.txt is a standard pip feature.
When requirements.txt is updated, the next run of "impala-python"  will
rebuild the virtualenv.

Change-Id: I150595d7e09a45d5f2e3c30a845bc8d6a761eeed
Reviewed-on: http://gerrit.cloudera.org:8080/424
Reviewed-by: Casey Ching <casey@cloudera.com>
Tested-by: Internal Jenkins
2015-08-01 01:30:12 +00:00
ishaan
4b54107818 Add sentry-1.5.1-cdh5.5.0 to thirdparty.
Change-Id: I0100ac13b55b41022ce96ca92aef1aadd67d9786
2015-07-30 18:37:52 -07:00
Tim Armstrong
f6d8719007 IMPALA-1389: only install SASL to /tmp if necessary
The SASL install directory was hardcoded to a path under /tmp. This solved
a very specific problem with ~'s in paths but could not be overridden. This
can lead to lost time when /tmp is cleared out periodically or on reboots.
IMPALA_CYRUS_SASL_INSTALL_DIR now defaults to a path under thirdparty unless
there is a tilde in the path. It can also be overridden by setting the
IMPALA_CYRUS_SASL_INSTALL_DIR environment variable.

Change-Id: I1aea2b51d265e3d1f04be0c915dcbee57c863be6
Reviewed-on: http://gerrit.cloudera.org:8080/536
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Internal Jenkins
2015-07-29 23:45:16 +00:00
Casey Ching
23eec9fc30 Simplify shell cancellation tests
The tests were doing unnecessary things. One such thing that stopped
working with the virtualenv patch was searching for the shell process to
get the pid. The search was never needed since the process was spawned
with Popen which provides the pid directly.

Change-Id: I2455e58de4fdba8fd2770f0489fac8cddf6b90a0
Reviewed-on: http://gerrit.cloudera.org:8080/555
Reviewed-by: Casey Ching <casey@cloudera.com>
Tested-by: Internal Jenkins
2015-07-23 04:09:11 +00:00
ishaan
f14f044d23 Check for the existence of the psutil module before using it.
A previous change only enabled importing psutil if it was present on the system and always
returning True for check_process_exists. This broke the functionality of --kill wherein we
throw an error if the process has not been killed.

This patch adds a method to check for the existance of psutil and only checking for a
process being killed if it does.

Change-Id: I679ce12dc7e2732a8a95d5825c31d8a1bec354ec
Reviewed-on: http://gerrit.cloudera.org:8080/541
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: Internal Jenkins
2015-07-21 10:29:25 +00:00
Harrison Sheinblatt
d767bfea8c IMPALA-2139: Correct IMPALA_AUX_TEST_HOME default
Capitalize default for IMPALA_AUX_TEST_HOME in
bin/impala-config.sh so that it correctly matches the directory name.

The jenkins scripts will not be affected because they explicitly set
the variable and already use the correct capitalization.

Change-Id: I674ddfd38bc1a13721674e433e03cc66baad2cfc
Reviewed-on: http://gerrit.cloudera.org:8080/543
Reviewed-by: Dan Hecht <dhecht@cloudera.com>
Tested-by: Internal Jenkins
2015-07-21 04:28:28 +00:00
ishaan
b8d2eb97e6 Wrap the import of psutil in a try/catch to unbreak the full CDH build.
buildall invokes start-impala-cluster.py --kill --force in order to ensure that there are
no Impala daemons running on the system. The recent introduction of psutil in the start
script breaks the full CDH build as it's not a standard python module.

This patch only imports psutil inside the method its used and disables its usage with a
warning in the case that it's not found.

Change-Id: Ic2fce81b6d7af2722e0e23c2a580c30b86144aa1
Reviewed-on: http://gerrit.cloudera.org:8080/540
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: Internal Jenkins
2015-07-18 03:22:24 +00:00
Martin Grund
b472677e90 IMPALA-2001: Check for running processes during startup and killing.
This patch checks that the processes launched by start-impala-cluster.py
are actually started or killed. During startup, it checks if the
appropriate processes are really killed. This is useful in cases where a
similar process was started by another user and thus preventing starting
this mini cluster. For the statestore and the catalog this patch adds an
additional check after the processes were launched to verify their
existence.

Change-Id: Idfd6a11fd72278ddf180dc537459582b4392a109
Reviewed-on: http://gerrit.cloudera.org:8080/521
Tested-by: Internal Jenkins
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
2015-07-17 22:31:29 +00:00
Matthew Jacobs
f4ba3c2370 Remove debug logging in start-impala-cluster.py
Change-Id: I72366cc877d81e70f04bf32f0938307d459b40fa
Reviewed-on: http://gerrit.cloudera.org:8080/522
Tested-by: Internal Jenkins
Reviewed-by: Martin Grund <mgrund@cloudera.com>
2015-07-13 22:00:19 +00:00
Matthew Jacobs
5908ef983c Change start-impala-cluster.py to append daemon args
start-impala-cluster.py takes arguments to pass through to
impalads, catalogd, and statestored, but all arguments must
be passed as a single string today. This changes the arg
parsing to also accept multiple arguments passed as
different parameters.

E.g. the following previously (and still) works:
> start-impala-cluster.py --impalad_args="-v=2 -memlimit=1G"

Now args can optionally be passed separately:
> start-impala-cluster.py --impalad_args=-v=2 --impalad_args=-memlimit=1G

This is helpful in general, but is needed to enable passing
through some arguments from environment variables in jenkins
jobs.

Change-Id: I32f7a75ec4ce8f5ce878b3e7f76880a731842c14
Reviewed-on: http://gerrit.cloudera.org:8080/510
Reviewed-by: Dan Hecht <dhecht@cloudera.com>
Reviewed-by: Casey Ching <casey@cloudera.com>
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: Internal Jenkins
2015-07-09 21:07:45 +00:00
ishaan
8aea13e39f Add Sentry 1.5 to thirdparty.
Additionally, the pom.xml is changed to pull in the shaded sentry-provider-db jar to
account for the difference in thrift versions.

Change-Id: I24d64d2b21712e76d9ad51551ee87fd37a738641
2015-07-02 14:05:41 -07:00
ishaan
d9558cfdab Move the performance framework into its own folder.
This patch removes the files that deal with evaluating performance to its own folder.
Additionally, it also changes the code to adhere to python conventions, by using single
underscores instead of double underscores.

Change-Id: I9c96f51f33dfbc60d3121fa1ff68bfac6480e2c2
Reviewed-on: http://gerrit.cloudera.org:8080/471
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: Internal Jenkins
2015-06-23 03:31:39 +00:00
Martin Grund
81f247b171 Optional Impala Toolchain
This patch allows to optionally enable the new Impala binary
toolchain. For now there are now major version differences in the
toolchain dependencies and what is currently kept in thirdparty.

To enable the toolchain, export the variable IMPALA_TOOLCHAIN to the
folder where the binaries are available.

In addition this patch moves gutil from the thirdparty directory into
the source tree of be/src to allow easy propagation of compiler and
linker flags. Furthermore, the thrift-cpp target was added as a
dependency to all targets that require the generated thrift sources to
be available before the build is started.

What is the new toolchain: The goal of the toolchain is to homogenize
the build environment and to make sure that Impala is build nearly
identical on every platform. To achieve this, we limit the flexibility
of using the systems host libraries and rather rely on a set of custom
produced binaries including the necessary compiler.

Change-Id: If2dac920520e4a18be2a9a75b3184a5bd97a065b
Reviewed-on: http://gerrit.cloudera.org:8080/427
Reviewed-by: Adar Dembo <adar@cloudera.com>
Tested-by: Internal Jenkins
Reviewed-by: Martin Grund <mgrund@cloudera.com>
2015-06-13 03:11:44 +00:00
ishaan
377214c469 Use Isilon as the default file system when running Isilon tests.
This patch enables running Impala tests against Isilon as the default file system. The
intention is to run tests against a realistic deployment, i.e, Isilon replacing HDFS as
the underlying filesystem.

Specifically, it does the following:
  - Adds a new environment variable DEFAULT_FS, which points to HDFS by default.
  - Makes the fs.defaultFs property in core-site.xml use the DEFAULT_FS environment
    variable, such that all clients talk to Isilon implicitly.
  - Unset FILESYSTEM_PREFIX when the TARGET_FILESYSTEM is Isilon, since path prefixes
    are no longer needed.
  - Only starts the Hive Metastore and the Impala service stack when running
    tests against Isilon.

We don't start KMS/HBase because they're not relevant to Isilon. We also don't
start YARN, Hive and LLama because hive queries are disabled with Isilon.

The scripts that start/stop Hive, YARN and Llama should be modified to point to a
filesystem other than HDFS in the future.

Change-Id: Id66bfb160fe57f66a64a089b465b536c6c514b63
Reviewed-on: http://gerrit.cloudera.org:8080/449
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: Internal Jenkins
2015-06-11 01:23:11 +00:00
Dan Hecht
d46de9bba1 IMPALA-1968: Part 1: Improve planner numNodes estimate for remote scans
This commit will be backported to 5.4.x to improve plans when using
Isilon and S3.

The planner currently estimates the number of backends that an hdfs scan
node will execute on as the number of datanodes holding block replica
for the corresponding table.  This can be a bad estimate for various reasons:

1) It's completely wrong when the scan is remote (e.g. S3 or Isilon).
2) It doesn't account for partition pruning.
3) The size of the set of hosts holding block replica may larger than
   the number of scan ranges.

Improve the estimate by examing the scan ranges and taking locality into
account.  While this new estimate will eventually be used in all cases,
this change uses the new estimate only when there is a remote scan range
as to not change plans produced for local ranges (since this commit will
be backported to 5.4.x).  So, this commit purposely addresses only case
1.  A follow on commit will enable the new logic for all cases.

Also set up the S3PlannerTest so that we can enable it in the nightly
jenkins S3 run.  It was inadvertantly never enabled there.

Change-Id: I3fd3f7c5431a535fb044c98c326338c21b8a1898
Reviewed-on: http://gerrit.cloudera.org:8080/425
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Internal Jenkins
2015-06-03 20:04:03 +00:00
Martin Grund
7d66dfad5f IMPALA-1830: Fix impala-config.sh for non-bash shells
impala-config.sh had three bugs that are fixed with this patch:

  1) Not finding the postgres driver will return from the script, not
  exit the shell.

  2) HADOOP_CLASSPATH uses a corect wildcard path specifier and does not
  rely on shell expansion anymore.

  3) IMPALA_HOME is set correctly even if BASH_SOURCE is not available
  for ZSH shells.

Change-Id: Ifbcf62c643cade43a9007f9bb780fc650760df0e
Reviewed-on: http://gerrit.cloudera.org:8080/407
Reviewed-by: Casey Ching <casey@cloudera.com>
Tested-by: Internal Jenkins
2015-05-22 23:49:59 +00:00
Martin Grund
1afe72830a IMPALA-1916: Replace Status::OK by Status::OK()
By doing so, we avoid unnecessarily calling the copy constructor for
Status OK objects and loading the value from memory (due to the old
Status::OK being a global). The impact of this patch was validated by
inspecting both optimized assembly code and generated IR code.

Applying this patch has some effect on the amount of generated code. The
new tool `get_code_size` will list the text, data, and bss sizes for all
archives that we produce in a release build. This patch reduces the code
size by ~20 kB.

      Text      Data    BSS
Old   10578622  576864  40825
New   10559367  576864  40809

The majority of the changes in this patch have been mechanically applied
using:

   find be/src -name "*.cc" -or -name "*.h" | xargs sed -i
   's/Status::OK;/Status::OK\(\);/'

A new micro-benchmark was added to determine the overhead of using
Status in hot code sections.

Machine Info: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz
status:               Function     Rate (iters/ms)          Comparison
----------------------------------------------------------------------
             Call Status::OK()           9.555e+08                  1X
     Call static Status::Error           4.515e+07            0.04725X
   Call Status(Code, 'string')           9.873e+06            0.01033X
            Call w/ Assignment           5.422e+08             0.5674X
           Call Cond Branch OK           5.941e+06           0.006218X
        Call Cond Branch ERROR           7.047e+06           0.007375X
 Call Cond Branch Bool (false)           1.914e+10              20.03X
  Call Cond Branch Bool (true)           1.491e+11                156X
Call Cond Boost Optional (true)          3.935e+09              4.118X
Call Cond Boost Optional (false)         2.147e+10              22.47X

Change-Id: I1be6f4c52e2db8cba35b3938a236913faa321e9e
Reviewed-on: http://gerrit.cloudera.org:8080/351
Reviewed-by: Martin Grund <mgrund@cloudera.com>
Tested-by: Internal Jenkins
2015-05-22 09:53:13 +00:00
Alex Behm
1bd3eca22f Quietly resolve dependencies in Jenkins runs to avoid log spew.
Change-Id: If38a683785f3c6c9d92f762a2dfd86f009ce9d84
Reviewed-on: http://gerrit.cloudera.org:8080/392
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Internal Jenkins
2015-05-19 09:12:43 +00:00
Alex Behm
013d6f968f Clean up FE pom.xml to eliminate console spew.
This patch makes the following changes in our pom to reduce
the build time and signficantly reduce console spew.

1. Remove jar-with-dependencies from package goal.
We have no need for creating an uber jar that contains the FE as well
as all its dependencies. Locally, we carefully construct our class path
manually (relying on copy-dependencies), and in Impala deployments
the FE jar is put together with the other dependencies, so the FE jar
does not need to be self-contained.

2. Silence copy-dependencies.
Changes the configuration of the maven-dependency-plugin to not
log every copied file to the console.

Change-Id: If351e4e800fd1ca1108f9a0f4d88f52a53fc211c
Reviewed-on: http://gerrit.cloudera.org:8080/378
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Internal Jenkins
2015-05-18 07:20:07 +00:00
Matthew Jacobs
1fefe61745 Fix packaging build, with unsupported syntax on older Python
Older versions of Python did not support syntax
like "except Exception as e".

Change-Id: I6d874962e9de5db76eed312d61432c7d74341917
Reviewed-on: http://gerrit.cloudera.org:8080/383
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: Matthew Jacobs <mj@cloudera.com>
2015-05-14 23:20:35 +00:00
Matthew Jacobs
cf0b6bc595 Add flag to easily enable Yarn and Llama in mini cluster
Adds a flag to start-impala-cluster.py (--enable_rm) to set up the
mini Impala cluster using Yarn and Llama. This hides a number of
flags that must be set on the impalads:
  -enable_rm
  -llama_addressess: set to the local llama service
  -fair_scheduler_allocation_path: set to the path of the fair-scheduler.xml
       in each node's hadoop conf directory
  -cgroup_hierarchy_path: set to a path in the CPU cgroup hierarchy which
       has the correct permissions for Impala to manage a child cgroup. The
       path comes from cgroups.py.

The new module cgroups.py was added to contain cgroups-related
utilities. Right now it provides paths to the CPU controller
hierarchy root and a path within the hierarchy that can be used
for impalads (i.e. have the proper permissions, one for each
cluster node).

Change-Id: Ic2181ec5613c180f240958c84f885c6b136a64d4
Reviewed-on: http://gerrit.cloudera.org:8080/369
Tested-by: Internal Jenkins
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
2015-05-14 21:15:24 +00:00
Matthew Jacobs
adf4b4863d Allow specifying cluster start args in run-all-tests.sh
This will enable jenkins jobs to specify custom arguments when
starting the mini cluster (via start-impala-cluster.py). This
will be used to create a jenkins job that runs tests with RM
enabled.

Change-Id: I96a2e8d90db448581bbf448f3df514381f79fb27
Reviewed-on: http://gerrit.cloudera.org:8080/380
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: Internal Jenkins
2015-05-14 21:05:53 +00:00
ishaan
058978dccb Enable using isilon as the underlying filesystem.
This patch enables the Impala test suite to run the end to end tests
against an isilon namenode. There are a few caveats:
  - The fe test will currently not work.
  - Only loading data from both the test-warehouse snapshot and the metadata snapshot is
    supported.
  - The test suite cannot be run by multiple people (unless we have access to multiple
    isilon namenodes)

Change-Id: I786b4e4f51b99e79ad42abc676f537ebfc189237
Reviewed-on: http://gerrit.cloudera.org:8080/356
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: Internal Jenkins
2015-05-12 01:28:19 +00:00
ishaan
9b6c54eac2 Test execution should continue even if a test fails.
This patch changes the workflow of test execution:
  - All FE/BE/JDBC tests are executed inspite of failure.
  - End to end pytests also continue to execute until a threshold is reached. This
    threshold is set to 10, but is overridable by the environment variable
    MAX_PYTEST_FAILURES

It also adds extra debugging informationf for end to end tests. Failures are reported
immedietely after a test failed. The entire execution report is still displayed at the
end.

Change-Id: I3a4f446e74dbc6feb5799226e109fc1eebe48733
Reviewed-on: http://gerrit.cloudera.org:8080/326
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: Internal Jenkins
2015-05-01 03:32:35 +00:00
Alex Leblang
831331dda2 Changes to compile on CentOs7
This commit contains changes that allow Impala to be compiled, but not
yet run, on CentOs7. This commit adds some missing flags that specify
library locations.
Change-Id: I73aff2f75b6d0f7a3c13349665ac98a0286e7ebd
Reviewed-on: http://gerrit.cloudera.org:8080/313
Reviewed-by: Alex Leblang <alex.leblang@cloudera.com>
Tested-by: Internal Jenkins
2015-04-17 02:29:19 +00:00
ishaan
d947ef6f1f Run frontend tests before end to end tests.
Previously, some frontend tests depended upon state left behind by the end to end tests.
This is no longer the case, so we should run the faster tests first.

Change-Id: I0fa50a6916d76a4d0431e7fb2cc83e6e437b108b
Reviewed-on: http://gerrit.cloudera.org:8080/321
Reviewed-by: Alex Leblang <alex.leblang@cloudera.com>
Tested-by: Internal Jenkins
2015-04-08 03:27:10 +00:00
Martin Grund
fb9ab9c26a Separate IR targets to avoid build failure
When executing custom commands / custom targets in parallel, the
dependency resolution for these targets will happen in parallel as
well. If two custom targets are started sufficiently close together for
a clean build, each of the target will execute all dependencies. In the
worst case, both targets will try to overwrite files the other target
has already written leading to either corrupt archives or missing code.

Some background can be found here:
http://www.cmake.org/pipermail/cmake/2011-July/045256.html

This patch separates the two IR targets and moves them to the end of the
compile chain.

Change-Id: I5b26ebd1c3421788fd22e6a09ef96dd6b944e89e
Reviewed-on: http://gerrit.cloudera.org:8080/318
Reviewed-by: Martin Grund <mgrund@cloudera.com>
Tested-by: Internal Jenkins
2015-04-06 08:44:24 +00:00
Martin Grund
b605c57ee6 Apply -j to compile_to_ir_(no_)see targets
Change-Id: Ifd042b5119274d3cd6972f34c97c1230d9c1b9f5
Reviewed-on: http://gerrit.cloudera.org:8080/315
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: Internal Jenkins
2015-04-04 01:36:34 +00:00
Casey Ching
6f1ce232f4 Use java from JAVA_HOME
Various build and test machines have multiple versions of java
installed and relying on the default "java" command being compatible
isn't practical (a machine may also build an older version of Impala
that might require a different java version). Since JAVA_HOME is already
required that can/should be used to determine which java binary to use.

This also includes a minor change to replace a block of code that was
using 4-space indent. Instead of using 2-space indent, that block was
replaced with one line.

Change-Id: I4b8698b2aa5411b5fa6c5bc06291625999478955
Reviewed-on: http://gerrit.cloudera.org:8080/310
Reviewed-by: Casey Ching <casey@cloudera.com>
Tested-by: Internal Jenkins
2015-04-03 00:13:22 +00:00
ishaan
548309dfb4 Add Sentry/HBase/Llama/Hadoop thirdparty dependencies for cdh5.5.0-SNAPSHOT
Change-Id: I266bb4d561ec87431b95ff1d09eda5be32e0bfaa
2015-03-13 12:15:42 -07:00
ishaan
65692e7125 Enable running jdbc tests independent of fe tests.
Currently, it's not possible to run jdbc tests without running fe tests. This patch
makes that possible.

Change-Id: I77ec336cc31b231b43008a99f7c3a9d48a0f3fda
Reviewed-on: http://gerrit.cloudera.org:8080/197
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: Internal Jenkins
2015-03-11 16:39:39 -07:00
ishaan
6df2d92f78 Make the number of test iterations configurable via an environment variable.
This patch enables the user to set an environment variable specifying the number of
iterations to run all the tests. The configuration can be overriden via the command line.

Change-Id: I1da545201f1b59697344e62ae8edf5f9fb3cd92d
Reviewed-on: http://gerrit.cloudera.org:8080/188
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: Internal Jenkins
2015-03-11 16:39:39 -07:00
ishaan
9d76d5898e Bump version to 2.3.0-cdh5-INTERNAL
Change-Id: I9f6f4fcc6c0518c03417703bfbc1b576eabe5e80

Conflicts:

	bin/save-version.sh
2015-03-06 16:53:27 -08:00
ishaan
92d79d2cc7 Bump version to 2.2.0-cdh5
Change-Id: If818475de7c5e40ef5fba2a301a772610bb8c0b0
2015-03-06 16:47:37 -08:00
Martin Grund
b582cdc22b IMPALA-1598: Adding Error Codes to Log Messages
This patch introduces the concept of error codes for errors that are
recorded in Impala and are going to be presented to the client. These
error codes are used to aggregate and group incoming error / warning
messages to reduce the spill on the shell and increase the usefulness of
the messages. By splitting the message string from the implementation,
it becomes possible to edit the string independently of the code and
pave the way for internationalization.

Error messages are defined as a combination of an enum value and a
string. Both are defined in the Error.thrift file that is automatically
generated using the script in common/thrift/generate_error_codes.py. The
goal of the script is to have a central understandable repository of
error messages. Adding new messages to this file will require rebuilding
the thrift part. The proxy class ErrorMessage is responsible to
represent an error and capture the parameters that are used to format
the error message string.

When error messages are recorded they are recorded based on the
following algorithm:

- If an error message is of type GENERAL, do not aggregate this message
  and simply add it to the total number of messages
- If an error messages is of specific type, record the first error
  message as a sample and for all other occurrences increment the count.
- The coordinator will merge all error messages except the ones of type
  GENERAL and display a count.

For example, in the case of the parquet file spanning multiple blocks
the output will look like:

    Parquet files should not be split into multiple hdfs-blocks.
    file=hdfs://localhost:20500/fid.parq (1 of 321 similar)

All messages are always logged to VLOG. In the coordinator error
messages are merged across all backends to retain readability in the
case of large clusters.

The current version of this patch adds these new error codes to some of
the most important error messages as a reference implementation.

Change-Id: I1f1811631836d2dd6048035ad33f7194fb71d6b8
Reviewed-on: http://gerrit.cloudera.org:8080/39
Reviewed-by: Martin Grund <mgrund@cloudera.com>
Tested-by: Internal Jenkins
2015-03-01 03:37:32 +00:00
ishaan
aa65f3de45 Add Hbase version 1.0.0-cdh5.4.0-SNAPSHOT to thirdparty.
Change-Id: Iee1adcf778800f0d30a64c5818f7b62fd23cce5f
2015-02-25 19:58:56 -08:00
ishaan
21d24f5295 Infrastructure changes to enable the hive version change from 0.13.1 to 1.1.0
Specifically:
  - Hive needs some jars from hadoop/tools/lib
  - Hive has an dependency on apache.snapshots ( added in fe/pom.xml )
  - Beeline has to explicitly told not to use jline.

Change-Id: Id38956b748f8f667a39505c92355f0298f308718

Conflicts:

	testdata/bin/load-hive-builtins.sh
2015-02-23 20:27:13 -08:00
ishaan
18f84c0e07 Add hive-1.1.0-cdh5.4.0-SNAPSHOT
Change-Id: I5fef2a4b64d2ab7ed2ddca979ea6e2b270e5394d
2015-02-23 20:27:12 -08:00
ishaan
245a96b532 Add new hadoop 2.6
Change-Id: I18cf18c5c59084deea58adcbf0c8e39fbdf681f7
2015-02-12 09:44:00 -08:00
ishaan
2386fb84a8 Enable the data loading infrastructure to switch the underlying file system.
This patch enables loading data to s3 instead of hdfs. It is preliminary in nature,
as such, there are a few caveats:
 - The fe tests do not work.
 - Only loading from a test-warehouse snapshot and metastore snapshot is enabled.
 - Until hive works with s3, only a subset of all the tests will work.

Change-Id: Ia66a5f836b4245e3b022a49de805eec337a51324
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5851
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: jenkins
2015-02-03 01:02:42 -08:00