FlatBuffers version 1.6.0 is already included in the toolchain. This
commit adds it to the build system.
Change-Id: I2ca255ddf08ac846b454bfa1470ed67b1338d2b0
Reviewed-on: http://gerrit.cloudera.org:8080/6180
Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Tested-by: Impala Public Jenkins
Add libev 4.20 to the Impala build. This is a dependency of KRPC.
FindLibEv.cmake was taken from Apache Kudu.
Change-Id: Iaf0646533592e6a8cd929a8cb015b83a7ea3008f
Reviewed-on: http://gerrit.cloudera.org:8080/5659
Tested-by: Impala Public Jenkins
Reviewed-by: Henry Robinson <henry@cloudera.com>
This patch adds Protobuf 2.6.1 to Impala's build, and bumps the
toolchain version so that the dependency is available. Protobuf is
unused in this commit, but is required for KRPC.
FindProtobuf.cmake includes some utility CMake methods to generate
source code from Protobuf definitions. It is taken from Kudu.
Change-Id: Ic9357fe0f201cbf7df1ba19fe4773dfb6c10b4ef
Reviewed-on: http://gerrit.cloudera.org:8080/5657
Tested-by: Impala Public Jenkins
Reviewed-by: Henry Robinson <henry@cloudera.com>
This change prevents us from depending on LLAMA to build.
Note that the LLAMA MiniKDC is left in - it is a test
utility that does not depend on LLAMA itself.
IMPALA-4292 tracks cleaning this up.
Testing:
Ran a private build to verify that all tests pass.
Change-Id: If2e5e21d8047097d56062ded11b0832a1d397fe0
Reviewed-on: http://gerrit.cloudera.org:8080/4739
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Reviewed-by: Henry Robinson <henry@cloudera.com>
Tested-by: Internal Jenkins
Alas, poor Llama! I knew him, Impala: a system
of infinite jest, of most excellent fancy: we hath
borne him on our back a thousand times; and now, how
abhorred in my imagination it is!
Done:
* Removed QueryResourceMgr, ResourceBroker, CGroupsMgr
* Removed untested 'offline' mode and NM failure detection from
ImpalaServer
* Removed all Llama-related Thrift files
* Removed RM-related arguments to MemTracker constructors
* Deprecated all RM-related flags, printing a warning if enable_rm is
set
* Removed expansion logic from MemTracker
* Removed VCore logic from QuerySchedule
* Removed all reservation-related logic from Scheduler
* Removed RM metric descriptions
* Various misc. small class changes
Not done:
* Remove RM flags (--enable_rm etc.)
* Remove RM query options
* Changes to RequestPoolService (see IMPALA-4159)
* Remove estimates of VCores / memory from plan
Change-Id: Icfb14209e31f6608bb7b8a33789e00411a6447ef
Reviewed-on: http://gerrit.cloudera.org:8080/4445
Tested-by: Internal Jenkins
Reviewed-by: Henry Robinson <henry@cloudera.com>
One problem uncovered while trying to build Impala on Ubuntu16 is
that the functions 'isnan' and 'isinf' both appear in std::
(from <cmath>) and in boost::math::, but we're currently using
them without qualifiers in several places, leading to a conflict.
This patch prefaces all uses with 'std::' to disambiguate, and also
adds <cmath> imports to all files that use those functions, for
the sake of explicitness.
Another problem is that bin/make_impala.sh uses the system cmake,
which may not be compatible with the toolchain binaries. This patch
updates impala-config.sh to add the toolchain cmake to PATH, so
that we'll use it wherever we use cmake.
Change-Id: Iaa1520c1e4aa4175468ac342b14c1262fa745f7a
Reviewed-on: http://gerrit.cloudera.org:8080/3800
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: Internal Jenkins
For files that have a Cloudera copyright (and no other copyright
notice), make changes to follow the ASF source file header policy here:
http://www.apache.org/legal/src-headers.html#headers
Specifically:
1) Remove the Cloudera copyright.
2) Modify NOTICE.txt according to
http://www.apache.org/legal/src-headers.html#notice
to follow that format and add a line for Cloudera.
3) Replace or add the existing ASF license text with the one given
on the website.
Much of this change was automatically generated via:
git grep -li 'Copyright.*Cloudera' > modified_files.txt
cat modified_files.txt | xargs perl -n -i -e 'print unless m#Copyright.*Cloudera#i;'
cat modified_files_txt | xargs fix_apache_license.py [1]
Some manual fixups were performed following those steps, especially when
license text was completely missing from the file.
[1] https://gist.github.com/anonymous/ff71292094362fc5c594 with minor
modification to ORIG_LICENSE to match Impala's license text.
Change-Id: I2e0bd8420945b953e1b806041bea4d72a3943d86
Reviewed-on: http://gerrit.cloudera.org:8080/3779
Reviewed-by: Dan Hecht <dhecht@cloudera.com>
Tested-by: Internal Jenkins
The generated data is identical to the pregenerated tpch.tar.gz
and tpcds.tar.gz data that was used previously and were not
publically accessible.
This adds a "preload" hook to bin/load-data.py that can execute custom
logic for each data set. This is used to call the TPC-H and TPC-DS data
generation utilities that are already available in the Impala toolchain.
Testing:
Ran private test job with loading from snapshot disabled and without
the tpch/tpcds tarballs available.
Change-Id: Ieccfbd7d8d4a91bffddbe35abb7f5572e71a71cf
Reviewed-on: http://gerrit.cloudera.org:8080/3761
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Internal Jenkins
This change updates the toolchain bootstrapping script
to download the CDH components (hadoop, hbase, hive, llama,
llama-minikdc and sentry) from the toolchain S3 bucket to
the toolchain directory if the environment variable
$DOWNLOAD_CDH_COMPONENTS is true. By default, it is false
which means the CDH components in the thirdparty directory
will be used instead.
To build the ASF tree(https://git-wip-us.apache.org/repos/asf?p=incubator-impala.git),
set $DOWNLOAD_CDH_COMPONENTS to true. Currently, the CDH
components in S3 are snapshots from the thirdparty directory
at 688d0efcd38731e8e27a8236dbdca21c8fd571a1. Once the integration
jenkins job (impala-cdh5-trunk-core-integration) is modified
to upload the latest stable builds to the S3 buckets, we can
remove the thirdparty directory and always use the CDH components
in the toolchain directory.
Note that bootstrap_toolchain.py will not overwrite existing
directories in the toolchain directory. To force a refresh of
cpmponents in the toolchain directory, a user should delete the
cached copy in the toolchain directory and execute
bootstrap_toolchain.py again. This behavior allows users to
develop locally without network connection once the toolchain
has been bootstrapped.
Change-Id: I16fa79db0005554cc0a116e74775647ba99f8dda
Reviewed-on: http://gerrit.cloudera.org:8080/3333
Reviewed-by: Michael Ho <kwho@cloudera.com>
Tested-by: Internal Jenkins
Previously boost related symbols (and others) would get defined in the
Kudu client stub with a non-functional implementation. If these
implementations were used at runtime they would crash Impala.
Change-Id: I54292095692ce38c255a8df48cf8f3f655d797b0
Reviewed-on: http://gerrit.cloudera.org:8080/2864
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Internal Jenkins
This changes add breakpad crash handling support to catalogd, impalad,
and statestored. The destination folder for minidump files can be
configured via the 'minidump_path' command line flag. Leaving it empty
will disable minidump generation. The daemons will rotate minidump
files. The number of files to keep can be configured with the
'max_minidumps' command line flag.
Change-Id: I7a37a38488716ffe34296f3490ae291bbb7228d6
Reviewed-on: http://gerrit.cloudera.org:8080/2028
Reviewed-by: Lars Volker <lv@cloudera.com>
Tested-by: Internal Jenkins
This is the same as the previous LLVM upgrade patch, except we've
removed the libtinfo dependency, so we assume we're building against an
LLVM that doesn't require that.
This requires various changes for Impala to be fully functional with the
new version of LLVM.
The original JIT was removed from LLVM, we need to switch to the new
MCJIT API and implementation.
MCJIT only supports module-at-a-time compilation, so the module must
be finalised before any compilation happens. We didn't depend on the
old behaviour deeply, but various small fixes were required.
MCJIT requires that every IR module has a name.
We relied on the old JIT's workaround for the __dso_handle symbol,
which we have to emulate for MCJIT with a customer memory manager
until we can get rid of global initialisers in cross-compiled code.
LLVM made a number of incompatible API changes and reorganised headers.
Clang took over responsibility for padding structs by marking structs
as packed and inserting bytes so that members are aligned correctly
(previously it relies LLVM aligning struct members based on the
target's alignment rules). This means Impala also needs to manually
pad its structs since clang-emitted structs look to LLVM like they have
do not need to be inlined.
Our inlining pass would require some modification to work and is
redundant with LLVM's inlining pass, so was removed along with the
unused subexpr elimination pass.
There were various issues with __builtin_add_overflow and
__builtin_mul_overflow that are newly available in LLVM 3.8.
First, LLVM emitted a call to a function in libclang_rt, which
we don't link in and has symbols that conflict with
the gcc runtime library. Second, the performance actually regressed
by using the builtins (I tested this manually by copying across the
definition of the required function).
Change-Id: I60b18a40a2df3f1adf326721f0df2a639d53a7c2
Reviewed-on: http://gerrit.cloudera.org:8080/2866
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Internal Jenkins
Reverting until we can sort out libtinfo build dependencies on various
OSes.
This reverts commit 1e77048be06aeb511e3483193db4257c8dbc7cf3.
Change-Id: I281b0b040941d9e4e6a5199c5d228471ad8c031c
Reviewed-on: http://gerrit.cloudera.org:8080/2857
Tested-by: Internal Jenkins
Reviewed-by: Dan Hecht <dhecht@cloudera.com>
This requires various changes for Impala to be fully functional with the
new version of LLVM.
The original JIT was removed from LLVM, we need to switch to the new
MCJIT API and implementation.
MCJIT only supports module-at-a-time compilation, so the module must
be finalised before any compilation happens. We did't depend on the
old behaviour deeply, but various small fixes were required.
MCJIT requires that every IR module has a name.
We relied on the old JIT's workaround for the __dso_handle symbol,
which we have to emulate for MCJIT with a customer memory manager
until we can get rid of global initialisers in cross-compiled code.
LLVM made a number of incompatible API changes and reorganised headers.
Clang took over responsibility for padding structs by marking structs
as packed and inserting bytes so that members are aligned correctly
(previously it relies LLVM aligning struct members based on the
target's alignment rules). This means Impala also needs to manually
pad its structs since clang-emitted structs look to LLVM like they have
do not need to be inlined.
Our inlining pass would require some modification to work and is
redundant with LLVM's inlining pass, so was removed along with the
unused subexpr elimination pass.
LLVM now depends on another system library libtinfo, so we use
llvm-config to get the required system libs directly.
There were various issues with __builtin_add_overflow and
__builtin_mul_overflow that are newly available in LLVM 3.8.
First, LLVM emitted a call to a function in libclang_rt, which
we don't link in and has symbols that conflict with
the gcc runtime library. Second, the performance actually regressed
by using the builtins (I tested this manually by copying across the
definition of the required function).
Change-Id: I17d7afd05ad3b472a0bfe035bfc3daada5597b2d
Reviewed-on: http://gerrit.cloudera.org:8080/2486
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Internal Jenkins
The directory structure of the newer Kudu toolchain artifacts has
changed. Now the root directory is split into /release and /debug. A few
little updates are needed to the build and service scripts.
Since the toolchain no longer provides stubs for platforms that Kudu
doesn't support the stubs need to be generated. This will be done as
part of the toolchain bootstrapping.
Also this upgrades Kudu to 0.8 RC1.
Developers will need to run bin/create-test-configuration.sh after
pulling in this change. Otherwise the Kudu service will fail to start.
Change-Id: I625903bd92afece0ad819a96fc275d5812b5eb2a
Reviewed-on: http://gerrit.cloudera.org:8080/2720
Reviewed-by: Casey Ching <casey@cloudera.com>
Tested-by: Internal Jenkins
If SKIP_TOOLCHAIN_BOOTSTRAP is set, toolchain bootstrap is skipped. This
means that even if you are running on a supported OS, your custom-built
toolchain artifacts will always be used.
Also use Ubuntu 14.04 toolchain artifacts for Ubuntu 15.10.
I have been using the artifacts locally for a while and it has been
working fine.
Change-Id: If3bae187cc8a829c693711482c0ec656e41b7bf2
Reviewed-on: http://gerrit.cloudera.org:8080/2665
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Internal Jenkins
The stubs in Impala broke during the merge commit. This commit removes
the stubs in hopes of improving robustness of the build. The original
problem (Kudu clients are only available for some OSs) is now addressed
by moving the stubbing into a dummy Kudu client. The dummy client only
allows linking to succeed, if any client method is called, Impala will
crash. Before calling any such method, Kudu availability must be
checked.
Change-Id: I4bf1c964faf21722137adc4f7ba7f78654f0f712
Reviewed-on: http://gerrit.cloudera.org:8080/2585
Reviewed-by: Casey Ching <casey@cloudera.com>
Tested-by: Internal Jenkins
With this commit the bootstrat_toolchain.py script can work on Ubuntu
15.04 systems by using the 14.04 prebuilt artifacts.
Change-Id: Ie61576cb3dc350420cfd327d85cdcd028dd0032c
Reviewed-on: http://gerrit.cloudera.org:8080/2283
Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Tested-by: Internal Jenkins
There were a few review items pointed out on the review only
version of the final impala-kudu merge. Since that patch was
a pure mechanical patch those are addressed here.
Change-Id: Ibc4b30180a8f23394c7afc32b32668b05f142eff
Reviewed-on: http://gerrit.cloudera.org:8080/2545
Reviewed-by: David Ribeiro Alves <david.alves@cloudera.com>
Tested-by: Internal Jenkins
1) Add Ubuntu 12 to the unsupported OSs list.
2) Update Kudu sink stub.
3) Don't try to download Kudu if it isn't supported.
Change-Id: I6412ea0c79c9f2a2e3285b532372076ca437400d
Reviewed-on: http://gerrit.cloudera.org:8080/2547
Reviewed-by: Dan Hecht <dhecht@cloudera.com>
Tested-by: Casey Ching <casey@cloudera.com>
This is for review purposes only. This patch will be merged with David's
big merge patch.
Changes:
1) Make Kudu compilation dependent on the OS since not all OSs support
Kudu.
2) Only run Kudu related tests when Kudu is supported (see #1).
3) Look for Kudu locally, but in a different location. To use a local
build of Kudu, set KUDU_BUILD_DIR to the path Kudu was built in and
set KUDU_CLIENT_DIR to the path KUDU was installed in.
Example:
git clone https://github.com/cloudera/kudu.git
...build 3rd party etc...
mkdir -p $KUDU_BUILD_DIR
cd $KUDU_BUILD_DIR
cmake <path to Kudu source dir>
make
DESTDIR=$KUDU_CLIENT_DIR make install
4) Look for Kudu in the toolchain if not using a local Kudu build.
5) Add Kudu service startup scripts. The Kudu in the toolchain is
actually a parcel that has been renamed (the contents were not
modified in any way), that mean the Kudu service binaries are there.
Those binaries are now used to run the Kudu service.
Change-Id: I3db88cbd27f2ea2394f011bc8d1face37411ed58
Make bootstrap_toolchain.py fall back to checking the existence of
directories if the platform is not supported. This is the desired
behaviour if a custom toolchain build is used: we want to be sure the
packages exist and report an error otherwise, but we don't want to fail
the build.
Change-Id: I1232653f2fc3e889aa8bdf436035ab6eb0c17411
Reviewed-on: http://gerrit.cloudera.org:8080/2251
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Internal Jenkins
This patch depends on the llvm-3.3-no-asserts-p1 build being added to
the native toolchain. I tested by building with my local modified
toolchain. A previous commit also disabled automatic bootstrapping of
the toolchain on build machines, so to download the new module
automatically, I changed the build scripts to always bootstrap, but to
skip downloading packages that were already present.
The logic is changed so that LLVM without assertions is always used,
except for debug builds which link against the libraries with
assertions built in.
We want to always use the same clang to generate the IR, so that the IR
we are testing in debug mode is the same as in release mode. This
requires separating the LLVM binaries search from the LLVM libraries
search. Also requires the root CMakeLists.txt to know about debug
versus release builds so it can decide which library to use, so I
refactored some of that logic too.
This change fixes the lock contention problem of IMPALA-2980 (since a
global lock is acquired only to check an assertion) and generally
improves codegen times. On a simple inner join query I saw
OptimizationTime reduced from ~240ms to ~150ms and PrepareTime reduced
from ~120ms to ~90ms.
Change-Id: I4977815a42c66a74e34ebb6e5cf3931f51ed461a
Reviewed-on: http://gerrit.cloudera.org:8080/2231
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Internal Jenkins
Until now, we would statically link SASL in Impala and simply hope
that it is ABI compatible to the version where we install
Impala. However, SASL is as well a security library and thus Impala
should not statically link it but rather implicitly depend on it.
The only complication of this patch is that there was an API breaking
change in SASL that this patch has to deal with as it compiles on
different platforms.
In addition, this patch changes the behavior and default of the
--sasl_path command line option. If the option is not set, we rely on
the automatic resolution of the SASL plugin path by the dynamic library
and only if the option is set, we will override it with the custom value.
I tested this patch on Ubuntu and CentOS 6 with the two different
versions and everything worked fine.
Change-Id: I0523b47f15a63ac385e9036c5b76d43a55bb6771
Reviewed-on: http://gerrit.cloudera.org:8080/1692
Reviewed-by: Dan Hecht <dhecht@cloudera.com>
Tested-by: Internal Jenkins
This patch adds logic to automatically download the pre-built toolchain
packages to the local developer machine using the bootstrap_toolchain.py
script in case there are not present. There is no manual user
intervention necessary to initiate the download process.
If desired the script can always be called to re-download the
dependencies from a correctly sourced Impala environment.
Change-Id: I636160efeadfac4b5c1feb478da5ae5da0c9fd00
Reviewed-on: http://gerrit.cloudera.org:8080/1429
Reviewed-by: Martin Grund <mgrund@cloudera.com>
Tested-by: Internal Jenkins