Commit Graph

23 Commits

Author SHA1 Message Date
Michael Smith
c3dc7f9667 IMPALA-13147: Limit concurrency of link jobs
Configure separate compile and link pools for ninja. Configures link
parallelism based on expected memory use, which can be reduced by
setting IMPALA_MINIMAL_DEBUG_INFO=true or IMPALA_SPLIT_DEBUG_INFO=true.

Adds IMPALA_MAKE_CMD to simplify using the ninja build tool for all make
operations in scripts. Install ninja on Ubuntu. Adds a '-make' option to
buildall.sh to force using 'make'.

Adds MOLD_JOBS=1 to avoid overloading the system when trying 'mold' and
linking test binaries. However 'mold' is not selected as the default
due to test failures around SASL/GSSAPI (see IMPALA-14527).

Switches bin/jenkins/all-tests.sh to use ninja and removes the guard in
bootstrap_development.sh limiting IMPALA_BUILD_THREADS as it's no longer
needed with ninja.

SKIP_BE_TEST_PATTERN in run-backend-tests is unused (only used with
TARGET_FILESYSTEM=local) so I don't attempt to make it work with ninja.

Tested with local 'IMPALA_SPLIT_DEBUG_INFO=true buildall.sh -skiptests'
with default (make) and IMPALA_MAKE_CMD=ninja.

Change-Id: I0952dc19ace5c9c42bed0d2ffb61499656c0a2db
Reviewed-on: http://gerrit.cloudera.org:8080/23572
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Reviewed-by: Pranav Lodha <pranav.lodha@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-12-15 21:43:07 +00:00
Michael Smith
d217b9ecc6 IMPALA-14450: Simplify Java version selection
Removes IMPALA_JAVA_HOME_OVERRIDE and updates version selection. In
order of priority
1. If IMPALA_JDK_VERSION is set, use the OS JDK version from a known
   location. This is primarily used when also installing the JDK as part
   of automated builds.
2. If JAVA_HOME is set, use it.
3. Look for the system default JDK.

The IMPALA_JDK_VERSION variable is no longer modified to avoid issues
when sourcing impala-config.sh multiple times. JAVA_HOME will be
modified if IMPALA_JDK_VERSION is set; both must be unset to restore
using the system default Java.

If switching between JDKs, now prefer setting JAVA_HOME. If relying on
system Java, unset JAVA_HOME after e.g. update-java-alternatives.

The detected Java version is set in IMPALA_JAVA_TARGET, which is used to
add Java 9+ options and configure the Java compilation target.

Eliminates IMPALA_JDK_VERSION_NUM as it's value was always identical to
IMPALA_JAVA_TARGET.

Stops printing from impala-config-java.sh. It made the output from
impala-config.sh look strange, and the decisions can all be clearly
determined from impala-config.sh printed variables later or the packages
installed in bootstrap_system.sh.

Fixes JAVA_HOME in bootstrap_build.sh on ARM64 systems.

Change-Id: I68435ca69522f8310221a0f3050f13d86568b9da
Reviewed-on: http://gerrit.cloudera.org:8080/23434
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-09-19 01:51:47 +00:00
Laszlo Gaal
89d2b23509 IMPALA-14139: Enable Impala builds on Ubuntu 24.04
Update the following elements of the Impala build environment to enable
builds on Ubuntu 24.04:

- Recognize and handle (where necessary) Ubuntu 24.04 in various
  bootstrap scripts (bootstrap_system.sh, bootstrap_toolchain.py, etc.)
- Bump IMPALA_TOOLCHAIN_ID to an official toolchain build that contains
  Ubuntu 24.04-specific binary packages
- Bump binutils to 2.42, and
- Bump the GDB version to 12.1-p1, as required by the new toolchain
  version
- Update unique_ptr usage syntax in  be/src/util/webserver-test.cc to
  compensate for new GLIBC funtion prototypes:

System headers in Ubuntu 24.04 adopted attributes on several widely
used function prototypes. Such attributes are not considered to be part
of the function's signature during template evaluation, so GCC throws a
warning when such a function is passed as a template argument, which
breaks the build, as warnings are treated as errors.

webserver-test.cc uses pclose() as the deleter for a unique_ptr in a
utility function. This patch encapsulates pclose() and its attributes in
an explicit specialization for std::default_delete<>, "hiding" the
attributes inside a functor.

The particular solution was inspired by Anton-V-K's proposal in
https://gist.github.com/t-mat/5849549

This commit builds on an earlier patch for the same purpose by Michael
Smith: https://gerrit.cloudera.org/c/23058/

Change-Id: Ia4454b0c359dbf579e6ba2f9f9c44cfa3f1de0d2
Reviewed-on: http://gerrit.cloudera.org:8080/23384
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
2025-09-15 16:10:42 +00:00
Laszlo Gaal
ad7888898b IMPALA-13223: Fix bootstrap-build.sh for platforms without Python2
bin/bootstrap-build.sh did not distinguish between various version of
the Ubuntu platform, and attempted to install unversioned Python
packages (python-dev and python-setuptools) even on newer versions
that don't support Python 2 any longer (e.g. Ubuntu 22.04 and 24.04).
On older Ubuntu versions these packages are still useful, so at this
point it is not feasible just to drop them.

This patch makes these packages optional: they are added to the list of
packages to be installed only if they actually exist for the platform.

The patch also extends the package list with some basic packages that
are needed when bin/bootstrap_build.sh is run inside an Ubuntu 22.04
Docker container.

Tests: ran a compile-only build on Ubuntu 20.04 (still has Python 2) and
on Ubuntu 22.04 (does not support Python 2 any more).

Change-Id: I94ade35395afded4e130b79eab8c27c6171b50d6
Reviewed-on: http://gerrit.cloudera.org:8080/21800
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-08-21 22:03:52 +00:00
Laszlo Gaal
a53117214f IMPALA-14255: Install Java 17 in bootstrap_build.sh
This patch bumps the Java version installed during bootstrap_build.sh to
Java 17 to keep the precommit environment consistently on Java 17.

bin/bootstrap_build.sh is a simplified setup script used instead of
bootstrap_system.sh when setting up an Impala environment limited to
compilation only. It is used mainly during the initial, lightweight
precommit checks on jenkins.impala.io when a patchset is submitted for
review.

This setup script was not updated with the Java version change from Java
8 to Java 17, so it became out of synch with the general assumption of
building Impala 5.x versions for Java 17.

This patch also removes a special case reserved for Ubuntu 14.04, which
is now not supported by Impala.

Tested automatically by submitting the patch for review.

Change-Id: I796c6004e13aeca536b339fee765f79f39cc2ea1
Reviewed-on: http://gerrit.cloudera.org:8080/23201
Reviewed-by: Jason Fehr <jfehr@cloudera.com>
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Tested-by: Michael Smith <michael.smith@cloudera.com>
2025-07-23 21:36:56 +00:00
Joe McDonnell
b21e0f031b IMPALA-14125: Avoid downloading maven from archive.apache.org
Some builds failed because archive.apache.org was unreachable.
This gets the maven download from the native toolchain s3 bucket
instead.

Change-Id: Ib1eec38f12209abf88c1cf2976db47d0ca04ab5b
Reviewed-on: http://gerrit.cloudera.org:8080/22973
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-06-05 23:07:30 +00:00
Joe McDonnell
8dd935a98f IMPALA-13558: Workaround Python 2 tarfile issue by patching tarfile.py
Ubuntu 20.04 introduced a bug in their Python 2.7's tarfile
functionality with 2.7.18-1~20.04.5. See
https://bugs.launchpad.net/ubuntu/+source/python2.7/+bug/2089071
This breaks the Impala build with a message like
"tarfile.ReadError: invalid header".

The bug has an attached tarfile.py with a workaround for the
issue. This change bootstrap_system.sh and boostrap_build.sh to
detect the bad tarfile.py and replace it with the patched tarfile.py.
Since this is comparing the hash, this will become a no-op once
Ubuntu fixes the issue.

Testing:
 - Ran a build on Ubuntu 20.04

Change-Id: I1d0691611cf53ae6dd1099b97f0aa15b450e0996
Reviewed-on: http://gerrit.cloudera.org:8080/22088
Reviewed-by: Quanlong Huang <huangquanlong@gmail.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-11-21 07:55:48 +00:00
jasonmfehr
af5f3acd67 IMPALA-13300: Upgrade Maven to 3.9.8
Maven version 3.9.7 consumed an upgraded version of the resolver
plugin that contains a fix around file locking. Issues with locking
files are seen occasionally on builds.

This patch consumes Maven 3.9.8 since it is the latest version
available at this time.

Testing was performed by running only the download code in a Redhat 8
docker container.

Change-Id: I509dd94799b99bf637a583eadc2905bc32a87c87
Reviewed-on: http://gerrit.cloudera.org:8080/21674
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Andrew Sherman <asherman@cloudera.com>
2024-08-14 17:41:08 +00:00
Laszlo Gaal
d83b48cf72 IMPALA-13014: Upgrade Maven to 3.9.6
IMPALA-12212 upgraded Maven to 3.9.2 to gain access to the parallel
dependency resolver in the 3.9.x line. The Maven project has published
several new releases since 3.9.2, fixing various issues with the new
resolver, and also fixing problems with concurrent access to the
local Maven cache.

Pick up the latest version to gain access to these new fixes.

Change-Id: I726618d084f4f0737f5b876879a90c17b0c3777c
Reviewed-on: http://gerrit.cloudera.org:8080/21332
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-07-14 03:19:52 +00:00
Laszlo Gaal
ee069687fc IMPALA-12212: Bump Maven to 3.9.2, pull dependencies in parallel
Maven 3.9.x offers a new dependency resolver, HttpClient, which allows
downloading project dependencies in parallel.

This patch bumps the Maven version installed by bootstrap_system.sh to
v3.9.2, and adds the flags enabling the new resolver to download
dependencies (including POM files) in parallel. Parallelism is set to
10 threads.

The flags are added to a project-specific Maven setting file in the
newly created java/.mvn directory. The settings file is added to the
RAT exclusion list in bin/rat_exclude_files.txt.

The --show-version flag is added for debugging purposes.

The same flags are added to the JAMM subproject as well.

The new resolver in Maven 3.9 has also changed the warning message
emitted for missing component checksums, so the new warning string
is added to the filter in bin/mvn-quiet.sh
Unfortunately Maven 3.9 has also changed the way it responds to missing
checksum files: the resolver now emits a stack trace when checksums
cannot be determined, and missing checksums are not explicitly ignored.

Detailed documentation for the new Maven resolver in Maven 3.9.0+ is
located at:
https://maven.apache.org/guides/mini/guide-resolver-transport.html
resolver configuration reference:
https://maven.apache.org/resolver/configuration.html

Tests:
- verified in a core-mode test run with Maven 3.9.2 installed
- verified in a local build using an earlier version of Maven
  to verify that the new default setting does not cause regressions
  with the old dependency resolver.

Change-Id: I75d05215effc724f5bd471646fb352f37443e185
Reviewed-on: http://gerrit.cloudera.org:8080/20142
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
2023-07-24 18:50:34 +00:00
Joe McDonnell
234d641d7b IMPALA-11961/IMPALA-12207: Add Redhat 9 / Ubuntu 22 support
This adds support for Redhat 9 / Ubuntu 22. It updates
to a newer toolchain that has those builds, and it adds
supporting code in bootstrap_system.sh.

Redhat 9 and Ubuntu 22 use python = python3, which requires
various changes to build scripts and tests. Ubuntu 22 uses
Python 3.10, which deprecates certain ssl.PROTOCOL_TLS, so
this adapts test_client_ssl.py to that change until it
can be fully addressed in IMPALA-12219.

Various OpenSSL methods have been deprecated. As a workaround
until these can be addressed properly, this specifies
-Wno-deprecated-declarations. This can be removed once the
code is adapted to the non-deprecated APIs in IMPALA-12226.

Impala crashes with tcmalloc errors unless we update to a newer
gperftools, so this moves to gperftools 2.10. gperftools changed
the default for tcmalloc.aggressive_memory_decommit to off, so
this adapts our code to set it for backend tests. The gperftools
upgrade does not show any performance regression:

+----------+-----------------------+---------+------------+------------+----------------+
| Workload | File Format           | Avg (s) | Delta(Avg) | GeoMean(s) | Delta(GeoMean) |
+----------+-----------------------+---------+------------+------------+----------------+
| TPCH(42) | parquet / none / none | 3.08    | -0.64%     | 2.20       | -0.37%         |
+----------+-----------------------+---------+------------+------------+----------------+

With newer Python versions, the impala-virtualenv command
fails to create a Python 3 virtualenv. This switches to
using Python 3's builtin venv command for Python >=3.6.

Kudu needed a newer version and LLVM required a couple patches.

Testing:
 - Ran a core job on Ubuntu 22 and Redhat 9. The tests run
   to completion without crashing. There are test failures
   that will be addressed in follow-up JIRAs.
 - Ran dockerised tests on Ubuntu 22.
 - Ran dockerised tests on Ubuntu 20 and Rocky 8.5.

Change-Id: If1fcdb2f8c635ecd6dc7a8a1db81f5f389c78b86
Reviewed-on: http://gerrit.cloudera.org:8080/20073
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>
2023-06-21 05:21:01 +00:00
Laszlo Gaal
1e7aa8e097 IMPALA-12199: Point Maven 3.5.4 download location to Apache Archives
bootstrap_build.sh still used the old (non-archive) download location
to fetch Impala's preferred Maven version, v3.5.4. Unfortunately this
location holds only the last few releases, and with a recent release
this version was pushed out and removed from this location.

This patch realigns bootstrap_build.sh with bootstrap_system.sh, which
already has the long-term stable location for Maven 3.5.4, pointing to
the Apache Archive download repo.

Change-Id: Ied6fc226e6634a21b34c0d897bb410ce86b199af
Reviewed-on: http://gerrit.cloudera.org:8080/20029
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>
2023-06-09 22:10:42 +00:00
stiga-huang
45ea094fa2 IMPALA-11716: Bump up gcovr version to 4.2
IMPALA-9999 upgrades to GCC version to 10.4 which generates new gcov
format that the current gcovr version (3.4) can't parse. This patch
upgrades gcovr to the latest Python2-compatible version (4.2). Also adds
Jinja2, MarkupSafe and lxml as the required dependent packages. The
development packages of libxml2 and libxslt are also added in
bootstrap_system.sh and bootstrap_build.sh.

This patch also fixes a failure due to the gcov executable not found in
PATH.

Tests:
 - Verified builds on Ubuntu 16.04 and CentOS 7.9
 - Verified coverage_helper.sh work after this patch

Change-Id: I9458fa0dc97d69f88a4e8a3313dc9440215dfd52
Reviewed-on: http://gerrit.cloudera.org:8080/19226
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2022-11-11 00:09:05 +00:00
Joe McDonnell
ba4cb95b62 IMPALA-11257: Fix CMake warnings for module names and cmake_minimum_required
This fixes a few different CMake warnings:
1. This removes cmake_minimum_required invocations except for the
   top-most CMakeLists.txt. This eliminates the warnings like this:
     Compatibility with CMake < 2.8.12 will be removed from a future version of
     CMake.

     Update the VERSION argument <min> value or use a ...<max> suffix to tell
     CMake that the project does not need compatibility with older versions.
   Moving to a later version also required setting CMAKE_ENABLE_EXPORTS
   to continue exporting symbols.
2. This modifies the module names so that they match the corresponding
   module names from Find*.cmake. This is mostly dealing with case
   differences. This address warnings like:
     The package name passed to `find_package_handle_standard_args` (PROTOBUF)
     does not match the name of the calling package (Protobuf).  This can lead
     to problems in calling code that expects `find_package` result variables
     (e.g., `_FOUND`) to follow a certain pattern.
   This fixed the detection logic for KerberosPrograms, and so it required
   adding more Kerberos packages to bin/bootstrap_build.sh.
3. This adds a missing .cc suffix. This addresses the following warning:
     CMake Warning (dev) at be/src/util/CMakeLists.txt:141 (add_library):
     Policy CMP0115 is not set: Source file extensions must be explicit.  Run
     "cmake --help-policy CMP0115" for policy details.  Use the cmake_policy
     command to set the policy and suppress this warning.

These fixes mostly match how these warnings were handled in
Apache Kudu.

Testing:
 - Ran GVO

Change-Id: I2a97dd07cdd0831e90882a2035415ac71d670147
Reviewed-on: http://gerrit.cloudera.org:8080/18444
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2022-08-11 05:48:36 +00:00
Michael Smith
64b324ac40 IMPALA-11389: Include Python 3 eggs in tarball
Build Python 3 eggs for the shell tarball so it works with both Python 2
and Python 3. The impala-shell script selects eggs based on the
available Python version.

Inlines thrift for impala-shell so we can easily build Python 2 and
Python 3 versions, consistent with other libraries. The impala-shell
version should always be at least as new as IMPALA_THRIFT_PY_VERSION.

Thrift 0.13.0+ wraps all exceptions during TSocket read/write operations
in TTransportException. Specifically socket.error that we got as raw
exceptions are now wrapped. Unwraps them before raising to preserve
prior behavior.

A specific Python version can be selected with IMPALA_PYTHON_EXECUTABLE;
otherwise it will use 'python', and if unavailable try 'python3'.

Adds tests for impala-shell tarball with Python 3.

Change-Id: I94f86de9e2a6303151c2f0e6454b5f629cbc9444
Reviewed-on: http://gerrit.cloudera.org:8080/18653
Reviewed-by: Wenzhe Zhou <wzhou@cloudera.com>
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2022-07-14 23:52:04 +00:00
Michael Smith
181fd94068 IMPALA-8373: Test impala-shell with python3
Sets up a python3 virtualenv, installs impala-shell into it, and runs
tests.

Change-Id: I8e123aecd53a7ded44a7da7eb8c8b853cebbfc56
Reviewed-on: http://gerrit.cloudera.org:8080/18588
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>
2022-06-13 17:13:42 +00:00
Joe McDonnell
fb282852ef IMPALA-9107 (part 2): Add script to use the m2 archive tarball
This adds a script to find an appropriate m2 archive
tarball, download it, and use it to prepopulate the
~/.m2 directory.

The script uses the JSON interface for Jenkins to search through
the all-build-options-ub1604 builds on jenkins.impala.io to
find one that:
1. Is building the "master" branch
2. Has the m2_archive.tar.gz
Then, it downloads the m2 archive and uses it to populate ~/.m2.
It does not overwrite or remove any files already in ~/.m2.

The build scripts that call populate_m2_directory.py do not
rely on the script succeeding. They will continue even if
the script fails.

This also modifies the build-all-flag-combinations.sh script
to only build the m2 archive if the GENERATE_M2_ARCHIVE
environment variable is true. GENERATE_M2_ARCHIVE=true will
clear out the ~/.m2 directory to build an accurate m2 archive.
Precommit jobs will use GENERATE_M2_ARCHIVE=false, which
will allow them to use the m2 archive to speed up the build.

Testing:
 - Ran gerrify-verify-dryrun
 - Tested locally

Change-Id: I5065658d8c0514550927161855b0943fa7b3a402
Reviewed-on: http://gerrit.cloudera.org:8080/15735
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-06-12 01:54:16 +00:00
Laszlo Gaal
556e63f72d IMPALA-9390: Change ant and maven download URLs to valid locations
Looks like there was a change in the Apache download infrastructure,
and the URLs used by the bootstrap scripts to download ant and maven
are no longer valid.

Manually traversing the download pages yielded the new locations,
https://downloads.apache.org/..., which was used to update
the bootstrap scripts.

Change-Id: I299bdd9cbe999ff4d483c022fa69ec88b37612f6
Reviewed-on: http://gerrit.cloudera.org:8080/15229
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Laszlo Gaal <laszlo.gaal@cloudera.com>
2020-02-20 22:16:35 +00:00
Todd Lipcon
327b938214 IMPALA-8516. Update maven for Jenkins builds
This changes Maven to download and install on both Ubuntu and Redhat for
the Jenkins builds (previously it was only installed on Redhat).

The version number is kept at 3.5.4 even though a newer release is
available upstream. The new release fails to build Impala due to an
XML-parsing bug causing it to fail to resolve the parquet pom [1]

This should hopefully address some of the hang issues we've seen
previously with the older version of Maven that shipped with the version
of Ubuntu we have on Ubuntu 16.04.

[1] https://github.com/codehaus-plexus/plexus-utils/issues/65

Change-Id: I793409eb4e9f4533b75bfe089a497c0ea62ad1ff
Reviewed-on: http://gerrit.cloudera.org:8080/13268
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Todd Lipcon <todd@apache.org>
2019-05-10 13:37:16 +00:00
Jim Apple
2187d36b36 IMPALA-6045: Make build scripts more friendly to Ubuntu 16.04
This commit bundles two changes.

The first extracts from bootstrap_development.sh the commands to
prepare a system to build-and-test without actually doing so. This
enables custom build commands that might not load the test data, for
instance.

The second changes bootstrap_build.sh to work with Ubuntu 16.04. It
should still work with Ubuntu 14.04, but I don't anticipate that being
part of the Jenkins pre-merge job anymore, so I removed that from the
comment at the top of the file explaining what it does.

Change-Id: I8196a2a87bce5893a349a1b290c3f3d04fd80317
Reviewed-on: http://gerrit.cloudera.org:8080/8262
Reviewed-by: Michael Brown <mikeb@cloudera.com>
Tested-by: Jim Apple <jbapple-impala@apache.org>
2017-10-12 20:54:36 +00:00
Henry Robinson
f51c4435c9 IMPALA-4669: [SECURITY] Add security library to build
* Minor compilation fix
* Add krb5 as a non-toolchain dependency
* Handle legacy versions of libkrb5.so by providing implementation of
  krb5_is_config_principal().
* Link against openssl from the toolchain if 1.0.0 or higher not found
  on build machine.
* Update LICENSE.txt and NOTICE.txt re: OpenSSL code in x509_check_host.{h,c}.

Change-Id: I4f327810066bee7f3ac107b0295480fb9ed45e14
Reviewed-on: http://gerrit.cloudera.org:8080/5717
Reviewed-by: Henry Robinson <henry@cloudera.com>
Tested-by: Henry Robinson <henry@cloudera.com>
2017-08-15 00:47:26 +00:00
Sailesh Mukil
50bd015f2d IMPALA-5333: Add support for Impala to work with ADLS
This patch leverages the AdlFileSystem in Hadoop to allow
Impala to talk to the Azure Data Lake Store. This patch has
functional changes as well as adds test infrastructure for
testing Impala over ADLS.

We do not support ACLs on ADLS since the Hadoop ADLS
connector does not integrate ADLS ACLs with Hadoop users/groups.

For testing, we use the azure-data-lake-store-python client
from Microsoft. This client seems to have some consistency
issues. For example, a drop table through Impala will delete
the files in ADLS, however, listing that directory through
the python client immediately after the drop, will still show
the files. This behavior is unexpected since ADLS claims to be
strongly consistent. Some tests have been skipped due to this
limitation with the tag SkipIfADLS.slow_client. Tracked by
IMPALA-5335.

The azure-data-lake-store-python client also only works on CentOS 6.6
and over, so the python dependencies for Azure will not be downloaded
when the TARGET_FILESYSTEM is not "adls". While running ADLS tests,
the expectation will be that it runs on a machine that is at least
running CentOS 6.6.
Note: This is only a test limitation, not a functional one. Clusters
with older OSes like CentOS 6.4 will still work with ADLS.

Added another dependency to bootstrap_build.sh for the ADLS Python
client.

Testing: Ran core tests with and without TARGET_FILESYSTEM as
'adls' to make sure that all tests pass and that nothing breaks.

Change-Id: Ic56b9988b32a330443f24c44f9cb2c80842f7542
Reviewed-on: http://gerrit.cloudera.org:8080/6910
Tested-by: Impala Public Jenkins
Reviewed-by: Sailesh Mukil <sailesh@cloudera.com>
2017-05-25 19:35:24 +00:00
Jim Apple
26805809f7 IMPALA-4512: Add a script that builds Impala on stock Ubuntu 14.04.
This is a simpler alternative to bootstrap_development.sh - it
acquires enough dependencies to build, but does not attempt to load
the test data or even build the tests. This is sometimes a lightweight
testing method used by Apache PPMC members who are voting on a release
of an incubating project.

Change-Id: If34e398052a61dfda9825b1cf3a918eb61736048
Reviewed-on: http://gerrit.cloudera.org:8080/5154
Reviewed-by: Jim Apple <jbapple-impala@apache.org>
Tested-by: Impala Public Jenkins
2016-11-29 22:10:58 +00:00