9 Commits

Author SHA1 Message Date
Joe McDonnell
13fbe510c0 IMPALA-9838: Switch to GCC 7.5.0
This upgrades GCC and libstdc++ to version 7.5.0. There
have been ABI changes since 4.9.2, so this means that
the native-toolchain produced with the new compiler is
not interoperable with one produced by the old compiler.
To allow that transition, IMPALA_TOOLCHAIN_PACKAGES_HOME
is now a subdirectory of IMPALA_TOOLCHAIN
(toolchain-packages-gcc${IMPALA_GCC_VERSION}) to distinguish
it from the old packages.

Some Python packages in the impala-python virtualenv are
compiled using the toolchain GCC and now use the new ABI.
This leads to two changes:
1. When constructing the LD_LIBRARY_PATH for impala-python,
we include the GCC libstdc++ libraries. Otherwise, certain
Python packages that use C++ fail on older OSes like Centos 7.
This fixes IMPALA-9804.
2. Since developers work on various branches, this changes
the virtualenv's directory location to a directory with
the GCC version in the name. This allows the virtualenv
built with GCC 7 to coexist with the current virtualenv
built with GCC 4.9.2. The location for the old virtualenv is
${IMPALA_HOME}/infra/python/env. The new location is
${IMPALA_HOME}/infra/python/env-gcc${IMPALA_GCC_VERSION}. This
required updating several impala-python scripts.

There are various odds-and-ends related to the transition:
1. Due to the small string optimization, the size of std::string
changed, which means that various data structures also changed
in size. This required updating some static asserts.
2. There is a bug in clang-tidy that reports a use-after-free
for some code using std::shared_ptr. Clang is not modeling
the shared_ptr correctly, so it is a false-positive. As a workaround,
this disables the clang-analyzer-cplusplus.NewDelete diagnostic.
3. Various small compilation fixes (includes, etc).

Performance testing:
 - Ran single-node performance tests on TPC-H for the following
   configurations:
    - TPC-H Parquet scale 30 with normal configurations
    - TPC-H Parquet scale 30 with codegen disabled
    - TPC-H Kudu scale 10
   None found any significant regressions. Full results are
   posted on the JIRA.
 - Ran single-node performance tests on targeted-perf scale 10.
   No significant regressions.
 - The size of binaries (impalad, etc) is slightly smaller with the new GCC:
   GCC 4.9.2 release impalad binary: 545664
   GCC 7.5.0 release impalad binary: 539900
 - Compilation in DEBUG mode is roughly 15-25% faster

Functional testing:
 - Ran core jobs, exhaustive release jobs, UBSAN

Change-Id: Ia0beb2b618ba669c9699f8dbc0c52d1203d004e4
Reviewed-on: http://gerrit.cloudera.org:8080/16045
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-06-15 23:42:12 +00:00
Tim Armstrong
c43c03c5ee IMPALA-3926: part 2: avoid setting LD_LIBRARY_PATH
This removes LD_LIBRARY_PATH and LD_PRELOAD from the
developer's shell and cleans it up. With the preceding
change, toolchain utilities like clang can be run without
a special LD_LIBRARY_PATH.

This fixes a bug where libjvm.so was registered as a
static instead of a shared library, which adds it to the
RUNPATH variable in the binary, which provides a default
search location that can be overriden by LD_LIBRARY_PATH.

Impala binaries don't have the rpath baked in for some
libraries, including Impala-lzo, libgcc and libstdc++.
, so we still need to set LD_LIBRARY_PATH when running
those. That is solved with wrapper scripts that sets
the environment variables only when invoking those
binaries, e.g. starting a daemon or running a backend
test. I added three scripts because there were 3 sets
of environment variables. The scripts are:
* run-binary.sh: just sets LD_LIBRARY_PATH
* run-jvm-binary.sh: sets LD_LIBRARY_PATH and CLASSPATH
* start-daemon.sh: sets LD_LIBRARY_PATH and CLASSPATH and
  kerberos-related environment variables.

The binaries, in almost all cases, work fine without
those tweaks, because libstdc++ and libgcc are picked
up along with libkuduclient.so from the toolchain (they
are in the same directory). I decided to leave good enough
alone here. run-binary.sh and friends can be used in
any remaining edge cases to run binaries.

An alternative to the 3 scripts would be to have an
uber-script that set all the variables, but I felt
that it was better to be specific about what
each binary needed. Cleaning the LD_LIBRARY_PATH
mess up has given me a distaste for scattershot
setting of environment variables. I am open to
revisiting this.

Testing:
* Ran tests on centos 7
* Manually tested that my dev env with
 LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu continued
 to work (for now). All ubuntu 16.04 and 18.04 dev
 envs that were set up with bootstrap_development.sh
 will be in this state.

Change-Id: I61c83e6cca6debb87a12135e58ee501244bc9603
Reviewed-on: http://gerrit.cloudera.org:8080/14494
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-05-07 08:50:44 +00:00
Tim Armstrong
c187601b0f Remove Python 2.4 workarounds in start-impala-cluster.py
The workarounds are:
* Trying to handle the json module not being present. That module was
  added in Python 2.6 and also isn't used in this file!
* Trying to handle import failure of ImpalaCluster. It appears that
  this is be a workaround for the script being run without the
  correct PYTHONPATH (set by set-pythonpath.sh) - see IMPALA-7989.
  This is fixed in this patch by sourcing it in
  bin/impala-python-common.sh

Testing:
Ran a build on CentOS 6 with Python 2.4 and confirmed that
it was able to start the cluster OK.

Reran a build that previously failed with the python import error.

Change-Id: I80680804e61fcca81ac2b76604ef8fad70b031f5
Reviewed-on: http://gerrit.cloudera.org:8080/12100
Reviewed-by: Bharath Vissapragada <bharathv@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-12-18 01:10:37 +00:00
David Knupp
6e5ec22b12 IMPALA-7399: Emit a junit xml report when trapping errors
This patch will cause a junitxml file to be emitted in the case of
errors in build scripts. Instead of simply echoing a message to the
console, we set up a trap function that also writes out to a
junit xml report that can be consumed by jenkins.impala.io.

Main things to pay attention to:

- New file that gets sourced by all bash scripts when trapping
  within bash scripts:

  https://gerrit.cloudera.org/c/11257/1/bin/report_build_error.sh

- Installation of the python lib into impala-python venv for use
  from within python files:

  https://gerrit.cloudera.org/c/11257/1/bin/impala-python-common.sh

- Change to the generate_junitxml.py file itself, for ease of
  https://gerrit.cloudera.org/c/11257/1/lib/python/impala_py_lib/jenkins/generate_junitxml.py

Most of the other changes are to source the new report_build_error.sh
script to set up the trap function.

Change-Id: Idd62045bb43357abc2b89a78afff499149d3c3fc
Reviewed-on: http://gerrit.cloudera.org:8080/11257
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-08-23 18:33:58 +00:00
Zoltan Ivanfi
a60ba6d274 IMPALA-4006: dangerous rm -rf statements in scripts
Quoted variable substitutions in rm -rf commands and in many other
places. This prevents disasters if those variables contain whitespace.

Redirected output of the cd commands to /dev/null. This prevents
polluting the target variable with the directory name when the CDPATH
environment variable is set.

Change-Id: I7503794180dee99eeb979e67f34e3b2edade70fe
Reviewed-on: http://gerrit.cloudera.org:8080/4078
Tested-by: Internal Jenkins
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
2016-09-01 21:26:52 +00:00
Dan Hecht
ffa7829b70 IMPALA-3918: Remove Cloudera copyrights and add ASF license header
For files that have a Cloudera copyright (and no other copyright
notice), make changes to follow the ASF source file header policy here:

http://www.apache.org/legal/src-headers.html#headers

Specifically:
1) Remove the Cloudera copyright.
2) Modify NOTICE.txt according to
   http://www.apache.org/legal/src-headers.html#notice
   to follow that format and add a line for Cloudera.
3) Replace or add the existing ASF license text with the one given
   on the website.

Much of this change was automatically generated via:

git grep -li 'Copyright.*Cloudera' > modified_files.txt
cat modified_files.txt | xargs perl -n -i -e 'print unless m#Copyright.*Cloudera#i;'
cat modified_files_txt | xargs fix_apache_license.py [1]

Some manual fixups were performed following those steps, especially when
license text was completely missing from the file.

[1] https://gist.github.com/anonymous/ff71292094362fc5c594 with minor
    modification to ORIG_LICENSE to match Impala's license text.

Change-Id: I2e0bd8420945b953e1b806041bea4d72a3943d86
Reviewed-on: http://gerrit.cloudera.org:8080/3779
Reviewed-by: Dan Hecht <dhecht@cloudera.com>
Tested-by: Internal Jenkins
2016-08-09 08:19:41 +00:00
Michael Brown
5112e65be2 Revert "Revert "Add Kudu test helpers""
This reverts commit f8dd5413b65d30646c3745dfc738ed812d50a51f and
effectively re-adds commit 9248dcb70478b8f93f022893776a0960f45fdc28. The
difference between this patch and its original is that I fixed the
changes introduced in infra/python/bootstrap_virtualenv.py to be
python2.4-compatible:

- removed the use of str.format(), preferring a str.join() pattern
- removed the call of the exit() builtin to prefer sys.exit()

The only testing I did for this patch was to ensure
CDH Impala-packaging-on-demand works.

Change-Id: I02ed97473868eacf45b25abe89b41e6fa2fce325
Reviewed-on: http://gerrit.cloudera.org:8080/3160
Reviewed-by: Michael Brown <mikeb@cloudera.com>
Tested-by: Internal Jenkins
2016-05-24 16:40:59 -07:00
Shiraz Ali
08eff2bc09 Revert "Add Kudu test helpers"
This reverts commit 9248dcb70478b8f93f022893776a0960f45fdc28.
2016-05-20 08:46:00 -07:00
casey
36b524f68c Add Kudu test helpers
Changes:

1) Add the python Kudu module to the virtualenv. Building the virtualenv
is much slower now because Cython and numpy are required. To help with
the rebuild time --no-cache was removed. That option was added to help
when using the dev version of impyla, the version number would be the
same but the module contents were different and the cache used the old
module contents.

2) Add some py.test fixtures to help create Kudu and Impala connections.

Change-Id: I8e5e22b38d5bd09a36238e66a69aa42d1a941de7
Reviewed-on: http://gerrit.cloudera.org:8080/2855
Reviewed-by: Casey Ching <casey@cloudera.com>
Tested-by: Internal Jenkins
2016-05-19 19:45:48 -07:00