To remove the dependency on Python 2, existing scripts need to use
python3 rather than python. These commands find those
locations (for impala-python and regular python):
git grep impala-python | grep -v impala-python3 | grep -v impala-python-common | grep -v init-impala-python
git grep bin/python | grep -v python3
This removes or switches most of these locations by various means:
1. If a python file has a #!/bin/env impala-python (or python) but
doesn't have a main function, it removes the hash-bang and makes
sure that the file is not executable.
2. Most scripts can simply switch from impala-python to impala-python3
(or python to python3) with minimal changes.
3. The cm-api pypi package (which doesn't support Python 3) has been
replaced by the cm-client pypi package and interfaces have changed.
Rather than migrating the code (which hasn't been used in years), this
deletes the old code and stops installing cm-api into the virtualenv.
The code can be restored and revamped if there is any interest in
interacting with CM clusters.
4. This switches tests/comparison over to impala-python3, but this code has
bit-rotted. Some pieces can be run manually, but it can't be fully
verified with Python 3. It shouldn't hold back the migration on its own.
5. This also replaces locations of impala-python in comments / documentation /
READMEs.
6. kazoo (used for interacting with HBase) needed to be upgraded to a
version that supports Python 3. The newest version of kazoo requires
upgrades of other component versions, so this uses kazoo 2.8.0 to avoid
needing other upgrades.
The two remaining uses of impala-python are:
- bin/cmake_aux/create_virtualenv.sh
- bin/impala-env-versioned-python
These will be removed separately when we drop Python 2 support
completely. In particular, these are useful for testing impala-shell
with Python 2 until we stop supporting Python 2 for impala-shell.
The docker-based tests still use /usr/bin/python, but this can
be switched over independently (and doesn't impact impala-python)
Testing:
- Ran core job
- Ran build + dataload on Centos 7, Redhat 8
- Manual testing of individual scripts (except some bitrotted areas like the
random query generator)
Change-Id: If209b761290bc7e7c716c312ea757da3e3bca6dc
Reviewed-on: http://gerrit.cloudera.org:8080/23468
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Tested-by: Michael Smith <michael.smith@cloudera.com>
This adds support for Redhat 9 / Ubuntu 22. It updates
to a newer toolchain that has those builds, and it adds
supporting code in bootstrap_system.sh.
Redhat 9 and Ubuntu 22 use python = python3, which requires
various changes to build scripts and tests. Ubuntu 22 uses
Python 3.10, which deprecates certain ssl.PROTOCOL_TLS, so
this adapts test_client_ssl.py to that change until it
can be fully addressed in IMPALA-12219.
Various OpenSSL methods have been deprecated. As a workaround
until these can be addressed properly, this specifies
-Wno-deprecated-declarations. This can be removed once the
code is adapted to the non-deprecated APIs in IMPALA-12226.
Impala crashes with tcmalloc errors unless we update to a newer
gperftools, so this moves to gperftools 2.10. gperftools changed
the default for tcmalloc.aggressive_memory_decommit to off, so
this adapts our code to set it for backend tests. The gperftools
upgrade does not show any performance regression:
+----------+-----------------------+---------+------------+------------+----------------+
| Workload | File Format | Avg (s) | Delta(Avg) | GeoMean(s) | Delta(GeoMean) |
+----------+-----------------------+---------+------------+------------+----------------+
| TPCH(42) | parquet / none / none | 3.08 | -0.64% | 2.20 | -0.37% |
+----------+-----------------------+---------+------------+------------+----------------+
With newer Python versions, the impala-virtualenv command
fails to create a Python 3 virtualenv. This switches to
using Python 3's builtin venv command for Python >=3.6.
Kudu needed a newer version and LLVM required a couple patches.
Testing:
- Ran a core job on Ubuntu 22 and Redhat 9. The tests run
to completion without crashing. There are test failures
that will be addressed in follow-up JIRAs.
- Ran dockerised tests on Ubuntu 22.
- Ran dockerised tests on Ubuntu 20 and Rocky 8.5.
Change-Id: If1fcdb2f8c635ecd6dc7a8a1db81f5f389c78b86
Reviewed-on: http://gerrit.cloudera.org:8080/20073
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>
We're starting to see environments where the system Python ('python') is
Python 3. Updates utility and build scripts to work with Python 3, and
updates check-pylint-py3k.sh to check scripts that use system python.
Fixes other issues found during a full build and test run with Python
3.8 as the default for 'python'.
Fixes a impala-shell tip that was supposed to have been two tips (and
had no space after period when they were printed).
Removes out-of-date deploy.py and various Python 2.6 workarounds.
Testing:
- Full build with /usr/bin/python pointed to python3
- run-all-tests passed with python pointed to python3
- ran push_to_asf.py
Change-Id: Idff388aff33817b0629347f5843ec34c78f0d0cb
Reviewed-on: http://gerrit.cloudera.org:8080/19697
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Tested-by: Michael Smith <michael.smith@cloudera.com>
ANSI escape sequences do a variety of actions in the
terminal like adding color to compilation warnings.
generate_junitxml.py currently hits an error when trying
to generate JUnitXML for compilation output that contains
ANSI escape sequences.
This changes generate_junitxml.py to strip ANSI
escape sequences from the strings incorporated into
JUnitXML (e.g. the error output of a compiler).
The solution is based off the discussion at:
https://stackoverflow.com/questions/14693701
Testing:
- A case where generate_junitxml.py was failing to
generate JUnitXML now generates valid JUnitXML.
The output still contains all the compiler warnings
and information needed to diagnose the issue.
Change-Id: I9654a6b13350cb9582ec908b8807b630636a1ed0
Reviewed-on: http://gerrit.cloudera.org:8080/19708
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Reviewed-by: Wenzhe Zhou <wzhou@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Python 3 now treats print as a function and requires
the parenthesis in invocation.
print "Hello World!"
is now:
print("Hello World!")
This fixes all locations to use the function
invocation. This is more complicated when the output
is being redirected to a file or when avoiding the
usual newline.
print >> sys.stderr , "Hello World!"
is now:
print("Hello World!", file=sys.stderr)
To support this properly and guarantee equivalent behavior
between python 2 and python 3, all files that use print
now add this import:
from __future__ import print_function
This also fixes random flake8 issues that intersect with
the changes.
Testing:
- check-python-syntax.sh shows no errors related to print
Change-Id: Ib634958369ad777a41e72d80c8053b74384ac351
Reviewed-on: http://gerrit.cloudera.org:8080/19552
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Tested-by: Michael Smith <michael.smith@cloudera.com>
This wraps each command executed by CMake with a wrapper that
generates a JUnitXML file if the command fails. If the command
succeeds, the wrapper does nothing. The wrapper applies to C++
compilation, linking, and custom shell commands (such as
building the frontend via maven). It does not apply to failures
coming from CMake itself. It can be disabled by setting
DISABLE_CMAKE_JUNITXML.
The command output can include Unicode (e.g. smart quotes for
g++), so this also updates generate_junitxml.py to handle
Unicode.
The wrapper interacts poorly with add_custom_command/add_custom_target
CMake commands that use 'cd directory && do_something', so this
switches those locations (in /docker) to use CMake's WORKING_DIRECTORY.
Testing:
- Verified it does not impact a successful build (including with
ccache and/or distcc).
- Verified it generates JUnitXML for C++ and Java compilation
failures.
- Verified it doesn't use the wrapper when DISABLE_CMAKE_JUNITXML
is set.
Change-Id: If71f2faf3ab5052b56b38f1b291fee53c390ce23
Reviewed-on: http://gerrit.cloudera.org:8080/12668
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Python 2.6's ElementTree.write() does not have an xml_declaration
argument, so junitxml_prune_notrun.py fails on python 2.6.
This fixes junitxml_prune_notrun.py by using minidom to write
the output. This mirrors how bin/generate_junitxml.py outputs
XML.
Verified that tests now pass on python 2.6 and python 2.7 does
not change.
Change-Id: I9ef8fb77b1ac8c51e3dfb6b04690ae9ccc490d62
Reviewed-on: http://gerrit.cloudera.org:8080/12479
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
There are 100+ backend tests and each requires 400+ MB
of disk space when statically linked (the default).
This requires a large amount of disk space and adds
considerable link time.
This introduces a framework to link multiple backend
tests into a single executable. Currently it does
this for several tests in be/src/util. It saves
about 10GB of space.
It maintains several of the same properties that
the current tests have:
1. "make <testname>" rebuilds that test.
2. It generates an executable shell script with the
same name as the original backend test that runs
the same subset of tests.
3. It generates JUnitXML and log files with names that
match the test name.
4. One can run the shell script with "--gtest_filter"
and run a subset of the tests.
5. ctest commands such as ctest -R continue to function.
It validates at build time that every test linked into
the unified executable is covered by an equivalent test
filter pattern. This means that every test in the unified
executable will run as part of normal testing.
Introducing the framework along with a limited number
of trial backend tests gives us a chance to evaluate
this change before continuing to convert tests.
Change-Id: Ia03ef38719b1fbc0fe2025e16b7b3d3dd4488842
Reviewed-on: http://gerrit.cloudera.org:8080/12124
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
With python 2.6, the syntax "{}".format(1) doesn't work:
$docker run centos:6 python -c 'print "{}".format(1)'
Traceback (most recent call last):
File "<string>", line 1, in <module>
ValueError: zero length field name in format
generate_junitxml was using this incantation and failing.
I've updated the syntax to be py2.6-friendly, and tested
it like so:
$docker run -v $(pwd):/mnt centos:6 bash -c "yum install -y python-argparse; /mnt/lib/python/impala_py_lib/jenkins/generate_junitxml.py --phase phase --step step --stdout out --stderr err; cat /extra_junit_xml_logs/*.xml"
[output from yum...]
Installed:
python-argparse.noarch 0:1.2.1-2.1.el6
Complete!
Generated: ./extra_junit_xml_logs/generate_junitxml.phase.step.20180904_18_04_56.xml
<?xml version="1.0" ?>
<testsuites errors="0" failures="0" tests="1" time="0.0">
<testsuite disabled="0" errors="0" failures="0" file="None" log="None" name="generate_junitxml.phase.step" skipped="0" tests="1" time="0" timestamp="2018-09-04 18:04:56+00:00" url="None">
<testcase classname="generate_junitxml.phase" name="step">
<system-out>
out
</system-out>
<system-err>
err
</system-err>
</testcase>
</testsuite>
</testsuites>
Change-Id: Ic0c1e837a9ed6c2d59906aed1d1098bde6f5d815
Reviewed-on: http://gerrit.cloudera.org:8080/11384
Reviewed-by: David Knupp <dknupp@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
The original patch for this Jira relied on a third party python lib
for generating Junit XML output, That proved to be limiting because
setting up the necessary virtualenv across a variety of dev and test
scenarios (private dev environment, jenkins.impala.io, and others)
proved to be confusing and messy.
This update to the script maintains the same functionality and the
same interface, but uses only the python standard library. A symlink
has also been added to Impala/bin for convenience.
Change-Id: I958ee0d8420b6a4197aaf0a7e0538a566332ea97
Reviewed-on: http://gerrit.cloudera.org:8080/11235
Reviewed-by: David Knupp <dknupp@cloudera.com>
Tested-by: David Knupp <dknupp@cloudera.com>
This patch adds a script to generate junit XML reports for arbitrary
build steps. It's also being used to seed the creation of an internal
python library for Impala development that can be pip installed into
a development environment.
Change-Id: If6024d74075ea69b8ee20d1fc3cc9c1ff821ba5b
Reviewed-on: http://gerrit.cloudera.org:8080/11128
Reviewed-by: David Knupp <dknupp@cloudera.com>
Tested-by: David Knupp <dknupp@cloudera.com>