Commit Graph

8 Commits

Author SHA1 Message Date
Joe McDonnell
1913ab46ed IMPALA-14501: Migrate most scripts from impala-python to impala-python3
To remove the dependency on Python 2, existing scripts need to use
python3 rather than python. These commands find those
locations (for impala-python and regular python):
git grep impala-python | grep -v impala-python3 | grep -v impala-python-common | grep -v init-impala-python
git grep bin/python | grep -v python3

This removes or switches most of these locations by various means:
1. If a python file has a #!/bin/env impala-python (or python) but
   doesn't have a main function, it removes the hash-bang and makes
   sure that the file is not executable.
2. Most scripts can simply switch from impala-python to impala-python3
   (or python to python3) with minimal changes.
3. The cm-api pypi package (which doesn't support Python 3) has been
   replaced by the cm-client pypi package and interfaces have changed.
   Rather than migrating the code (which hasn't been used in years), this
   deletes the old code and stops installing cm-api into the virtualenv.
   The code can be restored and revamped if there is any interest in
   interacting with CM clusters.
4. This switches tests/comparison over to impala-python3, but this code has
   bit-rotted. Some pieces can be run manually, but it can't be fully
   verified with Python 3. It shouldn't hold back the migration on its own.
5. This also replaces locations of impala-python in comments / documentation /
   READMEs.
6. kazoo (used for interacting with HBase) needed to be upgraded to a
   version that supports Python 3. The newest version of kazoo requires
   upgrades of other component versions, so this uses kazoo 2.8.0 to avoid
   needing other upgrades.

The two remaining uses of impala-python are:
 - bin/cmake_aux/create_virtualenv.sh
 - bin/impala-env-versioned-python
These will be removed separately when we drop Python 2 support
completely. In particular, these are useful for testing impala-shell
with Python 2 until we stop supporting Python 2 for impala-shell.

The docker-based tests still use /usr/bin/python, but this can
be switched over independently (and doesn't impact impala-python)

Testing:
 - Ran core job
 - Ran build + dataload on Centos 7, Redhat 8
 - Manual testing of individual scripts (except some bitrotted areas like the
   random query generator)

Change-Id: If209b761290bc7e7c716c312ea757da3e3bca6dc
Reviewed-on: http://gerrit.cloudera.org:8080/23468
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Tested-by: Michael Smith <michael.smith@cloudera.com>
2025-10-22 16:30:17 +00:00
Michael Smith
0a42185d17 IMPALA-9627: Update utility scripts for Python 3 (part 2)
We're starting to see environments where the system Python ('python') is
Python 3. Updates utility and build scripts to work with Python 3, and
updates check-pylint-py3k.sh to check scripts that use system python.

Fixes other issues found during a full build and test run with Python
3.8 as the default for 'python'.

Fixes a impala-shell tip that was supposed to have been two tips (and
had no space after period when they were printed).

Removes out-of-date deploy.py and various Python 2.6 workarounds.

Testing:
- Full build with /usr/bin/python pointed to python3
- run-all-tests passed with python pointed to python3
- ran push_to_asf.py

Change-Id: Idff388aff33817b0629347f5843ec34c78f0d0cb
Reviewed-on: http://gerrit.cloudera.org:8080/19697
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Tested-by: Michael Smith <michael.smith@cloudera.com>
2023-04-26 18:52:23 +00:00
Joe McDonnell
2b550634d2 IMPALA-11952 (part 2): Fix print function syntax
Python 3 now treats print as a function and requires
the parenthesis in invocation.

print "Hello World!"
is now:
print("Hello World!")

This fixes all locations to use the function
invocation. This is more complicated when the output
is being redirected to a file or when avoiding the
usual newline.

print >> sys.stderr , "Hello World!"
is now:
print("Hello World!", file=sys.stderr)

To support this properly and guarantee equivalent behavior
between python 2 and python 3, all files that use print
now add this import:
from __future__ import print_function

This also fixes random flake8 issues that intersect with
the changes.

Testing:
 - check-python-syntax.sh shows no errors related to print

Change-Id: Ib634958369ad777a41e72d80c8053b74384ac351
Reviewed-on: http://gerrit.cloudera.org:8080/19552
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Tested-by: Michael Smith <michael.smith@cloudera.com>
2023-02-28 17:11:50 +00:00
Tim Armstrong
c43c03c5ee IMPALA-3926: part 2: avoid setting LD_LIBRARY_PATH
This removes LD_LIBRARY_PATH and LD_PRELOAD from the
developer's shell and cleans it up. With the preceding
change, toolchain utilities like clang can be run without
a special LD_LIBRARY_PATH.

This fixes a bug where libjvm.so was registered as a
static instead of a shared library, which adds it to the
RUNPATH variable in the binary, which provides a default
search location that can be overriden by LD_LIBRARY_PATH.

Impala binaries don't have the rpath baked in for some
libraries, including Impala-lzo, libgcc and libstdc++.
, so we still need to set LD_LIBRARY_PATH when running
those. That is solved with wrapper scripts that sets
the environment variables only when invoking those
binaries, e.g. starting a daemon or running a backend
test. I added three scripts because there were 3 sets
of environment variables. The scripts are:
* run-binary.sh: just sets LD_LIBRARY_PATH
* run-jvm-binary.sh: sets LD_LIBRARY_PATH and CLASSPATH
* start-daemon.sh: sets LD_LIBRARY_PATH and CLASSPATH and
  kerberos-related environment variables.

The binaries, in almost all cases, work fine without
those tweaks, because libstdc++ and libgcc are picked
up along with libkuduclient.so from the toolchain (they
are in the same directory). I decided to leave good enough
alone here. run-binary.sh and friends can be used in
any remaining edge cases to run binaries.

An alternative to the 3 scripts would be to have an
uber-script that set all the variables, but I felt
that it was better to be specific about what
each binary needed. Cleaning the LD_LIBRARY_PATH
mess up has given me a distaste for scattershot
setting of environment variables. I am open to
revisiting this.

Testing:
* Ran tests on centos 7
* Manually tested that my dev env with
 LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu continued
 to work (for now). All ubuntu 16.04 and 18.04 dev
 envs that were set up with bootstrap_development.sh
 will be in this state.

Change-Id: I61c83e6cca6debb87a12135e58ee501244bc9603
Reviewed-on: http://gerrit.cloudera.org:8080/14494
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-05-07 08:50:44 +00:00
Joe McDonnell
2cb93b7ac8 Improve error handling for validation of unified backend executable
bin/validate-unified-backend-test-filters.py currently does not
handle the case where the call to run the unified backend test
executable fails. Instead, it produces a very confusing message
about invalid filters that does not mention that the execution
failed.

This checks the return code when running the unified backend
test executable and produces a reasonable message if the
return code is not 0.

Change-Id: Ia84cd074be79bc9d99b0621a1ae216f3debf5833
Reviewed-on: http://gerrit.cloudera.org:8080/12895
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2019-04-03 03:47:32 +00:00
Joe McDonnell
cf92d5b23f Usability fixups for the unified backend test executable
This fixes the error messages in bin/validate-unified-backend-test-filters.py
to be clearer. It also modifies the dependency graph so that
"make $TESTNAME" will run validation on the unified backend executable.

Change-Id: Ic206044c8f119b6ee164eb2e6121ec67e71dafa8
Reviewed-on: http://gerrit.cloudera.org:8080/12620
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2019-03-01 03:27:30 +00:00
Joe McDonnell
e9936b02cf IMPALA-8247: Fix tests missing from unified backend executable
When converting the tests in be/src/util to the unified backend
test executable in IMPALA-8071, some tests did not get linked
into the executable appropriately. This means that they stopped
running. Specifically, the tests in bit-stream-utils-test.cc
and system-state-info-test.cc were left out.

This adds them back and modifies bin/validate-unified-backend-filters.py
to check both directions:
 - If a filter is specified, it must match a test in the executable.
 - If the executable has a test, some filter must match it.
I ran the backend tests on debug and ASAN with this change.

Change-Id: I80d8b5779cb0ac663e32c72722e0c73b68a43d2e
Reviewed-on: http://gerrit.cloudera.org:8080/12584
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
2019-02-26 17:28:23 +00:00
Joe McDonnell
3ce34a81b2 IMPALA-8071: Initial unified backend test framework
There are 100+ backend tests and each requires 400+ MB
of disk space when statically linked (the default).
This requires a large amount of disk space and adds
considerable link time.

This introduces a framework to link multiple backend
tests into a single executable. Currently it does
this for several tests in be/src/util. It saves
about 10GB of space.

It maintains several of the same properties that
the current tests have:
1. "make <testname>" rebuilds that test.
2. It generates an executable shell script with the
   same name as the original backend test that runs
   the same subset of tests.
3. It generates JUnitXML and log files with names that
   match the test name.
4. One can run the shell script with "--gtest_filter"
   and run a subset of the tests.
5. ctest commands such as ctest -R continue to function.

It validates at build time that every test linked into
the unified executable is covered by an equivalent test
filter pattern. This means that every test in the unified
executable will run as part of normal testing.

Introducing the framework along with a limited number
of trial backend tests gives us a chance to evaluate
this change before continuing to convert tests.

Change-Id: Ia03ef38719b1fbc0fe2025e16b7b3d3dd4488842
Reviewed-on: http://gerrit.cloudera.org:8080/12124
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2019-02-08 19:00:21 +00:00