impala

mirror of https://github.com/apache/impala.git synced 2025-12-19 18:12:08 -05:00

Author	SHA1	Message	Date
Joe McDonnell	1913ab46ed	IMPALA-14501: Migrate most scripts from impala-python to impala-python3 To remove the dependency on Python 2, existing scripts need to use python3 rather than python. These commands find those locations (for impala-python and regular python): git grep impala-python \| grep -v impala-python3 \| grep -v impala-python-common \| grep -v init-impala-python git grep bin/python \| grep -v python3 This removes or switches most of these locations by various means: 1. If a python file has a #!/bin/env impala-python (or python) but doesn't have a main function, it removes the hash-bang and makes sure that the file is not executable. 2. Most scripts can simply switch from impala-python to impala-python3 (or python to python3) with minimal changes. 3. The cm-api pypi package (which doesn't support Python 3) has been replaced by the cm-client pypi package and interfaces have changed. Rather than migrating the code (which hasn't been used in years), this deletes the old code and stops installing cm-api into the virtualenv. The code can be restored and revamped if there is any interest in interacting with CM clusters. 4. This switches tests/comparison over to impala-python3, but this code has bit-rotted. Some pieces can be run manually, but it can't be fully verified with Python 3. It shouldn't hold back the migration on its own. 5. This also replaces locations of impala-python in comments / documentation / READMEs. 6. kazoo (used for interacting with HBase) needed to be upgraded to a version that supports Python 3. The newest version of kazoo requires upgrades of other component versions, so this uses kazoo 2.8.0 to avoid needing other upgrades. The two remaining uses of impala-python are: - bin/cmake_aux/create_virtualenv.sh - bin/impala-env-versioned-python These will be removed separately when we drop Python 2 support completely. In particular, these are useful for testing impala-shell with Python 2 until we stop supporting Python 2 for impala-shell. The docker-based tests still use /usr/bin/python, but this can be switched over independently (and doesn't impact impala-python) Testing: - Ran core job - Ran build + dataload on Centos 7, Redhat 8 - Manual testing of individual scripts (except some bitrotted areas like the random query generator) Change-Id: If209b761290bc7e7c716c312ea757da3e3bca6dc Reviewed-on: http://gerrit.cloudera.org:8080/23468 Reviewed-by: Michael Smith <michael.smith@cloudera.com> Tested-by: Michael Smith <michael.smith@cloudera.com>	2025-10-22 16:30:17 +00:00
Joe McDonnell	234d641d7b	IMPALA-11961/IMPALA-12207: Add Redhat 9 / Ubuntu 22 support This adds support for Redhat 9 / Ubuntu 22. It updates to a newer toolchain that has those builds, and it adds supporting code in bootstrap_system.sh. Redhat 9 and Ubuntu 22 use python = python3, which requires various changes to build scripts and tests. Ubuntu 22 uses Python 3.10, which deprecates certain ssl.PROTOCOL_TLS, so this adapts test_client_ssl.py to that change until it can be fully addressed in IMPALA-12219. Various OpenSSL methods have been deprecated. As a workaround until these can be addressed properly, this specifies -Wno-deprecated-declarations. This can be removed once the code is adapted to the non-deprecated APIs in IMPALA-12226. Impala crashes with tcmalloc errors unless we update to a newer gperftools, so this moves to gperftools 2.10. gperftools changed the default for tcmalloc.aggressive_memory_decommit to off, so this adapts our code to set it for backend tests. The gperftools upgrade does not show any performance regression: +----------+-----------------------+---------+------------+------------+----------------+ \| Workload \| File Format \| Avg (s) \| Delta(Avg) \| GeoMean(s) \| Delta(GeoMean) \| +----------+-----------------------+---------+------------+------------+----------------+ \| TPCH(42) \| parquet / none / none \| 3.08 \| -0.64% \| 2.20 \| -0.37% \| +----------+-----------------------+---------+------------+------------+----------------+ With newer Python versions, the impala-virtualenv command fails to create a Python 3 virtualenv. This switches to using Python 3's builtin venv command for Python >=3.6. Kudu needed a newer version and LLVM required a couple patches. Testing: - Ran a core job on Ubuntu 22 and Redhat 9. The tests run to completion without crashing. There are test failures that will be addressed in follow-up JIRAs. - Ran dockerised tests on Ubuntu 22. - Ran dockerised tests on Ubuntu 20 and Rocky 8.5. Change-Id: If1fcdb2f8c635ecd6dc7a8a1db81f5f389c78b86 Reviewed-on: http://gerrit.cloudera.org:8080/20073 Reviewed-by: Michael Smith <michael.smith@cloudera.com> Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>	2023-06-21 05:21:01 +00:00
Michael Smith	0a42185d17	IMPALA-9627: Update utility scripts for Python 3 (part 2) We're starting to see environments where the system Python ('python') is Python 3. Updates utility and build scripts to work with Python 3, and updates check-pylint-py3k.sh to check scripts that use system python. Fixes other issues found during a full build and test run with Python 3.8 as the default for 'python'. Fixes a impala-shell tip that was supposed to have been two tips (and had no space after period when they were printed). Removes out-of-date deploy.py and various Python 2.6 workarounds. Testing: - Full build with /usr/bin/python pointed to python3 - run-all-tests passed with python pointed to python3 - ran push_to_asf.py Change-Id: Idff388aff33817b0629347f5843ec34c78f0d0cb Reviewed-on: http://gerrit.cloudera.org:8080/19697 Reviewed-by: Michael Smith <michael.smith@cloudera.com> Tested-by: Michael Smith <michael.smith@cloudera.com>	2023-04-26 18:52:23 +00:00
Joe McDonnell	2f73239607	IMPALA-12045: Strip ANSI escape sequences for JUnitXML ANSI escape sequences do a variety of actions in the terminal like adding color to compilation warnings. generate_junitxml.py currently hits an error when trying to generate JUnitXML for compilation output that contains ANSI escape sequences. This changes generate_junitxml.py to strip ANSI escape sequences from the strings incorporated into JUnitXML (e.g. the error output of a compiler). The solution is based off the discussion at: https://stackoverflow.com/questions/14693701 Testing: - A case where generate_junitxml.py was failing to generate JUnitXML now generates valid JUnitXML. The output still contains all the compiler warnings and information needed to diagnose the issue. Change-Id: I9654a6b13350cb9582ec908b8807b630636a1ed0 Reviewed-on: http://gerrit.cloudera.org:8080/19708 Reviewed-by: Michael Smith <michael.smith@cloudera.com> Reviewed-by: Wenzhe Zhou <wzhou@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2023-04-10 18:39:16 +00:00
Joe McDonnell	2b550634d2	IMPALA-11952 (part 2): Fix print function syntax Python 3 now treats print as a function and requires the parenthesis in invocation. print "Hello World!" is now: print("Hello World!") This fixes all locations to use the function invocation. This is more complicated when the output is being redirected to a file or when avoiding the usual newline. print >> sys.stderr , "Hello World!" is now: print("Hello World!", file=sys.stderr) To support this properly and guarantee equivalent behavior between python 2 and python 3, all files that use print now add this import: from __future__ import print_function This also fixes random flake8 issues that intersect with the changes. Testing: - check-python-syntax.sh shows no errors related to print Change-Id: Ib634958369ad777a41e72d80c8053b74384ac351 Reviewed-on: http://gerrit.cloudera.org:8080/19552 Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com> Reviewed-by: Michael Smith <michael.smith@cloudera.com> Tested-by: Michael Smith <michael.smith@cloudera.com>	2023-02-28 17:11:50 +00:00
Joe McDonnell	1f3160b4c0	IMPALA-8304: Generate JUnitXML if a command run by CMake fails This wraps each command executed by CMake with a wrapper that generates a JUnitXML file if the command fails. If the command succeeds, the wrapper does nothing. The wrapper applies to C++ compilation, linking, and custom shell commands (such as building the frontend via maven). It does not apply to failures coming from CMake itself. It can be disabled by setting DISABLE_CMAKE_JUNITXML. The command output can include Unicode (e.g. smart quotes for g++), so this also updates generate_junitxml.py to handle Unicode. The wrapper interacts poorly with add_custom_command/add_custom_target CMake commands that use 'cd directory && do_something', so this switches those locations (in /docker) to use CMake's WORKING_DIRECTORY. Testing: - Verified it does not impact a successful build (including with ccache and/or distcc). - Verified it generates JUnitXML for C++ and Java compilation failures. - Verified it doesn't use the wrapper when DISABLE_CMAKE_JUNITXML is set. Change-Id: If71f2faf3ab5052b56b38f1b291fee53c390ce23 Reviewed-on: http://gerrit.cloudera.org:8080/12668 Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2020-10-09 15:52:05 +00:00
Joe McDonnell	4344914bb3	IMPALA-8193: Fix python 2.6 issue in junit_prune_notrun.py Python 2.6's ElementTree.write() does not have an xml_declaration argument, so junitxml_prune_notrun.py fails on python 2.6. This fixes junitxml_prune_notrun.py by using minidom to write the output. This mirrors how bin/generate_junitxml.py outputs XML. Verified that tests now pass on python 2.6 and python 2.7 does not change. Change-Id: I9ef8fb77b1ac8c51e3dfb6b04690ae9ccc490d62 Reviewed-on: http://gerrit.cloudera.org:8080/12479 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2019-02-16 04:48:01 +00:00
Joe McDonnell	3ce34a81b2	IMPALA-8071: Initial unified backend test framework There are 100+ backend tests and each requires 400+ MB of disk space when statically linked (the default). This requires a large amount of disk space and adds considerable link time. This introduces a framework to link multiple backend tests into a single executable. Currently it does this for several tests in be/src/util. It saves about 10GB of space. It maintains several of the same properties that the current tests have: 1. "make <testname>" rebuilds that test. 2. It generates an executable shell script with the same name as the original backend test that runs the same subset of tests. 3. It generates JUnitXML and log files with names that match the test name. 4. One can run the shell script with "--gtest_filter" and run a subset of the tests. 5. ctest commands such as ctest -R continue to function. It validates at build time that every test linked into the unified executable is covered by an equivalent test filter pattern. This means that every test in the unified executable will run as part of normal testing. Introducing the framework along with a limited number of trial backend tests gives us a chance to evaluate this change before continuing to convert tests. Change-Id: Ia03ef38719b1fbc0fe2025e16b7b3d3dd4488842 Reviewed-on: http://gerrit.cloudera.org:8080/12124 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2019-02-08 19:00:21 +00:00
Philip Zeyliger	304e02cf62	Fix generate_junitxml for python2.6. With python 2.6, the syntax "{}".format(1) doesn't work: $docker run centos:6 python -c 'print "{}".format(1)' Traceback (most recent call last): File "<string>", line 1, in <module> ValueError: zero length field name in format generate_junitxml was using this incantation and failing. I've updated the syntax to be py2.6-friendly, and tested it like so: $docker run -v $(pwd):/mnt centos:6 bash -c "yum install -y python-argparse; /mnt/lib/python/impala_py_lib/jenkins/generate_junitxml.py --phase phase --step step --stdout out --stderr err; cat /extra_junit_xml_logs/*.xml" [output from yum...] Installed: python-argparse.noarch 0:1.2.1-2.1.el6 Complete! Generated: ./extra_junit_xml_logs/generate_junitxml.phase.step.20180904_18_04_56.xml <?xml version="1.0" ?> <testsuites errors="0" failures="0" tests="1" time="0.0"> <testsuite disabled="0" errors="0" failures="0" file="None" log="None" name="generate_junitxml.phase.step" skipped="0" tests="1" time="0" timestamp="2018-09-04 18:04:56+00:00" url="None"> <testcase classname="generate_junitxml.phase" name="step"> <system-out> out </system-out> <system-err> err </system-err> </testcase> </testsuite> </testsuites> Change-Id: Ic0c1e837a9ed6c2d59906aed1d1098bde6f5d815 Reviewed-on: http://gerrit.cloudera.org:8080/11384 Reviewed-by: David Knupp <dknupp@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2018-09-04 21:58:08 +00:00
David Knupp	6e5ec22b12	IMPALA-7399: Emit a junit xml report when trapping errors This patch will cause a junitxml file to be emitted in the case of errors in build scripts. Instead of simply echoing a message to the console, we set up a trap function that also writes out to a junit xml report that can be consumed by jenkins.impala.io. Main things to pay attention to: - New file that gets sourced by all bash scripts when trapping within bash scripts: https://gerrit.cloudera.org/c/11257/1/bin/report_build_error.sh - Installation of the python lib into impala-python venv for use from within python files: https://gerrit.cloudera.org/c/11257/1/bin/impala-python-common.sh - Change to the generate_junitxml.py file itself, for ease of https://gerrit.cloudera.org/c/11257/1/lib/python/impala_py_lib/jenkins/generate_junitxml.py Most of the other changes are to source the new report_build_error.sh script to set up the trap function. Change-Id: Idd62045bb43357abc2b89a78afff499149d3c3fc Reviewed-on: http://gerrit.cloudera.org:8080/11257 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2018-08-23 18:33:58 +00:00
David Knupp	0286142b67	IMPALA-7399: Remove third-party dependencies from junit xml script The original patch for this Jira relied on a third party python lib for generating Junit XML output, That proved to be limiting because setting up the necessary virtualenv across a variety of dev and test scenarios (private dev environment, jenkins.impala.io, and others) proved to be confusing and messy. This update to the script maintains the same functionality and the same interface, but uses only the python standard library. A symlink has also been added to Impala/bin for convenience. Change-Id: I958ee0d8420b6a4197aaf0a7e0538a566332ea97 Reviewed-on: http://gerrit.cloudera.org:8080/11235 Reviewed-by: David Knupp <dknupp@cloudera.com> Tested-by: David Knupp <dknupp@cloudera.com>	2018-08-16 21:59:52 +00:00
David Knupp	8f9f91f38b	IMPALA-7399: Add script in lib/python to generate junit XML. This patch adds a script to generate junit XML reports for arbitrary build steps. It's also being used to seed the creation of an internal python library for Impala development that can be pip installed into a development environment. Change-Id: If6024d74075ea69b8ee20d1fc3cc9c1ff821ba5b Reviewed-on: http://gerrit.cloudera.org:8080/11128 Reviewed-by: David Knupp <dknupp@cloudera.com> Tested-by: David Knupp <dknupp@cloudera.com>	2018-08-09 20:53:48 +00:00

12 Commits