Commit Graph

9 Commits

Author SHA1 Message Date
Joe McDonnell
1913ab46ed IMPALA-14501: Migrate most scripts from impala-python to impala-python3
To remove the dependency on Python 2, existing scripts need to use
python3 rather than python. These commands find those
locations (for impala-python and regular python):
git grep impala-python | grep -v impala-python3 | grep -v impala-python-common | grep -v init-impala-python
git grep bin/python | grep -v python3

This removes or switches most of these locations by various means:
1. If a python file has a #!/bin/env impala-python (or python) but
   doesn't have a main function, it removes the hash-bang and makes
   sure that the file is not executable.
2. Most scripts can simply switch from impala-python to impala-python3
   (or python to python3) with minimal changes.
3. The cm-api pypi package (which doesn't support Python 3) has been
   replaced by the cm-client pypi package and interfaces have changed.
   Rather than migrating the code (which hasn't been used in years), this
   deletes the old code and stops installing cm-api into the virtualenv.
   The code can be restored and revamped if there is any interest in
   interacting with CM clusters.
4. This switches tests/comparison over to impala-python3, but this code has
   bit-rotted. Some pieces can be run manually, but it can't be fully
   verified with Python 3. It shouldn't hold back the migration on its own.
5. This also replaces locations of impala-python in comments / documentation /
   READMEs.
6. kazoo (used for interacting with HBase) needed to be upgraded to a
   version that supports Python 3. The newest version of kazoo requires
   upgrades of other component versions, so this uses kazoo 2.8.0 to avoid
   needing other upgrades.

The two remaining uses of impala-python are:
 - bin/cmake_aux/create_virtualenv.sh
 - bin/impala-env-versioned-python
These will be removed separately when we drop Python 2 support
completely. In particular, these are useful for testing impala-shell
with Python 2 until we stop supporting Python 2 for impala-shell.

The docker-based tests still use /usr/bin/python, but this can
be switched over independently (and doesn't impact impala-python)

Testing:
 - Ran core job
 - Ran build + dataload on Centos 7, Redhat 8
 - Manual testing of individual scripts (except some bitrotted areas like the
   random query generator)

Change-Id: If209b761290bc7e7c716c312ea757da3e3bca6dc
Reviewed-on: http://gerrit.cloudera.org:8080/23468
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Tested-by: Michael Smith <michael.smith@cloudera.com>
2025-10-22 16:30:17 +00:00
Michael Smith
0a42185d17 IMPALA-9627: Update utility scripts for Python 3 (part 2)
We're starting to see environments where the system Python ('python') is
Python 3. Updates utility and build scripts to work with Python 3, and
updates check-pylint-py3k.sh to check scripts that use system python.

Fixes other issues found during a full build and test run with Python
3.8 as the default for 'python'.

Fixes a impala-shell tip that was supposed to have been two tips (and
had no space after period when they were printed).

Removes out-of-date deploy.py and various Python 2.6 workarounds.

Testing:
- Full build with /usr/bin/python pointed to python3
- run-all-tests passed with python pointed to python3
- ran push_to_asf.py

Change-Id: Idff388aff33817b0629347f5843ec34c78f0d0cb
Reviewed-on: http://gerrit.cloudera.org:8080/19697
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Tested-by: Michael Smith <michael.smith@cloudera.com>
2023-04-26 18:52:23 +00:00
Tamas Mate
175dae33d9 IMPALA-11974: Fix xrange call in collect_minidumps.py
Python3 deprecates xrange operator, this commit replaces it with the new
range operator similar to earlier replacements in IMPALA-11974.

Testing:
 - Tested that Python3 parses this file successfully

Change-Id: I155725d8edac9222064c88912b19f0591c690afe
Reviewed-on: http://gerrit.cloudera.org:8080/19727
Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2023-04-13 23:53:59 +00:00
Joe McDonnell
2b550634d2 IMPALA-11952 (part 2): Fix print function syntax
Python 3 now treats print as a function and requires
the parenthesis in invocation.

print "Hello World!"
is now:
print("Hello World!")

This fixes all locations to use the function
invocation. This is more complicated when the output
is being redirected to a file or when avoiding the
usual newline.

print >> sys.stderr , "Hello World!"
is now:
print("Hello World!", file=sys.stderr)

To support this properly and guarantee equivalent behavior
between python 2 and python 3, all files that use print
now add this import:
from __future__ import print_function

This also fixes random flake8 issues that intersect with
the changes.

Testing:
 - check-python-syntax.sh shows no errors related to print

Change-Id: Ib634958369ad777a41e72d80c8053b74384ac351
Reviewed-on: http://gerrit.cloudera.org:8080/19552
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Tested-by: Michael Smith <michael.smith@cloudera.com>
2023-02-28 17:11:50 +00:00
Dan Hecht
ffa7829b70 IMPALA-3918: Remove Cloudera copyrights and add ASF license header
For files that have a Cloudera copyright (and no other copyright
notice), make changes to follow the ASF source file header policy here:

http://www.apache.org/legal/src-headers.html#headers

Specifically:
1) Remove the Cloudera copyright.
2) Modify NOTICE.txt according to
   http://www.apache.org/legal/src-headers.html#notice
   to follow that format and add a line for Cloudera.
3) Replace or add the existing ASF license text with the one given
   on the website.

Much of this change was automatically generated via:

git grep -li 'Copyright.*Cloudera' > modified_files.txt
cat modified_files.txt | xargs perl -n -i -e 'print unless m#Copyright.*Cloudera#i;'
cat modified_files_txt | xargs fix_apache_license.py [1]

Some manual fixups were performed following those steps, especially when
license text was completely missing from the file.

[1] https://gist.github.com/anonymous/ff71292094362fc5c594 with minor
    modification to ORIG_LICENSE to match Impala's license text.

Change-Id: I2e0bd8420945b953e1b806041bea4d72a3943d86
Reviewed-on: http://gerrit.cloudera.org:8080/3779
Reviewed-by: Dan Hecht <dhecht@cloudera.com>
Tested-by: Internal Jenkins
2016-08-09 08:19:41 +00:00
Lars Volker
7b9e9c6349 IMPALA-3800: Python 2.6 support for collect_minidumps.py
Python 2.6 is the default python version shipped with CentOS 6.6 and
the minidump collection script needs to run there, too. However, the
tarfile library version shipped with python 2.6 does not support using
TarFile objects as context managers. This change wraps the usage of
tarfile.TarFile in contextlib.closing to support python 2.6, too.

Testing: I verified that this script runs with python 2.7 on testdata
generated using the generate_minidump_collection_testdata.py script. I
also copied an updated version to a CentOS 6.6 cluster and verified that
the problem is resolved there.

Change-Id: Ic9028626fa829ff7571d2a731c2fcd7e15e2ce36
Reviewed-on: http://gerrit.cloudera.org:8080/3525
Reviewed-by: Michael Brown <mikeb@cloudera.com>
Reviewed-by: Taras Bobrovytsky <tbobrovytsky@cloudera.com>
Reviewed-by: Dan Hecht <dhecht@cloudera.com>
Tested-by: Lars Volker <lv@cloudera.com>
2016-07-05 13:37:25 -07:00
Lars Volker
d16e83214a IMPALA-3581: Change location of minidump folders to log_dir
Currently the default minidump location is /tmp/impala-minidumps, which can be wiped on
reboot on various distributions. This change moves the default location to
FLAGS_log_dir/minidumps/$daemon. The additional trailing $daemon folder is kept to prevent
name collisions in case of local test clusters and strangely configured installations.

For local test clusters the minidumps will be written to
$IMPALA_HOME/logs/cluster/minidumps/{catalogd,impalad,statestored}.

Change-Id: Idecf5a314bfb8b0870e8aa4819c4fb39a107702f
Reviewed-on: http://gerrit.cloudera.org:8080/3171
Reviewed-by: Taras Bobrovytsky <tbobrovytsky@cloudera.com>
Tested-by: Internal Jenkins
2016-05-31 23:32:11 -07:00
Taras Bobrovytsky
afdc32e409 CDH-39818 ADDENDUM: Change time format in minidump collection script
Before this patch, the Breakpad minidump collection script expected the
start_time and end_time to be specified in epoch seconds UTC. This
patch changes this format to milliseconds (as requested by CM).

Change-Id: I9b91bffbf0d4ab37753566ea9c31bbb01ac41623
Reviewed-on: http://gerrit.cloudera.org:8080/3163
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Taras Bobrovytsky <tbobrovytsky@cloudera.com>
2016-05-23 08:40:20 -07:00
Taras Bobrovytsky
ff0cd823cc CDH-39818: Add Breakpad minidump collection script
Add two scripts:
collect_minidumps.py - generates a compressed tarball that contains
minidumps generated by Breakpad.
generate_minidump_collection_testdata.py - generates testdata for the
above script.

Change-Id: I85b3643133e28eca07507ac2a79acbf73128456f
Reviewed-on: http://gerrit.cloudera.org:8080/2997
Reviewed-by: Taras Bobrovytsky <tbobrovytsky@cloudera.com>
Reviewed-by: Michael Brown <mikeb@cloudera.com>
Tested-by: Internal Jenkins
2016-05-13 15:52:53 -07:00