To remove the dependency on Python 2, existing scripts need to use
python3 rather than python. These commands find those
locations (for impala-python and regular python):
git grep impala-python | grep -v impala-python3 | grep -v impala-python-common | grep -v init-impala-python
git grep bin/python | grep -v python3
This removes or switches most of these locations by various means:
1. If a python file has a #!/bin/env impala-python (or python) but
doesn't have a main function, it removes the hash-bang and makes
sure that the file is not executable.
2. Most scripts can simply switch from impala-python to impala-python3
(or python to python3) with minimal changes.
3. The cm-api pypi package (which doesn't support Python 3) has been
replaced by the cm-client pypi package and interfaces have changed.
Rather than migrating the code (which hasn't been used in years), this
deletes the old code and stops installing cm-api into the virtualenv.
The code can be restored and revamped if there is any interest in
interacting with CM clusters.
4. This switches tests/comparison over to impala-python3, but this code has
bit-rotted. Some pieces can be run manually, but it can't be fully
verified with Python 3. It shouldn't hold back the migration on its own.
5. This also replaces locations of impala-python in comments / documentation /
READMEs.
6. kazoo (used for interacting with HBase) needed to be upgraded to a
version that supports Python 3. The newest version of kazoo requires
upgrades of other component versions, so this uses kazoo 2.8.0 to avoid
needing other upgrades.
The two remaining uses of impala-python are:
- bin/cmake_aux/create_virtualenv.sh
- bin/impala-env-versioned-python
These will be removed separately when we drop Python 2 support
completely. In particular, these are useful for testing impala-shell
with Python 2 until we stop supporting Python 2 for impala-shell.
The docker-based tests still use /usr/bin/python, but this can
be switched over independently (and doesn't impact impala-python)
Testing:
- Ran core job
- Ran build + dataload on Centos 7, Redhat 8
- Manual testing of individual scripts (except some bitrotted areas like the
random query generator)
Change-Id: If209b761290bc7e7c716c312ea757da3e3bca6dc
Reviewed-on: http://gerrit.cloudera.org:8080/23468
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Tested-by: Michael Smith <michael.smith@cloudera.com>
We're starting to see environments where the system Python ('python') is
Python 3. Updates utility and build scripts to work with Python 3, and
updates check-pylint-py3k.sh to check scripts that use system python.
Fixes other issues found during a full build and test run with Python
3.8 as the default for 'python'.
Fixes a impala-shell tip that was supposed to have been two tips (and
had no space after period when they were printed).
Removes out-of-date deploy.py and various Python 2.6 workarounds.
Testing:
- Full build with /usr/bin/python pointed to python3
- run-all-tests passed with python pointed to python3
- ran push_to_asf.py
Change-Id: Idff388aff33817b0629347f5843ec34c78f0d0cb
Reviewed-on: http://gerrit.cloudera.org:8080/19697
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Tested-by: Michael Smith <michael.smith@cloudera.com>
Python3 deprecates xrange operator, this commit replaces it with the
range operator similar to earlier replacements in IMPALA-11974.
Adds universal_newlines to Popen so that we return text in Python 3.
Testing:
- Ran '$IMPALA_HOME/bin/diagnostics/collect_diagnostics.py --pid <pid>
--minidumps 2 1 --minidumps_dir $IMPALA_HOME/logs/cluster/minidumps
--stacks 2 1' with Python 2/3 and inspected the results.
Change-Id: I52f075825d47613293b106a7c50d4499c19cd3f4
Reviewed-on: http://gerrit.cloudera.org:8080/19746
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Currently, the diagnostics script expects a full path to the actual
directory to which process minidumps are written. This is however
incosistent with Impala's configuration --minidump_path.
Impala creates a subdirectory under FLAGS_minidump_path (for ex:
<FLAGS_minidump_path>/impalad) to which it writes the minidumps.
This commit fixes the diagnostic script input --minidump_dir to be
consistent with the above behavior from Impala. It now looks for
minidumps under the directory <--minidump_path>/<process-name>
The users of this script are expected to fix their input args
accordingly.
Change-Id: I9e59f108a1f29a33768a39d0f4554d96e2dcd381
Reviewed-on: http://gerrit.cloudera.org:8080/11353
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Bharath Vissapragada <bharathv@cloudera.com>
collect_diagnostics.py was missing a shebang (#!) at its
beginning as well as the executable bit. Since it's
meant to be runnable standalone, I've added those.
I was able to also simplify the argument handling around
the single required --pid argument.
Change-Id: If4d021c2f6f9dec62d6865d32ec0419e41a2441c
Reviewed-on: http://gerrit.cloudera.org:8080/11178
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Without the fix, the diagnostics tar file included the entire
directory structure of the diagnostics root dir.
Before:
=======
$ tar tf /tmp/impala-diagnostics-2018-05-08-11-59-39-spv8Eh.tar.gz
tmp/impala-diagnostics-2018-05-08-11-59-39-spv8Eh/
tmp/impala-diagnostics-2018-05-08-11-59-39-spv8Eh/stacks/
tmp/impala-diagnostics-2018-05-08-11-59-39-spv8Eh/stacks/jstack-0.txt
....
After:
=====
$ tar tf /tmp/impala-diagnostics-2018-05-08-12-01-51-Y0nlQI.tar.gz
impala-diagnostics-2018-05-08-12-01-51-Y0nlQI/
impala-diagnostics-2018-05-08-12-01-51-Y0nlQI/stacks/
impala-diagnostics-2018-05-08-12-01-51-Y0nlQI/stacks/jstack-0.txt
.....
Tested with python 2.6
Change-Id: I540f6c228a0315780d45cf11961f124478b5dd0c
Reviewed-on: http://gerrit.cloudera.org:8080/10347
Reviewed-by: Bharath Vissapragada <bharathv@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This commit adds the necessary tooling to automate diagnostics
collection for Impala daemons. Following diagnostics are supported.
1. Native core dump (+ shared libs)
2. GDB/Java thread dump (pstack + jstack)
3. Java heap dump (jmap)
4. Minidumps (using breakpad) *
5. Profiles
Given the required inputs, the script outputs a zip compressed
impala diagnostic bundle with all the diagnostics collected.
The script can be run manually with the following command.
python collect_diagnostics.py --help
Tested with python 2.6 and later.
* minidumps collected here correspond to the state of the Impala
process at the time this script is triggered. This is different
from collect_minidumps.py which archives the entire minidump
directory.
Change-Id: I166e726f1dd1ce81187616e4f06d2404fa379bf8
Reviewed-on: http://gerrit.cloudera.org:8080/10056
Reviewed-by: Philip Zeyliger <philip@cloudera.com>
Reviewed-by: Bharath Vissapragada <bharathv@cloudera.com>
Tested-by: Bharath Vissapragada <bharathv@cloudera.com>
A couple of things donot work in python2.6
-- Multiple with statements in the same context
-- shutil.make_archive()
I need a little more time to test the fix with python2.6.
Meanwhile, reverting this to unblock others. I'll resubmit
the fix when I'm confident that it works with python2.6
This reverts commit 2883c99500.
Change-Id: I221ede9d5eb4d89ea20992cc27a8284803af3223
Reviewed-on: http://gerrit.cloudera.org:8080/9872
Reviewed-by: Michael Ho <kwho@cloudera.com>
Tested-by: Michael Ho <kwho@cloudera.com>
This commit adds the necessary tooling to automate diagnostics
collection for Impala daemons. Following diagnostics are supported.
1. Native core dump (+ shared libs)
2. GDB/Java thread dump (pstack + jstack)
3. Java heap dump (jmap)
4. Minidumps (using breakpad) *
5. Profiles
Given the required inputs, the script outputs a zip compressed
impala diagnostic bundle with all the diagnostics collected.
The script can be run manually with the following command.
python collect_diagnostics.py --help
* minidumps collected here correspond to the state of the Impala
process at the time this script is triggered. This is different
from collect_minidumps.py which archives the entire minidump
directory.
Change-Id: Ib29caec7c3be5b6a31e60461294979c318300f64
Reviewed-on: http://gerrit.cloudera.org:8080/9815
Reviewed-by: Lars Volker <lv@cloudera.com>
Reviewed-by: Philip Zeyliger <philip@cloudera.com>
Tested-by: Impala Public Jenkins