Commit Graph

6 Commits

Author SHA1 Message Date
Joe McDonnell
0a0001e1a8 Revert "IMPALA-9718: Delete pkg_resources from IMPALA_HOME/shell/"
The fix for IMPALA-9718 introduced test failures on Centos 7.
See IMPALA-9735.

This reverts commit 75d98b4b08.

Change-Id: Id09c55435f432a8626a45079f58860d6e27ac55e
Reviewed-on: http://gerrit.cloudera.org:8080/15881
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>
2020-05-07 23:15:32 +00:00
David Knupp
75d98b4b08 IMPALA-9718: Delete pkg_resources from IMPALA_HOME/shell/
This file was originally copied here in 2013 because it was somehow
not availble on SUSE Linux systems. However, pkg_resources is part
of the stdlib, and should be available now on any system. Just to
sure about that, I tested on a SUSE Linux machine to be sure.

systest@dknupp-sles-test:~> python
Python 2.7.13 (default, Jan 11 2017, 10:56:06) [GCC] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import pkg_resources
>>> pkg_resources.__file__
'/usr/lib/python2.7/site-packages/pkg_resources/__init__.pyc'
>>>

This version is now woefully out of date, and has had to be patched a
couple times for bugs. We should get rid of it.

Change-Id: I2a06f21177a6fa561478c3dd243d9262deb62db4
Reviewed-on: http://gerrit.cloudera.org:8080/15855
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: David Knupp <dknupp@cloudera.com>
2020-05-05 07:16:18 +00:00
David Knupp
bc9d7e063d IMPALA-3343, IMPALA-9489: Make impala-shell compatible with python 3.
This is the main patch for making the the impala-shell cross-compatible with
python 2 and python 3. The goal is wind up with a version of the shell that will
pass python e2e tests irrepsective of the version of python used to launch the
shell, under the assumption that the test framework itself will continue to run
with python 2.7.x for the time being.

Notable changes for reviewers to consider:

- With regard to validating the patch, my assumption is that simply passing
  the existing set of e2e shell tests is sufficient to confirm that the shell
  is functioning properly. No new tests were added.

- A new pytest command line option was added in conftest.py to enable a user
  to specify a path to an alternate impala-shell executable to test. It's
  possible to use this to point to an instance of the impala-shell that was
  installed as a standalone python package in a separate virtualenv.

  Example usage:
  USE_THRIFT11_GEN_PY=true impala-py.test --shell_executable=/<path to virtualenv>/bin/impala-shell -sv shell/test_shell_commandline.py

  The target virtualenv may be based on either python3 or python2. However,
  this has no effect on the version of python used to run the test framework,
  which remains tied to python 2.7.x for the foreseeable future.

- The $IMPALA_HOME/bin/impala-shell.sh now sets up the impala-shell python
  environment independenty from bin/set-pythonpath.sh. The default version
  of thrift is thrift-0.11.0 (See IMPALA-9489).

- The wording of the header changed a bit to include the python version
  used to run the shell.

    Starting Impala Shell with no authentication using Python 3.7.5
    Opened TCP connection to localhost:21000
    ...

    OR

    Starting Impala Shell with LDAP-based authentication using Python 2.7.12
    Opened TCP connection to localhost:21000
    ...

- By far, the biggest hassle has been juggling str versus unicode versus
  bytes data types. Python 2.x was fairly loose and inconsistent in
  how it dealt with strings. As a quick demo of what I mean:

  Python 2.7.12 (default, Nov 12 2018, 14:36:49)
  [GCC 5.4.0 20160609] on linux2
  Type "help", "copyright", "credits" or "license" for more information.
  >>> d = 'like a duck'
  >>> d == str(d) == bytes(d) == unicode(d) == d.encode('utf-8') == d.decode('utf-8')
  True

  ...and yet there are weird unexpected gotchas.

  >>> d.decode('utf-8') == d.encode('utf-8')
  True
  >>> d.encode('utf-8') == bytearray(d, 'utf-8')
  True
  >>> d.decode('utf-8') == bytearray(d, 'utf-8')   # fails the eq property?
  False

  As a result, this was inconsistency was reflected in the way we handled
  strings in the impala-shell code, but things still just worked.

  In python3, there's a much clearer distinction between strings and bytes, and
  as such, much tighter type consistency is expected by standard libs like
  subprocess, re, sqlparse, prettytable, etc., which are used throughout the
  shell. Even simple calls that worked in python 2.x:

  >>> import re
  >>> re.findall('foo', b'foobar')
  ['foo']

  ...can throw exceptions in python 3.x:

  >>> import re
  >>> re.findall('foo', b'foobar')
  Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "/data0/systest/venvs/py3/lib/python3.7/re.py", line 223, in findall
      return _compile(pattern, flags).findall(string)
  TypeError: cannot use a string pattern on a bytes-like object

  Exceptions like this resulted in a many, if not most shell tests failing
  under python 3.

  What ultimately seemed like a better approach was to try to weed out as many
  existing spurious str.encode() and str.decode() calls as I could, and try to
  implement what is has colloquially been called a "unicode sandwich" -- namely,
  "bytes on the outside, unicode on the inside, encode/decode at the edges."

  The primary spot in the shell where we call decode() now is when sanitising
  input...

  args = self.sanitise_input(args.decode('utf-8'))

  ...and also whenever a library like re required it. Similarly, str.encode()
  is primarily used where a library like readline or csv requires is.

- PYTHONIOENCODING needs to be set to utf-8 to override the default setting for
  python 2. Without this, piping or redirecting stdout results in unicode errors.

- from __future__ import unicode_literals was added throughout

Testing:

  To test the changes, I ran the e2e shell tests the way we always do (against
  the normal build tarball), and then I set up a python 3 virtual env with the
  shell installed as a package, and manually ran the tests against that.

  No effort has been made at this point to come up with a way to integrate
  testing of the shell in a python3 environment into our automated test
  processes.

Change-Id: Idb004d352fe230a890a6b6356496ba76c2fab615
Reviewed-on: http://gerrit.cloudera.org:8080/15524
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-04-18 05:13:50 +00:00
David Knupp
ed15c2c58f IMPALA-3343: Part 1 -- Fix simple python 2->3 syntax errors
In an effort to keep the work of reviewing the changes more manageable
with regard to making the impala-shell python3 compatible, I'm trying
to break the patches up into smaller chunks.

The first patch is the easiest one -- simply addressing the handful of
syntax issues that aren't python 3 compatible, namely changing the
print statements to function calls, changing the way we catch exceptions,
and adding a few simple branches to work around the removal of such
things as dict.iteritems().

We needed the print function imported from __future__ because it allows
us to pass in a file descriptor, e.g., sys.stderr.

Notably, there's nothing in this patch related to string/bytes/unicode
changes from python 2 to 3.

Change-Id: I9a515da01ef03d5936cb1a4d9e4bc6d105386b1d
Reviewed-on: http://gerrit.cloudera.org:8080/15487
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-03-20 03:10:07 +00:00
Tim Armstrong
cdbcdca670 IMPALA-4570: shell tarball breaks with certain setuptools versions
The bug was in the third-party pkg_resources.py script. The version
check was broken because it matches any version with a "0.7" substring
instead of just versions starting with 0.7.

This is a known bug. setuptools even re-released 20.7.0 as version
20.8.0 to avoid it:
e5822f0d5b

Testing:
I was unable to reproduce this locally, but I think the fix is clear-cut
enough that this is ok.

Change-Id: I0565c0e6c1be7d82c3f35d2545ba044a684bb075
Reviewed-on: http://gerrit.cloudera.org:8080/5314
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins
2016-12-03 03:34:16 +00:00
ishaan
083b66e6f2 Make the shell work on SLES by adding pkg_resources.py from setuptools. 2014-01-08 10:48:07 -08:00