Python 3 moved several things around or removed deprecated
functions / fields:
- sys.maxint was removed, but sys.maxsize provides similar functionality
- long was removed, but int provides the same range
- file() was removed, but open() already provided the same functionality
- Exception.message was removed, but str(exception) is equivalent
- Some encodings (like hex) were moved to codecs.encode()
- string.letters -> string.ascii_letters
- string.lowercase -> string.ascii_lowercase
- string.strip was removed
This fixes all of those locations. Python 3 also has slightly different
rounding behavior from round(), so this changes round() to use future's
builtins.round() to get the Python 3 behavior.
This fixes the following pylint warnings:
- file-builtin
- long-builtin
- invalid-str-codec
- round-builtin
- deprecated-string-function
- sys-max-int
- exception-message-attribute
Testing:
- Ran cores tests
Change-Id: I094cd7fd06b0d417fc875add401d18c90d7a792f
Reviewed-on: http://gerrit.cloudera.org:8080/19591
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>
This takes steps to make Python 2 behave like Python 3 as
a way to flush out issues with running on Python 3. Specifically,
it handles two main differences:
1. Python 3 requires absolute imports within packages. This
can be emulated via "from __future__ import absolute_import"
2. Python 3 changed division to "true" division that doesn't
round to an integer. This can be emulated via
"from __future__ import division"
This changes all Python files to add imports for absolute_import
and division. For completeness, this also includes print_function in the
import.
I scrutinized each old-division location and converted some locations
to use the integer division '//' operator if it needed an integer
result (e.g. for indices, counts of records, etc). Some code was also using
relative imports and needed to be adjusted to handle absolute_import.
This fixes all Pylint warnings about no-absolute-import and old-division,
and these warnings are now banned.
Testing:
- Ran core tests
Change-Id: Idb0fcbd11f3e8791f5951c4944be44fb580e576b
Reviewed-on: http://gerrit.cloudera.org:8080/19588
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>
Python 3 now treats print as a function and requires
the parenthesis in invocation.
print "Hello World!"
is now:
print("Hello World!")
This fixes all locations to use the function
invocation. This is more complicated when the output
is being redirected to a file or when avoiding the
usual newline.
print >> sys.stderr , "Hello World!"
is now:
print("Hello World!", file=sys.stderr)
To support this properly and guarantee equivalent behavior
between python 2 and python 3, all files that use print
now add this import:
from __future__ import print_function
This also fixes random flake8 issues that intersect with
the changes.
Testing:
- check-python-syntax.sh shows no errors related to print
Change-Id: Ib634958369ad777a41e72d80c8053b74384ac351
Reviewed-on: http://gerrit.cloudera.org:8080/19552
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Tested-by: Michael Smith <michael.smith@cloudera.com>
Currently, the query events (audits and lineages) are logged as a
part of query unregistration. This delays the event logging in cases
where the Unregister() is delayed by client for some reason (ex: Hue
does not call Unregister until the browser tab is closed) or the client
goes away without calling Unregister and the query timeout kicks in.
This patch moves this event logging to an earlier stage in the query
lifecycle. Moved the event logging related code into ClientRequestState
for easier code refactoring.
The conditions under which the events are logged are slightly
modified by this patch. Without the patch, events are logged for
unsuccessful queries if atleast a single fetch is perfomed. This patch
relaxes this guarantee to log events for any query that reaches
the FINISHED state (rows are available to fetch by the client) and does
not wait for a fetch to be performed. This simplifies the coordinator
state machine by avoiding unnecessary synchronization.
Added some test coverage for coordinator side code paths for writing
lineages. fe specific lineage tests only verified the correctness of
lineage created but did not test whether it was being flushed correctly
to the disk.
Change-Id: I639b9c1acb9806b29292cd85be2863688453ca2e
Reviewed-on: http://gerrit.cloudera.org:8080/14143
Reviewed-by: Bharath Vissapragada <bharathv@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
For files that have a Cloudera copyright (and no other copyright
notice), make changes to follow the ASF source file header policy here:
http://www.apache.org/legal/src-headers.html#headers
Specifically:
1) Remove the Cloudera copyright.
2) Modify NOTICE.txt according to
http://www.apache.org/legal/src-headers.html#notice
to follow that format and add a line for Cloudera.
3) Replace or add the existing ASF license text with the one given
on the website.
Much of this change was automatically generated via:
git grep -li 'Copyright.*Cloudera' > modified_files.txt
cat modified_files.txt | xargs perl -n -i -e 'print unless m#Copyright.*Cloudera#i;'
cat modified_files_txt | xargs fix_apache_license.py [1]
Some manual fixups were performed following those steps, especially when
license text was completely missing from the file.
[1] https://gist.github.com/anonymous/ff71292094362fc5c594 with minor
modification to ORIG_LICENSE to match Impala's license text.
Change-Id: I2e0bd8420945b953e1b806041bea4d72a3943d86
Reviewed-on: http://gerrit.cloudera.org:8080/3779
Reviewed-by: Dan Hecht <dhecht@cloudera.com>
Tested-by: Internal Jenkins
Also fix a stale comment in the avro scanner header.
The main work here is to fix the handling of empty result sets in the
test result verifier. This is a problem because we wanted to verify
that the results in the test file were a superset of the rows
returned, and this was thrown off by superflous '' rows in the expected
and actual result sets.
The basic problem is that the way test file sections
was parsed conflated an empty result section with non-empty result
section that had a single empty string. I.e.:
---- RESULTS
====
vs
---- RESULTS
====
both got resolved to [''].
Change-Id: Ia007e558d92c7e4ce30be90446fdbb1f50a0ebc4
Reviewed-on: http://gerrit.cloudera.org:8080/3413
Tested-by: Internal Jenkins
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Many of our test scripts have import statements that look like
"from xxx import *". It is a good practice to explicitly name what
needs to be imported. This commit implements this practice. Also,
unused import statements are removed.
Change-Id: I6a33bb66552ae657d1725f765842f648faeb26a8
Reviewed-on: http://gerrit.cloudera.org:8080/3444
Reviewed-by: Michael Brown <mikeb@cloudera.com>
Tested-by: Internal Jenkins
Many python files had a hashbang and the executable bit set though
they were not intended to be run a standalone script. That makes
determining which python files are actually scripts very difficult.
A future patch will update the hashbang in real python scripts so they
use $IMPALA_HOME/bin/impala-python.
Change-Id: I04eafdc73201feefe65b85817a00474e182ec2ba
Reviewed-on: http://gerrit.cloudera.org:8080/599
Reviewed-by: Casey Ching <casey@cloudera.com>
Reviewed-by: Taras Bobrovytsky <tbobrovytsky@cloudera.com>
Tested-by: Internal Jenkins