This puts all of the thrift-generated python code into the
impala_thrift_gen package. This is similar to what Impyla
does for its thrift-generated python code, except that it
uses the impala_thrift_gen package rather than impala._thrift_gen.
This is a preparatory patch for fixing the absolute import
issues.
This patches all of the thrift files to add the python namespace.
This has code to apply the patching to the thirdparty thrift
files (hive_metastore.thrift, fb303.thrift) to do the same.
Putting all the generated python into a package makes it easier
to understand where the imports are getting code. When the
subsequent change rearranges the shell code, the thrift generated
code can stay in a separate directory.
This uses isort to sort the imports for the affected Python files
with the provided .isort.cfg file. This also adds an impala-isort
shell script to make it easy to run.
Testing:
- Ran a core job
Change-Id: Ie2927f22c7257aa38a78084efe5bd76d566493c0
Reviewed-on: http://gerrit.cloudera.org:8080/20169
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>
This takes steps to make Python 2 behave like Python 3 as
a way to flush out issues with running on Python 3. Specifically,
it handles two main differences:
1. Python 3 requires absolute imports within packages. This
can be emulated via "from __future__ import absolute_import"
2. Python 3 changed division to "true" division that doesn't
round to an integer. This can be emulated via
"from __future__ import division"
This changes all Python files to add imports for absolute_import
and division. For completeness, this also includes print_function in the
import.
I scrutinized each old-division location and converted some locations
to use the integer division '//' operator if it needed an integer
result (e.g. for indices, counts of records, etc). Some code was also using
relative imports and needed to be adjusted to handle absolute_import.
This fixes all Pylint warnings about no-absolute-import and old-division,
and these warnings are now banned.
Testing:
- Ran core tests
Change-Id: Idb0fcbd11f3e8791f5951c4944be44fb580e576b
Reviewed-on: http://gerrit.cloudera.org:8080/19588
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>
IMPALA-7312 added the query option FETCH_ROWS_TIMEOUT_MS, but it only
applies to fetch requests against a query that has already transitioned
to the 'FINISHED' state. This patch changes the timeout so that it
applies to queries in the 'RUNNING' state as well. Before this patch,
fetch requests issued while a query was 'RUNNING' blocked until the query
transitioned to the 'FINISHED' state, and then it fetched results and
returned them. After this patch, fetch requests against queries in the
'RUNNING' state will block for 'FETCH_ROWS_TIMEOUT_MS' and then return.
For HS2 clients, fetch requests that return while a query is 'RUNNING'
set their TStatusCode to STILL_EXECUTING_STATUS. For Beeswax clients,
fetch requests that return while a query is 'RUNNING' set the 'ready'
flag to false. For both clients, hasMoreRows is set to true.
If the following sequence of events occurs:
* A fetch request is issued and blocks on a 'RUNNING' query
* The query transitions to the 'FINISHED' state
* The fetch request attempts to read multiple batches
Then the time spent waiting for the query to finish is deducted from
the timeout used when waiting for rows to be produced by the Coordinator
fragment.
Fixed a bug in the current usage of FETCH_ROWS_TIMEOUT_MS where the
time units for FETCH_ROWS_TIMEOUT_MS and MonotonicStopWatch were not
being converted properly.
Tests:
* Moved existing fetch timeout tests from hs2/test_fetch.py into a new
test file hs2/test_fetch_timeout.py.
* Added several new tests to hs2/test_fetch_timeout.py to validate that
the timeout is applied to 'RUNNING' queries and that the timeout applies
across a 'RUNNING' and 'FINISHED' query.
* Added new tests to query_test/test_fetch.py to validate the timeout
while using the Beeswax protocol.
Change-Id: I2cba6bf062dcc1af19471d21857caa797c1ea4a4
Reviewed-on: http://gerrit.cloudera.org:8080/14332
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>