14 Commits

Author SHA1 Message Date
Joe McDonnell
3398f20afe IMPALA-14491: Fix run-workload.py's handling of HS2's exec summary
Recently, we switched bin/run-workload.py to use HS2. It turns
out that the HS2 client code is not producing the same data
structure for the exec summary. report_benchmark_results.py
relies on that data structure and fails for HS2.

This changes the HS2 client code to use the same representation
as the beeswax. There is already a function that does this
conversion (build_summary_table_from_thrift) for our regular
tests, so this reuses that function.

Testing:
 - Ran bin/run-workload.py twice to produce json files and
   processed them with report_benchmark_results.py. This
   failed before the change and passed afterward.

Change-Id: I0a041bdebe748b6b3a05b552584e0ca2327cff67
Reviewed-on: http://gerrit.cloudera.org:8080/23597
Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-10-25 16:37:46 +00:00
Riza Suminto
f28a32fbc3 IMPALA-13916: Change BaseTestSuite.default_test_protocol to HS2
This is the final patch to move all Impala e2e and custom cluster tests
to use HS2 protocol by default. Only beeswax-specific test remains
testing against beeswax protocol by default. We can remove them once
Impala officially remove beeswax support.

HS2 error message formatting in impala-hs2-server.cc is adjusted a bit
to match with formatting in impala-beeswax-server.cc.

Move TestWebPageAndCloseSession from webserver/test_web_pages.py to
custom_cluster/test_web_pages.py to disable glog log buffering.

Testing:
- Pass exhaustive tests, except for some known and unrelated flaky
  tests.

Change-Id: I42e9ceccbba1e6853f37e68f106265d163ccae28
Reviewed-on: http://gerrit.cloudera.org:8080/22845
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Jason Fehr <jfehr@cloudera.com>
2025-05-20 14:32:10 +00:00
Joe McDonnell
aa4050b4d9 IMPALA-11976: Fix use of deprecated functions/fields removed in Python 3
Python 3 moved several things around or removed deprecated
functions / fields:
 - sys.maxint was removed, but sys.maxsize provides similar functionality
 - long was removed, but int provides the same range
 - file() was removed, but open() already provided the same functionality
 - Exception.message was removed, but str(exception) is equivalent
 - Some encodings (like hex) were moved to codecs.encode()
 - string.letters -> string.ascii_letters
 - string.lowercase -> string.ascii_lowercase
 - string.strip was removed

This fixes all of those locations. Python 3 also has slightly different
rounding behavior from round(), so this changes round() to use future's
builtins.round() to get the Python 3 behavior.

This fixes the following pylint warnings:
 - file-builtin
- long-builtin
- invalid-str-codec
- round-builtin
- deprecated-string-function
- sys-max-int
- exception-message-attribute

Testing:
 - Ran cores tests

Change-Id: I094cd7fd06b0d417fc875add401d18c90d7a792f
Reviewed-on: http://gerrit.cloudera.org:8080/19591
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>
2023-03-09 17:17:57 +00:00
Joe McDonnell
82bd087fb1 IMPALA-11973: Add absolute_import, division to all eligible Python files
This takes steps to make Python 2 behave like Python 3 as
a way to flush out issues with running on Python 3. Specifically,
it handles two main differences:
 1. Python 3 requires absolute imports within packages. This
    can be emulated via "from __future__ import absolute_import"
 2. Python 3 changed division to "true" division that doesn't
    round to an integer. This can be emulated via
    "from __future__ import division"

This changes all Python files to add imports for absolute_import
and division. For completeness, this also includes print_function in the
import.

I scrutinized each old-division location and converted some locations
to use the integer division '//' operator if it needed an integer
result (e.g. for indices, counts of records, etc). Some code was also using
relative imports and needed to be adjusted to handle absolute_import.
This fixes all Pylint warnings about no-absolute-import and old-division,
and these warnings are now banned.

Testing:
 - Ran core tests

Change-Id: Idb0fcbd11f3e8791f5951c4944be44fb580e576b
Reviewed-on: http://gerrit.cloudera.org:8080/19588
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>
2023-03-09 17:17:57 +00:00
Joe McDonnell
c71de994b0 IMPALA-11952 (part 1): Fix except syntax
Python 3 does not support this old except syntax:

except Exception, e:

Instead, it needs to be:

except Exception as e:

This uses impala-futurize to fix all locations of
the old syntax.

Testing:
 - The check-python-syntax.sh no longer shows errors
   for except syntax.

Change-Id: I1737281a61fa159c8d91b7d4eea593177c0bd6c9
Reviewed-on: http://gerrit.cloudera.org:8080/19551
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Tested-by: Michael Smith <michael.smith@cloudera.com>
2023-02-28 17:11:50 +00:00
Thomas Tauber-Marshall
8b5cb576d2 IMPALA-8173: Fix KeyError in run-workload.py
A recent change (IMPALA-7694) causes run-workload.py to fail with a
KeyError due to trying to construct an ImpalaBeeswaxResult without a
query id.

This patch fixes that issue and two related issues:
- This problem was not caught by automated testing even though we run
  run-workload.py in run-all-tests.sh because of an issue where
  queries that fail to produce results are silently ignored.
- Fixes an issue where queries that fail to produce results would
  confusingly produce the error:
  'NoneType' object has no attribute 'time_taken'

Testing:
- Ran run-workload.py locally and demonstrated that it works now.
- Ran run-all-tests.sh locally and demonstrated that it fails now when
  the KeyError issue isn't fixed.

Change-Id: I5b8a3c3dd7499335b9290d5667c194e8c0eabd12
Reviewed-on: http://gerrit.cloudera.org:8080/12397
Reviewed-by: Thomas Marshall <thomasmarshall@cmu.edu>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2019-02-12 00:03:36 +00:00
Jim Apple
e1c9cbd077 IMPALA-7585: support LDAP in run-workload.py
This patch just threads through the user, password, and ssl settings
all the way back to the ImpalaBeeswaxClient.

Change-Id: Ibfa987d8a027f50bc1ba3db5aa355331442a74ba
Reviewed-on: http://gerrit.cloudera.org:8080/11938
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: David Knupp <dknupp@cloudera.com>
2018-11-19 18:46:38 +00:00
Todd Lipcon
30bb0b3d89 tests: ensure consistent logging format across tests
Many of the test modules included calls to 'logging.basicConfig' at
global scope in their implementation. This meant that by just importing
one of these files, other tests would inherit their logging format. This
is typically a bad idea in Python -- modules should not have side
effects like this on import.

The format was additionally inconsistent. In some cases we had a "--"
prepended to the format, and in others we didn't. The "--" is very
useful since it lets developers copy-paste query-test output back into
the shell to reproduce an issue.

This patch fixes the above by centralizing the logging configuration in
a pytest hook that runs prior to all pytests. A few other non-pytest
related tools now configure logging in their "main" code which is only
triggered when the module is executed directly.

I tested that, with this change, logs still show up properly in the .xml
output files from 'run-tests.py' as well as when running tests manually
from impala-py.test

Change-Id: I55ef0214b43f87da2d71804913ba4caa964f789f
Reviewed-on: http://gerrit.cloudera.org:8080/11225
Reviewed-by: Philip Zeyliger <philip@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-08-18 04:21:00 +00:00
Thomas Tauber-Marshall
e6e2baea33 IMPALA-4372: 'Describe formatted' returns types in upper case
A recent change caused 'describe formatted' to display the types
in all upper case, but we want 'describe formatted' to match Hive's
'describe' output, which displays the types in lower case.

This patch also fixes several problems with test_describe_formatted,
which was encountering an error but reporting success.

Change-Id: I274b97d4d1247244247fb38a5ca7f4c10bba8d22
Reviewed-on: http://gerrit.cloudera.org:8080/4861
Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Tested-by: Internal Jenkins
2016-11-15 05:38:12 +00:00
Dan Hecht
ffa7829b70 IMPALA-3918: Remove Cloudera copyrights and add ASF license header
For files that have a Cloudera copyright (and no other copyright
notice), make changes to follow the ASF source file header policy here:

http://www.apache.org/legal/src-headers.html#headers

Specifically:
1) Remove the Cloudera copyright.
2) Modify NOTICE.txt according to
   http://www.apache.org/legal/src-headers.html#notice
   to follow that format and add a line for Cloudera.
3) Replace or add the existing ASF license text with the one given
   on the website.

Much of this change was automatically generated via:

git grep -li 'Copyright.*Cloudera' > modified_files.txt
cat modified_files.txt | xargs perl -n -i -e 'print unless m#Copyright.*Cloudera#i;'
cat modified_files_txt | xargs fix_apache_license.py [1]

Some manual fixups were performed following those steps, especially when
license text was completely missing from the file.

[1] https://gist.github.com/anonymous/ff71292094362fc5c594 with minor
    modification to ORIG_LICENSE to match Impala's license text.

Change-Id: I2e0bd8420945b953e1b806041bea4d72a3943d86
Reviewed-on: http://gerrit.cloudera.org:8080/3779
Reviewed-by: Dan Hecht <dhecht@cloudera.com>
Tested-by: Internal Jenkins
2016-08-09 08:19:41 +00:00
Taras Bobrovytsky
609b80410e Clean up Python test import statements
Many of our test scripts have import statements that look like
"from xxx import *". It is a good practice to explicitly name what
needs to be imported. This commit implements this practice. Also,
unused import statements are removed.

Change-Id: I6a33bb66552ae657d1725f765842f648faeb26a8
Reviewed-on: http://gerrit.cloudera.org:8080/3444
Reviewed-by: Michael Brown <mikeb@cloudera.com>
Tested-by: Internal Jenkins
2016-07-15 23:26:18 +00:00
Kapil Rastogi
e1c5959b4d Reuse session for executing queries (Hive on Spark)
Change-Id: I06c798dc311d63eb0a875450fd26d06db4e84a03
Reviewed-on: http://gerrit.cloudera.org:8080/2374
Reviewed-by: Taras Bobrovytsky <tbobrovytsky@cloudera.com>
Tested-by: Internal Jenkins
2016-05-12 14:17:54 -07:00
Kapil Rastogi
f7740e0fa8 Add Hive support to Impala workload runner
This also allows us to process results (saved in JSON file)
and upload them to a DB for stroage/reporting purposes.

Change-Id: I4f65889a61d50a23e1f4588c208cc79ff660ec0a
Reviewed-on: http://gerrit.cloudera.org:8080/1637
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: Internal Jenkins
2015-12-19 04:44:23 +00:00
ishaan
e408560c56 Perf Framework: Move exec functions to a separate file and deprecate Hive execution.
This patch does the following:
  - Removes code that deals with executing queries through Hive.
  - Gives the user the option to specify only the hostname for the Impalads.
  - Moves the execution functions to their own .py file.
  - Removes some duplicate code (exec_shell_cmd -> exec_process)

Change-Id: If49951c7bb5423ef9343d4d211f6da13d397325a
Reviewed-on: http://gerrit.cloudera.org:8080/862
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: Internal Jenkins
2015-09-22 10:58:32 -07:00