Commit Graph

115 Commits

Author SHA1 Message Date
Joe McDonnell
11396d3146 IMPALA-13384: Only install gcovr deps for coverage builds
IMPALA-13279 upgraded gcovr to 7.2 and moved it from python 2 to
python 3.8. gcovr has several dependencies that require native
compilation, and this increased the cost of initializing the
Python 3 virtualenv substantially:

Without gcovr: 1m43.279s
With gcovr and deps: 6m35.107s

This moves gcovr to its own requirements file and only installs
gcovr if this is a coverage build (detected from the
.cmake_buid_type file).

Testing:
 - Verified that a coverage build does install gcovr and
   produce a report

Change-Id: I1d0fd6d21273053aaf2acee39fcb83d9093d49a2
Reviewed-on: http://gerrit.cloudera.org:8080/21849
Reviewed-by: Laszlo Gaal <laszlo.gaal@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-09-27 00:25:41 +00:00
Joe McDonnell
ed94f31a25 IMPALA-13279: Upgrade gcovr to 7.2
In some environments, the code coverage report is empty even
though the tests ran successfully and gcno/gcda files are
written properly.

This upgrades to gcovr 7.2, which does not show the same
problem. gcovr 7.2 requires Python 3.8, so this switches to use
Python 3.8 from the toolchain and installs gcovr in the Python 3
virtualenv.

gcovr 7.2 outputs logging to stderr, so this also modifies
bin/coverage_helper.sh to redirect stderr to stdout.

Testing:
 - Verified that this can generate a report locally and on
   the affected environment

Change-Id: I5b1aaa92c65f54149a3e7230cbe56d5286f1051a
Reviewed-on: http://gerrit.cloudera.org:8080/21647
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-08-06 22:28:16 +00:00
Joe McDonnell
5071f54a4c IMPALA-12825: Install thrift into the impala-python virtualenv
impala-python currently gets its Thrift from the toolchain
by adding the appropriate Thrift toolchain directories to
the PYTHONPATH. This is a problem when switching to Python 3,
because the toolchain Thrift was built with Python 2 and
this can produce complicated bugs. In general, it is also
not a good idea to get Python dependencies from the toolchain.

This switches to installing Thrift into the impala-python
virtualenv, which lets the different Python versions have
their own copy of compiled files.

Testing:
 - Ran a core job

Change-Id: Ib36e8a1ce8d446b69b08e81ea458f95c158e28f5
Reviewed-on: http://gerrit.cloudera.org:8080/21046
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Reviewed-by: Wenzhe Zhou <wzhou@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-03-01 08:06:56 +00:00
zhangyifan27
45682c132f IMPALA-12229: Support soft-delete Kudu table
Adds 'kudu_table_reserve_seconds' query option to set reserved time
for deleted Impala managed Kudu tables. The default value is 0.
This option can prevent users from deleting important Kudu tables
by mistake.

Testing:
- Added e2e tests.

Change-Id: I3020567bb6cfe4dd48ef17906f8de674f37217e7
Reviewed-on: http://gerrit.cloudera.org:8080/20773
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2023-12-14 00:12:55 +00:00
Gergely Farkas
04bdb4d32c IMPALA-12552: Fix Kerberos authentication issue that occurs
in python 3 environment when kerberos_host_fqdn option is used

In Pyhton 2, the sasl layer does not accept unicode strings,
so we have to explicitly encode the kerberos_host_fqdn string
to ascii. However, this is not the case in python 3, where
we have to omit the encode, because if we don't do this,
impala-shell wants to use the following service principal
during Kerberos auth:
my_service_name/b'my.kerberos.host.fqdn'@MY.REALM
instead of the correct one, which is:
my_service_name/my.kerberos.host.fqdn@MY.REALM
(This is because the output of the encode function
is a byte array in python 3.)

Tested with new unit tests and with a snapshot build
manually in CDP PVC DS.

Change-Id: I8b157d76824ad67faf531a529256a8afe2ab9d49
Reviewed-on: http://gerrit.cloudera.org:8080/20691
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Wenzhe Zhou <wzhou@cloudera.com>
2023-11-17 20:08:42 +00:00
Joe McDonnell
7b502f7c96 IMPALA-12240: Put gcc on the PATH when building the impala-python venv
On some systems, we have seen the build for the impala-python
virtualenv refer to system gcc directly, even though we have
specified Impala toolchain's gcc via CC. When the system gcc
is newer than Impala's gcc, it fails to execute because it needs
symbols that are not present in Impala's libstdc++:

gcc: /home/joe/impala/toolchain/toolchain-packages-gcc10.4.0/gcc-10.4.0/lib64/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by gcc)

This adds the toolchain gcc to the PATH when building the impala-python
virtualenv. This means that any direct reference to gcc will use our
compiler rather than system gcc. We continue to have CC pointed to
our compiler.

Testing:
 - Ran a build on Redhat 9 where the issue presented

Change-Id: Ia5ddd6a88b41a3f8ba04d13538b3de2d9499cbf5
Reviewed-on: http://gerrit.cloudera.org:8080/20114
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2023-06-24 02:28:55 +00:00
Michael Smith
0a42185d17 IMPALA-9627: Update utility scripts for Python 3 (part 2)
We're starting to see environments where the system Python ('python') is
Python 3. Updates utility and build scripts to work with Python 3, and
updates check-pylint-py3k.sh to check scripts that use system python.

Fixes other issues found during a full build and test run with Python
3.8 as the default for 'python'.

Fixes a impala-shell tip that was supposed to have been two tips (and
had no space after period when they were printed).

Removes out-of-date deploy.py and various Python 2.6 workarounds.

Testing:
- Full build with /usr/bin/python pointed to python3
- run-all-tests passed with python pointed to python3
- ran push_to_asf.py

Change-Id: Idff388aff33817b0629347f5843ec34c78f0d0cb
Reviewed-on: http://gerrit.cloudera.org:8080/19697
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Tested-by: Michael Smith <michael.smith@cloudera.com>
2023-04-26 18:52:23 +00:00
Joe McDonnell
82bd087fb1 IMPALA-11973: Add absolute_import, division to all eligible Python files
This takes steps to make Python 2 behave like Python 3 as
a way to flush out issues with running on Python 3. Specifically,
it handles two main differences:
 1. Python 3 requires absolute imports within packages. This
    can be emulated via "from __future__ import absolute_import"
 2. Python 3 changed division to "true" division that doesn't
    round to an integer. This can be emulated via
    "from __future__ import division"

This changes all Python files to add imports for absolute_import
and division. For completeness, this also includes print_function in the
import.

I scrutinized each old-division location and converted some locations
to use the integer division '//' operator if it needed an integer
result (e.g. for indices, counts of records, etc). Some code was also using
relative imports and needed to be adjusted to handle absolute_import.
This fixes all Pylint warnings about no-absolute-import and old-division,
and these warnings are now banned.

Testing:
 - Ran core tests

Change-Id: Idb0fcbd11f3e8791f5951c4944be44fb580e576b
Reviewed-on: http://gerrit.cloudera.org:8080/19588
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>
2023-03-09 17:17:57 +00:00
Joe McDonnell
566df80891 IMPALA-11959: Add Python 3 virtualenv
This adds a Python 3 equivalent to the impala-python
virtualenv base on the toolchain Python 3.7.16.
This modifies bootstrap_virtualenv.py to support
the two different modes. This adds py2-requirements.txt
and py3-requirements.txt to allow some differences
between the Python 2 and Python 3 virtualenvs.

Here are some specific package changes:
 - allpairs is replaced with allpairspy, as allpairs did
   not support Python 3.
 - requests is upgraded slightly, because otherwise is has issues
   with idna==2.8.
 - pylint is limited to Python 3, because we are adding it
   and don't need it on both
 - flake8 is limited to Python 2, because it will take
   some work to switch to a version that works on Python 3
 - cm_api is limited to Python 2, because it doesn't support
   Python 3
 - pytest-random does not support Python 3 and it is unused,
   so it is removed
 - Bump the version of setuptool-scm to support Python 3

This adds impala-pylint, which can be used to do further
Python 3 checks via --py3k. This also adds a bin/check-pylint-py3k.sh
script to enforce specific py3k checks. The banned py3k warnings
are specified in the bin/banned_py3k_warnings.txt. This is currently
empty, but this can ratchet up the py3k strictness over time
to avoid regressions.

This pulls in a new toolchain with the fix for IMPALA-11956
to get Python 3.7.16.

Testing:
 - Hand tested that the allpairs libraries produce the
   same results
 - The python3 virtualenv has no influence on regular
   tests yet

Change-Id: Ica4853f440c9a46a79bd5fb8e0a66730b0b4efc0
Reviewed-on: http://gerrit.cloudera.org:8080/19567
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>
2023-03-09 17:17:57 +00:00
Michael Smith
0c72c98f91 IMPALA-9627: Update utility scripts for Python 3
Updates utility scripts that don't use impala-python to work with Python
3 so we can build on systems that don't include Python 2 (such as SLES
15 SP4).

Primarily adds 'universal_newlines=True' to subprocess calls so they
return text rather than binary data in Python 3 with a change that's
compatible with Python 2.

Testing:
- built in SLES 15 SP4 container with Python 3

Change-Id: I7f4ce71fa1183aaeeca55d0666aeb113640c5cf2
Reviewed-on: http://gerrit.cloudera.org:8080/19559
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Tested-by: Michael Smith <michael.smith@cloudera.com>
2023-03-01 04:53:49 +00:00
Joe McDonnell
ff62a4df39 IMPALA-11951: Add tools for checking/fixing python 3 syntax
This adds the bin/check-python-syntax.sh script, which
runs "python -m compileall" for all python files in
Impala with both python2 and python3. This detects
syntax errors in the python files. This will be
incorporated into precommit once it is clean.

This also adds future to the impala-python virtualenv.
This provides the futurize script (exposed via
impala-futurize), which can be used to automatically
fix some py2/py3 issues. Future also provides the
builtins library, which can provide python 3
functionality on python 2.

Testing:
 - Ran impala-futurize locally
 - Ran the script repeatedly while fixing syntax errors

Change-Id: Iae2c51bc6ddc9b6a04469ee1b8284227fed3bd45
Reviewed-on: http://gerrit.cloudera.org:8080/19550
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Tested-by: Michael Smith <michael.smith@cloudera.com>
2023-02-28 17:11:50 +00:00
Joe McDonnell
a9cfc7b33f IMPALA-11624: Bump Impyla dependency to 0.18.0
IMPALA_THRIFT_PY_VERSION is also bumped to 0.16.0p3.
As 0.16.0p3 Thrift does not contain Python related
patches and Impyla 0.18.0 depends on Thrift 0.16.0,
now we are consistently using Thrift 0.16.0 in all
Python code. This also bumps the Thrift in the
shell's ext-py directory to 0.16.0 (based on the
Thrift 0.16.0 pypi tarball with the egg directory
removed).

Testing:
 - Ran a GVO job

Change-Id: I7265558b0e07959c606cba73cd251c3edfcb3ed5
Reviewed-on: http://gerrit.cloudera.org:8080/18456
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Reviewed-by: Wenzhe Zhou <wzhou@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
2023-02-27 20:39:26 +00:00
stiga-huang
45ea094fa2 IMPALA-11716: Bump up gcovr version to 4.2
IMPALA-9999 upgrades to GCC version to 10.4 which generates new gcov
format that the current gcovr version (3.4) can't parse. This patch
upgrades gcovr to the latest Python2-compatible version (4.2). Also adds
Jinja2, MarkupSafe and lxml as the required dependent packages. The
development packages of libxml2 and libxslt are also added in
bootstrap_system.sh and bootstrap_build.sh.

This patch also fixes a failure due to the gcov executable not found in
PATH.

Tests:
 - Verified builds on Ubuntu 16.04 and CentOS 7.9
 - Verified coverage_helper.sh work after this patch

Change-Id: I9458fa0dc97d69f88a4e8a3313dc9440215dfd52
Reviewed-on: http://gerrit.cloudera.org:8080/19226
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2022-11-11 00:09:05 +00:00
Joe McDonnell
cff286e751 IMPALA-9999: Switch to GCC 10.4
This upgrades GCC and libstdc++ to version 10.4. This
required patching or upgrading several dependencies
so they could compile with GCC 10. The toolchain
companion change has details on what items needed
to be upgraded and why.

The toolchain companion change switches GCC to build
with toolchain binutils rather than host binutils. This
means that the python virtualenv initialization needs
to include binutils on the path.

This disables two warnings introduced in the new GCC
versions (Wclass-memaccess and Winit-list-lifetime).
These two warnings occur in our code and also in
dependencies like LLVM and rapidjson. These are not
critical warnings, so they can be addressed
independently and reenabled later.

Binary sizes increase, particulary when including
debug symbols:
                         | GCC 7.5     | GCC 10.4
impalad RELEASE stripped |  83204768   |  88702824
impalad RELEASE          | 707278904   | 971711456
impalad DEBUG stripped   | 106677672   |  97391944
impalad DEBUG            | 725864760   | 867647512

Testing:
 - Multiple test jobs (core, release exhaustive, ASAN)
 - Performance testing for TPC-H and TPC-DS shows
   a modest improvement (2-4%).
 - Code compiles without warnings on debug and release

Change-Id: Ibe6857b822925226d39fd4d6413457ef6bbaabec
Reviewed-on: http://gerrit.cloudera.org:8080/18134
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Csaba Ringhofer <csringhofer@cloudera.com>
2022-09-20 15:50:18 +00:00
Michael Smith
8b1002aa6a IMPALA-11398: Update flake8 for indent-size=2
Updates flake8 to the latest Python 2-compatible version so we can use
indent-size=2. Our code uses 2-space indents and we have previously
worked around or disabled flake8 checks that rely on 4-space indenting.

Change-Id: Ia701f6e3d86be451ae86d041b799c8a10aee2d93
Reviewed-on: http://gerrit.cloudera.org:8080/18669
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2022-06-30 09:29:41 +00:00
wzhou-code
397d1d15a2 IMPALA-10745: Support Kerberos over HTTP for impala-shell
This patch ports the implementation of GSSAPI authentication over http
transport from Impyla (https://github.com/cloudera/impyla/pull/415) to
impala-shell.

The implementation adds a new dependency on 'kerberos' python module,
which is a pip-installed module distributed under Apache License Version
2.
When using impala-shell with Kerberos over http, it is assumed that the
host has a preexisting kinit-cached Kerberos ticket that impala-shell
can pass to the server automatically without the user to reenter the
password.

Testing:
 - Passed exhaustive tests.
 - Tested manually on a real cluster with a full Kerberos setup.

Change-Id: Ia59ba4004490735162adbd468a00a962165c5abd
Reviewed-on: http://gerrit.cloudera.org:8080/18493
Reviewed-by: Csaba Ringhofer <csringhofer@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2022-05-10 03:22:41 +00:00
Michael Smith
e6ed98c22b IMPALA-11201: update gitignore files
Updates gitignore for files generated during bootstrap_development.
Fixes deleting tracked files in be/src/thirdparty. Includes ignore rules
for past versions of shell dependencies and updates ignores for current
versions.

Change-Id: I03deba5e7fb151ef8e34039becdcc3fb47684084
Reviewed-on: http://gerrit.cloudera.org:8080/18499
Reviewed-by: Quanlong Huang <huangquanlong@gmail.com>
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2022-05-10 03:06:59 +00:00
yx91490
f566e7dee7 IMPALA-10994: Normalize the pip package name part of download URL.
According to PEP-0503, pip repo server doesn't support unnormalized URL
access, and some package name within
'infra/python/deps/*requirements.txt' are unnormalized, e.g. 'Cython',
and pip_download.py will concat $PYPI_MIRROR and package name to get
download URL directly, which maybe unnormalized.

Fix this by normalize package name in download URL using the
recommanded method in PEP-0503.

Change-Id: I479df0ad7acf3c650b8f5317372261d5e2840864
Reviewed-on: http://gerrit.cloudera.org:8080/17987
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-11-26 08:17:10 +00:00
Attila Jeges
c8aa5796d9 IMPALA-10879: Add parquet stats to iceberg manifest
This patch adds parquet stats to iceberg manifest as per-datafile
metrics.

The following metrics are supported:
- column_sizes :
  Map from column id to the total size on disk of all regions that
  store the column. Does not include bytes necessary to read other
  columns, like footers.

- null_value_counts :
  Map from column id to number of null values in the column.

- lower_bounds :
  Map from column id to lower bound in the column serialized as
  binary. Each value must be less than or equal to all non-null,
  non-NaN values in the column for the file.

- upper_bounds :
  Map from column id to upper bound in the column serialized as
  binary. Each value must be greater than or equal to all non-null,
  non-Nan values in the column for the file.

The corresponding parquet stats are collected by 'ColumnStats'
(in 'min_value_', 'max_value_', 'null_count_' members) and
'HdfsParquetTableWriter::BaseColumnWriter' (in
'total_compressed_byte_size_' member).

Testing:
- New e2e test was added to verify that the metrics are written to the
  Iceberg manifest upon inserting data.
- New e2e test was added to verify that lower_bounds/upper_bounds
  metrics are used to prune data files on querying iceberg tables.
- Existing e2e tests were updated to work with the new behavior.
- BE test for single-value serialization.

Relevant Iceberg documentation:
- Manifest:
  https://iceberg.apache.org/spec/#manifests
- Values in lower_bounds and upper_bounds maps should be Single-value
  serialized to binary:
  https://iceberg.apache.org/spec/#appendix-d-single-value-serialization

Change-Id: Ic31f2260bc6f6a7f307ac955ff05eb154917675b
Reviewed-on: http://gerrit.cloudera.org:8080/17806
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Attila Jeges <attilaj@cloudera.com>
2021-09-02 21:34:41 +00:00
wzhou-code
237ed5e873 IMPALA-10874: Upgrade impyla to the latest version
This patch upgrades impyla to the latest version 0.18a1, which supports
cookie retention for LDAP authentications. Also adds unit-test cases
for implyla's HTTP test with LDAP authentication.

Testing:
 - Passed core tests.

Change-Id: I990e5cdde4e98d6ab3581fe48f53a5d0590ce492
Reviewed-on: http://gerrit.cloudera.org:8080/17795
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-08-25 05:52:35 +00:00
Csaba Ringhofer
94f67a3432 IMPALA-7825: Upgrade Thrift version to 0.11.0
Before this patch Impala mainly used Thrift 0.9.3, but it was
possible to compile Impala shell with Thrift 0.11.0, so the 0.11.0
Thrift lib was already included in the toolchain.

Most of the changes are related to replacing boost:: with std::
shared_ptr-s in cpp code (this is a continuation of patch by Sahil).

The Thrift upgrade also needs an Impyla release with Thrift 0.11.0, as
Impala's test framework relies on Impyla. A thrift_sasl release is also
needed, because it currently pins Thrift version to 0.9.3 for Python 2.

The current patch uses alpha releases from Impyla and thrift_sasl that
use thrift 0.11.0.

Notable side effects:
- old logic to compile thrift for impala-shell with 0.11.0 was removed
- impala_shell's utf8 handling had to be updated as the new 0.11.0
  compilation happens with no_utf8strings. This also made things a
  bit faster, e.g the following is ~0.22s instead of ~0.25
  shell/impala_shell.py \
    -B -q "select * from functional_parquet.alltypes;" > /dev/null
- THRIFT-3921 changed the stream operators to print an enum's name
  instead of its number, leading to slightly different messages
  in some cases.
- "templates" was added to the thift generator's parameters to avoid
  a compilation issue (related to IMPALA-10600). I didn't notice any
  change in compilation time. This option generated .tcc files with
  templetized readers/writers for Thrift types. Currently we don't
  use these, but they could potentially speed up (de)serialization.

Testing:
- ran Impyla's test suite with Python 2 and 3
- ran core tests

Change-Id: Idd13f177b4f7acc07872ea6399035aa180ef6ab6
Reviewed-on: http://gerrit.cloudera.org:8080/17170
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-04-27 13:36:54 +00:00
Jim Apple
f18e0d72a7 Upgrade urllib3 to 1.24.2
Change-Id: Ib18c76e66db2920e7e05a63b5bcd79854b819cd9
Reviewed-on: http://gerrit.cloudera.org:8080/17270
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Zoltan Borok-Nagy <boroknagyz@cloudera.com>
2021-04-06 09:26:45 +00:00
Joe McDonnell
ede22a63a5 IMPALA-10608 followup: Detect the virtualenv tarball version
When rebasing from an older commit, the version change
in virtualenv can cause there to be multiple virtualenv
tarballs of different versions in the infra/python/deps
directory. bootstrap_virtualenv.py currently doesn't
handle this gracefully, because it is looking for all
virtualenv*.tar.gz files and fails when it finds more
than one.

This changes bootstrap_virtualenv.py to get the virtualenv
version from the requirements.txt file and only look
for the tarball with that version. If it fails to get
the version, it falls back to the old method.

Testing:
 - Copied virtualenv-16.7.10.tar.gz to virtualenv-16.7.9.tar.gz
   and verified that bootstrap_virtualenv.py works

Change-Id: Iebfa9ba5e223d5187414e02e24f34562418fae40
Reviewed-on: http://gerrit.cloudera.org:8080/17249
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>
2021-03-31 22:33:43 +00:00
Joe McDonnell
e7fc18c4ea IMPALA-10608: Update kudu-python version and remove some unused packages
This updates kudu-python to version 1.14.0 (from 1.2.0).
As part of this, it disables ccache for bootstrap_virtualenv.py.
ccache wasn't working anyway, because pip install uses random
temporary directories. It also needs to copy a few files to
the build directory for the Kudu install. The advantage to
upgrading is that the new version no longer has a numpy dependency.

Additionally, this modifies a few minor packages:
 - virtualenv moves to the latest version prior to the rewrite
   that accompanied version 20 (i.e. 16.10.7).
 - setuptools moves to the last version that supports python 2.7 (44.1.1)
 - remove botos3, ipython, and ordereddict

These changes speed up installing the virtualenv
Before:
real	3m11.956s
user	2m49.620s
sys	0m14.266s
After:
real    1m38.798s
user    1m33.591s
sys     0m8.112s

Testing:
 - Hand tests, GVO run

Change-Id: Ib47770df9e46de448fe2bffef7abe2c3aa942fb9
Reviewed-on: http://gerrit.cloudera.org:8080/17231
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-03-31 03:17:24 +00:00
Joe McDonnell
9670c1455d IMPALA-10606 (part 2): Clean up ordering of requirements.txt
This is a followup to the original IMPALA-10606 that reorders
the requirements.txt alphabetically. No versions changed
as part of this.

Change-Id: I2f13ec8f8af80c4bac5da30d08a2ea4c56806d27
Reviewed-on: http://gerrit.cloudera.org:8080/17229
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>
2021-03-31 03:17:24 +00:00
Joe McDonnell
1142c7b58e IMPALA-10606: Simplify impala-python virtualenv bootstrapping
Bootstrapping the impala-python virtualenv requires multiple
rounds of pip installs with different sets of requirements.
This consolidates the requirements.txt, stage2-requirements.txt,
and compiled-requirements.txt into a single requirements.txt.
This will make it easier to upgrade python packages.

This also splits out setuptools into its own
setuptools-requirements.txt. Setuptools is used during the
pip install for several of the dependencies. Recent versions
of setuptools do not support Python 2, but some of the install
tools (like easy_install) don't know how to pick a version
of setuptools that works with Python 2. Splitting it out to its
own requirements file lets us pin the version.

To make review easier, this does not change any of the versions
of the dependencies. It also leaves the stage2-requirements.txt
and compiled-requirements.txt split out in separate sections
of requirements.txt. These will later be turned into a single
alphabetical list.

Testing:
 - Tested impala-python locally
 - Ran GVO

Change-Id: I8e920e5a257f1e0613065685078624a50d59bf2e
Reviewed-on: http://gerrit.cloudera.org:8080/17226
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>
2021-03-31 03:17:24 +00:00
Jim Apple
103774a8e5 Update Python requests package to 2.20.0
See https://2.python-requests.org/en/master/community/updates/#id8.
This is currently only used in the tests, but it's best to fix
this now.

While here, remove now-false not about required support for Python
2.6.

Change-Id: I092a641a12f38cdb45b0062c31ffb51c0c664800
Reviewed-on: http://gerrit.cloudera.org:8080/17215
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Zoltan Borok-Nagy <boroknagyz@cloudera.com>
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
2021-03-30 01:27:04 +00:00
Jim Apple
e5d5dbc30a Update Paramiko to 2.4.2.
See https://www.paramiko.org/changelog.html#2.4.2. This shouldn't
directly apply to Impala deployments, but it is best to fix this in
test now.

Change-Id: If9cc9ea4a0763c8b5303ca4e8482761ee2f53efa
Reviewed-on: http://gerrit.cloudera.org:8080/17214
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-03-22 19:34:00 +00:00
Joe McDonnell
60f8f87b09 IMPALA-10274: Initialize impala-python as part of the CMake build
Initializing the impala-python virtualenv takes a couple minutes,
so it is useful to do that in parallel to the rest of the build.
This moves the impala-python initialization to its own step
in the CMake build. It stops using impala-python for commands
invoked from buildall.sh or the CMake build to avoid premature
or concurrent initializations of impala-python. Then, it adds
a dedicated step to initialize impala-python.

Testing:
 - Ran a core job and a couple builds
 - Rebuilt and verified that impala-python is not reinitialized
   if it is already initialized

Change-Id: Ieff51263c55bd234028fed7101c94b4a928590f0
Reviewed-on: http://gerrit.cloudera.org:8080/16607
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-02-04 17:03:57 +00:00
Tim Armstrong
b8a2b75466 IMPALA-10225: bump impyla version to 0.17a1
Update a couple of tests with the new improved error messages.

Change-Id: I70a0e883275f3c29e2b01fd5bab7725857c8a1ed
Reviewed-on: http://gerrit.cloudera.org:8080/16562
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-10-10 02:08:22 +00:00
guojingfeng
7baa31ea04 IMPALA-10093: Replace urllib with wget to download python deps
When build impala in Company internal network, pip_download.py
failed to download dependency eggs from https engpoint Although
correcly set system proxy like http_proxy, https_proxy. Is is
a issue of python2's urllib. I just replace urllib with wget
which can works well with system proxy like https_proxy.

Change-Id: I146d93312701fd682420cb65cf4738bc030f3cfb
Reviewed-on: http://gerrit.cloudera.org:8080/16344
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-09-11 16:43:35 +00:00
Tim Armstrong
6ec6aaae8e IMPALA-3695: Remove KUDU_IS_SUPPORTED
Testing:
Ran exhaustive tests.

Change-Id: I059d7a42798c38b570f25283663c284f2fcee517
Reviewed-on: http://gerrit.cloudera.org:8080/16085
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-06-18 01:11:18 +00:00
Joe McDonnell
13fbe510c0 IMPALA-9838: Switch to GCC 7.5.0
This upgrades GCC and libstdc++ to version 7.5.0. There
have been ABI changes since 4.9.2, so this means that
the native-toolchain produced with the new compiler is
not interoperable with one produced by the old compiler.
To allow that transition, IMPALA_TOOLCHAIN_PACKAGES_HOME
is now a subdirectory of IMPALA_TOOLCHAIN
(toolchain-packages-gcc${IMPALA_GCC_VERSION}) to distinguish
it from the old packages.

Some Python packages in the impala-python virtualenv are
compiled using the toolchain GCC and now use the new ABI.
This leads to two changes:
1. When constructing the LD_LIBRARY_PATH for impala-python,
we include the GCC libstdc++ libraries. Otherwise, certain
Python packages that use C++ fail on older OSes like Centos 7.
This fixes IMPALA-9804.
2. Since developers work on various branches, this changes
the virtualenv's directory location to a directory with
the GCC version in the name. This allows the virtualenv
built with GCC 7 to coexist with the current virtualenv
built with GCC 4.9.2. The location for the old virtualenv is
${IMPALA_HOME}/infra/python/env. The new location is
${IMPALA_HOME}/infra/python/env-gcc${IMPALA_GCC_VERSION}. This
required updating several impala-python scripts.

There are various odds-and-ends related to the transition:
1. Due to the small string optimization, the size of std::string
changed, which means that various data structures also changed
in size. This required updating some static asserts.
2. There is a bug in clang-tidy that reports a use-after-free
for some code using std::shared_ptr. Clang is not modeling
the shared_ptr correctly, so it is a false-positive. As a workaround,
this disables the clang-analyzer-cplusplus.NewDelete diagnostic.
3. Various small compilation fixes (includes, etc).

Performance testing:
 - Ran single-node performance tests on TPC-H for the following
   configurations:
    - TPC-H Parquet scale 30 with normal configurations
    - TPC-H Parquet scale 30 with codegen disabled
    - TPC-H Kudu scale 10
   None found any significant regressions. Full results are
   posted on the JIRA.
 - Ran single-node performance tests on targeted-perf scale 10.
   No significant regressions.
 - The size of binaries (impalad, etc) is slightly smaller with the new GCC:
   GCC 4.9.2 release impalad binary: 545664
   GCC 7.5.0 release impalad binary: 539900
 - Compilation in DEBUG mode is roughly 15-25% faster

Functional testing:
 - Ran core jobs, exhaustive release jobs, UBSAN

Change-Id: Ia0beb2b618ba669c9699f8dbc0c52d1203d004e4
Reviewed-on: http://gerrit.cloudera.org:8080/16045
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-06-15 23:42:12 +00:00
Joe McDonnell
56ee90c598 IMPALA-9760: Add IMPALA_TOOLCHAIN_PACKAGES_HOME to prepare for GCC7
The locations for native-toolchain packages in IMPALA_TOOLCHAIN
currently do not include the compiler version. This means that
the toolchain can't distinguish between native-toolchain packages
built with gcc 4.9.2 versus gcc 7.5.0. The collisions can cause
issues when switching back and forth between branches.

This introduces the IMPALA_TOOLCHAIN_PACKAGES_HOME environment
variable, which is a location inside IMPALA_TOOLCHAIN that would
hold native-toolchain packages. Currently, it is set to the same
as IMPALA_TOOLCHAIN, so there is no difference in behavior.
This lays the groundwork to add the compiler version to this
path when switching to GCC7.

Testing:
 - The only impediment to building with
   IMPALA_TOOLCHAIN_PACKAGES_HOME=$IMPALA_TOOLCHAIN/test is
   Impala-lzo. With a custom Impala-lzo, compilation succeeds.
   Either Impala-lzo will be fixed or it will be removed.
 - Core tests

Change-Id: I1ff641e503b2161baf415355452f86b6c8bfb15b
Reviewed-on: http://gerrit.cloudera.org:8080/15991
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-05-30 16:25:37 +00:00
Laszlo Gaal
b921d982b5 IMPALA-9668: Obey SKIP_TOOLCHAIN_BOOTSTRAP during virtualenv bootstrap
IMPALA-9626 broke the use case where the toolchain binaries are not
downloaded from the native-toolchain S3 bucket, because
SKIP_TOOLCHAIN_BOOTSTRAP is set to true.

Fix this use case by checking SKIP_TOOLCHAIN_BOOTSTRAP in
bin/bootstrap_environment.py:
- if true: just check if the specified version of the Python binary is
  present at the expected toolchain location. If it is there, use it,
  otherwise throw an exception and abort the bootstrap process.
- in any other case: proceed to download the Python binary as in
  bootstrap_toolchain.py.

Test:
- simulate the custom toolchain setup by downloading the toolchain
  binaries from the S3 bucket, copying them to a separate directory,
  symlinking them into Impala/toolchain, then executing buildall.sh
  with SKIP_BOOTSTRAP_TOOLCHAIN set to "true".

Change-Id: Ic51b3c327b3cebc08edff90de931d07e35e0c319
Reviewed-on: http://gerrit.cloudera.org:8080/15759
Reviewed-by: Laszlo Gaal <laszlo.gaal@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-04-22 21:56:01 +00:00
David Knupp
c26e3db4bd IMPALA-9362: Upgrade sqlparse 0.1.19 -> 0.3.1
Upgrades the impala-shell's bundled version of sqlparse to 0.3.1.
There were some API changes in 0.2.0+ that required a re-write of
the StripLeadingCommentFilter in impala_shell.py. A slight perf
optimization was also added to avoid using the filter altogether
if no leading comment is readily discernible.

As 0.1.19 was the last version of sqlparse to support python 2.6,
this patch also breaks Impala's compatibility with python 2.6.

No new tests were added, but all existing tests passed without
modification.

Change-Id: I77a1fd5ae311634a18ee04b8c389d8a3f3a6e001
Reviewed-on: http://gerrit.cloudera.org:8080/15642
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-04-17 05:04:23 +00:00
Laszlo Gaal
c97191b6a5 IMPALA-9626: Use Python from the toolchain for Impala
Historically Impala used the Python2 version that was available on
the hosting platform, as long as that version was at least v2.6.
This caused constant headache as all Python syntax had to be kept
compatible with Python 2.6 (for Centos 6). It also caused a recent problem
on Centos 8: here the system Python version was compiled with the
system's GCC version (v8.3), which was much more recent than the Impala
standard compiler version (GCC 4.9.2). When the Impala virtualenv was
built, the system Python version supplied C compiler switches for models
containing native code that were unknown for the Impala version of GCC,
thus breaking virtualenv installation.

This patch changes the Impala virtualenv to always use the Python2
version from the toolchain, which is built with the toolchain compiler.

This ensures that
- Impala always has a known Python 2.7 version for all its scripts,
- virtualenv modules based on native code will always be installable, as
  the Python environment and the modules are built with the same compiler
  version.

Additional changes:
- Add an auto-use fixture to conftest.py to check that the tests are
  being run with Python 2.7.x
- Make bootstrap_toolchain.py independent from the Impala virtualenv:
  remove the dependency on the "sh" library

Tests:
- Passed core-mode tests on CentOS 7.4
- Passed core-mode tests in Docker-based mode for centos:7
  and ubuntu:16.04

Most content in this patch was developed but not published earlier
by Tim Armstrong.

Change-Id: Ic7b40cef89cfb3b467b61b2d54a94e708642882b
Reviewed-on: http://gerrit.cloudera.org:8080/15624
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-04-16 01:08:00 +00:00
David Knupp
5c541512f0 IMPALA-9582: Upgrade thrift_sasl to 0.4.2 for impala-shell
Change-Id: Iff739ebeaf5b022a7418883b638b5c5d17885f3b
Reviewed-on: http://gerrit.cloudera.org:8080/15610
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-04-01 04:22:38 +00:00
David Knupp
ed70492580 IMPALA-3343: Part 3 - Fix py2->3 changes re: libs, built-ins, imports
A few built-ins were changed in python 3 -- e.g., xrange became range,
ConfigParser became configparser, etc. We can redefine some of those
things in a single place, and import them from there as needed. Other
items may also be added as we go along.

Change-Id: Ibd3d86df524666a98cbfa463756adac48bd1f8a3
Reviewed-on: http://gerrit.cloudera.org:8080/15514
Reviewed-by: David Knupp <dknupp@cloudera.com>
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-03-21 19:52:07 +00:00
David Knupp
df875dc05b IMPALA-9424: Add six to shell/ext-py
Change-Id: I003e0008c138ee1f2c290775553d4cfc66e9b7fe
Reviewed-on: http://gerrit.cloudera.org:8080/15293
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-02-28 01:37:38 +00:00
Lars Volker
74c7b7e55f IMPALA-8863: Add support to run tests over HTTP/HS2
This change adds support to run backend tests over HTTP using a new
version of Impyla (0.16.1). It also adds a test that exercises
authentication over HTTP.

Change-Id: I7156558071781378fcb9c8941c0f4dd82eb0d018
Reviewed-on: http://gerrit.cloudera.org:8080/14059
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2019-11-26 22:46:40 +00:00
Tim Armstrong
9d8b846825 IMPALA-9171: Revert "IMPALA-9098: bump Impyla to 0.16.1"
This reverts commit 7b135949d9.

Change-Id: I1469f6ca5e6cff19fb999a17d627741c64b14899
Reviewed-on: http://gerrit.cloudera.org:8080/14748
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Tim Armstrong <tarmstrong@cloudera.com>
2019-11-20 16:43:06 +00:00
Tim Armstrong
7b135949d9 IMPALA-9098: bump Impyla to 0.16.1
This includes a bugfix for fetch requests returning 0 rows,
which was resulting in Impyla truncating results for
some tests.

Testing:
Ran exhaustive tests.

Change-Id: Iee3afdcc2f25f0e094c6d2531a83da79045d01be
Reviewed-on: http://gerrit.cloudera.org:8080/14733
Reviewed-by: David Knupp <dknupp@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2019-11-18 23:27:01 +00:00
Thomas Tauber-Marshall
16e0d550c2 IMPALA-4057, IMPALA-4050: Fix --webserver_interface
This patch fixes two issues with --webserver_interface:
- When --webserver_interface was used start-impala-cluster.py with a
  value that's different from --hostname, minicluster startup would
  appear to fail as liveness is determined by checking for the webui's
  availability at the address specified for --hostname.
- The value of --webserver_interface was applied correctly for the
  catalogd and statestored but not for impalads, due to the way
  ExecEnv constructed the Webserver.
- It is now possible to specify a hostname for webserver_interface
  instead of an IP. The webserver will resolve the hostname.

This patch also upgrades our version of psutil to the latest for the
function 'net_if_addrs'. This requires a few change to our use of
psutil, mostly adding '()' to call functions that previously were
variables.

Testing:
- Added a custom cluster test that finds all available interfaces,
  binds the webserver to one of them, and checks that its only
  available over that interface.

Change-Id: Ic7e75908426756d73f13a0fa3cfc21fc31da164c
Reviewed-on: http://gerrit.cloudera.org:8080/14266
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2019-09-20 20:43:26 +00:00
Tim Armstrong
f1f3ae9ec2 IMPALA-7290: part 2: Add HS2 support to Impala shell
HS2 is added as an option via --protocol=hs2. The user-visible
differences in behaviour are minimal. Beeswax is still the
default and can be explicitly enabled via --protocol=beeswax
but will be deprecated. The default is unchanged because
changing the default could break certain workflows, e.g.
those that explicitly specify the port with -i or deployments
that hit --fe_service_threads for HS2 and somehow rely on
impala-shell not contributing to that limit. For most
workflows the change is transparent and we should change
the default in a major version change.

This support requires Impala-specific extensions to
the HS2 interface, similar to the existing extensions
to Beeswax. Thus the HS2 shell is only
forwards-compatible with newer Impala versions.
I considered trying to gracefully degrade when the
new extensions weren't present, but it didn't seem to be
worth the ongoing testing effort.

Differences between HS2 and Beeswax are abstracted into
ImpalaClient subclasses.
Here are the changes required to make it work:
* Switch to TBinaryProtocolAccelerated to avoid perf
  regression. The HS2 protocol requires decoding
  more primitive values (because its not a string-per-row),
  which was slow with the pure python implementation of
  TBinaryProtocol.
* Added bitarray module to efficiently unpack null indicators
* Minimise invasiveness of changes by transposing and stringifying
  the columnar results into rows in impala_client.py. The transposition
  needs to happen before display anyway.
* Add PingImpalaHS2Service() to get back version string and webserver
  address.
* Add CloseImpalaOperation() extension to return DML row counts. This
  possibly addresses IMPALA-1789, although we need to confirm that
  this is a sufficient solution.
* Add is_closed member to query handles to avoid shell independently
  tracking whether the query handle was closed or not.
* Include query status in HS2 log to match beeswax.
* HS2 GetLog() command now includes query status error message for
  consistency with beeswax.
* "set"/"set all" uses the client requests options, not the session
  default. This captures the effective value of TIMEZONE, which
  was previously missing. This also requires test changes where
  the tests set non-default values, e.g. for ABORT_ON_ERROR.
* "set all" on the server side returns REMOVED query options - the
  shell needs to know these so it can correctly ignore them.
* Clean up self.orig_cmd/self.last_leading comment argument
  passing to avoid implicit parameter passing through multiple
  function calls.
* Clean up argument handling in shell tests to consistently pass
  around lists of arguments instead of strings that are subject
  to shell tokenisation rules.
* Consistently close connections in the shell to avoid leaking
  HS2 sessions. This is enforced by making ImpalaShell a context
  manager and also eliminating all sys.exit() calls that would
  bypass the explicit connection closing.

Testing:
* Shell tests can run with both protocols
* Add tests for formatting of all types and NULL values
* Added testing for floating point output formatting, which does
  change as a result of switching to server-side vs client-side
  formatting.
* Verified that newly-added tests were actually going through HS2
  by disabling hs2 on the minicluster and running tests.
* Add checks to test_verify_metrics.py to ensure that no sessions
  are left open at the end of tests.

Performance:
Baseline from beeswax shell for large extract is as follows:

  $ time impala-shell.sh -B -q 'select * from tpch_parquet.orders' > /dev/null
  real    0m6.708s
  user    0m5.132s
  sys     0m0.204s

After this change it is somewhat slower, but we generally don't consider
bulk extract performance through the shell to be perf-critical:
  real    0m7.625s
  user    0m6.436s
  sys     0m0.256s

Change-Id: I6d5cc83d545aacc659523f29b1d6feed672e2a12
Reviewed-on: http://gerrit.cloudera.org:8080/12884
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2019-06-20 10:23:28 +00:00
Joe McDonnell
6b09612e76 IMPALA-8344: Add support for running the minicluster with S3Guard
Some tests can fail on S3 due to some operations that are eventually
consistent. S3Guard stores extra metadata in a DynamoDB to solve
several consistency issues.

This adds support for running the minicluster on S3 with S3Guard.
S3Guard is configured by the following environment variables:
S3GUARD_ENABLED: defaults to false, set to true to enable S3Guard
S3GUARD_DYNAMODB_TABLE: name of the DynamoDB table to use. This must
  be exclusively owned by this minicluster. The dataload scripts
  initialize this table and will purge entries if the table already
  exists. The table should be in the same region as the S3_BUCKET
  for the minicluster.
S3GUARD_DYNAMODB_REGION - AWS region for S3GUARD_DYNAMODB_TABLE
These environment variables only impact S3 configurations.

The support comes from three pieces:
1. Configuration changes in core-site.xml to add the appropriate
   parameters.
2. Updating dataload to initialize/purge the s3guard dynamodb table
   and import data appropriately.
3. Update tests to manipulate files through the HDFS command line
   rather than through s3 utilities. This takes the filesystem
   utility code for ABFS (which actually calls HDFS command line),
   makes it generic, and uses it for S3Guard.

Testing:
 - Ran multiple rounds of s3 tests
 - Aborted tests in the middle and restarted the s3 tests (to test
   the s3guard reinitialization code)

Change-Id: I3c748529a494bb6e70fec96dc031523ff79bf61d
Reviewed-on: http://gerrit.cloudera.org:8080/13020
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Sahil Takiar <stakiar@cloudera.com>
2019-05-23 18:25:46 +00:00
Akshesh
e4388740f7 Make infra/python compatible with both Python 2 & 3
Change-Id: If4285a021bb581f88425daa52ef8a3f844017d82
Reviewed-on: http://gerrit.cloudera.org:8080/13070
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Akshesh Doshi <aksheshdoshi@gmail.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2019-04-25 20:43:45 +00:00
Lars Volker
25c3dfb774 IMPALA-8158: Retrieve thrift profiles through Impyla 0.15.0
This change updates Impyla to 0.15.0 and then uses Impyla to retrieve
thrift profiles through the HS2 api.

Unfortunately, some of the current usages of get_thrift_profile rely on
the Beeswax query states and the ImpylaHS2Connection does not have the
required functionality yet. We will have to update these in a future
change, once we unified the query states.

This change also adds a self-contained test for IMPALA-2063

Change-Id: I769a99f0843297dd2b20f2f5b1a9046c97bb131e
Reviewed-on: http://gerrit.cloudera.org:8080/12530
Reviewed-by: Lars Volker <lv@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2019-04-18 03:11:33 +00:00
Fredy Wijaya
ed9e1d499b IMPALA-8340: Rewrite fixes in IMPALA-8317 and IMPALA-8337
Fixes in IMPALA-8317 and IMPALA-8337 introduced third-party dependencies
in Impala shell which is problematic in multi-Python environment. This
patch rewrites the fixes using an alternative solution when dealing with
duplicate options without any third-party dependencies. For example:

[impala]
keyval=msg1=hello,keyval=msg2=world

Testing:
- Ran all shell tests on Python 2.6 and 2.7.
- Ran make_shell_tarball.sh and ran Impala shell from the tarball
  without any issue.

Change-Id: Ifc0bf391ba26cf5a34f622a4157d7287453cc539
Reviewed-on: http://gerrit.cloudera.org:8080/12844
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2019-03-26 04:03:46 +00:00
Fredy Wijaya
eb95c912cb IMPALA-8337: Fix OrderedDict Python compatibility issue in Impala shell
IMPALA-8317 added a patch that uses OrderedDict, which is not available
on Python 2.6 or lower. The patch in IMPALA-8317 also requires a newer
version of configparser to handle duplicate keys, which is not available
in Python 2.6. This patch fixes the issue by using the OrderedDict from
ordereddict third-party library when running on Python 2.6 or lower and
using configparser from the backport.

Testing:
- Ran all E2E shell tests on Python 2.6 and 2.7

Change-Id: Iab1a33542319e9bb75d806bfd38b158995b54aa9
Reviewed-on: http://gerrit.cloudera.org:8080/12830
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2019-03-23 04:14:29 +00:00