Commit Graph

12034 Commits

Author SHA1 Message Date
Joe McDonnell
cbb35ebccd IMPALA-13326: Prefer python3 for tarball packaged impala-shell
The tarball packaging for impala-shell ships support for
multiple Python versions (including both Python 2 and Python 3).
In the impala-shell script, it determines the python to use
and uses the corresponding installation. Historically, impala-shell
has preferred the "python" executable (which can be Python 2) to
the "python3" executable. Since Python 2 is deprecated, this flips
the preference to prefer "python3" to "python".

This continues to respect IMPALA_PYTHON_EXECUTABLE as before, but
it adds an IMPALA_SHELL_PYTHON_FALLBACK variable to determine
whether to fall back to the regular logic. This defaults to
true, allowing fallback, to maintain existing behavior. The
shell end-to-end tests set this to false to lock in the
Python version.

Testing:
 - Ran shell tests

Change-Id: If0e32e8eee672e4dc66e725722f5150cd1e4c9a6
Reviewed-on: http://gerrit.cloudera.org:8080/22953
Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
2025-05-28 23:22:12 +00:00
Riza Suminto
4ab0ce139d IMPALA-12162: (addendum) Move test_parallel_checksum
test_parallel_checksum only need to run over single exec option
dimension and text/none format. Leaving it in TestInsertQueries will
exercise test_parallel_checksum over 'compression_codec' query
option (in exhaustive builds). The CTAS fails when
compression_codec != none since the target table is in text format
and writing to compressed text table is not supported.

This patch move test_parallel_checksum under
TestInsertNonPartitionedTable that have such limited test dimension.
Also add assertion that CTAS query is successful.

Change-Id: I2b2bc34ae48a2355ee1e6f6e9e42da9076adf96b
Reviewed-on: http://gerrit.cloudera.org:8080/22948
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-05-28 20:35:26 +00:00
Joe McDonnell
b90d636407 IMPALA-14104: Fix TestDecimalFuzz on Python 3
TestDecimalFuzz uses division to calculate the number of
iterations for certain tests. On Python 3, division produces
a float and range() will not take a float as an argument.
In theory, the "from __future__ import division" was supposed
to produce the same behavior on Python 2 and 3, but in practice,
the "from builtins import range" allows a float argument to
range() on Python 2 but not Python 3.

This fixes the issue by explicitly casting to an integer.

Testing:
 - Ran TestDecimalFuzz with Python 3

Change-Id: I4cd4daecde690bf41a4e412c02c23cbb6ae5a14c
Reviewed-on: http://gerrit.cloudera.org:8080/22955
Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-05-28 15:55:52 +00:00
Joe McDonnell
71b47dfdb4 IMPALA-14103: Fix TestAdmissionControllerStress on Python 3
TestAdmissionControllerStress has an invalid except clause
where it catches Exception as well as ImpalaHiveServer2Service.
This is an error on Python 3, because ImpalaHiveServer2Service
is not an exception class. This changes the except clause to
only cause Exception.

Testing:
 - Ran TestAdmissionControllerStress locally

Change-Id: Iefe9306cd6b76bd27ca5be1d62b05aff1e5deafe
Reviewed-on: http://gerrit.cloudera.org:8080/22954
Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>
Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>
2025-05-28 15:55:52 +00:00
stiga-huang
b37f4509fa IMPALA-14089: Support REFRESH on multiple partitions
Currently we just support REFRESH on the whole table or a specific
partition:
  REFRESH [db_name.]table_name [PARTITION (key_col1=val1 [, key_col2=val2...])]

If users want to refresh multiple partitions, they have to submit
multiple statements each for a single partition. This has some
drawbacks:
 - It requires holding the table write lock inside catalogd multiple
   times, which increase lock contention with other read/write
   operations on the same table, e.g. getPartialCatalogObject requests
   from coordinators.
 - Catalog version of the table will be increased multiple times.
   Coordinators in local catalog mode is more likely to see different
   versions between their getPartialCatalogObject requests so have to
   retry planning to resolve InconsistentMetadataFetchException.
 - Partitions are reloaded in sequence. They should be reloaded in
   parallel like we do in refreshing the whole table.

This patch extends the syntax to refresh multiple partitions in one
statement:
  REFRESH [db_name.]table_name
  [PARTITION (key_col1=val1 [, key_col2=val2...])
   [PARTITION (key_col1=val3 [, key_col2=val4...])...]]
Example:
  REFRESH foo PARTITION(p=0) PARTITION(p=1) PARTITION(p=2);

TResetMetadataRequest is extended to have a list of partition specs for
this. If the list has only one item, we still use the existing logic of
reloading a specific partition. If the list has more than one item,
partitions will be reloaded in parallel. This is implemented in
CatalogServiceCatalog#reloadTable(). Previously it always invokes
HdfsTable#load() with partitionsToUpdate=null. Now the parameter is
set when TResetMetadataRequest has the partition list.

HMS notification events in RELOAD type will be fired for each partition
if enable_reload_events is turned on. Once HIVE-28967 is resolved, we
can fire a single event for multiple partitions.

Updated docs in impala_refresh.xml.

Tests:
 - Added FE and e2e tests

Change-Id: Ie5b0deeaf23129ed6e1ba2817f54291d7f63d04e
Reviewed-on: http://gerrit.cloudera.org:8080/22938
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-05-28 05:18:53 +00:00
Riza Suminto
063b90c433 IMPALA-14098: Fix test_pool_config_change_while_queued
test_pool_config_change_while_queued has been failing for not finding
admission-controller.pool-max-query-mem-limit.root.invalidTestPool
metric reaching 0. This patch increase the mem_limit config in
ResourcePoolConfig.__wait_for_impala_to_pickup_config_change() from 10G
to 20G to ensure that the trigger query is always rejected and refresh
the pool config.

Testing:
Loop the test 10 times in exhaustive mode and pass them all.

Change-Id: If903840f81d54d58947fe596ecc0c86e6a234b60
Reviewed-on: http://gerrit.cloudera.org:8080/22946
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-05-27 19:35:32 +00:00
Xuebin Su
607bad042a IMPALA-3841: Enable late materialization for collections
This patch enables late materialization for collections to avoid the
cost of materializing collections that will never be accessed by the
query.

For a collection column, late materialization takes effect only when the
collection column is not used in any predicate, including the `!empty()`
predicate added by the planner. Otherwise we need to read every row to
evaluate the predicate and cannot skip any. Therefore, this patch skips
registering the `!empty()` predicates if the query contains zipping
unnests. This can affect performance if the table contains many empty
collections, but should be noticeable only in very extreme cases.

The late materialization threshold is set to 1 in HdfsParquetScanner
when there is any collection that can be skipped.

This patch also adds the detail of `HdfsScanner::parse_status_` to the
error message returned by the HdfsParquetScanner to help figure out the
root cause.

Performance:
- Tests with the queries involving collection columns in table
  `tpch_nested_parquet.customer` show that when the selectivity is low,
  the single-threaded (1 impalad and MT_DOP=1) scanning time can be
  reduced by about 50%, while when the selectivity is high, the scanning
  time almost does not change.
- For queries not involving collections, performance A/B testing
  shows no regression on TPC-H.

Testing:
- Added a runtime profile counter NumTopLevelValuesSkipped to record
  the total number of top-level values skipped for all columns. The
  counter only counts the values that are not skipped as a page.
- Added e2e test cases in test_parquet_late_materialization.py to ensure
  that late materialization works using the new counter.

Change-Id: Ia21bdfa6811408d66d74367e0a9520e20951105f
Reviewed-on: http://gerrit.cloudera.org:8080/22662
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Csaba Ringhofer <csringhofer@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-05-27 15:45:52 +00:00
Riza Suminto
4d9612f514 IMPALA-14097: Fix test_log_fragments.py
test_log_fragments.py broken after HS2 switch in IMPALA-13916. The test
look for log line from ImpalaServer::QueryToTQueryContext in
impala-beeswax-server.cc.

  VLOG_QUERY << "query: " << ThriftDebugString(query);

This log line does not appear when using HS2 protocol. The equivalent
log line in impala-hs2-server.cc is in ImpalaServer::ExecuteStatement

  VLOG_QUERY << "ExecuteStatement(): request=" << ...

This patch also remove redundant TExecuteStatementReq logging in
ImpalaServer::TExecuteStatementReqToTQueryContext. Either of
ImpalaServer::ExecuteStatement or ImpalaServer::ExecutePlannedStatement
has log the same TExecuteStatementReq already.

Testing:
- Run and pass test_log_fragments.py by itself.

Change-Id: I93e1fb6c7ba50f47023ca0c382a884093187b847
Reviewed-on: http://gerrit.cloudera.org:8080/22947
Reviewed-by: Quanlong Huang <huangquanlong@gmail.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-05-27 10:02:59 +00:00
Joe McDonnell
f4e7551094 IMPALA-14087: Fix shell live_progress output display issue on Python 3
When running the shell in a terminal with live_progress=true, live
progress overwrites its output by using the ANSI up character to
rewrite lines with updated on the query progress. On Python 3,
we found that the updates to clear the live progress were overwriting
the actual output in the terminal. e.g.

+----------+
| count(*) |
+----------+
Fetched 1 row(s) in 5.20s

To avoid this, the live progress lines need to be fully flushed to stderr
before starting to output the result to stdout. This adds a flush call
in OverwritingStdErrOutputStream::clear() to force this.

Testing:
 - Hand tested queries with live progress
 - Added test that redirects stdout and stderr to the same file and
   verifies that no ANSI up character comes after the query output

Change-Id: Id2e21224253f76b2a04767a57b3ade49ce2c914f
Reviewed-on: http://gerrit.cloudera.org:8080/22941
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>
2025-05-24 04:29:14 +00:00
Michael Smith
cee2d01f52 IMPALA-12162: Use thread pool to collect checksums
Refactors ParallelFileMetadataLoader to be usable for multiple types of
metadata. Uses it to collect checksums for new files in parallel.

Testing: adds test that multiple loading threads are used and checksum
does not take too long.

Change-Id: I314621104e4757620c0a90d41dd6875bf8855b51
Reviewed-on: http://gerrit.cloudera.org:8080/22872
Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>
Reviewed-by: Quanlong Huang <huangquanlong@gmail.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-05-23 15:46:42 +00:00
Michael Smith
ef174d3aa5 IMPALA-12162: Checksum files before lock in INSERT
Collect file metadata - file checksums and ACID directory path - before
acquiring the table lock. Table lock doesn't prevent files from being
deleted from the underlying filesystem, and these operations can take
time, blocking other operations that depend on the table lock.

Fires InsertEvents with partial data if there are errors collecting
checksum or acidDirPath on individual files to provide best-effort
information. Hive defaults to empty string for these values when not
specified.

IMPALA-10254 has been resolved, so removes the exception for
FeIcebergTable and associated TODO.

Change-Id: I18f9686f5d53cf1e7c384684c25427fb5353e2af
Reviewed-on: http://gerrit.cloudera.org:8080/22871
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-05-22 07:51:08 +00:00
Zoltan Borok-Nagy
5c415545ea IMPALA-14023: Fix test_scan_metrics_in_profile in non-HDFS builds
test_scan_metrics_in_profile was querying pre-written Iceberg V2
tables. The position delete files of such tables contain hard-coded
URIs of data files, i.e. URIs that start with "hdfs://localhost...".
Therefore the test only worked well in HDFS builds.

This patch splits the test into two parts:

* test_scan_metrics_in_profile_basic: it works on all storage systems
  as it only works on Iceberg tables that don't have delete files.
* test_scan_metrics_in_profile_with_deletes: uses Iceberg tables
  that have delete files, therefore it is only executed on HDFS.

Change-Id: I80a7c6469a7f56b58254e1327a05ef7b3dc9c9ff
Reviewed-on: http://gerrit.cloudera.org:8080/22931
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-05-22 05:44:29 +00:00
Surya Hebbar
e419964250 IMPALA-13615: Support row grouping of instances based on fragment names
In the "Fragment Instances" page of a query, even though it is possible
to sort the rows based on the fragment's name, it is difficult to
distinguish between fragments and their instances.

With row grouping based on fragment's name, it becomes easier to
distinguish one fragment's instance from the other.

The lexographical sorting of instances can still be done based on
different columns, which splits the fragment's group and orders the rows
lexicographically only based on the column's values.

Row grouping has been implemented using the "RowGroup" extension
for datatables - https://datatables.net/extensions/rowgroup/.

Datatable libraries and its extensions have been added under the
directory - "www/datatables".

The datatable library's license has been updated according to
version 1.13.2, which was previously not updated.

The related row grouping extension's license has also been included.

Change-Id: If2b7ed6e2a6d605553242a7db4dbeaa7fcae4606
Reviewed-on: http://gerrit.cloudera.org:8080/22226
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-05-22 05:42:18 +00:00
Daniel Becker
067b25e526 IMPALA-14067: Bump glog version to 0.6.0 in Impala
Some minor changes were needed on the Impala side because of changes in
glog (for example some variables and function parameters were changed
from signed to unsigned integer types).

Testing:
 - passed exhaustive DEBUG tests
 - core ASAN tests

Change-Id: Ifbe341265fd7aa7be8fe304b9fda31b4470237cf
Reviewed-on: http://gerrit.cloudera.org:8080/22906
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-05-22 04:20:40 +00:00
Riza Suminto
125df65322 IMPALA-14042: (Addendum) limit test_rename_drop in exhaustive mode
test_rename_drop continues to be flaky in non-HDFS environment. HMS
seems to be slower to response in non-HDFS environment. There are also
possibillity of table lock contention with background TableLoader and
CatalogDelta thread that cause DROP query happen after RENAME instead of
BEFORE.

This patch limit running test_rename_drop in HDFS and exhaustive mode in
the meantime.

Change-Id: Ie55e6d6093367c454cf3e31ed8a409b6e091193d
Reviewed-on: http://gerrit.cloudera.org:8080/22933
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Michael Smith <michael.smith@cloudera.com>
2025-05-21 22:22:19 +00:00
Riza Suminto
a0b3ae4e02 IMPALA-11396: Deflake test_low_mem_limit_orderby_all
test_low_mem_limit_orderby_all is flaking if test_mem_limit equals 100
and 120 in test vector. The minimum mem_limit to run this test is 120MB
+ 30MB = 150MB. Thus, this test vector expect one of
MEM_LIMIT_ERROR_MSGS will be thrown because mem_limit (test_mem_limit)
is not enough.

Parquet scan under this low mem_limit sometimes throws "Couldn't skip
rows in column" error instead. This possibly indicate memory exhaustion
happen while reading parquet page index or late materialization (see
IMPALA-5843, IMPALA-9873, IMPALA-11134). This patch attempt to deflake
the test by adding "Couldn't skip rows in column" into
MEM_LIMIT_ERROR_MSGS.

Change-Id: I43a953bc19b40256e3a8fe473b1498bbe477c54d
Reviewed-on: http://gerrit.cloudera.org:8080/22932
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-05-21 22:01:11 +00:00
Surya Hebbar
6656016069 IMPALA-14032: Fix broken query timeline after webUI refactor in IMPALA-13389
Updated the identifiers in the following query timeline script according
to the new declrations.

Change-Id: I49a3e5405588edd07836605bff2efc00b9fa3ee9
Reviewed-on: http://gerrit.cloudera.org:8080/22857
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
2025-05-21 15:14:28 +00:00
Joe McDonnell
ea0969a772 IMPALA-11980 (part 2): Fix absolute import issues for impala_shell
Python 3 changed the behavior of imports with PEP328. Existing
imports become absolute unless they use the new relative import
syntax. This adapts the impala-shell code to use absolute
imports, fixing issues where it is imported from our test code.

There are several parts to this:
1. It moves impala shell code into shell/impala_shell.
   This matches the directory structure of the PyPi package.
2. It changes the imports in the shell code to be
   absolute paths (i.e. impala_shell.foo rather than foo).
   This fixes issues with Python 3 absolute imports.
   It also eliminates the need for ugly hacks in the PyPi
   package's __init__.py.
3. This changes Thrift generation to put it directly in
   $IMPALA_HOME/shell rather than $IMPALA_HOME/shell/gen-py.
   This means that the generated Thrift code is rooted in
   the same directory as the shell code.
4. This changes the PYTHONPATH to include $IMPALA_HOME/shell
   and not $IMPALA_HOME/shell/gen-py. This means that the
   test code is using the same import paths as the pypi
   package.

With all of these changes, the source code is very close
to the directory structure of the PyPi package. As long as
CMake has generated the thrift files and the Python version
file, only a few differences remain. This removes those
differences by moving the setup.py / MANIFEST.in and other
files from the packaging directory to the top-level
shell/ directory. This means that one can pip install
directly from the source code. i.e. pip install $IMPALA_HOME/shell

This also moves the shell tarball generation script to the
packaging directory and changes bin/impala-shell.sh to use
Python 3.

This sorts the imports using isort for the affected Python files.

Testing:
 - Ran a regular core job with Python 2
 - Ran a core job with Python 3 and verified that the absolute
   import issues are gone.

Change-Id: Ica75a24fa6bcb78999b9b6f4f4356951b81c3124
Reviewed-on: http://gerrit.cloudera.org:8080/22330
Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Tested-by: Riza Suminto <riza.suminto@cloudera.com>
2025-05-21 15:14:11 +00:00
Joe McDonnell
df8cb46f75 IMPALA-14038: Pull in KUDU-3663 to handle certs with RSASSA-PSS
The existing KRPC code to determine the hash algorithm for a
certificate does not handle RSASSA-PSS signatures as the hash
algorithm is configurable for RSASSA-PSS. This was addressed
in Kudu with KUDU-3663. That fix uses OpenSSL 1.1.1's
x509_get_signature_info() function, which is able to determine
the hash algorithm even for RSASSA-PSS. This is similar to the
fix that Postgres did in a similar situation. It does not support
RSASSA-PSS on OpenSSL 1.0.2, but it improves the error message
in that case.

Testing:
 - Kudu added a unit test that passes

Change-Id: I1df2ce4cac2ed13ea0668ffeaadff10dc83a3d38
Reviewed-on: http://gerrit.cloudera.org:8080/22923
Reviewed-by: Jason Fehr <jfehr@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-05-21 02:48:42 +00:00
Riza Suminto
f28a32fbc3 IMPALA-13916: Change BaseTestSuite.default_test_protocol to HS2
This is the final patch to move all Impala e2e and custom cluster tests
to use HS2 protocol by default. Only beeswax-specific test remains
testing against beeswax protocol by default. We can remove them once
Impala officially remove beeswax support.

HS2 error message formatting in impala-hs2-server.cc is adjusted a bit
to match with formatting in impala-beeswax-server.cc.

Move TestWebPageAndCloseSession from webserver/test_web_pages.py to
custom_cluster/test_web_pages.py to disable glog log buffering.

Testing:
- Pass exhaustive tests, except for some known and unrelated flaky
  tests.

Change-Id: I42e9ceccbba1e6853f37e68f106265d163ccae28
Reviewed-on: http://gerrit.cloudera.org:8080/22845
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Jason Fehr <jfehr@cloudera.com>
2025-05-20 14:32:10 +00:00
gaurav1086
929130b735 IMPALA-13813: OAuth/JWT Avoid key verification on
every rpc call

This patch optimizes the OAuth/JWT flow by setting
cookies in order to avoid token verification in every
RPC call. The default cookie expiry time is 1 day.
This is only valid for hs2-http protocol.

Testing: Modified existing custom cluster tests:
test_jwt_auth_valid and test_oauth_auth_valid:
-  total jwt token verification success count = 1:
   Reason: Verify jwt/oauth token only the first time
   and then set the cookie so do not need to re-verify
   the token for subsequent rpc queries.
-  total cookie auth success = rpc count - 1:
   Reason: After first verification, all subsequent
   authentication will be cookie auth based.
- Benchmarking the query SELECT 1; executed 10,000
  times with OAuth authentication showed a total time
  of 2.16s with the cookie enabled vs. 2.38s
  without the cookie. This indicates a modest
  performance gain (~9%) when cookie support is
  enabled. The time command output in both scenarios
  are:

  With cookie enabled:
  - real 2.16
  - user 0.99
  - sys 0.21

  With cookie disabled:
  - real 2.38
  - user 1.12
  - sys 0.22

Change-Id: I0e3e5d9cf8bdb99920611b06571515e05e15164e
Reviewed-on: http://gerrit.cloudera.org:8080/22600
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-05-20 04:46:19 +00:00
stiga-huang
dd2d44492d IMPALA-14062: Adds missing timeline items in constructing PartitionDeltaUpdater
PartitionDeltaUpdater has two sub-classes, PartNameBasedDeltaUpdater and
PartBasedDeltaUpdater. They are used in reloading metadata of a table.
Their constructors invoke HMS RPCs which could be slow and should be
tracked in the catalog timeline.

This patch adds missing timeline items for those HMS RPCs.

Tests:
 - Added e2e tests

Change-Id: Id231c2b15869aac2dae3258817954abf119da802
Reviewed-on: http://gerrit.cloudera.org:8080/22917
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-05-20 02:28:54 +00:00
Joe McDonnell
c7f9469919 IMPALA-14031: Enable keepalive by default for client connections
Keepalive is useful for detecting dead client connections. In
particular, it has come up several times with load balancers that
drop idle connections. In IMPALA-13253, we added startup flags
that can turn on keepalive and tune the timeouts, but that change
did not enable keepalive by default.

This enables keepalive by default by setting
client_keepalive_probe_period_s to 10 minutes / 600 seconds. The
usual OS setting has a probe period of 2 hours / 7200 seconds.
This uses a more aggressive default, because a client connection
counts towards the fe_service_threads limit. Detecting dead clients
quickly frees up those threads.

This has no impact on idle connections where the TCP connection is
still alive. That is handled through separate controls.

Testing:
 - There is already a custom cluster test for keepalive that
   continues to pass
 - Ran core tests

Change-Id: Ie358b5cabaff884516f7117d20d18c25124e59d5
Reviewed-on: http://gerrit.cloudera.org:8080/22861
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>
2025-05-19 18:14:00 +00:00
Riza Suminto
44e9b6f97d IMPALA-14078: Reorganize test_ranger.py to share minicluster
test_ranger.py is a custom cluster test consisting of 41 test methods.
Each test method require minicluster restart. With IMPALA-13503, we can
reorganize TestRanger class into 3 separate test class:
TestRangerIndependent, TestRangerLegacyCatalog, and
TestRangerLocalCatalog. Both TestRangerLegacyCatalog and
TestRangerLocalCatalog can maintain the same minicluster without
restarting it in between.

Testing:
- Run and pass test_ranger.py in exhaustive mode.
- Confirmed that no test is missing after reorganization.

Change-Id: I01ff2b3e98fccfffa8bcdfe1177be98634363b56
Reviewed-on: http://gerrit.cloudera.org:8080/22905
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-05-17 10:28:43 +00:00
Riza Suminto
16eaedf4e5 IMPALA-13850 (part 3): Fix TSAN issue at AcceptRequest
Nightly TSAN build reveal issue in part 2 patch. This patch attempt to
fix is by changing the boolean variable into AtomicBoolean.

Testing:
Pass TSAN core tests.

Change-Id: I8dcd3c8e105d8dc6ac04096060dbf6e185651aa5
Reviewed-on: http://gerrit.cloudera.org:8080/22907
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-05-17 05:00:33 +00:00
Joe McDonnell
d3814f445f IMPALA-14077: Remove references to shaded packages from other projects
IDEs can mistakenly add imports that reference shaded versions of
classes that are available normally. For example, one might use
org.apache.curator.shaded.com.google.common.base.Preconditions
vs
com.google.common.base.Preconditions

Some of these usages have crept back into the codebase, so this
removes those obvious cases. Some commands to find these cases:
git grep import | grep com.google | grep -v "import com"
git grep import | grep shade
git grep import | grep relocate

Testing:
 - Ran a build

Change-Id: I7d09c757cb2a29a8e3187f05f3f71984fa897810
Reviewed-on: http://gerrit.cloudera.org:8080/22904
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-05-16 05:24:47 +00:00
Riza Suminto
3593a47a71 IMPALA-14060: Remove ImpalaConnection.get_default_configuration()
This patch remove ImpalaConnection.get_default_configuration() after
refactoring done in IMPALA-14039.

Testing:
Run and pass test_queries.py::TestQueries.

Change-Id: Idf2a3a5b7b427a46ddd288bb7fbb16ba2803735d
Reviewed-on: http://gerrit.cloudera.org:8080/22903
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-05-16 01:19:34 +00:00
Joe McDonnell
78d1c2cd3a IMPALA-14049: Fix TSAN issue with HdrHistogram in expr-test
IMPALA-13978 switched HdrHistrogram from using a gscoped_ptr
to unique_ptr. This has been causing TSAN issues during the
teardown for expr-test. gscoped_ptr doesn't null out the
pointer when it gets destructed, but unique_ptr does. This
is a data race with the threads that are still running and
trying to access the metrics.

The full solution would be to have an orderly shutdown of
all the threads before destructing things. That is a large
project that would touch many different components. As a
short-term fix, this avoids the TSAN issue by leaking
the statestore metrics.

We should consider fixing IMPALA-9314 and implementing
orderly shutdown.

Testing:
 - Ran expr-test in a loop with TSAN and didn't see this
   particular issue. There are other shutdown issues with
   much lower frequency that have different symptoms.

Change-Id: I73c3f4db16c6ffa272f2512e9871db5743be7a54
Reviewed-on: http://gerrit.cloudera.org:8080/22900
Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-05-15 21:39:30 +00:00
skatiyal
fae38aa77d IMPALA-13866: Add the timestamp in /jvm-threadz page
Enhanced /jvm-threadz WebUI page to include a timestamp for every jstack capture,
It will help comparing multiple jstacks captured via WebUI.

Change-Id: Ic0cb95028e175328c94aa2ad9df1f841efcde948
Reviewed-on: http://gerrit.cloudera.org:8080/22877
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-05-15 17:58:58 +00:00
Riza Suminto
6831076983 IMPALA-14072: Fix NPE in Catalogd during rename.
test_rename_drop fail with NPE after IMPALA-14042. This is because
CatalogServiceCatalog.renameTable() return null for not finding the
database of oldTableName. This patch change renameTable() to return
Pair.create(null, null) for that scenario.

Refactor test_rename_drop slightly to ensure that invalidating the
renamed table and dropping it are successful.

Testing:
- Add checkNotNull precondition in
  CatalogOpExecutor.alterTableOrViewRename()
- Increase catalogd_table_rename_delay delay to 6s to ensure that DROP
  query happen in Catalogd before renameTable() called. Manually
  observed that No NPE is shown anymore.

Change-Id: I7a421a71cf3703290645e85180de8e9d5e86368a
Reviewed-on: http://gerrit.cloudera.org:8080/22899
Reviewed-by: Quanlong Huang <huangquanlong@gmail.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-05-15 09:42:31 +00:00
Riza Suminto
ed6c19cf0c IMPALA-14071: Refactor helper methods around cardinality bounding
There are multiple ways to do cardinality multipication that also avoid
integer overflow. Some helper methods available are:
- MathUtil.saturatingMultiplyCardinalities()
- PlanNode.checkedMultiply()
- LongMath.saturatedMultiply()

This patch intent to simplify things by:
- MathUtil.saturatingMultiplyCardinalities() with
  PlanNode.checkedMultiply() into MathUtil.multiplyCardinalities().
- MathUtil.saturatingAddCardinalities() with
  PlanNode.checkedAdd() into MathUtil.addCardinalities().
- Move PlanNode.smallestValidCardinality() to MathUtil.
- Make MathUtil.saturatingMultiply() and MathUtil.saturatingAdd() simply
  a wrapper for LongMath.saturatedMultiply() and LongMath.saturatedAdd()
  accordingly.

multiplyCardinalities(), addCardinalities(), and
smallestValidCardinality() have cardinality Preconditions check.

Harden cardinality calculation in several places by using
multiplyCardinalities() and addCardinalities() accordingly. Added sanity
check PlanNode.verifyCardinality() that is evaluated at the end of
PlanNode.computeStats(). This ensure that cardinality_ and
inputCardinality_ is always valid after PlanNode.computeStats().

Also fixed bug in ExchangeNode.estimateTotalQueueByteSize() that prevent
calculation against negative cardinality or non-positive num nodes.

Testing:
Pass Following FE and EE tests:
CardinalityTest
MathUtilTest
PlannerTest#testSpillableBufferSizing+testResourceRequirements
TpcdsCpuCostPlannerTest
TpcdsPlannerTest
metadata/test_explain.py

Change-Id: I505ab11cfa1024feb4ceac4cffe9c3283be228ce
Reviewed-on: http://gerrit.cloudera.org:8080/22897
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-05-15 04:00:52 +00:00
Riza Suminto
02fb5e1ccb IMPALA-13937: (Addendum) Replace diff with manual bash script
diff require BASE_IMAGE to have diffutils preinstalled. However, not all
BASE_IMAGE have it preinstalled. This patch replace the diff invocation
with manual bash script.

Testing:
Completed build with ubi8:latest that does not have diffutils
preinstalled.

Change-Id: I58e9ef7c344caffd198664e3f9683f54ce2c1914
Reviewed-on: http://gerrit.cloudera.org:8080/22898
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-05-14 21:23:36 +00:00
Riza Suminto
f18cfaf0db IMPALA-14028: Refactor cancel_query_and_validate_state with HS2
cancel_query_and_validate_state is a helper method used to test query
cancellation with concurrent fetch. It is still use beeswax client by
default.

This patch change the test method to use HS2 protocol by default. The
changes include following:
1. Set TGetOperationStatusResp.operationState to
   TOperationState::ERROR_STATE if returning abnormally.
2. Use separate MinimalHS2Client for
   (execute_async, fetch, get_runtime_profile) vs cancel vs close.
   Cancellation through KILL QUERY still instantiate new
   ImpylaHS2Connection client.
3. Implement required missing methods in MinimalHS2Client.
4. Change MinimalHS2Client logging pattern to match with other clients.

Testing:
Pass test_cancellation.py and TestResultSpoolingCancellation in core
exploration mode. Also fix default_test_protocol to HS2 for these tests.

Change-Id: I626a1a06eb3d5dc9737c7d4289720e1f52d2a984
Reviewed-on: http://gerrit.cloudera.org:8080/22853
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Riza Suminto <riza.suminto@cloudera.com>
2025-05-14 20:20:14 +00:00
Riza Suminto
f7cf4f8446 IMPALA-14070: Use checkedMultiply in SortNode.java
maxRowsInHeaps calculation may overflow because it use simple
multiplication. This patch fix the bug by calculating it using
checkedMultiply(). A broader refactoring will be done by IMPALA-14071.

Testing:
Add ee tests TestTopNHighNdv that exercise the issue.

Change-Id: Ic6712b94f4704fd8016829b2538b1be22baaf2f7
Reviewed-on: http://gerrit.cloudera.org:8080/22896
Reviewed-by: Abhishek Rawat <arawat@cloudera.com>
Reviewed-by: Wenzhe Zhou <wzhou@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-05-14 04:11:48 +00:00
Surya Hebbar
7ad7a86c0e IMPALA-13624: Implement textual representation for aggregate event sequences
This adds support for a summarized textual representation of timestamps
for the event sequences present in the aggregated profile.

With the verbose format present in profile V1 and V2, it becomes
difficult to analyze an event's timestamps across instances.

The event sequences are now displayed in a histogram format, based on
the number of timestamps present, in order to support an easier view
for skew analysis and other possible use cases.
(i.e. based on json_profile_event_timestamp_limit)

The summary generated from aggregated instance-level timestamps
(i.e. IMPALA-13304) is used to achieve this within the profile V2,
which covers the possbility of missing events.

Example,
  Verbosity::DEFAULT
  json_profile_event_timestamp_limit = 5 (default)

  Case #1, Number of instances exceeded limit
    Node Lifecycle Event Timeline Summary :
     - Open Started (4s880ms):
        Min: 2s312ms, Avg: 3s427ms, Max: 4s880ms, Count: 12
        HistogramCount: 4, 4, 0, 0, 4

  Case #2, Number of instances within the limit

    Node Lifecycle Event Timeline:
     - Open Started: 5s885ms, 1s708ms, 3s434ms
     - Open Finished: 5s885ms, 1s708ms, 3s435ms
     - First Batch Requested: 5s885ms, 1s708ms, 3s435ms
     - First Batch Returned: 6s319ms, 2s123ms, 3s570ms
     - Last Batch Returned: 7s878ms, 2s123ms, 3s570ms

With Verbosity::EXTENDED or more, all events and timestamps are printed
with full verbosity as before.

Tests:
For test_profile_tool.py, updated the generated outputs for text
and JSON profiles.

Change-Id: I4bcc0e2e7fccfa8a184cfa8a3a96d68bfe6035c0
Reviewed-on: http://gerrit.cloudera.org:8080/22245
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>
2025-05-14 00:21:54 +00:00
Surya Hebbar
bc0de10966 IMPALA-14069: Factor possibility of zero timestamps in aggregated event sequences
Currently, the missing event timestamps are substituted by zeros
and then reported(i.e. unreported_event_instance_idxs) within
event sequences of the JSON profile. See IMPALA-13555 for more details.

Even with micro/nanosecond precision, some event timestamps are recorded
as zeros (i.e. Prepare Finished - 0ns).

The current implementation of aggregated event sequences was incorrectly
considering these zeros as substituted missing timestamps.

Although, these can be distinguished from missing timestamps through
the exposed 'unreported_event_instance_idxs', it is more helpful
to represent missing values as -ve values or constants(i.e. -1).

This representation is favorable for summary and visualization, and is
necessary for skipping missing values and maintaing alignment between
instance timestamps.

The patch also fixes null values in "info_strings" fields within
the JSON profile.

Fixed runtime-profile-test to consider -ve values(i.e. -1) as missing
event timestamps, instead of 0.

Updated the generated profiles in testdata/impala-profiles.

Change-Id: I9f1efd2aad5f62084075cd8f9169ef72c66942b6
Reviewed-on: http://gerrit.cloudera.org:8080/22893
Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>
Tested-by: Riza Suminto <riza.suminto@cloudera.com>
2025-05-14 00:21:54 +00:00
jasonmfehr
0293a1bc08 IMPALA-12427: Documentation for Workload Management
This change adds documentation for the Workload Management feature.

Change-Id: I9c228dfaa3f6060add6e5bd8058551a4d362f460
Reviewed-on: http://gerrit.cloudera.org:8080/22706
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-05-13 07:14:19 +00:00
Riza Suminto
2040f66569 IMPALA-14042: Deflake TestConcurrentRename.test_rename_drop
TestConcurrentRename.test_rename_drop has been flaky because the
INVALIDATE query may arrive ahead of the ALTER TABLE RENAME query. This
patch deflake it by changing the sleep with admission control wait and
catalog version check. The first INVALIDATE query will only start after
catalog version increase since CREATE TABLE query.

Testing:
Loop the test 50x and pass them all.

Change-Id: I2539d5755aae6d375400b9a1289a658d0e7ba888
Reviewed-on: http://gerrit.cloudera.org:8080/22876
Reviewed-by: Yida Wu <wydbaggio000@gmail.com>
Reviewed-by: Jason Fehr <jfehr@cloudera.com>
Tested-by: Riza Suminto <riza.suminto@cloudera.com>
2025-05-12 23:10:40 +00:00
Noemi Pap-Takacs
8170ec124d IMPALA-11672: Update 'transient_lastDdlTime' for Iceberg tables
'transient_lastDdlTime' table property was not updated for Iceberg
tables before this change. Now it is updated after DDL operations
including DROP PARTITION as well.

Renaming an Iceberg table is an exception:
Iceberg does not keep track of the table name in the metadata files,
so there is no Iceberg transaction to change it.
The table name is a concept that exists only in the catalog.
If we rename the table, we only edit our catalog entry, but the metadata
stored on the file system - the table's state - does not change.
Therefore renaming an Iceberg table does not change the
'transient_lastDdlTime' table property because rename is a
catalog-level operation for Iceberg tables, and not table-level.

Testing:
 - added managed and non-managed Iceberg table DDL tests to
   test_last_ddl_update.py

Change-Id: I7e5f63b50bd37c80faf482c4baf4221be857c54b
Reviewed-on: http://gerrit.cloudera.org:8080/22831
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-05-12 18:35:07 +00:00
Surya Hebbar
7756e5bc32 IMPALA-13473: Add support for JS code analysis and linting with ESLint
This patch adds support for JS code analysis and linting to webUI
scripts using ESLint.

Support to enforce code style and quality is partcularly beneficial,
as the codebase for client-side scripts is consistently growing.

This has been implemented to work alongside other code style enforcement
rules present within 'critique-gerrit-review.py', which runs on the
existing jenkins job 'gerrit-auto-critic', to produce gerrit comments.

In the case of webUI scripts, ESLint's code analysis and linting checks
are performed to produce these comments.

As a shared NodeJS installation can be used for JS tests as well as
linting, a seperate common script "bin/nodejs/setup_nodejs.sh"
has been added for assiting with the NodeJS installation.

To ensure quicker run times for the jenkins job, NodeJS tarball is
cached within "${HOME}/.cache" directory, after the initial installation.

ESLint's packages and dependencies have been made to be cached
using NPM's own package management and are also cached locally.

NodeJS and ESLint dependencies are retrieved and executed, only if
there are any changes within ".js" files within the patchset,
and run with minimal overhead.

After analysis, comments are generated for all the violations according
to the specified rules.

A custom formatter has been added to extract, format and filter the
violations in JSON form.

These generated code style violations are formatted into the required
JSON form according to gerrit's REST API, similar to comments generated
by flake8. These are then posted to gerrit as comments
on the respective patchset from jenkins over SSH.

The following code style and quality rules have been added using ESLint.
  - Disallow unused variables
  - Enforce strict equality (=== and !==)
  - Require curly braces for all control statements (if, while, etc.)
  - Enforce semicolons at the end of statements
  - Enforce double quotes for strings
  - Set maximum line length to 90
  - Disallow `var`, use `let` or `const`
  - Prefer `const` where possible
  - Disallow multiple empty lines
  - Enforce spacing around infix operators (eg. +, =)
  - Disallow the use of undeclared variables
  - Require parentheses around arrow function arguments
  - Require a space before blocks
  - Enforce consistent spacing inside braces
  - Disallow shadowing variables declared in the outer scope
  - Disallow constant conditions in if statements, loops, etc
  - Disallow unnecessary parentheses in expressions
  - Disallow duplicate arguments in function definitions
  - Disallow duplicate keys in object literals
  - Disallow unreachable code after return, throw, continue, etc
  - Disallow reassigning function parameters
  - Require functions to always consistently return or not return at all
  - Enforce consistent use of dot notation wherever possible
  - Disallow multiple empty lines
  - Enforce spacing around the colon in object literal properties
  - Disallow optional chaining, where undefined values are not allowed

The required linting packages have been added as dependencies in the
"www/scripts" directory.

All the test scripts and related dependencies have been moved to -
$IMPALA_HOME/tests/webui/js_tests.

All the custom ESLint formatter scripts and related dependencies
have been moved to -
$IMPALA_HOME/tests/webui/linting.

A combination of NodeJS's 'prefix' argument and NODE_PATH environmental
variable is being used to seperate the dependencies and webUI scripts.
To support running the tests from a remote directory(i.e. tests/webui),
by modifying the required base paths.

The JS scripts need to be updated according to these linting rules,
as per IMPALA-13986.

Change-Id: Ieb3d0a9221738e2ac6fefd60087eaeee4366e33f
Reviewed-on: http://gerrit.cloudera.org:8080/21970
Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-05-11 01:07:14 +00:00
Riza Suminto
a3146ca722 IMPALA-13959: (addendum) Let test pass regardless of JDK version.
This patch modify test_change_parquet_column_type to let it pass
regardless of the test JDK version. The assertion is changed from using
string match to regex.

Testing:
Run and pass the test with both JDK8 and JDK17.

Change-Id: I5bd3eebe7b1e52712033dda488f0c19882207f9d
Reviewed-on: http://gerrit.cloudera.org:8080/22874
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-05-09 21:51:43 +00:00
Riza Suminto
be16a02fa8 IMPALA-13850 (part 2): Fix bug found by test_restart_services.py
This patch stabilize test_restart_catalogd_with_local_catalog in
test_restart_services.py after the first part of IMPALA-13850 merged.

IMPALA-13850 (part 1) make local catalog mode send statestore update
twice: the first is to announce its availability and service id, while
the second is the full topic update. There is a slight duration where
CatalogD accept getCatalogObject() request before the very first
CatalogServiceCatalog.reset() initiated and obtain write lock. When such
request went through, the request might see an empty catalog which
results in query failures of db/table not exists.

This patch block CatalogServiceThriftIf.AcceptRequest() until
CatalogServiceCatalog.reset() initiated. Catalog version 100 is used to
signal that initial reset has begun. Later in part 3, when we implement
in-place metadata cache reset, AcceptRequest() can unblock faster when
reset() release the write lock in-between catalog cache initialization.

Testing:
- Loop and pass test_restart_catalogd_with_local_catalog 100 times.

Change-Id: I97f6f692506de0bbf2e1445f83bed824dc8298fd
Reviewed-on: http://gerrit.cloudera.org:8080/22844
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-05-09 20:35:19 +00:00
Zoltan Borok-Nagy
afa329fd89 IMPALA-13931: TestIcebergRestCatalog.test_rest_catalog_basic failed at setup
There were several issues with test_rest_catalog_basic which made it
fail in environments that used Ozone or S3.

Missing dependency of Ozone and S3 classes:
* This is resolved in iceberg-rest-catalog-test/pom.xml by adding
  a dependency to impala-executor-deps

Hadoop configuration was initialized properly:
* run-iceberg-rest-server.sh used Maven to run Iceberg REST Catalog in
  which case Maven is in charge of setting the CLASSPATH but the
  core-site/ozone-site/etc. config files were not on it, so the
  REST Catalog used a default Hadoop configuration that wasn't good
  for our environment.
* To overcome the CLASSPATH problem now we create a runnable JAR in
  iceberg-rest-catalog-test/pom.xml and also generate the proper
  CLASSPATH during compilation.
* run-iceberg-rest-server.sh now uses java -cp to run the REST
  Catalog

S3 builds threw NoSuchMethodException for the "create" method of
ApacheHttpClientConfigurations:
* The Iceberg library dynamically load its http client builders
  to workaround an error, see details in
  https://github.com/apache/iceberg/issues/6715
* So the Iceberg lib dynamically wants to load the "create" method
  of its own ApacheHttpClientConfigurations class but it fails
  with NoSuchMethodException.
* The critical code is invoked from Impala's IcebergMetadataScanner's
  ScanMetadataTable() method which happens to be invoked through
  JNI from the C++ backend.
* The context class loader of such threads are NULL, which means
  Java will use the bootstrap class loader to load classes and methods,
  but that doesn't have the proper resources on its classpath.
* To overcome this issue we set the context class loader for the thread
  to the class loader that originally loaded the IcebergMetadataScanner
  class.

Change-Id: I9dc0e30aeaff0b8de41426ba38506383b4af472c
Reviewed-on: http://gerrit.cloudera.org:8080/22818
Reviewed-by: Jason Fehr <jfehr@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>
2025-05-09 17:01:56 +00:00
Laszlo Gaal
cc703b3c3a IMPALA-13937: Use simpler chmod syntax to set +t on /var/tmp in Docker build
Some Docker base images contain basic Unix utilities implemented by
Busybox instead of the usual linux-coreutils package. The chmod command
in the Busybox implementation seems to ignore certain syntax variants:
the current invocation for setting the sticky bit (+t) on /var/tmp got
silently ignored, while chmod indicated success, returning 0 to the
calling script.

This patch changes the chmod call to a slightly simple syntax, which was
tested to be understood by Busybox and coreutils both; and adds a simple
inline check to assert that the directories required by Kerberos
- exist
- and have the required ownership and permission structure.

The assertion fails the Docker build if setting up /tmp and /var/tmp in
a Kerberos-compatible way did not succeed.

Change-Id: I20c52dc70fb73337efcd6d12652bf99c3c473ff9
Reviewed-on: http://gerrit.cloudera.org:8080/22811
Reviewed-by: Peter Rozsa <prozsa@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-05-09 15:33:08 +00:00
Laszlo Gaal
b17f22048a IMPALA-14029: Add Kerberos utilities to Docker image build
The Kerberos utility package was missing from the OS package list of
the Docker container build when the base image was detected being
a hardened Wolfi-based image. This prevented Impala coordinators from
renewing their Kerberos tickets in containerized and Kerberized
environments.

This patch adds the Kerberos utility package to the list of installed
packages for such minimal containers.

Change-Id: I84f295ac8ae4c000868abff0342b922beb141b5b
Reviewed-on: http://gerrit.cloudera.org:8080/22854
Reviewed-by: Norbert Luksa <norbert.luksa@cloudera.com>
Tested-by: Laszlo Gaal <laszlo.gaal@cloudera.com>
2025-05-09 15:33:08 +00:00
Zoltan Borok-Nagy
04735598d6 IMPALA-13718: Skip reloading Iceberg tables when metadata JSON file is the same
With this patch Impala skips reloading Iceberg tables when metadata
JSON file is the same, as this means that the table is essentially
unchanged.

This can help in situations when the event processor is lagging behind
and we have an Iceberg table that is updated frequently. Imagine the
case when Impala gets 100 events for an Iceberg table. In this case
after processing the first event, our internal representation of
the Iceberg table is already up-to-date, there is no need to do the
reload 100 times.

We cannot use the internal icebergApiTable_'s metadata location,
as the following statement might silently refresh the metadata
in 'current()':

 icebergApiTable_.operations().current().metadataFileLocation()

To guarantee that we check against the actual loaded metadata
this patch introduces a new member to store the metadata location.

Testing
 * added e2e tests for REFRESH, also for event processing

Change-Id: I16727000cb11d1c0591875a6542d428564dce664
Reviewed-on: http://gerrit.cloudera.org:8080/22432
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Noemi Pap-Takacs <npaptakacs@cloudera.com>
2025-05-09 11:37:01 +00:00
Peter Rozsa
ad19828df5 IMPALA-14040: Remove Kudu masters property from FeCatalog
This change removes getDefaultKuduMasterHosts from FeCatalog and makes
the Kudu masters lookup bound directly to the backend config.

Tests are adjusted for the new property lookup.

Change-Id: Idcf31a724bb7bd00268accf21c0b997d92e9b23a
Reviewed-on: http://gerrit.cloudera.org:8080/22863
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Daniel Becker <daniel.becker@cloudera.com>
Tested-by: Daniel Becker <daniel.becker@cloudera.com>
2025-05-09 08:39:08 +00:00
Riza Suminto
f2acd2381f IMPALA-14039: __restore_query_options should unset query option
ImpalaTestSuite.__restore_query_options() attempt to restore client's
configuration with what it understand as the "default" query option.

Since IMPALA-13930, ImpalaConnection.get_default_configuration() parse
the default query option from TQueryOption fields. Therefore, it might
not respect server's default that comes from --default_query_options
flag.

ImpalaTestSuite.__restore_query_options() should simply unset any
configuration that previously set by running SET query like this:

SET query_option="";

This patch also change execute_query_using_vector() to simply unset
client's configuration.

Follow up cleanup will be tracked through IMPALA-14060.

Testing:
Run and pass test_queries.py::TestQueries.

Change-Id: I884986b9ecbcabf0b34a7346220e6ea4142ca923
Reviewed-on: http://gerrit.cloudera.org:8080/22862
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Jason Fehr <jfehr@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-05-09 00:48:58 +00:00
Daniel Becker
eb79fbea2b IMPALA-14033: Document the integration of Iceberg ScanMetrics in the query profile
This change documents the integration of Iceberg ScanMetrics into
Impala query profiles.

Change-Id: I49d27ecd0f37ffed58afb8abea04bf592d68f11c
Reviewed-on: http://gerrit.cloudera.org:8080/22859
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Zoltan Borok-Nagy <boroknagyz@cloudera.com>
2025-05-07 09:36:26 +00:00
Riza Suminto
3210ec58c5 IMPALA-14006: Bound max_instances in CreateInputCollocatedInstances
IMPALA-11604 (part 2) changes how many instances to create in
Scheduler::CreateInputCollocatedInstances. This works when the left
child fragment of a parent fragment is distributed across nodes.
However, if the left child fragment instance is limited to only 1
node (the case of UNPARTITIONED fragment), the scheduler might
over-parallelize the parent fragment by scheduling too many instances in
a single node.

This patch attempts to mitigate the issue in two ways. First, it adds
bounding logic in PlanFragment.traverseEffectiveParallelism() to lower
parallelism further if the left (probe) side of the child fragment is
not well distributed across nodes.

Second, it adds TQueryExecRequest.max_parallelism_per_node to relay
information from Analyzer.getMaxParallelismPerNode() to the scheduler.
With this information, the scheduler can do additional sanity checks to
prevent Scheduler::CreateInputCollocatedInstances from
over-parallelizing a fragment. Note that this sanity check can also cap
MAX_FS_WRITERS option under a similar scenario.

Added ScalingVerdict enum and TRACE log it to show the scaling decision
steps.

Testing:
- Add planner test and e2e test that exercise the corner case under
  COMPUTE_PROCESSING_COST=1 option.
- Manually comment the bounding logic in traverseEffectiveParallelism()
  and confirm that the scheduler's sanity check still enforces the
  bounding.

Change-Id: I65223b820c9fd6e4267d57297b1466d4e56829b3
Reviewed-on: http://gerrit.cloudera.org:8080/22840
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-05-07 03:34:15 +00:00