3568 Commits

Author SHA1 Message Date
stiga-huang
dd2d44492d IMPALA-14062: Adds missing timeline items in constructing PartitionDeltaUpdater
PartitionDeltaUpdater has two sub-classes, PartNameBasedDeltaUpdater and
PartBasedDeltaUpdater. They are used in reloading metadata of a table.
Their constructors invoke HMS RPCs which could be slow and should be
tracked in the catalog timeline.

This patch adds missing timeline items for those HMS RPCs.

Tests:
 - Added e2e tests

Change-Id: Id231c2b15869aac2dae3258817954abf119da802
Reviewed-on: http://gerrit.cloudera.org:8080/22917
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-05-20 02:28:54 +00:00
Riza Suminto
44e9b6f97d IMPALA-14078: Reorganize test_ranger.py to share minicluster
test_ranger.py is a custom cluster test consisting of 41 test methods.
Each test method require minicluster restart. With IMPALA-13503, we can
reorganize TestRanger class into 3 separate test class:
TestRangerIndependent, TestRangerLegacyCatalog, and
TestRangerLocalCatalog. Both TestRangerLegacyCatalog and
TestRangerLocalCatalog can maintain the same minicluster without
restarting it in between.

Testing:
- Run and pass test_ranger.py in exhaustive mode.
- Confirmed that no test is missing after reorganization.

Change-Id: I01ff2b3e98fccfffa8bcdfe1177be98634363b56
Reviewed-on: http://gerrit.cloudera.org:8080/22905
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-05-17 10:28:43 +00:00
Riza Suminto
3593a47a71 IMPALA-14060: Remove ImpalaConnection.get_default_configuration()
This patch remove ImpalaConnection.get_default_configuration() after
refactoring done in IMPALA-14039.

Testing:
Run and pass test_queries.py::TestQueries.

Change-Id: Idf2a3a5b7b427a46ddd288bb7fbb16ba2803735d
Reviewed-on: http://gerrit.cloudera.org:8080/22903
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-05-16 01:19:34 +00:00
skatiyal
fae38aa77d IMPALA-13866: Add the timestamp in /jvm-threadz page
Enhanced /jvm-threadz WebUI page to include a timestamp for every jstack capture,
It will help comparing multiple jstacks captured via WebUI.

Change-Id: Ic0cb95028e175328c94aa2ad9df1f841efcde948
Reviewed-on: http://gerrit.cloudera.org:8080/22877
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-05-15 17:58:58 +00:00
Riza Suminto
6831076983 IMPALA-14072: Fix NPE in Catalogd during rename.
test_rename_drop fail with NPE after IMPALA-14042. This is because
CatalogServiceCatalog.renameTable() return null for not finding the
database of oldTableName. This patch change renameTable() to return
Pair.create(null, null) for that scenario.

Refactor test_rename_drop slightly to ensure that invalidating the
renamed table and dropping it are successful.

Testing:
- Add checkNotNull precondition in
  CatalogOpExecutor.alterTableOrViewRename()
- Increase catalogd_table_rename_delay delay to 6s to ensure that DROP
  query happen in Catalogd before renameTable() called. Manually
  observed that No NPE is shown anymore.

Change-Id: I7a421a71cf3703290645e85180de8e9d5e86368a
Reviewed-on: http://gerrit.cloudera.org:8080/22899
Reviewed-by: Quanlong Huang <huangquanlong@gmail.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-05-15 09:42:31 +00:00
Riza Suminto
f18cfaf0db IMPALA-14028: Refactor cancel_query_and_validate_state with HS2
cancel_query_and_validate_state is a helper method used to test query
cancellation with concurrent fetch. It is still use beeswax client by
default.

This patch change the test method to use HS2 protocol by default. The
changes include following:
1. Set TGetOperationStatusResp.operationState to
   TOperationState::ERROR_STATE if returning abnormally.
2. Use separate MinimalHS2Client for
   (execute_async, fetch, get_runtime_profile) vs cancel vs close.
   Cancellation through KILL QUERY still instantiate new
   ImpylaHS2Connection client.
3. Implement required missing methods in MinimalHS2Client.
4. Change MinimalHS2Client logging pattern to match with other clients.

Testing:
Pass test_cancellation.py and TestResultSpoolingCancellation in core
exploration mode. Also fix default_test_protocol to HS2 for these tests.

Change-Id: I626a1a06eb3d5dc9737c7d4289720e1f52d2a984
Reviewed-on: http://gerrit.cloudera.org:8080/22853
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Riza Suminto <riza.suminto@cloudera.com>
2025-05-14 20:20:14 +00:00
Riza Suminto
f7cf4f8446 IMPALA-14070: Use checkedMultiply in SortNode.java
maxRowsInHeaps calculation may overflow because it use simple
multiplication. This patch fix the bug by calculating it using
checkedMultiply(). A broader refactoring will be done by IMPALA-14071.

Testing:
Add ee tests TestTopNHighNdv that exercise the issue.

Change-Id: Ic6712b94f4704fd8016829b2538b1be22baaf2f7
Reviewed-on: http://gerrit.cloudera.org:8080/22896
Reviewed-by: Abhishek Rawat <arawat@cloudera.com>
Reviewed-by: Wenzhe Zhou <wzhou@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-05-14 04:11:48 +00:00
Riza Suminto
2040f66569 IMPALA-14042: Deflake TestConcurrentRename.test_rename_drop
TestConcurrentRename.test_rename_drop has been flaky because the
INVALIDATE query may arrive ahead of the ALTER TABLE RENAME query. This
patch deflake it by changing the sleep with admission control wait and
catalog version check. The first INVALIDATE query will only start after
catalog version increase since CREATE TABLE query.

Testing:
Loop the test 50x and pass them all.

Change-Id: I2539d5755aae6d375400b9a1289a658d0e7ba888
Reviewed-on: http://gerrit.cloudera.org:8080/22876
Reviewed-by: Yida Wu <wydbaggio000@gmail.com>
Reviewed-by: Jason Fehr <jfehr@cloudera.com>
Tested-by: Riza Suminto <riza.suminto@cloudera.com>
2025-05-12 23:10:40 +00:00
Noemi Pap-Takacs
8170ec124d IMPALA-11672: Update 'transient_lastDdlTime' for Iceberg tables
'transient_lastDdlTime' table property was not updated for Iceberg
tables before this change. Now it is updated after DDL operations
including DROP PARTITION as well.

Renaming an Iceberg table is an exception:
Iceberg does not keep track of the table name in the metadata files,
so there is no Iceberg transaction to change it.
The table name is a concept that exists only in the catalog.
If we rename the table, we only edit our catalog entry, but the metadata
stored on the file system - the table's state - does not change.
Therefore renaming an Iceberg table does not change the
'transient_lastDdlTime' table property because rename is a
catalog-level operation for Iceberg tables, and not table-level.

Testing:
 - added managed and non-managed Iceberg table DDL tests to
   test_last_ddl_update.py

Change-Id: I7e5f63b50bd37c80faf482c4baf4221be857c54b
Reviewed-on: http://gerrit.cloudera.org:8080/22831
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-05-12 18:35:07 +00:00
Surya Hebbar
7756e5bc32 IMPALA-13473: Add support for JS code analysis and linting with ESLint
This patch adds support for JS code analysis and linting to webUI
scripts using ESLint.

Support to enforce code style and quality is partcularly beneficial,
as the codebase for client-side scripts is consistently growing.

This has been implemented to work alongside other code style enforcement
rules present within 'critique-gerrit-review.py', which runs on the
existing jenkins job 'gerrit-auto-critic', to produce gerrit comments.

In the case of webUI scripts, ESLint's code analysis and linting checks
are performed to produce these comments.

As a shared NodeJS installation can be used for JS tests as well as
linting, a seperate common script "bin/nodejs/setup_nodejs.sh"
has been added for assiting with the NodeJS installation.

To ensure quicker run times for the jenkins job, NodeJS tarball is
cached within "${HOME}/.cache" directory, after the initial installation.

ESLint's packages and dependencies have been made to be cached
using NPM's own package management and are also cached locally.

NodeJS and ESLint dependencies are retrieved and executed, only if
there are any changes within ".js" files within the patchset,
and run with minimal overhead.

After analysis, comments are generated for all the violations according
to the specified rules.

A custom formatter has been added to extract, format and filter the
violations in JSON form.

These generated code style violations are formatted into the required
JSON form according to gerrit's REST API, similar to comments generated
by flake8. These are then posted to gerrit as comments
on the respective patchset from jenkins over SSH.

The following code style and quality rules have been added using ESLint.
  - Disallow unused variables
  - Enforce strict equality (=== and !==)
  - Require curly braces for all control statements (if, while, etc.)
  - Enforce semicolons at the end of statements
  - Enforce double quotes for strings
  - Set maximum line length to 90
  - Disallow `var`, use `let` or `const`
  - Prefer `const` where possible
  - Disallow multiple empty lines
  - Enforce spacing around infix operators (eg. +, =)
  - Disallow the use of undeclared variables
  - Require parentheses around arrow function arguments
  - Require a space before blocks
  - Enforce consistent spacing inside braces
  - Disallow shadowing variables declared in the outer scope
  - Disallow constant conditions in if statements, loops, etc
  - Disallow unnecessary parentheses in expressions
  - Disallow duplicate arguments in function definitions
  - Disallow duplicate keys in object literals
  - Disallow unreachable code after return, throw, continue, etc
  - Disallow reassigning function parameters
  - Require functions to always consistently return or not return at all
  - Enforce consistent use of dot notation wherever possible
  - Disallow multiple empty lines
  - Enforce spacing around the colon in object literal properties
  - Disallow optional chaining, where undefined values are not allowed

The required linting packages have been added as dependencies in the
"www/scripts" directory.

All the test scripts and related dependencies have been moved to -
$IMPALA_HOME/tests/webui/js_tests.

All the custom ESLint formatter scripts and related dependencies
have been moved to -
$IMPALA_HOME/tests/webui/linting.

A combination of NodeJS's 'prefix' argument and NODE_PATH environmental
variable is being used to seperate the dependencies and webUI scripts.
To support running the tests from a remote directory(i.e. tests/webui),
by modifying the required base paths.

The JS scripts need to be updated according to these linting rules,
as per IMPALA-13986.

Change-Id: Ieb3d0a9221738e2ac6fefd60087eaeee4366e33f
Reviewed-on: http://gerrit.cloudera.org:8080/21970
Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-05-11 01:07:14 +00:00
Riza Suminto
a3146ca722 IMPALA-13959: (addendum) Let test pass regardless of JDK version.
This patch modify test_change_parquet_column_type to let it pass
regardless of the test JDK version. The assertion is changed from using
string match to regex.

Testing:
Run and pass the test with both JDK8 and JDK17.

Change-Id: I5bd3eebe7b1e52712033dda488f0c19882207f9d
Reviewed-on: http://gerrit.cloudera.org:8080/22874
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-05-09 21:51:43 +00:00
Zoltan Borok-Nagy
04735598d6 IMPALA-13718: Skip reloading Iceberg tables when metadata JSON file is the same
With this patch Impala skips reloading Iceberg tables when metadata
JSON file is the same, as this means that the table is essentially
unchanged.

This can help in situations when the event processor is lagging behind
and we have an Iceberg table that is updated frequently. Imagine the
case when Impala gets 100 events for an Iceberg table. In this case
after processing the first event, our internal representation of
the Iceberg table is already up-to-date, there is no need to do the
reload 100 times.

We cannot use the internal icebergApiTable_'s metadata location,
as the following statement might silently refresh the metadata
in 'current()':

 icebergApiTable_.operations().current().metadataFileLocation()

To guarantee that we check against the actual loaded metadata
this patch introduces a new member to store the metadata location.

Testing
 * added e2e tests for REFRESH, also for event processing

Change-Id: I16727000cb11d1c0591875a6542d428564dce664
Reviewed-on: http://gerrit.cloudera.org:8080/22432
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Noemi Pap-Takacs <npaptakacs@cloudera.com>
2025-05-09 11:37:01 +00:00
Riza Suminto
f2acd2381f IMPALA-14039: __restore_query_options should unset query option
ImpalaTestSuite.__restore_query_options() attempt to restore client's
configuration with what it understand as the "default" query option.

Since IMPALA-13930, ImpalaConnection.get_default_configuration() parse
the default query option from TQueryOption fields. Therefore, it might
not respect server's default that comes from --default_query_options
flag.

ImpalaTestSuite.__restore_query_options() should simply unset any
configuration that previously set by running SET query like this:

SET query_option="";

This patch also change execute_query_using_vector() to simply unset
client's configuration.

Follow up cleanup will be tracked through IMPALA-14060.

Testing:
Run and pass test_queries.py::TestQueries.

Change-Id: I884986b9ecbcabf0b34a7346220e6ea4142ca923
Reviewed-on: http://gerrit.cloudera.org:8080/22862
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Jason Fehr <jfehr@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-05-09 00:48:58 +00:00
Riza Suminto
3210ec58c5 IMPALA-14006: Bound max_instances in CreateInputCollocatedInstances
IMPALA-11604 (part 2) changes how many instances to create in
Scheduler::CreateInputCollocatedInstances. This works when the left
child fragment of a parent fragment is distributed across nodes.
However, if the left child fragment instance is limited to only 1
node (the case of UNPARTITIONED fragment), the scheduler might
over-parallelize the parent fragment by scheduling too many instances in
a single node.

This patch attempts to mitigate the issue in two ways. First, it adds
bounding logic in PlanFragment.traverseEffectiveParallelism() to lower
parallelism further if the left (probe) side of the child fragment is
not well distributed across nodes.

Second, it adds TQueryExecRequest.max_parallelism_per_node to relay
information from Analyzer.getMaxParallelismPerNode() to the scheduler.
With this information, the scheduler can do additional sanity checks to
prevent Scheduler::CreateInputCollocatedInstances from
over-parallelizing a fragment. Note that this sanity check can also cap
MAX_FS_WRITERS option under a similar scenario.

Added ScalingVerdict enum and TRACE log it to show the scaling decision
steps.

Testing:
- Add planner test and e2e test that exercise the corner case under
  COMPUTE_PROCESSING_COST=1 option.
- Manually comment the bounding logic in traverseEffectiveParallelism()
  and confirm that the scheduler's sanity check still enforces the
  bounding.

Change-Id: I65223b820c9fd6e4267d57297b1466d4e56829b3
Reviewed-on: http://gerrit.cloudera.org:8080/22840
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-05-07 03:34:15 +00:00
Riza Suminto
c0c6cc9df4 IMPALA-12201: Stabilize TestFetch
This patch attempt to stabilize TestFetch by using HS2 as test protocol.
test_rows_sent_counters is modified to use the default hs2_client.
test_client_fetch_time_stats and test_client_fetch_time_stats_incomplete
is modified to use MinimalHS2Connection that has more simpler mechanism
in terms of fetching (ImpylaHS2Connection always fetch 10240 rows at a
time).

Implemented minimal functions needed to wait for finished state and pull
runtime profile at MinimalHS2Connection.

Testing:
Loop the test 50 times and pass them all.

Change-Id: I52651df37a318357711d26d2414e025cce4185c3
Reviewed-on: http://gerrit.cloudera.org:8080/22847
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-05-07 00:45:08 +00:00
Riza Suminto
cb496104d9 IMPALA-14027: Implement HS2 NULL_TYPE using TStringValue
HS2 NULL_TYPE should be implemented using TStringValue.

However, due to incompatibility with Hive JDBC driver implementation
then, Impala choose to implement NULL type using TBoolValue (see
IMPALA-914, IMPALA-1370).

HIVE-4172 might be the root cause for such decision. Today, the Hive
JDBC (org.apache.hive.jdbc.HiveDriver) does not have that issue anymore,
as shown in this reproduction after applying this patch:

./bin/run-jdbc-client.sh -q "select null" -t NOSASL
Using JDBC Driver Name: org.apache.hive.jdbc.HiveDriver
Connecting to: jdbc:hive2://localhost:21050/;auth=noSasl
Executing: select null
----[START]----
NULL
----[END]----
Returned 1 row(s) in 0.343s

Thus, we can reimplement NULL_TYPE using TStringValue to match
HiveServer2 behavior.

Testing:
- Pass core tests.

Change-Id: I354110164b360013d9893f1eb4398c3418f80472
Reviewed-on: http://gerrit.cloudera.org:8080/22852
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-05-06 19:41:17 +00:00
Michael Smith
912114b6cd IMPALA-14022: Use longer timeouts for rename test
Extends timeouts for test_alter_table_rename_independent to allow more
time for catalog updates.

Change-Id: Ie3dcd7b93a37fc6d078fd562ae1c356596a758a6
Reviewed-on: http://gerrit.cloudera.org:8080/22856
Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>
Tested-by: Michael Smith <michael.smith@cloudera.com>
2025-05-06 16:55:35 +00:00
Michael Smith
8a4803f895 IMPALA-14022: Run test_alter_table_rename_independent serially
Run new test test_alter_table_rename_independent serially to avoid
delays from other catalog activity that make the test less predictable.

Change-Id: I6033dc533721a649dd476f2bf2c8c63a7a78ef59
Reviewed-on: http://gerrit.cloudera.org:8080/22841
Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>
Tested-by: Michael Smith <michael.smith@cloudera.com>
2025-05-01 15:53:34 +00:00
Fang-Yu Rao
8f7d2246ec IMPALA-12554: (Addendum) Add a flag to not consolidate requests by default
This patch adds a startup flag so that by default the catalog server
will not consolidate the grant/revoke requests sent to the Ranger server
when there are multiple columns involved in the GRANT/REVOKE statement.

Testing:
 - Added 2 end-to-end tests to make sure the grant/revoke requests
   sent to the Ranger server would be consolidated only when the flag
   is explicitly added when we start the catalog server.

Change-Id: I4defc59c048be1112380c3a7254ffa8655eee0af
Reviewed-on: http://gerrit.cloudera.org:8080/22833
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-05-01 11:34:38 +00:00
Yida Wu
d95c06cd6c IMPALA-14001: Start EXEC_TIME_LIMIT_S timer after backend execution begins
This patch fixes an issue where EXEC_TIME_LIMIT_S was inaccurately
enforced by including the planning time in its countdown. The timer
for EXEC_TIME_LIMIT_S is now started only after the coordinator
reaches the "Ready to start on the backends" state, ensuring that
this time limit applies strictly to the execution phase.

This patch also adds a DebugAction PLAN_CREATE in the planning phase
for the testing purpose.

Tests:
Passed core tests.
Adds an ee testcase query_test/test_exec_time_limit.py.

Change-Id: I825e867f1c9a39a9097d1c97ee8215281a009d7d
Reviewed-on: http://gerrit.cloudera.org:8080/22837
Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-05-01 09:31:53 +00:00
Venu Reddy
5db760662f IMPALA-12709: Add support for hierarchical metastore event processing
At present, metastore event processor is single threaded. Notification
events are processed sequentially with a maximum limit of 1000 events
fetched and processed in a single batch. Multiple locks are used to
address the concurrency issues that may arise when catalog DDL
operation processing and metastore event processing tries to
access/update the catalog objects concurrently. Waiting for a lock or
file metadata loading of a table can slow the event processing and can
affect the processing of other events following it. Those events may
not be dependent on the previous event. Altogether it takes a very
long time to synchronize all the HMS events.

Existing metastore event processing is turned into multi-level
event processing with enable_hierarchical_event_processing flag. It
is not enabled by default. Idea is to segregate the events based on
their dependency, maintain the order of events as they occur within
the dependency and process them independently as much as possible.
Following 3 main classes represents the three level threaded event
processing.
1. EventExecutorService
   It provides the necessary methods to initialize, start, clear,
   stop and process the metastore events processing in hierarchical
   mode. It is instantiated from MetastoreEventsProcessor and its
   methods are invoked from MetastoreEventsProcessor. Upon receiving
   the event to process, EventExecutorService queues the event to
   appropriate DbEventExecutor for processing.
2. DbEventExecutor
   An instance of this class has an execution thread, manage events
   of multiple databases with DbProcessors. An instance of DbProcessor
   is maintained to store the context of each database within the
   DbEventExecutor. On each scheduled execution, input events on
   DbProcessor are segregated to appropriate TableProcessors for the
   event processing and also process the database events that are
   eligible for processing.
   Once a DbEventExecutor is assigned to a database, a DbProcessor
   is created. And the subsequent events belonging to the database
   are queued to same DbEventExecutor thread for further processing.
   Hence, linearizability is ensured in dealing with events within
   the database. Each instance of DbEventExecutor has a fixed list
   of TableEventExecutors.
3. TableEventExecutor
   An instance of this class has an execution thread, processes
   events of multiple tables with TableProcessors. An instance of
   TableProcessor is maintained to store context of each table within
   a TableEventExecutor. On each scheduled execution, events from
   TableProcessors are processed.
   Once a TableEventExecutor is assigned to table, a TableProcessor
   is created. And the subsequent table events are processed by same
   TableEventExecutor thread. Hence, linearizability is guaranteed
   in processing events of a particular table.
   - All the events of a table are processed in the same order they
     have occurred.
   - Events of different tables are processed in parallel when those
     tables are assigned to different TableEventExecutors.

Following new events are added:
1. DbBarrierEvent
   This event wraps a database event. It is used to synchronize all
   the TableProcessors belonging to database before processing the
   database event. It acts as a barrier to restrict the processing
   of table events that occurred after the database event until the
   database event is processed on DbProcessor.
2. RenameTableBarrierEvent
   This event wraps an alter table event for rename. It is used to
   synchronize the source and target TableProcessors to
   process the rename table event. It ensures the source
   TableProcessor removes the table first and then allows the target
   TableProcessor to create the renamed table.
3. PseudoCommitTxnEvent and PseudoAbortTxnEvent
   CommitTxnEvent and AbortTxnEvent can involve multiple tables in
   a transaction and processing these events modifies multiple table
   objects. Pseudo events are introduced such that a pseudo event is
   created for each table involved in the transaction and these
   pseudo events are processed independently at respective
   TableProcessors.

Following new flags are introduced:
1. enable_hierarchical_event_processing
   To enable the hierarchical event processing on catalogd.
2. num_db_event_executors
   To set the number of database level event executors.
3. num_table_event_executors_per_db_event_executor
   To set the number of table level event executors within a
   database event executor.
4. min_event_processor_idle_ms
   To set the minimum time to retain idle db processors and table
   processors on the database event executors and table event
   executors respectively, when they do not have events to process.
5. max_outstanding_events_on_executors
   To set the limit of maximum outstanding events to process on
   event executors.

Changed hms_event_polling_interval_s type from int to double to support
millisecond precision interval

TODOs:
1. We need to redefine the lag in the hierarchical processing mode.
2. Need to have a mechanism to capture the actual event processing time
   in hierarchical processing mode. Currently, with
   enable_hierarchical_event_processing as true, lastSyncedEventId_ and
   lastSyncedEventTimeSecs_ are updated upon event dispatch to
   EventExecutorService for processing on respective DbEventExecutor
   and/or TableEventExecutor. So lastSyncedEventId_ and
   lastSyncedEventTimeSecs_ doesn't actually mean events are processed.
3. Hierarchical processing mode currently have a mechanism to show the
   total number of outstanding events on all the db and table executors
   at the moment. Need to enhance observability further with this mode.
Filed a jira[IMPALA-13801] to fix them.

Testing:
 - Executed existing end to end tests.
 - Added fe and end-to-end tests with enable_hierarchical_event_processing.
 - Added event processing performance tests.
 - Have executed the existing tests with hierarchical processing
   mode enabled. lastSyncedEventId_ is now used in the new feature of
   sync_hms_events_wait_time_s (IMPALA-12152) as well. Some tests fail when
   hierarchical processing mode is enabled because lastSyncedEventId_ do
   not actually mean event is processed in this mode. This need to be
   fixed/verified with above jira[IMPALA-13801].

Change-Id: I76d8a739f9db6d40f01028bfd786a85d83f9e5d6
Reviewed-on: http://gerrit.cloudera.org:8080/21031
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-04-30 11:51:03 +00:00
stiga-huang
076536d508 IMPALA-13999: Refactor test_hms_event_sync_basic to be smaller parallel tests
test_hms_event_sync_basic is not a simple test that it actually tests
several kinds of statements in sequence.

This refactors it into smaller parallel tests so there are more
concurrent HMS events to be processed and easier to reveal bugs.

Renamed some tests to use shorter names.

Tests:
 - Ran all parallel tests of TestEventSyncWaiting 32 times.

Change-Id: I8a2be548697f6259961b83dc91230306f38e03ad
Reviewed-on: http://gerrit.cloudera.org:8080/22829
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-04-30 04:39:30 +00:00
Eyizoha
faf322dd41 IMPALA-12927: Support specifying format for reading JSON BINARY columns
Currently, Impala always assumes that the data in the binary columns of
JSON tables is base64 encoded. However, before HIVE-21240, Hive wrote
binary data to JSON tables without base64 encoding it, instead writing
it as escaped strings. After HIVE-21240, Hive defaults to base64
encoding binary data when writing to JSON tables and introduces the
serde property 'json.binary.format' to indicate the encoding method of
binary data in JSON tables.

To maintain consistency with Hive and avoid correctness issues caused by
reading data in an incorrect manner, this patch also introduces the
serde property 'json.binary.format' to specify the reading method for
binary data in JSON tables. Currently, this property supports reading in
either base64 or rawstring formats, same as Hive.

Additionally, this patch introduces a query option 'json_binary_format'
to achieve the same effect. This query option will only take effect for
JSON tables where the serde property 'json.binary.format' is not set.
The reading format of binary columns in JSON tables can be configured
globally by setting the 'default_query_options'. It should be noted that
the default value of 'json_binary_format' is 'NONE', and impala will
prohibit reading binary columns of JSON tables that either have
"no 'json.binary.format' set and 'json_binary_format' is 'NONE'" or
"an invalid 'json.binary.format' value set", and will provide an error
message to avoid using an incorrect format without the user noticing.

Testing:
  - Enabled existing binary type E2E tests for JSON tables
  - Added new E2E test for 'json.binary.format'

Change-Id: Idf61fa3afc0f33caa63fbc05393e975733165e82
Reviewed-on: http://gerrit.cloudera.org:8080/22289
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-04-29 16:16:12 +00:00
stiga-huang
391cbd0824 IMPALA-13974: (Addendum) Skip TestEventSyncWaiting in non-HDFS builds
TestEventSyncWaiting depends on running statements on Hive which doesn't
work in non-HDFS builds. This patch skips the test in these builds.

Change-Id: I947ad23456c01bc76df0ad154519360549143b80
Reviewed-on: http://gerrit.cloudera.org:8080/22828
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-04-29 09:21:05 +00:00
stiga-huang
56b465d91f IMPALA-13829: Postpone catalog deleteLog GC for waitForHmsEvent requests
When a db/table is removed in the catalog cache, catalogd assigns it a
new catalog version and put it into the deleteLog. This is used for the
catalog update thread to collect deletion updates. Once the catalog
update thread collects a range of updates, it triggers GC in the
deleteLog to clear items older than the last sent catalog version. The
deletions will be broadcasted by statestore to all the coordinators
eventually.

However, waitForHmsEvent requests is also a consumer of the deleteLog
and could be impacted by these GCs. waitForHmsEvent is a catalogd RPC
used by coordinators when a query wants to wait until the related
metadata is in synced with HMS. The response of waitForHmsEvent returns
the latest metadata including the deletions on related dbs/tables.
If the related deletions in deleteLog is GCed just before the
waitForHmsEvent request collects the results, they will be missing in
the response. Coordinator might keep using stale metadata of
non-existing dbs/tables.

This is a quick fix for the issue by postponing deleteLog GC in a
configurable number of topic updates, similar to what we have done on
the TopicUpdateLog. A thorough fix might need to carefully choose the
version to GC or let impalad waits for the deletions from statestore to
arrive.

A new flag, catalog_delete_log_ttl, is added for this. The deleteLog
items can survive for catalog_delete_log_ttl catalog updates. The
default is 60 so a deletion can survive for at least 120s. It should be
safe enough, i.e. the GCed deletions must have arrived in the impalad
side after 60 rounds of catalog updates, otherwise that's an abnormal
impalad and already has other more severe issues, e.g. lots of stale
tables due to metadata out of sync with catalogd.

Note that postponing deleteLog GCs might increase the memory
consumption. But since most of its memory is used by db/table/partition
names, the memory usage might still be trivial comparing to other
metadata like file descriptors and incremental stats in lived catalog
objects.

This patch also removed some unused imports.

Tests:
 - Added e2e test with a debug action to reproduce the issue. Ran the
   test 100 times. Without the fix, it consistently fails when runs for
   2-3 times.

Change-Id: I2441440bca2b928205dd514047ba742a5e8bf05e
Reviewed-on: http://gerrit.cloudera.org:8080/22816
Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-04-29 07:41:41 +00:00
stiga-huang
0e3ae5c339 IMPALA-13996: Deflake test_too_many_files by creating dedicate tables
TestAllowIncompleteData.test_too_many_files depends on
tpch_parquet.lineitem to have exactly 3 data files. This is false in
erasure coding builds in which tpch_parquet.lineitem has only 2 data
files.

This fixes the test to use dedicate tables created in the test.

Change-Id: I28cec8ec4bc59f066aa15a7243b7163639706cc7
Reviewed-on: http://gerrit.cloudera.org:8080/22824
Reviewed-by: Jason Fehr <jfehr@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-04-29 06:53:31 +00:00
Michael Smith
295f74ef12 IMPALA-13989: Invalidate table on rename failure
Handles the error "Table/view rename succeeded in the Hive Metastore,
but failed in Impala's Catalog Server" rather than failing the table
rename. This error happens when catalog state catches up to the alter
event from our alter_table RPC to HMS before we renameTable explicitly
in the catalog. The catalog can update independently due to a concurrent
'invalidate metadata' call.

In that case we use the oldTbl definition we already have - updated from
the delete log if possible - and fetch the new table definition with
invalidateTable to continue, automating most of the work that the error
message suggested users do via 'invalidate metadata <tbl>'.

Updated the test_concurrent_ddls test to remove handle_rename_failure
and ran the tests a dozen times. Adds concurrency tests with
simultaneous rename and invalidate metadata that previously would fail.

Change-Id: Ic2a276b6e5ceb35b7f3ce788cc47052387ae8980
Reviewed-on: http://gerrit.cloudera.org:8080/22807
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-04-29 06:24:21 +00:00
Riza Suminto
96ae16b60b IMPALA-13584: Add option to shows num row report in impala-shell
In beeswax all statements with the exception of USE print
'Fetched X row(s) in Ys', while in HS2 some statements (REFRESH,
INVALIDATE) metadata does not print it. While these statements always
return 0 rows, the amount of time spent with the statement can be
useful.

This patch modifies add impala-shell to let it print elapsed time for
that query, even if query is not expected to return result metadata.
Added --beeswax_compat_num_rows option in impala-shell. It default to
False. If this option is set (True), 'Fetched 0 row(s) in' will be
printed for all Impala protocol, just like beeswax. One exception for
this is USE query, which will remain silent.

Testing:
- Added test_beeswax_compat_num_rows in test_shell_interactive.py.
- Pass test_shell_interactive.py.

Change-Id: Id76ede98c514f73ff1dfa123a0d951e80e7508b4
Reviewed-on: http://gerrit.cloudera.org:8080/22813
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-04-28 19:13:39 +00:00
Gabor Kaszab
3c24706c72 IMPALA-13268: Integrate Iceberg ScanMetrics into Impala query profiles
When calling planFiles() on an Iceberg table, it can give us some
metrics like total planning time, number of data/delete files and
manifests, how many of these could be skipped etc.

This change integrates these metrics into the query profile, under the
"Frontend" section. These metrics are per-table, so if multiple tables
are scanned for the query there will be multiple sections in the
profile.

Note that we only have these metrics for a table if Iceberg needs to be
used for planning for that table, e.g. if a predicate is pushed down to
Iceberg or if there is time travel. For tables where Iceberg was not
used in planning, the profile will contain a short note describing this.

To facilitate pairing the metrics with scans, the metrics header
references the plan node responsible for the scan. This will always be
the top level node for the scan, so it can be a SCAN node, a JOIN node
or a UNION node depending on whether the table has delete files.

Testing:
 - added EE tests in iceberg-scan-metrics.tests
 - added a test in PlannerTest.java that asserts on the number of
   metrics; if it changes in a new Iceberg release, the test will fail
   and we can update our reporting

Change-Id: I080ee8eafc459dad4d21356ac9042b72d0570219
Reviewed-on: http://gerrit.cloudera.org:8080/22501
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Daniel Becker <daniel.becker@cloudera.com>
2025-04-28 08:54:30 +00:00
stiga-huang
f092646f84 IMPALA-13993: waitForHmsEvent should check table events under missing dbs
When a db is missing in the catalog cache, waitForHmsEvent request
currently just check if there are pending database events on it,
assuming processing the CREATE_DATABASE event will add the db with
the table list. However, that's wrong since processing CREATE_DATABASE
event just adds the db with an empty table list. We should still wait
for pending events on underlying tables.

Tests:
 - Added e2e test which consistenly fails when running concurrently with
   other tests in TestEventSyncWaiting without the fix.

Change-Id: I3fe74fcf0bf4dbac4a3584f6603279c0a2730b0c
Reviewed-on: http://gerrit.cloudera.org:8080/22817
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Quanlong Huang <huangquanlong@gmail.com>
2025-04-28 01:06:36 +00:00
Michael Smith
74bd0832ed IMPALA-13631: (Addendum) Test slow concurrent alters
Adds a test that multiple slow concurrent alters complete in parallel
rather than being executed sequentially.

The test creates tables, and ensures Impala's catalog is up-to-date for
their creation so that starting ALTER TABLE will be fast. Then starts
two ALTER TABLE RENAMES asynchronously - with debug_action ensuring each
takes at least 5 seconds - and waits for them to finish.

Verifies that concurrent alters are no longer blocked on "Got catalog
version read lock" and complete in less than 10 seconds.

Change-Id: I87d077aaa295943a16e6da60a2755dd289f3a132
Reviewed-on: http://gerrit.cloudera.org:8080/22804
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-04-26 16:06:16 +00:00
Mihaly Szjatinya
5e39afc2d9 IMPALA-10268: Validate the debug actions when they are set
This patch aims to extract existing verifications on DEBUG_ACTION query
option format onto pre-planner stage SetQueryOption(), in order to
prevent failures on execution stage. Also, it localizes verification
code for two existing types of debug actions.

There are two types of debug actions, global e.g. 'RECVR_ADD_BATCH:FAIL'
and ExecNode debug actions, e.g. '0:GETNEXT:FAIL'. Two types are
implemented independently in source code, both having verification code
intertwined with execution. In addition, global debug actions subdivide
into C++ and Java, the two being more or less synchronized though.

In case of global debug actions, most of the code inside existing
DebugActionImpl() consists of verification, therefore it makes sense to
make a wrapper around it for separating out the execution code.

Things are worse for ExecNode debug actions, where verification code
consists of two parts, one in DebugOptions() constructor and another one
in ExecNode::ExecDebugActionImpl(). Additionally, some verification in
constructor produces warnings, while ExecDebugActionImpl() verification
either fails on DCHECK() or (in a single case) returns an error. For
this case, a reasonable solution seems to be simply calling the
constructor for a temporary object and extracting verification code from
ExecNode::ExecDebugActionImpl(). This has the drawback of having the
same warning being produced two times.

Finally, having extracted verification code for both types, logic in
impala::SetQueryOption() combines the two verification mechanisms.

Note: In the long run, it is better to write a single verification
routine for both Global and ExecNode debug actions, ideally as part of a
general unification of the two existing debug_action mechanisms. With
this in mind, the current patch intends to preserve current behavior,
while avoiding complex refactoring.

Change-Id: I53816aba2c79b556688d3b916883fee7476fdbb5
Reviewed-on: http://gerrit.cloudera.org:8080/22734
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-04-25 22:25:19 +00:00
stiga-huang
7d6fe8c6c8 IMPALA-13487: Add profile counters for memory allocation in parquet scanners
This patch adds some profile counters for memory allocation and free in
MemPools, which are useful to detect tcmalloc contention.

The following counters are added:
 - Thread level page faults: TotalThreadsMinorPageFaults,
   TotalThreadsMajorPageFaults.
 - MemPool counters for tuple_mem_pool and aux_mem_pool of the scratch
   batch in columnar scanners:
    - ScratchBatchMemAllocDuration
    - ScratchBatchMemFreeDuration
    - ScratchBatchMemAllocBytes
 - MemPool counters for data_page_pool of ParquetColumnChunkReader
    - ParquetDataPagePoolAllocBytes
    - ParquetDataPagePoolAllocDuration
    - ParquetDataPagePoolFreeBytes
    - ParquetDataPagePoolFreeDuration
 - MemPool counters for the fragment level RowBatch
    - RowBatchMemPoolAllocDuration
    - RowBatchMemPoolAllocBytes
    - RowBatchMemPoolFreeDuration
    - RowBatchMemPoolFreeBytes
 - Duration in HdfsColumnarScanner::GetCollectionMemory() which includes
   memory allocation for collection values and memcpy when doubling the
   tuple buffer:
    - MaterializeCollectionGetMemTime

Here is an example of a memory-bound query:
  Fragment Instance
    - RowBatchMemPoolAllocBytes: 0 (Number of samples: 0)
    - RowBatchMemPoolAllocDuration: 0.000ns (Number of samples: 0)
    - RowBatchMemPoolFreeBytes: (Avg: 719.25 KB (736517) ; Min: 4.00 KB (4096) ; Max: 4.12 MB (4321922) ; Sum: 1.93 GB (2069615013) ; Number of samples: 2810)
    - RowBatchMemPoolFreeDuration: (Avg: 132.027us ; Min: 0.000ns ; Max: 21.999ms ; Sum: 370.997ms ; Number of samples: 2810)
    - TotalStorageWaitTime: 47.999ms
    - TotalThreadsInvoluntaryContextSwitches: 2 (2)
    - TotalThreadsMajorPageFaults: 0 (0)
    - TotalThreadsMinorPageFaults: 549.63K (549626)
    - TotalThreadsTotalWallClockTime: 9s646ms
      - TotalThreadsSysTime: 1s508ms
      - TotalThreadsUserTime: 1s791ms
    - TotalThreadsVoluntaryContextSwitches: 8.85K (8852)
    - TotalTime: 9s648ms
    ...
    HDFS_SCAN_NODE (id=0):
      - ParquetDataPagePoolAllocBytes: (Avg: 2.36 MB (2477480) ; Min: 4.00 KB (4096) ; Max: 4.12 MB (4321922) ; Sum: 1.02 GB (1090091508) ; Number of samples: 440)
      - ParquetDataPagePoolAllocDuration: (Avg: 1.263ms ; Min: 0.000ns ; Max: 39.999ms ; Sum: 555.995ms ; Number of samples: 440)
      - ParquetDataPagePoolFreeBytes: (Avg: 1.28 MB (1344350) ; Min: 4.00 KB (4096) ; Max: 1.53 MB (1601012) ; Sum: 282.06 MB (295757000) ; Number of samples: 220)
      - ParquetDataPagePoolFreeDuration: (Avg: 1.927ms ; Min: 0.000ns ; Max: 19.999ms ; Sum: 423.996ms ; Number of samples: 220)
      - ScratchBatchMemAllocBytes: (Avg: 486.33 KB (498004) ; Min: 4.00 KB (4096) ; Max: 512.00 KB (524288) ; Sum: 1.19 GB (1274890240) ; Number of samples: 2560)
      - ScratchBatchMemAllocDuration: (Avg: 1.936ms ; Min: 0.000ns ; Max: 35.999ms ; Sum: 4s956ms ; Number of samples: 2560)
      - ScratchBatchMemFreeDuration: 0.000ns (Number of samples: 0)
      - DecompressionTime: 1s396ms
      - MaterializeCollectionGetMemTime: 4s899ms
      - MaterializeTupleTime: 6s656ms
      - ScannerIoWaitTime: 47.999ms
      - TotalRawHdfsOpenFileTime: 0.000ns
      - TotalRawHdfsReadTime: 360.997ms
      - TotalTime: 9s254ms

The fragment instance took 9s648ms to finish. 370.997ms spent in
releasing memory of the final RowBatch. The majority of the time is
spent in the scan node (9s254ms). Mostly it's DecompressionTime +
MaterializeTupleTime + ScannerIoWaitTime + TotalRawHdfsReadTime. The
majority is MaterializeTupleTime (6s656ms).

ScratchBatchMemAllocDuration shows that invoking std::malloc() in
materializing the scratch batches took 4s956ms overall.
MaterializeCollectionGetMemTime shows that allocating memory for
collections and copying memory in doubling the tuple buffer took
4s899ms. So materializing the collections took most of the time.

Note that DecompressionTime (1s396ms) also includes memory allocation
duration tracked by the sum of ParquetDataPagePoolAllocDuration
(555.995ms). So memory allocation also takes a significant portion of
time here.

The other observation is TotalThreadsTotalWallClockTime is much higher
than TotalThreadsSysTime + TotalThreadsUserTime and there is a large
number of TotalThreadsVoluntaryContextSwitches. So the thread is waiting
for resources (e.g. lock) for a long duration. In the above case, it's
waiting for locks in tcmalloc memory allocation (need off-cpu flame
graph to reveal this).

Implementation of MemPool counters
Add MemPoolCounters in MemPool to track malloc/free duration and bytes.
Note that counters are not updated in the destructor since it's expected
that all chunks are freed or transfered before calling the destructor.

MemPool is widely used in the code base. This patch only exposes MemPool
counters in three places:
 - the scratch batch in columnar scanners
 - the ParquetColumnChunkReader of parquet scanners
 - the final RowBatch reset by FragmentInstanceState

This patch also moves GetCollectionMemory() from HdfsScanner to
HdfsColumnarScanner since it's only used by parquet and orc scanners.

PrettyPrint of SummaryStatsCounter is updated to also show the sum of
the values if it's not speeds or percentages.

Tests
 - tested in manually reproducing the memory-bound queries
 - ran perf-AB-test on tpch (sf=42) and didn't see significant
   performance change
 - added e2e tests
 - updated expected files of observability/test_profile_tool.py due to
   SummaryStatsCounter now prints sum in most of the cases. Also updated
   get_bytes_summary_stats_counter and
   test_get_bytes_summary_stats_counter accordingly.

Change-Id: I982315d96e6de20a3616f3bd2a2b4866d1ff4710
Reviewed-on: http://gerrit.cloudera.org:8080/22062
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-04-25 02:53:32 +00:00
Michael Smith
ef8f8ca27b IMPALA-13631: (Addendum) Retry aborted concurrent DDLs
TestConcurrentDdls has several exceptions it considers acceptable for
testing; it would accept the query failure and continue with other
cases. That was fine for existing queries, but if an ALTER RENAME fails
subsequent queries will also fail because the table does not have the
expected name.

With IMPALA-13631, there are three exception cases we need to handle:
1. "Table/view rename succeeded in the Hive Metastore, but failed in
   Impala's Catalog Server" happens when the HMS alter_table RPC
   succeeds but local catalog has changed. INVALIDATE METADATA on the
   target table is sufficient to bring things in sync.
2. "CatalogException: Table ... was modified while operation was in
   progress, aborting execution" can safely be retried.
3. "Couldn't retrieve the catalog topic update for the SYNC_DDL
   operation" happens when SYNC_DDL=1 and the DDL runs on a stale table
   object that's removed from the cache by a global INVALIDATE.

Adds --max_wait_time_for_sync_ddl_s=10 in catalogd_args for the last
exception to occur. Otherwise the query will just timeout.

Tested by running test_concurrent_ddls.py 15 times. The 1st exception
previously would show up within 3-4 runs, while the 2nd exception
happens pretty much every run.

Change-Id: I04d071b62e4f306466a69ebd9e134a37d4327b77
Reviewed-on: http://gerrit.cloudera.org:8080/22802
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Michael Smith <michael.smith@cloudera.com>
2025-04-24 04:05:10 +00:00
Riza Suminto
0816986b15 IMPALA-13987: Fix stress_catalog_init_delay_ms check in RELEASE
stress_catalog_init_delay_ms does not exist in RELEASE build and causing
KeyError in impala_cluster.py. This patch fix it by specifying default
value when inspecting ImpaladService.get_flag_current_values() return
value.

Testing:
Run start-impala-cluster.py in RELEASE build and it works.

Change-Id: Ia4400a7e711d21d23cc37878f18f2e0389b741b0
Reviewed-on: http://gerrit.cloudera.org:8080/22803
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-04-23 03:30:52 +00:00
stiga-huang
ab92a300fc IMPALA-13631: alterTableOrViewRename shouldn't hold catalog versionLock during external RPCs
Catalog versionLock is a lock used to synchronize reads/writes of
catalogVersion. It can be used to perform atomic bulk catalog operations
since catalogVersion cannot change externally while the lock is being
held. All other catalog operations will be blocked if the current thread
holds the lock. So it shouldn't be held for a long time, especially when
the current thread is invoking external RPCs for a table.

CatalogOpExecutor.alterTable() is one place that could hold the lock for
a long time. If the ALTER operation is a RENAME, it holds the lock until
alterTableOrViewRename() finishes. HMS RPCs are invoked in this method
to perform the operation, which might take an unpredictive time. The
motivation of holding this lock is that RENAME is implemented as an DROP
+ ADD in the catalog cache. So this operation can be atomic. However,
that doesn't mean we need the lock before operating the cache in
CatalogServiceCatalog.renameTable(). We actually acquires the lock again
in this method. So no need to keep holding the lock when invoking HMS
RPCs.

This patch removes holding the lock in alterTableOrViewRename().

Tests
 - Added e2e test for concurrent rename operations.
 - Also added some rename operations in test_concurrent_ddls.py

Change-Id: Ie5f443b1e167d96024b717ce70ca542d7930cb4b
Reviewed-on: http://gerrit.cloudera.org:8080/22789
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
2025-04-21 22:54:05 +00:00
stiga-huang
4ddacac14f IMPALA-11402: Add limit on files fetched by a single getPartialCatalogObject request
getPartialCatalogObject is a catalogd RPC used by local catalog mode
coordinators to fetch metadata on-demand from catalogd.
For a table with a huge number (e.g. 6M) of files, catalogd might hit
OOM of exceeding the JVM array limit when serializing the response of
a getPartialCatalogObject request for all partitions (thus all files).

This patch adds a new flag, catalog_partial_fetch_max_files, to define
the max number of file descriptors allowed in a response of
getPartialCatalogObject. Catalogd will truncate the response in
partition level when it's too big, and only return a subset of the
requested partitions. Coordinator should send new requests to fetch the
remaining partitions. Note that it's possible that table metadata
changes between the requests. Coordinator will detect the catalog
version changes and throws an InconsistentMetadataFetchException for the
planner to replan the query. This is an existing mechanism for other
kinds of table metadata.

Here are some metrics of the number of files in a single response and
the corresponding byte array size and duration of a single response:
 * 1000000: 371.71MB, 1s487ms
 * 2000000: 744.51MB, 4s035ms
 * 3000000: 1.09GB, 6s643ms
 * 4000000: 1.46GB, duration not measured due to GC pauses
 * 5000000: 1.82GB, duration not measured due to GC pauses
 * 6000000: >2GB (hit OOM)
Choose 1000000 as the default value for now. We can tune it in the
future.

Tests:
 - Added custom-cluster test
 - Ran e2e tests in local-catalog mode with
   catalog_partial_fetch_max_files=1000 so the new codes are used.

Change-Id: Ibb13fec20de5a17e7fc33613ca5cdebb9ac1a1e5
Reviewed-on: http://gerrit.cloudera.org:8080/22559
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-04-21 15:20:59 +00:00
Riza Suminto
a29319e4b9 IMPALA-13970: Add NaN and Infinity parsing in ImpylaHS2ResultSet
This patch adds NaN, Infinity, and boolean parsing in ImpylaHS2ResultSet
to match with beeswax result. TestQueriesJsonTables is changed to test
all client protocol.

Testing:
Run and pass TestQueriesJsonTables.

Change-Id: I739a88e9dfa418d3a3c2d9d4181b4add34bc6b93
Reviewed-on: http://gerrit.cloudera.org:8080/22785
Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>
Tested-by: Riza Suminto <riza.suminto@cloudera.com>
2025-04-18 15:27:22 +00:00
Riza Suminto
9000c83efc IMPALA-13971: Deflake TestAdmissionController.test_user_loads_rules
TestAdmissionController.test_user_loads_rules is flaky for not failing
the last query that should exceed the user quota. The test executes
queries in a round-robin fashion across all impalad. These impalads are
expected to synchronize user quota count through statestore updates.

This patch attempts to deflake the test by raising the heartbeat wait
time from 1 heartbeat period to 3 hearbeat periods. It also changes the
reject query to a fast version of SLOW_QUERY (without sleep) so the test
can fail fast if it is not rejected.

Testing:
Loop the test 50 times and pass them all.

Change-Id: Ib2ae8e1c2edf174edbf0e351d3c2ed06a0539f08
Reviewed-on: http://gerrit.cloudera.org:8080/22787
Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>
Tested-by: Riza Suminto <riza.suminto@cloudera.com>
2025-04-18 15:27:22 +00:00
Riza Suminto
648209b172 IMPALA-13967: Move away from setting user parameter in execute
ImpalaConnection.execute and ImpalaConnection.execute_async have 'user'
parameter to set specific user to run the query. This is mainly legacy
of BeeswaxConnection, which allows using 1 client to run queries under
different usernames.

BeeswaxConnection and ImpylaHS2Connection actually allow specifying one
user per client. Doing so will simplify user-specific tests such as
test_ranger.py that often instantiates separate clients for admin user
and regular user. There is no need to specify 'user' parameter anymore
when calling execute() or execute_async(). Thus, reducing potential bugs
from forgetting to set one or setting it with incorrect value.

This patch applies one-user-per-client practice as much as possible for
test_ranger.py, test_authorization.py, and test_admission_controller.py.
Unused code and pytest fixtures are removed. Few flake8 issues are
addressed too. Their default_test_protocol() is overridden to return
'hs2'.

ImpylaHS2Connection.execute() and ImpylaHS2Connection.execute_async()
are slightly modified to assume ImpylaHS2Connection.__user if 'user'
parameter in None. BeeswaxConnection remains unchanged.

Extend ImpylaHS2ResultSet.__convert_result_value() to lower case boolean
return value to match beeswax result.

Testing:
Run and pass all modified tests in exhaustive exploration.

Change-Id: I20990d773f3471c129040cefcdff1c6d89ce87eb
Reviewed-on: http://gerrit.cloudera.org:8080/22782
Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>
Tested-by: Riza Suminto <riza.suminto@cloudera.com>
2025-04-18 15:27:22 +00:00
Riza Suminto
182aa5066e IMPALA-13958: Revisit hs2_parquet_constraint and hs2_text_constraint
hs2_parquet_constraint and hs2_text_constraint is meant to extend test
vector dimension to also test non-default test protocol (other than
beeswax), but limit it to only run against 'parquet/none' or 'text/none'
format accordingly.

This patch modifies these constraints to
default_protocol_or_parquet_constraint and
default_protocol_or_text_constraint respectively such that the full file
format coverage happen for default_test_protocol configuration and
limited for the other protocols. Drop hs2_parquet_constraint entirely
from test_utf8_strings.py because that test is already constrained to
single 'parquet/none' file format.

Num modified rows validation in date-fileformat-support.test and
date-partitioning.test are changed to check the NumModifiedRows counter
from profile.

Fix TestQueriesJsonTables to always run with beeswax protocol because
its assertions relies on beeswax-specific return values.

Run impala-isort and fix few flake8 issues and in modified test files.

Testing:
Run and pass the affected test files using exhaustive exploration and
env var DEFAULT_TEST_PROTOCOL=hs2. Confirmed that full file format
coverage happen for hs2 protocol. Note that
DEFAULT_TEST_PROTOCOL=beeswax is still the default.

Change-Id: I8be0a628842e29a8fcc036180654cd159f6a23c8
Reviewed-on: http://gerrit.cloudera.org:8080/22775
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-04-17 22:50:58 +00:00
stiga-huang
fb789df3be IMPALA-13684: Improve waitForHmsEvent() to only wait for related events
waitForHmsEvent is a catalogd RPC for coordinators to send a requested
db/table names to catalogd and wait until it's safe (i.e. no stale
metadata) to start analyzing the statement. The wait time is configured
by query option sync_hms_events_wait_time_s. Currently, when this option
is enabled, catalogd waits until it syncs to the latest HMS event
regardless what the query is.

This patch reduces waiting by only checking related events and wait
until the last related event has been processed. In the ideal case, if
there are no pending events that are related, the query doesn't need to
wait.

Related pending events are determined as follows:
 - For queries that need the db list, i.e. SHOW DATABASES, check pending
   CREATE/ALTER/DROP_DATABASE events on all dbs. ALTER_DATABASE events
   are checked in case the ownership changes and impacts visibility.
 - For db statements like SHOW FUNCTIONS, CREATE/ALTER/DROP DATABASE,
   check pending CREATE/ALTER/DROP events on that db.
   - For db statements that require the table list, i.e. SHOW TABLES,
     also check CREATE_TABLE, DROP_TABLE events under that db.
 - For table statements,
   - check all database events on related db names.
   - If there are loaded transactional tables, check all the pending
     COMMIT_TXN, ABORT_TXN events. Note that these events might modify
     multiple transactional tables and we don't know their table names
     until they are processed. To be safe, wait for all transactional
     events.
   - For all the other table names,
     - if they are all missing/unloaded in the catalog, check all the
       pending CREATE_TABLE, DROP_TABLE events on them for their
       existence.
     - Otherwise, some of them are loaded, check all the table events on
       them. Note that we can fetch events on multiple tables under the
       same db in a single fetch.

If the statement has a SELECT part, views will be expanded so underlying
tables will be checked as well. For performance, this feature assumes
that views won't be changed to tables, and vice versa. This is a rare
use case in regular jobs. Users should use INVALIDATE for such case.

This patch leverages the HMS API to fetch events of several tables under
the same db in batch. MetastoreEventsProcessor.MetaDataFilter is
improved for this.

Tests:
 - Added test for multiple tables in a single query.
 - Added test with views.
 - Added test for transactional tables.
 - Ran CORE tests.

Change-Id: Ic033b7e197cd19505653c3ff80c4857cc474bcfc
Reviewed-on: http://gerrit.cloudera.org:8080/22571
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-04-17 09:06:19 +00:00
Riza Suminto
55feffb41b IMPALA-13850 (part 1): Wait until CatalogD active before resetting
In HA mode, CatalogD initialization can fail to complete within
reasonable time. Log messages showed that CatalogD is blocked trying to
acquire "CatalogServer.catalog_lock_" when calling
CatalogServer::UpdateActiveCatalogd() during statestore subscriber
registration. catalog_lock_ was held by GatherCatalogUpdatesThread which
is calling GetCatalogDelta(), which waits for the java lock versionLock_
which is held by the thread doing CatalogServiceCatalog.reset().

This patch remove catalog reset in JniCatalog constructor. In turn,
catalogd-server.cc is now responsible to trigger the metadata
reset (Invaidate Metadata) only if:

1. It is the active CatalogD, and
2. Gathering thread has collect the first topic update or CatalogD is
   set with catalog_topic_mode other than "minimal".

The later prerequisite is to ensure that all coordinators are not
blocked waiting for full topic update in on-demand metadata mode. This
is all managed by a new thread method TriggerResetMetadata that monitor
and trigger the initial reset metadata.

Note that this is a behavior change in on-demand catalog
mode (catalog_topic_mode=minimal). Previously, on-demand catalog mode
will send full database list in its first catalog topic update. This
behavior change is OK since coordinator can request metadata on-demand.

After this patch, catalog-server.active-status and /healthz page can
turn into true and OK respectively even if the very first metadata reset
is still ongoing. Observer that cares about having fully populated
metadata should check other metrics such as catalog.num-db,
catalog.num-tables, or /catalog page content.

Updated start-impala-cluster.py readiness check to wait for at least 1
table to be seen by coordinators, except during create-load-data.sh
execution (there is no table yet) and when use_local_catalog=true (local
catalog cache does not start with any table). Modified startup flag
checking from reading the actual command line args to reading the
'/varz?json' page of the daemon. Cleanup impala_service.py to fix some
flake8 issues.

Slightly update TestLocalCatalogCompactUpdates::test_restart_catalogd so
that unique_database cleanup is successful.

Testing:
- Refactor test_catalogd_ha.py to reduce repeated code, use
  unique_database fixture, and additionally validate /healthz page of
  both active and standby catalogd. Changed it to test using hs2
  protocol by default.
- Run and pass test_catalogd_ha.py and test_concurrent_ddls.py.
- Pass core tests.

Change-Id: I58cc66dcccedb306ff11893f2916ee5ee6a3efc1
Reviewed-on: http://gerrit.cloudera.org:8080/22634
Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>
Tested-by: Riza Suminto <riza.suminto@cloudera.com>
2025-04-17 01:59:54 +00:00
stiga-huang
f22b805c88 IMPALA-13936: REFRESH should wait for ALTER ownership events
Coordinator uses collectTableRefs() to collect table names used by a
statement. For ResetMetadataStmt used by REFRESH and INVALIDATE METADATA
commands, it's intended to not return the table name in
collectTableRefs() to avoid triggering unneccessary table metadata
loading. However, when this method is used for the HMS event sync
feature, we do want to know what the table is. Thus, catalogd can return
the latest metadata of it after waiting for HMS events are synced. This
bug leads to REFRESH/INVALIDATE not waiting for HMS ALTER ownership
events to be synced. REFRESH/INVALIDATE statements might unexpectedly
fail or succeed due to stale ownership info in coordinators.

To avoid changing the existing logic of collectTableRefs(), this patch
uses getTableName() directly for REFRESH statements since we know it's a
single-table statement. There are other kinds of such single-table
statements like DROP TABLE. To be generic, introduces a new interface,
SingleTableStmt, for all such statements that have a single table name.
If a statement is a SingleTableStmt, we use getTableName() directly
instead of collectTableRefs() in collectRequiredObjects().

This improves coordinator in collecting table names for single-table
statements. E.g. "DROP TABLE mydb.foo" previously has two candidate
table names - "mydb.foo" and "default.mydb" (assuming the session db is
"default"). Now it just collects "mydb.foo". Catalogd can return less
metadata in the response.

Tests:
 - Added FE tests for collectRequiredObjects() where coordinators
   collect db/table names.
 - Added authorization tests on altering the ownership in Hive and
   running queries in Impala.

Change-Id: I813007e9ec42392d0f6d3996331987c138cc4fb8
Reviewed-on: http://gerrit.cloudera.org:8080/22743
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-04-16 17:04:42 +00:00
Riza Suminto
b46d541501 IMPALA-13961: Remove usage of ImpalaBeeswaxResult.schema
An equivalent of ImpalaBeeswaxResult.schema is not implemented at
ImpylaHS2ResultSet. However, column_labels and column_types fields are
implemented for both.

This patch removes usage of ImpalaBeeswaxResult.schema and replaces it
with either column_labels or column_types field. Tests that used to
access ImpalaBeeswaxResult.schema are migrated to test using hs2
protocol by default. Also fix flake8 issues in modified test files.

Testing:
Run and pass modified test files in exhaustive exploration.

Change-Id: I060fe2d3cded1470fd09b86675cb22442c19fbee
Reviewed-on: http://gerrit.cloudera.org:8080/22776
Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-04-16 06:28:11 +00:00
Joe McDonnell
c5a0ec8bdf IMPALA-11980 (part 1): Put all thrift-generated python code into the impala_thrift_gen package
This puts all of the thrift-generated python code into the
impala_thrift_gen package. This is similar to what Impyla
does for its thrift-generated python code, except that it
uses the impala_thrift_gen package rather than impala._thrift_gen.
This is a preparatory patch for fixing the absolute import
issues.

This patches all of the thrift files to add the python namespace.
This has code to apply the patching to the thirdparty thrift
files (hive_metastore.thrift, fb303.thrift) to do the same.

Putting all the generated python into a package makes it easier
to understand where the imports are getting code. When the
subsequent change rearranges the shell code, the thrift generated
code can stay in a separate directory.

This uses isort to sort the imports for the affected Python files
with the provided .isort.cfg file. This also adds an impala-isort
shell script to make it easy to run.

Testing:
 - Ran a core job

Change-Id: Ie2927f22c7257aa38a78084efe5bd76d566493c0
Reviewed-on: http://gerrit.cloudera.org:8080/20169
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>
2025-04-15 17:03:02 +00:00
stiga-huang
c0174bb438 IMPALA-13960: Add catalog timeline item for prepareInsertEventData
When enable_insert_events is set to true (default), Impala will fire HMS
INSERT events for each INSERT statement. Preparing data of the
InsertEvents actually takes time since it fetches checksums of all the
new files. This patch adds a catalog timeline item to reveal this step.

Before this patch, the duration of "Got Metastore client" before "Fired
Metastore events" could be long:

    Catalog Server Operation: 65.762ms
       - Got catalog version read lock: 12.724us (12.724us)
       - Got catalog version write lock and table write lock: 224.572us (211.848us)
       - Got Metastore client: 418.346us (193.774us)
       - Got Metastore client: 29.001ms (28.583ms) <---- Unexpected long
       - Fired Metastore events: 52.665ms (23.663ms)

After this patch, we shows what actually takes the time is "Prepared
InsertEvent data":

    Catalog Server Operation: 61.597ms
       - Got catalog version read lock: 7.129us (7.129us)
       - Got catalog version write lock and table write lock: 114.476us (107.347us)
       - Got Metastore client: 200.040us (85.564us)
       - Prepared InsertEvent data: 25.335ms (25.135ms)
       - Got Metastore client: 25.342ms (7.009us)
       - Fired Metastore events: 46.625ms (21.283ms)

Tests:
 - Added e2e test

Change-Id: Iaef1cae7e8ca1c350faae8666ab1369717736978
Reviewed-on: http://gerrit.cloudera.org:8080/22778
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-04-15 16:24:56 +00:00
Riza Suminto
50a98dce46 IMPALA-13959: Fix TestHmsIntegration.test_change_parquet_column_type
TestHmsIntegration.test_change_parquet_column_type fail in exhaustive
mode due to a missing int parsing introduced by IMPALA-13920.

This patch add the missing int parsing. It also fix flake8 issues
in test_hms_integration.py, including unused vector fixture.

Testing:
Run and pass test_hms_integration.py in exhaustive mode.

Change-Id: If5fb9f96b4087e86b0ebaac7135e14b7a14936ea
Reviewed-on: http://gerrit.cloudera.org:8080/22774
Reviewed-by: Csaba Ringhofer <csringhofer@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-04-14 12:04:48 +00:00
Riza Suminto
047cf9ff4d IMPALA-13954: Validate num inserted rows via NumModifiedRows counter
This patch changes the way test validate num inserted rows from checking
the beeswax-specific result to checking NumModifiedRows counter from
query profile.

Remove skiping over hs2 protocol in test_chars.py and refactor
test_date_queries.py a bit to reduce test skiping. Added HS2_TYPES in
tests that requires it and fix some flake8 issues.

Testing:
Run and pass all affected tests.

Change-Id: I96eae9967298f75b2c9e4d0662fcd4a62bf5fffc
Reviewed-on: http://gerrit.cloudera.org:8080/22770
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Riza Suminto <riza.suminto@cloudera.com>
2025-04-11 20:31:50 +00:00
Riza Suminto
0ed4e869de IMPALA-13930: ImpylaHS2Connection should only open cursor as needed
Before this patch, ImpylaHS2Connection unconditionally opened a
cursor (and HS2 session) as it connected, followed by running a "SET
ALL" query to populate the default query options.

This patch changes the behavior of ImpylaHS2Connection to open the
default cursor only when querying is needed for the first time. This
helps preserve assertions for a test that is sensitive about client
connection, like IMPALA-13925. Default query options are now parsed from
newly instantiated TQueryOptions object rather than issuing a "SET ALL"
query or making BeeswaxService.get_default_configuration() RPC.

Fix test_query_profile_contains_query_compilation_metadata_cached_event
slightly by setting the 'sync_ddl' option because the test is flaky
without it.

Tweak test_max_hs2_sessions_per_user to run queries so that sessions
will open.

Deduplicate test cases between utc-timestamp-functions.test and
local-timestamp-functions.test. Rename TestUtcTimestampFunctions to
TestTimestampFunctions, and expand it to also tests
local-timestamp-functions.test and
file-formats-with-local-tz-conversion.test. The table_format is now
contrained to 'test/none' because it is unnecessary to permute other
table_format.

Deprecate 'use_local_tz_for_unix_timestamp_conversions' in favor of
query option with the same name. Filed IMPALA-13953 to update the
documentation of 'use_local_tz_for_unix_timestamp_conversions'
flag/option.

Testing:
Run and pass a few pytests such as:
test_admission_controller.py
test_observability.py
test_runtime_filters.py
test_session_expiration.py.
test_set.py

Change-Id: I9d5e3e5c11ad386b7202431201d1a4cff46cbff5
Reviewed-on: http://gerrit.cloudera.org:8080/22731
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-04-11 04:37:14 +00:00