impala

mirror of https://github.com/apache/impala.git synced 2025-12-19 18:12:08 -05:00

Author	SHA1	Message	Date
stiga-huang	dd2d44492d	IMPALA-14062: Adds missing timeline items in constructing PartitionDeltaUpdater PartitionDeltaUpdater has two sub-classes, PartNameBasedDeltaUpdater and PartBasedDeltaUpdater. They are used in reloading metadata of a table. Their constructors invoke HMS RPCs which could be slow and should be tracked in the catalog timeline. This patch adds missing timeline items for those HMS RPCs. Tests: - Added e2e tests Change-Id: Id231c2b15869aac2dae3258817954abf119da802 Reviewed-on: http://gerrit.cloudera.org:8080/22917 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-05-20 02:28:54 +00:00
Riza Suminto	44e9b6f97d	IMPALA-14078: Reorganize test_ranger.py to share minicluster test_ranger.py is a custom cluster test consisting of 41 test methods. Each test method require minicluster restart. With IMPALA-13503, we can reorganize TestRanger class into 3 separate test class: TestRangerIndependent, TestRangerLegacyCatalog, and TestRangerLocalCatalog. Both TestRangerLegacyCatalog and TestRangerLocalCatalog can maintain the same minicluster without restarting it in between. Testing: - Run and pass test_ranger.py in exhaustive mode. - Confirmed that no test is missing after reorganization. Change-Id: I01ff2b3e98fccfffa8bcdfe1177be98634363b56 Reviewed-on: http://gerrit.cloudera.org:8080/22905 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-05-17 10:28:43 +00:00
Riza Suminto	3593a47a71	IMPALA-14060: Remove ImpalaConnection.get_default_configuration() This patch remove ImpalaConnection.get_default_configuration() after refactoring done in IMPALA-14039. Testing: Run and pass test_queries.py::TestQueries. Change-Id: Idf2a3a5b7b427a46ddd288bb7fbb16ba2803735d Reviewed-on: http://gerrit.cloudera.org:8080/22903 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-05-16 01:19:34 +00:00
skatiyal	fae38aa77d	IMPALA-13866: Add the timestamp in /jvm-threadz page Enhanced /jvm-threadz WebUI page to include a timestamp for every jstack capture, It will help comparing multiple jstacks captured via WebUI. Change-Id: Ic0cb95028e175328c94aa2ad9df1f841efcde948 Reviewed-on: http://gerrit.cloudera.org:8080/22877 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-05-15 17:58:58 +00:00
Riza Suminto	6831076983	IMPALA-14072: Fix NPE in Catalogd during rename. test_rename_drop fail with NPE after IMPALA-14042. This is because CatalogServiceCatalog.renameTable() return null for not finding the database of oldTableName. This patch change renameTable() to return Pair.create(null, null) for that scenario. Refactor test_rename_drop slightly to ensure that invalidating the renamed table and dropping it are successful. Testing: - Add checkNotNull precondition in CatalogOpExecutor.alterTableOrViewRename() - Increase catalogd_table_rename_delay delay to 6s to ensure that DROP query happen in Catalogd before renameTable() called. Manually observed that No NPE is shown anymore. Change-Id: I7a421a71cf3703290645e85180de8e9d5e86368a Reviewed-on: http://gerrit.cloudera.org:8080/22899 Reviewed-by: Quanlong Huang <huangquanlong@gmail.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-05-15 09:42:31 +00:00
Riza Suminto	f18cfaf0db	IMPALA-14028: Refactor cancel_query_and_validate_state with HS2 cancel_query_and_validate_state is a helper method used to test query cancellation with concurrent fetch. It is still use beeswax client by default. This patch change the test method to use HS2 protocol by default. The changes include following: 1. Set TGetOperationStatusResp.operationState to TOperationState::ERROR_STATE if returning abnormally. 2. Use separate MinimalHS2Client for (execute_async, fetch, get_runtime_profile) vs cancel vs close. Cancellation through KILL QUERY still instantiate new ImpylaHS2Connection client. 3. Implement required missing methods in MinimalHS2Client. 4. Change MinimalHS2Client logging pattern to match with other clients. Testing: Pass test_cancellation.py and TestResultSpoolingCancellation in core exploration mode. Also fix default_test_protocol to HS2 for these tests. Change-Id: I626a1a06eb3d5dc9737c7d4289720e1f52d2a984 Reviewed-on: http://gerrit.cloudera.org:8080/22853 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Riza Suminto <riza.suminto@cloudera.com>	2025-05-14 20:20:14 +00:00
Riza Suminto	f7cf4f8446	IMPALA-14070: Use checkedMultiply in SortNode.java maxRowsInHeaps calculation may overflow because it use simple multiplication. This patch fix the bug by calculating it using checkedMultiply(). A broader refactoring will be done by IMPALA-14071. Testing: Add ee tests TestTopNHighNdv that exercise the issue. Change-Id: Ic6712b94f4704fd8016829b2538b1be22baaf2f7 Reviewed-on: http://gerrit.cloudera.org:8080/22896 Reviewed-by: Abhishek Rawat <arawat@cloudera.com> Reviewed-by: Wenzhe Zhou <wzhou@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-05-14 04:11:48 +00:00
Riza Suminto	2040f66569	IMPALA-14042: Deflake TestConcurrentRename.test_rename_drop TestConcurrentRename.test_rename_drop has been flaky because the INVALIDATE query may arrive ahead of the ALTER TABLE RENAME query. This patch deflake it by changing the sleep with admission control wait and catalog version check. The first INVALIDATE query will only start after catalog version increase since CREATE TABLE query. Testing: Loop the test 50x and pass them all. Change-Id: I2539d5755aae6d375400b9a1289a658d0e7ba888 Reviewed-on: http://gerrit.cloudera.org:8080/22876 Reviewed-by: Yida Wu <wydbaggio000@gmail.com> Reviewed-by: Jason Fehr <jfehr@cloudera.com> Tested-by: Riza Suminto <riza.suminto@cloudera.com>	2025-05-12 23:10:40 +00:00
Noemi Pap-Takacs	8170ec124d	IMPALA-11672: Update 'transient_lastDdlTime' for Iceberg tables 'transient_lastDdlTime' table property was not updated for Iceberg tables before this change. Now it is updated after DDL operations including DROP PARTITION as well. Renaming an Iceberg table is an exception: Iceberg does not keep track of the table name in the metadata files, so there is no Iceberg transaction to change it. The table name is a concept that exists only in the catalog. If we rename the table, we only edit our catalog entry, but the metadata stored on the file system - the table's state - does not change. Therefore renaming an Iceberg table does not change the 'transient_lastDdlTime' table property because rename is a catalog-level operation for Iceberg tables, and not table-level. Testing: - added managed and non-managed Iceberg table DDL tests to test_last_ddl_update.py Change-Id: I7e5f63b50bd37c80faf482c4baf4221be857c54b Reviewed-on: http://gerrit.cloudera.org:8080/22831 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-05-12 18:35:07 +00:00
Surya Hebbar	7756e5bc32	IMPALA-13473: Add support for JS code analysis and linting with ESLint This patch adds support for JS code analysis and linting to webUI scripts using ESLint. Support to enforce code style and quality is partcularly beneficial, as the codebase for client-side scripts is consistently growing. This has been implemented to work alongside other code style enforcement rules present within 'critique-gerrit-review.py', which runs on the existing jenkins job 'gerrit-auto-critic', to produce gerrit comments. In the case of webUI scripts, ESLint's code analysis and linting checks are performed to produce these comments. As a shared NodeJS installation can be used for JS tests as well as linting, a seperate common script "bin/nodejs/setup_nodejs.sh" has been added for assiting with the NodeJS installation. To ensure quicker run times for the jenkins job, NodeJS tarball is cached within "${HOME}/.cache" directory, after the initial installation. ESLint's packages and dependencies have been made to be cached using NPM's own package management and are also cached locally. NodeJS and ESLint dependencies are retrieved and executed, only if there are any changes within ".js" files within the patchset, and run with minimal overhead. After analysis, comments are generated for all the violations according to the specified rules. A custom formatter has been added to extract, format and filter the violations in JSON form. These generated code style violations are formatted into the required JSON form according to gerrit's REST API, similar to comments generated by flake8. These are then posted to gerrit as comments on the respective patchset from jenkins over SSH. The following code style and quality rules have been added using ESLint. - Disallow unused variables - Enforce strict equality (=== and !==) - Require curly braces for all control statements (if, while, etc.) - Enforce semicolons at the end of statements - Enforce double quotes for strings - Set maximum line length to 90 - Disallow `var`, use `let` or `const` - Prefer `const` where possible - Disallow multiple empty lines - Enforce spacing around infix operators (eg. +, =) - Disallow the use of undeclared variables - Require parentheses around arrow function arguments - Require a space before blocks - Enforce consistent spacing inside braces - Disallow shadowing variables declared in the outer scope - Disallow constant conditions in if statements, loops, etc - Disallow unnecessary parentheses in expressions - Disallow duplicate arguments in function definitions - Disallow duplicate keys in object literals - Disallow unreachable code after return, throw, continue, etc - Disallow reassigning function parameters - Require functions to always consistently return or not return at all - Enforce consistent use of dot notation wherever possible - Disallow multiple empty lines - Enforce spacing around the colon in object literal properties - Disallow optional chaining, where undefined values are not allowed The required linting packages have been added as dependencies in the "www/scripts" directory. All the test scripts and related dependencies have been moved to - $IMPALA_HOME/tests/webui/js_tests. All the custom ESLint formatter scripts and related dependencies have been moved to - $IMPALA_HOME/tests/webui/linting. A combination of NodeJS's 'prefix' argument and NODE_PATH environmental variable is being used to seperate the dependencies and webUI scripts. To support running the tests from a remote directory(i.e. tests/webui), by modifying the required base paths. The JS scripts need to be updated according to these linting rules, as per IMPALA-13986. Change-Id: Ieb3d0a9221738e2ac6fefd60087eaeee4366e33f Reviewed-on: http://gerrit.cloudera.org:8080/21970 Reviewed-by: Riza Suminto <riza.suminto@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-05-11 01:07:14 +00:00
Riza Suminto	a3146ca722	IMPALA-13959: (addendum) Let test pass regardless of JDK version. This patch modify test_change_parquet_column_type to let it pass regardless of the test JDK version. The assertion is changed from using string match to regex. Testing: Run and pass the test with both JDK8 and JDK17. Change-Id: I5bd3eebe7b1e52712033dda488f0c19882207f9d Reviewed-on: http://gerrit.cloudera.org:8080/22874 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-05-09 21:51:43 +00:00
Zoltan Borok-Nagy	04735598d6	IMPALA-13718: Skip reloading Iceberg tables when metadata JSON file is the same With this patch Impala skips reloading Iceberg tables when metadata JSON file is the same, as this means that the table is essentially unchanged. This can help in situations when the event processor is lagging behind and we have an Iceberg table that is updated frequently. Imagine the case when Impala gets 100 events for an Iceberg table. In this case after processing the first event, our internal representation of the Iceberg table is already up-to-date, there is no need to do the reload 100 times. We cannot use the internal icebergApiTable_'s metadata location, as the following statement might silently refresh the metadata in 'current()': icebergApiTable_.operations().current().metadataFileLocation() To guarantee that we check against the actual loaded metadata this patch introduces a new member to store the metadata location. Testing * added e2e tests for REFRESH, also for event processing Change-Id: I16727000cb11d1c0591875a6542d428564dce664 Reviewed-on: http://gerrit.cloudera.org:8080/22432 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Noemi Pap-Takacs <npaptakacs@cloudera.com>	2025-05-09 11:37:01 +00:00
Riza Suminto	f2acd2381f	IMPALA-14039: __restore_query_options should unset query option ImpalaTestSuite.__restore_query_options() attempt to restore client's configuration with what it understand as the "default" query option. Since IMPALA-13930, ImpalaConnection.get_default_configuration() parse the default query option from TQueryOption fields. Therefore, it might not respect server's default that comes from --default_query_options flag. ImpalaTestSuite.__restore_query_options() should simply unset any configuration that previously set by running SET query like this: SET query_option=""; This patch also change execute_query_using_vector() to simply unset client's configuration. Follow up cleanup will be tracked through IMPALA-14060. Testing: Run and pass test_queries.py::TestQueries. Change-Id: I884986b9ecbcabf0b34a7346220e6ea4142ca923 Reviewed-on: http://gerrit.cloudera.org:8080/22862 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Jason Fehr <jfehr@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-05-09 00:48:58 +00:00
Riza Suminto	3210ec58c5	IMPALA-14006: Bound max_instances in CreateInputCollocatedInstances IMPALA-11604 (part 2) changes how many instances to create in Scheduler::CreateInputCollocatedInstances. This works when the left child fragment of a parent fragment is distributed across nodes. However, if the left child fragment instance is limited to only 1 node (the case of UNPARTITIONED fragment), the scheduler might over-parallelize the parent fragment by scheduling too many instances in a single node. This patch attempts to mitigate the issue in two ways. First, it adds bounding logic in PlanFragment.traverseEffectiveParallelism() to lower parallelism further if the left (probe) side of the child fragment is not well distributed across nodes. Second, it adds TQueryExecRequest.max_parallelism_per_node to relay information from Analyzer.getMaxParallelismPerNode() to the scheduler. With this information, the scheduler can do additional sanity checks to prevent Scheduler::CreateInputCollocatedInstances from over-parallelizing a fragment. Note that this sanity check can also cap MAX_FS_WRITERS option under a similar scenario. Added ScalingVerdict enum and TRACE log it to show the scaling decision steps. Testing: - Add planner test and e2e test that exercise the corner case under COMPUTE_PROCESSING_COST=1 option. - Manually comment the bounding logic in traverseEffectiveParallelism() and confirm that the scheduler's sanity check still enforces the bounding. Change-Id: I65223b820c9fd6e4267d57297b1466d4e56829b3 Reviewed-on: http://gerrit.cloudera.org:8080/22840 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-05-07 03:34:15 +00:00
Riza Suminto	c0c6cc9df4	IMPALA-12201: Stabilize TestFetch This patch attempt to stabilize TestFetch by using HS2 as test protocol. test_rows_sent_counters is modified to use the default hs2_client. test_client_fetch_time_stats and test_client_fetch_time_stats_incomplete is modified to use MinimalHS2Connection that has more simpler mechanism in terms of fetching (ImpylaHS2Connection always fetch 10240 rows at a time). Implemented minimal functions needed to wait for finished state and pull runtime profile at MinimalHS2Connection. Testing: Loop the test 50 times and pass them all. Change-Id: I52651df37a318357711d26d2414e025cce4185c3 Reviewed-on: http://gerrit.cloudera.org:8080/22847 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-05-07 00:45:08 +00:00
Riza Suminto	cb496104d9	IMPALA-14027: Implement HS2 NULL_TYPE using TStringValue HS2 NULL_TYPE should be implemented using TStringValue. However, due to incompatibility with Hive JDBC driver implementation then, Impala choose to implement NULL type using TBoolValue (see IMPALA-914, IMPALA-1370). HIVE-4172 might be the root cause for such decision. Today, the Hive JDBC (org.apache.hive.jdbc.HiveDriver) does not have that issue anymore, as shown in this reproduction after applying this patch: ./bin/run-jdbc-client.sh -q "select null" -t NOSASL Using JDBC Driver Name: org.apache.hive.jdbc.HiveDriver Connecting to: jdbc:hive2://localhost:21050/;auth=noSasl Executing: select null ----[START]---- NULL ----[END]---- Returned 1 row(s) in 0.343s Thus, we can reimplement NULL_TYPE using TStringValue to match HiveServer2 behavior. Testing: - Pass core tests. Change-Id: I354110164b360013d9893f1eb4398c3418f80472 Reviewed-on: http://gerrit.cloudera.org:8080/22852 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-05-06 19:41:17 +00:00
Michael Smith	912114b6cd	IMPALA-14022: Use longer timeouts for rename test Extends timeouts for test_alter_table_rename_independent to allow more time for catalog updates. Change-Id: Ie3dcd7b93a37fc6d078fd562ae1c356596a758a6 Reviewed-on: http://gerrit.cloudera.org:8080/22856 Reviewed-by: Riza Suminto <riza.suminto@cloudera.com> Tested-by: Michael Smith <michael.smith@cloudera.com>	2025-05-06 16:55:35 +00:00
Michael Smith	8a4803f895	IMPALA-14022: Run test_alter_table_rename_independent serially Run new test test_alter_table_rename_independent serially to avoid delays from other catalog activity that make the test less predictable. Change-Id: I6033dc533721a649dd476f2bf2c8c63a7a78ef59 Reviewed-on: http://gerrit.cloudera.org:8080/22841 Reviewed-by: Riza Suminto <riza.suminto@cloudera.com> Tested-by: Michael Smith <michael.smith@cloudera.com>	2025-05-01 15:53:34 +00:00
Fang-Yu Rao	8f7d2246ec	IMPALA-12554: (Addendum) Add a flag to not consolidate requests by default This patch adds a startup flag so that by default the catalog server will not consolidate the grant/revoke requests sent to the Ranger server when there are multiple columns involved in the GRANT/REVOKE statement. Testing: - Added 2 end-to-end tests to make sure the grant/revoke requests sent to the Ranger server would be consolidated only when the flag is explicitly added when we start the catalog server. Change-Id: I4defc59c048be1112380c3a7254ffa8655eee0af Reviewed-on: http://gerrit.cloudera.org:8080/22833 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-05-01 11:34:38 +00:00
Yida Wu	d95c06cd6c	IMPALA-14001: Start EXEC_TIME_LIMIT_S timer after backend execution begins This patch fixes an issue where EXEC_TIME_LIMIT_S was inaccurately enforced by including the planning time in its countdown. The timer for EXEC_TIME_LIMIT_S is now started only after the coordinator reaches the "Ready to start on the backends" state, ensuring that this time limit applies strictly to the execution phase. This patch also adds a DebugAction PLAN_CREATE in the planning phase for the testing purpose. Tests: Passed core tests. Adds an ee testcase query_test/test_exec_time_limit.py. Change-Id: I825e867f1c9a39a9097d1c97ee8215281a009d7d Reviewed-on: http://gerrit.cloudera.org:8080/22837 Reviewed-by: Riza Suminto <riza.suminto@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-05-01 09:31:53 +00:00
Venu Reddy	5db760662f	IMPALA-12709: Add support for hierarchical metastore event processing At present, metastore event processor is single threaded. Notification events are processed sequentially with a maximum limit of 1000 events fetched and processed in a single batch. Multiple locks are used to address the concurrency issues that may arise when catalog DDL operation processing and metastore event processing tries to access/update the catalog objects concurrently. Waiting for a lock or file metadata loading of a table can slow the event processing and can affect the processing of other events following it. Those events may not be dependent on the previous event. Altogether it takes a very long time to synchronize all the HMS events. Existing metastore event processing is turned into multi-level event processing with enable_hierarchical_event_processing flag. It is not enabled by default. Idea is to segregate the events based on their dependency, maintain the order of events as they occur within the dependency and process them independently as much as possible. Following 3 main classes represents the three level threaded event processing. 1. EventExecutorService It provides the necessary methods to initialize, start, clear, stop and process the metastore events processing in hierarchical mode. It is instantiated from MetastoreEventsProcessor and its methods are invoked from MetastoreEventsProcessor. Upon receiving the event to process, EventExecutorService queues the event to appropriate DbEventExecutor for processing. 2. DbEventExecutor An instance of this class has an execution thread, manage events of multiple databases with DbProcessors. An instance of DbProcessor is maintained to store the context of each database within the DbEventExecutor. On each scheduled execution, input events on DbProcessor are segregated to appropriate TableProcessors for the event processing and also process the database events that are eligible for processing. Once a DbEventExecutor is assigned to a database, a DbProcessor is created. And the subsequent events belonging to the database are queued to same DbEventExecutor thread for further processing. Hence, linearizability is ensured in dealing with events within the database. Each instance of DbEventExecutor has a fixed list of TableEventExecutors. 3. TableEventExecutor An instance of this class has an execution thread, processes events of multiple tables with TableProcessors. An instance of TableProcessor is maintained to store context of each table within a TableEventExecutor. On each scheduled execution, events from TableProcessors are processed. Once a TableEventExecutor is assigned to table, a TableProcessor is created. And the subsequent table events are processed by same TableEventExecutor thread. Hence, linearizability is guaranteed in processing events of a particular table. - All the events of a table are processed in the same order they have occurred. - Events of different tables are processed in parallel when those tables are assigned to different TableEventExecutors. Following new events are added: 1. DbBarrierEvent This event wraps a database event. It is used to synchronize all the TableProcessors belonging to database before processing the database event. It acts as a barrier to restrict the processing of table events that occurred after the database event until the database event is processed on DbProcessor. 2. RenameTableBarrierEvent This event wraps an alter table event for rename. It is used to synchronize the source and target TableProcessors to process the rename table event. It ensures the source TableProcessor removes the table first and then allows the target TableProcessor to create the renamed table. 3. PseudoCommitTxnEvent and PseudoAbortTxnEvent CommitTxnEvent and AbortTxnEvent can involve multiple tables in a transaction and processing these events modifies multiple table objects. Pseudo events are introduced such that a pseudo event is created for each table involved in the transaction and these pseudo events are processed independently at respective TableProcessors. Following new flags are introduced: 1. enable_hierarchical_event_processing To enable the hierarchical event processing on catalogd. 2. num_db_event_executors To set the number of database level event executors. 3. num_table_event_executors_per_db_event_executor To set the number of table level event executors within a database event executor. 4. min_event_processor_idle_ms To set the minimum time to retain idle db processors and table processors on the database event executors and table event executors respectively, when they do not have events to process. 5. max_outstanding_events_on_executors To set the limit of maximum outstanding events to process on event executors. Changed hms_event_polling_interval_s type from int to double to support millisecond precision interval TODOs: 1. We need to redefine the lag in the hierarchical processing mode. 2. Need to have a mechanism to capture the actual event processing time in hierarchical processing mode. Currently, with enable_hierarchical_event_processing as true, lastSyncedEventId_ and lastSyncedEventTimeSecs_ are updated upon event dispatch to EventExecutorService for processing on respective DbEventExecutor and/or TableEventExecutor. So lastSyncedEventId_ and lastSyncedEventTimeSecs_ doesn't actually mean events are processed. 3. Hierarchical processing mode currently have a mechanism to show the total number of outstanding events on all the db and table executors at the moment. Need to enhance observability further with this mode. Filed a jira[IMPALA-13801] to fix them. Testing: - Executed existing end to end tests. - Added fe and end-to-end tests with enable_hierarchical_event_processing. - Added event processing performance tests. - Have executed the existing tests with hierarchical processing mode enabled. lastSyncedEventId_ is now used in the new feature of sync_hms_events_wait_time_s (IMPALA-12152) as well. Some tests fail when hierarchical processing mode is enabled because lastSyncedEventId_ do not actually mean event is processed in this mode. This need to be fixed/verified with above jira[IMPALA-13801]. Change-Id: I76d8a739f9db6d40f01028bfd786a85d83f9e5d6 Reviewed-on: http://gerrit.cloudera.org:8080/21031 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-04-30 11:51:03 +00:00
stiga-huang	076536d508	IMPALA-13999: Refactor test_hms_event_sync_basic to be smaller parallel tests test_hms_event_sync_basic is not a simple test that it actually tests several kinds of statements in sequence. This refactors it into smaller parallel tests so there are more concurrent HMS events to be processed and easier to reveal bugs. Renamed some tests to use shorter names. Tests: - Ran all parallel tests of TestEventSyncWaiting 32 times. Change-Id: I8a2be548697f6259961b83dc91230306f38e03ad Reviewed-on: http://gerrit.cloudera.org:8080/22829 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-04-30 04:39:30 +00:00
Eyizoha	faf322dd41	IMPALA-12927: Support specifying format for reading JSON BINARY columns Currently, Impala always assumes that the data in the binary columns of JSON tables is base64 encoded. However, before HIVE-21240, Hive wrote binary data to JSON tables without base64 encoding it, instead writing it as escaped strings. After HIVE-21240, Hive defaults to base64 encoding binary data when writing to JSON tables and introduces the serde property 'json.binary.format' to indicate the encoding method of binary data in JSON tables. To maintain consistency with Hive and avoid correctness issues caused by reading data in an incorrect manner, this patch also introduces the serde property 'json.binary.format' to specify the reading method for binary data in JSON tables. Currently, this property supports reading in either base64 or rawstring formats, same as Hive. Additionally, this patch introduces a query option 'json_binary_format' to achieve the same effect. This query option will only take effect for JSON tables where the serde property 'json.binary.format' is not set. The reading format of binary columns in JSON tables can be configured globally by setting the 'default_query_options'. It should be noted that the default value of 'json_binary_format' is 'NONE', and impala will prohibit reading binary columns of JSON tables that either have "no 'json.binary.format' set and 'json_binary_format' is 'NONE'" or "an invalid 'json.binary.format' value set", and will provide an error message to avoid using an incorrect format without the user noticing. Testing: - Enabled existing binary type E2E tests for JSON tables - Added new E2E test for 'json.binary.format' Change-Id: Idf61fa3afc0f33caa63fbc05393e975733165e82 Reviewed-on: http://gerrit.cloudera.org:8080/22289 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-04-29 16:16:12 +00:00
stiga-huang	391cbd0824	IMPALA-13974: (Addendum) Skip TestEventSyncWaiting in non-HDFS builds TestEventSyncWaiting depends on running statements on Hive which doesn't work in non-HDFS builds. This patch skips the test in these builds. Change-Id: I947ad23456c01bc76df0ad154519360549143b80 Reviewed-on: http://gerrit.cloudera.org:8080/22828 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-04-29 09:21:05 +00:00
stiga-huang	56b465d91f	IMPALA-13829: Postpone catalog deleteLog GC for waitForHmsEvent requests When a db/table is removed in the catalog cache, catalogd assigns it a new catalog version and put it into the deleteLog. This is used for the catalog update thread to collect deletion updates. Once the catalog update thread collects a range of updates, it triggers GC in the deleteLog to clear items older than the last sent catalog version. The deletions will be broadcasted by statestore to all the coordinators eventually. However, waitForHmsEvent requests is also a consumer of the deleteLog and could be impacted by these GCs. waitForHmsEvent is a catalogd RPC used by coordinators when a query wants to wait until the related metadata is in synced with HMS. The response of waitForHmsEvent returns the latest metadata including the deletions on related dbs/tables. If the related deletions in deleteLog is GCed just before the waitForHmsEvent request collects the results, they will be missing in the response. Coordinator might keep using stale metadata of non-existing dbs/tables. This is a quick fix for the issue by postponing deleteLog GC in a configurable number of topic updates, similar to what we have done on the TopicUpdateLog. A thorough fix might need to carefully choose the version to GC or let impalad waits for the deletions from statestore to arrive. A new flag, catalog_delete_log_ttl, is added for this. The deleteLog items can survive for catalog_delete_log_ttl catalog updates. The default is 60 so a deletion can survive for at least 120s. It should be safe enough, i.e. the GCed deletions must have arrived in the impalad side after 60 rounds of catalog updates, otherwise that's an abnormal impalad and already has other more severe issues, e.g. lots of stale tables due to metadata out of sync with catalogd. Note that postponing deleteLog GCs might increase the memory consumption. But since most of its memory is used by db/table/partition names, the memory usage might still be trivial comparing to other metadata like file descriptors and incremental stats in lived catalog objects. This patch also removed some unused imports. Tests: - Added e2e test with a debug action to reproduce the issue. Ran the test 100 times. Without the fix, it consistently fails when runs for 2-3 times. Change-Id: I2441440bca2b928205dd514047ba742a5e8bf05e Reviewed-on: http://gerrit.cloudera.org:8080/22816 Reviewed-by: Riza Suminto <riza.suminto@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-04-29 07:41:41 +00:00
stiga-huang	0e3ae5c339	IMPALA-13996: Deflake test_too_many_files by creating dedicate tables TestAllowIncompleteData.test_too_many_files depends on tpch_parquet.lineitem to have exactly 3 data files. This is false in erasure coding builds in which tpch_parquet.lineitem has only 2 data files. This fixes the test to use dedicate tables created in the test. Change-Id: I28cec8ec4bc59f066aa15a7243b7163639706cc7 Reviewed-on: http://gerrit.cloudera.org:8080/22824 Reviewed-by: Jason Fehr <jfehr@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-04-29 06:53:31 +00:00
Michael Smith	295f74ef12	IMPALA-13989: Invalidate table on rename failure Handles the error "Table/view rename succeeded in the Hive Metastore, but failed in Impala's Catalog Server" rather than failing the table rename. This error happens when catalog state catches up to the alter event from our alter_table RPC to HMS before we renameTable explicitly in the catalog. The catalog can update independently due to a concurrent 'invalidate metadata' call. In that case we use the oldTbl definition we already have - updated from the delete log if possible - and fetch the new table definition with invalidateTable to continue, automating most of the work that the error message suggested users do via 'invalidate metadata <tbl>'. Updated the test_concurrent_ddls test to remove handle_rename_failure and ran the tests a dozen times. Adds concurrency tests with simultaneous rename and invalidate metadata that previously would fail. Change-Id: Ic2a276b6e5ceb35b7f3ce788cc47052387ae8980 Reviewed-on: http://gerrit.cloudera.org:8080/22807 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-04-29 06:24:21 +00:00
Riza Suminto	96ae16b60b	IMPALA-13584: Add option to shows num row report in impala-shell In beeswax all statements with the exception of USE print 'Fetched X row(s) in Ys', while in HS2 some statements (REFRESH, INVALIDATE) metadata does not print it. While these statements always return 0 rows, the amount of time spent with the statement can be useful. This patch modifies add impala-shell to let it print elapsed time for that query, even if query is not expected to return result metadata. Added --beeswax_compat_num_rows option in impala-shell. It default to False. If this option is set (True), 'Fetched 0 row(s) in' will be printed for all Impala protocol, just like beeswax. One exception for this is USE query, which will remain silent. Testing: - Added test_beeswax_compat_num_rows in test_shell_interactive.py. - Pass test_shell_interactive.py. Change-Id: Id76ede98c514f73ff1dfa123a0d951e80e7508b4 Reviewed-on: http://gerrit.cloudera.org:8080/22813 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-04-28 19:13:39 +00:00
Gabor Kaszab	3c24706c72	IMPALA-13268: Integrate Iceberg ScanMetrics into Impala query profiles When calling planFiles() on an Iceberg table, it can give us some metrics like total planning time, number of data/delete files and manifests, how many of these could be skipped etc. This change integrates these metrics into the query profile, under the "Frontend" section. These metrics are per-table, so if multiple tables are scanned for the query there will be multiple sections in the profile. Note that we only have these metrics for a table if Iceberg needs to be used for planning for that table, e.g. if a predicate is pushed down to Iceberg or if there is time travel. For tables where Iceberg was not used in planning, the profile will contain a short note describing this. To facilitate pairing the metrics with scans, the metrics header references the plan node responsible for the scan. This will always be the top level node for the scan, so it can be a SCAN node, a JOIN node or a UNION node depending on whether the table has delete files. Testing: - added EE tests in iceberg-scan-metrics.tests - added a test in PlannerTest.java that asserts on the number of metrics; if it changes in a new Iceberg release, the test will fail and we can update our reporting Change-Id: I080ee8eafc459dad4d21356ac9042b72d0570219 Reviewed-on: http://gerrit.cloudera.org:8080/22501 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Daniel Becker <daniel.becker@cloudera.com>	2025-04-28 08:54:30 +00:00
stiga-huang	f092646f84	IMPALA-13993: waitForHmsEvent should check table events under missing dbs When a db is missing in the catalog cache, waitForHmsEvent request currently just check if there are pending database events on it, assuming processing the CREATE_DATABASE event will add the db with the table list. However, that's wrong since processing CREATE_DATABASE event just adds the db with an empty table list. We should still wait for pending events on underlying tables. Tests: - Added e2e test which consistenly fails when running concurrently with other tests in TestEventSyncWaiting without the fix. Change-Id: I3fe74fcf0bf4dbac4a3584f6603279c0a2730b0c Reviewed-on: http://gerrit.cloudera.org:8080/22817 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Quanlong Huang <huangquanlong@gmail.com>	2025-04-28 01:06:36 +00:00
Michael Smith	74bd0832ed	IMPALA-13631: (Addendum) Test slow concurrent alters Adds a test that multiple slow concurrent alters complete in parallel rather than being executed sequentially. The test creates tables, and ensures Impala's catalog is up-to-date for their creation so that starting ALTER TABLE will be fast. Then starts two ALTER TABLE RENAMES asynchronously - with debug_action ensuring each takes at least 5 seconds - and waits for them to finish. Verifies that concurrent alters are no longer blocked on "Got catalog version read lock" and complete in less than 10 seconds. Change-Id: I87d077aaa295943a16e6da60a2755dd289f3a132 Reviewed-on: http://gerrit.cloudera.org:8080/22804 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-04-26 16:06:16 +00:00
Mihaly Szjatinya	5e39afc2d9	IMPALA-10268: Validate the debug actions when they are set This patch aims to extract existing verifications on DEBUG_ACTION query option format onto pre-planner stage SetQueryOption(), in order to prevent failures on execution stage. Also, it localizes verification code for two existing types of debug actions. There are two types of debug actions, global e.g. 'RECVR_ADD_BATCH:FAIL' and ExecNode debug actions, e.g. '0:GETNEXT:FAIL'. Two types are implemented independently in source code, both having verification code intertwined with execution. In addition, global debug actions subdivide into C++ and Java, the two being more or less synchronized though. In case of global debug actions, most of the code inside existing DebugActionImpl() consists of verification, therefore it makes sense to make a wrapper around it for separating out the execution code. Things are worse for ExecNode debug actions, where verification code consists of two parts, one in DebugOptions() constructor and another one in ExecNode::ExecDebugActionImpl(). Additionally, some verification in constructor produces warnings, while ExecDebugActionImpl() verification either fails on DCHECK() or (in a single case) returns an error. For this case, a reasonable solution seems to be simply calling the constructor for a temporary object and extracting verification code from ExecNode::ExecDebugActionImpl(). This has the drawback of having the same warning being produced two times. Finally, having extracted verification code for both types, logic in impala::SetQueryOption() combines the two verification mechanisms. Note: In the long run, it is better to write a single verification routine for both Global and ExecNode debug actions, ideally as part of a general unification of the two existing debug_action mechanisms. With this in mind, the current patch intends to preserve current behavior, while avoiding complex refactoring. Change-Id: I53816aba2c79b556688d3b916883fee7476fdbb5 Reviewed-on: http://gerrit.cloudera.org:8080/22734 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-04-25 22:25:19 +00:00
stiga-huang	7d6fe8c6c8	IMPALA-13487: Add profile counters for memory allocation in parquet scanners This patch adds some profile counters for memory allocation and free in MemPools, which are useful to detect tcmalloc contention. The following counters are added: - Thread level page faults: TotalThreadsMinorPageFaults, TotalThreadsMajorPageFaults. - MemPool counters for tuple_mem_pool and aux_mem_pool of the scratch batch in columnar scanners: - ScratchBatchMemAllocDuration - ScratchBatchMemFreeDuration - ScratchBatchMemAllocBytes - MemPool counters for data_page_pool of ParquetColumnChunkReader - ParquetDataPagePoolAllocBytes - ParquetDataPagePoolAllocDuration - ParquetDataPagePoolFreeBytes - ParquetDataPagePoolFreeDuration - MemPool counters for the fragment level RowBatch - RowBatchMemPoolAllocDuration - RowBatchMemPoolAllocBytes - RowBatchMemPoolFreeDuration - RowBatchMemPoolFreeBytes - Duration in HdfsColumnarScanner::GetCollectionMemory() which includes memory allocation for collection values and memcpy when doubling the tuple buffer: - MaterializeCollectionGetMemTime Here is an example of a memory-bound query: Fragment Instance - RowBatchMemPoolAllocBytes: 0 (Number of samples: 0) - RowBatchMemPoolAllocDuration: 0.000ns (Number of samples: 0) - RowBatchMemPoolFreeBytes: (Avg: 719.25 KB (736517) ; Min: 4.00 KB (4096) ; Max: 4.12 MB (4321922) ; Sum: 1.93 GB (2069615013) ; Number of samples: 2810) - RowBatchMemPoolFreeDuration: (Avg: 132.027us ; Min: 0.000ns ; Max: 21.999ms ; Sum: 370.997ms ; Number of samples: 2810) - TotalStorageWaitTime: 47.999ms - TotalThreadsInvoluntaryContextSwitches: 2 (2) - TotalThreadsMajorPageFaults: 0 (0) - TotalThreadsMinorPageFaults: 549.63K (549626) - TotalThreadsTotalWallClockTime: 9s646ms - TotalThreadsSysTime: 1s508ms - TotalThreadsUserTime: 1s791ms - TotalThreadsVoluntaryContextSwitches: 8.85K (8852) - TotalTime: 9s648ms ... HDFS_SCAN_NODE (id=0): - ParquetDataPagePoolAllocBytes: (Avg: 2.36 MB (2477480) ; Min: 4.00 KB (4096) ; Max: 4.12 MB (4321922) ; Sum: 1.02 GB (1090091508) ; Number of samples: 440) - ParquetDataPagePoolAllocDuration: (Avg: 1.263ms ; Min: 0.000ns ; Max: 39.999ms ; Sum: 555.995ms ; Number of samples: 440) - ParquetDataPagePoolFreeBytes: (Avg: 1.28 MB (1344350) ; Min: 4.00 KB (4096) ; Max: 1.53 MB (1601012) ; Sum: 282.06 MB (295757000) ; Number of samples: 220) - ParquetDataPagePoolFreeDuration: (Avg: 1.927ms ; Min: 0.000ns ; Max: 19.999ms ; Sum: 423.996ms ; Number of samples: 220) - ScratchBatchMemAllocBytes: (Avg: 486.33 KB (498004) ; Min: 4.00 KB (4096) ; Max: 512.00 KB (524288) ; Sum: 1.19 GB (1274890240) ; Number of samples: 2560) - ScratchBatchMemAllocDuration: (Avg: 1.936ms ; Min: 0.000ns ; Max: 35.999ms ; Sum: 4s956ms ; Number of samples: 2560) - ScratchBatchMemFreeDuration: 0.000ns (Number of samples: 0) - DecompressionTime: 1s396ms - MaterializeCollectionGetMemTime: 4s899ms - MaterializeTupleTime: 6s656ms - ScannerIoWaitTime: 47.999ms - TotalRawHdfsOpenFileTime: 0.000ns - TotalRawHdfsReadTime: 360.997ms - TotalTime: 9s254ms The fragment instance took 9s648ms to finish. 370.997ms spent in releasing memory of the final RowBatch. The majority of the time is spent in the scan node (9s254ms). Mostly it's DecompressionTime + MaterializeTupleTime + ScannerIoWaitTime + TotalRawHdfsReadTime. The majority is MaterializeTupleTime (6s656ms). ScratchBatchMemAllocDuration shows that invoking std::malloc() in materializing the scratch batches took 4s956ms overall. MaterializeCollectionGetMemTime shows that allocating memory for collections and copying memory in doubling the tuple buffer took 4s899ms. So materializing the collections took most of the time. Note that DecompressionTime (1s396ms) also includes memory allocation duration tracked by the sum of ParquetDataPagePoolAllocDuration (555.995ms). So memory allocation also takes a significant portion of time here. The other observation is TotalThreadsTotalWallClockTime is much higher than TotalThreadsSysTime + TotalThreadsUserTime and there is a large number of TotalThreadsVoluntaryContextSwitches. So the thread is waiting for resources (e.g. lock) for a long duration. In the above case, it's waiting for locks in tcmalloc memory allocation (need off-cpu flame graph to reveal this). Implementation of MemPool counters Add MemPoolCounters in MemPool to track malloc/free duration and bytes. Note that counters are not updated in the destructor since it's expected that all chunks are freed or transfered before calling the destructor. MemPool is widely used in the code base. This patch only exposes MemPool counters in three places: - the scratch batch in columnar scanners - the ParquetColumnChunkReader of parquet scanners - the final RowBatch reset by FragmentInstanceState This patch also moves GetCollectionMemory() from HdfsScanner to HdfsColumnarScanner since it's only used by parquet and orc scanners. PrettyPrint of SummaryStatsCounter is updated to also show the sum of the values if it's not speeds or percentages. Tests - tested in manually reproducing the memory-bound queries - ran perf-AB-test on tpch (sf=42) and didn't see significant performance change - added e2e tests - updated expected files of observability/test_profile_tool.py due to SummaryStatsCounter now prints sum in most of the cases. Also updated get_bytes_summary_stats_counter and test_get_bytes_summary_stats_counter accordingly. Change-Id: I982315d96e6de20a3616f3bd2a2b4866d1ff4710 Reviewed-on: http://gerrit.cloudera.org:8080/22062 Reviewed-by: Michael Smith <michael.smith@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-04-25 02:53:32 +00:00
Michael Smith	ef8f8ca27b	IMPALA-13631: (Addendum) Retry aborted concurrent DDLs TestConcurrentDdls has several exceptions it considers acceptable for testing; it would accept the query failure and continue with other cases. That was fine for existing queries, but if an ALTER RENAME fails subsequent queries will also fail because the table does not have the expected name. With IMPALA-13631, there are three exception cases we need to handle: 1. "Table/view rename succeeded in the Hive Metastore, but failed in Impala's Catalog Server" happens when the HMS alter_table RPC succeeds but local catalog has changed. INVALIDATE METADATA on the target table is sufficient to bring things in sync. 2. "CatalogException: Table ... was modified while operation was in progress, aborting execution" can safely be retried. 3. "Couldn't retrieve the catalog topic update for the SYNC_DDL operation" happens when SYNC_DDL=1 and the DDL runs on a stale table object that's removed from the cache by a global INVALIDATE. Adds --max_wait_time_for_sync_ddl_s=10 in catalogd_args for the last exception to occur. Otherwise the query will just timeout. Tested by running test_concurrent_ddls.py 15 times. The 1st exception previously would show up within 3-4 runs, while the 2nd exception happens pretty much every run. Change-Id: I04d071b62e4f306466a69ebd9e134a37d4327b77 Reviewed-on: http://gerrit.cloudera.org:8080/22802 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Michael Smith <michael.smith@cloudera.com>	2025-04-24 04:05:10 +00:00
Riza Suminto	0816986b15	IMPALA-13987: Fix stress_catalog_init_delay_ms check in RELEASE stress_catalog_init_delay_ms does not exist in RELEASE build and causing KeyError in impala_cluster.py. This patch fix it by specifying default value when inspecting ImpaladService.get_flag_current_values() return value. Testing: Run start-impala-cluster.py in RELEASE build and it works. Change-Id: Ia4400a7e711d21d23cc37878f18f2e0389b741b0 Reviewed-on: http://gerrit.cloudera.org:8080/22803 Reviewed-by: Michael Smith <michael.smith@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-04-23 03:30:52 +00:00
stiga-huang	ab92a300fc	IMPALA-13631: alterTableOrViewRename shouldn't hold catalog versionLock during external RPCs Catalog versionLock is a lock used to synchronize reads/writes of catalogVersion. It can be used to perform atomic bulk catalog operations since catalogVersion cannot change externally while the lock is being held. All other catalog operations will be blocked if the current thread holds the lock. So it shouldn't be held for a long time, especially when the current thread is invoking external RPCs for a table. CatalogOpExecutor.alterTable() is one place that could hold the lock for a long time. If the ALTER operation is a RENAME, it holds the lock until alterTableOrViewRename() finishes. HMS RPCs are invoked in this method to perform the operation, which might take an unpredictive time. The motivation of holding this lock is that RENAME is implemented as an DROP + ADD in the catalog cache. So this operation can be atomic. However, that doesn't mean we need the lock before operating the cache in CatalogServiceCatalog.renameTable(). We actually acquires the lock again in this method. So no need to keep holding the lock when invoking HMS RPCs. This patch removes holding the lock in alterTableOrViewRename(). Tests - Added e2e test for concurrent rename operations. - Also added some rename operations in test_concurrent_ddls.py Change-Id: Ie5f443b1e167d96024b717ce70ca542d7930cb4b Reviewed-on: http://gerrit.cloudera.org:8080/22789 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Riza Suminto <riza.suminto@cloudera.com> Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com> Reviewed-by: Michael Smith <michael.smith@cloudera.com>	2025-04-21 22:54:05 +00:00
stiga-huang	4ddacac14f	IMPALA-11402: Add limit on files fetched by a single getPartialCatalogObject request getPartialCatalogObject is a catalogd RPC used by local catalog mode coordinators to fetch metadata on-demand from catalogd. For a table with a huge number (e.g. 6M) of files, catalogd might hit OOM of exceeding the JVM array limit when serializing the response of a getPartialCatalogObject request for all partitions (thus all files). This patch adds a new flag, catalog_partial_fetch_max_files, to define the max number of file descriptors allowed in a response of getPartialCatalogObject. Catalogd will truncate the response in partition level when it's too big, and only return a subset of the requested partitions. Coordinator should send new requests to fetch the remaining partitions. Note that it's possible that table metadata changes between the requests. Coordinator will detect the catalog version changes and throws an InconsistentMetadataFetchException for the planner to replan the query. This is an existing mechanism for other kinds of table metadata. Here are some metrics of the number of files in a single response and the corresponding byte array size and duration of a single response: * 1000000: 371.71MB, 1s487ms * 2000000: 744.51MB, 4s035ms * 3000000: 1.09GB, 6s643ms * 4000000: 1.46GB, duration not measured due to GC pauses * 5000000: 1.82GB, duration not measured due to GC pauses * 6000000: >2GB (hit OOM) Choose 1000000 as the default value for now. We can tune it in the future. Tests: - Added custom-cluster test - Ran e2e tests in local-catalog mode with catalog_partial_fetch_max_files=1000 so the new codes are used. Change-Id: Ibb13fec20de5a17e7fc33613ca5cdebb9ac1a1e5 Reviewed-on: http://gerrit.cloudera.org:8080/22559 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-04-21 15:20:59 +00:00
Riza Suminto	a29319e4b9	IMPALA-13970: Add NaN and Infinity parsing in ImpylaHS2ResultSet This patch adds NaN, Infinity, and boolean parsing in ImpylaHS2ResultSet to match with beeswax result. TestQueriesJsonTables is changed to test all client protocol. Testing: Run and pass TestQueriesJsonTables. Change-Id: I739a88e9dfa418d3a3c2d9d4181b4add34bc6b93 Reviewed-on: http://gerrit.cloudera.org:8080/22785 Reviewed-by: Riza Suminto <riza.suminto@cloudera.com> Tested-by: Riza Suminto <riza.suminto@cloudera.com>	2025-04-18 15:27:22 +00:00
Riza Suminto	9000c83efc	IMPALA-13971: Deflake TestAdmissionController.test_user_loads_rules TestAdmissionController.test_user_loads_rules is flaky for not failing the last query that should exceed the user quota. The test executes queries in a round-robin fashion across all impalad. These impalads are expected to synchronize user quota count through statestore updates. This patch attempts to deflake the test by raising the heartbeat wait time from 1 heartbeat period to 3 hearbeat periods. It also changes the reject query to a fast version of SLOW_QUERY (without sleep) so the test can fail fast if it is not rejected. Testing: Loop the test 50 times and pass them all. Change-Id: Ib2ae8e1c2edf174edbf0e351d3c2ed06a0539f08 Reviewed-on: http://gerrit.cloudera.org:8080/22787 Reviewed-by: Riza Suminto <riza.suminto@cloudera.com> Tested-by: Riza Suminto <riza.suminto@cloudera.com>	2025-04-18 15:27:22 +00:00
Riza Suminto	648209b172	IMPALA-13967: Move away from setting user parameter in execute ImpalaConnection.execute and ImpalaConnection.execute_async have 'user' parameter to set specific user to run the query. This is mainly legacy of BeeswaxConnection, which allows using 1 client to run queries under different usernames. BeeswaxConnection and ImpylaHS2Connection actually allow specifying one user per client. Doing so will simplify user-specific tests such as test_ranger.py that often instantiates separate clients for admin user and regular user. There is no need to specify 'user' parameter anymore when calling execute() or execute_async(). Thus, reducing potential bugs from forgetting to set one or setting it with incorrect value. This patch applies one-user-per-client practice as much as possible for test_ranger.py, test_authorization.py, and test_admission_controller.py. Unused code and pytest fixtures are removed. Few flake8 issues are addressed too. Their default_test_protocol() is overridden to return 'hs2'. ImpylaHS2Connection.execute() and ImpylaHS2Connection.execute_async() are slightly modified to assume ImpylaHS2Connection.__user if 'user' parameter in None. BeeswaxConnection remains unchanged. Extend ImpylaHS2ResultSet.__convert_result_value() to lower case boolean return value to match beeswax result. Testing: Run and pass all modified tests in exhaustive exploration. Change-Id: I20990d773f3471c129040cefcdff1c6d89ce87eb Reviewed-on: http://gerrit.cloudera.org:8080/22782 Reviewed-by: Riza Suminto <riza.suminto@cloudera.com> Tested-by: Riza Suminto <riza.suminto@cloudera.com>	2025-04-18 15:27:22 +00:00
Riza Suminto	182aa5066e	IMPALA-13958: Revisit hs2_parquet_constraint and hs2_text_constraint hs2_parquet_constraint and hs2_text_constraint is meant to extend test vector dimension to also test non-default test protocol (other than beeswax), but limit it to only run against 'parquet/none' or 'text/none' format accordingly. This patch modifies these constraints to default_protocol_or_parquet_constraint and default_protocol_or_text_constraint respectively such that the full file format coverage happen for default_test_protocol configuration and limited for the other protocols. Drop hs2_parquet_constraint entirely from test_utf8_strings.py because that test is already constrained to single 'parquet/none' file format. Num modified rows validation in date-fileformat-support.test and date-partitioning.test are changed to check the NumModifiedRows counter from profile. Fix TestQueriesJsonTables to always run with beeswax protocol because its assertions relies on beeswax-specific return values. Run impala-isort and fix few flake8 issues and in modified test files. Testing: Run and pass the affected test files using exhaustive exploration and env var DEFAULT_TEST_PROTOCOL=hs2. Confirmed that full file format coverage happen for hs2 protocol. Note that DEFAULT_TEST_PROTOCOL=beeswax is still the default. Change-Id: I8be0a628842e29a8fcc036180654cd159f6a23c8 Reviewed-on: http://gerrit.cloudera.org:8080/22775 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-04-17 22:50:58 +00:00
stiga-huang	fb789df3be	IMPALA-13684: Improve waitForHmsEvent() to only wait for related events waitForHmsEvent is a catalogd RPC for coordinators to send a requested db/table names to catalogd and wait until it's safe (i.e. no stale metadata) to start analyzing the statement. The wait time is configured by query option sync_hms_events_wait_time_s. Currently, when this option is enabled, catalogd waits until it syncs to the latest HMS event regardless what the query is. This patch reduces waiting by only checking related events and wait until the last related event has been processed. In the ideal case, if there are no pending events that are related, the query doesn't need to wait. Related pending events are determined as follows: - For queries that need the db list, i.e. SHOW DATABASES, check pending CREATE/ALTER/DROP_DATABASE events on all dbs. ALTER_DATABASE events are checked in case the ownership changes and impacts visibility. - For db statements like SHOW FUNCTIONS, CREATE/ALTER/DROP DATABASE, check pending CREATE/ALTER/DROP events on that db. - For db statements that require the table list, i.e. SHOW TABLES, also check CREATE_TABLE, DROP_TABLE events under that db. - For table statements, - check all database events on related db names. - If there are loaded transactional tables, check all the pending COMMIT_TXN, ABORT_TXN events. Note that these events might modify multiple transactional tables and we don't know their table names until they are processed. To be safe, wait for all transactional events. - For all the other table names, - if they are all missing/unloaded in the catalog, check all the pending CREATE_TABLE, DROP_TABLE events on them for their existence. - Otherwise, some of them are loaded, check all the table events on them. Note that we can fetch events on multiple tables under the same db in a single fetch. If the statement has a SELECT part, views will be expanded so underlying tables will be checked as well. For performance, this feature assumes that views won't be changed to tables, and vice versa. This is a rare use case in regular jobs. Users should use INVALIDATE for such case. This patch leverages the HMS API to fetch events of several tables under the same db in batch. MetastoreEventsProcessor.MetaDataFilter is improved for this. Tests: - Added test for multiple tables in a single query. - Added test with views. - Added test for transactional tables. - Ran CORE tests. Change-Id: Ic033b7e197cd19505653c3ff80c4857cc474bcfc Reviewed-on: http://gerrit.cloudera.org:8080/22571 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-04-17 09:06:19 +00:00
Riza Suminto	55feffb41b	IMPALA-13850 (part 1): Wait until CatalogD active before resetting In HA mode, CatalogD initialization can fail to complete within reasonable time. Log messages showed that CatalogD is blocked trying to acquire "CatalogServer.catalog_lock_" when calling CatalogServer::UpdateActiveCatalogd() during statestore subscriber registration. catalog_lock_ was held by GatherCatalogUpdatesThread which is calling GetCatalogDelta(), which waits for the java lock versionLock_ which is held by the thread doing CatalogServiceCatalog.reset(). This patch remove catalog reset in JniCatalog constructor. In turn, catalogd-server.cc is now responsible to trigger the metadata reset (Invaidate Metadata) only if: 1. It is the active CatalogD, and 2. Gathering thread has collect the first topic update or CatalogD is set with catalog_topic_mode other than "minimal". The later prerequisite is to ensure that all coordinators are not blocked waiting for full topic update in on-demand metadata mode. This is all managed by a new thread method TriggerResetMetadata that monitor and trigger the initial reset metadata. Note that this is a behavior change in on-demand catalog mode (catalog_topic_mode=minimal). Previously, on-demand catalog mode will send full database list in its first catalog topic update. This behavior change is OK since coordinator can request metadata on-demand. After this patch, catalog-server.active-status and /healthz page can turn into true and OK respectively even if the very first metadata reset is still ongoing. Observer that cares about having fully populated metadata should check other metrics such as catalog.num-db, catalog.num-tables, or /catalog page content. Updated start-impala-cluster.py readiness check to wait for at least 1 table to be seen by coordinators, except during create-load-data.sh execution (there is no table yet) and when use_local_catalog=true (local catalog cache does not start with any table). Modified startup flag checking from reading the actual command line args to reading the '/varz?json' page of the daemon. Cleanup impala_service.py to fix some flake8 issues. Slightly update TestLocalCatalogCompactUpdates::test_restart_catalogd so that unique_database cleanup is successful. Testing: - Refactor test_catalogd_ha.py to reduce repeated code, use unique_database fixture, and additionally validate /healthz page of both active and standby catalogd. Changed it to test using hs2 protocol by default. - Run and pass test_catalogd_ha.py and test_concurrent_ddls.py. - Pass core tests. Change-Id: I58cc66dcccedb306ff11893f2916ee5ee6a3efc1 Reviewed-on: http://gerrit.cloudera.org:8080/22634 Reviewed-by: Riza Suminto <riza.suminto@cloudera.com> Tested-by: Riza Suminto <riza.suminto@cloudera.com>	2025-04-17 01:59:54 +00:00
stiga-huang	f22b805c88	IMPALA-13936: REFRESH should wait for ALTER ownership events Coordinator uses collectTableRefs() to collect table names used by a statement. For ResetMetadataStmt used by REFRESH and INVALIDATE METADATA commands, it's intended to not return the table name in collectTableRefs() to avoid triggering unneccessary table metadata loading. However, when this method is used for the HMS event sync feature, we do want to know what the table is. Thus, catalogd can return the latest metadata of it after waiting for HMS events are synced. This bug leads to REFRESH/INVALIDATE not waiting for HMS ALTER ownership events to be synced. REFRESH/INVALIDATE statements might unexpectedly fail or succeed due to stale ownership info in coordinators. To avoid changing the existing logic of collectTableRefs(), this patch uses getTableName() directly for REFRESH statements since we know it's a single-table statement. There are other kinds of such single-table statements like DROP TABLE. To be generic, introduces a new interface, SingleTableStmt, for all such statements that have a single table name. If a statement is a SingleTableStmt, we use getTableName() directly instead of collectTableRefs() in collectRequiredObjects(). This improves coordinator in collecting table names for single-table statements. E.g. "DROP TABLE mydb.foo" previously has two candidate table names - "mydb.foo" and "default.mydb" (assuming the session db is "default"). Now it just collects "mydb.foo". Catalogd can return less metadata in the response. Tests: - Added FE tests for collectRequiredObjects() where coordinators collect db/table names. - Added authorization tests on altering the ownership in Hive and running queries in Impala. Change-Id: I813007e9ec42392d0f6d3996331987c138cc4fb8 Reviewed-on: http://gerrit.cloudera.org:8080/22743 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-04-16 17:04:42 +00:00
Riza Suminto	b46d541501	IMPALA-13961: Remove usage of ImpalaBeeswaxResult.schema An equivalent of ImpalaBeeswaxResult.schema is not implemented at ImpylaHS2ResultSet. However, column_labels and column_types fields are implemented for both. This patch removes usage of ImpalaBeeswaxResult.schema and replaces it with either column_labels or column_types field. Tests that used to access ImpalaBeeswaxResult.schema are migrated to test using hs2 protocol by default. Also fix flake8 issues in modified test files. Testing: Run and pass modified test files in exhaustive exploration. Change-Id: I060fe2d3cded1470fd09b86675cb22442c19fbee Reviewed-on: http://gerrit.cloudera.org:8080/22776 Reviewed-by: Riza Suminto <riza.suminto@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-04-16 06:28:11 +00:00
Joe McDonnell	c5a0ec8bdf	IMPALA-11980 (part 1): Put all thrift-generated python code into the impala_thrift_gen package This puts all of the thrift-generated python code into the impala_thrift_gen package. This is similar to what Impyla does for its thrift-generated python code, except that it uses the impala_thrift_gen package rather than impala._thrift_gen. This is a preparatory patch for fixing the absolute import issues. This patches all of the thrift files to add the python namespace. This has code to apply the patching to the thirdparty thrift files (hive_metastore.thrift, fb303.thrift) to do the same. Putting all the generated python into a package makes it easier to understand where the imports are getting code. When the subsequent change rearranges the shell code, the thrift generated code can stay in a separate directory. This uses isort to sort the imports for the affected Python files with the provided .isort.cfg file. This also adds an impala-isort shell script to make it easy to run. Testing: - Ran a core job Change-Id: Ie2927f22c7257aa38a78084efe5bd76d566493c0 Reviewed-on: http://gerrit.cloudera.org:8080/20169 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>	2025-04-15 17:03:02 +00:00
stiga-huang	c0174bb438	IMPALA-13960: Add catalog timeline item for prepareInsertEventData When enable_insert_events is set to true (default), Impala will fire HMS INSERT events for each INSERT statement. Preparing data of the InsertEvents actually takes time since it fetches checksums of all the new files. This patch adds a catalog timeline item to reveal this step. Before this patch, the duration of "Got Metastore client" before "Fired Metastore events" could be long: Catalog Server Operation: 65.762ms - Got catalog version read lock: 12.724us (12.724us) - Got catalog version write lock and table write lock: 224.572us (211.848us) - Got Metastore client: 418.346us (193.774us) - Got Metastore client: 29.001ms (28.583ms) <---- Unexpected long - Fired Metastore events: 52.665ms (23.663ms) After this patch, we shows what actually takes the time is "Prepared InsertEvent data": Catalog Server Operation: 61.597ms - Got catalog version read lock: 7.129us (7.129us) - Got catalog version write lock and table write lock: 114.476us (107.347us) - Got Metastore client: 200.040us (85.564us) - Prepared InsertEvent data: 25.335ms (25.135ms) - Got Metastore client: 25.342ms (7.009us) - Fired Metastore events: 46.625ms (21.283ms) Tests: - Added e2e test Change-Id: Iaef1cae7e8ca1c350faae8666ab1369717736978 Reviewed-on: http://gerrit.cloudera.org:8080/22778 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-04-15 16:24:56 +00:00
Riza Suminto	50a98dce46	IMPALA-13959: Fix TestHmsIntegration.test_change_parquet_column_type TestHmsIntegration.test_change_parquet_column_type fail in exhaustive mode due to a missing int parsing introduced by IMPALA-13920. This patch add the missing int parsing. It also fix flake8 issues in test_hms_integration.py, including unused vector fixture. Testing: Run and pass test_hms_integration.py in exhaustive mode. Change-Id: If5fb9f96b4087e86b0ebaac7135e14b7a14936ea Reviewed-on: http://gerrit.cloudera.org:8080/22774 Reviewed-by: Csaba Ringhofer <csringhofer@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-04-14 12:04:48 +00:00
Riza Suminto	047cf9ff4d	IMPALA-13954: Validate num inserted rows via NumModifiedRows counter This patch changes the way test validate num inserted rows from checking the beeswax-specific result to checking NumModifiedRows counter from query profile. Remove skiping over hs2 protocol in test_chars.py and refactor test_date_queries.py a bit to reduce test skiping. Added HS2_TYPES in tests that requires it and fix some flake8 issues. Testing: Run and pass all affected tests. Change-Id: I96eae9967298f75b2c9e4d0662fcd4a62bf5fffc Reviewed-on: http://gerrit.cloudera.org:8080/22770 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Riza Suminto <riza.suminto@cloudera.com>	2025-04-11 20:31:50 +00:00
Riza Suminto	0ed4e869de	IMPALA-13930: ImpylaHS2Connection should only open cursor as needed Before this patch, ImpylaHS2Connection unconditionally opened a cursor (and HS2 session) as it connected, followed by running a "SET ALL" query to populate the default query options. This patch changes the behavior of ImpylaHS2Connection to open the default cursor only when querying is needed for the first time. This helps preserve assertions for a test that is sensitive about client connection, like IMPALA-13925. Default query options are now parsed from newly instantiated TQueryOptions object rather than issuing a "SET ALL" query or making BeeswaxService.get_default_configuration() RPC. Fix test_query_profile_contains_query_compilation_metadata_cached_event slightly by setting the 'sync_ddl' option because the test is flaky without it. Tweak test_max_hs2_sessions_per_user to run queries so that sessions will open. Deduplicate test cases between utc-timestamp-functions.test and local-timestamp-functions.test. Rename TestUtcTimestampFunctions to TestTimestampFunctions, and expand it to also tests local-timestamp-functions.test and file-formats-with-local-tz-conversion.test. The table_format is now contrained to 'test/none' because it is unnecessary to permute other table_format. Deprecate 'use_local_tz_for_unix_timestamp_conversions' in favor of query option with the same name. Filed IMPALA-13953 to update the documentation of 'use_local_tz_for_unix_timestamp_conversions' flag/option. Testing: Run and pass a few pytests such as: test_admission_controller.py test_observability.py test_runtime_filters.py test_session_expiration.py. test_set.py Change-Id: I9d5e3e5c11ad386b7202431201d1a4cff46cbff5 Reviewed-on: http://gerrit.cloudera.org:8080/22731 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-04-11 04:37:14 +00:00

... 2 3 4 5 6 ...

3568 Commits