impala

mirror of https://github.com/apache/impala.git synced 2025-12-19 09:58:28 -05:00

Author	SHA1	Message	Date
stiga-huang	a68f716458	IMPALA-14307: Correctly update createEventId and DeleteEventLog in AlterTableRename When EventProcessor is paused, e.g. due to a global INVALIDATE METADATA operation, in alterTableOrViewRename() we don't fetch the event id of the ALTER_TABLE event. This causes the createEventId of the new table being -1 and the DeleteEventLog entry of the old table is missing. So stale ALTER_TABLE RENAME events could incorrectly remove the new table or add the old table. The other case is in the fallback invalidation added in IMPALA-13989 that handles rename failure inside catalog (but succeeds in HMS). The createEventId is also set as -1. This patch fixes these by always setting a correct/meaningful createEventId. When fetching the ALTER_TABLE event fails, we try to use the event id before the HMS operation. It could be a little bit stale but much better than -1. Modified CatalogServiceCatalog#isEventProcessingActive() to just check if event processing is enabled and renamed it to isEventProcessingEnabled(). Note that this method is only used in DDLs that check their self events. We should allow these checks even when EventProcessor is not in the ACTIVE state. So when EventProcessor is recovered, fields like createEventId in tables are still correct. Removed the code of tracking in-flight events at the end of rename since the new table is in unloaded state and only the createEventId is useful. The catalog version used is also incorrect since it's not used in CatalogServiceCatalog#renameTable() so it doesn't make sence to use it. Removed the InProgressTableModification parameter of alterTableOrViewRename() since it's not used anymore. This patch also fixes a bug in getRenamedTableFromEvents() that it always returns the first event id in the list. It should use the rename event it finds. Tests - Added e2e test and ran it 40 times. Change-Id: Ie7c305e5aaafc8bbdb85830978182394619fad08 Reviewed-on: http://gerrit.cloudera.org:8080/23291 Reviewed-by: Riza Suminto <riza.suminto@cloudera.com> Reviewed-by: Michael Smith <michael.smith@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-08-15 04:47:42 +00:00
jfehr	8053a68f39	IMPALA-14286: Fix RETRY_FAILED_QUERIES Default Value The Impala documentation lists true as the default value for the RETRY_FAILED_QUERIES query option. However, the actual default value is false. Fixes the documentation to reflect the correct default value. Change-Id: I88522f7195262fad9365feb18e703546c7b651be Reviewed-on: http://gerrit.cloudera.org:8080/23288 Reviewed-by: Jason Fehr <jfehr@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-08-14 02:28:43 +00:00
Riza Suminto	2975f10701	IMPALA-14308: Workaround failure in impala_python3 build Construction of the impala-virtualenv fails since PyPI released version 7.0.0 of pbr. This blocks all precommit runs, since the Impala virtualenv is required for all end-to-end tests. The failure happen during pywebhdfs==0.3.2 installation. It is expected to pullthe pinned version pbr==3.1.1, but the latest pbr==7.0.0 was pulled instead. pbr==7.0.0 then broke with this error message: ModuleNotFoundError: No module named 'packaging.requirements' This patch adds workaround in bootstrap_virtualenv.py to install packaging==24.1 early for python3. Installing it early managed to unblock `make -j impala_python3`. packaging==24.1 package is already listed in infra/python/deps/gcovr-requirements.txt, which installed in later step and in python3 virtualenv only. Testing: Pass shell/ tests in Ubuntu 22.04 and Rocky 9.2. Change-Id: I0167fb5e1e0637cdde64d0d3beaf6b154afc06b1 Reviewed-on: http://gerrit.cloudera.org:8080/23292 Reviewed-by: Jason Fehr <jfehr@cloudera.com> Tested-by: Jason Fehr <jfehr@cloudera.com>	2025-08-13 20:57:08 +00:00
Joe McDonnell	22898abbc4	IMPALA-14275: Ignore produced runtime filters for tuple cache keys PlanNode's list of runtime filters includes both runtime filters consumed and produced. The code for incorporating runtime filters into the tuple cache key doesn't make a distinction between the two. This means that JoinNodes that produce runtime filters hash their children more than once. This only applies to mt_dop=0, because mt_dop>0 produces the runtime filter from a separate build side fragment. This hasn't produced a noticeable issue, but it is still wrong. This ignores produced runtime filters. Testing: - Added a test case in TupleCacheTest Change-Id: I5d132a5cf7de1ce19b55545171799d8f38bb8c3d Reviewed-on: http://gerrit.cloudera.org:8080/23227 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Michael Smith <michael.smith@cloudera.com>	2025-08-13 15:21:01 +00:00
zhangyifan27	810ade2819	IMPALA-14299: Remove EventCounter configurations from log4j.properties.tmpl EventCouter has been removed in HADOOP-17254, log4j configuration should alse be updated to avoid errors. With this patch, a HDFS cluster could be started up with no errors after running `./bin/create-test-configurations.sh`. Change-Id: Id092ed7c9d1e3929daf36d05e0305d1d27de8207 Reviewed-on: http://gerrit.cloudera.org:8080/23287 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Michael Smith <michael.smith@cloudera.com>	2025-08-13 15:16:13 +00:00
Joe McDonnell	7477107ca3	IMPALA-12108: Add support for LZ4 high compression levels LZ4 has a high compression mode that gets higher compression ratios (at the cost of higher compression time) while maintaining the fast decompression speed. This type of compression would be useful for workloads that write data once and read it many times. This adds support for specifying a compression level for the LZ4 codec. Compression level 1 is the current fast API. Compression levels between LZ4HC_CLEVEL_MIN (3) and LZ4HC_CLEVEL_MAX (12) use the high compression API. This lines up with the behavior of the lz4 commandline. TPC-H 42 scale comparison Compression codec \| Avg Time (s) \| Geomean Time (s) \| Lineitem Size (GB) \| Compression time for lineitem (s) ------------------+--------------+------------------+--------------------+------------------------------ Snappy \| 2.75 \| 2.08 \| 8.76 \| 7.436 LZ4 level 1 \| 2.58 \| 1.91 \| 9.1 \| 6.864 LZ4 level 3 \| 2.58 \| 1.93 \| 7.9 \| 43.918 LZ4 level 9 \| 2.68 \| 1.98 \| 7.6 \| 125.0 Zstd level 3 \| 3.03 \| 2.31 \| 6.36 \| 17.274 Zstd level 6 \| 3.10 \| 2.38 \| 6.33 \| 44.955 LZ4 level 3 is about 10% smaller in data size while being about as fast as regular LZ4. It compresses at about the same speed as Zstd level 6. Testing: - Ran perf-AB-test with lz4 high compression levels - Added test cases to decompress-test Change-Id: Ie7470ce38b8710c870cacebc80bc02cf5d022791 Reviewed-on: http://gerrit.cloudera.org:8080/23254 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-08-12 23:03:58 +00:00
Riza Suminto	14ff597e2f	IMPALA-14289: Suppress data race in ThreadTokenAvailableCb TSAN build in RHEL9 hit a data race issue in HdfsScanNode::ThreadTokenAvailableCb from timed_mutex + try_lock_for usage. It seems to be a known false-positive in ThreadSanitizer: https://github.com/google/sanitizers/issues/1620 https://github.com/llvm/llvm-project/issues/142370 This patch suppress the TSAN error in ThreadTokenAvailableCb. Testing: Pass dataloading and BE tests in TSAN in RHEL9. Change-Id: I87950cdc3fedc8d80adeb788c6d29791db58242a Reviewed-on: http://gerrit.cloudera.org:8080/23281 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-08-12 21:52:57 +00:00
pranav.lodha	acca24fe02	IMPALA-14005: Support for quoted reserved words column names This change updates the way column names are projected in the SQL query generated for JDBC external tables. Instead of relying on optional mapping or default behavior, all column names are now explicitly quoted using appropriate quote characters. Column names are now wrapped with quote characters based on the JDBC driver being used: 1. Backticks (`) for Hive, Impala and MySQL 2. Double quotes (") for all other databases This helps in the support for case-sensitive or reserved column names. Change-Id: I5da5bc7ea5df8f094b7e2877a0ebf35662f93805 Reviewed-on: http://gerrit.cloudera.org:8080/23066 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Wenzhe Zhou <wzhou@cloudera.com>	2025-08-12 15:01:13 +00:00
Daniel Vanko	6a97109551	IMPALA-12520: Create all Iceberg test tables under /test-warehouse This patch modifies the creation of Iceberg tables in 5 testfiles. Previously these tables were created outside of /test-warehouse which could lead to issues, because we only clear the /test-warehouse directory in bin/jenkins/release_cloud_resources.sh. This means tables subsequent executions might see data from earlier runs. Change-Id: I97ce512db052b6e7499187079a184c1525692592 Reviewed-on: http://gerrit.cloudera.org:8080/23188 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Zoltan Borok-Nagy <boroknagyz@cloudera.com>	2025-08-12 11:37:47 +00:00
jasonmfehr	2ad6f818a5	IMPALA-13237: [Patch 5] - Implement OpenTelemetry Traces for Select Queries Tracking Adds representation of Impala select queries using OpenTelemetry traces. Each Impala query is represented as its own individual OpenTelemetry trace. The one exception is retried queries which will have an individual trace for each attempt. These traces consist of a root span and several child spans. Each child span has the root as its parent. No child span has another child span as its parent. Each child span represents one high-level query lifecycle stage. Each child span also has span attributes that further describe the state of the query. Child spans: 1. Init 2. Submitted 3. Planning 4. Admission Control 5. Query Execution 6. Close Each child span contains a mix of universal attributes (available on all spans) and query phase specific attributes. For example, the "ErrorMsg" attribute, present on all child spans, is the error message (if any) at the end of that particular query phase. One example of a child span specific attribute is "QueryType" on the Planning span. Since query type is first determined during query planning, the "QueryType" attribute is present on the Planning span and has a value of "QUERY" (since only selects are supported). Since queries can run for lengthy periods of time, the Init span communicates the beginning of a query along with global query attributes. For example, span attributes include query id, session id, sql, user, etc. Once the query has closed, the root span is closed. Testing accomplished with new custom cluster tests. Generated-by: Github Copilot (GPT-4.1, Claude Sonnet 3.7) Change-Id: Ie40b5cd33274df13f3005bf7a704299ebfff8a5b Reviewed-on: http://gerrit.cloudera.org:8080/22924 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-08-12 04:11:06 +00:00
Eyizoha	e7676223df	IMPALA-14233: Fix unexpected Kudu table drop attempt on external table creation failure The existing code incorrectly attempts to drop the corresponding Kudu table when the creation of a Kudu external table in HMS fails due to an erroneous negation in the if condition (fortunately, there are additional checks with Preconditions in KuduCatalogOpExecutor.dropTable, causing such attempts to always fail). Additionally, when creating a Kudu synchronized table, if the table creation fails in HMS, it will unexpectedly skip deleting the corresponding Kudu table, resulting in an "already exists in Kudu" error when retrying the table creation. Removed the incorrect negation in the if condition to align with the intended behavior described in the comment. Testing: - Existing tests cover this change. Change-Id: I67d1cb333526fa41f247757997a6f7cf60d26c0b Reviewed-on: http://gerrit.cloudera.org:8080/23181 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-08-11 18:01:45 +00:00
zhangyifan27	f0757418c8	IMPALA-14257: Support set USE_APACHE_* when USE_APACHE_COMPONENTS=false Before this patch, USE_APACHE_COMPONENTS overwrite all USE_APACHE_* variables, but we should support using specific apache components. After this patch, if USE_APACHE_COMPONENTS is not false, USE_APACHE_ {HADOOP,HBASE,HIVE,TEZ,RANGER} variable will be set true. Otherwise, we should use the value of USE_APACHE_{HADOOP,HBASE,HIVE,TEZ,RANGER}. Test: - Built and ran a test cluster with setting USE_APACHE_HIVE=true and USE_APACHE_COMPONENTS=false. Change-Id: I33791465a3b238b56f82d749e3dbad8215f3b3bc Reviewed-on: http://gerrit.cloudera.org:8080/23211 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-08-11 12:44:26 +00:00
Riza Suminto	6b005a793a	IMPALA-14296: Fix with_args fixture in TestMetadataReplicas IMPALA-13947 has incorrect fixture edit that cause following error: common/custom_cluster_test_suite.py:396: in setup_method pytest.fail("Cannot specify with_args on both class and methods") E Failed: Cannot specify with_args on both class and methods This patch move the with_args fixture at test_catalog_restart up to the class level. Testing: Run and pass TestMetadataReplicas in exhaustive mode. Change-Id: I9016eac859fb01326b3d1e0a8e8e135f03d696bb Reviewed-on: http://gerrit.cloudera.org:8080/23280 Reviewed-by: Quanlong Huang <huangquanlong@gmail.com> Reviewed-by: Xuebin Su <xsu@cloudera.com> Tested-by: Quanlong Huang <huangquanlong@gmail.com>	2025-08-11 11:34:54 +00:00
Steve Carlin	922443da46	IMPALA-14165: Type coercion code accidentally omitted from analysis On the first cut of creating the Calcite planner, the Calcite planner was standalone and ran its own JniFrontend. In the current version, the parsing, validating, and single node planning is called from the Impala framework. There is some code in the first cut regarding the "ImpalaTypeCoercionFactory" class which handles deriving the correct data type for various expressions, for instance (found in exprs.test): select count(*) from alltypesagg where 10.1 in (tinyint_col, smallint_col, int_col, bigint_col, float_col, double_col) Without this patch, the query returns the following error: UDF ERROR: Decimal expression overflowed This code can be found in CalciteValidator.java, but was accidentally omitted from CalciteAnalysisDriver. Change-Id: I74c4c714504400591d1ec6313f040191613c25d9 Reviewed-on: http://gerrit.cloudera.org:8080/23039 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Steve Carlin <scarlin@cloudera.com>	2025-08-10 17:00:54 +00:00
Steve Carlin	98b6c0208f	IMPALA-14094: Calcite planner: Use table and column statistics for optimization This commit enables the Calcite planner join optimization rule to make use of table and column statistics in Impala. The ImpalaRelMetadataProvider class provides the metadata classes to the rule optimizer. All the ImpalaRelMd* classes are extensions of Calcite Metadata classes. The ones overridden are: ImpalaRelMdRowCount: This provides the cardinality of a given type of RelNode. The default implementation in the RelMdRowCount is used for some of the RelNodes. The ones overridden are: TableScan: Gets the row count from the Table object. Filter: Calls the FilterSelectivityEstimator and adjusts the number of rows based on the selectivity of the filter condition. Join: Uses our own algorithm to determine the number of rows that will be created by the join condition using the JoinRelationInfo (more on this below). ImpalaRelMdDistinctRowCount: This provides the number of distinct rows returned by the RelNode. The default implementation in the RelMdDistinct RowCount is used for some of the RelNodes. The ones overridden are: TableScan: Uses the stats. If stats are not defined, all rows will be marked as distinct. Aggregate: For some reason, Calcite sometimes returns a number of distinct rows greater than the number of rows, which doesn't make sense. So this ensures the number of distinct rows never exceeds the number of rows. Filter: The number of distinct rows is reduced by the calculated selectivity. Join: same as aggregate. ImpalaRelMdRowSize: Provides the Impala interpreted size of the Calcite datatypes. ImpalaRelMdSelectivity: The selectivity is calculated within the RowCount. An initial attempt was done to use this class for selectivity, but it was seemed rather clunky since the row counts and selectivity are very closely intertwined and the pruned row counts (a future commit) made this even more complicated. So the selectivity metadata is overridden or all our RelNodes as full selectivity (1.0). As mentioned above, the FilterSelectivityEstimator class tries to approximate the number of rows filtered out with the given condition. Some work still needs to be done to make this more in line with the Expr seletivities, a Jira will be filed for this. The JoinRelationInfo is the helper class that estimates the number of rows that will be output of the Join RelNode. The join condition is split up into multiple conditions broken up by the AND keyword. This first pass has some major flaws which need to be corrected, including: - Only equality conditions limit the number of rows. Non-equality conditions will be ignored. If there are only non-equality conditions, the cardinality will be the equivalent of a cross join. - Left joins take the maximum of the calculated join and the total number of rows on the left side. This can probably be improved upon if we find the matching rows provide a cardinality that is greater than one for each row. (Of course, right joins and outer joins have this same logic). Change-Id: I9d5bb50eb562c28e4b7c7a6529d140f98e77295c Reviewed-on: http://gerrit.cloudera.org:8080/23122 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Steve Carlin <scarlin@cloudera.com>	2025-08-10 01:20:43 +00:00
Sai Hemanth Gantasala	a7efa7665f	IMPALA-13453: Avoid reloading partition if it is unchanged In table level REFRESH, we check whether the partition is actually changed and skip updating unchanged partitions in catalog. However, in partition REFRESH, we always drop and add the partition. This leads to unecessarily dropping the partition metadata, column statistics and adding them back again. This patch adds a check to verify if the partition really changed before reloading the partition to avoid unnecessary drop-add sequence. Change-Id: I72d5d20fa2532d49313d5e88f2d66f98b9537b2e Reviewed-on: http://gerrit.cloudera.org:8080/22962 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Quanlong Huang <huangquanlong@gmail.com>	2025-08-08 15:38:05 +00:00
Yida Wu	859c9c1f66	IMPALA-14276: Fix memory leak by removing AdmissionState on rejection Normally, AdmissionState entries in admissiond are cleaned up when a query is released. However, for requests that are rejected, releasing query is not called, and their AdmissionState was not removed from admission_state_map_ resulting in a memory leak over time. This leak was less noticeable because AdmissionState entries were relatively small. However, when admissiond is run as a standalone process, each AdmissionState includes a profile sidecar, which can be large, making the leak much more. This change adds logic to remove AdmissionState entries when the admission request is rejected. Testing: Add test_admission_state_map_mem_leak for regression test. Change-Id: I9fba4f176c648ed7811225f7f94c91342a724d10 Reviewed-on: http://gerrit.cloudera.org:8080/23257 Reviewed-by: Riza Suminto <riza.suminto@cloudera.com> Reviewed-by: Abhishek Rawat <arawat@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-08-07 20:26:58 +00:00
Peter Rozsa	0dd6c154c9	IMPALA-14138: Include generated files in .gitignore Change-Id: Ie5079c87dd27c2391bee13937d69a3340e2a4fb1 Reviewed-on: http://gerrit.cloudera.org:8080/23223 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-08-07 20:19:46 +00:00
Riza Suminto	1cead45114	IMPALA-13947: Test local catalog mode by default Local catalog mode has been the default and works well in downstream Impala for over 5 years. This patch turn on local catalog mode by default (--catalog_topic_mode=minimal and --use_local_catalog=true) as preferred mode going forward. Implemented LocalCatalog.setIsReady() to facilitate using local catalog mode for FE tests. Some FE tests fail due to behavior differences in local catalog mode like IMPALA-7539. This is probably OK since Impala now largely hand over FileSystem permission check to Apache Ranger. The following custom cluster tests are pinned to evaluate under legacy catalog mode because their behavior changed in local catalog mode: TestCalcitePlanner.test_calcite_frontend TestCoordinators.test_executor_only_lib_cache TestMetadataReplicas TestTupleCacheCluster TestWorkloadManagementSQLDetailsCalcite.test_tpcds_8_decimal At TestHBaseHmsColumnOrder.test_hbase_hms_column_order, set --use_hms_column_order_for_hbase_tables=true flag for both impalad and catalogd to get consistent column order in either local or legacy catalog mode. Changed TestCatalogRpcErrors.test_register_subscriber_rpc_error assertions to be more fine grained by matching individual query id. Move most of test methods from TestRangerLegacyCatalog to TestRangerLocalCatalog, except for some that do need to run in legacy catalog mode. Also renamed TestRangerLocalCatalog to TestRangerDefaultCatalog. Table ownership issue in local catalog mode remains unresolved (see IMPALA-8937). Testing: Pass exhaustive tests. Change-Id: Ie303e294972d12b98f8354bf6bbc6d0cb920060f Reviewed-on: http://gerrit.cloudera.org:8080/23080 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-08-06 21:42:24 +00:00
Riza Suminto	cc1cbb559a	IMPALA-14263: Add broadcast_cost_scale_factor option This commit enhances the distributed planner's costing model for broadcast joins by introducing the `broadcast_cost_scale_factor` query option. This option enables users to fine-tune the planner's decision between broadcast and partitioned joins. Key changes: - The total broadcast cost is scaled by the new `broadcast_cost_scale_factor` query option, allowing users to favor or penalize broadcast joins as needed when setting query hint is not feasible. - Updated the planner logic and test cases to reflect the new costing model and options. This addresses scenarios where the default costing could lead to suboptimal join distribution choices, particularly in a large-scale cluster where the number of executors can increase broadcast cost, while choosing a partitioned strategy can lead to data skew. Admin can set `broadcast_cost_scale_factor` less than 1.0 to make DistributedPlanner favor broadcast more than partitioned join (with possible downside of higher memory usage per query and higher network transmission). Existing query hints still take precedence over this option. Note that this option is applied independent of `broadcast_to_partition_factor` option (see IMPALA-10287). In MT_DOP>1 setup, it should be sufficient to set `use_dop_for_costing=True` and tune `broadcast_to_partition_factor` only. Testing: Added FE tests. Change-Id: I475f8a26b2171e87952b69f66a5c18f77c2b3133 Reviewed-on: http://gerrit.cloudera.org:8080/23258 Reviewed-by: Wenzhe Zhou <wzhou@cloudera.com> Reviewed-by: Aman Sinha <amsinha@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-08-06 21:37:07 +00:00
Surya Hebbar	09a6f0e6cd	IMPALA-14278: Fix MemTracker data race between ExecEnv and Webserver In the Webserver, while assigning or closing the compressed buffer's memory tracker, no lock was being held across threads causing TSAN build failures. The critical section for this memory tracker is only necessary during begining of the Webserver and is used rarely. So, only a general mutex has been used instead of a shared mutex with concurrent reads. Change-Id: Ife9198e911e526a9a0e88bdb175b4502a5bc2662 Reviewed-on: http://gerrit.cloudera.org:8080/23250 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-08-06 06:05:46 +00:00
stiga-huang	9d6997b7c0	IMPALA-14280: (Addendum) Waits for updating active catalogd address Some tests for catalogd HA failover have a lightweight verifier function that finishes quickly before coordinator notices catalogd HA failover, e.g. when the verifier function runs a statement that doesn't trigger catalogd RPCs. If the test finishes in such a state, coordinator will use the stale active catalogd address in cleanup, i.e. dropping unique_database, and fails quickly since the catalogd is passive now. Retrying the statement immediately usually won't help since coordinator hasn't updated the active catalogd address yet. Note that we also retry the verifier function immediately when it's failed by coordinator talking to the stale catalogd address. It works since the previous active catalogd is not running so the catalogd RPCs fail and got retried. The retry interval is 3s (configured by catalog_client_rpc_retry_interval_ms) and we retry it for at least 2 times (customized by catalog_client_connection_num_retries in the tests). The duration is usually enough for coordinator to update the active catalogd address. But depending on this duration is a bit tricky. This patch adds a wait before the verifier function to make sure coordinator has updated the active catalogd address. This also make sure the cleanup of unique_database won't fail due to stale active catalogd address. Tests: - Ran test_catalogd_ha.py Change-Id: I45e4a20170fdcce8282f1762f81a290689777aed Reviewed-on: http://gerrit.cloudera.org:8080/23252 Reviewed-by: Riza Suminto <riza.suminto@cloudera.com> Reviewed-by: Wenzhe Zhou <wzhou@cloudera.com> Tested-by: Quanlong Huang <huangquanlong@gmail.com>	2025-08-05 23:08:34 +00:00
Yida Wu	59fdd7169a	IMPALA-10866: Add testcases for failure cases involving the admission service The admission service uses the statestore as the only source of truth to determine whether a coordinator is down. If the statestore reports a coordinator is down, all running and queued queries associated with it should be cancelled or rejected. In IMPALA-12057, we introduced logic to reject queued queries if the corresponding coordinator has been removed, along with tests for that behavior. This patch adds additional test cases to cover other failure scenarios, such as the coordinator or the statestore going down with running queries, and verifies that the behavior is as expected in each case. Tests: Passed exhaustive tests. Change-Id: If617326cbc6fe2567857d6323c6413d98c92d009 Reviewed-on: http://gerrit.cloudera.org:8080/23217 Reviewed-by: Riza Suminto <riza.suminto@cloudera.com> Reviewed-by: Abhishek Rawat <arawat@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-08-05 06:43:41 +00:00
jasonmfehr	19f662301c	IMPALA-14214: [Addendum] - Ensure IMPALA_TOOLCHAIN_COMMIT_HASH Matches Build IDs Adds verification code to ensure the IMPALA_TOOLCHAIN_COMMIT_HASH environment variable matches the commit hash in the IMPALA_TOOLCHAIN_BUILD_ID_AARCH64 and IMPALA_TOOLCHAIN_BUILD_ID_X86_64 environment variables. Generated-by: Github Copilot (Claude Sonnet 3.7) Change-Id: I348698356a014413875f6b8b54a005bf89b9793a Reviewed-on: http://gerrit.cloudera.org:8080/23243 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-08-05 06:14:28 +00:00
stiga-huang	aec7380b75	IMPALA-14283: Invalidate the cache when served by a new catalogd Before this patch, coordinator just invalidates the catalog cache when witness the catalog service id changes in DDL/DML responses or statestore catalog updates. This is enough in the legacy catalog mode since these are the only ways that coordinator gets metadata from catalogd. However, in local catalog mode, coordinator sends getPartialCatalogObject requests to fetch metadata from catalogd. If the request is now served by a new catalogd (e.g. due to HA failover), coordinator should invalidate its catalog cache in case catalog version overlaps on the same table and unintentionally reuse stale metadata. To ensure performance, catalogServiceIdLock_ in CatalogdMetaProvider is refactored to be a ReentrantReadWriteLock. Most of the usages on it just need the read lock. This patch also adds the catalog service id in the profile. Tests: - Ran test_warmed_up_metadata_failover_catchup 50 times. - Ran FE tests: CatalogdMetaProviderTest and LocalCatalogTest. - Ran CORE tests Change-Id: I751e43f5d594497a521313579defc5b179dc06ce Reviewed-on: http://gerrit.cloudera.org:8080/23236 Reviewed-by: Riza Suminto <riza.suminto@cloudera.com> Tested-by: Quanlong Huang <huangquanlong@gmail.com>	2025-08-04 23:02:37 +00:00
Sai Hemanth Gantasala	447c016ae1	IMPALA-12187: Fix flaky test_event_based_replication() TestEventProcessing.test_event_based_replication is turning flaky when there is a lag replication of a database that has too many events to replicate. The case III in the test is turning flaky because the event processor has to processes so many ALTER_PARTITIONS events that valid writeId list can be inaccurate when the replication is not complete. So a 20 sec timeout is introduced in case III after replication so that event processor will process events after replication process is completely done. Testing: - Looped the test 100 times to avoid flakiness Change-Id: I89fcd951f6a65ab7fe97c4f23554d93d9ba12f4e Reviewed-on: http://gerrit.cloudera.org:8080/22131 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Riza Suminto <riza.suminto@cloudera.com>	2025-08-04 21:51:16 +00:00
stiga-huang	73de6517a4	IMPALA-14280: Deflake catalogd HA failover tests Several tests on catalogd HA failover have a loop of the following pattern: - Do some operations - Kills the active catalogd - Verifies some results - Starts the killed catalogd After starting the killed catalogd, the test gets the new active and standby catalogds and check their /healthz pages immediately. This could fail if the web pages are not registered yet. The cause is when starting catalogd, we just wait for its 'statestore-subscriber.connected' to be True. This doesn't guarantee that the web pages are initialized. This patch adds a wait for this, i.e. when getting the web pages hits 404 (Not Found) error, wait and retry. Another flaky issue of these failover tests is cleanup unique_database could fail due to impalad still using the old active catalogd address even in RPC failure retries (IMPALA-14228). This patch adds a retry on the DROP DATABASE statement to work around this. Sets disable_log_buffering to True so the killed catalogd has complete logs. Sets catalog_client_connection_num_retries to 2 to save time in coordinator retrying RPCs to the killed catalogd. This reduce the duration of test_warmed_up_metadata_failover_catchup from 100s to 50s. Tests: - Ran all (15) failover tests in test_catalogd_ha.py 10 times (each round takes 450s). Change-Id: Iad42a55ed7c357ed98d85c69e16ff705a8cae89d Reviewed-on: http://gerrit.cloudera.org:8080/23235 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Quanlong Huang <huangquanlong@gmail.com>	2025-08-04 09:12:30 +00:00
jasonmfehr	7b7e7709aa	IMPALA-14214: Correct IMPALA_TOOLCHAIN_COMMIT_HASH Fixes the default value of the IMPALA_TOOLCHAIN_COMMIT_HASH environment variable to be the correct hash. Change-Id: I98824f363334a15e4f91c0b3f51fa09a5d15c241 Reviewed-on: http://gerrit.cloudera.org:8080/23233 Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com> Tested-by: Quanlong Huang <huangquanlong@gmail.com>	2025-08-04 01:22:23 +00:00
Joe McDonnell	5414c30098	IMPALA-13809: Update list of excluded locations for the code coverage report This adds be/src/gutil and be/src/kudu to the list of excluded locations for the code coverage report. These directories are third-party party code that have been vendored into the Impala repository. There is a fair amount of unused code in those directories simply because it is easier to maintain that way. Impala's tests aren't intending to test that code. Testing: - Ran code coverage with the updated list Change-Id: I7f3aa971e50b2c454e9ca607fb9d49d7cc3593ae Reviewed-on: http://gerrit.cloudera.org:8080/23084 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-07-31 14:20:49 +00:00
stiga-huang	5bdd9c7f39	IMPALA-14227: (Addendum) Add more tests for catalogd HA warm failover This adds more tests in test_catalogd_ha.py for warm failover. Refactored _test_metadata_after_failover to run in the following way: - Run DDL/DML in the active catalogd. - Kill the active catalogd and wait until the failover finishes. - Verify the DDL/DML results in the new active catalogd. - Restart the killed catalogd It accepts two methods in parameters to perform the DDL/DML and the verifier. In the last step, the killed catalogd is started so we keep having 2 catalogd and can merge these into a single test by invoking _test_metadata_after_failover for different method pairs. This saves some test time. The following DDL/DML statements are tested: - CreateTable - AddPartition - REFRESH - DropPartition - INSERT - DropTable After each failover, the table is verified to be warmed up (i.e. loaded). Also validate flags in startup to make sure enable_insert_events and enable_reload_events are both set to true when warm failover is enabled, i.e. --catalogd_ha_reset_metadata_on_failover=false. Change-Id: I6b20adeb0bd175592b425e521138c41196347600 Reviewed-on: http://gerrit.cloudera.org:8080/23206 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Wenzhe Zhou <wzhou@cloudera.com>	2025-07-30 00:29:07 +00:00
Riza Suminto	5fc66bfabc	IMPALA-14220 (part 2): Delay AcceptRequest until catalog is stable CatalogD availability is improving since reading is_active_ no longer requires holding catalog_lock_. However, during a failover scenario, requests may slip into the passive-turn-to-active CatalogD and obtain stale metadata. This patch improves the situation in two steps. First, it adds a new mutex ha_transition_lock_ that must be obtained by AcceptRequest() in HA mode. This mutex protects both CatalogServer::WaitPendingResetStarts() and CatalogServer::UpdateActiveCatalogd(). WaitPendingResetStarts() will only exit and return to AcceptRequest() after the triggered_first_reset_ flag is True (initial metadata reset has completed) or min_catalog_resets_to_serve_ is met. If only the latter happens, request will goes through the Catalog JVM and subsequently blocked by CatalogResetManager.waitOngoingMetadataFetch() until metadata reset has progress beyond requested database/table. Second, it increments numCatalogResetStarts_ on every global reset (Invalidate Metadata) initiated by catalog-server.cc. CatalogServer::MarkPendingMetadataReset() matches this logic to increment min_catalog_resets_to_serve_ before setting triggered_first_reset_ flag to False (consequently waking up TriggerResetMetadata thread). Rename WaitForCatalogReady() to WaitCatalogReadinessForWorkloadManagement() since this wait mechanism is specific to Workload Management initialization and has stricter requirements. Removed CatalogServer::IsActive() since the only call site is replaced with CatalogServer::WaitHATransition(). Testing: Added test_metadata_after_failover_with_delayed_reset and test_metadata_after_failover_with_hms_sync. Change-Id: I370d21319335318e441ec3c3455bac4227803900 Reviewed-on: http://gerrit.cloudera.org:8080/23194 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-07-29 07:21:25 +00:00
Steve Carlin	5af4fa84df	IMPALA-14094: Prepend to commit, refactor estimated table stats IMPALA-14094 adds statistics for the Calcite planner. The row count statistics for the original planner are estimated within HdfsScanNode when the statistics are missing because they were not computed with the compute statistics command. This commit refactors this estimation code so that it is shareable. Change-Id: I522e5105867fa1c85df5c04a4bc6cdd5d63443f0 Reviewed-on: http://gerrit.cloudera.org:8080/23185 Reviewed-by: Aman Sinha <amsinha@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-07-28 15:10:29 +00:00
Laszlo Gaal	a53117214f	IMPALA-14255: Install Java 17 in bootstrap_build.sh This patch bumps the Java version installed during bootstrap_build.sh to Java 17 to keep the precommit environment consistently on Java 17. bin/bootstrap_build.sh is a simplified setup script used instead of bootstrap_system.sh when setting up an Impala environment limited to compilation only. It is used mainly during the initial, lightweight precommit checks on jenkins.impala.io when a patchset is submitted for review. This setup script was not updated with the Java version change from Java 8 to Java 17, so it became out of synch with the general assumption of building Impala 5.x versions for Java 17. This patch also removes a special case reserved for Ubuntu 14.04, which is now not supported by Impala. Tested automatically by submitting the patch for review. Change-Id: I796c6004e13aeca536b339fee765f79f39cc2ea1 Reviewed-on: http://gerrit.cloudera.org:8080/23201 Reviewed-by: Jason Fehr <jfehr@cloudera.com> Reviewed-by: Michael Smith <michael.smith@cloudera.com> Tested-by: Michael Smith <michael.smith@cloudera.com>	2025-07-23 21:36:56 +00:00
Joe McDonnell	535b72e674	IMPALA-13945: Change hash trace to show each node's individual contribution Currently, the hash trace accumulates up the plan tree and is displayed only for tuple cache nodes. This means that tuple cache nodes high in a large plan can have hundreds of lines of hash trace output without an indication of which contributions came from which nodes. This changes the hash trace in two ways: 1. It displays each plan node's individual contribution to the hash trace. This only contains a summary of the hash contributed by the child, so the hash trace does not accumulate up the plan tree. Since each node is displaying its own contribution, the tuple cache node does not display the hash trace itself. 2. This adds structure to the hash trace to include a comment for each contribution to the hash trace. This allows a cleaner display of the individual pieces of a node's hash trace. It also gives extra information about the specific contributions into the hash. It should be possible to trace the contribution through the plan tree. This also changes the output to only display the hash trace with explain_level=EXTENDED or higher (i.e. it won't be displayed with STANDARD). Example output: tuple cache hash trace: TupleDescriptor 0: TTupleDescriptor(id:0, byteSize:0, numNullBytes:0, tableId:1, tuplePath:[]) Table: TTableName(db_name:functional, table_name:alltypes) PlanNode: [TPlanNode(node_id:0, node_type:HDFS_SCAN_NODE, num_children:0, limit:-1, row_tuples:[0], nullable_tu] [ples:[false], disable_codegen:false, pipelines:[], hdfs_scan_node:THdfsScanNode(tuple_id:0, random_r] [eplica:false, use_mt_scan_node:false, is_partition_key_scan:false, file_formats:[]), resource_profil] [e:TBackendResourceProfile(min_reservation:0, max_reservation:0))] Query options hash: TQueryOptionsHash(hi:-2415313890045961504, lo:-1462668909363814466) Testing: - Modified TupleCacheInfoTest and TupleCacheTest to use the new hash trace Change-Id: If53eda24e7eba264bc2d2f212b63eab9dc97a74c Reviewed-on: http://gerrit.cloudera.org:8080/23017 Reviewed-by: Yida Wu <wydbaggio000@gmail.com> Reviewed-by: Michael Smith <michael.smith@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-07-23 16:15:30 +00:00
Joe McDonnell	07a5716773	IMPALA-13892: Add support for printing STRUCTs Tuple cache correctness verification is failing as the code in debug-util.cc used for printing the text version of tuples does not support printing structs. It hits a DCHECK and kills Impala. This adds supports for printing structs to debug-util.cc, fixing tuple cache correctness verification for complex types. To print structs correctly, each slot needs to know its field name. The ColumnType has this information, but it requires a field idx to lookup the name. This is the last index in the absolute path for this slot. However, the materialized path can be truncated to remove some indices at the end. Since we need that information to resolve the field name, this adds the struct field idx to the TSlotDescriptor to pass it to the backend. This also adds a counter to the profile to track when correctness verification is on. This is useful for testing. Testing: - Added a custom cluster test using nested types with correctness verification - Examined some of the text files Change-Id: Ib9479754c2766a9dd6483ba065e26a4d3a22e7e9 Reviewed-on: http://gerrit.cloudera.org:8080/23075 Reviewed-by: Michael Smith <michael.smith@cloudera.com> Reviewed-by: Daniel Becker <daniel.becker@cloudera.com> Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>	2025-07-23 16:15:30 +00:00
stiga-huang	fd64c54f7d	IMPALA-14059: deflake test_hms_event_sync_timeout test_hms_event_sync_timeout adds a sleep in events processing and runs a SELECT in Impala after an INSERT in Hive. The Impala SELECT statement is submitted with sync_hms_events_wait_time_s=2 and it's expected that changes done in Hive haven't been applied in catalogd yet. However, the changes is applied by a single event (ADD_PARTITION) and the event processing delay is just 2s which is not longer enough. Sometimes the event is applied just before the waitForHmsEvent request times out. So the query still sees the latest results and fails the test. This increases the event processing delay to 4s to deflake the test. Change-Id: I91e9cbf234360446422259e274161a01a43ea3d9 Reviewed-on: http://gerrit.cloudera.org:8080/23207 Reviewed-by: Daniel Becker <daniel.becker@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-07-22 17:45:31 +00:00
jasonmfehr	5bad0daf72	IMPALA-14214: Compile OpenTelemetry-cpp Against STDLIB Consumes the new toolchain builds that compiled the OpenTelemetry-cpp SDK libraries against the standard C++ library instead of the SDK's nostd translation layer. Change-Id: Icf06710d5f7987f43cb8bae5450b657f251f199b Reviewed-on: http://gerrit.cloudera.org:8080/23192 Reviewed-by: Michael Smith <michael.smith@cloudera.com> Tested-by: Jason Fehr <jfehr@cloudera.com>	2025-07-22 15:43:41 +00:00
Daniel Vanko	365ce0b12f	IMPALA-11512: Add tests for BINARY type support in Iceberg This patch adds tests for the binary type in Iceberg tables. Change-Id: I9221050a4bee57b8fbb85280478304e5b28efd21 Reviewed-on: http://gerrit.cloudera.org:8080/23167 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-07-21 16:34:41 +00:00
Daniel Vanko	ee69ed1d03	IMPALA-13625: Allow reading Parquet int32/int64 as decimal without logical types This patch allows reading columns with integer logical type as decimals. This can occur when we're trying to read files that were written as INT but the column was altered to a suitable DECIMAL. In this case the precision is based on physical type and equals 9 and 18, for int32 and int64 respectively. Test: * add new e2e tests Change-Id: I56006eb3cca28c81ec8467d77b35005fbf669680 Reviewed-on: http://gerrit.cloudera.org:8080/22922 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-07-21 16:34:33 +00:00
m-sanjana19	8a691a3507	IMPALA-12648: [DOCS] Documentation for Kill Query statements Documents the Kill Query statements used to stop running queries by using their unique query IDs. Change-Id: I51efbdeb585bad358b3e44ea4f555f62bfee4f00 Reviewed-on: http://gerrit.cloudera.org:8080/23031 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-07-21 02:23:18 +00:00
Zoltan Borok-Nagy	d5f7cc3a0b	IMPALA-14138: Addendum test fix This patch moves out a query from no-block-locations.test to only run it on HDFS because the queried Iceberg table has Iceberg delete files that contain HDFS-specific URIs. Change-Id: Iea862dd3b73a9aceceeb848d0ac85ac87627c8c2 Reviewed-on: http://gerrit.cloudera.org:8080/23189 Reviewed-by: Daniel Becker <daniel.becker@cloudera.com> Reviewed-by: Csaba Ringhofer <csringhofer@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-07-19 01:54:17 +00:00
jasonmfehr	fdad954ce4	IMPALA-13237: [Patch 4 of 5] - Helpers to Visualize OpenTelemetry Traces Adds helper scripts and configurations to run an OpenTelemetry OTLP collector and a Jaeger instance. The collector is configured to receive telemetry data on port 55888 via OTLP-over-http and to forward traces to a Jaeger-all-in-one container receiving data on port 4317. Testing was accomplished by running this setup locally and verifying traces appeared in the Jaeger UI. Generated-by: Github Copilot (GPT-4.1) Change-Id: I198c00ddc99a87c630a6f654042bffece2c9d0fd Reviewed-on: http://gerrit.cloudera.org:8080/23100 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-07-18 01:33:57 +00:00
Surya Hebbar	3e6326538c	IMPALA-13795: Support serving webUI content with gzip compression This patch adds support for serving all the webUI content with gzip content encoding. For large JSONs and text profiles, Impala's webUI renderings maybe hindered by the user's network bandwidth. As the browser's native gzip decompression is very fast e.g. 300-400MB/s, combining it with a faster compression level(i.e. gzip Z_BEST_SPEED) in backend results in significant increases in speed i.e. faster load times. During compression, instead of multiple reallocations, existing string data is reinterpreted to reduce memory usage. In case of failure during compression, the content is served in plain format as before. As currently, none of the memory allocation's are being tracked for the rapidjson's generated documents(or any daemon webserver's served string), it would be helpful to display the peak memory usage of a single buffer used to serve all webUI content. In the future, it is recommended to implement and use custom allocators for all large served strings and rapidjson generated documents. (See IMPALA-14178, IMPALA-14179) Memory trackers within ExecEnv are now initialized before enabling the webserver, allowing their use as parent memory trackers. For now, the memory used by the compressed buffer, for each compressed response is being tracked. (i.e. through the "WebserverCompressedBuffer" MemTracker) Example: For Impala daemon, it is included in the execution environment's process memory tracker and displayed on the /memz page as follows. # After serving a general webpage like /memz WebserverCompressedBuffer: Total=0 Peak=227.56 KB # After serving a query profile text / JSON WebserverCompressedBuffer: Total=0 Peak=4.09 MB Tests: * Added new tests to validate plain and gzipped content encoding headers in test_web_pages.py - TestWebPage:test_content_encoding in util/webserver-test.cc - Webserver::ContentEncodingHeadersTest * The pre-existing tests validate the content in test_web_pages.py, all tests request and validate gzipped content in util/webserver-test.cc, all tests request and validate plain text * Performance: Approximate improvements for a TPC-DS 14 query ran locally with 3 nodes with defaults -> JSON profile : 4.53MB to 428.94KB Without throttling / Raw local: 421ms to 421ms Based on firefox's throttling(8 mbps): 8s to 2s -> Text profile : 1.24MB to 219KB Without throttling / Raw local: 281ms to 281ms Based on firefox's throttling(8 mbps): 1.3s to 281ms Change-Id: I431088a30337bbef2c8d6e16dd15fb6572db0f15 Reviewed-on: http://gerrit.cloudera.org:8080/22599 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>	2025-07-17 18:52:47 +00:00
stiga-huang	64abca481f	IMPALA-14227: In HA failover, passive catalogd should apply pending HMS events before being active After IMPALA-14074, the passive catalogd can have a warmed up metadata cache during failover (with catalogd_ha_reset_metadata_on_failover=false and a non-empty warmup_tables_config_file). However, it could still use a stale metadata cache when some pending HMS events generated by the previous active catalogd are not applied yet. This patch adds a wait during HA failover to ensure HMS events before the failover happens are all applied on the new active catalogd. The timeout is configured by a new flag which defaults to 300 (5 minutes): catalogd_ha_failover_catchup_timeout_s. When timeout happens, by default catalogd will fallback to resetting all metadata. Users can decide whether to reset or continue using the current cache. This is configured by another flag, catalogd_ha_reset_metadata_on_failover_catchup_timeout. Since the passive catalogd depends on HMS event processing to keep its metadata up-to-date with the active catalogd, this patch adds validation to avoid starting catalogd with catalogd_ha_reset_metadata_on_failover set to false and hms_event_polling_interval_s <= 0. This patch also makes catalogd_ha_reset_metadata_on_failover a non-hidden flag so it's shown in the /varz web page. Tests: - Ran test_warmed_up_metadata_after_failover 200 times. Without the fix, it usually fails in several runs. - Added new tests for the new flags. Change-Id: Icf4fcb0e27c14197f79625749949b47c033a5f31 Reviewed-on: http://gerrit.cloudera.org:8080/23174 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-07-17 17:30:19 +00:00
Zoltan Borok-Nagy	438461db9e	IMPALA-14138: Manually disable block location loading via Hadoop config For storage systems that support block location information (HDFS, Ozone) we always retrieve it with the assumption that we can use it for scheduling, to do local reads. But it's also typical that Impala is not co-located with the storage system, not even in on-prem deployments. E.g. when Impala runs in containers, and even if they are co-located, we don't try to figure out which container runs on which machine. In such cases we should not reach out to the storage system to collect file information because it can be very expensive for large tables and we won't benefit from it at all. Since currently there is no easy way to tell if Impala is co-located with the storage system this patch adds configuration options to disable block location retrieval during table loading. It can be disabled globally via Hadoop Configuration: 'impala.preload-block-locations-for-scheduling': 'false' We can restrict it to filesystem schemes, e.g.: 'impala.preload-block-locations-for-scheduling.scheme.hdfs': 'false' When multiple storage systems are configured with the same scheme, we can still control block location loading based on authority, e.g.: 'impala.preload-block-locations-for-scheduling.authority.mycluster': 'false' The latter only disables block location loading for URIs like 'hdfs://mycluster/warehouse/tablespace/...' If block location loading is disabled by any of the switches, it cannot be re-enabled by another, i.e. the most restrictive setting prevails. E.g: disable scheme 'hdfs', enable authority 'mycluster' ==> hdfs://mycluster/ is still disabled disable globally, enable scheme 'hdfs', enable authority 'mycluster' ==> hdfs://mycluster/ is still disabled, as everything else is. Testing: * added unit tests for FileSystemUtil * added unit tests for the file metadata loaders * custom cluster tests with custom Hadoop configuration Change-Id: I1c7a6a91f657c99792db885991b7677d2c240867 Reviewed-on: http://gerrit.cloudera.org:8080/23175 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-07-17 13:08:15 +00:00
Csaba Ringhofer	1ae97e7173	IMPALA-13811: Deflake test_insert_overwrite_base_detection The test didn't wait in wait_for_finished_timeout() long enough and ignored its return value, so it could continue execution before the query was actually finished. Change-Id: I339bd338cfd3873cc4892f012066034a6f7d4e12 Reviewed-on: http://gerrit.cloudera.org:8080/23180 Reviewed-by: Riza Suminto <riza.suminto@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-07-17 02:45:59 +00:00
Joe McDonnell	78a27c56fe	IMPALA-13898: Incorporate partition information into tuple cache keys Currently, the tuple cache keys do not include partition information in either the planner key or the fragment instance key. However, the partition actually is important to correctness. First, there are settings defined on the table and partition that can impact the results. For example, for processing text files, the separator, escape character, etc are specified at the table level. This impacts the rows produced from a given file. There are other such settings stored at the partition level (e.g. the JSON binary format). Second, it is possible to have two partitions pointed at the same filesystem location. For example, scale_db.num_partitions_1234_blocks_per_partition_1 is a table that has all partitions pointing to the same location. In that case, the cache can't tell the partitions apart based on the files alone. This is an exotic configuration. Incorporating an identifier of the partition (e.g. the partition keys/values) allows the cache to tell the difference. To fix this, we incorporate partition information into the key. At planning time, when incorporating the scan range information, we also incorporate information about the associated partitions. This moves the code to HdfsScanNode and changes it to iterate over the partitions, hashing both the partition information and the scan ranges. At runtime, the TupleCacheNode looks up the partition associated with a scan node and hashes the additional information on the HdfsPartitionDescriptor. This includes some test-only changes to make it possible to run the TestBinaryType::test_json_binary_format test case with tuple caching. ImpalaTestSuite::_get_table_location() (used by clone_table()) now detects a fully-qualified table name and extracts the database from it. It only uses the vector to calculate the database if the table is not fully qualified. This allows a test to clone a table without needing to manipulate its vector to match the right database. This also changes _get_table_location() so that it does not switch into the database. This required reworking test_scanners_fuzz.py to use absolute paths for queries. It turns out that some tests in test_scanners_fuzz.py were running in the wrong database and running against uncorrupted tables. After this is corrected, some tests can crash Impala. This xfails those tests until this can be fixed (tracked by IMPALA-14219). Testing: - Added a frontend test in TupleCacheTest for a table with multiple partitions pointed at the same place. - Added custom cluster tests testing both issues Change-Id: I3a7109fcf8a30bf915bb566f7d642f8037793a8c Reviewed-on: http://gerrit.cloudera.org:8080/23074 Reviewed-by: Yida Wu <wydbaggio000@gmail.com> Reviewed-by: Michael Smith <michael.smith@cloudera.com> Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>	2025-07-17 01:07:44 +00:00
Daniel Becker	a6ad1392da	IMPALA-13888: LEFT ANTI JOIN is not working with Iceberg V2 tables on the right side If an Iceberg table contains delete files, queries where it is on the right side of a left anti-join fail: select * from alltypes a LEFT ANTI JOIN iceberg_v2_positional_update_all_rows b ON a.id = b.i; AnalysisException: Illegal column/field reference 'b.input__file__name' of semi-/anti-joined table 'b' This is because semi-joined tuples need to be made visible explicitly in order for paths pointing inside them to be resolvable, see Analyzer::resolvePaths(). This commit adds code to IcebergScanPlanner to make the tuple containing the virtual fields visible if it is semi-joined. Testing: - Added regressions tets in iceberg-v2-read-position-deletes.test. Change-Id: I19de9c7c7ed1d61cde281d270c4cc3ce0b7c582d Reviewed-on: http://gerrit.cloudera.org:8080/23147 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-07-16 14:22:35 +00:00
Zoltan Borok-Nagy	eaadf7ada5	IMPALA-14017: Add Ranger tests to Iceberg REST Catalog This patch adds authorization tests for the case when Impala only connects to an Iceberg REST Catalog. To make the tests faster it also implements REFRESH AUTHORIZATION without CatalogD. Testing: * custom cluster tests added with Ranger + Iceberg REST Catalog Change-Id: I30d506e04537c5ca878ab9cf58792bc8a6b560c3 Reviewed-on: http://gerrit.cloudera.org:8080/23118 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Noemi Pap-Takacs <npaptakacs@cloudera.com>	2025-07-16 09:50:34 +00:00
stiga-huang	41b6652fbf	IMPALA-14221: Avoid hard-coding table list in TestWarmupCatalog Some other tests like tests/query_test/test_cancellation.py might create tables under the tpch db, which fails the assertion in TestWarmupCatalog assuming that there are 8 tables under it. This fixes the test by fetching the table list of tpch db in runtime instead of hard-coding them. Change-Id: I0aca8ee19146f2e63e7cd82177d9fce0b8c6736a Reviewed-on: http://gerrit.cloudera.org:8080/23173 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-07-15 17:23:31 +00:00

... 3 4 5 6 7 ...

12369 Commits