IMPALA-14447: Parallelize table loading in getMissingTables()

StmtMetadataLoader.getMissingTables() load missing tables in serial manner. In local catalog mode, large number of serial table loading can incur significant round trip latency to CatalogD. This patch parallelize the table loading by using executor service to lookup and gather all non-null FeTables from given TableName set. Modify LocalCatalog.loadDbs() and LocalDb.loadTableNames() slightly to make it thread-safe. Change FrontendProfile.Scope to support nested scope referencing the same FrontendProfile instance. Added new flag max_stmt_metadata_loader_threads to control the maximum number of threads to use for loading table metadata during query compilation. It is deafult to 8 threads per query compilation. If there is only one table to load, max_stmt_metadata_loader_threads set to 1, or RejectedExecutionException raised, fallback to load table serially. Testing: Run and pass few tests such as test_catalogd_ha.py, test_concurrent_ddls.py, and test_observability.py. Add FE tests CatalogdMetaProviderTest.testProfileParallelLoad. Manually run following query and observe parallel loading by setting TRACE level log in CatalogdMetaProvider.java. use functional; select count(*) from alltypesnopart union select count(*) from alltypessmall union select count(*) from alltypestiny union select count(*) from alltypesagg; Change-Id: I97a5165844ae846b28338d62e93a20121488d79f Reviewed-on: http://gerrit.cloudera.org:8080/23436 Reviewed-by: Quanlong Huang <huangquanlong@gmail.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-12-19 18:12:08 -05:00 · 2025-09-17 14:05:20 -07:00
parent cde4bc016c
commit 1008decc07
9 changed files with 249 additions and 51 deletions
--- a/be/src/util/backend-gflag-util.cc
+++ b/be/src/util/backend-gflag-util.cc
@@ -302,6 +302,10 @@ DEFINE_int32(iceberg_catalog_num_threads, 16,
    "Maximum number of threads to use for Iceberg catalog operations. These threads are "
    "shared among concurrent Iceberg catalog operation (ie., ExpireSnapshot).");

+DEFINE_int32(max_stmt_metadata_loader_threads, 8,
+    "Maximum number of threads to use for loading table metadata during query "
+    "compilation.");
+
 // These coefficients have not been determined empirically. The write coefficient
 // matches the coefficient for a broadcast sender in DataStreamSink. The read
 // coefficient matches the coefficient for an exchange receiver in ExchandeNode.
@@ -358,6 +362,7 @@ DEFINE_validator(query_cpu_count_divisor, &ValidatePositiveDouble);
 DEFINE_validator(min_processing_per_thread, &ValidatePositiveInt64);
 DEFINE_validator(query_cpu_root_factor, &ValidatePositiveDouble);
 DEFINE_validator(iceberg_catalog_num_threads, &ValidatePositiveInt32);
+DEFINE_validator(max_stmt_metadata_loader_threads, &ValidatePositiveInt32);
 DEFINE_validator(tuple_cache_cost_coefficient_write_bytes, &ValidateNonnegativeDouble);
 DEFINE_validator(tuple_cache_cost_coefficient_write_rows, &ValidateNonnegativeDouble);
 DEFINE_validator(tuple_cache_cost_coefficient_read_bytes, &ValidateNonnegativeDouble);
@@ -590,6 +595,7 @@ Status PopulateThriftBackendGflags(TBackendGflags& cfg) {
  cfg.__set_tuple_cache_cost_coefficient_read_rows(
      FLAGS_tuple_cache_cost_coefficient_read_rows);
  cfg.__set_min_jdbc_scan_cardinality(FLAGS_min_jdbc_scan_cardinality);
+  cfg.__set_max_stmt_metadata_loader_threads(FLAGS_max_stmt_metadata_loader_threads);
  return Status::OK();
 }