mirror of
https://github.com/apache/impala.git
synced 2025-12-25 02:03:09 -05:00
When debugging stale metadata, it'd be helpful to know what catalog version of the tables are used and what's the time when catalogd loads those versions. This patch exposes these info in the query profile for each referenced table. E.g. Original Table Versions: tpch.customer, 2249, 1726052668932, Wed Sep 11 19:04:28 CST 2024 tpch.nation, 2255, 1726052790140, Wed Sep 11 19:06:30 CST 2024 tpch.orders, 2257, 1726052803258, Wed Sep 11 19:06:43 CST 2024 tpch.lineitem, 2254, 1726052785384, Wed Sep 11 19:06:25 CST 2024 tpch.supplier, 2256, 1726052794235, Wed Sep 11 19:06:34 CST 2024 Each line consists of the table name, catalog version, loaded timestamp and the timestamp string. Implementation: The loaded timestamp is updated whenever a CatalogObject updates its catalog version in catalogd. It's passed to impalads with the TCatalogObject broadcasted by statestore, or in DDL/DML responses. Currently, the loaded timestamp is added for table, view, function, data source, and hdfs cache pool in catalogd. However, only those of table and view are applied used in impalad. For the loaded timestamp of other types, users can check them in the /catalog WebUI of catalogd. Tests: - Adds e2e test Change-Id: I94b2fd59ed5aca664d6db4448c61ad21a88a4f98 Reviewed-on: http://gerrit.cloudera.org:8080/21782 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>