mirror of
https://github.com/apache/impala.git
synced 2026-01-09 06:05:09 -05:00
Creating Kudu clients is very expensive as each will fetch metadata from the Kudu master, so we should minimize the number of Kudu clients that get created. This patch stores a map from Kudu master addressed to Kudu clients in KuduUtil to be used across the FE and catalog. Another patch has already addressed the BE. Future work will consider providing a way to invalidate the stored Kudu clients in case something goes wrong (IMPALA-5685) This relies on two changes on the Kudu side: one that clears non-covered range entries from the client's cache on table open (d07ecd6ded01201c912d2e336611a6a941f48d98), and one that automatically refreshes auth tokens when they expire (603c1578c78c0377ffafdd9c427ebfd8a206bda3). This patch disables some tests that no longer work as they relied on Kudu metadata loading operations timing out, but since we're reusing clients the metadata is already loaded when the test is run. Testing: - Ran a stress test on a 10 node cluster: scan of a small Kudu table, 1000 concurrent queries, load on the Kudu master was reduced signficantly, from ~50% cpu to ~5%. (with the BE changes included) - Ran the Kudu e2e tests. - Manually ran a test with concurrent INSERTs and 'ALTER TABLE ADD PARTITION' (which is affected by the Kudu side change mentiond above) and verified correctness. Change-Id: I9b0b346f37ee43f7f0eefe34a093eddbbdcf2a5e Reviewed-on: http://gerrit.cloudera.org:8080/6898 Reviewed-by: Thomas Tauber-Marshall <tmarshall@cloudera.com> Tested-by: Impala Public Jenkins