impala

mirror of https://github.com/apache/impala.git synced 2025-12-19 18:12:08 -05:00

Author	SHA1	Message	Date
Joe McDonnell	c5a0ec8bdf	IMPALA-11980 (part 1): Put all thrift-generated python code into the impala_thrift_gen package This puts all of the thrift-generated python code into the impala_thrift_gen package. This is similar to what Impyla does for its thrift-generated python code, except that it uses the impala_thrift_gen package rather than impala._thrift_gen. This is a preparatory patch for fixing the absolute import issues. This patches all of the thrift files to add the python namespace. This has code to apply the patching to the thirdparty thrift files (hive_metastore.thrift, fb303.thrift) to do the same. Putting all the generated python into a package makes it easier to understand where the imports are getting code. When the subsequent change rearranges the shell code, the thrift generated code can stay in a separate directory. This uses isort to sort the imports for the affected Python files with the provided .isort.cfg file. This also adds an impala-isort shell script to make it easy to run. Testing: - Ran a core job Change-Id: Ie2927f22c7257aa38a78084efe5bd76d566493c0 Reviewed-on: http://gerrit.cloudera.org:8080/20169 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>	2025-04-15 17:03:02 +00:00
Csaba Ringhofer	f98b697c7b	IMPALA-13929: Make 'functional-query' the default workload in tests This change adds get_workload() to ImpalaTestSuite and removes it from all test suites that already returned 'functional-query'. get_workload() is also removed from CustomClusterTestSuite which used to return 'tpch'. All other changes besides impala_test_suite.py and custom_cluster_test_suite.py are just mass removals of get_workload() functions. The behavior is only changed in custom cluster tests that didn't override get_workload(). By returning 'functional-query' instead of 'tpch', exploration_strategy() will no longer return 'core' in 'exhaustive' test runs. See IMPALA-3947 on why workload affected exploration_strategy. An example for affected test is TestCatalogHMSFailures which was skipped both in core and exhaustive runs before this change. get_workload() functions that return a different workload than 'functional-query' are not changed - it is possible that some of these also don't handle exploration_strategy() as expected, but individually checking these tests is out of scope in this patch. Change-Id: I9ec6c41ffb3a30e1ea2de773626d1485c69fe115 Reviewed-on: http://gerrit.cloudera.org:8080/22726 Reviewed-by: Riza Suminto <riza.suminto@cloudera.com> Reviewed-by: Daniel Becker <daniel.becker@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-04-08 07:12:55 +00:00
Riza Suminto	95f353ac4a	IMPALA-13507: Allow disabling glog buffering via with_args fixture We have plenty of custom_cluster tests that assert against content of Impala daemon log files while the process is still running using assert_log_contains() and it's wrappers. The method specifically mention about disabling glog buffering ('-logbuflevel=-1'), but not all custom_cluster tests do that. This often result in flaky test that hard to triage and often neglected if it does not frequently run in core exploration. This patch adds boolean param 'disable_log_buffering' into CustomClusterTestSuite.with_args for test to declare intention to inspect log files in live minicluster. If it is True, start minicluster with '-logbuflevel=-1' for all daemons. If it is False, log WARNING on any calls to assert_log_contains(). There are several complex custom_cluster tests that left unchanged and print out such WARNING logs, such as: - TestQueryLive - TestQueryLogTableBeeswax - TestQueryLogOtherTable - TestQueryLogTableHS2 - TestQueryLogTableAll - TestQueryLogTableBufferPool - TestStatestoreRpcErrors - TestWorkloadManagementInitWait - TestWorkloadManagementSQLDetails This patch also fixed some small flake8 issues on modified tests. There is a flakiness sign at test_query_live.py where test query is submitted to coordinator and fail because sys.impala_query_live table has not exist yet from coordinator's perspective. This patch modify test_query_live.py to wait for few seconds until sys.impala_query_live is queryable. Testing: - Pass custom_cluster tests in exhaustive exploration. Change-Id: I56fb1746b8f3cea9f3db3514a86a526dffb44a61 Reviewed-on: http://gerrit.cloudera.org:8080/22015 Reviewed-by: Jason Fehr <jfehr@cloudera.com> Reviewed-by: Michael Smith <michael.smith@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2024-11-05 04:49:05 +00:00
Michael Smith	8d4497be09	IMPALA-13252: Consistently use PrintId to print TUniqueId Some logging formats TUniqueId inconsistently by relying on the Thrift to_string/toString generated printers. This makes it difficult to track a specific query through logs. Adds operator<<(ostream, TUniqueId) to simplify logging TUniqueId correctly, uses PrintId instead of toString in Java, and adds a verifier to test_banned_log_messages to ensure TUniqueId is not printed in logs. Change-Id: If01bf20a240debbbd4c0a22798045ea03f17b28e Reviewed-on: http://gerrit.cloudera.org:8080/21606 Reviewed-by: Yida Wu <wydbaggio000@gmail.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2024-07-27 04:26:44 +00:00
wzhou-code	ae95c43eda	IMPALA-12286: Make CatalogD HA robust IMPALA-12155 added support for CatalogD HA. The statestore assigns the roles for the catalogd in the HA pair. When an active catalogd is elected, statestore sends RPCs to coordinators and catalogds to notify the new active catalogd. But the RPCs could fail due to network conditions. To avoid missing the notification of new active catalogd, this patch makes statestore to resend RPCs to subscribers if there are RPC failures. This patch also rename the metric "catalog-server.ha-active-status" to "catalog-server.active-status" since this metric is also set for catalogd when CatalogD HA is not enabled. As discussed in IMPALA-12267, catalogd need to re-generate its Catalog Service ID when it becomes active. Testing: - Added unit-test cases for CatalogD HA with simulated RPC failures. - Passed core tests. Change-Id: Ibdfea022031c3cc1cbaf4ad52e947720a5d5630f Reviewed-on: http://gerrit.cloudera.org:8080/20220 Reviewed-by: Andrew Sherman <asherman@cloudera.com> Reviewed-by: Abhishek Rawat <arawat@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Wenzhe Zhou <wzhou@cloudera.com>	2023-07-21 07:34:32 +00:00
wzhou-code	a82830896b	IMPALA-12150: Use protocol version to isolate cluster components Some Thrift request/response structs in CatalogService were changed to add new variables in the middle, which caused cross version incompatibility issue for CatalogService. Impala cluster membership is managed by the statestore. During upgrade scenarios where different versions of Impala daemons are upgraded one at a time, the upgraded daemons have incompatible message formats. Even through protocol versions numbers were already defined for Statestore and Catalog Services, they were not used. The Statestore and Catalog server did not check the protocol version in the requests, which allowed incompatible Impala daemons to join one cluster. This causes unexpected query failures during rolling upgrade. We need a way to detect this and enforce that some rules are followed: - Statestore refuses the registration requests from incompatible subscribers. - Catalog server refuses the requests from incompatible clients. - Scheduler assigns tasks to a group of compatible executors. This patch isolate Impala daemons into separate clusters based on protocol versions of Statestore service to prevent incompatible Impala daemons from communicating with each other. It covers the Thrift RPC communications between catalogd and coordinators, and communication between statestore and its subscribers (executor, coordinators, catalogd and admissiond). This change should work for future upgrade. Following changes were made: - Bump StatestoreServiceVersion and CatalogServiceVersion to V2 for all requests of Statestore and Catalog services. - Update the request and response structs in CatalogService to ensure each Thrift request struct has protocol version and each Thrift response struct has returned status. - Update the request and response struct in StatestoreService to ensure each Thrift request struct has protocol version and each Thrift response struct has returned status. - Add subscriber type so that statestore could distinguish different types of subscribers. - Statestore checks protocol version for registration requests from subscribers. It refuses the requests with incompatible version. - Catalog server checks protocol version for Catalog service APIs, and returns error for requests with incompatible version. - Catalog daemon sends its address and the protocol version of Catalog service when it registers to statestore, statestore forwards the address and the protocol version of Catalog service to all subscribers during registration. - Add UpdateCatalogd API for StatestoreSubscriber service so that the coordinators could receive the address and the protocol version of Catalog service from statestore if the coordinators register to statestore before catalog daemon. - Add GetProtocolVersion API for Statestore service so that the subscribers can check the protocol version of statestore before calling RegisterSubscriber API. - Add starting flag tolerate_statestore_startup_delay. It is off by default. When it's enabled, the subscriber is able to tolerate the delay of the statestore's availability. The subscriber's process will not exit if it cannot register with the specified statestore on startup. But instead it enter into Recovery mode, it will loop, sleep and retry till it successfully register with the statestore. This flag should be enabled during rolling upgrade. CatalogServiceVersion is defined in CatalogService.thrift. In future, if we make non backward version compatible changes in the request or response structures for CatalogService APIs, we need to bump the protocol version of Catalog service. StatestoreServiceVersion is defined in StatestoreService.thrift. Similarly if we make non backward version compatible changes in the request or response structures for StatestoreService APIs, we need to bump the protocol version of Statestore service. Message formats for KRPC communications between coordinators and executors, and between admissiond and coordinators are defined in proto files under common/protobuf. If we make non backward version compatible changes in these structures, we need to bump the protocol version of Statestore service. Testing: - Added end-to-end unit tests. - Passed the core tests. - Ran manual test to verify old version of executors cannot register with new version of statestore, and new version of executors cannot register with old version of statestore. Change-Id: If61506dab38c4d1c50419c1b3f7bc4f9ee3676bc Reviewed-on: http://gerrit.cloudera.org:8080/19959 Reviewed-by: Andrew Sherman <asherman@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2023-06-21 04:24:16 +00:00
Joe McDonnell	eb66d00f9f	IMPALA-11974: Fix lazy list operators for Python 3 compatibility Python 3 changes list operators such as range, map, and filter to be lazy. Some code that expects the list operators to happen immediately will fail. e.g. Python 2: range(0,5) == [0,1,2,3,4] True Python 3: range(0,5) == [0,1,2,3,4] False The fix is to wrap locations with list(). i.e. Python 3: list(range(0,5)) == [0,1,2,3,4] True Since the base operators are now lazy, Python 3 also removes the old lazy versions (e.g. xrange, ifilter, izip, etc). This uses future's builtins package to convert the code to the Python 3 behavior (i.e. xrange -> future's builtins.range). Most of the changes were done via these futurize fixes: - libfuturize.fixes.fix_xrange_with_import - lib2to3.fixes.fix_map - lib2to3.fixes.fix_filter This eliminates the pylint warnings: - xrange-builtin - range-builtin-not-iterating - map-builtin-not-iterating - zip-builtin-not-iterating - filter-builtin-not-iterating - reduce-builtin - deprecated-itertools-function Testing: - Ran core job Change-Id: Ic7c082711f8eff451a1b5c085e97461c327edb5f Reviewed-on: http://gerrit.cloudera.org:8080/19589 Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com> Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>	2023-03-09 17:17:57 +00:00
Joe McDonnell	82bd087fb1	IMPALA-11973: Add absolute_import, division to all eligible Python files This takes steps to make Python 2 behave like Python 3 as a way to flush out issues with running on Python 3. Specifically, it handles two main differences: 1. Python 3 requires absolute imports within packages. This can be emulated via "from __future__ import absolute_import" 2. Python 3 changed division to "true" division that doesn't round to an integer. This can be emulated via "from __future__ import division" This changes all Python files to add imports for absolute_import and division. For completeness, this also includes print_function in the import. I scrutinized each old-division location and converted some locations to use the integer division '//' operator if it needed an integer result (e.g. for indices, counts of records, etc). Some code was also using relative imports and needed to be adjusted to handle absolute_import. This fixes all Pylint warnings about no-absolute-import and old-division, and these warnings are now banned. Testing: - Ran core tests Change-Id: Idb0fcbd11f3e8791f5951c4944be44fb580e576b Reviewed-on: http://gerrit.cloudera.org:8080/19588 Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com> Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>	2023-03-09 17:17:57 +00:00
Michael Ho	41426077fb	IMPALA-9026: Use resolved IP address for statestore subscriber This change adds a flag (--statestore_subscriber_use_resolved_address) which, if set to true, allows statestore subscribers to use its resolved IP address instead of its hostname as the heartbeat address which statestore sends heartbeats / updates to. This flag is useful in certain situation in which the subscriber's DNS entry may not be present for a valid reason (e.g. a Kubernetes pod whose readiness probe returns false). An example is that there are multiple Impala coordinators but only one of them will be active at a time (for admission control reason) and the rest will serve as backup. In which case, we still want the backup coordinators to receive updates from statestore but not serve any queries. Change-Id: Ieb8302dec0e52beb9f0b88306a51c38ff42a63a2 Reviewed-on: http://gerrit.cloudera.org:8080/14388 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2019-10-29 23:11:47 +00:00
Zoram Thanga	f97634fcf9	IMPALA-2642: Fix a potential deadlock in statestore The statestored can deadlock if the number of subscribers has reached STATESTORE_MAX_SUBSCRIBERS, because the DoSubscriberUpdate() method calls OfferUpdate(), while holding subscribers_lock_, which also tries to take the same lock in this situation. Fix the issue by moving out the call to acquire subscribers_lock_ from OfferUpdate(), and depend on the callers to take it. We also make the maximum number of statestore subscribers a start-up time tuneable, to allow us to test the limit more easily. Testing: The problem is easily reproduced by lowering the value of STATESTORE_MAX_SUBSCRIBERS to 3, and then launching a mini cluster with 3 impalads. Without the fix, the statestored becomes completely deadlocked. A new EE test has been added to exercise this scenario. The test verifies that statestored correctly rejects new subscription requests when the limit it reached. Change-Id: I5d49dede221ce1f50ec299643b5532c61f93f0c6 Reviewed-on: http://gerrit.cloudera.org:8080/9038 Reviewed-by: Sailesh Mukil <sailesh@cloudera.com> Tested-by: Impala Public Jenkins	2018-01-31 21:27:00 +00:00

10 Commits