impala

mirror of https://github.com/apache/impala.git synced 2025-12-22 11:28:09 -05:00

Author	SHA1	Message	Date
Venu Reddy	ebbc67cf40	IMPALA-13801: Support greatest synced event with hierarchical metastore event processing It is a follow-up jira/commit to IMPALA-12709. IMPALA-12152 and IMPALA-12785 are affected when hierarchical metastore event processing feature is enabled. Following changes are incorporated with this patch: 1. Added creationTime_ and dispatchTime_ fields in MetastoreEvent class to store the current time in millisec. They are used to calculate: a) Event dispatch time(time between a MetastoreEvent object creation and when event is moved to inProgressLog_ of EventExecutorService after dispatching it to a DbEventExecutor). b) Event schedule delays incurred at DbEventExecutors and TableEventExecutors(time between an event moved to EventExecutorService's inProgressLog_ and before start of processing event at appropriate DbEventExecutor and TableEventExecutor). c) Event process time from EventExecutorService point of view(time spent in inProgressLog_ before it is moved to processedLog_). Logs are added to show the event dispatch time, schedule delays, process time from EventExecutorService point of view for each event. Also a log is added to show the time taken for event's processIfEnabled(). 2. Added isDelimiter_ field in MetastoreEvent class to indicate whether it is a delimiter event. It is set only when hierarchical event processing is enabled. Delimiter is a kind of metastore event that do not require event processing. Delimeter event can be: a) A CommitTxnEvent that do not have any write event info for a given transaction. b) An AbortTxnEvent that do not have write ids for a given transaction. c) An IgnoredEvent. An event is determined and marked as delimiter in EventExecutorService#dispatch(). They are not queued to a DbEventExecutor for processing. They are just maintained in the inProgressLog_ to preserve continuity and correctness in synchronization tracking. The delimiter events are removed from inProgressLog_ when their preceding non-delimiter metastore event is removed from inProgressLog_. 3. Greatest synced event id is computed based on the dispatched events(inProgressLog_) and processed events(processedLog_) tree maps. Greatest synced event is the latest event such that all events with id less than or equal to the latest event are definitely synced. 4. Lag is calculated as difference between latest event time on HMS and the greatest synced event time. It is shown in the log. 5. Greatest synced event id is used in IMPALA-12152 changes. When greatest synced event id becomes greater than or equal to waitForEventId, all the required events are definitely synced. 6. Event processor is paused gracefully when paused with command in IMPALA-12785. This ensures that all the fetched events from HMS in current batch are processed before the event processor is fully paused. It is necessary to process the current batch of events because, certain events like AllocWriteIdEvent, AbortTxnEvent and CommitTxnEvent update table write ids in catalog upon metastore event object creation. And the table write ids are later updated to appropriate table object during their event process. Can lead to inconsistent state of write ids on table objects when paused abruptly in the middle of current batch of event processing. 7. Added greatest synced event id and event time in events processor metrics. And updated description of lag, pending events, last synced event id and event time metrics. 8. Atomically update the event queue and increment outstanding event count in enqueue methods of both DbProcessor and TableProcessor so that respective process methods do not process the event until event is added to queue and outstanding event count is incremented. Otherwise, event can get processed, outstanding event count gets decremented before it is incremented in enqueue method. 9. Refactored DbEventExecutor, DbProcessor, TableEventExecutor and TableProcessor classes to propapage the exception occurred along with event during event processing. EventProcessException is a wrapper added to hold reference to event being processed and exception occurred. 10.Added AcidTableWriteInfo helper class to store table, writeids and partitions for the transaction id received in CommitTxnEvent. Testing: - Added new tests and executed existing end to end tests. - Have executed the existing tests with hierarchical event processing enabled. Change-Id: I26240f36aaf85125428dc39a66a2a1e4d3197e85 Reviewed-on: http://gerrit.cloudera.org:8080/22997 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Quanlong Huang <huangquanlong@gmail.com>	2025-09-26 10:53:46 +00:00
Venu Reddy	5db760662f	IMPALA-12709: Add support for hierarchical metastore event processing At present, metastore event processor is single threaded. Notification events are processed sequentially with a maximum limit of 1000 events fetched and processed in a single batch. Multiple locks are used to address the concurrency issues that may arise when catalog DDL operation processing and metastore event processing tries to access/update the catalog objects concurrently. Waiting for a lock or file metadata loading of a table can slow the event processing and can affect the processing of other events following it. Those events may not be dependent on the previous event. Altogether it takes a very long time to synchronize all the HMS events. Existing metastore event processing is turned into multi-level event processing with enable_hierarchical_event_processing flag. It is not enabled by default. Idea is to segregate the events based on their dependency, maintain the order of events as they occur within the dependency and process them independently as much as possible. Following 3 main classes represents the three level threaded event processing. 1. EventExecutorService It provides the necessary methods to initialize, start, clear, stop and process the metastore events processing in hierarchical mode. It is instantiated from MetastoreEventsProcessor and its methods are invoked from MetastoreEventsProcessor. Upon receiving the event to process, EventExecutorService queues the event to appropriate DbEventExecutor for processing. 2. DbEventExecutor An instance of this class has an execution thread, manage events of multiple databases with DbProcessors. An instance of DbProcessor is maintained to store the context of each database within the DbEventExecutor. On each scheduled execution, input events on DbProcessor are segregated to appropriate TableProcessors for the event processing and also process the database events that are eligible for processing. Once a DbEventExecutor is assigned to a database, a DbProcessor is created. And the subsequent events belonging to the database are queued to same DbEventExecutor thread for further processing. Hence, linearizability is ensured in dealing with events within the database. Each instance of DbEventExecutor has a fixed list of TableEventExecutors. 3. TableEventExecutor An instance of this class has an execution thread, processes events of multiple tables with TableProcessors. An instance of TableProcessor is maintained to store context of each table within a TableEventExecutor. On each scheduled execution, events from TableProcessors are processed. Once a TableEventExecutor is assigned to table, a TableProcessor is created. And the subsequent table events are processed by same TableEventExecutor thread. Hence, linearizability is guaranteed in processing events of a particular table. - All the events of a table are processed in the same order they have occurred. - Events of different tables are processed in parallel when those tables are assigned to different TableEventExecutors. Following new events are added: 1. DbBarrierEvent This event wraps a database event. It is used to synchronize all the TableProcessors belonging to database before processing the database event. It acts as a barrier to restrict the processing of table events that occurred after the database event until the database event is processed on DbProcessor. 2. RenameTableBarrierEvent This event wraps an alter table event for rename. It is used to synchronize the source and target TableProcessors to process the rename table event. It ensures the source TableProcessor removes the table first and then allows the target TableProcessor to create the renamed table. 3. PseudoCommitTxnEvent and PseudoAbortTxnEvent CommitTxnEvent and AbortTxnEvent can involve multiple tables in a transaction and processing these events modifies multiple table objects. Pseudo events are introduced such that a pseudo event is created for each table involved in the transaction and these pseudo events are processed independently at respective TableProcessors. Following new flags are introduced: 1. enable_hierarchical_event_processing To enable the hierarchical event processing on catalogd. 2. num_db_event_executors To set the number of database level event executors. 3. num_table_event_executors_per_db_event_executor To set the number of table level event executors within a database event executor. 4. min_event_processor_idle_ms To set the minimum time to retain idle db processors and table processors on the database event executors and table event executors respectively, when they do not have events to process. 5. max_outstanding_events_on_executors To set the limit of maximum outstanding events to process on event executors. Changed hms_event_polling_interval_s type from int to double to support millisecond precision interval TODOs: 1. We need to redefine the lag in the hierarchical processing mode. 2. Need to have a mechanism to capture the actual event processing time in hierarchical processing mode. Currently, with enable_hierarchical_event_processing as true, lastSyncedEventId_ and lastSyncedEventTimeSecs_ are updated upon event dispatch to EventExecutorService for processing on respective DbEventExecutor and/or TableEventExecutor. So lastSyncedEventId_ and lastSyncedEventTimeSecs_ doesn't actually mean events are processed. 3. Hierarchical processing mode currently have a mechanism to show the total number of outstanding events on all the db and table executors at the moment. Need to enhance observability further with this mode. Filed a jira[IMPALA-13801] to fix them. Testing: - Executed existing end to end tests. - Added fe and end-to-end tests with enable_hierarchical_event_processing. - Added event processing performance tests. - Have executed the existing tests with hierarchical processing mode enabled. lastSyncedEventId_ is now used in the new feature of sync_hms_events_wait_time_s (IMPALA-12152) as well. Some tests fail when hierarchical processing mode is enabled because lastSyncedEventId_ do not actually mean event is processed in this mode. This need to be fixed/verified with above jira[IMPALA-13801]. Change-Id: I76d8a739f9db6d40f01028bfd786a85d83f9e5d6 Reviewed-on: http://gerrit.cloudera.org:8080/21031 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-04-30 11:51:03 +00:00
Joe McDonnell	c5a0ec8bdf	IMPALA-11980 (part 1): Put all thrift-generated python code into the impala_thrift_gen package This puts all of the thrift-generated python code into the impala_thrift_gen package. This is similar to what Impyla does for its thrift-generated python code, except that it uses the impala_thrift_gen package rather than impala._thrift_gen. This is a preparatory patch for fixing the absolute import issues. This patches all of the thrift files to add the python namespace. This has code to apply the patching to the thirdparty thrift files (hive_metastore.thrift, fb303.thrift) to do the same. Putting all the generated python into a package makes it easier to understand where the imports are getting code. When the subsequent change rearranges the shell code, the thrift generated code can stay in a separate directory. This uses isort to sort the imports for the affected Python files with the provided .isort.cfg file. This also adds an impala-isort shell script to make it easy to run. Testing: - Ran a core job Change-Id: Ie2927f22c7257aa38a78084efe5bd76d566493c0 Reviewed-on: http://gerrit.cloudera.org:8080/20169 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>	2025-04-15 17:03:02 +00:00
stiga-huang	bde8cc4ae4	IMPALA-13799: Bumps timeout in waiting for catalog updates in tests EventProcessorUtils.wait_for_event_processing() is used in tests to wait for HMS events being processed by catalogd and all impalads receive the catalog updates. Currently, the timeout in waiting for catalog updates is 10s. However, there are some e2e tests like test_overlap_min_max_filters that run DDL/DMLs longer than 10s, which could block the catalog update for longer than 10s. When this util method is used in e2e tests, it could be impacted by other concurrent tests and time out. This patch deflake the issue by bumping the timeout to be 20s. Change-Id: If6a785e6d98572bf1a3fa3efc81d712c7ecc488e Reviewed-on: http://gerrit.cloudera.org:8080/22547 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Quanlong Huang <huangquanlong@gmail.com>	2025-02-26 23:29:05 +00:00
Venu Reddy	5e8292ef53	IMPALA-12916: Fix test_event_processor_error_global_invalidate test random failure Event processor goes to error state before it tries to global invalidate It remains in error state for a very short period of time. If wait_for_synced_event_id() obtains event processor status during this period, it can get status as error. This test was introduced with IMPALA-12832. Testing: - Tested manually. Added sleep in code for testing so that event processor remains in error state for little longer time. Change-Id: I787cff4cc9f9df345cd715c02b51b8d93a150edf Reviewed-on: http://gerrit.cloudera.org:8080/21169 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2024-03-21 04:50:53 +00:00
Venu Reddy	b7ddbcad0d	IMPALA-12832: Implicit invalidate metadata on event failures At present, failure in event processing needs manual invalidate metadata. This patch implicitly invalidates the table upon failures in processing of table events with new 'invalidate_metadata_on_event_processing_failure' flag. And a new 'invalidate_global_metadata_on_event_processing_failure' flag is added to global invalidate metadata automatically when event processor goes to non-active state. Note: Also introduced a config 'inject_process_event_failure_event_types' for automated tests to simulate event processor failures. This config is used to specify what event types can be intentionally failed. This config should only be used for testing purpose. Need IMPALA-12851 as a prerequisite Testing: - Added end-to-end tests to mimic failures in event processor and verified that event processor is active - Added unit test to verify the 'auto_global_invalidate_metadata' config - Passed FE tests Co-Authored-by: Sai Hemanth Gantasala <saihemanth@cloudera.com> Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c Reviewed-on: http://gerrit.cloudera.org:8080/21065 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2024-03-08 14:46:02 +00:00
stiga-huang	1cf8f5065a	IMPALA-12053: Expose event-processor error message in WebUI When the event-processor goes into the ERROR/NEEDS_INVALIDATE state, we can only check logs to get the detailed information. This is inconvenient in triaging failures. This patch exposes the error message in the /events WebUI. It includes the timestamp string and the stacktrace of the exception. This patch makes the /events page visable. Also modifies the test code of EventProcessorUtils.wait_for_synced_event_id() to print the error message if the event processor is down. A trivial bug of lastProcessedEvent is not updated (IMPALA-11588) is also fixed in this patch. Refactored the variable to be a member of the class so internal methods can update it before processing each event. Some new metrics are not added in the /events page, e.g. latest-event-id, latest-event-time-ms, last-synced-event-time-ms. This patch addes them and also add a metric of event-processing-delay-ms which is latest-event-time-ms minors last-synced-event-time-ms. Tests: - Manually inject codes to fail the event processor and verified the WebUI. - Ran metadata/test_event_processing.py when the event processor is in ERROR state. Verified the error message is shown up in test output. Change-Id: I077375422bc3d24eed57c95c6b05ac408228f083 Reviewed-on: http://gerrit.cloudera.org:8080/19916 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2023-05-25 16:43:39 +00:00
Csaba Ringhofer	a6333aed6b	IMPALA-10983: Wait more in wait_for_event_processing if there is progress There are some flaky tests where wait_for_event_processing timeouts, e.g. TestEventProcessing.test_insert_events. My theory is that this is caused by parallel tests with DDL/DML statements that can also fire HMS events that have to be processed by catalogd. The change bumps (default: 10 sec->100 sec) the timeout in case there is some progress in event processing. If still the same event is processed then the old timeout is used. An alternative approach would be to mark the related test as serial, but I would prefer to avoid this as it would make test jobs slower. The event processor status is also checked to timeout earlier if the event processor is without hope of recovery. Change-Id: I676854f7df9aea5fa10fb6ecf6381195bc8fa4b8 Reviewed-on: http://gerrit.cloudera.org:8080/19614 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2023-03-17 04:43:02 +00:00
Joe McDonnell	82bd087fb1	IMPALA-11973: Add absolute_import, division to all eligible Python files This takes steps to make Python 2 behave like Python 3 as a way to flush out issues with running on Python 3. Specifically, it handles two main differences: 1. Python 3 requires absolute imports within packages. This can be emulated via "from __future__ import absolute_import" 2. Python 3 changed division to "true" division that doesn't round to an integer. This can be emulated via "from __future__ import division" This changes all Python files to add imports for absolute_import and division. For completeness, this also includes print_function in the import. I scrutinized each old-division location and converted some locations to use the integer division '//' operator if it needed an integer result (e.g. for indices, counts of records, etc). Some code was also using relative imports and needed to be adjusted to handle absolute_import. This fixes all Pylint warnings about no-absolute-import and old-division, and these warnings are now banned. Testing: - Ran core tests Change-Id: Idb0fcbd11f3e8791f5951c4944be44fb580e576b Reviewed-on: http://gerrit.cloudera.org:8080/19588 Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com> Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>	2023-03-09 17:17:57 +00:00
Yu-Wen Lai	cf9c443ddc	IMPALA-8592: Add support for insert events for 'LOAD DATA' statements from Impala In this patch, we use TUpdateCatalogRequest to refresh metadata after 'LOAD DATA' instead of TResetMetadataRequest so that we can reuse the code for 'INSERT' statements. It will fire an insert event just same as what we did for 'INSERT' statements. We also fix the inconsistent indentation in event_processor_utils.py. Testing: - Run existing test_load.py - Added test_load_data_from_impala() in test_event_processing.py Change-Id: I7f1b470f40e0aaf891c9f3f327af393b2f9c74bc Reviewed-on: http://gerrit.cloudera.org:8080/19052 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Quanlong Huang <huangquanlong@gmail.com>	2022-10-04 12:50:26 +00:00
Vihang Karajgaonkar	d8d44f3f14	IMPALA-9857: Batching of consecutive partition events This patch improves the performance of events processor by batching together consecutive ALTER_PARTITION or INSERT events. Currently, without this patch, if the events stream consists of a lot of consecutive ALTER_PARTITION events which cannot be skipped, events processor will refresh partition from each event one by one. Similarly, in case of INSERT events in a partition events processor refresh one partition at a time. By batching together such consecutive ALTER_PARTITION or INSERT events, events processor needs to take lock on the table only once per batch and can refresh all the partitions from the events using multiple threads. For transactional (acid) tables, this provides even significant performance gain since currently we refresh the whole table in case of ALTER_PARTITION or INSERT partition events. By batching them together, events processor will refresh the table once per batch. The batch of eligible ALTER_PARTITION and INSERT events will be processed as ALTER_PARTITIONS and INSERT_PARTITIONS event respectively. Performance tests: In order to simulate bunch of ALTER_PARTITION and INSERT events, a simple test was performed by running the following query from hive: insert into store_sales_copy partition(ss_sold_date_sk) select * from store_sales; This query generates 1824 ALTER_PARTITION and 1824 INSERT events and time taken to process all the events generated was measured before and after the patch for external and ACID table. Table Type Before After ====================================================== External table 75 sec 25 sec ACID tables 313 sec 47 sec Additionally, the patch also fixes a minor bug in evaluateSelfEvent() method which should return false when serviceId does not match. Testing Done: 1. Added new tests which cover the batching logic of events. 2. Exhaustive tests. Change-Id: I5d27a68a64436d31731e9a219b1efd6fc842de73 Reviewed-on: http://gerrit.cloudera.org:8080/17848 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Sourabh Goyal <sourabhg@cloudera.com> Reviewed-by: Zoltan Borok-Nagy <boroknagyz@cloudera.com>	2021-10-04 17:13:42 +00:00
Sourabh Goyal	fcbb15a5ea	IMPALA-10746: Drop table/db from catalog cache when drop table/db HMS apis are accessed from catalog's metastore server. This patch fixes a scenario where if table/db already exists in cache and a user drops it via catalog metastore server endpoint (i.e drop_table HMS api). When recreating the same via Impala Shell, user gets an error that table/db already exists. The patch fixes it by dropping table/db from HMS endpoints so that new table/db succeeds Testing: Added new unit tests which cover drop_database and drop_table HMS API Change-Id: Ic2e2ad2630e2028b8ad26a6272ee766b27e0935c Reviewed-on: http://gerrit.cloudera.org:8080/17576 Reviewed-by: <kishen@cloudera.com> Reviewed-by: Vihang Karajgaonkar <vihang@cloudera.com> Tested-by: Vihang Karajgaonkar <vihang@cloudera.com>	2021-08-11 18:11:56 +00:00
Vihang Karajgaonkar	5a9dcd108d	IMPALA-8795: Turn on events processing by default This commit turns on events processing by default. The default polling interval is set as 1 second which can be overrriden by setting hms_event_polling_interval_s to non-default value. When the event polling turned on by default this patch also moves the test_event_processing.py to tests/metadata instead of custom cluster test. Some tests within test_event_processing.py which needed non-default configurations were moved to tests/custom_cluster/test_events_custom_configs.py. Additionally, some other tests were modified to take into account the automatic ability of Impala to detect newly added tables from hive. Testing done: 1. Ran exhaustive tests by turning on the events processing multiple times. 2. Ran exhaustive tests by disabling events processing. 3. Ran dockerized tests. Change-Id: I9a8b1871a98b913d0ad8bb26a104a296b6a06122 Reviewed-on: http://gerrit.cloudera.org:8080/17612 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Quanlong Huang <huangquanlong@gmail.com>	2021-08-09 17:22:31 +00:00
Vihang Karajgaonkar	d46f4a68fa	IMPALA-9101: Add support for detecting self-events on partition events This commit redoes some of the self-event detection logic, specifically for the partition events. Before the patch, the self-event identifiers for a partition were stored at a table level when generating the partition events. This was problematic since unlike ADD_PARTITION and DROP_PARTITION event, ALTER_PARTITION event is generated one per partition. Due to this if there are multiple ALTER_PARTITION events generated, only the first event is identified as a self-event and the rest of the events are processed. This patch fixes this by adding the self-event identifiers to each partition so that when the event is later received, each ALTER_PARTITION uses the state stored in HdfsPartition to valuate the self-events. The patch makes sure that the event processor takes a table lock during self-event evaluation to avoid races with other parts of the code which try to modify the table at the same time. Additionally, this patch also changes the event processor to refresh a loaded table (incomplete tables are not refreshed) when a ALTER_TABLE event is received instead of invalidating the table. This makes the events processor consistent with respect to all the other event types. In future, we should add a flag to choose the behavior preference (prefer invalidate or refresh). Also, this patch fixes the following related issues: 1. Self-event logic was not triggered for alter database events when user modifies the comment on the database. 2. In case of queries like "alter table add if not exists partition...", the partition is not added since its pre-existing. The self-event identifiers should not be added in such cases since no event is expected from such queries. 3. Changed wait_for_event_processing test util method in EventProcessorUtils to use a more deterministic way to determine if the catalog updates have propogated to impalad instead of waiting for a random duration of time. This also speeds up the event processing tests significantly. Testing Done: 1. Added a e2e self-events test which runs multiple impala queries and makes sure that the event is skips processing. 2. Ran MetastoreEventsProcessorTest 3. Ran core tests on CDH and CDP builds. Change-Id: I9b4148f6be0f9f946c8ad8f314d64b095731744c Reviewed-on: http://gerrit.cloudera.org:8080/14799 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2020-01-10 22:45:02 +00:00
Vihang Karajgaonkar	0ff4f450e3	IMPALA-8847: Ignore add partition events with empty partition list Certain Hive queries like "alter table <table> add if not exists partition (<part_spec>)" generate a add_partition event even if the partition did not really exists. Such events have a empty partition list in the event message which trips on the Precondition check in the AddPartitionEvent. This causes event processor to go into error state. The only way to recover is to issue invalidate metadata in such a case. The patch adds logic to ignore such events. Testing: 1. Added a test case which reproduces the issue. The test case works after the patch is applied. Change-Id: I877ce6233934e7090cd18e497f748bc6479838cb Reviewed-on: http://gerrit.cloudera.org:8080/14049 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Vihang Karajgaonkar <vihang@cloudera.com>	2019-08-15 07:00:56 +00:00

15 Commits