Files
impala/tests/custom_cluster/test_metastore_events_cleanup.py
Vihang Karajgaonkar cc6f6d5c91 IMPALA-11028: Table loading can fail when events are cleaned up
IMPALA-10502 introduces a createEventId field of a table which
is updated when Impala creates a table. This is used by
the events processor to determine if the subsequent CREATE_TABLE
event which is received should be skipped or not.

When the table is loaded for the first time, in order to avoid
race conditions, TableLoader updates the createEventId to the
last CREATE_TABLE event id from the metastore. In order to
fetch the latest CREATE_TABLE event id, it fetches all the
events from metastore since the last known createEventId of the
table. However, if there is a significant delay between
(more than 24hrs) between the time a table is created
or invalidated, and the table is queried, it is possible that
the metastore cleanup thread deletes the events which are generated
since the table's createEventId. In such a case, the HMS Client method
getNextNotification() throws an IllegalStateException due to the missing
events. This exception causes the Table load to fail and query to error
out.

The fix is to not rely on the HMS Client method which throws the
IllegalStateException. Instead we use the backing thrift API directly.

Testing:
1. Introduced a custom cluster test which can reproduce this issue.
2. Test works after the patch.
3. Core tests.

Change-Id: I95e5e20e1a2086688a92abdfb28e89177e996a1a
Reviewed-on: http://gerrit.cloudera.org:8080/18038
Reviewed-by: Vihang Karajgaonkar <vihang@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-11-23 07:45:47 +00:00

45 lines
2.0 KiB
Python

# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
import pytest
import os
from tests.common.custom_cluster_test_suite import CustomClusterTestSuite
IMPALA_HOME = os.getenv('IMPALA_HOME')
HIVE_SITE_EVENTS_CLEANUP = IMPALA_HOME + '/fe/src/test/resources/hive-site-events-cleanup'
class TestTableLoadingWithEventsCleanUp(CustomClusterTestSuite):
@pytest.mark.execute_serially
@CustomClusterTestSuite.with_args(hive_conf_dir=HIVE_SITE_EVENTS_CLEANUP)
def test_table_load_with_events_cleanup(self, unique_database):
"""Regression test for IMPALA-11028"""
self.execute_query_expect_success(self.client, "create table {}.{}"
"(id int)".format(unique_database,
"t1"))
self.execute_query_expect_success(self.client, "create table {}.{}"
"(id int)".format(unique_database,
"t2"))
self.execute_query_expect_success(self.client, "select sleep(120000)")
self.execute_query_expect_success(self.client, "create table {}.{}"
"(id int)".format(unique_database,
"t3"))
self.execute_query_expect_success(self.client, "select * from "
"{}.{}".format(unique_database, "t1"))