Commit Graph

10 Commits

Author SHA1 Message Date
Sourabh Goyal
b25e250d32 IMPALA-10926: Improve catalogd consistency and self events detection
In the current design, catalogd cache gets updated from 2 sources:
1. Impala shell
2. MetastoreEventProcessor

The updates from the Impala shell are applied in place whereas
MetastoreEventProcessor runs as a background thread which polls HMS
events from notifications log table and apply them asynchronously.
These two stream of updates cause consistency issues. For example
consider the following sequence of alter table events on a table
t1 as per HMS:

1. alter table t1 from source s1 say other Impala cluster
2. alter table t1 from source s2 say other Hive cluster
3. alter table t1 from local Impala cluster

The #3 alter table ddl operation would get reflected in the local
cache immediately. However, later on event processor would process
events from #1 and #2 above and try to alter the table. In an ideal
scenario, these alters should have been applied before #3 i.e in the
same order as they appear in HMS notification log. This leaves table
t1 in an inconsistent state.

Proposed solution:

The main idea of the solution is to keep track of the last event id
for a given table as eventId which the catalogd has synced to in the
Db/Table object. The events processor ignores any event whose EVENT_ID
is less than or equal to the eventId stored in the table. Once the
events processor successfully processes a given event, it updates the
value of eventId in the table before releasing the table lock. Also,
any DDL or refresh operation on the catalogd from both catalog HMS
metastore server and Impala shell) will follow the following steps
to update the event id for the table:

1. Acquire write lock on the table
2. Perform ddl operation in HMS
3. Sync table till the latest event id (as per HMS) since its last
   synced event id

The above steps ensure that any concurrent updates applied on a same
db/table from multiple sources like Hive, Impala or say multiple
Impala clusters, get reflected in the local catalogd cache in the
same order as they appear in HMS thus removing any inconsistencies.
Also the solution relies on the existing locking mechanism in the
catalogd to prevent any other concurrent updates to the table (even
via EventsProcessor). In case of database objects, we will also have
a similar eventId which represents the events on the database object
(CREATE, DROP, ALTER database) to which the catalogd as synced to.

This patch addresses the following:
1. Add a new flag enable_sync_to_latest_event_on_ddls to enable/disable
   this improvement. It is turned off by default.
2. If flag in #1 is enabled then apart from Impala shell and
   MetastoreEventProcessor the cache would also get updated for ddls
   executed via catalog HMS endpoints. And while excuting a ddl,
   db/table will be synced till latest event id.
3. Event processor skips processing an event if db/table is already
   synced till that event id. Sets that event id in db/table if
   the event is processed.
4. When EventProcessor detects a self event, it sets the last synced
   event id in db/table before skipping the processing of an event.
5. Full table refresh sets the last event processed in table cache.

Future Work:
1. Sync db/table to latest event id for ddls executed from Impala
   shell (execDdlRequest() in catalogOpExecutor)

Testing:

1. Added new unit tests and modified existing ones
2. Ran exhaustive tests with flag both turned on and off

Change-Id: I36364e401911352c4666674eb98c8d61bbaae9b9
Reviewed-on: http://gerrit.cloudera.org:8080/17859
Reviewed-by: Vihang Karajgaonkar <vihang@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-12-14 02:18:29 +00:00
Fucun Chu
157086cb80 IMPALA-10771: Add Tencent COS support
This patch adds support for COS(Cloud Object Storage). Using the
hadoop-cos, the implementation is similar to other remote FileSystems.

New flags for COS:
- num_cos_io_threads: Number of COS I/O threads. Defaults to be 16.

Follow-up:
- Support for caching COS file handles will be addressed in
   IMPALA-10772.
- test_concurrent_inserts and test_failing_inserts in
   test_acid_stress.py are skipped due to slow file listing on
   COS (IMPALA-10773).

Tests:
 - Upload hdfs test data to a COS bucket. Modify all locations in HMS
   DB to point to the COS bucket. Remove some hdfs caching params.
   Run CORE tests.

Change-Id: Idce135a7591d1b4c74425e365525be3086a39821
Reviewed-on: http://gerrit.cloudera.org:8080/17503
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-12-08 16:32:02 +00:00
Vihang Karajgaonkar
731bb8029e IMPALA-10885: Deflake test_get_table_req_without_fallback
The test originally was written when events processing
was not turned on by default. However, after
IMPALA-8795 the events processing is turned on by
default and this test fails intermittently.

The error was reproduced by adding a simple
sleep statement in the test just before issuing a
get_table API call which was expected to fail.

Testing:
1. Looped the test for 25 times with the change and with
the sleep statement which reproduced the failure.

Change-Id: I684ec07cc23617d64355df25420c45b0cbedd5a3
Reviewed-on: http://gerrit.cloudera.org:8080/17817
Reviewed-by: Vihang Karajgaonkar <vihang@cloudera.com>
Reviewed-by: Quanlong Huang <huangquanlong@gmail.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-08-31 04:58:31 +00:00
Sourabh Goyal
fcbb15a5ea IMPALA-10746: Drop table/db from catalog cache when drop table/db HMS
apis are accessed from catalog's metastore server.

This patch fixes a scenario where if table/db already exists in
cache and a user drops it via catalog metastore server endpoint
(i.e drop_table HMS api). When recreating the same via Impala
Shell, user gets an error that table/db already exists. The patch
fixes it by dropping table/db from HMS endpoints so that new
table/db succeeds

Testing:
 Added new unit tests which cover drop_database and
drop_table HMS API

Change-Id: Ic2e2ad2630e2028b8ad26a6272ee766b27e0935c
Reviewed-on: http://gerrit.cloudera.org:8080/17576
Reviewed-by: <kishen@cloudera.com>
Reviewed-by: Vihang Karajgaonkar <vihang@cloudera.com>
Tested-by: Vihang Karajgaonkar <vihang@cloudera.com>
2021-08-11 18:11:56 +00:00
Sourabh Goyal
780a892f57 IMPALA-10813: Invalidate external table from catalog cache for
truncate table HMS api

This patch is in continuation of IMPALA-10648 in which we missed
invalidating external table for truncate_table api

Testing:
Enhanced exiting test to include truncate_table scenario

Change-Id: I734c2b5f371291fef32badab9efc886b4b067e10
Reviewed-on: http://gerrit.cloudera.org:8080/17705
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Vihang Karajgaonkar <vihang@cloudera.com>
2021-07-28 19:51:03 +00:00
Sourabh Goyal
975047f43c IMPALA-10648: Invalidate catalogd table metadata cache for HMS DDL apis
For transactional tables, catalogd already guarantees consitent table
metadata reads based on the writeIdList passed in the request. For
non transactional tables, the reads are eventually consistent as in
event processor thread in the background, processes HMS events for the
table and updates its metadata.
In this patch, to ensure strong consistency guarantees for external
tables,we invalidate the table metadata from cache if HMS DDL apis
like alter/drop table/partition are accessed from catalogd's metastore
server. As a result of which, any subsequent get table request fetches
the table from HMS and loads it in cache. This ensures that any
get_table/get_partition requests after DDL operations on same table
return updated table metadata. This behavior has a performance penalty
since metadata loading in cache takes time specially for large tables.
The change is behind catalogd server's flag:
invalidate_hms_cache_on_ddls which is enabled by default. The flag
needs to be turned off in case of a performance bottleneck.

Change-Id: Idb9cc22ebfb51948433e4d57f4705ce201acaf98
Reviewed-on: http://gerrit.cloudera.org:8080/17298
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Vihang Karajgaonkar <vihang@cloudera.com>
2021-06-09 17:41:24 +00:00
Vihang Karajgaonkar
8087c75f62 IMPALA-10613: (addendum) Fix test on S3 builds
The test_metastore_service.py fails on S3 builds because it expects
the filemetadata's object dictionary to be present. However, if the
table is located on S3 then there are no file-blocks in the returned
file-metadata and hence the length of obj_dict will be 0.

This patch fixes the test in S3 builds by not asserting the check
on S3, ADLS, GCS builds.

Testing Done:
1. Ran the test locally to confirm it is working.
2. [WIP] Ran the test on a S3 environment where tables are created
on S3.

Change-Id: I6ac291529dc0661abdfc2d4f48924a2c4b807c40
Reviewed-on: http://gerrit.cloudera.org:8080/17483
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Vihang Karajgaonkar <vihang@cloudera.com>
2021-05-25 17:00:15 +00:00
Vihang Karajgaonkar
5c85bf5c54 Revert "Revert "IMPALA-10613: Standup HMS thrift server in Catalog""
This reverts commit 829d1a6ab4.

Additionally, this patch has couple of addendums which are related
to the original change:
1. Bug fix the original reverted commit which uses
isSetGetFileMetadata instead of isGetFileMetadata
(see https://gerrit.cloudera.org/#/c/17330/)
2. Fix for intermittent failures on CatalogHmsFileMetadataTest
due to the limitation of the catalogd's HMS client requirement
of need to set "hive.metastore.execute.setugi" to false.

Change-Id: Icbe93f3ae4efd585d4b0092a9ac7081b0b2c1c44
Reviewed-on: http://gerrit.cloudera.org:8080/17429
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Aman Sinha <amsinha@cloudera.com>
2021-05-18 00:37:53 +00:00
Joe McDonnell
829d1a6ab4 Revert "IMPALA-10613: Standup HMS thrift server in Catalog"
There are issues building this patch against other
Hive versions, so reverting until those can be addressed.

This reverts commit a7eae471b8.

Change-Id: Id952ee063095a9c36c4619b7238b71cfcb7d61f3
Reviewed-on: http://gerrit.cloudera.org:8080/17290
Reviewed-by: Vihang Karajgaonkar <vihang@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-04-09 00:04:00 +00:00
Vihang Karajgaonkar
a7eae471b8 IMPALA-10613: Standup HMS thrift server in Catalog
This change adds the basic infrastructure to start the HMS server in
Catalog. It introduces a new configuration (--start_hms_server) along with a
config for the port and starts a HMS thrift server in the CatalogServiceCatalog
instance. Currently, all the HMS APIs are "pass-through" to the backing HMS
service. Except for the following 3 HMS APIs which can be used to request
a table and its partitions.

Additionally, there is another flag (--enable_catalogd_hms_cache) which
can be used to disable the usage of catalogd for providing the table
and partition metadata. This contribution was done by Kishen Das.

1. get_table_req
2. get_partitions_by_expr
3. get_partitions_by_names

In case of get_partitions_by_expr we need the hive-exec jar to be
present in the classpath since it needs to load the PartitionExpressionProxy
to push down the partition predicates to the HMS database. In case of
get_table_req if column statistics are requested, we return the
table level statistics.

Additionally, this patch adds a new configuration
fallback_to_hms_on_errors for the catalog which is used to determine
if the Catalog falls back to HMS service in case of errors while
executing the API. This is useful for testing purposes.

In order to expose the file-metadata for the tables and partitions,
HMS API changes were made to add the filemetadata fields to table
and partitions. In case of transactional tables, the file-metadata
which is returned is consistent with the provided ValidWriteIdList
in the API call.

There are a few TODOs which will be done in follow up tasks:
1. Add support for SASL support.
2. Pin the hive_metastore.thrift in the code so that any changes to HMS APIs
in the hive branch doesn't break Catalog's HMS service.

Testing:
1. Added a new end-to-end test which starts the HMS service in Catalog and runs
some basic HMS APIs against it.
2. Ran a modification of TestRemoteHiveMetastore in the Hive code base and
confirmed most tests are working. There were some test failures but they are
unrelated since the test assumes an empty warehouse whereas we run against the
actual HMS service running in the mini-cluster.

Change-Id: I1b306f91d63cb5137c178e8e72b6e8b578a907b5
Reviewed-on: http://gerrit.cloudera.org:8080/17244
Reviewed-by: Quanlong Huang <huangquanlong@gmail.com>
Tested-by: Vihang Karajgaonkar <vihang@cloudera.com>
2021-04-08 01:01:22 +00:00