impala

mirror of https://github.com/apache/impala.git synced 2025-12-23 03:44:48 -05:00

Author	SHA1	Message	Date
Riza Suminto	1cead45114	IMPALA-13947: Test local catalog mode by default Local catalog mode has been the default and works well in downstream Impala for over 5 years. This patch turn on local catalog mode by default (--catalog_topic_mode=minimal and --use_local_catalog=true) as preferred mode going forward. Implemented LocalCatalog.setIsReady() to facilitate using local catalog mode for FE tests. Some FE tests fail due to behavior differences in local catalog mode like IMPALA-7539. This is probably OK since Impala now largely hand over FileSystem permission check to Apache Ranger. The following custom cluster tests are pinned to evaluate under legacy catalog mode because their behavior changed in local catalog mode: TestCalcitePlanner.test_calcite_frontend TestCoordinators.test_executor_only_lib_cache TestMetadataReplicas TestTupleCacheCluster TestWorkloadManagementSQLDetailsCalcite.test_tpcds_8_decimal At TestHBaseHmsColumnOrder.test_hbase_hms_column_order, set --use_hms_column_order_for_hbase_tables=true flag for both impalad and catalogd to get consistent column order in either local or legacy catalog mode. Changed TestCatalogRpcErrors.test_register_subscriber_rpc_error assertions to be more fine grained by matching individual query id. Move most of test methods from TestRangerLegacyCatalog to TestRangerLocalCatalog, except for some that do need to run in legacy catalog mode. Also renamed TestRangerLocalCatalog to TestRangerDefaultCatalog. Table ownership issue in local catalog mode remains unresolved (see IMPALA-8937). Testing: Pass exhaustive tests. Change-Id: Ie303e294972d12b98f8354bf6bbc6d0cb920060f Reviewed-on: http://gerrit.cloudera.org:8080/23080 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-08-06 21:42:24 +00:00
Riza Suminto	f28a32fbc3	IMPALA-13916: Change BaseTestSuite.default_test_protocol to HS2 This is the final patch to move all Impala e2e and custom cluster tests to use HS2 protocol by default. Only beeswax-specific test remains testing against beeswax protocol by default. We can remove them once Impala officially remove beeswax support. HS2 error message formatting in impala-hs2-server.cc is adjusted a bit to match with formatting in impala-beeswax-server.cc. Move TestWebPageAndCloseSession from webserver/test_web_pages.py to custom_cluster/test_web_pages.py to disable glog log buffering. Testing: - Pass exhaustive tests, except for some known and unrelated flaky tests. Change-Id: I42e9ceccbba1e6853f37e68f106265d163ccae28 Reviewed-on: http://gerrit.cloudera.org:8080/22845 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Jason Fehr <jfehr@cloudera.com>	2025-05-20 14:32:10 +00:00
Riza Suminto	00dc79adf6	IMPALA-13907: Remove reference to create_beeswax_client This patch replace create_beeswax_client() reference to create_hs2_client() or vector-based client creation to prepare towards hs2 test migration. test_session_expiration_with_queued_query is changed to use impala.dbapi directly from Impyla due to limitation in ImpylaHS2Connection. TestAdmissionControllerRawHS2 is migrated to use hs2 as default test protocol. Modify test_query_expiration.py to set query option through client instead of SET query. test_query_expiration is slightly modified due to behavior difference in hs2 ImpylaHS2Connection. Remove remaining reference to BeeswaxConnection.QueryState. Fixed a bug in ImpylaHS2Connection.wait_for_finished_timeout(). Fix some easy flake8 issues caught thorugh this command: git show HEAD --name-only \| grep '^tests.*py' \ \| xargs -I {} impala-flake8 {} \ \| grep -e U100 -e E111 -e E301 -e E302 -e E303 -e F... Testing: - Pass exhaustive tests. Change-Id: I1d84251835d458cc87fb8fedfc20ee15aae18d51 Reviewed-on: http://gerrit.cloudera.org:8080/22700 Reviewed-by: Riza Suminto <riza.suminto@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-03-29 18:37:45 +00:00
Riza Suminto	e73e2d40da	IMPALA-13864: Implement ImpylaHS2ResultSet.exec_summary This patch implement building exec summary table for ImpylaHS2Connection. It adds fetch_exec_summary argument in ImpalaConnection.execute(). If this argument is True, an exec summary table will be added into the returned result object. fetch_exec_summary is also implemented for BeeswaxConnection. Thus, BeeswaxConnection will not fetch exec summary by default all the time. Tests that validate exec summary table is updated to set fetch_exec_summary=True and migrated to test against hs2 protocol. Change TestExecutorGroup._set_query_options() to do query option setting through hs2_client iconfig instead of SET query. Some flake8 issues are addressed as well. Move build_exec_summary_table to separate exec_summary.py file. Tweak it a bit to return early if given TExecSummary is empty. Fixed bug in ImpalaBeeswaxClient.fetch_results() where fetch will not happen at all if discard_result argument is True. Testing: - Run and pass affected tests locally. Change-Id: I7d88f78e58eeda29ce21e7828884c7a129d7efe6 Reviewed-on: http://gerrit.cloudera.org:8080/22626 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-03-24 22:34:20 +00:00
Xuebin Su	242095ac8a	IMPALA-13729: Accept error messages not starting with prompt Previously, error_msg_expected() only accepted error messages starting with the following error prompt: ``` Query <query_id> failed:\n ``` However, for some tests using the Beeswax protocol, the error prompt may appear in the middle of the error message instead of at its beginning. Therefore, this patch adapts error_msg_expected() to accept error messages not starting with the error prompt. The error_msg_expected() function is renamed to error_msg_startswith() to better describe its behavior. Change-Id: Iac3e68bcc36776f7fd6cc9c838dd8da9c3ecf58b Reviewed-on: http://gerrit.cloudera.org:8080/22468 Reviewed-by: Daniel Becker <daniel.becker@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>	2025-02-26 15:29:36 +00:00
Riza Suminto	3005092332	IMPALA-13668: Add default_test_protocol parameter to py.test ImpalaTestSuite.client is always initialized as beeswax client. And many tests use it directly rather than going through helper method such as execute_query(). This patch add add default_test_protocol parameter to conftest.py. It control whether to initialize ImpalaTestSuite.client equals to 'beeswax_client', 'hs2_client', or 'hs2_http_client'. This parameter is still default to 'beeswax'. This patch also adds helper method 'default_client_protocol_dimension', 'beeswax_client_protocol_dimension' and 'hs2_client_protocol_dimension' for convenience and traceability. Reduced occurrence where test method manually override ImpalaTestSuite.client. They are replaced by combination of ImpalaTestSuite.create_impala_clients and ImpalaTestSuite.close_impala_clients. Testing: - Pass core tests. Change-Id: I9165ea220b2c83ca36d6e68ef3b88b128310af23 Reviewed-on: http://gerrit.cloudera.org:8080/22336 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-01-24 12:19:02 +00:00
Xuebin Su	ad868b9947	IMPALA-13115: Add query id to error messages This patch adds the query id to the error messages in both - the result of the `get_log()` RPC, and - the error message in an RPC response before they are returned to the client, so that the users can easily figure out the errored queries on the client side. To achieve this, the query id of the thread debug info is set in the RPC handler method, and is retrieved from the thread debug info each time the error reporting function or `get_log()` gets called. Due to the change of the error message format, some checks in the impala-shell.py are adapted to keep them valid. Testing: - Added helper function `error_msg_expected()` to check whether an error message is expected. It is stricter than only using the `in` operator. - Added helper function `error_msg_equal()` to check if two error messages are equal regardless of the query ids. - Various test cases are adapted to match the new error message format. - `ImpalaBeeswaxException`, which is used in tests only, is simplified so that it has the same error message format as the exceptions for HS2. - Added an assertion to the case of killing and restarting a worker in the custom cluster test to ensure that the query id is in the error message in the client log retrieved with `get_log()`. Change-Id: I67e659681e36162cad1d9684189106f8eedbf092 Reviewed-on: http://gerrit.cloudera.org:8080/21587 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2024-08-08 14:11:04 +00:00
Daniel Becker	9071030f7f	IMPALA-12809: Iceberg metadata table scanner should always be scheduled to the coordinator On clusters with dedicated coordinators and executors the Iceberg metadata scanner fragment(s) can be scheduled to executors, for example during a join. The fragment in this case will fail a precondition check, because either the 'frontend_' object or the table will not be present. This change forces Iceberg metadata scanner fragments to be scheduled on the coordinator. It is not enough to set the DataPartition type to UNPARTITIONED, because unpartitioned fragments can still be scheduled on executors. This change introduces a new flag in the TPlanFragment thrift struct - if it is true, the fragment is always scheduled on the coordinator. Testing: - Added a regression test in test_coordinators.py. - Added a new planner test with two metadata tables and a regular table joined together. Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e Reviewed-on: http://gerrit.cloudera.org:8080/21138 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2024-03-29 04:40:31 +00:00
Joe McDonnell	82bd087fb1	IMPALA-11973: Add absolute_import, division to all eligible Python files This takes steps to make Python 2 behave like Python 3 as a way to flush out issues with running on Python 3. Specifically, it handles two main differences: 1. Python 3 requires absolute imports within packages. This can be emulated via "from __future__ import absolute_import" 2. Python 3 changed division to "true" division that doesn't round to an integer. This can be emulated via "from __future__ import division" This changes all Python files to add imports for absolute_import and division. For completeness, this also includes print_function in the import. I scrutinized each old-division location and converted some locations to use the integer division '//' operator if it needed an integer result (e.g. for indices, counts of records, etc). Some code was also using relative imports and needed to be adjusted to handle absolute_import. This fixes all Pylint warnings about no-absolute-import and old-division, and these warnings are now banned. Testing: - Ran core tests Change-Id: Idb0fcbd11f3e8791f5951c4944be44fb580e576b Reviewed-on: http://gerrit.cloudera.org:8080/19588 Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com> Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>	2023-03-09 17:17:57 +00:00
Joe McDonnell	c71de994b0	IMPALA-11952 (part 1): Fix except syntax Python 3 does not support this old except syntax: except Exception, e: Instead, it needs to be: except Exception as e: This uses impala-futurize to fix all locations of the old syntax. Testing: - The check-python-syntax.sh no longer shows errors for except syntax. Change-Id: I1737281a61fa159c8d91b7d4eea593177c0bd6c9 Reviewed-on: http://gerrit.cloudera.org:8080/19551 Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com> Reviewed-by: Michael Smith <michael.smith@cloudera.com> Tested-by: Michael Smith <michael.smith@cloudera.com>	2023-02-28 17:11:50 +00:00
Michael Smith	bbb0b4939d	IMPALA-11476: Support Ozone erasure coding Adds support for identifying erasure coding policy with Ozone. Enables testing Ozone with erasure coding. Omits support for identifying erasure coding policy with the o3fs protocol as that protocol is effectively deprecated and its classes don't provide access to the ObjectStore. Refactors volumeBucketPair to use StringTokenizer. Test updates: - test_exclusive_coordinator_plan: Ozone+EC blocks are 768MB, which is larger than all tables in our test environment. Use tpch_parquet which we rely on having 3 files (by loading from snapshot in this case). - test_new_file_shorter: receives an EOFException when seeking with EC - test_local_read: erasure-coded-bytes-read is also tied to IMPALA-11697 - test_erasure_coding: Ozone doesn't report files as erasure-coded (HDDS-7603) Testing: - Passes core E2E and custom cluster tests with TARGET_FILESYSTEM=ozone and ERASURE_CODING=true. Change-Id: I201e2e33ce94bbc1e81631a0a315884bcc8047d1 Reviewed-on: http://gerrit.cloudera.org:8080/19324 Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2023-01-25 18:18:28 +00:00
Michael Smith	1eb0510eaa	IMPALA-11456: Collapse filesystem Skip logic Combines all SkipIf* classes for different filesystems into a single SkipIfFS class. Many cases are simplified to 'not IS_HDFS', with the rest as filesystem-specific special cases. The 'jira' option is removed in favor of specific flags for each issue. Change-Id: Ib928a6274baaaec45614887b9e762346a25812a1 Reviewed-on: http://gerrit.cloudera.org:8080/18781 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2022-08-10 22:37:08 +00:00
Michael Smith	830625b104	IMPALA-9442: Add Ozone to minicluster Adds Ozone as an alternative to hdfs in the minicluster. Select by setting `export TARGET_FILESYSTEM=ozone`. With that flag, run-mini-dfs.sh will start Ozone instead of HDFS. Requires a snapshot because Ozone does not support HBase (HDDS-3589); snapshot loading doesn't work yet primarily due to HDDS-5502. Uses the o3fs interface because Ozone puts specific restrictions on bucket names (no underscores, for instance), and it was a lot easier to use an interface where everything is written to a single bucket than to update all Impala's use of HDFS-style paths to make `test-warehouse` a bucket inside a volume. Specifies reduced Ozone client retries during shutdown where Ozone may not be available. Passes tests with FE_TEST=false BE_TEST=false. Change-Id: Ibf8b0f7b2d685d8b011df1926e12bf5434b5a2be Reviewed-on: http://gerrit.cloudera.org:8080/18738 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>	2022-08-03 16:58:20 +00:00
Qifan Chen	07a3e6e0df	IMPALA-10992 Planner changes for estimate peak memory This patch provides replan support for multiple executor group sets. Each executor group set is associated with a distinct number of nodes and a threshold for estimated memory per host in bytes that can be denoted as [<group_name_prefix>:<#nodes>, <threshold>]. In the patch, a query of type EXPLAIN, QUERY or DML can be compiled more than once. In each attempt, per host memory is estimated and compared with the threshold of an executor group set. If the estimated memory is no more than the threshold, the iteration process terminates and the final plan is determined. The executor group set with the threshold is selected to run the query. A new query option 'enable_replan', default to 1 (enabled), is added. It can be set to 0 to disable this patch and to generate the distributed plan for the default executor group. To avoid long compilation time, the following enhancement is enabled. Note 1) can be disabled when relevant meta-data change is detected. 1. Authorization is performed only for the 1st compilation; 2. openTransaction() is called for transactional queries in 1st compilation and the saved transactional info is used in subsequent compilations. Similar logic is applied to Kudu transactional queries. To facilitate testing, the patch imposes an artificial two executor group setup in FE as follows. 1. [regular:<#nodes>, 64MB] 2. [large:<#nodes>, 8PB] This setup is enabled when a new query option 'test_replan' is set to 1 in backend tests, or RuntimeEnv.INSTANCE.isTestEnv() is true as in most frontend tests. This query option is set to 0 by default. Compilation time increases when a query is compiled in several iterations, as shown below for several TPCDs queries. The increase is mostly due to redundant work in either single node plan creation or recomputing value transfer graph phase. For small queries, the increase can be avoided if they can be compiled in single iteration by properly setting the smallest threshold among all executor group sets. For example, for the set of queries listed below, the smallest threshold can be set to 320MB to catch both q15 and q21 in one compilation. Compilation time (ms) Queries Estimated Memory 2-iterations 1-iteration Percentage of increase q1 408MB 60.14 25.75 133.56% q11 1.37GB 261.00 109.61 138.11% q10a 519MB 139.24 54.52 155.39% q13 339MB 143.82 60.08 139.38% q14a 3.56GB 762.68 312.92 143.73% q14b 2.20GB 522.01 245.13 112.95% q15 314MB 9.73 4.28 127.33% q21 275MB 16.00 8.18 95.59% q23a 1.50GB 461.69 231.78 99.19% q23b 1.34GB 461.31 219.61 110.05% q4 2.60GB 218.05 105.07 107.52% q67 5.16GB 694.59 334.24 101.82% Testing: 1. Almost all FE and BE tests are now run in the artificial two executor setup except a few where a specific cluster configuration is desirable; 2. Ran core tests successfully; 3. Added a new observability test and a new query assignment test; 4. Disabled concurrent insert test (test_concurrent_inserts) and failing inserts (test_failing_inserts) test in local catalog mode due to flakiness. Reported both in IMPALA-11189 and IMPALA-11191. Change-Id: I75cf17290be2c64fd4b732a5505bdac31869712a Reviewed-on: http://gerrit.cloudera.org:8080/18178 Reviewed-by: Qifan Chen <qchen@cloudera.com> Tested-by: Qifan Chen <qchen@cloudera.com>	2022-03-21 20:17:28 +00:00
Fucun Chu	157086cb80	IMPALA-10771: Add Tencent COS support This patch adds support for COS(Cloud Object Storage). Using the hadoop-cos, the implementation is similar to other remote FileSystems. New flags for COS: - num_cos_io_threads: Number of COS I/O threads. Defaults to be 16. Follow-up: - Support for caching COS file handles will be addressed in IMPALA-10772. - test_concurrent_inserts and test_failing_inserts in test_acid_stress.py are skipped due to slow file listing on COS (IMPALA-10773). Tests: - Upload hdfs test data to a COS bucket. Modify all locations in HMS DB to point to the COS bucket. Remove some hdfs caching params. Run CORE tests. Change-Id: Idce135a7591d1b4c74425e365525be3086a39821 Reviewed-on: http://gerrit.cloudera.org:8080/17503 Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-12-08 16:32:02 +00:00
stiga-huang	2dfc68d852	IMPALA-7712: Support Google Cloud Storage This patch adds support for GCS(Google Cloud Storage). Using the gcs-connector, the implementation is similar to other remote FileSystems. New flags for GCS: - num_gcs_io_threads: Number of GCS I/O threads. Defaults to be 16. Follow-up: - Support for spilling to GCS will be addressed in IMPALA-10561. - Support for caching GCS file handles will be addressed in IMPALA-10568. - test_concurrent_inserts and test_failing_inserts in test_acid_stress.py are skipped due to slow file listing on GCS (IMPALA-10562). - Some tests are skipped due to issues introduced by /etc/hosts setting on GCE instances (IMPALA-10563). Tests: - Compile and create hdfs test data on a GCE instance. Upload test data to a GCS bucket. Modify all locations in HMS DB to point to the GCS bucket. Remove some hdfs caching params. Run CORE tests. - Compile and load snapshot data to a GCS bucket. Run CORE tests. Change-Id: Ia91ec956de3b620cccf6a1244b56b7da7a45b32b Reviewed-on: http://gerrit.cloudera.org:8080/17121 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-03-13 11:20:08 +00:00
Joe McDonnell	97856478ec	IMPALA-10198 (part 1): Unify Java in a single java/ directory This changes all existing Java code to be submodules under a single root pom. The root pom is impala-parent/pom.xml with minor changes to add submodules. This avoids most of the weird CMake/maven interactions, because there is now a single maven invocation for all the Java code. This moves all the Java projects other than fe into a top level java directory. fe is left where it is to avoid disruption (but still is compiled via the java directory's root pom). Various pieces of code that reference the old locations are updated. Based on research, there are two options for dealing with the shaded dependencies. The first is to have an entirely separate Maven project with a separate Maven invocation. In this case, the consumers of the shaded jars will see the reduced set of transitive dependencies. The second is to have the shaded dependencies as modules with a single Maven invocation. The consumer would see all of the original transitive dependencies and need to exclude them all. See MSHADE-206/MNG-5899. This chooses the second. This only moves code around and does not focus on version numbers or making "mvn versions:set" work. Testing: - Ran a core job - Verified existing maven commands from fe/ directory still work - Compared the *-classpath.txt files from fe and executor-deps and verified they are the same except for paths Change-Id: I08773f4f9d7cb269b0491080078d6e6f490d8d7a Reviewed-on: http://gerrit.cloudera.org:8080/16500 Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com> Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>	2020-10-15 19:30:13 +00:00
Bikramjeet Vig	004e3c897e	IMPALA-8830: Fix executor group assignment of coordinator only queries With this fix, coordinator only queries are submitted to a pseudo executor group named "empty group (using coordinator only)" which is empty. This allows running coordinator only queries regardless of the presence of any healthy executor groups. Testing: Added a custom cluster test and modified tests that relied on coordinator only queries to be queued in absence of executor groups. Change-Id: I8fe098032744aa20bbbe4faddfc67e7a46ce03d5 Reviewed-on: http://gerrit.cloudera.org:8080/14183 Reviewed-by: Bikramjeet Vig <bikramjeet.vig@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2020-06-19 01:35:29 +00:00
Csaba Ringhofer	58273fff60	IMPALA-9609: Minimize Frontend activity in executor only impalads Until now the Frontend started fully regardless of flag is_coordinator, e.g. created connections to the HMS, which is both error prone and can DoS the metastore. (note that even coordinators started to connect to HMS only in the recent past, related to local catalog mode and ACID transactions) Executor only impalads still need a JVM as queries can contain java calls (HDFS/Hbase API calls, Hive UDFs), but most of the JNI API provided by JniFrontend shouldn't be called by executors. It seems that the whole Frontend object is needed only by coordinators. Testing: - generally executor only mode doesn't seem to be well covered - ran test_coordinators.py which has some tests with executor only impalads - added new test for HBase tables (Hive UDFs and HDFS were already covered) Change-Id: I4627e5e3520175153cb49e24fd480815dfefdae1 Reviewed-on: http://gerrit.cloudera.org:8080/15793 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2020-04-24 21:43:53 +00:00
Lars Volker	69a9ac102d	IMPALA-9151: Maintain cluster size in ExecutorMembershipSnapshot This change improves the cluster membership snapshot we maintain in the frontend in cases where all executors have been shut down or none have started yet. Prior to this change when configuring Impala with executor groups, the planner might see a ExecutorMembershipSnapshot that has no executors in it. This could happen if the first executor group had not started up yet, or if all executor groups had been shutdown. If this happened, the planner would make sub-optimal decisions, e.g. decide on a broadcast join vs a partitioned hash join. With this change if no executors have been registered so far, the planner will use the expected number of executors which can be set using the -num_expected_executors flag and is 20 by default. After executors come online, the planner will use the size of the largest healthy executor group, and it will hold on to the group's size even if it shuts down or becomes unhealthy. This allows the planner to work on the assumption that a healthy executor group of the same size will eventually come online to execute the query. Change-Id: Ib6b05326c82fb3ca625c015cfcdc38f891f5d4f9 Reviewed-on: http://gerrit.cloudera.org:8080/14756 Reviewed-by: Lars Volker <lv@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2019-12-04 00:40:04 +00:00
Sahil Takiar	ac87278b16	IMPALA-8950: Add -d, -f options to hdfs copyFromLocal, put, cp Add the -d option and -f option to the following commands: `hdfs dfs -copyFromLocal <localsrc> URI` `hdfs dfs -put [ - \| <localsrc1> .. ]. <dst>` `hdfs dfs -cp URI [URI ...] <dest>` The -d option "Skip[s] creation of temporary file with the suffix ._COPYING_." which improves performance of these commands on S3 since S3 does not support metadata only renames. The -f option "Overwrites the destination if it already exists" combined with HADOOP-13884 this improves issues seen with S3 consistency issues by avoiding a HEAD request to check if the destination file exists or not. Added the method 'copy_from_local' to the BaseFilesystem class. Re-factored most usages of the aforementioned HDFS commands to use the filesystem_client. Some usages were not appropriate / worth refactoring, so occasionally this patch just adds the '-d' and '-f' options explicitly. All calls to '-put' were replaced with 'copyFromLocal' because they both copy files from the local fs to a HDFS compatible target fs. Since WebHDFS does not have good support for copying files, this patch removes the copy functionality from the PyWebHdfsClientWithChmod. Re-factored the hdfs_client so that it uses a DelegatingHdfsClient that delegates to either the HadoopFsCommandLineClient or PyWebHdfsClientWithChmod. Testing: * Ran core tests on HDFS and S3 Change-Id: I0d45db1c00554e6fb6bcc0b552596d86d4e30144 Reviewed-on: http://gerrit.cloudera.org:8080/14311 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2019-10-05 00:04:08 +00:00
Lars Volker	576f205bff	IMPALA-8936: Improve queuing reason for unhealthy executor groups In some situations, users might actually expect not having a healthy executor group around, e.g. when they're starting one and it takes a while to come online. This change makes the queuing reason more generic and drops the "unhealthy" concept from it to reduce confusion. Change-Id: Idceab7fb56335bab9d787b0f351a41e6efd7dd59 Reviewed-on: http://gerrit.cloudera.org:8080/14210 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2019-09-11 08:04:05 +00:00
Lars Volker	2397ae5590	IMPALA-8484: Run queries on disjoint executor groups This change adds support for running queries inside a single admission control pool on one of several, disjoint sets of executors called "executor groups". Executors can be configured with an executor group through the newly added '--executor_groups' flag. Note that in anticipation of future changes, the flag already uses the plural form, but only a single executor group may be specified for now. Each executor group specification can optionally contain a minimum size, separated by a ':', e.g. --executor_groups default-pool-1:3. Only when the cluster membership contains at least that number of executors for the groups will it be considered for admission. Executor groups are mapped to resource pools by their name: An executor group can service queries from a resource pool if the pool name is a prefix of the group name separated by a '-'. For example, queries in poll poolA can be serviced by executor groups named poolA-1 and poolA-2, but not by groups name foo or poolB-1. During scheduling, executor groups are considered in alphabetical order. This means that one group is filled up entirely before a subsequent group is considered for admission. Groups also need to pass a health check before considered. In particular, they must contain at least the minimum number of executors specified. If no group is specified during startup, executors are added to the default executor group. If - during admission - no executor group for a pool can be found and the default group is non-empty, then the default group is considered. The default group does not have a minimum size. This change inverts the order of scheduling and admission. Prior to this change, queries were scheduled before submitting them to the admission controller. Now the admission controller computes schedules for all candidate executor groups before each admission attempt. If the cluster membership has not changed, then the schedules of the previous attempt will be reused. This means that queries will no longer fail if the cluster membership changes while they are queued in the admission controller. This change also alters the default behavior when using a dedicated coordinator and no executors have registered yet. Prior to this change, a query would fail immediately with an error ("No executors registered in group"). Now a query will get queued and wait until executors show up, or it times out after the pools queue timeout period. Testing: This change adds a new custom cluster test for executor groups. It makes use of new capabilities added to start-impala-cluster.py to bring up additional executors into an already running cluster. Additionally, this change adds an instructional implementation of executor group based autoscaling, which can be used during development. It also adds a helper to run queries concurrently. Both are used in a new test to exercise the executor group logic and to prevent regressions to these tools. In addition to these tests, the existing tests for the admission controller (both BE and EE tests) thoroughly exercise the changed code. Some of them required changes themselves to reflect the new behavior. I looped the new tests (test_executor_groups and test_auto_scaling) for a night (110 iterations each) without any issues. I also started an autoscaling cluster with a single group and ran TPC-DS, TPC-H, and test_queries on it successfully. Known limitations: When using executor groups, only a single coordinator and a single AC pool (i.e. the default pool) are supported. Executors to not include the number of currently running queries in their statestore updates and so admission controllers are not aware of the number of queries admitted by other controllers per host. Change-Id: I8a1d0900f2a82bd2fc0a906cc094e442cffa189b Reviewed-on: http://gerrit.cloudera.org:8080/13550 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2019-07-21 04:54:03 +00:00
Lars Volker	2dbd7eec81	IMPALA-8758: Improve error message when no executors are online Prior to this change a dedicated coordinator would not create the default executor group when registering its own backend descriptor in the cluster membership. This caused a misleading error message during scheduling when the default executor group could not be found. To improve this, we now always create the default executor group and return an improved error message if it is empty. This change adds a test that validates that a query against a cluster without executors returns the expected error. Change-Id: Ia4428ef833363f52b14dfff253569212427a8e2f Reviewed-on: http://gerrit.cloudera.org:8080/13866 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2019-07-16 11:11:43 +00:00
Lars Volker	6b3e5fe426	IMPALA-8460: Simplify cluster membership management This change adds a class to track cluster membership called ClusterMembershipMgr. It replaces the logic that was partially duplicated between the ImpalaServer and the Coordinator and makes sure that the local backend descriptor is consistent (IMPALA-8469). The ClusterMembershipMgr maintains a view of the cluster membership and incorporates incoming updates from the statestore. It also registers the local backend with the statestore after startup. Clients can obtain a consistent, immutable snapshot of the current cluster membership from the ClusterMembershipMgr. Additionally, callbacks can be registered to receive notifications of cluster membership changes. The ImpalaServer and Frontend use this mechanism. This change also generalizes the fix for IMPALA-7665: updates from the statestore to the cluster membership topic are only made visible to the rest of the local server after a post-recovery grace period has elapsed. As part of this the flag 'failed_backends_query_cancellation_grace_period_ms' is replaced with 'statestore_subscriber_recovery_grace_period_ms'. To tell the initial startup from post-recovery, a new metric 'statestore-subscriber.num-connection-failures' is exposed by the daemon, which tracks the total number of connection failures to the statestore over the lifetime process lifetime. This change also unifies the naming of executor-related classes, in particular it renames "BackendConfig" to "ExecutorGroup". In anticipation of a subsequent change (IMPALA-8484), it adds maps to store multiple executor groups. This change also disables the generation of default operators from the thrift files and instead adds explicit implementations for the ones that we rely on. This forces us to explicitly specify comparators when manipulating containers of thrift structs and will help prevent accidental bugs. Testing: This change adds a backend unit test for the new cluster membership manager. The observable behavior of Impala does not change, and the existing scheduler unit test and end to end tests should make sure of that. Change-Id: Ib3cf9a8bb060d0c6e9ec8868b7b21ce01f8740a3 Reviewed-on: http://gerrit.cloudera.org:8080/13207 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2019-06-02 06:38:07 +00:00
Tim Armstrong	d05f73f415	IMPALA-7647: Add HS2/Impyla dimension to TestQueries I used some ideas from Alex Leblang's abandoned patch: https://gerrit.cloudera.org/#/c/137/ in order to run .test files through HS2. The advantage of using Impyla is that much of the code will be reusable for any Python client implementing the standard Python dbapi and does not require us implementing yet another thrift client. This gives us better coverage of non-trivial result sets from HS2, including handling of NULLs, error logs and more interesting result sets than the basic HS2 tests. I added HS2 coverage to TestQueries, which has a reasonable variety of queries and covers the data types in alltypes. I also added TestDecimalQueries, TestStringQuery and TestCharFormats to get coverage of DECIMAL, CHAR and VARCHAR that aren't in alltypes. Coverage of results sets with NULLs was limited so I added a couple of queries. Places where results differ from Beeswax: * Impyla is a Python dbapi client so must convert timestamps into python datetime objects, which only have microsecond precision. Therefore result timestamps within nanosecond precision are truncated. * The HS2 interface reports the NULL type as BOOLEAN as a workaround for IMPALA-914. * The Beeswax interface reported VARCHAR as STRING, but HS2 reports VARCHAR. I dealt with different results by adding additional result sections so that the expected differences between the clients/protocols were explicit. Limitations: * Not all of the same methods are implemented as for beeswax, so some tests that have more complicated interactions with the client will not work with HS2 yet. * We don't have a way to get the affected row count for inserts. I also simplified the ImpalaConnection API by removing some unnecessary methods and moved some generic methods to the base class. Testing: * Confirmed that it detected IMPALA-7588 by re-applying the buggy patch. * Ran exhaustive and CentOS6 tests. Change-Id: I9908ccc4d3df50365be8043b883cacafca52661e Reviewed-on: http://gerrit.cloudera.org:8080/11546 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2018-10-09 00:45:10 +00:00
poojanilangekar	880011fa1f	IMPALA-6031: Fix executor node count in distributed plans Prior to this change, the planner also considered coordinator-only nodes as executors while estimating the number of scan nodes to be used in the distributed plan. This change ensures that only executor nodes are considered for that estimation. Testing: Added a new custom cluster test to verify the same. Change-Id: I44af6b40099a495e13a0a5dc72c491d486d23aa8 Reviewed-on: http://gerrit.cloudera.org:8080/10873 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2018-07-07 03:33:08 +00:00
Vuk Ercegovac	2894884deb	IMPALA-6670: refresh lib-cache entries from plan When an impalad is in executor-only mode, it receives no catalog updates. As a result, lib-cache entries are never refreshed. A consequence is that udf queries can return incorrect results or may not run due to resolution issues. Both cases are caused by the executor using a stale copy of the lib file. For incorrect results, an old version of the method may be used. Resolution issues can come up if a method is added to a lib file. The solution in this change is to capture the coordinator's view of the lib file's last modified time when planning. This last modified time is then shipped with the plan to executors. Executors must then use both the lib file path and the last modified time as a key for the lib-cache. If the coordinator's last modified time is more recent than the executor's lib-cache entry, then the entry is refreshed. Brief discussion of alternatives: - lib-cache always checks last modified time + easy/local change to lib-cache - adds an fs lookup always. rejected for this reason - keep the last modified time in the catalog - bound on staleness is too loose. consider the case where fn's f1, f2, f3 are created with last modified times of t1, t2, t3. treat the fn's last modified time as a low-watermark; if the cache entry has a more recent time, use it. Such a scheme would allow the version at t2 to persist. An old fn may keep the state from converging to the latest. This could end up with strange cases where different versions of the lib are used across executors for a single query. In contrast, the change in this path relies on the statestore to push versions forward at all coordinators, so will push all versions at all caches forward as well. Testing: - added an e2e custom cluster test Change-Id: Icf740ea8c6a47e671427d30b4d139cb8507b7ff6 Reviewed-on: http://gerrit.cloudera.org:8080/9697 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Impala Public Jenkins	2018-03-24 04:38:53 +00:00
Vuk Ercegovac	6a2b7a64fb	IMPALA-4704: Turns on client connections when local catalog initialized. Currently, impalad starts beeswax and hs2 servers even if the catalog has not yet been initialized. As a result, client connections see an error message stating that the impalad is not yet ready. This patch changes the impalad startup sequence to wait until the catalog is received before opening beeswax and hs2 ports and starting their servers. Testing: - python e2e tests that start a cluster without a catalog and check that client connections are rejected as expected. Change-Id: I52b881cba18a7e4533e21a78751c2e35c3d4c8a6 Reviewed-on: http://gerrit.cloudera.org:8080/8202 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Impala Public Jenkins	2017-11-13 21:14:14 +00:00
Matthew Jacobs	6c12546561	IMPALA-4833: Compute precise per-host reservation size Before this change, the per-host reservation size was computed by the Planner. However, scheduling happens after planning, so the Planner must assume that all fragments run on all hosts, and the reservation size is likely much larger than it needs to be. This moves the computation of the per-host reservation size to the BE where it can be computed more precisely. This also includes a number of plan/profile changes. Change-Id: Idbcd1e9b1be14edc4017b4907e83f9d56059fbac Reviewed-on: http://gerrit.cloudera.org:8080/7630 Reviewed-by: Matthew Jacobs <mj@cloudera.com> Tested-by: Impala Public Jenkins	2017-08-12 08:10:07 +00:00
Dimitris Tsirogiannis	e2c53a8bdf	IMPALA-5147: Add the ability to exclude hosts from query execution This commit introduces a new startup option, termed 'is_executor', that determines whether an impalad process can execute query fragments. The 'is_executor' option determines if a specific host will be included in the scheduler's backend configuration and hence included in scheduling decisions. Testing: - Added a customer cluster test. - Added a new scheduler test. Change-Id: I5d2ff7f341c9d2b0649e4d14561077e166ad7c4d Reviewed-on: http://gerrit.cloudera.org:8080/6628 Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com> Tested-by: Impala Public Jenkins	2017-04-26 01:45:40 +00:00
Dimitris Tsirogiannis	296df3c826	IMPALA-4041: Limit catalog and admission control updates to coordinators With this commit we add the ability to limit catalog updates to a limited set of coordinator nodes. A new startup option, termed 'is_coordinator' is added to indicate if a node is a coordinator. Coordinators accept connections through HS2 and Beeswax interfaces and can also participate in query execution. Non-coordinator nodes do not receive catalog updates from the statestore, do not initialize a query scheduler and cannot accept Beeswax and HS2 client connections. Testing: - Added a custom cluster test that launches a cluster in which the number of coordinators is less than the cluster size and runs a number of smoke queries. - Successfully run exhaustive tests. Change-Id: I5f2c74abdbcd60ac050efa323616bd41182ceff3 Reviewed-on: http://gerrit.cloudera.org:8080/6344 Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com> Tested-by: Impala Public Jenkins	2017-03-28 22:27:25 +00:00

32 Commits