impala

mirror of https://github.com/apache/impala.git synced 2025-12-22 03:18:15 -05:00

Author	SHA1	Message	Date
Joe McDonnell	1913ab46ed	IMPALA-14501: Migrate most scripts from impala-python to impala-python3 To remove the dependency on Python 2, existing scripts need to use python3 rather than python. These commands find those locations (for impala-python and regular python): git grep impala-python \| grep -v impala-python3 \| grep -v impala-python-common \| grep -v init-impala-python git grep bin/python \| grep -v python3 This removes or switches most of these locations by various means: 1. If a python file has a #!/bin/env impala-python (or python) but doesn't have a main function, it removes the hash-bang and makes sure that the file is not executable. 2. Most scripts can simply switch from impala-python to impala-python3 (or python to python3) with minimal changes. 3. The cm-api pypi package (which doesn't support Python 3) has been replaced by the cm-client pypi package and interfaces have changed. Rather than migrating the code (which hasn't been used in years), this deletes the old code and stops installing cm-api into the virtualenv. The code can be restored and revamped if there is any interest in interacting with CM clusters. 4. This switches tests/comparison over to impala-python3, but this code has bit-rotted. Some pieces can be run manually, but it can't be fully verified with Python 3. It shouldn't hold back the migration on its own. 5. This also replaces locations of impala-python in comments / documentation / READMEs. 6. kazoo (used for interacting with HBase) needed to be upgraded to a version that supports Python 3. The newest version of kazoo requires upgrades of other component versions, so this uses kazoo 2.8.0 to avoid needing other upgrades. The two remaining uses of impala-python are: - bin/cmake_aux/create_virtualenv.sh - bin/impala-env-versioned-python These will be removed separately when we drop Python 2 support completely. In particular, these are useful for testing impala-shell with Python 2 until we stop supporting Python 2 for impala-shell. The docker-based tests still use /usr/bin/python, but this can be switched over independently (and doesn't impact impala-python) Testing: - Ran core job - Ran build + dataload on Centos 7, Redhat 8 - Manual testing of individual scripts (except some bitrotted areas like the random query generator) Change-Id: If209b761290bc7e7c716c312ea757da3e3bca6dc Reviewed-on: http://gerrit.cloudera.org:8080/23468 Reviewed-by: Michael Smith <michael.smith@cloudera.com> Tested-by: Michael Smith <michael.smith@cloudera.com>	2025-10-22 16:30:17 +00:00
Csaba Ringhofer	f98b697c7b	IMPALA-13929: Make 'functional-query' the default workload in tests This change adds get_workload() to ImpalaTestSuite and removes it from all test suites that already returned 'functional-query'. get_workload() is also removed from CustomClusterTestSuite which used to return 'tpch'. All other changes besides impala_test_suite.py and custom_cluster_test_suite.py are just mass removals of get_workload() functions. The behavior is only changed in custom cluster tests that didn't override get_workload(). By returning 'functional-query' instead of 'tpch', exploration_strategy() will no longer return 'core' in 'exhaustive' test runs. See IMPALA-3947 on why workload affected exploration_strategy. An example for affected test is TestCatalogHMSFailures which was skipped both in core and exhaustive runs before this change. get_workload() functions that return a different workload than 'functional-query' are not changed - it is possible that some of these also don't handle exploration_strategy() as expected, but individually checking these tests is out of scope in this patch. Change-Id: I9ec6c41ffb3a30e1ea2de773626d1485c69fe115 Reviewed-on: http://gerrit.cloudera.org:8080/22726 Reviewed-by: Riza Suminto <riza.suminto@cloudera.com> Reviewed-by: Daniel Becker <daniel.becker@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-04-08 07:12:55 +00:00
Joe McDonnell	eb66d00f9f	IMPALA-11974: Fix lazy list operators for Python 3 compatibility Python 3 changes list operators such as range, map, and filter to be lazy. Some code that expects the list operators to happen immediately will fail. e.g. Python 2: range(0,5) == [0,1,2,3,4] True Python 3: range(0,5) == [0,1,2,3,4] False The fix is to wrap locations with list(). i.e. Python 3: list(range(0,5)) == [0,1,2,3,4] True Since the base operators are now lazy, Python 3 also removes the old lazy versions (e.g. xrange, ifilter, izip, etc). This uses future's builtins package to convert the code to the Python 3 behavior (i.e. xrange -> future's builtins.range). Most of the changes were done via these futurize fixes: - libfuturize.fixes.fix_xrange_with_import - lib2to3.fixes.fix_map - lib2to3.fixes.fix_filter This eliminates the pylint warnings: - xrange-builtin - range-builtin-not-iterating - map-builtin-not-iterating - zip-builtin-not-iterating - filter-builtin-not-iterating - reduce-builtin - deprecated-itertools-function Testing: - Ran core job Change-Id: Ic7c082711f8eff451a1b5c085e97461c327edb5f Reviewed-on: http://gerrit.cloudera.org:8080/19589 Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com> Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>	2023-03-09 17:17:57 +00:00
Joe McDonnell	82bd087fb1	IMPALA-11973: Add absolute_import, division to all eligible Python files This takes steps to make Python 2 behave like Python 3 as a way to flush out issues with running on Python 3. Specifically, it handles two main differences: 1. Python 3 requires absolute imports within packages. This can be emulated via "from __future__ import absolute_import" 2. Python 3 changed division to "true" division that doesn't round to an integer. This can be emulated via "from __future__ import division" This changes all Python files to add imports for absolute_import and division. For completeness, this also includes print_function in the import. I scrutinized each old-division location and converted some locations to use the integer division '//' operator if it needed an integer result (e.g. for indices, counts of records, etc). Some code was also using relative imports and needed to be adjusted to handle absolute_import. This fixes all Pylint warnings about no-absolute-import and old-division, and these warnings are now banned. Testing: - Ran core tests Change-Id: Idb0fcbd11f3e8791f5951c4944be44fb580e576b Reviewed-on: http://gerrit.cloudera.org:8080/19588 Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com> Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>	2023-03-09 17:17:57 +00:00
Michael Smith	a870a11e64	IMPALA-7098: Re-enable tests under EC Re-enables tests under erasure coding, or provides more specific exceptions. Erasure coding uses multiple data blocks to construct a block group. Our tests use RS-3-2-1024k, which includes 3 data blocks in a block group. Each of these blocks is sized according to `dfs.block.size`, so block groups by default hold up to 384MB of data. Impala schedules work to executors based on blocks reported by HDFS, which for EC actually represent block groups. So with default block size, a file in EC has 1/3rd the number of schedulable blocks. In the case of tpch.lineitem, this produces 2 parquet files instead of 3 and reduces the number of executors scheduled to read parquet lineitem as 1. lineitem.tbl is loaded via Hive. With EC it uses 2 block groups, without EC it uses 6 blocks. 2. parquet lineitem is created by select/insert from lineitem.tbl. Impala schedules reads to executors based on available blocks, so with EC this gets scheduled across 2 executors instead of 3 and each executor writes a separate parquet file. Change-Id: Ib452024993e35d5a8d2854c6b2085115b26e40df Reviewed-on: http://gerrit.cloudera.org:8080/19172 Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com> Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>	2022-11-04 22:13:50 +00:00
Bikramjeet Vig	45fb0fb3e7	IMPALA-10397: De-flake test_single_workload This patch removes a flaky part of the test that relies on query completion rate. Since we are already verifying that number of healthy executor groups increases, this additional check is not adding much to the test. Change-Id: I6f75afdbe676d9dd6922b6ba8aa1919daa161947 Reviewed-on: http://gerrit.cloudera.org:8080/17239 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-03-30 23:46:48 +00:00
Bikramjeet Vig	0b79464d9c	IMPALA-10397: Fix test_single_workload The logs on failed runs indicated that the autoscaler never started another cluster. This can only happen if it never notices a queued query which is possible since this test was only failing in release builds. This patch increases the runtime of the sample query to make execution more predictable. Testing: Looped on my local on a release build Change-Id: Ide3c7fb4509ce9a797b4cbdd141b2a319b923d4e Reviewed-on: http://gerrit.cloudera.org:8080/17218 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-03-25 22:59:16 +00:00
Bikramjeet Vig	f888d36295	IMPALA-10397 : Reduce flakiness in test_single_workload This test failed recently due to a timeout waiting for executors to come up. The logs showed that the executors came up on time but it was not recognized by the coordinator. This patch attempts to reduce flakiness by increasing the timeout and adding more logging in case this happens in the future. Testing: Ran in a loop on my local for a few hours. Change-Id: I73ea5eb663db6d03832b19ed323670590946f514 Reviewed-on: http://gerrit.cloudera.org:8080/17028 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-02-10 06:05:51 +00:00
Lars Volker	ddd07e1713	IMPALA-8867: Further deflake test_auto_scaling A previous attempt to deflake this test by lowering the threshold for multi-group throughput had improved things but we still saw another occurrence of test_auto_scaling failing recently. This change lowers the threshold even further to try and eradicate the flakiness. From inspecting the logs of the failed run I could see that the new threshold would have prevented the failure. Change-Id: I29808982cc6226152c544cb99f76961b582975a7 Reviewed-on: http://gerrit.cloudera.org:8080/14740 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Lars Volker <lv@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2019-11-26 02:03:41 +00:00
Sahil Takiar	df365a8e30	IMPALA-8803: Coordinator should release admitted memory per-backend Changes the Coordinator to release admitted memory when each Backend completes, rather than waiting for the entire query to complete before releasing admitted memory. When the Coordinator detects that a Backend has completed (via ControlService::ReportExecStatus) it updates the state of the Backend in Coordinator::BackendResourceState. BackendResourceState tracks the state of the admitted resources for each Backend and decides when the resources for a group of Backend states should be released. BackendResourceState defines a state machine to help coordinate the state of the admitted memory for each Backend. It guarantees that by the time the query is shutdown, all Backends release their admitted memory. BackendResourceState implements three rules to control the rate at which the Coordinator releases admitted memory from the AdmissionController: * Resources are released at most once every 1 second, this prevents short lived queries from causing high load on the admission controller * Resources are released at most O(log(num_backends)) times; the BackendResourceStates can release multiple BackendStates from the AdmissionController at a time * All pending resources are released if the only remaining Backend is the Coordinator Backend; this is useful for result spooling where all Backends may complete, except for the Coordinator Backend Exposes the following hidden startup flags to help tune the heuristics above: --batched_release_decay_factor * Defaults to 2 * Controls the base value for the O(log(num_backends)) bound when batching the release of Backends. --release_backend_states_delay_ms * Defaults to 1000 milliseconds * Controls how often Backends can release their resources. Testing: * Ran core tests * Added new tests to test_result_spooling.py and test_admission_controller.py Change-Id: I88bb11e0ede7574568020e0277dd8ac8d2586dc9 Reviewed-on: http://gerrit.cloudera.org:8080/14104 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2019-09-07 21:15:52 +00:00
Lars Volker	cebed8b88a	IMPALA-8867: Deflake test_auto_scaling, improve logging test_auto_scaling sometimes failed to reach the required query rate within the expected time. This change increases the timeout that we wait for the query rate to increase. It also adds logging for the maximum query rate observed to help with debugging in case the issue still occurs. Testing: I looped this for a day on my desktop box and did not observe any issues. Change-Id: I22c43618a40ff197784add69223359e23fa1bdec Reviewed-on: http://gerrit.cloudera.org:8080/14074 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2019-08-16 01:50:34 +00:00
Tim Armstrong	4fb8e8e324	IMPALA-8816: reduce custom cluster test runtime in core This includes some optimisations and a bulk move of tests to exhaustive. Move a bunch of custom cluster tests to exhaustive. I selected these partially based on runtime (i.e. I looked most carefully at the tests that ran for over a minute) and the likelihood of them catching a precommit bug. Regression tests for specific edge cases and tests for parts of the code that are very stable were prime candidates. Remove an unnecessary cluster restart in test_breakpad. Merge test_scheduler_error into test_failpoints to avoid an unnecessary cluster restart. Speed up cluster starts by ensuring that the default statestore args are applied even when _start_impala_cluster() is called directly. This shaves a couple of seconds off each restart. We made the default args use a faster update frequency - see IMPALA-7185 - but they did not take effect in all tests. Change-Id: Ib2e3e7ebc9695baec4d69183387259958df10f62 Reviewed-on: http://gerrit.cloudera.org:8080/13967 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2019-08-06 21:34:26 +00:00
Bikramjeet Vig	b5193f36de	IMPALA-8806: Add metrics to improve observability of executor groups This patch adds 3 metrics under a new metric group called "cluster-membership" that keep track of the number of executor groups that have at least one live executor, number of executor groups that are in a healthy state and the number of backends registered with the statestore. Testing: Modified tests to use these metrics for verification. Change-Id: I7745ea1c7c6778d3fb5e59adbc873697beb0f3b9 Reviewed-on: http://gerrit.cloudera.org:8080/13979 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2019-08-06 02:33:59 +00:00
Lars Volker	6dd36d9d1b	IMPALA-8798: Skip TestAutoScaling on EC runs TestAutoScaling uses ConcurrentWorkload, which does not set the required query options to scan erasure-coded files. This change disables the affected test when running with erasure-coding enabled. Change-Id: I96690914f12679619ba96d79edde8af9cccf65fa Reviewed-on: http://gerrit.cloudera.org:8080/13935 Reviewed-by: Lars Volker <lv@cloudera.com> Tested-by: Lars Volker <lv@cloudera.com>	2019-07-27 02:42:28 +00:00
Lars Volker	2397ae5590	IMPALA-8484: Run queries on disjoint executor groups This change adds support for running queries inside a single admission control pool on one of several, disjoint sets of executors called "executor groups". Executors can be configured with an executor group through the newly added '--executor_groups' flag. Note that in anticipation of future changes, the flag already uses the plural form, but only a single executor group may be specified for now. Each executor group specification can optionally contain a minimum size, separated by a ':', e.g. --executor_groups default-pool-1:3. Only when the cluster membership contains at least that number of executors for the groups will it be considered for admission. Executor groups are mapped to resource pools by their name: An executor group can service queries from a resource pool if the pool name is a prefix of the group name separated by a '-'. For example, queries in poll poolA can be serviced by executor groups named poolA-1 and poolA-2, but not by groups name foo or poolB-1. During scheduling, executor groups are considered in alphabetical order. This means that one group is filled up entirely before a subsequent group is considered for admission. Groups also need to pass a health check before considered. In particular, they must contain at least the minimum number of executors specified. If no group is specified during startup, executors are added to the default executor group. If - during admission - no executor group for a pool can be found and the default group is non-empty, then the default group is considered. The default group does not have a minimum size. This change inverts the order of scheduling and admission. Prior to this change, queries were scheduled before submitting them to the admission controller. Now the admission controller computes schedules for all candidate executor groups before each admission attempt. If the cluster membership has not changed, then the schedules of the previous attempt will be reused. This means that queries will no longer fail if the cluster membership changes while they are queued in the admission controller. This change also alters the default behavior when using a dedicated coordinator and no executors have registered yet. Prior to this change, a query would fail immediately with an error ("No executors registered in group"). Now a query will get queued and wait until executors show up, or it times out after the pools queue timeout period. Testing: This change adds a new custom cluster test for executor groups. It makes use of new capabilities added to start-impala-cluster.py to bring up additional executors into an already running cluster. Additionally, this change adds an instructional implementation of executor group based autoscaling, which can be used during development. It also adds a helper to run queries concurrently. Both are used in a new test to exercise the executor group logic and to prevent regressions to these tools. In addition to these tests, the existing tests for the admission controller (both BE and EE tests) thoroughly exercise the changed code. Some of them required changes themselves to reflect the new behavior. I looped the new tests (test_executor_groups and test_auto_scaling) for a night (110 iterations each) without any issues. I also started an autoscaling cluster with a single group and ran TPC-DS, TPC-H, and test_queries on it successfully. Known limitations: When using executor groups, only a single coordinator and a single AC pool (i.e. the default pool) are supported. Executors to not include the number of currently running queries in their statestore updates and so admission controllers are not aware of the number of queries admitted by other controllers per host. Change-Id: I8a1d0900f2a82bd2fc0a906cc094e442cffa189b Reviewed-on: http://gerrit.cloudera.org:8080/13550 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2019-07-21 04:54:03 +00:00

15 Commits