impala

mirror of https://github.com/apache/impala.git synced 2025-12-19 18:12:08 -05:00

Author	SHA1	Message	Date
Csaba Ringhofer	f98b697c7b	IMPALA-13929: Make 'functional-query' the default workload in tests This change adds get_workload() to ImpalaTestSuite and removes it from all test suites that already returned 'functional-query'. get_workload() is also removed from CustomClusterTestSuite which used to return 'tpch'. All other changes besides impala_test_suite.py and custom_cluster_test_suite.py are just mass removals of get_workload() functions. The behavior is only changed in custom cluster tests that didn't override get_workload(). By returning 'functional-query' instead of 'tpch', exploration_strategy() will no longer return 'core' in 'exhaustive' test runs. See IMPALA-3947 on why workload affected exploration_strategy. An example for affected test is TestCatalogHMSFailures which was skipped both in core and exhaustive runs before this change. get_workload() functions that return a different workload than 'functional-query' are not changed - it is possible that some of these also don't handle exploration_strategy() as expected, but individually checking these tests is out of scope in this patch. Change-Id: I9ec6c41ffb3a30e1ea2de773626d1485c69fe115 Reviewed-on: http://gerrit.cloudera.org:8080/22726 Reviewed-by: Riza Suminto <riza.suminto@cloudera.com> Reviewed-by: Daniel Becker <daniel.becker@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2025-04-08 07:12:55 +00:00
Joe McDonnell	82bd087fb1	IMPALA-11973: Add absolute_import, division to all eligible Python files This takes steps to make Python 2 behave like Python 3 as a way to flush out issues with running on Python 3. Specifically, it handles two main differences: 1. Python 3 requires absolute imports within packages. This can be emulated via "from __future__ import absolute_import" 2. Python 3 changed division to "true" division that doesn't round to an integer. This can be emulated via "from __future__ import division" This changes all Python files to add imports for absolute_import and division. For completeness, this also includes print_function in the import. I scrutinized each old-division location and converted some locations to use the integer division '//' operator if it needed an integer result (e.g. for indices, counts of records, etc). Some code was also using relative imports and needed to be adjusted to handle absolute_import. This fixes all Pylint warnings about no-absolute-import and old-division, and these warnings are now banned. Testing: - Ran core tests Change-Id: Idb0fcbd11f3e8791f5951c4944be44fb580e576b Reviewed-on: http://gerrit.cloudera.org:8080/19588 Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com> Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>	2023-03-09 17:17:57 +00:00
Tim Armstrong	c14a090400	IMPALA-5844: use a MemPool for expr result allocations This is also a step towards IMPALA-2399 (remove QueryMaintenance()). "local" allocations containing expression results (either intermediate or final results) have the following properties: * They are usually small allocations * They can be made frequently (e.g. every function call) * They are owned and managed by the Impala runtime * They are freed in bulk at various points in query execution. A MemPool (i.e. bump allocator) is the right mechanism to manage allocations with the above properties. Before this patch FunctionContext's used a FreePool + vector of allocations to emulate the above behaviour. This patch switches to using a MemPool to bring these allocations in line with the rest of the codebase. The steps required to do this conversion. * Use a MemPool for FunctionContext local allocations. * Identify appropriate MemPools for all of the local allocations from function contexts so that the memory lifetime is correct. * Various cleanup and documentation of existing MemPools. * Replaces calls to FreeLocalAllocations() with calls to MemPool::Clear() More involved surgery was required in a few places: * Made the Sorter own its comparator, exprs and MemPool. * Remove FunctionContextImpl::ReallocateLocal() and just have StringFunctions::Replace() do the doubling itself to avoid the need for a special interface. Worst-case this doubles the memory requirements for Replace() since n / 2 + n / 4 + n / 8 + .... bytes of memory could be wasted instead of recycled for an n-byte output string. * Provide a way redirect agg fn Serialize()/Finalize() allocations to come directly from the output RowBatch's MemPool. This is also potentially applicable to other places where we currently copy out strings from local allocations, e.g. AnalyticEvalNode::AddResultTuple() and Tuple::MaterializeExprs(). * --stress_free_pool_alloc was changed to instead intercept at the FunctionContext layer so that it retains the old behaviour even though allocations do not all come from FreePools. The "local" allocation concept was not exposed directly in udf.h so this patch also renames them to better reflect that they're used for expr results. Testing: * ran exhaustive and ASAN Change-Id: I4ba5a7542ed90a49a4b5586c040b5985a7d45b61 Reviewed-on: http://gerrit.cloudera.org:8080/8025 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Impala Public Jenkins	2017-10-06 00:01:08 +00:00
Dan Hecht	741421de09	IMPALA-5252: Fix crash in HiveUdfCall::GetStringVal() when mem_limit exceeded We need to check for AllocateLocal() returning NULL. CopyFrom() takes care of that for us. Also adjust a few other places in the code base that didn't have the check. The new test reproduces the crash, but in order to get this test file to execute, I had to move the xfail to be a function decorator. Apparently xfail as a statement causes the test to not run at all. We should run all of these queries even if they are non-determistic to at least verify that impalad does not crash. Change-Id: Iafefef24479164cc4d2b99191d2de28eb8b311b6 Reviewed-on: http://gerrit.cloudera.org:8080/6761 Reviewed-by: Dan Hecht <dhecht@cloudera.com> Tested-by: Impala Public Jenkins	2017-04-29 02:23:51 +00:00
Michael Ho	9b80224f9f	IMPALA-2925: Mark test_alloc_update as xfail. test_alloc_update.py is flaky and the expected failure sometimes doesn't occur. Mark this test as xfail for now to unblock the build. Change-Id: If4e86e7b9c064bc78b672814cd3569453ecc268d Reviewed-on: http://gerrit.cloudera.org:8080/5366 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Internal Jenkins	2016-12-06 12:37:17 +00:00
Dan Hecht	ffa7829b70	IMPALA-3918: Remove Cloudera copyrights and add ASF license header For files that have a Cloudera copyright (and no other copyright notice), make changes to follow the ASF source file header policy here: http://www.apache.org/legal/src-headers.html#headers Specifically: 1) Remove the Cloudera copyright. 2) Modify NOTICE.txt according to http://www.apache.org/legal/src-headers.html#notice to follow that format and add a line for Cloudera. 3) Replace or add the existing ASF license text with the one given on the website. Much of this change was automatically generated via: git grep -li 'Copyright.Cloudera' > modified_files.txt cat modified_files.txt \| xargs perl -n -i -e 'print unless m#Copyright.Cloudera#i;' cat modified_files_txt \| xargs fix_apache_license.py [1] Some manual fixups were performed following those steps, especially when license text was completely missing from the file. [1] https://gist.github.com/anonymous/ff71292094362fc5c594 with minor modification to ORIG_LICENSE to match Impala's license text. Change-Id: I2e0bd8420945b953e1b806041bea4d72a3943d86 Reviewed-on: http://gerrit.cloudera.org:8080/3779 Reviewed-by: Dan Hecht <dhecht@cloudera.com> Tested-by: Internal Jenkins	2016-08-09 08:19:41 +00:00
Taras Bobrovytsky	609b80410e	Clean up Python test import statements Many of our test scripts have import statements that look like "from xxx import *". It is a good practice to explicitly name what needs to be imported. This commit implements this practice. Also, unused import statements are removed. Change-Id: I6a33bb66552ae657d1725f765842f648faeb26a8 Reviewed-on: http://gerrit.cloudera.org:8080/3444 Reviewed-by: Michael Brown <mikeb@cloudera.com> Tested-by: Internal Jenkins	2016-07-15 23:26:18 +00:00
Michael Brown	22669e23be	IMPALA-3501: ee tests: detect build type and support different timeouts based on the same Impala compiled with the address sanitizer, or compiled with code coverage, runs through code paths much slower. This can cause end-to-end tests that pass on a non-ASAN or non-code coverage build to fail. Some examples include IMPALA-2721, IMPALA-2973, and IMPALA-3501. These classes of failures tend always to involve some time-sensitive condition that fails to succeed under such "slow builds". The works-around in the past have been to simply increase the timeout. The problem with this approach is that it relaxes conditions for tests on builds that see the field--i.e., release builds--for builds that never will--i.e., ASAN and code coverage. This patch fixes that problem by allowing test authors to set timeout values based on a specific build type. The author may choose timeouts with a default value, and different timeouts for either or both so-called "slow builds": ASAN and code coverage. We detect the so-called "specific build type" by inspecting the binary expected to be at the path under test. This removes the need to make alterations to Impala itself. The inspection done is to read the DWARF information in the binary, specifically the first compile unit's DW_AT_producer and DW_AT_name DIE attributes. We employ a heuristic based on these attributes' values to guess the build type. If we can't determine the build type, we will assume it's a debug build. More information on this is in IMPALA-3501. A quick summary of the changes follows: 1. Move some of the logic in tests.common.skip to tests.common.environ and rework some skip marks to be more precise. 2. Add Pyelftools for convenient deserialization of DWARF 3. Our Pyelftools usage requires collections.OrderedDict, which isn't in python2.6; also add Monkeypatch to handle this. 4. Add ImpalaBuild and specific_build_type_timeout, the core of the new functionality 5. Fix the statestore tests that only fail under code coverage (the basis for IMPALA-3501) Testing: The tests that were previously, reliably failing under code coverage now pass. I also ran perfunctory tests of debug, release, and ASAN builds to ensure our detection of build type is working. This patch will not turn the code coverage builds green; there are other tests that fail, and fixing all of them here is out of the scope of this patch. Change-Id: I2b675c04c54e36d404fd9e5a6cf085fb8d6d0e47 Reviewed-on: http://gerrit.cloudera.org:8080/3156 Reviewed-by: Michael Brown <mikeb@cloudera.com> Tested-by: Internal Jenkins	2016-05-25 19:41:45 -07:00
Michael Ho	40f75fb1ba	IMPALA-2925: Fix flaky tests in test_alloc_fail_update() test_alloc_fail_update() aims to stress memory allocation failure in the Update(), Serialize() and/or Finalize() functions of UDAs. However, this test included some UDFs which allocated memory in their Init() functions and not during their Update() functions. This change removes those UDFs from the test. Change-Id: I1ecc7e838e34ebc9ea3c878fee8ea2497b5fa23e Reviewed-on: http://gerrit.cloudera.org:8080/2005 Reviewed-by: Matthew Jacobs <mj@cloudera.com> Reviewed-by: Dan Hecht <dhecht@cloudera.com> Tested-by: Internal Jenkins	2016-02-10 00:54:11 +00:00
Michael Ho	7fbf56430a	IMPALA-2903: Skip test_alloc_fail in non-debug builds. Impala.tests.custom_cluster.test_alloc_fail relies on a debug-builds only startup option to stress the allocation failure behavior. This startup option is intentionally compiled out of release builds to avoid unintended misconfiguration. Therefore, we should skip this test in non-debug builds. This change introduces SkipIfNotDebugBuild in pytest so we can skip certain tests in non-debug builds. Change-Id: I46ef258e48cffb13b106a0dcc0490edf5ea50a1c Reviewed-on: http://gerrit.cloudera.org:8080/1951 Reviewed-by: Michael Ho <kwho@cloudera.com> Tested-by: Internal Jenkins	2016-02-02 03:44:19 +00:00
Michael Ho	e01ab4f1b2	IMPALA-2620: FunctionContext::Allocate() and friends should check for memory limits. FunctionContext::Allocate(), FunctionContextImpl::AllocateLocal() and FunctionContext::Reallocate() allocate memory without taking memory limits into account. The problem is that these functions invoke FreePool::Allocate() which may call MemPool::Allocate() that doesn't check against the memory limits. This patch fixes the problem by making these FunctionContext functions check for memory limits and set an error in the FunctionContext object if memory limits are exceeded. An alternative would be for these functions to call MemPool::TryAllocate() instead and return NULL if memory limits are exceeded. However, this may break some existing external UDAs which don't check for allocation failures, leading to unexpected crashes of Impala. Therefore, we stick with this ad hoc approach until the UDF/UDA interfaces are updated in the future releases. Callers of these FunctionContext functions are also updated to handle potential failed allocations instead of operating on NULL pointers. The query status will be polled at various locations and terminate the query. This patch also fixes MemPool to handle the case in which malloc may return NULL. It propagates the failure to the callers instead of continuing to run with NULL pointers. In addition, errors during aggregate functions' initialization are now properly propagated. Change-Id: Icefda795cd685e5d0d8a518cbadd37f02ea5e733 Reviewed-on: http://gerrit.cloudera.org:8080/1445 Reviewed-by: Michael Ho <kwho@cloudera.com> Tested-by: Internal Jenkins	2015-12-19 04:45:55 +00:00

11 Commits