11 Commits

Author SHA1 Message Date
Csaba Ringhofer
f98b697c7b IMPALA-13929: Make 'functional-query' the default workload in tests
This change adds get_workload() to ImpalaTestSuite and removes it
from all test suites that already returned 'functional-query'.
get_workload() is also removed from CustomClusterTestSuite which
used to return 'tpch'.

All other changes besides impala_test_suite.py and
custom_cluster_test_suite.py are just mass removals of
get_workload() functions.

The behavior is only changed in custom cluster tests that didn't
override get_workload(). By returning 'functional-query' instead
of 'tpch', exploration_strategy() will no longer return 'core' in
'exhaustive' test runs. See IMPALA-3947 on why workload affected
exploration_strategy. An example for affected test is
TestCatalogHMSFailures which was skipped both in core and exhaustive
runs before this change.

get_workload() functions that return a different workload than
'functional-query' are not changed - it is possible that some of
these also don't handle exploration_strategy() as expected, but
individually checking these tests is out of scope in this patch.

Change-Id: I9ec6c41ffb3a30e1ea2de773626d1485c69fe115
Reviewed-on: http://gerrit.cloudera.org:8080/22726
Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>
Reviewed-by: Daniel Becker <daniel.becker@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-04-08 07:12:55 +00:00
Joe McDonnell
82bd087fb1 IMPALA-11973: Add absolute_import, division to all eligible Python files
This takes steps to make Python 2 behave like Python 3 as
a way to flush out issues with running on Python 3. Specifically,
it handles two main differences:
 1. Python 3 requires absolute imports within packages. This
    can be emulated via "from __future__ import absolute_import"
 2. Python 3 changed division to "true" division that doesn't
    round to an integer. This can be emulated via
    "from __future__ import division"

This changes all Python files to add imports for absolute_import
and division. For completeness, this also includes print_function in the
import.

I scrutinized each old-division location and converted some locations
to use the integer division '//' operator if it needed an integer
result (e.g. for indices, counts of records, etc). Some code was also using
relative imports and needed to be adjusted to handle absolute_import.
This fixes all Pylint warnings about no-absolute-import and old-division,
and these warnings are now banned.

Testing:
 - Ran core tests

Change-Id: Idb0fcbd11f3e8791f5951c4944be44fb580e576b
Reviewed-on: http://gerrit.cloudera.org:8080/19588
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>
2023-03-09 17:17:57 +00:00
Tim Armstrong
c14a090400 IMPALA-5844: use a MemPool for expr result allocations
This is also a step towards IMPALA-2399 (remove QueryMaintenance()).

"local" allocations containing expression results (either intermediate
or final results) have the following properties:
* They are usually small allocations
* They can be made frequently (e.g. every function call)
* They are owned and managed by the Impala runtime
* They are freed in bulk at various points in query execution.

A MemPool (i.e. bump allocator) is the right mechanism to manage
allocations with the above properties. Before this patch
FunctionContext's used a FreePool + vector of allocations to emulate the
above behaviour. This patch switches to using a MemPool to bring these
allocations in line with the rest of the codebase.

The steps required to do this conversion.
* Use a MemPool for FunctionContext local allocations.
* Identify appropriate MemPools for all of the local allocations from
  function contexts so that the memory lifetime is correct.
* Various cleanup and documentation of existing MemPools.
* Replaces calls to FreeLocalAllocations() with calls to
  MemPool::Clear()

More involved surgery was required in a few places:
* Made the Sorter own its comparator, exprs and MemPool.
* Remove FunctionContextImpl::ReallocateLocal() and just have
  StringFunctions::Replace() do the doubling itself to avoid
  the need for a special interface. Worst-case this doubles
  the memory requirements for Replace() since n / 2 + n / 4
  + n / 8 + .... bytes of memory could be wasted instead of recycled
  for an n-byte output string.
* Provide a way redirect agg fn Serialize()/Finalize() allocations
  to come directly from the output RowBatch's MemPool. This is
  also potentially applicable to other places where we currently
  copy out strings from local allocations, e.g.
  AnalyticEvalNode::AddResultTuple() and Tuple::MaterializeExprs().
* --stress_free_pool_alloc was changed to instead intercept at the
  FunctionContext layer so that it retains the old behaviour even
  though allocations do not all come from FreePools.

The "local" allocation concept was not exposed directly in udf.h so this
patch also renames them to better reflect that they're used for expr
results.

Testing:
* ran exhaustive and ASAN

Change-Id: I4ba5a7542ed90a49a4b5586c040b5985a7d45b61
Reviewed-on: http://gerrit.cloudera.org:8080/8025
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins
2017-10-06 00:01:08 +00:00
Dan Hecht
741421de09 IMPALA-5252: Fix crash in HiveUdfCall::GetStringVal() when mem_limit exceeded
We need to check for AllocateLocal() returning NULL. CopyFrom() takes
care of that for us.  Also adjust a few other places in the code base
that didn't have the check.

The new test reproduces the crash, but in order to get this test file to
execute, I had to move the xfail to be a function decorator. Apparently
xfail as a statement causes the test to not run at all. We should run
all of these queries even if they are non-determistic to at least verify
that impalad does not crash.

Change-Id: Iafefef24479164cc4d2b99191d2de28eb8b311b6
Reviewed-on: http://gerrit.cloudera.org:8080/6761
Reviewed-by: Dan Hecht <dhecht@cloudera.com>
Tested-by: Impala Public Jenkins
2017-04-29 02:23:51 +00:00
Michael Ho
9b80224f9f IMPALA-2925: Mark test_alloc_update as xfail.
test_alloc_update.py is flaky and the expected failure sometimes
doesn't occur. Mark this test as xfail for now to unblock the build.

Change-Id: If4e86e7b9c064bc78b672814cd3569453ecc268d
Reviewed-on: http://gerrit.cloudera.org:8080/5366
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Internal Jenkins
2016-12-06 12:37:17 +00:00
Dan Hecht
ffa7829b70 IMPALA-3918: Remove Cloudera copyrights and add ASF license header
For files that have a Cloudera copyright (and no other copyright
notice), make changes to follow the ASF source file header policy here:

http://www.apache.org/legal/src-headers.html#headers

Specifically:
1) Remove the Cloudera copyright.
2) Modify NOTICE.txt according to
   http://www.apache.org/legal/src-headers.html#notice
   to follow that format and add a line for Cloudera.
3) Replace or add the existing ASF license text with the one given
   on the website.

Much of this change was automatically generated via:

git grep -li 'Copyright.*Cloudera' > modified_files.txt
cat modified_files.txt | xargs perl -n -i -e 'print unless m#Copyright.*Cloudera#i;'
cat modified_files_txt | xargs fix_apache_license.py [1]

Some manual fixups were performed following those steps, especially when
license text was completely missing from the file.

[1] https://gist.github.com/anonymous/ff71292094362fc5c594 with minor
    modification to ORIG_LICENSE to match Impala's license text.

Change-Id: I2e0bd8420945b953e1b806041bea4d72a3943d86
Reviewed-on: http://gerrit.cloudera.org:8080/3779
Reviewed-by: Dan Hecht <dhecht@cloudera.com>
Tested-by: Internal Jenkins
2016-08-09 08:19:41 +00:00
Taras Bobrovytsky
609b80410e Clean up Python test import statements
Many of our test scripts have import statements that look like
"from xxx import *". It is a good practice to explicitly name what
needs to be imported. This commit implements this practice. Also,
unused import statements are removed.

Change-Id: I6a33bb66552ae657d1725f765842f648faeb26a8
Reviewed-on: http://gerrit.cloudera.org:8080/3444
Reviewed-by: Michael Brown <mikeb@cloudera.com>
Tested-by: Internal Jenkins
2016-07-15 23:26:18 +00:00
Michael Brown
22669e23be IMPALA-3501: ee tests: detect build type and support different timeouts based on the same
Impala compiled with the address sanitizer, or compiled with code
coverage, runs through code paths much slower. This can cause end-to-end
tests that pass on a non-ASAN or non-code coverage build to fail. Some
examples include IMPALA-2721, IMPALA-2973, and IMPALA-3501. These
classes of failures tend always to involve some time-sensitive condition
that fails to succeed under such "slow builds".

The works-around in the past have been to simply increase the timeout.
The problem with this approach is that it relaxes conditions for tests
on builds that see the field--i.e., release builds--for builds that
never will--i.e., ASAN and code coverage.

This patch fixes that problem by allowing test authors to set timeout
values based on a *specific* build type. The author may choose timeouts
with a default value, and different timeouts for either or both
so-called "slow builds": ASAN and code coverage.

We detect the so-called "specific build type" by inspecting the binary
expected to be at the path under test. This removes the need to make
alterations to Impala itself. The inspection done is to read the DWARF
information in the binary, specifically the first compile unit's
DW_AT_producer and DW_AT_name DIE attributes. We employ a heuristic
based on these attributes' values to guess the build type. If we can't
determine the build type, we will assume it's a debug build. More
information on this is in IMPALA-3501.

A quick summary of the changes follows:

1. Move some of the logic in tests.common.skip to tests.common.environ
   and rework some skip marks to be more precise.

2. Add Pyelftools for convenient deserialization of DWARF

3. Our Pyelftools usage requires collections.OrderedDict, which isn't in
   python2.6; also add Monkeypatch to handle this.

4. Add ImpalaBuild and specific_build_type_timeout, the core of the new
   functionality

5. Fix the statestore tests that only fail under code coverage (the
   basis for IMPALA-3501)

Testing:

The tests that were previously, reliably failing under code coverage now
pass. I also ran perfunctory tests of debug, release, and ASAN builds to
ensure our detection of build type is working. This patch will *not*
turn the code coverage builds green; there are other tests that fail,
and fixing all of them here is out of the scope of this patch.

Change-Id: I2b675c04c54e36d404fd9e5a6cf085fb8d6d0e47
Reviewed-on: http://gerrit.cloudera.org:8080/3156
Reviewed-by: Michael Brown <mikeb@cloudera.com>
Tested-by: Internal Jenkins
2016-05-25 19:41:45 -07:00
Michael Ho
40f75fb1ba IMPALA-2925: Fix flaky tests in test_alloc_fail_update()
test_alloc_fail_update() aims to stress memory allocation
failure in the Update(), Serialize() and/or Finalize() functions
of UDAs. However, this test included some UDFs which allocated
memory in their Init() functions and not during their Update()
functions. This change removes those UDFs from the test.

Change-Id: I1ecc7e838e34ebc9ea3c878fee8ea2497b5fa23e
Reviewed-on: http://gerrit.cloudera.org:8080/2005
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Reviewed-by: Dan Hecht <dhecht@cloudera.com>
Tested-by: Internal Jenkins
2016-02-10 00:54:11 +00:00
Michael Ho
7fbf56430a IMPALA-2903: Skip test_alloc_fail in non-debug builds.
Impala.tests.custom_cluster.test_alloc_fail relies on a
debug-builds only startup option to stress the allocation
failure behavior. This startup option is intentionally
compiled out of release builds to avoid unintended
misconfiguration. Therefore, we should skip this test
in non-debug builds.

This change introduces SkipIfNotDebugBuild in pytest so
we can skip certain tests in non-debug builds.

Change-Id: I46ef258e48cffb13b106a0dcc0490edf5ea50a1c
Reviewed-on: http://gerrit.cloudera.org:8080/1951
Reviewed-by: Michael Ho <kwho@cloudera.com>
Tested-by: Internal Jenkins
2016-02-02 03:44:19 +00:00
Michael Ho
e01ab4f1b2 IMPALA-2620: FunctionContext::Allocate() and friends should check for memory limits.
FunctionContext::Allocate(), FunctionContextImpl::AllocateLocal()
and FunctionContext::Reallocate() allocate memory without taking
memory limits into account. The problem is that these functions
invoke FreePool::Allocate() which may call MemPool::Allocate()
that doesn't check against the memory limits. This patch fixes
the problem by making these FunctionContext functions check for
memory limits and set an error in the FunctionContext object if
memory limits are exceeded.

An alternative would be for these functions to call
MemPool::TryAllocate() instead and return NULL if memory limits
are exceeded. However, this may break some existing external
UDAs which don't check for allocation failures, leading to
unexpected crashes of Impala. Therefore, we stick with this
ad hoc approach until the UDF/UDA interfaces are updated in
the future releases.

Callers of these FunctionContext functions are also updated to
handle potential failed allocations instead of operating on
NULL pointers. The query status will be polled at various
locations and terminate the query.

This patch also fixes MemPool to handle the case in which malloc
may return NULL. It propagates the failure to the callers instead
of continuing to run with NULL pointers. In addition, errors during
aggregate functions' initialization are now properly propagated.

Change-Id: Icefda795cd685e5d0d8a518cbadd37f02ea5e733
Reviewed-on: http://gerrit.cloudera.org:8080/1445
Reviewed-by: Michael Ho <kwho@cloudera.com>
Tested-by: Internal Jenkins
2015-12-19 04:45:55 +00:00