10 Commits

Author SHA1 Message Date
Joe McDonnell
5eea4f6f79 IMPALA-14559: Ship calcite-planner jar in Impala packages
This adds the java/impala-package Maven project to make it easier
to ship / test the Calcite planner. impala-package has a dependency
on impala-frontend and calcite-planner, so its classpath requires
no extra work when constructing the classpath.

An additional cleanup is that this no longer puts the
impala-frontend-*-tests.jar on the classpath by default. This requires
updating the query event hooks test, as it relies on that jar being
present.

This does not change the default value for the use_calcite_planner
query option, so there is no change in behavior.

Testing:
 - Ran a core job
 - Built docker images and OS packages locally

Change-Id: I81dec2a5b59e279229a735c8bb1a23c77111a793
Reviewed-on: http://gerrit.cloudera.org:8080/23497
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-11-21 03:36:12 +00:00
Csaba Ringhofer
f98b697c7b IMPALA-13929: Make 'functional-query' the default workload in tests
This change adds get_workload() to ImpalaTestSuite and removes it
from all test suites that already returned 'functional-query'.
get_workload() is also removed from CustomClusterTestSuite which
used to return 'tpch'.

All other changes besides impala_test_suite.py and
custom_cluster_test_suite.py are just mass removals of
get_workload() functions.

The behavior is only changed in custom cluster tests that didn't
override get_workload(). By returning 'functional-query' instead
of 'tpch', exploration_strategy() will no longer return 'core' in
'exhaustive' test runs. See IMPALA-3947 on why workload affected
exploration_strategy. An example for affected test is
TestCatalogHMSFailures which was skipped both in core and exhaustive
runs before this change.

get_workload() functions that return a different workload than
'functional-query' are not changed - it is possible that some of
these also don't handle exploration_strategy() as expected, but
individually checking these tests is out of scope in this patch.

Change-Id: I9ec6c41ffb3a30e1ea2de773626d1485c69fe115
Reviewed-on: http://gerrit.cloudera.org:8080/22726
Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>
Reviewed-by: Daniel Becker <daniel.becker@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-04-08 07:12:55 +00:00
Riza Suminto
95f353ac4a IMPALA-13507: Allow disabling glog buffering via with_args fixture
We have plenty of custom_cluster tests that assert against content of
Impala daemon log files while the process is still running using
assert_log_contains() and it's wrappers. The method specifically mention
about disabling glog buffering ('-logbuflevel=-1'), but not all
custom_cluster tests do that. This often result in flaky test that hard
to triage and often neglected if it does not frequently run in core
exploration.

This patch adds boolean param 'disable_log_buffering' into
CustomClusterTestSuite.with_args for test to declare intention to
inspect log files in live minicluster. If it is True, start minicluster
with '-logbuflevel=-1' for all daemons. If it is False, log WARNING on
any calls to assert_log_contains().

There are several complex custom_cluster tests that left unchanged and
print out such WARNING logs, such as:
- TestQueryLive
- TestQueryLogTableBeeswax
- TestQueryLogOtherTable
- TestQueryLogTableHS2
- TestQueryLogTableAll
- TestQueryLogTableBufferPool
- TestStatestoreRpcErrors
- TestWorkloadManagementInitWait
- TestWorkloadManagementSQLDetails

This patch also fixed some small flake8 issues on modified tests.

There is a flakiness sign at test_query_live.py where test query is
submitted to coordinator and fail because sys.impala_query_live table
has not exist yet from coordinator's perspective. This patch modify
test_query_live.py to wait for few seconds until sys.impala_query_live
is queryable.

Testing:
- Pass custom_cluster tests in exhaustive exploration.

Change-Id: I56fb1746b8f3cea9f3db3514a86a526dffb44a61
Reviewed-on: http://gerrit.cloudera.org:8080/22015
Reviewed-by: Jason Fehr <jfehr@cloudera.com>
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-11-05 04:49:05 +00:00
Riza Suminto
9c87cf41bf IMPALA-13396: Unify tmp dir management in CustomClusterTestSuite
There are many custom cluster tests that require creating temporary
directory. The temporary directory typically live within a scope of test
method and cleaned afterwards. However, some test do create temporary
directory directly and forgot to clean them afterwards, leaving junk
dirs under /tmp/ or $LOG_DIR.

This patch unify the temporary directory management inside
CustomClusterTestSuite. It introduce new 'tmp_dir_placeholders' arg in
CustomClusterTestSuite.with_args() that list tmp dirs to create.
'impalad_args', 'catalogd_args', and 'impala_log_dir' now accept
formatting pattern that is replaceable by a temporary dir path, defined
through 'tmp_dir_placeholders'.

There are few occurrences where mkdtemp is called and not replaceable by
this work, such as tests/comparison/cluster.py. In that case, this patch
change them to supply prefix arg so that developer knows that it comes
from Impala test script.

This patch also addressed several flake8 errors in modified files.

Testing:
- Pass custom cluster tests in exhaustive mode.
- Manually run few modified tests and observe that the temporary dirs
  are created and removed under logs/custom_cluster_tests/ as the tests
  go.

Change-Id: I8dd665e8028b3f03e5e33d572c5e188f85c3bdf5
Reviewed-on: http://gerrit.cloudera.org:8080/21836
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-10-02 01:25:39 +00:00
Joe McDonnell
82bd087fb1 IMPALA-11973: Add absolute_import, division to all eligible Python files
This takes steps to make Python 2 behave like Python 3 as
a way to flush out issues with running on Python 3. Specifically,
it handles two main differences:
 1. Python 3 requires absolute imports within packages. This
    can be emulated via "from __future__ import absolute_import"
 2. Python 3 changed division to "true" division that doesn't
    round to an integer. This can be emulated via
    "from __future__ import division"

This changes all Python files to add imports for absolute_import
and division. For completeness, this also includes print_function in the
import.

I scrutinized each old-division location and converted some locations
to use the integer division '//' operator if it needed an integer
result (e.g. for indices, counts of records, etc). Some code was also using
relative imports and needed to be adjusted to handle absolute_import.
This fixes all Pylint warnings about no-absolute-import and old-division,
and these warnings are now banned.

Testing:
 - Ran core tests

Change-Id: Idb0fcbd11f3e8791f5951c4944be44fb580e576b
Reviewed-on: http://gerrit.cloudera.org:8080/19588
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>
2023-03-09 17:17:57 +00:00
Michael Smith
35dc24fbc8 IMPALA-10148: Cleanup cores in TestHooksStartupFail
Generalizes coredump cleanup and expecting startup failure from
test_provider.py and uses it in test_query_event_hooks.py
TestHooksStartupFail to ensure core dumps are cleaned up.

Testing: ran changed tests, observed core files being created and
cleaned up while they ran. Observed other core files already present
were not cleaned up, as expected.

Change-Id: Iec32e0acbadd65aa78264594c85ffcd574cf3458
Reviewed-on: http://gerrit.cloudera.org:8080/19103
Reviewed-by: Laszlo Gaal <laszlo.gaal@cloudera.com>
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2022-10-13 16:20:34 +00:00
Bharath Vissapragada
4327cc3513 IMPALA-8931: Fix fe trigger for lineage events
Currently, fe generates lineage objects only when --lineage_event_log_dir
is configured. This is a legacy startup param. Lineages should also be
generated when event hooks are configured since they can potentially
consume them.

Added a test that confirms that the hook is invoked when a lineage is
created.

Change-Id: I2d8deb05883cc3ecab27fe4afd031a1e7ccb0829
Reviewed-on: http://gerrit.cloudera.org:8080/14194
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2019-09-09 19:07:13 +00:00
Tim Armstrong
4fb8e8e324 IMPALA-8816: reduce custom cluster test runtime in core
This includes some optimisations and a bulk move of tests
to exhaustive.

Move a bunch of custom cluster tests to exhaustive. I selected
these partially based on runtime (i.e. I looked most carefully
at the tests that ran for over a minute) and the likelihood
of them catching a precommit bug.  Regression tests for specific
edge cases and tests for parts of the code that are very stable
were prime candidates.

Remove an unnecessary cluster restart in test_breakpad.

Merge test_scheduler_error into test_failpoints to avoid an unnecessary
cluster restart.

Speed up cluster starts by ensuring that the default statestore args are
applied even when _start_impala_cluster() is called directly. This
shaves a couple of seconds off each restart. We made the default args
use a faster update frequency - see IMPALA-7185 - but they did not
take effect in all tests.

Change-Id: Ib2e3e7ebc9695baec4d69183387259958df10f62
Reviewed-on: http://gerrit.cloudera.org:8080/13967
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2019-08-06 21:34:26 +00:00
Fredy Wijaya
1b531be9be IMPALA-8589: Re-enable flaky test_query_event_hooks.py
This patch fixes the flaky test_query_event_hooks.py. The patch also
cuts down the waiting time for impalad timeout to 5 seconds from the
default 60 seconds especially for those tests that will fail Impala
startup.

Testing:
- Ran test_query_event_hooks.py in a loop.

Change-Id: Ia64550e986b5eba59a1d77657943932bb977d470
Reviewed-on: http://gerrit.cloudera.org:8080/13713
Reviewed-by: Fredy Wijaya <fwijaya@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2019-06-26 00:13:44 +00:00
Radford Nguyen
31195eb811 IMPALA-8473: Publish lineage info via hook
This commit introduces a hook mechanism for publishing,
lineage data specifically, but query information more
generally, from Impala.

The legacy behavior of writing the lineage file is
being retained but deprecated.

Hooks can be implemented by downstream consumers (i.e.
runtime dependencies) to hook into supported places during
Impala query execution:

- impalad startup
- query completion
    - see IMPALA-8572 for caveat/details

The consumers are to be frontend Java dependencies
intiated at runtime. 2 backend flags configure this
behavior:

- `query_event_hook_classes` specifies a comma-separated
list of hook consumer implementation classes that
are instantiated and registered at impala start up.

- `query_event_hook_nthreads`
specifies the number of threads to use for asynchronous
hook execution.  (Relevant if multiple hooks are
registered.)

Lineage information is passed from the backend after
a query completes (but before it returns) and given
to every hook to execute asynchronously.  In other words,
a query may complete and return to the user before any
or all hooks have completed executing.  An exception
during hook on-query-complete execution will simply be logged
and will not be (directly) fatal to the system.

Tests:
- added unit tests for FE hook execution
- added E2E tests for hook configuration, execution, error
- ran full build, tests

Change-Id: I23a896537a98bfef07fb27c70e9a87c105cd77a1
Reviewed-on: http://gerrit.cloudera.org:8080/13352
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2019-05-26 23:16:07 +00:00