Commit Graph

7 Commits

Author SHA1 Message Date
Csaba Ringhofer
f98b697c7b IMPALA-13929: Make 'functional-query' the default workload in tests
This change adds get_workload() to ImpalaTestSuite and removes it
from all test suites that already returned 'functional-query'.
get_workload() is also removed from CustomClusterTestSuite which
used to return 'tpch'.

All other changes besides impala_test_suite.py and
custom_cluster_test_suite.py are just mass removals of
get_workload() functions.

The behavior is only changed in custom cluster tests that didn't
override get_workload(). By returning 'functional-query' instead
of 'tpch', exploration_strategy() will no longer return 'core' in
'exhaustive' test runs. See IMPALA-3947 on why workload affected
exploration_strategy. An example for affected test is
TestCatalogHMSFailures which was skipped both in core and exhaustive
runs before this change.

get_workload() functions that return a different workload than
'functional-query' are not changed - it is possible that some of
these also don't handle exploration_strategy() as expected, but
individually checking these tests is out of scope in this patch.

Change-Id: I9ec6c41ffb3a30e1ea2de773626d1485c69fe115
Reviewed-on: http://gerrit.cloudera.org:8080/22726
Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>
Reviewed-by: Daniel Becker <daniel.becker@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-04-08 07:12:55 +00:00
Riza Suminto
3005092332 IMPALA-13668: Add default_test_protocol parameter to py.test
ImpalaTestSuite.client is always initialized as beeswax client. And many
tests use it directly rather than going through helper method such as
execute_query().

This patch add add default_test_protocol parameter to conftest.py. It
control whether to initialize ImpalaTestSuite.client equals to
'beeswax_client', 'hs2_client', or 'hs2_http_client'. This parameter is
still default to 'beeswax'.

This patch also adds helper method 'default_client_protocol_dimension',
'beeswax_client_protocol_dimension' and 'hs2_client_protocol_dimension'
for convenience and traceability.

Reduced occurrence where test method manually override
ImpalaTestSuite.client. They are replaced by combination of
ImpalaTestSuite.create_impala_clients and
ImpalaTestSuite.close_impala_clients.

Testing:
- Pass core tests.

Change-Id: I9165ea220b2c83ca36d6e68ef3b88b128310af23
Reviewed-on: http://gerrit.cloudera.org:8080/22336
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-01-24 12:19:02 +00:00
Riza Suminto
95f353ac4a IMPALA-13507: Allow disabling glog buffering via with_args fixture
We have plenty of custom_cluster tests that assert against content of
Impala daemon log files while the process is still running using
assert_log_contains() and it's wrappers. The method specifically mention
about disabling glog buffering ('-logbuflevel=-1'), but not all
custom_cluster tests do that. This often result in flaky test that hard
to triage and often neglected if it does not frequently run in core
exploration.

This patch adds boolean param 'disable_log_buffering' into
CustomClusterTestSuite.with_args for test to declare intention to
inspect log files in live minicluster. If it is True, start minicluster
with '-logbuflevel=-1' for all daemons. If it is False, log WARNING on
any calls to assert_log_contains().

There are several complex custom_cluster tests that left unchanged and
print out such WARNING logs, such as:
- TestQueryLive
- TestQueryLogTableBeeswax
- TestQueryLogOtherTable
- TestQueryLogTableHS2
- TestQueryLogTableAll
- TestQueryLogTableBufferPool
- TestStatestoreRpcErrors
- TestWorkloadManagementInitWait
- TestWorkloadManagementSQLDetails

This patch also fixed some small flake8 issues on modified tests.

There is a flakiness sign at test_query_live.py where test query is
submitted to coordinator and fail because sys.impala_query_live table
has not exist yet from coordinator's perspective. This patch modify
test_query_live.py to wait for few seconds until sys.impala_query_live
is queryable.

Testing:
- Pass custom_cluster tests in exhaustive exploration.

Change-Id: I56fb1746b8f3cea9f3db3514a86a526dffb44a61
Reviewed-on: http://gerrit.cloudera.org:8080/22015
Reviewed-by: Jason Fehr <jfehr@cloudera.com>
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-11-05 04:49:05 +00:00
stiga-huang
ee21427d26 IMPALA-13009: Fix catalogd not sending deletion updates for some dropped partitions
*Background*

Since IMPALA-3127, catalogd sends incremental partition updates based on
the last sent table snapshot ('maxSentPartitionId_' to be specific).
Dropped partitions since the last catalog update are tracked in
'droppedPartitions_' of HdfsTable. When catalogd collects the next
catalog update, they will be collected. HdfsTable then clears the set.
See details in CatalogServiceCatalog#addHdfsPartitionsToCatalogDelta().

If an HdfsTable is invalidated, it's replaced with an IncompleteTable
which doesn't track any partitions. The HdfsTable object is then added
to the deleteLog so catalogd can send deletion updates for all its
partitions. The same if the HdfsTable is dropped. However, the
previously dropped partitions are not collected in this case, which
results in a leak in the catalog topic if the partition name is not
reused anymore. Note that in the catalog topic, the key of a partition
update consists of the table name and the partition name. So if the
partition is added back to the table, the topic key will be reused then
resolves the leak.

The leak will be observed when a coordinator restarts. In the initial
catalog update sent from statestore, coordinator will find some
partition updates that are not referenced by the HdfsTable (assuming the
table is used again after the INVALIDATE). Then a Precondition check
fails and the table is not added to the coordinator.

*Overview of the patch*

This patch fixes the leak by also collecting the dropped partitions when
adding the HdfsTable to the deleteLog. A new field, dropped_partitions,
is added in THdfsTable to collect them. It's only used when catalogd
collects catalog updates.

Removes the Precondition check in coordinator and just reports the stale
partitions since IMPALA-12831 could also introduce them.

Also adds a log line in CatalogOpExecutor.alterTableDropPartition() to
show the dropped partition names for better diagnostics.

Tests
 - Added e2e tests

Change-Id: I12a68158dca18ee48c9564ea16b7484c9f5b5d21
Reviewed-on: http://gerrit.cloudera.org:8080/21326
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-05-07 01:56:49 +00:00
Joe McDonnell
82bd087fb1 IMPALA-11973: Add absolute_import, division to all eligible Python files
This takes steps to make Python 2 behave like Python 3 as
a way to flush out issues with running on Python 3. Specifically,
it handles two main differences:
 1. Python 3 requires absolute imports within packages. This
    can be emulated via "from __future__ import absolute_import"
 2. Python 3 changed division to "true" division that doesn't
    round to an integer. This can be emulated via
    "from __future__ import division"

This changes all Python files to add imports for absolute_import
and division. For completeness, this also includes print_function in the
import.

I scrutinized each old-division location and converted some locations
to use the integer division '//' operator if it needed an integer
result (e.g. for indices, counts of records, etc). Some code was also using
relative imports and needed to be adjusted to handle absolute_import.
This fixes all Pylint warnings about no-absolute-import and old-division,
and these warnings are now banned.

Testing:
 - Ran core tests

Change-Id: Idb0fcbd11f3e8791f5951c4944be44fb580e576b
Reviewed-on: http://gerrit.cloudera.org:8080/19588
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>
2023-03-09 17:17:57 +00:00
Michael Smith
1eb0510eaa IMPALA-11456: Collapse filesystem Skip logic
Combines all SkipIf* classes for different filesystems into a single
SkipIfFS class. Many cases are simplified to 'not IS_HDFS', with the
rest as filesystem-specific special cases. The 'jira' option is removed
in favor of specific flags for each issue.

Change-Id: Ib928a6274baaaec45614887b9e762346a25812a1
Reviewed-on: http://gerrit.cloudera.org:8080/18781
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2022-08-10 22:37:08 +00:00
wzhou-code
02043744be IMPALA-11445: Fix bug in firing insert event of partitions located in different FS
When adding a partition with location in a file system which is
different from the file system of the table location, Impala accept
it. But when insert values to the table, catalogd throw exception.

This patch fix the issue by using the right FileSystem object.

Testing:
 - Added new test case with partitions on different file systems.
   Ran the test on S3.
 - Did manual tests in cluster with partitions on HDFS and Ozone.
 - Passed core test.

Change-Id: I0491ee1bf40c3d5240f9124cef3f3169c44a8267
Reviewed-on: http://gerrit.cloudera.org:8080/18759
Reviewed-by: Quanlong Huang <huangquanlong@gmail.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2022-07-22 04:23:30 +00:00