Commit Graph

32 Commits

Author SHA1 Message Date
Riza Suminto
1cead45114 IMPALA-13947: Test local catalog mode by default
Local catalog mode has been the default and works well in downstream
Impala for over 5 years. This patch turn on local catalog mode by
default (--catalog_topic_mode=minimal and --use_local_catalog=true) as
preferred mode going forward.

Implemented LocalCatalog.setIsReady() to facilitate using local catalog
mode for FE tests. Some FE tests fail due to behavior differences in
local catalog mode like IMPALA-7539. This is probably OK since Impala
now largely hand over FileSystem permission check to Apache Ranger.

The following custom cluster tests are pinned to evaluate under legacy
catalog mode because their behavior changed in local catalog mode:

TestCalcitePlanner.test_calcite_frontend
TestCoordinators.test_executor_only_lib_cache
TestMetadataReplicas
TestTupleCacheCluster
TestWorkloadManagementSQLDetailsCalcite.test_tpcds_8_decimal

At TestHBaseHmsColumnOrder.test_hbase_hms_column_order, set
--use_hms_column_order_for_hbase_tables=true flag for both impalad and
catalogd to get consistent column order in either local or legacy
catalog mode.

Changed TestCatalogRpcErrors.test_register_subscriber_rpc_error
assertions to be more fine grained by matching individual query id.

Move most of test methods from TestRangerLegacyCatalog to
TestRangerLocalCatalog, except for some that do need to run in legacy
catalog mode. Also renamed TestRangerLocalCatalog to
TestRangerDefaultCatalog. Table ownership issue in local catalog mode
remains unresolved (see IMPALA-8937).

Testing:
Pass exhaustive tests.

Change-Id: Ie303e294972d12b98f8354bf6bbc6d0cb920060f
Reviewed-on: http://gerrit.cloudera.org:8080/23080
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-08-06 21:42:24 +00:00
Riza Suminto
f28a32fbc3 IMPALA-13916: Change BaseTestSuite.default_test_protocol to HS2
This is the final patch to move all Impala e2e and custom cluster tests
to use HS2 protocol by default. Only beeswax-specific test remains
testing against beeswax protocol by default. We can remove them once
Impala officially remove beeswax support.

HS2 error message formatting in impala-hs2-server.cc is adjusted a bit
to match with formatting in impala-beeswax-server.cc.

Move TestWebPageAndCloseSession from webserver/test_web_pages.py to
custom_cluster/test_web_pages.py to disable glog log buffering.

Testing:
- Pass exhaustive tests, except for some known and unrelated flaky
  tests.

Change-Id: I42e9ceccbba1e6853f37e68f106265d163ccae28
Reviewed-on: http://gerrit.cloudera.org:8080/22845
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Jason Fehr <jfehr@cloudera.com>
2025-05-20 14:32:10 +00:00
Riza Suminto
00dc79adf6 IMPALA-13907: Remove reference to create_beeswax_client
This patch replace create_beeswax_client() reference to
create_hs2_client() or vector-based client creation to prepare towards
hs2 test migration.

test_session_expiration_with_queued_query is changed to use impala.dbapi
directly from Impyla due to limitation in ImpylaHS2Connection.

TestAdmissionControllerRawHS2 is migrated to use hs2 as default test
protocol.

Modify test_query_expiration.py to set query option through client
instead of SET query. test_query_expiration is slightly modified due to
behavior difference in hs2 ImpylaHS2Connection.

Remove remaining reference to BeeswaxConnection.QueryState.

Fixed a bug in ImpylaHS2Connection.wait_for_finished_timeout().

Fix some easy flake8 issues caught thorugh this command:
git show HEAD --name-only | grep '^tests.*py' \
  | xargs -I {} impala-flake8 {} \
  | grep -e U100 -e E111 -e E301 -e E302 -e E303 -e F...

Testing:
- Pass exhaustive tests.

Change-Id: I1d84251835d458cc87fb8fedfc20ee15aae18d51
Reviewed-on: http://gerrit.cloudera.org:8080/22700
Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-03-29 18:37:45 +00:00
Riza Suminto
e73e2d40da IMPALA-13864: Implement ImpylaHS2ResultSet.exec_summary
This patch implement building exec summary table for
ImpylaHS2Connection. It adds fetch_exec_summary argument in
ImpalaConnection.execute(). If this argument is True, an exec summary
table will be added into the returned result object.

fetch_exec_summary is also implemented for BeeswaxConnection. Thus,
BeeswaxConnection will not fetch exec summary by default all the time.

Tests that validate exec summary table is updated to set
fetch_exec_summary=True and migrated to test against hs2 protocol.
Change TestExecutorGroup._set_query_options() to do query option setting
through hs2_client iconfig instead of SET query. Some flake8 issues are
addressed as well.

Move build_exec_summary_table to separate exec_summary.py file. Tweak it
a bit to return early if given TExecSummary is empty.

Fixed bug in ImpalaBeeswaxClient.fetch_results() where fetch will not
happen at all if discard_result argument is True.

Testing:
- Run and pass affected tests locally.

Change-Id: I7d88f78e58eeda29ce21e7828884c7a129d7efe6
Reviewed-on: http://gerrit.cloudera.org:8080/22626
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-03-24 22:34:20 +00:00
Xuebin Su
242095ac8a IMPALA-13729: Accept error messages not starting with prompt
Previously, error_msg_expected() only accepted error messages starting
with the following error prompt:
```
Query <query_id> failed:\n
```
However, for some tests using the Beeswax protocol, the error prompt may
appear in the middle of the error message instead of at its beginning.

Therefore, this patch adapts error_msg_expected() to accept error
messages not starting with the error prompt.

The error_msg_expected() function is renamed to error_msg_startswith()
to better describe its behavior.

Change-Id: Iac3e68bcc36776f7fd6cc9c838dd8da9c3ecf58b
Reviewed-on: http://gerrit.cloudera.org:8080/22468
Reviewed-by: Daniel Becker <daniel.becker@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>
2025-02-26 15:29:36 +00:00
Riza Suminto
3005092332 IMPALA-13668: Add default_test_protocol parameter to py.test
ImpalaTestSuite.client is always initialized as beeswax client. And many
tests use it directly rather than going through helper method such as
execute_query().

This patch add add default_test_protocol parameter to conftest.py. It
control whether to initialize ImpalaTestSuite.client equals to
'beeswax_client', 'hs2_client', or 'hs2_http_client'. This parameter is
still default to 'beeswax'.

This patch also adds helper method 'default_client_protocol_dimension',
'beeswax_client_protocol_dimension' and 'hs2_client_protocol_dimension'
for convenience and traceability.

Reduced occurrence where test method manually override
ImpalaTestSuite.client. They are replaced by combination of
ImpalaTestSuite.create_impala_clients and
ImpalaTestSuite.close_impala_clients.

Testing:
- Pass core tests.

Change-Id: I9165ea220b2c83ca36d6e68ef3b88b128310af23
Reviewed-on: http://gerrit.cloudera.org:8080/22336
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2025-01-24 12:19:02 +00:00
Xuebin Su
ad868b9947 IMPALA-13115: Add query id to error messages
This patch adds the query id to the error messages in both

- the result of the `get_log()` RPC, and
- the error message in an RPC response

before they are returned to the client, so that the users can easily
figure out the errored queries on the client side.

To achieve this, the query id of the thread debug info is set in the
RPC handler method, and is retrieved from the thread debug info each
time the error reporting function or `get_log()` gets called.

Due to the change of the error message format, some checks in the
impala-shell.py are adapted to keep them valid.

Testing:
- Added helper function `error_msg_expected()` to check whether an
  error message is expected. It is stricter than only using the `in`
  operator.
- Added helper function `error_msg_equal()` to check if two error
  messages are equal regardless of the query ids.
- Various test cases are adapted to match the new error message format.
- `ImpalaBeeswaxException`, which is used in tests only, is simplified
  so that it has the same error message format as the exceptions for
  HS2.
- Added an assertion to the case of killing and restarting a worker
  in the custom cluster test to ensure that the query id is in
  the error message in the client log retrieved with `get_log()`.

Change-Id: I67e659681e36162cad1d9684189106f8eedbf092
Reviewed-on: http://gerrit.cloudera.org:8080/21587
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-08-08 14:11:04 +00:00
Daniel Becker
9071030f7f IMPALA-12809: Iceberg metadata table scanner should always be scheduled to the coordinator
On clusters with dedicated coordinators and executors the Iceberg
metadata scanner fragment(s) can be scheduled to executors, for example
during a join. The fragment in this case will fail a precondition check,
because either the 'frontend_' object or the table will not be present.

This change forces Iceberg metadata scanner fragments to be scheduled on
the coordinator. It is not enough to set the DataPartition type to
UNPARTITIONED, because unpartitioned fragments can still be scheduled on
executors. This change introduces a new flag in the TPlanFragment thrift
struct - if it is true, the fragment is always scheduled on the
coordinator.

Testing:
 - Added a regression test in test_coordinators.py.
 - Added a new planner test with two metadata tables and a regular table
   joined together.

Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
Reviewed-on: http://gerrit.cloudera.org:8080/21138
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-03-29 04:40:31 +00:00
Joe McDonnell
82bd087fb1 IMPALA-11973: Add absolute_import, division to all eligible Python files
This takes steps to make Python 2 behave like Python 3 as
a way to flush out issues with running on Python 3. Specifically,
it handles two main differences:
 1. Python 3 requires absolute imports within packages. This
    can be emulated via "from __future__ import absolute_import"
 2. Python 3 changed division to "true" division that doesn't
    round to an integer. This can be emulated via
    "from __future__ import division"

This changes all Python files to add imports for absolute_import
and division. For completeness, this also includes print_function in the
import.

I scrutinized each old-division location and converted some locations
to use the integer division '//' operator if it needed an integer
result (e.g. for indices, counts of records, etc). Some code was also using
relative imports and needed to be adjusted to handle absolute_import.
This fixes all Pylint warnings about no-absolute-import and old-division,
and these warnings are now banned.

Testing:
 - Ran core tests

Change-Id: Idb0fcbd11f3e8791f5951c4944be44fb580e576b
Reviewed-on: http://gerrit.cloudera.org:8080/19588
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>
2023-03-09 17:17:57 +00:00
Joe McDonnell
c71de994b0 IMPALA-11952 (part 1): Fix except syntax
Python 3 does not support this old except syntax:

except Exception, e:

Instead, it needs to be:

except Exception as e:

This uses impala-futurize to fix all locations of
the old syntax.

Testing:
 - The check-python-syntax.sh no longer shows errors
   for except syntax.

Change-Id: I1737281a61fa159c8d91b7d4eea593177c0bd6c9
Reviewed-on: http://gerrit.cloudera.org:8080/19551
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Tested-by: Michael Smith <michael.smith@cloudera.com>
2023-02-28 17:11:50 +00:00
Michael Smith
bbb0b4939d IMPALA-11476: Support Ozone erasure coding
Adds support for identifying erasure coding policy with Ozone. Enables
testing Ozone with erasure coding.

Omits support for identifying erasure coding policy with the o3fs
protocol as that protocol is effectively deprecated and its classes
don't provide access to the ObjectStore.

Refactors volumeBucketPair to use StringTokenizer.

Test updates:
- test_exclusive_coordinator_plan: Ozone+EC blocks are 768MB, which is
  larger than all tables in our test environment. Use tpch_parquet which
  we rely on having 3 files (by loading from snapshot in this case).
- test_new_file_shorter: receives an EOFException when seeking with EC
- test_local_read: erasure-coded-bytes-read is also tied to IMPALA-11697
- test_erasure_coding: Ozone doesn't report files as erasure-coded
  (HDDS-7603)

Testing:
- Passes core E2E and custom cluster tests with TARGET_FILESYSTEM=ozone
  and ERASURE_CODING=true.

Change-Id: I201e2e33ce94bbc1e81631a0a315884bcc8047d1
Reviewed-on: http://gerrit.cloudera.org:8080/19324
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2023-01-25 18:18:28 +00:00
Michael Smith
1eb0510eaa IMPALA-11456: Collapse filesystem Skip logic
Combines all SkipIf* classes for different filesystems into a single
SkipIfFS class. Many cases are simplified to 'not IS_HDFS', with the
rest as filesystem-specific special cases. The 'jira' option is removed
in favor of specific flags for each issue.

Change-Id: Ib928a6274baaaec45614887b9e762346a25812a1
Reviewed-on: http://gerrit.cloudera.org:8080/18781
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2022-08-10 22:37:08 +00:00
Michael Smith
830625b104 IMPALA-9442: Add Ozone to minicluster
Adds Ozone as an alternative to hdfs in the minicluster. Select by
setting `export TARGET_FILESYSTEM=ozone`. With that flag,
run-mini-dfs.sh will start Ozone instead of HDFS. Requires a snapshot
because Ozone does not support HBase (HDDS-3589); snapshot loading
doesn't work yet primarily due to HDDS-5502.

Uses the o3fs interface because Ozone puts specific restrictions on
bucket names (no underscores, for instance), and it was a lot easier to
use an interface where everything is written to a single bucket than to
update all Impala's use of HDFS-style paths to make `test-warehouse` a
bucket inside a volume.

Specifies reduced Ozone client retries during shutdown where Ozone may
not be available.

Passes tests with FE_TEST=false BE_TEST=false.

Change-Id: Ibf8b0f7b2d685d8b011df1926e12bf5434b5a2be
Reviewed-on: http://gerrit.cloudera.org:8080/18738
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
2022-08-03 16:58:20 +00:00
Qifan Chen
07a3e6e0df IMPALA-10992 Planner changes for estimate peak memory
This patch provides replan support for multiple executor group sets.
Each executor group set is associated with a distinct number of nodes
and a threshold for estimated memory per host in bytes that can be
denoted as [<group_name_prefix>:<#nodes>, <threshold>].

In the patch, a query of type EXPLAIN, QUERY or DML can be compiled
more than once. In each attempt, per host memory is estimated and
compared with the threshold of an executor group set. If the estimated
memory is no more than the threshold, the iteration process terminates
and the final plan is determined. The executor group set with the
threshold is selected to run the query.

A new query option 'enable_replan', default to 1 (enabled), is added.
It can be set to 0 to disable this patch and to generate the distributed
plan for the default executor group.

To avoid long compilation time, the following enhancement is enabled.
Note 1) can be disabled when relevant meta-data change is
detected.

 1. Authorization is performed only for the 1st compilation;
 2. openTransaction() is called for transactional queries in 1st
    compilation and the saved transactional info is used in
    subsequent compilations. Similar logic is applied to Kudu
    transactional queries.

To facilitate testing, the patch imposes an artificial two executor
group setup in FE as follows.

 1. [regular:<#nodes>, 64MB]
 2. [large:<#nodes>, 8PB]

This setup is enabled when a new query option 'test_replan' is set
to 1 in backend tests, or RuntimeEnv.INSTANCE.isTestEnv() is true as
in most frontend tests. This query option is set to 0 by default.

Compilation time increases when a query is compiled in several
iterations, as shown below for several TPCDs queries. The increase
is mostly due to redundant work in either single node plan creation
or recomputing value transfer graph phase. For small queries, the
increase can be avoided if they can be compiled in single iteration
by properly setting the smallest threshold among all executor group
sets. For example, for the set of queries listed below, the smallest
threshold can be set to 320MB to catch both q15 and q21 in one
compilation.

                              Compilation time (ms)
Queries	 Estimated Memory   2-iterations  1-iteration  Percentage of
                                                         increase
 q1         408MB              60.14         25.75       133.56%
 q11	   1.37GB             261.00        109.61       138.11%
 q10a	    519MB             139.24         54.52       155.39%
 q13	    339MB             143.82         60.08       139.38%
 q14a	   3.56GB             762.68        312.92       143.73%
 q14b	   2.20GB             522.01        245.13       112.95%
 q15	    314MB               9.73          4.28       127.33%
 q21	    275MB              16.00          8.18        95.59%
 q23a	   1.50GB             461.69        231.78        99.19%
 q23b	   1.34GB             461.31        219.61       110.05%
 q4	   2.60GB             218.05        105.07       107.52%
 q67	   5.16GB             694.59        334.24       101.82%

Testing:
 1. Almost all FE and BE tests are now run in the artificial two
    executor setup except a few where a specific cluster configuration
    is desirable;
 2. Ran core tests successfully;
 3. Added a new observability test and a new query assignment test;
 4. Disabled concurrent insert test (test_concurrent_inserts) and
    failing inserts (test_failing_inserts) test in local catalog mode
    due to flakiness. Reported both in IMPALA-11189 and IMPALA-11191.

Change-Id: I75cf17290be2c64fd4b732a5505bdac31869712a
Reviewed-on: http://gerrit.cloudera.org:8080/18178
Reviewed-by: Qifan Chen <qchen@cloudera.com>
Tested-by: Qifan Chen <qchen@cloudera.com>
2022-03-21 20:17:28 +00:00
Fucun Chu
157086cb80 IMPALA-10771: Add Tencent COS support
This patch adds support for COS(Cloud Object Storage). Using the
hadoop-cos, the implementation is similar to other remote FileSystems.

New flags for COS:
- num_cos_io_threads: Number of COS I/O threads. Defaults to be 16.

Follow-up:
- Support for caching COS file handles will be addressed in
   IMPALA-10772.
- test_concurrent_inserts and test_failing_inserts in
   test_acid_stress.py are skipped due to slow file listing on
   COS (IMPALA-10773).

Tests:
 - Upload hdfs test data to a COS bucket. Modify all locations in HMS
   DB to point to the COS bucket. Remove some hdfs caching params.
   Run CORE tests.

Change-Id: Idce135a7591d1b4c74425e365525be3086a39821
Reviewed-on: http://gerrit.cloudera.org:8080/17503
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-12-08 16:32:02 +00:00
stiga-huang
2dfc68d852 IMPALA-7712: Support Google Cloud Storage
This patch adds support for GCS(Google Cloud Storage). Using the
gcs-connector, the implementation is similar to other remote
FileSystems.

New flags for GCS:
 - num_gcs_io_threads: Number of GCS I/O threads. Defaults to be 16.

Follow-up:
 - Support for spilling to GCS will be addressed in IMPALA-10561.
 - Support for caching GCS file handles will be addressed in
   IMPALA-10568.
 - test_concurrent_inserts and test_failing_inserts in
   test_acid_stress.py are skipped due to slow file listing on
   GCS (IMPALA-10562).
 - Some tests are skipped due to issues introduced by /etc/hosts setting
   on GCE instances (IMPALA-10563).

Tests:
 - Compile and create hdfs test data on a GCE instance. Upload test data
   to a GCS bucket. Modify all locations in HMS DB to point to the GCS
   bucket. Remove some hdfs caching params. Run CORE tests.
 - Compile and load snapshot data to a GCS bucket. Run CORE tests.

Change-Id: Ia91ec956de3b620cccf6a1244b56b7da7a45b32b
Reviewed-on: http://gerrit.cloudera.org:8080/17121
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-03-13 11:20:08 +00:00
Joe McDonnell
97856478ec IMPALA-10198 (part 1): Unify Java in a single java/ directory
This changes all existing Java code to be submodules under
a single root pom. The root pom is impala-parent/pom.xml
with minor changes to add submodules.

This avoids most of the weird CMake/maven interactions,
because there is now a single maven invocation for all
the Java code.

This moves all the Java projects other than fe into
a top level java directory. fe is left where it is
to avoid disruption (but still is compiled via the
java directory's root pom). Various pieces of code
that reference the old locations are updated.

Based on research, there are two options for dealing
with the shaded dependencies. The first is to have an
entirely separate Maven project with a separate Maven
invocation. In this case, the consumers of the shaded
jars will see the reduced set of transitive dependencies.
The second is to have the shaded dependencies as modules
with a single Maven invocation. The consumer would see
all of the original transitive dependencies and need to
exclude them all. See MSHADE-206/MNG-5899. This chooses
the second.

This only moves code around and does not focus on version
numbers or making "mvn versions:set" work.

Testing:
 - Ran a core job
 - Verified existing maven commands from fe/ directory still work
 - Compared the *-classpath.txt files from fe and executor-deps
   and verified they are the same except for paths

Change-Id: I08773f4f9d7cb269b0491080078d6e6f490d8d7a
Reviewed-on: http://gerrit.cloudera.org:8080/16500
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>
2020-10-15 19:30:13 +00:00
Bikramjeet Vig
004e3c897e IMPALA-8830: Fix executor group assignment of coordinator only queries
With this fix, coordinator only queries are submitted to a pseudo
executor group named "empty group (using coordinator only)" which
is empty. This allows running coordinator only queries regardless
of the presence of any healthy executor groups.

Testing:
Added a custom cluster test and modified tests that relied on
coordinator only queries to be queued in absence of executor groups.

Change-Id: I8fe098032744aa20bbbe4faddfc67e7a46ce03d5
Reviewed-on: http://gerrit.cloudera.org:8080/14183
Reviewed-by: Bikramjeet Vig <bikramjeet.vig@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-06-19 01:35:29 +00:00
Csaba Ringhofer
58273fff60 IMPALA-9609: Minimize Frontend activity in executor only impalads
Until now the Frontend started fully regardless of flag is_coordinator,
e.g. created connections to the HMS, which is both error prone and can
DoS the metastore. (note that even coordinators started to connect to
HMS only in the recent past, related to local catalog mode and ACID
transactions)

Executor only impalads still need a JVM as queries can contain
java calls (HDFS/Hbase API calls, Hive UDFs), but most of the JNI API
provided by JniFrontend shouldn't be called by executors. It seems
that the whole Frontend object is needed only by coordinators.

Testing:
- generally executor only mode doesn't seem to be well covered
- ran test_coordinators.py which has some tests with executor only
  impalads
- added new test for HBase tables (Hive UDFs and HDFS were already
  covered)

Change-Id: I4627e5e3520175153cb49e24fd480815dfefdae1
Reviewed-on: http://gerrit.cloudera.org:8080/15793
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-04-24 21:43:53 +00:00
Lars Volker
69a9ac102d IMPALA-9151: Maintain cluster size in ExecutorMembershipSnapshot
This change improves the cluster membership snapshot we maintain in the
frontend in cases where all executors have been shut down or none have
started yet.

Prior to this change when configuring Impala with executor groups, the
planner might see a ExecutorMembershipSnapshot that has no executors in
it. This could happen if the first executor group had not started up
yet, or if all executor groups had been shutdown. If this happened, the
planner would make sub-optimal decisions, e.g. decide on a broadcast
join vs a partitioned hash join.

With this change if no executors have been registered so far, the
planner will use the expected number of executors which can be set using
the -num_expected_executors flag and is 20 by default. After executors
come online, the planner will use the size of the largest healthy
executor group, and it will hold on to the group's size even if it shuts
down or becomes unhealthy. This allows the planner to work on the
assumption that a healthy executor group of the same size will
eventually come online to execute the query.

Change-Id: Ib6b05326c82fb3ca625c015cfcdc38f891f5d4f9
Reviewed-on: http://gerrit.cloudera.org:8080/14756
Reviewed-by: Lars Volker <lv@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2019-12-04 00:40:04 +00:00
Sahil Takiar
ac87278b16 IMPALA-8950: Add -d, -f options to hdfs copyFromLocal, put, cp
Add the -d option and -f option to the following commands:

`hdfs dfs -copyFromLocal <localsrc> URI`
`hdfs dfs -put [ - | <localsrc1> .. ]. <dst>`
`hdfs dfs -cp URI [URI ...] <dest>`

The -d option "Skip[s] creation of temporary file with the suffix
._COPYING_." which improves performance of these commands on S3 since S3
does not support metadata only renames.

The -f option "Overwrites the destination if it already exists" combined
with HADOOP-13884 this improves issues seen with S3 consistency issues by
avoiding a HEAD request to check if the destination file exists or not.

Added the method 'copy_from_local' to the BaseFilesystem class.
Re-factored most usages of the aforementioned HDFS commands to use
the filesystem_client. Some usages were not appropriate / worth
refactoring, so occasionally this patch just adds the '-d' and '-f'
options explicitly. All calls to '-put' were replaced with
'copyFromLocal' because they both copy files from the local fs to a HDFS
compatible target fs.

Since WebHDFS does not have good support for copying files, this patch
removes the copy functionality from the PyWebHdfsClientWithChmod.
Re-factored the hdfs_client so that it uses a DelegatingHdfsClient
that delegates to either the HadoopFsCommandLineClient or
PyWebHdfsClientWithChmod.

Testing:
* Ran core tests on HDFS and S3

Change-Id: I0d45db1c00554e6fb6bcc0b552596d86d4e30144
Reviewed-on: http://gerrit.cloudera.org:8080/14311
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2019-10-05 00:04:08 +00:00
Lars Volker
576f205bff IMPALA-8936: Improve queuing reason for unhealthy executor groups
In some situations, users might actually expect not having a healthy
executor group around, e.g. when they're starting one and it takes a
while to come online. This change makes the queuing reason more generic
and drops the "unhealthy" concept from it to reduce confusion.

Change-Id: Idceab7fb56335bab9d787b0f351a41e6efd7dd59
Reviewed-on: http://gerrit.cloudera.org:8080/14210
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2019-09-11 08:04:05 +00:00
Lars Volker
2397ae5590 IMPALA-8484: Run queries on disjoint executor groups
This change adds support for running queries inside a single admission
control pool on one of several, disjoint sets of executors called
"executor groups".

Executors can be configured with an executor group through the newly
added '--executor_groups' flag. Note that in anticipation of future
changes, the flag already uses the plural form, but only a single
executor group may be specified for now. Each executor group
specification can optionally contain a minimum size, separated by a
':', e.g. --executor_groups default-pool-1:3. Only when the cluster
membership contains at least that number of executors for the groups
will it be considered for admission.

Executor groups are mapped to resource pools by their name: An executor
group can service queries from a resource pool if the pool name is a
prefix of the group name separated by a '-'. For example, queries in
poll poolA can be serviced by executor groups named poolA-1 and poolA-2,
but not by groups name foo or poolB-1.

During scheduling, executor groups are considered in alphabetical order.
This means that one group is filled up entirely before a subsequent
group is considered for admission. Groups also need to pass a health
check before considered. In particular, they must contain at least the
minimum number of executors specified.

If no group is specified during startup, executors are added to the
default executor group. If - during admission - no executor group for a
pool can be found and the default group is non-empty, then the default
group is considered. The default group does not have a minimum size.

This change inverts the order of scheduling and admission. Prior to this
change, queries were scheduled before submitting them to the admission
controller. Now the admission controller computes schedules for all
candidate executor groups before each admission attempt. If the cluster
membership has not changed, then the schedules of the previous attempt
will be reused. This means that queries will no longer fail if the
cluster membership changes while they are queued in the admission
controller.

This change also alters the default behavior when using a dedicated
coordinator and no executors have registered yet. Prior to this change,
a query would fail immediately with an error ("No executors registered
in group"). Now a query will get queued and wait until executors show
up, or it times out after the pools queue timeout period.

Testing:

This change adds a new custom cluster test for executor groups. It
makes use of new capabilities added to start-impala-cluster.py to bring
up additional executors into an already running cluster.

Additionally, this change adds an instructional implementation of
executor group based autoscaling, which can be used during development.
It also adds a helper to run queries concurrently. Both are used in a
new test to exercise the executor group logic and to prevent regressions
to these tools.

In addition to these tests, the existing tests for the admission
controller (both BE and EE tests) thoroughly exercise the changed code.
Some of them required changes themselves to reflect the new behavior.

I looped the new tests (test_executor_groups and test_auto_scaling) for
a night (110 iterations each) without any issues.

I also started an autoscaling cluster with a single group and ran
TPC-DS, TPC-H, and test_queries on it successfully.

Known limitations:

When using executor groups, only a single coordinator and a single AC
pool (i.e. the default pool) are supported. Executors to not include the
number of currently running queries in their statestore updates and so
admission controllers are not aware of the number of queries admitted by
other controllers per host.

Change-Id: I8a1d0900f2a82bd2fc0a906cc094e442cffa189b
Reviewed-on: http://gerrit.cloudera.org:8080/13550
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2019-07-21 04:54:03 +00:00
Lars Volker
2dbd7eec81 IMPALA-8758: Improve error message when no executors are online
Prior to this change a dedicated coordinator would not create the
default executor group when registering its own backend descriptor in
the cluster membership. This caused a misleading error message during
scheduling when the default executor group could not be found.

To improve this, we now always create the default executor group and
return an improved error message if it is empty.

This change adds a test that validates that a query against a cluster
without executors returns the expected error.

Change-Id: Ia4428ef833363f52b14dfff253569212427a8e2f
Reviewed-on: http://gerrit.cloudera.org:8080/13866
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2019-07-16 11:11:43 +00:00
Lars Volker
6b3e5fe426 IMPALA-8460: Simplify cluster membership management
This change adds a class to track cluster membership called
ClusterMembershipMgr. It replaces the logic that was partially
duplicated between the ImpalaServer and the Coordinator and makes sure
that the local backend descriptor is consistent (IMPALA-8469).

The ClusterMembershipMgr maintains a view of the cluster membership and
incorporates incoming updates from the statestore. It also registers the
local backend with the statestore after startup. Clients can obtain a
consistent, immutable snapshot of the current cluster membership from
the ClusterMembershipMgr. Additionally, callbacks can be registered to
receive notifications of cluster membership changes. The ImpalaServer
and Frontend use this mechanism.

This change also generalizes the fix for IMPALA-7665: updates from the
statestore to the cluster membership topic are only made visible to the
rest of the local server after a post-recovery grace period has elapsed.
As part of this the flag
'failed_backends_query_cancellation_grace_period_ms' is replaced with
'statestore_subscriber_recovery_grace_period_ms'. To tell the initial
startup from post-recovery, a new metric
'statestore-subscriber.num-connection-failures' is exposed by the
daemon, which tracks the total number of connection failures to the
statestore over the lifetime process lifetime.

This change also unifies the naming of executor-related classes, in
particular it renames "BackendConfig" to "ExecutorGroup". In
anticipation of a subsequent change (IMPALA-8484), it adds maps to store
multiple executor groups.

This change also disables the generation of default operators from the
thrift files and instead adds explicit implementations for the ones that
we rely on. This forces us to explicitly specify comparators when
manipulating containers of thrift structs and will help prevent
accidental bugs.

Testing: This change adds a backend unit test for the new cluster
membership manager. The observable behavior of Impala does not change,
and the existing scheduler unit test and end to end tests should make
sure of that.

Change-Id: Ib3cf9a8bb060d0c6e9ec8868b7b21ce01f8740a3
Reviewed-on: http://gerrit.cloudera.org:8080/13207
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2019-06-02 06:38:07 +00:00
Tim Armstrong
d05f73f415 IMPALA-7647: Add HS2/Impyla dimension to TestQueries
I used some ideas from Alex Leblang's abandoned patch:
https://gerrit.cloudera.org/#/c/137/ in order to run .test files through
HS2. The advantage of using Impyla is that much of the code will be
reusable for any Python client implementing the standard Python dbapi
and does not require us implementing yet another thrift client.

This gives us better coverage of non-trivial result sets from HS2,
including handling of NULLs, error logs and more interesting result
sets than the basic HS2 tests.

I added HS2 coverage to TestQueries, which has a reasonable variety of
queries and covers the data types in alltypes. I also added
TestDecimalQueries, TestStringQuery and TestCharFormats to get coverage
of DECIMAL, CHAR and VARCHAR that aren't in alltypes. Coverage of
results sets with NULLs was limited so I added a couple of queries.

Places where results differ from Beeswax:
* Impyla is a Python dbapi client so must convert timestamps into python datetime
  objects, which only have microsecond precision. Therefore result
  timestamps within nanosecond precision are truncated.
* The HS2 interface reports the NULL type as BOOLEAN as a workaround for
  IMPALA-914.
* The Beeswax interface reported VARCHAR as STRING, but HS2 reports
  VARCHAR.

I dealt with different results by adding additional result sections so
that the expected differences between the clients/protocols were
explicit.

Limitations:
* Not all of the same methods are implemented as for beeswax, so some
  tests that have more complicated interactions with the client will not
  work with HS2 yet.
* We don't have a way to get the affected row count for inserts.

I also simplified the ImpalaConnection API by removing some unnecessary
methods and moved some generic methods to the base class.

Testing:
* Confirmed that it detected IMPALA-7588 by re-applying the buggy patch.
* Ran exhaustive and CentOS6 tests.

Change-Id: I9908ccc4d3df50365be8043b883cacafca52661e
Reviewed-on: http://gerrit.cloudera.org:8080/11546
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-10-09 00:45:10 +00:00
poojanilangekar
880011fa1f IMPALA-6031: Fix executor node count in distributed plans
Prior to this change, the planner also considered coordinator-only
nodes as executors while estimating the number of scan nodes to be
used in the distributed plan. This change ensures that only
executor nodes are considered for that estimation.

Testing:
Added a new custom cluster test to verify the same.

Change-Id: I44af6b40099a495e13a0a5dc72c491d486d23aa8
Reviewed-on: http://gerrit.cloudera.org:8080/10873
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2018-07-07 03:33:08 +00:00
Vuk Ercegovac
2894884deb IMPALA-6670: refresh lib-cache entries from plan
When an impalad is in executor-only mode, it receives no
catalog updates. As a result, lib-cache entries are never
refreshed. A consequence is that udf queries can return
incorrect results or may not run due to resolution issues.
Both cases are caused by the executor using a stale copy
of the lib file. For incorrect results, an old version of
the method may be used. Resolution issues can come up if
a method is added to a lib file.

The solution in this change is to capture the coordinator's
view of the lib file's last modified time when planning.
This last modified time is then shipped with the plan to
executors. Executors must then use both the lib file path
and the last modified time as a key for the lib-cache.
If the coordinator's last modified time is more recent than
the executor's lib-cache entry, then the entry is refreshed.

Brief discussion of alternatives:

- lib-cache always checks last modified time
  + easy/local change to lib-cache
  - adds an fs lookup always. rejected for this reason

- keep the last modified time in the catalog
  - bound on staleness is too loose. consider the case where
    fn's f1, f2, f3 are created with last modified times of
    t1, t2, t3. treat the fn's last modified time as a low-watermark;
    if the cache entry has a more recent time, use it. Such a scheme
    would allow the version at t2 to persist. An old fn may keep the
    state from converging to the latest. This could end up with strange
    cases where different versions of the lib are used across executors
    for a single query.

    In contrast, the change in this path relies on the statestore to
    push versions forward at all coordinators, so will push all
    versions at all caches forward as well.

Testing:
- added an e2e custom cluster test

Change-Id: Icf740ea8c6a47e671427d30b4d139cb8507b7ff6
Reviewed-on: http://gerrit.cloudera.org:8080/9697
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Impala Public Jenkins
2018-03-24 04:38:53 +00:00
Vuk Ercegovac
6a2b7a64fb IMPALA-4704: Turns on client connections when local catalog initialized.
Currently, impalad starts beeswax and hs2 servers even if the
catalog has not yet been initialized. As a result, client
connections see an error message stating that the impalad
is not yet ready.

This patch changes the impalad startup sequence to wait
until the catalog is received before opening beeswax and hs2 ports
and starting their servers.

Testing:
- python e2e tests that start a cluster without a catalog
  and check that client connections are rejected as expected.

Change-Id: I52b881cba18a7e4533e21a78751c2e35c3d4c8a6
Reviewed-on: http://gerrit.cloudera.org:8080/8202
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Impala Public Jenkins
2017-11-13 21:14:14 +00:00
Matthew Jacobs
6c12546561 IMPALA-4833: Compute precise per-host reservation size
Before this change, the per-host reservation size was computed
by the Planner. However, scheduling happens after planning,
so the Planner must assume that all fragments run on all
hosts, and the reservation size is likely much larger than
it needs to be.

This moves the computation of the per-host reservation size
to the BE where it can be computed more precisely. This also
includes a number of plan/profile changes.

Change-Id: Idbcd1e9b1be14edc4017b4907e83f9d56059fbac
Reviewed-on: http://gerrit.cloudera.org:8080/7630
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: Impala Public Jenkins
2017-08-12 08:10:07 +00:00
Dimitris Tsirogiannis
e2c53a8bdf IMPALA-5147: Add the ability to exclude hosts from query execution
This commit introduces a new startup option, termed 'is_executor',
that determines whether an impalad process can execute query fragments.
The 'is_executor' option determines if a specific host will be included
in the scheduler's backend configuration and hence included in
scheduling decisions.

Testing:
- Added a customer cluster test.
- Added a new scheduler test.

Change-Id: I5d2ff7f341c9d2b0649e4d14561077e166ad7c4d
Reviewed-on: http://gerrit.cloudera.org:8080/6628
Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Tested-by: Impala Public Jenkins
2017-04-26 01:45:40 +00:00
Dimitris Tsirogiannis
296df3c826 IMPALA-4041: Limit catalog and admission control updates to coordinators
With this commit we add the ability to limit catalog updates to a
limited set of coordinator nodes. A new startup option, termed
'is_coordinator' is added to indicate if a node is a coordinator.
Coordinators accept connections through HS2 and Beeswax interfaces
and can also participate in query execution. Non-coordinator nodes
do not receive catalog updates from the statestore, do not initialize
a query scheduler and cannot accept Beeswax and HS2 client connections.

Testing:
- Added a custom cluster test that launches a cluster in which the
number of coordinators is less than the cluster size and runs a number
of smoke queries.
- Successfully run exhaustive tests.

Change-Id: I5f2c74abdbcd60ac050efa323616bd41182ceff3
Reviewed-on: http://gerrit.cloudera.org:8080/6344
Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Tested-by: Impala Public Jenkins
2017-03-28 22:27:25 +00:00