Also sets dependencyManagement to force using the same version
for jackson-databind, jackson-core and jackon-annotations. This is
needed because datagenerator depends on kitesdk, which would pull in a
very old jackson-core version (2.3.1) and lead to build failures
with the newer jackson.databind.
Change-Id: I8440426da1395045cf149aca0044286015861e5f
Reviewed-on: http://gerrit.cloudera.org:8080/20914
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
HIVE-27114 adds a new property in hive-site.xml for HMS clients to
filter out unwanted partition parameters:
hive.metastore.partitions.parameters.exclude.pattern
It defaults to "impala_intermediate_stats_chunk%". This excludes the
incremental stats of Impala. Impala should set this to an empty string
to get rid of the impact.
Tests:
- Ran CatalogTest#testPullIncrementalStats which failed when running on
higher Hive versions that have HIVE-27114.
Change-Id: I033e811f4e55b3af04f7a68c69b5779c72e4b053
Reviewed-on: http://gerrit.cloudera.org:8080/20937
Reviewed-by: Laszlo Gaal <laszlo.gaal@cloudera.com>
Reviewed-by: Csaba Ringhofer <csringhofer@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
The DiskIoMgr starts a large number of threads for each different
type of object store, most of which are idle. For development,
this slows down processing minidumps and debugging with gdb.
This adds an option "reduce_disk_io_threads" to bin/start-impala-cluster.py
that sets the thread count startup parameter to one for any filesystem
that is not the TARGET_FILESYSTEM. On a typical development setup
running against HDFS, this reduces the number of DiskIoMgr threads
by 150 and the HDFS monitoring threads by 150 as well. This option is
enabled by default. It can disabled by setting --reduce_disk_io_threads=False
for bin/start-impala-cluster.py.
Separately, DiskIoMgr should be modified to reduce the number of
threads it spawns in general.
Testing:
- Hand tested this on my local development system
Change-Id: Ic8ee1fb1f9b9fe65d542d024573562b3bb120b76
Reviewed-on: http://gerrit.cloudera.org:8080/20920
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Some test cases for reading compressed JSON tables was added in
IMPALA-12431, but due to the lack of appropriate handling of the
database name a test case failed in exhaustive mode. This patch fixes
that issue.
Testing:
- Passed TestHdfsJsonScanNodeErrors in exhaustive mode.
Change-Id: I69d56d070b52d33fae37da008df5a7a8a9feca92
Reviewed-on: http://gerrit.cloudera.org:8080/20931
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
We use LOG_AND_RETURN_IF_ERROR to log the non-ok status of executing the
DDL. However, when enable_async_ddl_execution is true (default), the
returned status of ExecDdlRequest() and ExecLoadDataRequest() are about
creating the async thread. It's not the error of executing the
statement. If the DDL fails, no errors will be shown in impalad logs.
This patch fixes it by logging the error when UpdateQueryStatus() is
invoked by the async exec thread. A new parameter, 'log_error', is added
to this method to control the logging behavior. It's false by default
and only set to true when used in the async thread.
Tests
- Add e2e test to verify the error in logs
Change-Id: I8f02f22fa8ebbd2dea722d5586899bf57b66cf40
Reviewed-on: http://gerrit.cloudera.org:8080/20925
Reviewed-by: Wenzhe Zhou <wzhou@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Ubuntu 20.04 locked down access to the kernel messages, so a call to
'dmesg' can succeed only when executed with elevated privileges.
This could be a problem during Impala precommit runs, as the finalizer
script uses 'dmesg' to detect potential OOM-kills during the run.
This patch adds an "escalation" step to the dmesg call: if the regular
call fails, it issues a second call via 'sudo'.
Change-Id: Ic20193740c6e5cb9e8e155c03bede55184875de5
Reviewed-on: http://gerrit.cloudera.org:8080/20763
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This change sets the default value to 'true' for
'iceberg_restrict_data_file_location' and changes the flag name to
'iceberg_allow_datafiles_in_table_location_only'. Tests related to
multiple storage locations in Iceberg tables are moved out to custom
cluster tests. During test data loading, the flag is set to 'false'
to make the creation of 'iceberg_multiple_storage_locations' table
possible.
Change-Id: Ifec84c86132a8a44d7e161006dcf51be2e7c7e57
Reviewed-on: http://gerrit.cloudera.org:8080/20874
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
When using TPC-DS with a large number of iterations, the
results JSON files are enormous. Using Python2,
report_benchmark_results.py runs out of memory and fails to
produce the report. Python 3 is more efficient in how it
processes Unicode inputs (see Python PEP-0393), so it's
memory usage is much lower. It is able to handle generating
reports that Python 2 cannot.
As a general cleanup, this fixes all the flake8 issues for this file.
Testing:
- Processed very large JSON results (4+GB each for both baseline
result and new result). Python 3 completes successfully when
Python 2 failed.
Change-Id: Idbde17f720b18d38dc2c2104ecf3fec807c1839d
Reviewed-on: http://gerrit.cloudera.org:8080/20918
Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
test_catalog_operations_with_rpc_retry uses a short timeout which could
lead to failures in the first DESCRIBE statement. What it expects is the
next REFRESH statement failed by the RPC timeout.
The DESCRIBE statement triggers a catalog RPC of PrioritizeLoad on the
table. The RPC usually finishes in 40ms. This patch bumps the RPC
timeout of the test to be 100ms so it's long enough for DESCRIBE to
succeed. Also bumps the sleep time in the REFRESH statement so its
ExecDdl RPC will always time out in 100ms.
Tests:
- Ran the test till night (2700 times) and all passed.
Change-Id: Ibbfa79d7f7530af4cfbb6f8ebc55e8267e3d3261
Reviewed-on: http://gerrit.cloudera.org:8080/20924
Reviewed-by: Wenzhe Zhou <wzhou@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
When the event processor receives an alter table event and the event
type is of truncate operation currently file metadata is not reloaded.
This patch addresses this issue, where the alter table event type is
verified if it's a truncate operation and then reload the file metadata
accordingly.
Note: Alter table event for an external table generated by the truncate
operation from Impala cannot be identified if it's a truncate op or not
This becomes an issue in multi cluster Impala environments where events
generated from one impala cluster is consumed by other impala clusters.
Truncate operations in Impala on replicated tables will generated alter
event with 'isTruncateOp' field set to true.
Testing:
Added an end-to-end test to verify whether file metadata is reloaded
for the above scenario.
Change-Id: I53bb80c294623eec7c79d9f30f410771386c6b75
Reviewed-on: http://gerrit.cloudera.org:8080/20887
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
The current implementation of UPDATE creates the delete file(s) and the
new data file(s) for the updated row(s). These files are committed in
one Iceberg transaction, but the transaction adds two snapshots to the
table. The first contains the delete file(s), the second adds the new
data file(s) of the updated row(s). Only the final snapshot (which
holds the consistent table state) is observable by concurrent readers,
but still, the commit history can look strange with these "phantom
snapshots".
So instead of doing a RowDelta and AppendFiles operation in a single
transaction, with this change we are doing a single RowDelta operation
only.
Another issue was that we also committed empty operations (e.g. UPDATEs
with zero records). These created redundant snapshots in the table
history. This patch also fixes that.
Testing:
* added e2e test that checks the table history
Change-Id: I2ceb80b939c644388707b21061bf55451234dcd3
Reviewed-on: http://gerrit.cloudera.org:8080/20903
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Zoltan Borok-Nagy <boroknagyz@cloudera.com>
Impala has some troubles with Iceberg tables that don't specify
'external.table.purge'='true'. Schema changes like ADD COLUMN is not
working, also table properties that have been set by an ALTER TABLE
SET TBLPROPERTIES statement can be reset by subsequent INSERT
statements.
There was a bug in CatalogOpExecutor.alterIcebergTables(). Its return
value determines whether we need to also update the table definition in
HMS, or it was already done by the Iceberg library. In fact, the HMS
table definition is always updated by the Iceberg library if the table
is handled by the HiveCatalog. In every other case we need to update
the HMS table definition ourselves (unless the change won't affect
HMS).
The issue was that CatalogOpExecutor.alterIcebergTables() returned true
(which means we need to update HMS) in case of Iceberg tables that
didn't specify 'external.table.purge'='true'. This was a problem,
because Iceberg already modified the HMS table and set
'metadata_location' to a new metadata file. But then Impala came and
modified some properties of the HMS table definition, but also reset
'metadata_location' to the original value. Therefore subsequent
operations/refreshes used the earlier state of the Iceberg table and
they reset the modifications.
Testing:
* added e2e tests for Iceberg tables in different catalogs with
'external.table.merge'='false'
Change-Id: I2a82d022534e1e212d542fbd7916ae033c381c20
Reviewed-on: http://gerrit.cloudera.org:8080/20907
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This patch adds the functionality to read compressed JSON files for the
JSON scanner. Because the decompression code can largely be reused from
HdfsTextScanner, this patch moves that part of the code from
HdfsTextScanner to HdfsScanner so that HdfsJsonScanner can also call it.
As it reuses the relevant code from the TEXT scanner, the compression
formats supported by the Json scanner are the same as those supported by
the TEXT scanner.
Tests
- Most of the existing end-to-end JSON format tests can run on
compressed JSON format too.
Change-Id: I2471855d97d4cdd51363b321055e6b06aa6d81e8
Reviewed-on: http://gerrit.cloudera.org:8080/20482
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This patch adds TpcdsCpuCostPlannerTest, which is a copy of
TpcdsPlannerTest with COMPUTE_PROCESSING_COST option enabled.
PROCESSING_COST_MIN_THREADS and MAX_FRAGMENT_INSTANCES_PER_NODE are set
to 2 and 16 accordingly to match the same options in
PlannerTest#testProcessingCost. PlannerTest#testProcessingCost is
reduced to contain only corner case queries that is not subset of
TPC-DS. TpcdsCpuCostPlannerTest runs for around 30 second.
Testing:
- Pass FE tests.
Change-Id: If04fa584db5f13db0dd656ec9d99f7204c05d75d
Reviewed-on: http://gerrit.cloudera.org:8080/20872
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
On RedHat 8, RpcMgrKerberizedTest cases fail with
Jan 09 14:47:03 msmith.vpc.cloudera.com krb5kdc[609624](info): TGS_REQ
(1 etypes {aes128-cts-hmac-sha1-96(17)}) 127.0.0.1: LOOKING_UP_SERVER:
authtime 0, etypes {rep=UNSUPPORTED:(0)}
impala-test/msmith.vpc.cloudera.com@KRBTEST.COM for
impala-test/msmith@KRBTEST.COM, Server not found in Kerberos database
This happens because bootstrap_system.sh adds an entry to /etc/hosts to
resolve 127.0.0.1 to hostname and puts the short hostname first. During
negotiation, Kudu RPC will call GetFQDN to retrieve the FQDN, which for
our tests running on localhost returns the short hostname.
Fixes RpcMgrKerberizedTest by swapping the order of entries added to
/etc/hosts so the FQDN comes first. This is consistent with the example
provided in https://man7.org/linux/man-pages/man5/hosts.5.html.
Avoids 'hostname -f'; on RedHat it's identical to 'hostname', and on
Ubuntu it causes this test to fail.
Change-Id: I1eb24f9faec766e388d793408aedecdc92107185
Reviewed-on: http://gerrit.cloudera.org:8080/20876
Reviewed-by: Alexey Serbin <alexey@apache.org>
Reviewed-by: Jason Fehr <jfehr@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>
This patch uses JDBC connection string to apply query options to the
Impala server by setting the properties in "jdbc.properties" when
creating JDBC external DataSource table.
jdbc.properties are specified as comma-delimited key=value string, like
"MEM_LIMIT=1000000000, ENABLED_RUNTIME_FILTER_TYPES=\"BLOOM,MIN_MAX\"".
Fixed Impala to allow value of ENABLED_RUNTIME_FILTER_TYPES to have
double quotes in the beginning and ending of string.
jdbc.properties can be used for other databases like Postgres and MySQL
to set additional properties. The test cases will be added in separate
patch.
Testing:
- Added end-to-end tests for setting query options on Impala JDBC
tables.
- Passed core tests.
Change-Id: I47687b7a93e90cea8ebd5f3fc280c9135bd97992
Reviewed-on: http://gerrit.cloudera.org:8080/20837
Reviewed-by: Abhishek Rawat <arawat@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
I usually shutdown Kudu in my dev env to save some resources. However,
tests that import skip.py will fail if Kudu cluster is not running
locally, even if the tests are unrelated to Kudu. The cause is that Kudu
web pages are accessed when the module is imported, and it fails if Kudu
cluster is not running.
This patch exposes the decorators of SkipIfKudu as methods just like
what we did in SkipIfCatalogV2, so Kudu web pages can be checked lazily
when needed.
Tests:
- Ran Kudu tests.
- Ran some Kudu unrelated tests without lauching the Kudu cluster.
Change-Id: Ic7a8282b59d72322085c21c70a5019c51b586a52
Reviewed-on: http://gerrit.cloudera.org:8080/20904
Reviewed-by: Wenzhe Zhou <wzhou@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This patch fixes an issue reported by IMPALA-12582, where enabling
MIN_MAX RuntimeFilter for a specific query would cause the executor to
crash. The direct cause of the crash was an out-of-bounds access to
input_vals in the ScalarFnCall::InterpretEval() function, but the root
cause was actually due to the related ScalarExprEvaluator not invoking
the Open() function.
Testing:
- Added new E2E test case about this issue.
Change-Id: Iba951796d52f109c419587c444840adbb2d44f5d
Reviewed-on: http://gerrit.cloudera.org:8080/20891
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Csaba Ringhofer <csringhofer@cloudera.com>
Tested-by: Csaba Ringhofer <csringhofer@cloudera.com>
test_reduced_cardinality_by_filter failed in non-HDFS environment
because it assert for existence of '00:SCAN HDFS' in ExecSummary. This
patch change that assertion to ignore the type of scan node from test
query. Also marked the test with SkipIfNotHdfsMinicluster.plans
decorator.
Testing:
- Pass test_reduced_cardinality_by_filter
Change-Id: Icbf72687cc3c5a99aa0a0a74e229ed8c88ed06ef
Reviewed-on: http://gerrit.cloudera.org:8080/20902
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Currently, When querying some metadata tables of an empty iceberg table,
a null pointer exception occurs. This patch fixes the issue and adds
corresponding test cases in test_metadata_tables.
Testing:
- Added E2E test to cover this case
Change-Id: I6b4d4fb81a45214045b8809a4bdd910a1f1f3843
Reviewed-on: http://gerrit.cloudera.org:8080/20890
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
The introduction of check_deleted_file_fd() in IMPALA-12681 aimed
to detect a bug related to remote spilling where local temporary file
handles were not being released after deletion. However, the tests
associated with this function seem flaky in exhaustive builds with
occasionally some files of hdfs may not be promptly released after
deletion, though locally, I observed that these files are eventually
removed from /proc/xx/fd in a few minutes, the reason is unclear
yet.
To fix the flaky build failure, this patch confines the scope of
check_deleted_file_fd() to detect files containing the keyword
"scratch" only. Given that hdfs files eventually get removed, and
it seems not an urgent issue, a separate Jira will be filed to track
and investigate this behavior further.
Testing:
Reran the tests a couple times and passed.
Change-Id: I55f5aa1cdbc0c74f6c7ebd25575e71d2b238bf98
Reviewed-on: http://gerrit.cloudera.org:8080/20898
Reviewed-by: Csaba Ringhofer <csringhofer@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
**IMPALA-12665 Description:**
The issue occurs when scanning Parquet tables with a row size
> 4096 bytes and a row batch size > 1024. A heap-buffer-overflow
was detected by AddressSanitizer, indicating a write operation
beyond the allocated buffer space.
**Root Cause Analysis:**
The error log by AddressSanitizer points to a heap-buffer-overflow,
where memory is accessed beyond the allocated region. This occurs
in the `HdfsParquetScanner` and `ScratchTupleBatch` classes when
handling large rows > 4096 bytes.
**Fault Reproduction:**
The issue can be reproduced by creating a Parquet table with many
columns, inserting data using Hive, then querying with Impala.
Bash and Hive client scripts in IMPALA-12665 create a table and
populate it, triggering the bug.
**Technical Analysis:**
`ScratchTupleBatch::Reset` recalculates `capacity` based on tuple
size and fixed memory limits. When row size > 4096 bytes, `capacity`
is set < 1024. `HdfsParquetScanner` incorrectly assumes
`complete_micro_batch_` length of 1024, leading to overflow.
**Proposed Solution:**
Ensure `complete_micro_batch_` length is updated after
`ScratchTupleBatch::Reset`. This prevents accessing memory outside
allocated buffer, avoiding heap-buffer-overflow.
Change-Id: I966ff10ba734ed8b1b61325486de0dfcc7b58e4d
Reviewed-on: http://gerrit.cloudera.org:8080/20834
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This commit fixes a DCHECK failure when querying a struct inside a
struct. The previous field accessor creation logic was trying to find
the ColumnDescriptor for a struct inside a struct and hit a DCHECK
because there are no ColumnDescriptors for struct fields. The logic
has been reworked to only use ColumnDescriptors for top level columns.
Testing:
- Added E2E test to cover this case
Change-Id: Iadd029a4edc500bd8d8fca3f958903c2dbe09e8e
Reviewed-on: http://gerrit.cloudera.org:8080/20883
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
In the query profile, cardinality reduction from IMPALA-12018 is
highlighted in Plan section, but missing out from ExecSummary section.
This patch changes the ExecSummary to show the reduced cardinality
estimation if it set.
Testing:
- Add TestObservability::test_reduced_cardinality_by_filter
Change-Id: If1f51ce585a1cb66e518b725686ab3076ffa8168
Reviewed-on: http://gerrit.cloudera.org:8080/20879
Reviewed-by: Wenzhe Zhou <wzhou@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
In-flight catalog operations are tracked in a map using query id as the
key. It's ok since catalog clients use 0 as the timeout by default (see
--catalog_client_rpc_timeout_ms), i.e. catalog RPCs never timeout, which
means each query will have at most one in-flight catalog RPC at a time.
However, in case catalog_client_rpc_timeout_ms is set to non-zero,
impalad could retry the catalog RPC when it's considered timed out. That
causes several in-flight catalog operations coming from the same query
(so using the same query-id as the map key).
To fix the key conflicts, this patch use the pair of (queryId, threadId)
as the key of the in-flight operations map. 'threadId' comes from the
thrift thread that handles the RPC so it's unique across different
retries.
Tests:
- Add custom-cluster test to verify all retries are shown in the
/operations page.
Change-Id: Icd94ac7532fe7f3d68028c2da82298037be706c4
Reviewed-on: http://gerrit.cloudera.org:8080/20877
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
treated as self event
Self event check for add partition event is done only for the
transactional tables with IMPALA-10502 (commit id: 7f7a631). But
during addition of new partition(with insert statement), catalog
service id and version number are added to partition params of the
parition irrespective of whether the table is transactional or not.
Thus the version number is added to partition's inFlightEvents_ and
remained in it until the next alter partition event from hive. Thus
led to detection of the alter partition event as self event.
This commit ensures the catalog service id and version number are not
added to partition params if the partition is added to a
non-transactional table.
Also fixed another bug in reload event. Reload event self check
fails due to the above fix as it expects catalog service id and
version number in the partition params. Fixed to use last refreshed
event id to skip the self reload events.
Testing:
- Manually tested in cluster and added testcases
Change-Id: I23c2affa3fe32c0b3843bff5e4c0018dce9060d3
Reviewed-on: http://gerrit.cloudera.org:8080/20486
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
IMPALA-12018 changed the CPU costing formula from using getCardinality()
to getFilteredCardinality() for DataStreamSink, HashJoinNode, JoinNode,
and NestedLoopJoinNode. However, it miss to do the same for
ExchangeNode, which is also eligible for cardinality reduction by
runtime filter. This patch fix the formula for ExchangeNode.
Testing
- Pass PlannerTest#testProcessingCost.
Change-Id: I62a649b67c75c46bd57d8ceda80265af3321d85b
Reviewed-on: http://gerrit.cloudera.org:8080/20880
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Iceberg metadata tables are virtual tables and their schemata are
predefined by the Iceberg library. This commit extends the DESCRIBE
<table> statement, so the users can print the table description of these
tables.
Metadata tables do not exist in the HMS, therefore
DESCRIBE FORMATTED|EXTENDED statements are not permitted.
Testing:
- Added E2E tests
Change-Id: Ibe22f271a59a6885035991c09b5101193ade6e97
Reviewed-on: http://gerrit.cloudera.org:8080/20695
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This patch modifies the partition specification updating mechanism to
use an updated Iceberg API. The new API preserves the existing
partition specification by, for instance, generating new field IDs for
modified partitioning terms. Additionally, the new API incorporates
transactional support, as a result, the conditional branching based
on the 'needsTxn' criterion has been eliminated. For Iceberg V1 tables,
the new method correctly adds VOID(*) partition transform for removed
fields at updates, for V2 tables, the updated field could be reused,
but their order remains when they are first introduced in the spec,
these changes are reflected in the test changes.
Tests:
- modified already existing tests
- added e2e tests for V1 and V2 tables
Change-Id: I958107d08e50d7bd9044a57bd2fc02816414012d
Reviewed-on: http://gerrit.cloudera.org:8080/20823
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This patch fixes a bug where partially written temporary files are
removed without releasing the file descriptors. This patch fixes
the bug by adding a call to Close() of the local file writer
during the Delete() of the DiskFile class, which could be called
when the local buffer file is being evicted or the query ends,
ensuring proper release of the file handle.
Testing:
Passed core tests.
Additionally, a check has been added in the test
test_scratch_disk.py to verify that there are no deleted
files in the /proc/x/fd/ directory.
Change-Id: I58a2bac419ced806d6f5a32bcdf24d79e078ab14
Reviewed-on: http://gerrit.cloudera.org:8080/20852
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This modifies dump_breakpad_symbols.py to use a ThreadPool
to go parallel when there are multiple binaries or
libraries to process. This is common for Jenkins jobs that
dump symbols for all backend tests. The different binaries
write out to different directories, so the threads don't
interfere with each other.
Testing:
- Ran locally dumping the symbols for all backend tests
- Ran a Jenkins job that generates a minidump and triggers
the minidump symbol processing. It went parallel and
worked fine.
Change-Id: I93427bb07f1d9718bd6df90acfd247210b54294d
Reviewed-on: http://gerrit.cloudera.org:8080/20802
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
If the bin/jenkins/finalize.sh script is called from a directory
other than $IMPALA_HOME, it's call to resolve_minidumps.py will
fail due to the relative path. This changes the call to use
the absolute path so that finalize.sh works in this case.
Testing:
- Ran bin/jenkins/finalize.sh from a directory other than
$IMPALA_HOME
Change-Id: I063843554b52d3e8ed79ee32d9fd4c90d059c482
Reviewed-on: http://gerrit.cloudera.org:8080/20801
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>
Since resolve_minidumps.py's call to minidump_stackwalk can go haywire
due to bad symbols in shared libraries, this adds a fallback mechanism
where it tries again with a "safe" list of shared libraries. These are
limited to the ones that make the most difference in resolving minidumps
(libc, libstdc++, and libjvm). The list of safe libraries can be
customized via the --safe_library_list.
Testing:
- Verified that this uses the fallback on Centos 7 and resolves
the minidumps successfully.
Change-Id: I6bb4c9f65f9c27bb3b86c7ff2f3a6a48e258ef01
Reviewed-on: http://gerrit.cloudera.org:8080/20863
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>
On some platforms (Centos 7), resolve_minidumps.py's call to
minidump_stackwalk goes haywire and uses all the system memory
until it gets OOM killed. Some library must have corrupt
symbols, etc. As a workaround, this detects whether the
prlimit utility is present and uses this to run minidump_stackwalk
with a 4GB limit on virtual memory. This kills the process
earlier and avoids using all system memory.
Testing:
- Verified that bin/jenkins/finalize.sh uses resolve_minidumps.py
on a Redhat 8 Jenkins job (and it works)
- Verified that bin/jenkins/finalize.sh works properly on
my Ubuntu 20 box
- Ran a Jenkins job on Centos 7 and verified that the prlimit
code kills minidump_stackwalk when it uses 4GB of memory.
Change-Id: I4db8facb8a037327228c3714e047e0d1f0fe1d94
Reviewed-on: http://gerrit.cloudera.org:8080/20862
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>
When updating the last-synced-event-id after processing a batch of
partition events, we use the last event id. We should do the same when
updating last-synced-event-time. However, currently BatchPartitionEvent
uses getEventTime() from the parent class. It actually returns the event
time of the first event. We should override it to use the last event.
Tests
- Ran MetastoreEventsProcessorTest.testDisableEventSyncFlag 200 times.
Change-Id: I82efe18dd28fe8af47f8c66cc8c5eb8e6f8dfd2b
Reviewed-on: http://gerrit.cloudera.org:8080/20864
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
The new SimultaneousMultipleQueriesOneSession test within
internal-server-test.cc has revealed a data race condition
where the Impala web UI can read TExecRequest in the
QueryDriver while the frontend is updating this object.
Since the fix would require adding locks to the critical
query planning path and the only impact is the UI showing
slightly outdated data, this race condition is being
ignored.
Change-Id: I2c553576f03b7503f77f4aa1d3ea8086fff0e43b
Reviewed-on: http://gerrit.cloudera.org:8080/20842
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Tested-by: Michael Smith <michael.smith@cloudera.com>
CatalogdMetaProvider maintains a map (a Guava cache) as its local
catalog cache. It has a piggyback mechanism to load metadata from
catalogd that when concurrent threads want to load the same content
(identified by the same key) from catalogd, only one of them actually
sends the request and load the result into the cache. Other threads wait
and get the result when the work is done.
The piggyback mechanism is implemented by putting a Future object as the
value when the key doesn't exist in the cache. The Future object handles
the loading. Other threads that want the same value just invoke
Future.get() to wait. See more in the comments in loadWithCaching().
If there are any errors thrown in the loading process, Future.get() will
encapsulate the error into an ExecutionException and throw it instead.
The cause could be an InconsistentMetadataFetchException which indicates
FE should retry the planning. It's handled in Frontend#getTExecRequest().
In loadWithCaching(), we try to throw the cause of the exception thrown
from Future.get(). So the InconsistentMetadataFetchException can be
handled as expected. However, in getIfPresent(), the error handling is
inconsistent that it try to throw the current exception. That causes
retriable failures can't be retried. Note that this is an existing bug
but got more easy to be hitted after IMPALA-11501 because getIfPresent()
is now used in LocalDb#getTableIfCached() which is used in many places.
This patch fixes getIfPresent() to have the same logic of using the
Future object (including error handling) as loadWithCaching(). Also
adds more loggings in both catalogd and impalad sides when the lookup
status is abnormal.
In order to test the loading error more easily, this patch adds a hidden
flag, inject_failure_ratio_in_catalog_fetch, to randomly inject
retriable errors.
Tests
- Ran test_local_catalog_ddls_with_invalidate_metadata 700 times.
- Add e2e test that will easily fail without this fix.
Change-Id: I74268ba2bb700988107780e13ffbdbb4c767d09d
Reviewed-on: http://gerrit.cloudera.org:8080/20853
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
LdapHS2Test.testImpalaExtJdbcTables was added in IMPALA-12502 recently.
The test failed in nightly builds with JDK17. The error happened when
shell script testdata/bin/download-impala-jdbc-driver.sh was invoked
from Java code to copy Impala jdbc driver with hadoop fs commands.
This patch changes code to run the script download-impala-jdbc-driver.sh
from testdata/bin/copy-ext-data-sources.sh, which is invoked when
loading data and jar files for external data source.
Testing:
- Passed LdapHS2Test.testImpalaExtJdbcTables for JDK17 and default
build JDK.
- Passed core tests.
Change-Id: If62fe207978301b8ce95f5af30f605e3bb8caa28
Reviewed-on: http://gerrit.cloudera.org:8080/20851
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This patch removes the extra quote '"' in the
test_events_custom_config.py file. This ensures that tests are not
broken with different versions of Hive.
Change-Id: I1d29bf0bdf68d4da11f02dabff8c6a68d81276c8
Reviewed-on: http://gerrit.cloudera.org:8080/20860
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Tested-by: Michael Smith <michael.smith@cloudera.com>
Fixes an integer overflow issue in the DatabaseTest class
in the internal-server-test.cc file by switching from a
signed to an unsigned int. Additional comments and
assertions were also added to prevent future
integer overflows.
Change-Id: I3689185aa8676203226cc447585703e784627102
Reviewed-on: http://gerrit.cloudera.org:8080/20856
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Michael Smith <michael.smith@cloudera.com>
In switch/case statements, one case can fallthrough to the
next case. Sometimes this is intentional, but it is also a
common source of bugs (i.e. a missing break/return statement).
Clang-Tidy's clang-diagnostic-implicit-fallthrough flags
locations where a case falls through to the next case without
an explicit fallthrough declaration.
This change enables clang-diagnostic-implicit-fallthrough and
fixes failing locations. Since Impala uses C++17, this uses
C++17's [[fallthrough]] to indicate an explicit fallthrough.
This also adjusts clang-tidy's output to suggest [[fallthrough]]
as the preferred way to indicate fallthrough.
Testing:
- Ran core job
- Ran clang tidy
Change-Id: I6d65c92b442fa0317c3af228997571e124a54092
Reviewed-on: http://gerrit.cloudera.org:8080/20847
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Zihao Ye <eyizoha@163.com>
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
This patch adds an event specific metric "avg-events-process-duration"
at the table level metrics. This metric is also extended to last 1min,
5mins, 15mins duration. This metric is useful to identify the average
events processed duration on the table. This is helpful to identify if
a particular table is causing event procssor lagging and as a temporary
workaround, event processing can be disabled on that table.
Another metric is also added in the event processor summary page,
"events-consuming-delay-ms", is the time difference in milliseconds of
the event created in the metastore and event processed by event
processor. This is another useful metric to gauge how the event
processor is lagging.
Tests:
- Manually verified the metrics on catalogD UI page when running some
hive workloads.
Change-Id: I2428029361e610a0fcd8ed11be2ab771f03b00dd
Reviewed-on: http://gerrit.cloudera.org:8080/20473
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This commit removes the CatalogMetaProvider dependence on
DirectMetaProvider. CatalogMetaProvider depends on DirectMetaProvider
for 2 APIs. Implemented the APIs on catalog server and used them
instead. DirectMetaProvider is not referenced anywhere now. But it is
retained for future use.
Testing:
- Manually tested and CatalogdMetaProviderTest covers the tests.
Change-Id: I096c1b1d1a52e979c8b2d8173dae9ca2cc6c36d2
Reviewed-on: http://gerrit.cloudera.org:8080/20791
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Due to Iceberg #7612 migrating a table to Iceberg resulted in incorrect
data and stats if some of the string partition fields contained '/'
character. As a result we deliberately rejected migrating such tables.
Now that Impala uses an Iceberg version that has the fix we can allow
migrating such tables too.
Change-Id: I05b4ca44c7edb81cee6747f83a5bd82c5a4b5c44
Reviewed-on: http://gerrit.cloudera.org:8080/20845
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This patch fixes the issue where ILIKE and IREGEXP couldn't ignore case
when encountering non-constant patterns.
For example, 'SELECT 'ABC' ILIKE pattern FROM tbl' would return false
when the pattern in tbl is '%b%'.
Tests:
- Add TestNonConstPatternILike to test_exprs.py to verify the
effectiveness of the fixing.
Change-Id: I3d66680f5a7660e6a41859754c4230f276e66712
Reviewed-on: http://gerrit.cloudera.org:8080/20785
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
IMPALA-11604 adds a hidden backend flag named query_cpu_count_divisor to
allow oversubscribing CPU cores more than what is available in the
executor group set. This patch adds a query option with the same name
and function so that CPU core matching can be tuned for individual
queries. The query option takes precedence over the flag.
Testing:
- Add test case in test_executor_groups.py and query-options-test.cc
Change-Id: I34ab47bd67509a02790c3caedb3fde4d1b6eaa78
Reviewed-on: http://gerrit.cloudera.org:8080/20819
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Statestore::Topic::DeleteIfVersionsMatch() is used when impalad is down.
It incorrectly bumps the metrics of total value size and total topic
size. This patch removes the codes.
In order to verify the metrics using the /topics page, this patch
changes the /topics URL to also return numeric values which can be used
in the e2e test to verify the /metrics page.
Tests
- Add e2e test
Change-Id: I3ffcfb45b7cde0b40a87c9ca410ec634cb31cefb
Reviewed-on: http://gerrit.cloudera.org:8080/20841
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
DataSource objects are saved in-memory cache in Catalog server. They are
not persisted to the HMS. The objects are lost after Catalog server is
restarted and user needs to recreate DataSource objects before creating
new external DataSource tables.
This patch makes DataSource Object persistent by saving DataSource
objects as DataConnector objects with type "impalaDataSource" in HMS.
Since HMS events for DataConnector are not handled, Catalog server
has to refresh DataSource objects when the catalogd becomes active.
Note that this feature is not supported for Apache Hive 3.1 and older
version.
Testing:
- Added two end-to-end unit tests with restarting of Catalog server,
and catalogd HA failover.
These two tests are skipped when USE_APACHE_HIVE is set as true
and Apache Hive version is 3.x or older version.
- Passed all-build-options-ub2004.
- Passed core test.
Change-Id: I500a99142bb62ce873e693d573064ad4ffa153ab
Reviewed-on: http://gerrit.cloudera.org:8080/20768
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Wenzhe Zhou <wzhou@cloudera.com>
external data source
In the current implementation of external JDBC data source,
the user has to provide both the username and password in
plain text which is not a good practice.
This patch extends the functionality of existing implementation
to either provide:
a) username and password
b) username or key and keystore
If the user provides the password, then that password is used.
However, if no password is provided and the user provides only the
key/keystore, then it fetches the password from the secure jceks
keystore.
Testing:
- Added unit test TestExtDataSourcesWithKeyStore
Change-Id: Iec83a9b6e00456f0a1bbee747bd752b2cf9bf238
Reviewed-on: http://gerrit.cloudera.org:8080/20809
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>