Commit Graph

3381 Commits

Author SHA1 Message Date
Zoltan Borok-Nagy
94ed30d9fa IMPALA-12991: Eliminate unnecessary SORT for Iceberg DELETEs
Since we are using IcebergBufferedDeleteSink, which sorts the data
before flushing, there is no need to add a SORT node before the sink.

Testing:
 * updated planner tests

Change-Id: I94a691e7990228a1ec2de03e6ad90ebb97931581
Reviewed-on: http://gerrit.cloudera.org:8080/21285
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-04-11 17:35:07 +00:00
Yida Wu
9837637d93 IMPALA-12920: Support ai_generate_text built-in function for OpenAI's chat completion API
Added support for following built-in functions:
- ai_generate_text_default(prompt)
- ai_generate_text(ai_endpoint, prompt, ai_model,
  ai_api_key_jceks_secret, additional_params)

'ai_endpoint', 'ai_model' and 'ai_api_key_jceks_secret' are flagfile
options. 'ai_generate_text_default(prompt)' syntax expects all these
to be set to proper values. The other syntax, will try to use the
provided input parameter values, but fallback to instance level values
if the inputs are NULL or empty.

Only public OpenAI (api.openai.com) and Azure OpenAI (openai.azure.com)
API endpoints are currently supported.

Exposed these functions in FunctionContext so that they can also be
called from UDFs:
- ai_generate_text_default(context, model)
- ai_generate_text(context, ai_endpoint, prompt, ai_model,
  ai_api_key_jceks_secret, additional_params)

Testing:
- Added unit tests for AiGenerateTextInternal function
- Added fe test for JniFrontend::getSecretFromKeyStore
- Ran manual tests to make sure Impala can talk with OpenAI LLMs using
'ai_generate_text' built-in function. Example sql:
select ai_generate_text("https://api.openai.com/v1/chat/completions",
"hello", "gpt-3.5-turbo", "open-ai-key",
'{"temperature": 0.9, "model": "gpt-4"}')
- Tested using standalone UDF SDK and made sure that the UDFs can invoke
  BuiltInFunctions (ai_generate_text and ai_generate_text_default)

Change-Id: Id4446957f6030bab1f985fdd69185c3da07d7c4b
Reviewed-on: http://gerrit.cloudera.org:8080/21168
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-04-11 07:25:50 +00:00
Gabor Kaszab
df7aac9517 IMPALA-12970: Fix ConcurrentModificationException for Iceberg table scans
When a table is partitioned IcebergScanNode sorts the file descriptors
for better scheduling. However, the list of file descriptors comes from
IcebergContentFileStore and is shared between different select queries
on the table. When another query tries to iterate the list of file
descriptors and at the same time the IcebergScanNode sorts them we get
a ConcurrentModificationException.
To solve this IceberScanNode now creates its own copy of the file
descriptor list not to interfere with other queries.

Manual testing:
300-400 SELECT * Iceberg queries were sent into Impala in a loop that
confidently reproduced the original issue. With the fix the issue is
gone.
The queries used for the repro:
1:
select *
from functional_parquet.iceberg_v2_partitioned_position_deletes_orc a,
functional_parquet.iceberg_partitioned_orc_external b
where a.action = b.action and b.id=3;
2:
select *
from functional_parquet.iceberg_v2_equality_delete_schema_evolution;

Change-Id: Iafe57f05ffa0fa6a0875c141cfafd5ee1607a5c3
Reviewed-on: http://gerrit.cloudera.org:8080/21267
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-04-10 17:47:32 +00:00
Csaba Ringhofer
8ff51fbf74 IMPALA-5323: Support BINARY columns in Kudu tables
The patch adds read and write support for BINARY columns in Kudu
tables.

Predicate push down is implemented, but is incomplete:
a constant binary argument will be only pushed down if
the constant folding never encounters non-ascii strings.
Examples:
 - cast(unhex(hex("aa")) as binary) can be pushed down
 - cast(hex(unhex("aa")) as binary) can't be pushed
   down as unhex("aa") is not ascii (even though the
   final result is ascii)
See IMPALA-10349 for more details on this limitation.

The patch also changes casting BINARY <-> STRING from noop
to calling an actual function. While this may add some small
overhead it allows the backend to know whether an expression
returns STRING or BINARY.

Change-Id: Iff701a4b3a09ce7b6982c5d238e65f3d4f3d1151
Reviewed-on: http://gerrit.cloudera.org:8080/18868
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-04-10 16:17:15 +00:00
Riza Suminto
4764b91f42 IMPALA-12965: Add debug query option RUNTIME_FILTER_IDS_TO_SKIP
Runtime filter still have negative effect on certain scenario such as
long wait time that delays scan and cascading runtime filter chain that
prevents parallel execution of fragments. Having debug query option to
simply skip a runtime filter id from being scheduled can help us
investigate and test a solution early before implementing the
improvement code.

This patch add RUNTIME_FILTER_IDS_TO_SKIP option to do that. This patch
also improve parsing of multi-value query options to not split at ','
char that is within two double quotes and ignore empty/whitespace value
if exist.

Testing:
- Add BE test in query-options-test.cc
- Add FE test in runtime-filter-query-options.test

Change-Id: I897e37685dd1ec279989b55560ec7616a00d2280
Reviewed-on: http://gerrit.cloudera.org:8080/21230
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-04-09 21:35:53 +00:00
wzhou-code
e50bfa8376 IMPALA-12925: Fix decimal data type for external JDBC table
Decimal type is a primitive data type for Impala. Current code returns
wrong values for columns with decimal data type in external JDBC tables.

This patch fixes wrong values returned from JDBC data source, and
supports pushing down decimal type of predicates to remote database
and remote Impala.
The decimal precision and scale of the columns in external JDBC table
must be no less than the decimal precision and scale of the
corresponding columns in the table of remote database. Otherwise,
Impala fails with an error since it may cause truncation of decimal
data.

Testing:
 - Added Planner test for pushing down decimal type of predicates.
 - Added end-to-end unit-tests for tables with decimal type of columns
   for Postgres, MySQL, and Impala-to-Impala.
 - Passed core-tests.

Change-Id: I8c9d2e0667c42c0e52436b158e3dfe3ec14b9e3b
Reviewed-on: http://gerrit.cloudera.org:8080/21218
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Reviewed-by: Abhishek Rawat <arawat@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-04-05 09:16:53 +00:00
Gabor Kaszab
da8704f90b IMPALA-12612: SELECT * queries expand complex type columns from Iceberg metadata tables
Similarly to how regular tables behave, the nested columns are omitted
when we do a SELECT * on Iceberg metadata tables and the user needs to
turn EXPAND_COMPLEX_TYPES on to include the nested columns into the
result. This patch changes this behaviour to unconditionally include
the nested columns from Iceberg metadata tables.
Note, the behavior of handling nested columns from regular tables
doesn't change with this patch.

Testing:
  - Adjusted the SELECT * metadata table queries to add the nested
    columns into the results.
  - Added some new tests where both metadata tables and regular tables
    were queried in the same query.

Change-Id: Ia298705ba54411cc439e99d5cb27184093541f02
Reviewed-on: http://gerrit.cloudera.org:8080/21236
Reviewed-by: Daniel Becker <daniel.becker@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-04-04 14:04:51 +00:00
Riza Suminto
97adba5192 IMPALA-12881: Use getFkPkJoinCardinality to reduce scan cardinality
IMPALA-12018 adds reduceCardinalityForScanNode to lower cardinality
estimation when a runtime filter is involved. It calls
JoinNode.computeGenericJoinCardinality(). However, if the originating
join node has FK-PK conjunct, it should be possible to obtain a lower
cardinality estimate by calling JoinNode.getFkPkJoinCardinality()
instead.

This patch adds that analysis and calls
JoinNode.getFkPkJoinCardinality() when possible. It is, however, only
limited to runtime filters that evaluate at the storage layer, such as
partition filter and pushed-down Kudu filter. Row-level runtime filters
that evaluate at scan node will continue using
JoinNode.computeGenericJoinCardinality().

This distinction is because a storage layer filter is applied more
consistently than a row-level filter. For example, a partition filter
evaluate all partition_id and never disabled regardless of its
precision (see HdfsScanNodeBase::PartitionPassesFilters). On the other
hand, scan node can disable a row-level filter later on if it is deemed
ineffective / not precise enough (see
HdfsScanner::CheckFiltersEffectiveness,
LocalFilterStats::enabled_for_row, and min_filter_reject_ratio flag).
For the pushed-down Kudu filter, Impala will rely on Kudu to evaluate
the filter.

Runtime filters can arrive late as well. But for both storage layer
filter and row-level filter, the scan node can stop waiting and start
scanning after runtime_filter_wait_time_ms passed. Scan node will still
evaluate a late runtime filter later on if the scan process is still
ongoing.

Also, note that this cardinality reduction algorithm is based only on
highly selective runtime filters to increase its estimate
confidence (see RuntimeFilter.isHighlySelective()).

Testing:
- Update TpcdsCpuCostPlannerTest.
- Pass FE tests.

Change-Id: I6efafffc8f96247a860b88e85d9097b2b4327f32
Reviewed-on: http://gerrit.cloudera.org:8080/21118
Reviewed-by: Wenzhe Zhou <wzhou@cloudera.com>
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-04-04 04:46:19 +00:00
Daniel Becker
a623447206 IMPALA-12899: Temporary workaround for BINARY in complex types
The BINARY type is currently not supported inside complex types and a
cross-component decision is probably needed to support it (see
IMPALA-11491). We would like to enable EXPAND_COMPLEX_TYPES for Iceberg
metadata tables (IMPALA-12612), which requires that queries with BINARY
inside complex types don't fail. Enabling EXPAND_COMPLEX_TYPES is a more
prioritised issue than IMPALA-11491, so we have come up with a
temporary solution.

This change NULLs out BINARY values in complex types coming from Iceberg
metadata tables and logs a warning.

BINARYs in complex types from regular tables are not affected by this
change.

Testing:
 - Added test queries in iceberg-metadata-tables.test.

Change-Id: I0d834126c7d702a25e957bb6071ecbf0fda2c203
Reviewed-on: http://gerrit.cloudera.org:8080/21219
Reviewed-by: Gabor Kaszab <gaborkaszab@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-04-03 14:53:53 +00:00
stiga-huang
effc9df933 IMPALA-12782: Show info of the event processing in /events webUI
The /events page of catalogd shows the metrics and status of the
event-processor. This patch adds more info in this page, including
 - lag info
 - current event batch that's being processed
See the screenshot attached in the JIRA for how it looks like.

Also moves the error message to the top to highlight the error status.
Fixes the issue of not updating latest event id when event processor is
stopped. Also fixes the issue of error message not cleared after global
INVALIDATE METADATA.

Adds a debug action, catalogd_event_processing_delay, to inject a sleep
while processing an event. So the web page can be captured more easily.

Also adds a missing test for showing the error message of
event-processing in the /events page.

Tests:
 - Add e2e test to verify the content of the page.

Change-Id: I2e7d4952c7fd04ae89b6751204499bf9dd99f57c
Reviewed-on: http://gerrit.cloudera.org:8080/20986
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-04-02 18:40:26 +00:00
Daniel Becker
63f52807f0 IMPALA-12611: Add support to MAP type Iceberg Metadata table columns
This change adds support for querying MAP types from Iceberg Metadata
tables.

The 'IcebergMetadataScanner.ArrayScanner' java class is renamed to
'CollectionScanner' and extended to be able to handle maps. For arrays
the iteration returns the element as before, for maps it returns
'Map.Entry' objects.

Note that collections in the FROM clause are still not supported.

Testing:
- Added E2E tests in iceberg-metadata-tables.test.

Change-Id: I8a8b3a574ca45c893315c3b41b33ce4e0eff865a
Reviewed-on: http://gerrit.cloudera.org:8080/21125
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-04-02 18:30:39 +00:00
Daniel Becker
72732da9d8 IMPALA-12609: Implement SHOW METADATA TABLES IN statement to list Iceberg Metadata tables
After this change, the new SHOW METADATA TABLES IN statement can be used
to list all the available metadata tables of an Iceberg table.

Note that in contrast to querying the contents of Iceberg metadata tables,
this does not require fully qualified paths, e.g. both
  SHOW METADATA TABLES IN functional_parquet.iceberg_query_metadata;
and
  USE functional_parquet;
  SHOW METADATA TABLES IN iceberg_query_metadata;
work.

The available metadata tables for all Iceberg tables are the same,
corresponding to the values of the enum
"org.apache.iceberg.MetadataTableType", so there is actually no need to
pass the name of the regular table for which the metadata table list is
requested through Thrift. This change, however, does send the table name
because this way
 - if we add support for metadata tables for other table formats, the
   table name/path will be necessary to determine the correct list of
   metadata tables
 - we could later add support for different authorisation policies for
   individual tables
 - we can check also at the point of generating the list of metadata
   tables that the table is an Iceberg table

Testing:
 - added and updated tests in ParserTest, AnalyzeDDLTest, ToSqlTest and
   AuthorizationStmtTest
 - added a custom cluster test in test_authorization.py
 - added functional tests in iceberg-metadata-tables.test

Change-Id: Ide10ccf10fc0abf5c270119ba7092c67e712ec49
Reviewed-on: http://gerrit.cloudera.org:8080/21026
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Zoltan Borok-Nagy <boroknagyz@cloudera.com>
2024-04-02 09:58:37 +00:00
jasonmfehr
f55077007b IMPALA-12426: Switches the duration fields to be stored in decimal seconds.
The original implementation of the completed queries table
stored durations in integer nanoseconds. This change
modifies the duration fields to be stored as seconds with
up to three digits of millisecond precision.

Also reduces the default max number of queued queries to a
number that will not consume as much memory.

Existing sys.impala_query_log tables will need to be
dropped.

Testing was accomplished by modifying the python custom
cluster tests.

Change-Id: I842951a132b7b8eadccb09a3674f4c34ac42ff1b
Reviewed-on: http://gerrit.cloudera.org:8080/21203
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-03-30 03:48:14 +00:00
Michael Smith
c529b855e9 IMPALA-12626: Add Tables Queried to profile/history
Adds "Tables Queried" to the query profile, enumerating a
comma-separated list of tables accessed during a query:

  Tables Queried: tpch.customer,tpch.lineitem

Also adds "tables_queried" to impala_query_log and impala_query_live
with the same content.

Requires 'drop table sys.impala_query_log' to recreate it with the new
column.

Change-Id: I9c9c80b2adf7f3e44225a191fe8eb9df3c4bc5aa
Reviewed-on: http://gerrit.cloudera.org:8080/20886
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-03-29 11:04:17 +00:00
Daniel Becker
9071030f7f IMPALA-12809: Iceberg metadata table scanner should always be scheduled to the coordinator
On clusters with dedicated coordinators and executors the Iceberg
metadata scanner fragment(s) can be scheduled to executors, for example
during a join. The fragment in this case will fail a precondition check,
because either the 'frontend_' object or the table will not be present.

This change forces Iceberg metadata scanner fragments to be scheduled on
the coordinator. It is not enough to set the DataPartition type to
UNPARTITIONED, because unpartitioned fragments can still be scheduled on
executors. This change introduces a new flag in the TPlanFragment thrift
struct - if it is true, the fragment is always scheduled on the
coordinator.

Testing:
 - Added a regression test in test_coordinators.py.
 - Added a new planner test with two metadata tables and a regular table
   joined together.

Change-Id: Ib4397f64e9def42d2b84ffd7bc14ff31df27d58e
Reviewed-on: http://gerrit.cloudera.org:8080/21138
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-03-29 04:40:31 +00:00
Michael Smith
9ac55828f3 IMPALA-12540: (Fixup) Add EventSequence arg to load
Adds a new argument from IMPALA-12443 to Table#load.

Change-Id: I46185e9c0095cc470178e0d2d45d10a1803bff99
Reviewed-on: http://gerrit.cloudera.org:8080/21222
Reviewed-by: Wenzhe Zhou <wzhou@cloudera.com>
Reviewed-by: Jason Fehr <jfehr@cloudera.com>
Tested-by: Michael Smith <michael.smith@cloudera.com>
2024-03-28 18:41:25 +00:00
Michael Smith
45995e6892 IMPALA-12540: Query Live Table
Defines SystemTable which are in-memory tables that can provide access
to Impala state. Adds the 'impala_query_live' to the database 'sys',
which already exists for 'sys.impala_query_log'.

Implements the 'impala_query_live' table to view active queries across
all coordinators sharing the same statestore. SystemTables create new
SystemTableScanNodes for their scan node implementation. When computing
scan range locations, SystemTableScanNodes creates a scan range for each
in the cluster (identified via ClusterMembershipMgr). This produces a
plan that looks like:

Query: explain select * from sys.impala_query_live
+------------------------------------------------------------+
| Explain String                                             |
+------------------------------------------------------------+
| Max Per-Host Resource Reservation: Memory=4.00MB Threads=2 |
| Per-Host Resource Estimates: Memory=11MB                   |
| WARNING: The following tables are missing relevant table   |
| and/or column statistics.                                  |
| sys.impala_query_live                                      |
|                                                            |
| PLAN-ROOT SINK                                             |
| |                                                          |
| 01:EXCHANGE [UNPARTITIONED]                                |
| |                                                          |
| 00:SCAN SYSTEM_TABLE [sys.impala_query_live]               |
|    row-size=72B cardinality=20                             |
+------------------------------------------------------------+

Impala's scheduler checks for whether the query contains fragments that
can be scheduled on coordinators, and if present includes an
ExecutorGroup containing all coordinators. These are used to schedule
scan ranges that are flagged as 'use_coordinator', allowing
SystemTableScanNodes to be scheduled on dedicated coordinators and
outside the selected executor group.

Execution will pull data from ImpalaServer on the backend via a
SystemTableScanner implementation based on table name.

In the query profile, SYSTEM_TABLE_SCAN_NODE includes
ActiveQueryCollectionTime and PendingQueryCollectionTime to track time
spent collecting QueryState from ImpalaServer.

Grants QueryScanner private access to ImpalaServer, identical to how
ImpalaHttpHandler access internal server state.

Adds custom cluster tests for impala_query_live, and unit tests for
changes to planner and scheduler.

Change-Id: Ie2f9a449f0e5502078931e7f1c5df6e0b762c743
Reviewed-on: http://gerrit.cloudera.org:8080/20762
Reviewed-by: Jason Fehr <jfehr@cloudera.com>
Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-03-28 16:34:48 +00:00
Zoltan Borok-Nagy
b03cfcf2ad IMPALA-12894: (part 2) Fix optimized count(*) for Iceberg tables with dangling delete files
Impala can return incorrect results if a table has dangling delete
files. Dangling delete files are delete files that are part of the
snapshot but they are not applicable to any of the data files. We can
have such delete files after Spark's rewrite_data_files action.

During analysis we check the existence of delete files based on the
snapshot summary. If there are no delete files in the table, we just
replace the count(*) expression with NumericLiteral($record_count).
If there are delete files in the table (based on the summary), we set
optimize_count_star_for_iceberg_v2 in the query context.

Without optimize_count_star_for_iceberg_v2 in the query context, the
IcebergScanPlanner would create the following plan.

    AGGREGATE
    COUNT(*)
        |
    UNION ALL
   /         \
  /           \
 /             \
SCAN all    ANTI JOIN
datafiles  /         \
without   /           \
deletes  SCAN         SCAN
         datafiles    deletes
         with deletes

With optimize_count_star_for_iceberg_v2 the final plan looks like
the following:

      ArithmeticExpr(ADD)
      /             \
     /               \
    /                 \
record_count       AGGREGATE
of all             COUNT(*)
datafiles              |
without            ANTI JOIN
deletes           /         \
                 /           \
                SCAN        SCAN
                datafiles   deletes
                with deletes

The ArithmeticExpr(ADD) and its left child (record_count) is created
by the analyzer, IcebergScanPlanner is responsible in creating the
plan under AGGREGATE COUNT(*). And if it has delete files and
optimize_count_star_for_iceberg_v2 is true, it knows it can omit
the original UNION ALL and its left child.

However, IcebergScanPlanner checks delete file existence based on the
result of planFiles(), hence dangling delete files are eliminated.
And if there are no delete files, IcebergScanPlanner assumes that
case is already handled by the Analyzer (i.e. it replaced count(*)
with NumericLiteral($record_count)). So it will incorrectly create a
normal SCAN plan of the table under COUNT(*), i.e. we end up
with this:

      ArithmeticExpr(ADD)
      /             \
     /               \
    /                 \
record_count       AGGREGATE
of all             COUNT(*)
datafiles              |
without              SCAN
deletes            datafiles
                   without
                   deletes

Which means Impala will yield $record_count * 2 as a result.

This patch fixes the FeIcebergTable.hasDeleteFiles() method, so it
also ignores dangling delete files. Therefore, the analyzer will just
substitute count(*) with NumericLiteral($record_count) if all deletes
are dangling, i.e. no need to involve the IcebergScanPlanner at all.

The patch also introduces a new query option,
"iceberg_disable_count_star_optimization", so users can completely
disable the statistic-based count(*)-optimization if necessary.

Testing:
 * e2e tests
 * planner tests

Change-Id: Ie3aca0b0a104f9ca4589cde9643f3f341d4ff99f
Reviewed-on: http://gerrit.cloudera.org:8080/21190
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-03-28 15:17:40 +00:00
Gabor Kaszab
73171cb716 IMPALA-12729: Allow creating primary keys for Iceberg tables
There are writer engines that use Iceberg's identifier-field-ids from
the Iceberg schema to identify the columns to be written into the
equality delete files (Flink, NiFi). So far Impala wasn't able to
populate this identifier-field-ids. This patch introduces the support
for not enforced primary keys for Iceberg tables, where the primary key
is going to be used for setting identifier-field-ids during Iceberg
schema creation.

Example syntax:
CREATE TABLE ice_tbl (
  i int NOT NULL,
  j int,
  s string NOT NULL
  primary key(i, s) not enforced)
PARTITIONED BY SPEC (truncate(10, s))
STORED AS ICEBERG;

There are some constraints with primary keys (PK) following the
behavior of Flink:
 - Only NOT NULL columns can be in the PK.
 - PK is not allowed in the column definition level like
   'i int NOT NULL PRIMARY KEY'.
 - If the table is partitioned then the partition columns have to be
   part of the PK.
 - Float and double columns are not allowed for the PK.
 - Not allowed to drop a column that is used as a PK.

Testing:
 - New E2E tests added for different table creation scenarios.
 - Manual test to use Nifi for writing into a table with PK.

Change-Id: I7bea787acdabd8cb04661f4ddb5c3309af0364a6
Reviewed-on: http://gerrit.cloudera.org:8080/21149
Reviewed-by: Daniel Becker <daniel.becker@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-03-28 13:57:07 +00:00
Zoltan Borok-Nagy
580a477e69 IMPALA-12879: Conjunct not referring to table field causes ERROR for Iceberg table
The following query throws an error for Iceberg tables:

 select * from ice_tbl where rand() < 0.001;

It's because the predicate 'rand() < 0.001' doesn't involve any table
columns. Because of a bug in
IcebergScanPlanner.hasPartitionTransformType() the method throws an
IndexOutOfBoundsException. This patch fixes the method to handle
such predicates.

Testing:
 * added e2e tests

Change-Id: Id43a6798df3f4cc3a0e00ac610e25aa3b5781342
Reviewed-on: http://gerrit.cloudera.org:8080/21179
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Gabor Kaszab <gaborkaszab@cloudera.com>
2024-03-27 09:25:36 +00:00
Sai Hemanth Gantasala
52b11ab6aa IMPALA-12487: Skip reloading file metadata for ALTER_TABLE events with
trivial changes in StorageDescriptor

IMPALA-11534 skips reloading file metadata for some trivial ALTER_TABLE
events. However, ALTER_TABLE events that have trivial changes in
StorageDescriptor are not handled in IMPALA-11534. The only changes
that require file metadata reload are: location, rowformat, fileformat,
serde, and storedAsSubDirectories. The file metadata reload can be
skipped for all other changes in SD.

Testing:
1) Manual testing by changing SD parameters in local environment.
2) Added unit tests for the same in MetastoreEventsProcessorTest class.

Change-Id: I6fd9a9504bf93d2529dc7accbf436ad83e51d8ac
Reviewed-on: http://gerrit.cloudera.org:8080/21019
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-03-27 08:27:29 +00:00
wzhou-code
0a077fe992 IMPALA-12928: Mask JDBC table property dbcp.password for DESC FORMATTED and SHOW CREATE TABLE
'desc formatted' and 'show create table' commands show all of table
properties in clear text. For external JDBC table, dbcp.password table
property value should be masked in the output of these two commands.

This patch makes dbcp.password property value been masked in the output
of 'desc formatted' and 'show create table' commands.

dbcp.password table property could be wrote into Impala and HMS log
files with JDBC table creation statements. There is generic tool in
production environment with which user could set up the regular
expressions to detect and redact sensitive information within SQL
statement text in log files.

Testing:
 - Added end-to-end test cases.
 - Passed core tests.

Change-Id: I83dc32c8d0fec1cdfdfe06e720561b2ae1adf5df
Reviewed-on: http://gerrit.cloudera.org:8080/21187
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-03-27 05:09:04 +00:00
wzhou-code
c0507c02cd IMPALA-12896 (Part 2): JDBC table must be created as external table
In some of the deployment environments, default table type is
transactional. In these scenarios, JDBC tables which are created as non
external table are not accepted by HMS due to strict managed table check
failures.

This patch forces JDBC tables to be created as external table, and
requires at least 1 column for JDBC tables.

Testing:
 - Updated frontend unit tests and end-to-end unit tests to create JDBC
   tables as external tables.
 - Passed core tests

Change-Id: Ib5533b52434cdf1c430e30ac28a0146ab4d9d4b9
Reviewed-on: http://gerrit.cloudera.org:8080/21159
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-03-23 09:54:30 +00:00
stiga-huang
0d49c9d6cc IMPALA-12929: Skip loading HDFS permissions in local-catalog mode
HDFS file/dir permissions are not used at all in local catalog mode - in
LocalFsTable, hasWriteAccessToBaseDir() always returns true and
getFirstLocationWithoutWriteAccess() always returns null.

However, in catalogd, we still load them (in single thread for a table!)
which could dominant the table loading time when there are lots of
partitions. Note that the table loading process in catalogd is the same
no matter what catalog mode is in used. The difference between catalog
modes is mainly in how coordinators get metadata from catalogd. Local
catalog mode is turned on by setting --catalog_topic_mode=minimal on
catalogd and --use_local_catalog=true on coordinators.

This patch skips loading HDFS permissions on catalogd when running in
local catalog mode. We can revisit it in IMPALA-7539.

Tests:
 - Ran CORE tests

Change-Id: I5baa9f6ab0d3888a78ff161ae5caa19e85bc983a
Reviewed-on: http://gerrit.cloudera.org:8080/21178
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-03-22 14:17:58 +00:00
stiga-huang
d7b5819f90 IMPALA-12443: Add catalog timeline for all DDL profiles
This is a follow-up work of IMPALA-12024 where we add the catalog
timeline for CreateTable statements. Using the same mechanism, this
patch adds catalog timeline for all DDL/DML profiles, including
REFRESH and INSERT.

The goal is to add timeline markers after each step that could be
blocked, e.g. acquiring locks, external RPCs. So we can better debug
slow DDLs with the catalog timeline in profiles.

Tried to add some constant strings for widely used events, e.g. "Fetched
table from Metastore". Didn't do so for events that only occurs once.

Most of the catalog methods now have a new argument for tracking the
execution timeline. To avoid adding null checks everywhere, for code
paths that don't need a catalog profile, e.g. EventProcessor, uses a
static noop EventSequence as the argument. We can replace it in future
works, e.g. expose execution timeline of a slow processing on an HMS
event.

This patch also removes some unused overloads of HdfsTable#load() and
HdfsTable#reloadPartitionsFromNames().

Example timeline for a REFRESH statement on an unloaded table
(IncompleteTable):
Catalog Server Operation: 2s300ms
   - Got catalog version read lock: 26.407us (26.407us)
   - Start loading table: 314.663us (288.256us)
   - Got Metastore client: 629.599us (314.936us)
   - Fetched table from Metastore: 7.248ms (6.618ms)
   - Loaded table schema: 27.947ms (20.699ms)
   - Preloaded permissions cache for 1824 partitions: 1s514ms (1s486ms)
   - Got access level: 1s514ms (588.314us)
   - Created partition builders: 2s103ms (588.270ms)
   - Start loading file metadata: 2s103ms (49.760us)
   - Loaded file metadata for 1824 partitions: 2s282ms (179.839ms)
   - Async loaded table: 2s289ms (6.931ms)
   - Loaded table from scratch: 2s289ms (72.038us)
   - Got table read lock: 2s289ms (2.289us)
   - Finished resetMetadata request: 2s300ms (10.188ms)

Example timeline for an INSERT statement:
Catalog Server Operation: 178.120ms
   - Got catalog version read lock: 4.238us (4.238us)
   - Got catalog version write lock and table write lock: 52.768us (48.530us)
   - Got Metastore client: 15.768ms (15.715ms)
   - Fired Metastore events: 156.650ms (140.882ms)
   - Got Metastore client: 163.317ms (6.666ms)
   - Fetched table from Metastore: 166.561ms (3.244ms)
   - Start refreshing file metadata: 167.961ms (1.399ms)
   - Loaded file metadata for 24 partitions: 177.679ms (9.717ms)
   - Reloaded table metadata: 178.021ms (342.261us)
   - Finished updateCatalog request: 178.120ms (98.929us)

Example timeline for a "COMPUTE STATS tpcds_parquet.store_sales":
Catalog Server Operation: 6s737ms
   - Got catalog version read lock: 19.971us (19.971us)
   - Got catalog version write lock and table write lock: 50.255us (30.284us)
   - Got Metastore client: 171.819us (121.564us)
   - Updated column stats: 25.560ms (25.388ms)
   - Got Metastore client: 69.298ms (43.738ms)
   - Altered 500 partitions in Metastore: 1s894ms (1s825ms)
   - Altered 1000 partitions in Metastore: 3s558ms (1s664ms)
   - Altered 1500 partitions in Metastore: 5s144ms (1s586ms)
   - Altered 1824 partitions in Metastore: 6s205ms (1s060ms)
   - Got Metastore client: 6s205ms (329.481us)
   - Altered table in Metastore: 6s216ms (11.073ms)
   - Got Metastore client: 6s216ms (13.377us)
   - Fetched table from Metastore: 6s219ms (2.419ms)
   - Loaded table schema: 6s223ms (4.130ms)
   - Got current Metastore event id 19017: 6s639ms (415.690ms)
   - Start loading file metadata: 6s639ms (9.591us)
   - Loaded file metadata for 1824 partitions: 6s729ms (90.196ms)
   - Reloaded table metadata: 6s735ms (5.865ms)
   - DDL finished: 6s737ms (2.255ms)

Example timeline for a global INVALIDATE METADATA:
Catalog Server Operation: 301.618ms
   - Got catalog version write lock: 9.908ms (9.908ms)
   - Got Metastore client: 9.922ms (14.013us)
   - Got database list: 11.396ms (1.473ms)
   - Loaded functions of default: 44.919ms (33.523ms)
   - Loaded TableMeta of 82 tables in database default: 47.524ms (2.604ms)
   - Loaded functions of functional: 50.846ms (3.321ms)
   - Loaded TableMeta of 101 tables in database functional: 52.580ms (1.734ms)
   - Loaded functions of functional_avro: 54.861ms (2.281ms)
   - Loaded TableMeta of 35 tables in database functional_avro: 55.789ms (928.120us)
   ...
   - Loaded functions of tpch_text_gzip: 299.503ms (1.710ms)
   - Loaded TableMeta of 8 tables in database tpch_text_gzip: 300.288ms (784.725us)
   - Updated catalog cache: 300.366ms (78.045us)
   - Finished resetMetadata request: 301.618ms (1.251ms)

Tests:
 - Add e2e test to verify the catalog timeline in some DDLs.
 - Ran CORE tests

Change-Id: Ifbceefaeb24c66eb1a064c449d6f56077ea347c5
Reviewed-on: http://gerrit.cloudera.org:8080/20491
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-03-21 16:04:14 +00:00
wzhou-code
eb5c8d6884 IMPALA-12802: Support ALTER TABLE for JDBC tables
IMPALA-12793 changes the syntax for creating JDBC table. The
configurations of connection credentials - url, username, password,
jdbc driver, etc, are set as table properties.

This patch allows user to change these table properties, or edit
columns via ALTER TABLE statement.

Testing:
 - Added frontend analysis unit-tests.
 - Added end-to-end unit-test.
 - Passed Core tests

Change-Id: I5ebb5de2c686d2015db78641f78299dd5f33621e
Reviewed-on: http://gerrit.cloudera.org:8080/21088
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-03-20 09:01:30 +00:00
jasonmfehr
711a9f2bad IMPALA-12426: Query History Table
Adds the ability for users to specify that Impala will create and
maintain an internal Iceberg table that contains data about all
completed queries. This table is automatically created at startup by
each coordinator if it does not exist. Then, most completed queries are
queued in memory and flushed to the query history table at a set
interval (either minutes or number of records). Set, use, and show
queries are not written to this table. This commit leverages the
InternalServer class to maintain the query history table.

Ctest unit tests have been added to assert the various pieces of code.
New custom cluster tests have been added to assert the query history
table is properly populated with completed queries.

Negative testing consists of attempting sql injection attacks and
syntactically incorrect queries.

Impala built-in string functions benchmarks have been updated to include
the new built-in functions.

Change-Id: I2d2da9d450fba4e789400cfa62927fc25d34f844
Reviewed-on: http://gerrit.cloudera.org:8080/20770
Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-03-19 22:17:16 +00:00
Kurt Deschler
4477398ae4 IMPALA-12818: Intermediate Result Caching plan node framework
This patch adds a plan node framework for caching of intermediate result
tuples within a query. Actual caching of data will be implemented in
subsequent patches.

A new plan node type TupleCacheNode is introduced for brokering caching
decisions at runtime. If the result is in the cache, the TupleCacheNode will
return results from the cache and skip executing its child node. If the
result is not cached, the TupleCacheNode will execute its child node and
mirror the resulting RowBatches to the cache.

The TupleCachePlanner decides where to place the TupleCacheNodes. To
calculate eligibility and cache keys, the plan must be in a stable state
that will not change shape. TupleCachePlanner currently runs at the end
of planning after the DistributedPlanner and ParallelPlanner have run.
As a first cut, TupleCachePlanner places TupleCacheNodes at every
eligible location. Eligibility is currently restricted to immediately
above HdfsScanNodes. This implementation will need to incorporate cost
heuristics and other policies for placement.

Each TupleCacheNode has a hash key that is generated from the logical
plan below for the purpose of identifying results that have been cached
by semantically equivalent query subtrees. The initial implementation of
the subtree hash uses the plan Thrift to uniquely identify the subtree.

Tuple caching is enabled by setting the enable_tuple_cache query option
to true. As a safeguard during development, enable_tuple_cache can only
be set to true if the "allow_tuple_caching" startup option is set to
true. It defaults to false to minimize the impact for production clusters.
bin/start-impala-cluster.py sets allow_tuple_caching=true by default
to enable it in the development environment.

Testing:
 - This adds a frontend test that does basic checks for cache keys and
   eligibility
 - This verifies the presence of the caching information in the explain
   plan output.

Change-Id: Ia1f36a87dcce6efd5d1e1f0bc04009bf009b1961
Reviewed-on: http://gerrit.cloudera.org:8080/21035
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
Reviewed-by: Yida Wu <wydbaggio000@gmail.com>
Reviewed-by: Kurt Deschler <kdeschle@cloudera.com>
2024-03-14 20:24:27 +00:00
Csaba Ringhofer
691604b1d1 IMPALA-12835: Fix event processing without hms_event_incremental_refresh_transactional_table
If hms_event_incremental_refresh_transactional_table is false, then
for non-partitioned ACID tables Impala needs to rely on alter table
event to detect INSERTs in Hive. This patch changes the event processor
to not skip reloading files when processing the alter table event
for this specific type of table (even if the changes in the table
look trivial).

Testing:
- added a simple regression test

Change-Id: I137b289f0e5f7c9c1947e2a3b30258c979f20987
Reviewed-on: http://gerrit.cloudera.org:8080/21116
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-03-14 14:14:00 +00:00
wzhou-code
6c0c26146d IMPALA-12896: Avoid JDBC table to be set as transactional table
In some deployment environment, JDBC tables are set as transactional
tables by default. This causes catalogd failed to load the metadata for
JDBC tables. This patch explicitly add table properties with
"transactional=false" for JDBC table to avoid the JDBC to be set as
transactional table.

The operations on JDBC table are processed only on coordinator. The
processed rows should be estimated as 0 for DataSourceScanNode by
planner so that coordinator-only query plans are generated for simple
queries on JDBC tables and queries could be executed without invoking
executor nodes. Also adds Preconditions.check to make sure numNodes
equals 1 for DataSourceScanNode.

Updates FileSystemUtil.copyFileFromUriToLocal() function to write log
message for all types of exceptions.

Testing:
 - Fixed planer tests for data source tables.
 - Ran end-to-end tests of JDBC tables with query option
   'exec_single_node_rows_threshold' as default value 100.
 - Passed core-tests.

Change-Id: I556faeda923a4a11d4bef8c1250c9616f77e6fa6
Reviewed-on: http://gerrit.cloudera.org:8080/21141
Reviewed-by: Riza Suminto <riza.suminto@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-03-13 20:40:26 +00:00
Gabor Kaszab
ada4090e09 IMPALA-12894: (part 1) Turn off the count(*) optimisation for V2 Iceberg tables
This is a part 1 change that turns off the count(*) optimisations for
V2 tables as there is a correctness issue with it. The reason is that
Spark compaction may leave some dangling delete files that mess up
the logic in Impala.

Change-Id: Ida9fb04fd076c987b6b5257ad801bf30f5900237
Reviewed-on: http://gerrit.cloudera.org:8080/21139
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-03-13 18:59:07 +00:00
stiga-huang
ab6c9467f6 IMPALA-12831: Fix HdfsTable.toMinimalTCatalogObject() failed by concurrent modification
HdfsTable.toMinimalTCatalogObject() is not always invoked with holding
the table lock, e.g. in invalidating a table, we could replace an
HdfsTable instance with an IncompleteTable instance. We then invoke
HdfsTable.toMinimalTCatalogObject() to get the removed catalog object.
However, the HdfsTable instance could be modified in the meantime by a
concurrent DDL/DML that would reload it, e.g. a REFRESH statement. This
causes HdfsTable.toMinimalTCatalogObject() failed by
ConcurrentModificationException on the column/partition list.

This patch fixes the issue by try acquiring the table read lock in
HdfsTable.toMinimalTCatalogObject(). If it fails, the partition ids and
names won't be returned. Also updates the method to not collecting the
column list since it's unused.

Tests
 - Added e2e test
 - Ran CORE tests

Change-Id: Ie9f8e65c0bd24000241eedf8ca765c1e4e14fdb3
Reviewed-on: http://gerrit.cloudera.org:8080/21072
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-03-12 12:06:40 +00:00
Daniel Becker
b18999fe09 IMPALA-12845: Crash with DESCRIBE on a complex type from an Iceberg table
A DESCRIBE statement on a complex column contained in an Iceberg table
runs into a DCHECK and crashes Impala. An example with an array:

  describe functional_parquet.iceberg_resolution_test_external.phone

Note that this also happens with Iceberg metadata tables, for example:
  describe functional_parquet.iceberg_query_metadata.\
      entries.readable_metrics;

With non-Iceberg tables there is no error.

The problem is that for Iceberg tables, the DESCRIBE statement returns
four columns: "name", "type", "comment" and "nullable" (only Iceberg and
Kudu tables have "nullable"). However, the DESCRIBE statement response
for complex types only contains the first three columns, i.e. no column
for "nullable". But as the table is an Iceberg table, the 'metadata_'
field of HS2ColumnarResultSet is still populated with four columns.

The DCHECK in HS2ColumnarResultSet::AddOneRow() expects the number of
columns to be the same in the DESCRIBE statement response and the
'metadata_' field.

This commit solves the problem by only adding the "nullable" column to
the 'metadata_' field if the target of the DESCRIBE statement is a
table, not a complex type.

Note that Kudu tables do not support complex types so this issue does
not arise there.

This change also addresses a minor issue: DescribeTableStmt::analyze()
did not check whether the statement was already analyzed and did not set
the 'analyzer_' field which would indicate that analysis had already
been done. This is now corrected.

Testing:
 - added tests in describe-path.test for arrays, maps and structs from
   regular Iceberg tables and metadata tables.

Change-Id: I5eda21a41167cc1fda183aa16fd6276a6a16f5d3
Reviewed-on: http://gerrit.cloudera.org:8080/21105
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-03-08 19:48:33 +00:00
Venu Reddy
b7ddbcad0d IMPALA-12832: Implicit invalidate metadata on event failures
At present, failure in event processing needs manual invalidate
metadata. This patch implicitly invalidates the table upon failures
in processing of table events with new
'invalidate_metadata_on_event_processing_failure' flag. And a new
'invalidate_global_metadata_on_event_processing_failure' flag is
added to global invalidate metadata automatically when event
processor goes to non-active state.

Note: Also introduced a config
'inject_process_event_failure_event_types' for automated tests to
simulate event processor failures. This config is used to specify what
event types can be intentionally failed. This config should only be
used for testing purpose. Need IMPALA-12851 as a prerequisite

Testing:
- Added end-to-end tests to mimic failures in event processor and verified
that event processor is active
- Added unit test to verify the 'auto_global_invalidate_metadata' config
- Passed FE tests

Co-Authored-by: Sai Hemanth Gantasala <saihemanth@cloudera.com>

Change-Id: Ia67fc04c995802d3b6b56f79564bf0954b012c6c
Reviewed-on: http://gerrit.cloudera.org:8080/21065
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-03-08 14:46:02 +00:00
stiga-huang
b9c2e00a6b IMPALA-12855: Fix NPE in firing RELOAD events when the partition doesn't exist
When --enable_reload_events is set to true, catalogd will fire RELOAD
events for INVALIDATE/REFRESH statements. When the RELOAD event is fired
successfully for a REFRESH statement, we also update lastRefreshEventId
of the table/partition. This part could hit NullPointerException when
the partition is dropped by concurrent DDLs.

This patch ignores updating lastRefreshEventId if the partition doesn't
exists. Note that ideally we should hold the table lock of REFRESH until
finish firing the RELOAD events and updating lastRefreshEventId. So no
concurrent operations can drop the partition. However, when the table is
loaded from scratch, we don't actually hold the table write lock. We
just load the table and take a read lock to get the thrift object. The
partition could still be dropped concurrently after the load and before
taking the read lock. So ignoring missing partitions is a simpler
solution.

Refactors some codes of fireReloadEventAndUpdateRefreshEventId to save
some indention and avoid acquiring table lock if no events are fired.
Adds error messages in some Precondition checks in methods used by this
feature. Also refactors Table.getFullName() to not always constructing
the result. Improves logs of not reloading a partition for an event.

Tests:
 - Add e2e test

Change-Id: I01af3624bf7cf5cd69935cffa28d54f6a6807504
Reviewed-on: http://gerrit.cloudera.org:8080/21096
Reviewed-by: Csaba Ringhofer <csringhofer@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-03-06 16:41:58 +00:00
Riza Suminto
e666e07110 IMPALA-12678: Deflake test_skipping_batching_events
test_skipping_batching_events is flaky. It expect that REFRESH query
will arrive before ALTER_PARTITION is polled and processed, but the
opposite can happens too.

This patch deflake the test by injecting delay inside
MetastoreEvents.getFilteredEvents() rather than increasing
hms_event_polling_interval_s. The delay injection is specified through
debug_actions flag. This patch also add method in ImpaladProcess and
CatalogdProcess to help change JVM log level from pytest method.

Testing:
- Loop and pass test_skipping_batching_events 100 times.

Change-Id: Ia6e4cd1e9492e3ce75f5089038b90d0af4fbdb0f
Reviewed-on: http://gerrit.cloudera.org:8080/21107
Reviewed-by: Sai Hemanth Gantasala <saihemanth@cloudera.com>
Reviewed-by: Jason Fehr <jfehr@cloudera.com>
Reviewed-by: Quanlong Huang <huangquanlong@gmail.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-03-06 16:05:53 +00:00
Tamas Mate
f68d91dcee IMPALA-12610: Support reading ARRAY columns for Iceberg Metadata tables
This commit adds support for reading ARRAY columns inside Iceberg
Metadata tables.

The change starts with some refactoring, to consolidate accessing JVM
through JNI a new backend class was introduced, IcebergMetadataScanner.
This class is the C++ part of the Java IcebergMetadataScanner, it is
responsible to manage the Java scanner object.

In Iceberg array types do not have accessors, so structs inside arrays
have to be accessed by position, for the value obtaining logics have been
changed to allow access by position.

The IcebergRowReader needed an IcebergMetadataScanner, so that it can
iterate over the arrays returned by the scanner and add them to the
collection.

This change will not cover MAP, these slots are set to NULL, it will
be done in IMPALA-12611.

Testing:
 - Added E2E tests.

Change-Id: Ieb9bac1822a17bd3cd074b4b98e4d010703cecb1
Reviewed-on: http://gerrit.cloudera.org:8080/21061
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Gabor Kaszab <gaborkaszab@cloudera.com>
2024-03-05 14:56:40 +00:00
Zoltan Borok-Nagy
4428db37b3 IMPALA-12860: Invoke validateDataFilesExist for RowDelta operations
We must invoke validateDataFilesExist for RowDelta operations (DELETE/
UPDATE/MERGE). Without this a concurrent RewriteFiles (compaction) and
RowDelta can corrupt a table.

IcebergBufferedDeleteSink now also collects the filenames of the data
files that are referenced in the position delete files. It adds them to
the DML exec state which is then collected by the Coordinator. The
Coordinator passes the file paths to CatalogD which executes Iceberg's
RowDelta operation and now invokes validateDataFilesExist() with the
file paths. Additionally it also invokes validateDeletedFiles().

This patch set also resolves IMPALA-12640 which is about replacing
IcebergDeleteSink with IcebergBufferedDeleteSink, as from now on
we use the buffered version for all DML operations that write
position delete files.

Testing:
 * adds new stress test with DELETE + UPDATE + OPTIMIZE

Change-Id: I4869eb863ff0afe8f691ccf2fd681a92d36b405c
Reviewed-on: http://gerrit.cloudera.org:8080/21099
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Gabor Kaszab <gaborkaszab@cloudera.com>
2024-03-05 09:51:24 +00:00
Venu Reddy
784971c018 IMPALA-12851: Fix AllocWriteIdEvent process issue to add txnId-tableWriteIds mapping
During AllocWriteIdEvent process, txnId to tableWriteIds mapping is
not added to catalog in the following cases:
1. When CREATE_TABLE event is followed by ALLOC_WRITE_ID_EVENT for
the table in the same batch of MetastoreEventsProcessor.processEvents(),
process AllocWriteIdEvent cannot find catalog table since
CREATE_TABLE is not processed by the time of AllocWriteIdEvent
object construction.
2. When catalog table is present. But it is not loaded.

This patch fixes:
1. Removes the usage of get table from catalog in all the event
constructors. Currently, AllocWriteIdEvent, ReloadEvent,
CommitCompactionEvent get the catalog table in their constructors.
2. Adds the txnId to tableWriteIds mapping in catalog even when
table is not loaded. And ensures the write ids are not added to
table if it is a non-partitioned table.
3. Also fixed a bug in TableWriteId's hashCode() implementation that
is breaking hashcode contract. Two same TableWriteId of different
instances produce different hashcode though they are equal.
4. Fixed CatalogHmsSyncToLatestEventIdTest.cleanUp() issue.
flagInvalidateCache and flagSyncToLatestEventId are incorrectly set
in cleanUp.

Testing:
- Added tests in MetastoreEventsProcessorTest

Change-Id: I8b1a918befd4ee694880fd4e3cc04cb55b64955f
Reviewed-on: http://gerrit.cloudera.org:8080/21087
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-03-03 07:02:50 +00:00
gaurav1086
96964be7a3 IMPALA-12815: Support timestamp for scan predicates
for external data source table.

Binary SCAN predicates involving timestamp literals are pushed down
to remote Database. The current logic assumes ISO 8601 (SQL standard)
format for timestamp literals - 'yyyy-mm-dd hh:mm:ss.ms'

Testing:
- Added custom cluster tests for timestamp predicates with operators:
  '=', '>', '<', '>=', '<=', '!=', 'BETWEEN' for postgres, mysql
  and remote impala.
- Added coverage for timestamp with/without time in the timestamp
- Added coverage for timestamp with/without milliseconds in timestamp.
- Added Planner tests to check predicate pushdown for date/timestamp
  literals, date/timestamp functions and CASTs

Change-Id: If6ffe672b4027e2cee094cec4f99b9df9308e441
Reviewed-on: http://gerrit.cloudera.org:8080/21015
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Wenzhe Zhou <wzhou@cloudera.com>
2024-03-02 17:15:43 +00:00
Gabor Kaszab
65094a74f1 IMPALA-12598: Allow multiple equality field id lists for Iceberg tables
This patch adds support for reading Iceberg tables that have
different equality field ID lists associated to different equality
delete files. In practice this is a use case when one equality delete
file deletes by e.g. columnA and columnB while another one deletes by
columnB and columnC.

In order to achieve such functionality the plan tree creation needed
some adjustments so that it can create separate LEFT ANTI JOIN nodes
for the different equality field ID lists.

Testing:
  - Flink and NiFi was used for creating some test tables with the
    desired equality field IDs. Coverage on these tables are added to
    the test suite.

Change-Id: I3e52d7a5800bf1b479f0c234679be92442d09f79
Reviewed-on: http://gerrit.cloudera.org:8080/20951
Reviewed-by: Gabor Kaszab <gaborkaszab@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-02-29 19:58:22 +00:00
Noemi Pap-Takacs
47db4fd1f5 IMPALA-12412: Support partition evolution in OPTIMIZE statement
The OPTIMIZE statement is used to execute table maintenance tasks
on Iceberg tables, such as:
 1. compacting small files,
 2. merging delete deltas,
 3. rewriting the table according to the latest schema
    and partition spec.

OptimizeStmt used to serve as an alias for INSERT OVERWRITE.
After this change it works as follows: It creates a source statement
that contains all columns of the table. All table content will be
rewritten to new data files. After the executors finished writing,
the Catalog calls RewriteFiles Iceberg API to commit the changes.
All previous data and delete files will be excluded from,
and all newly written data files will be added to the next
snapshot. The old files remain accessible via time travel
to older snapshots of the table.

By default, Impala has as many file writers as query fragment instances
and therefore can write too many files for unpartitioned tables.
For smaller tables this can be limited by setting the
MAX_FS_WRITERS Query Option.

Authorization: OPTIMIZE TABLE requires ALL privileges.

Limitations:
All limitations about writing Iceberg tables apply.

Testing:
 - E2E tests:
     - schema evolution
     - partition evolution
     - UPDATE/DELETE
     - time travel
     - table history
 - negative tests
 - Ranger tests for authorization
 - FE: Planner test:
     - sorting order
     - MAX_FS_WRITERS
     - partitioned exchange
     - Parser test
Change-Id: I65a0c8529d274afff38ccd582f1b8a857716b1b5
Reviewed-on: http://gerrit.cloudera.org:8080/20866
Reviewed-by: Daniel Becker <daniel.becker@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-02-29 18:37:16 +00:00
jasonmfehr
9b8e43e9a1 IMPALA-12426: QueryStateRecord Refactor
The QueryStateRecord struct is used to store important information
about a completed query for the Impala web UI page of recently
completed queries. Since significant portions of this struct has data
that is also needed in workload management, it has been refactored.

The QueryStateRecord struct was a private child struct under the
ImpalaServer class. It has now been moved to a top-level struct within
the impala namespace.

A new struct named QueryStateExpanded has also been created. This
struct contains a shared pointer to a QueryStateRecord so the same
QueryStateRecord instance can be used by both the web UI and workload
management. The QueryStateExpanded struct also contains additional data
that is used exclusively by workload management.

New ctests have been added to exercise the added comparators.

Change-Id: I57d470db6fea71ec12e43f86e3fd62ed6259c83a
Reviewed-on: http://gerrit.cloudera.org:8080/21059
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-02-29 03:15:56 +00:00
Riza Suminto
e4fa1b8c8f IMPALA-12847: Expose computeScanRangeLocations and computeStats
After IMPALA-12631, HdfsScanNode.computeScanRangeLocations() need check
if countStarSlot_ is null or not to schedule footer range only.
HdfsScanNode.computeScanRangeLocations() is called along with
HdfsScanNode.init() call.

An external frontend that has its own count star slot analysis and
initialization will need to recompute scan range assignment and stats
after HdfsScanNode.init(). Therefore, computeScanRangeLocations() and
computeStats() should be made idempotent after init() and exposed to
subclasses.

This patch decouple countStarSlot_ initialization from
computeScanRangeLocations() and raise access level of
computeScanRangeLocations() from private to protected.

Testing:
- Pass core tests.

Change-Id: Ia621309c67455bb599f71bec9efc1f67fc085022
Reviewed-on: http://gerrit.cloudera.org:8080/21077
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-02-28 12:47:37 +00:00
stiga-huang
5250cc14b6 IMPALA-12827: Fix failures in processing AbortTxnEvent due to aborted write id is cleaned up
HdfsTable tracks the ValidWriteIdList from HMS. When the table is
reloaded, the ValidWriteIdList is updated to the latest state. An
ABORT_TXN event that is lagging behind could match to aborted write ids
that have already been cleaned up by the HMS housekeeping thread. Such
write ids can't be found in the cached ValidWriteIdList as opened or
aborted write ids. This hits a Precondition check and fails the event
processing.

This patch fixes the check to allow this case. Also adds more logs for
dealing with write ids.

Tests
 - Add custom-cluster test to start Hive with the housekeeping thread
   turned on and verified that such ABORT_TXN event is processed
   correctly.

Change-Id: I93b6f684d6e4b94961d804a0c022029249873681
Reviewed-on: http://gerrit.cloudera.org:8080/21071
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-02-27 17:54:42 +00:00
Riza Suminto
0c0a3fff39 IMPALA-12840: Exclude THdfsFileDesc in getJsonCatalogObject
TestReusePartitions::test_reuse_partitions_transactional calls
/catalog_object path of CatalogD's WebUI and decodes the JSON response.
The "json_string" field from the response text often contains Unicode
control characters that come from serialized binary data from
THdfsFileDesc objects. That causes JSON decoding to fail with an error
like this:

ValueError: Invalid control character at: line 1 column 1850 (char 1849)

This patch attempts to deflake the test by tuning the return value of
getJsonCatalogObject() that excludes THdfsFileDesc, by lowering the
detail level from ThriftObjectType.FULL to
ThriftObjectType.DESCRIPTOR_ONLY.

test_reuse_partitions_transactional is tweaked a bit to print the
response / JSON object if an assertion fails.

Testing:
- Loop and pass the test for hundred times.

Change-Id: I5f6840bf1267d1d99d321c0a6b4a0cab49543182
Reviewed-on: http://gerrit.cloudera.org:8080/21064
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-02-27 04:42:18 +00:00
wzhou-code
edd1e21493 IMPALA-12793: Create JDBC table without data source
This patch changes syntax of creating JDBC table statement as
  CREATE TABLE [IF NOT EXISTS] [db_name.]table_name
  (col_name data_type
    [constraint_specification]
    [COMMENT 'col_comment']
    [, ...]
  )
  [COMMENT 'table_comment']
  STORED BY JDBC
  TBLPROPERTIES ('key1'='value1', 'key2'='value2', ...)

Both "STORED BY JDBC" and "STORED AS JDBC" are acceptable. A table
property '__IMPALA_DATA_SOURCE_NAME' is added to the JDBC table with
value 'impalajdbcdatasource', which is shown in the output of command
'show create table'.
Following required JDBC parameters must be specified as table
properties: database.type, jdbc.url, jdbc.driver, driver.url, and table.
Otherwise, AnalysisException will be thrown.

Testing:
 - Added frontend unit tests for new syntax of creating JDBC table.
 - Updated end-to-end unit tests to create JDBC tables without data
   source.
 - Passed core tests

Change-Id: I765aa86b430246786ad85ab6857cefaf4332c920
Reviewed-on: http://gerrit.cloudera.org:8080/21016
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-02-27 02:39:59 +00:00
Riza Suminto
15e471563d IMPALA-11123: Reimplement ORC optimized count star
Commit 7ca20b3c94 revert the original
optimized count(star) for ORC scan from commit
f932d78ad0 (gerrit review
http://gerrit.cloudera.org:8080/18327). The revert is necessary since
the unification of count star and zero slot functions into
HdfsColumnarScanner and causing significant regression for non-optimized
counts star query in parquet format (over 15% slower
MaterializeTupleTime).

This patch reimplements optimized count(star) for ORC scan code path
while minimizing the code changes needed for parquet scan code path.
After this patch, ORC and parquet code path will have only the following
new things in common:
- THdfsScanNode.count_star_slot_offset renamed to
  THdfsScanNode.star_slot_offset
- HdfsScanner::IssueFooterRanges will only issue footer ranges if
  IsZeroSlotTableScan() or optimize_count_star() is true (made possible
  for parquet by IMPALA-12631).

The structure of HdfsParquetScanner::GetNextInternal() remains
unchanged. Its zero scan slot code path is still served through num_rows
metadata from the parquet footer, while the optimized count star code
path still loops over row groups metadata (also from parquet footer).

The following table shows single-node benchmark result of 3 count query
variant on TPC-DS scale 10, both in ORC and parquet format, looped 9
times.

+-----------+---------------------------+---------+--------+-------------+------------+
| Workload  | Query                     | Format  | Avg(s) | Base Avg(s) | Delta(Avg) |
+-----------+---------------------------+---------+--------+-------------+------------+
| TPCDS(10) | TPCDS-Q_COUNT_UNOPTIMIZED | orc     | 0.30   | 0.28        |   +6.50%   |
| TPCDS(10) | TPCDS-Q_COUNT_OPTIMIZED   | parquet | 0.14   | 0.14        |   +1.56%   |
| TPCDS(10) | TPCDS-Q_COUNT_ZERO_SLOT   | parquet | 0.27   | 0.27        |   +1.42%   |
| TPCDS(10) | TPCDS-Q_COUNT_ZERO_SLOT   | orc     | 0.28   | 0.29        |   -3.03%   |
| TPCDS(10) | TPCDS-Q_COUNT_UNOPTIMIZED | parquet | 0.21   | 0.22        |   -4.45%   |
| TPCDS(10) | TPCDS-Q_COUNT_OPTIMIZED   | orc     | 0.14   | 0.21        | I -35.92%  |
+-----------+---------------------------+---------+--------+-------------+------------+

Testing:
- Restore PlannerTest.testOrcStatsAgg
- Restore TestAggregationQueriesRunOnce and
  TestAggregationQueriesRunOnce::test_orc_count_star_optimization
- Exercise count(star) in TestOrc::test_misaligned_orc_stripes
- Pass core tests

Change-Id: I5971c8f278e1dee44e2a8dd4d2f043d22ebf5d17
Reviewed-on: http://gerrit.cloudera.org:8080/19927
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-02-23 20:43:22 +00:00
Csaba Ringhofer
2f14fd29c0 IMPALA-12433: Share buffers among channels in KrpcDataStreamSender
Before this patch each KrpcDataStreamSender::Channel had 2
OutboundRowBatch with its own serialization and compression buffers.

This patch switches to use a single buffer per channel. This is
enough to store the in-flight data in KRPC, while other buffers
are only used during serialization and compression which is done for
just a single channel at a time, so can be shared among channels.

Memory estimates in the planner are not changed because the existing
calculation has several issues (see IMPALA-12594).

Change-Id: I64854a350a9dae8bf3af11c871882ea4750e60b3
Reviewed-on: http://gerrit.cloudera.org:8080/20719
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Kurt Deschler <kdeschle@cloudera.com>
Reviewed-by: Zihao Ye <eyizoha@163.com>
Reviewed-by: Michael Smith <michael.smith@cloudera.com>
2024-02-21 10:31:51 +00:00
Zoltan Borok-Nagy
87212d791b IMPALA-12811: Exception during re-analyze can be lost
When there is an AnalysisException during re-analyze, we try to print
the re-written and original statements via invoking toSql() on them.
But toSql() fails because the analysis of the statement was incomplete,
so it throws another exception (typically an IllegalStateException
without any relevant information about the original issue).

This patch puts the original LOG.error() in a try-catch, then wraps the
original AnalysisException into a new exception that just mentions that
the error occurred after query rewrite.

Testing:
 * added a column masking test with an invalid masking function

Change-Id: Ie6e36b08703c07a2a8d68a4ec0e8ddd65ba03199
Reviewed-on: http://gerrit.cloudera.org:8080/21037
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2024-02-20 22:44:03 +00:00