Commit Graph

1542 Commits

Author SHA1 Message Date
stiga-huang
374783c55e IMPALA-10898: Add runtime IN-list filters for ORC tables
ORC files have optional bloom filter indexes for each column. Since
ORC-1.7.0, the C++ reader supports pushing down predicates to skip
unreleated RowGroups. The pushed down predicates will be evaludated on
file indexes (i.e. statistics and bloom filter indexes). Note that only
EQUALS and IN-list predicates can leverage bloom filter indexes.

Currently Impala has two kinds of runtime filters: bloom filter and
min-max filter. Unfortunately they can't be converted into EQUALS or
IN-list predicates. So they can't leverage the file level bloom filter
indexes.

This patch adds runtime IN-list filters for this purpose. Currently they
are generated for the build side of a broadcast join. They will only be
applied on ORC tables and be pushed down to the ORC reader(i.e. ORC
lib). To avoid exploding the IN-list, if # of distinct values of the
build side exceeds a threshold (default to 1024), we set the filter to
ALWAYS_TRUE and clear its entry. The threshold can be configured by a
new query option, RUNTIME_IN_LIST_FILTER_ENTRY_LIMIT.

Evaluating runtime IN-list filters is much slower than evaluating
runtime bloom filters due to the current simple implementation (i.e.
std::unorder_set) and the lack of codegen. So we disable it at row
level.

For visibility, this patch addes two counters in the HdfsScanNode:
 - NumPushedDownPredicates
 - NumPushedDownRuntimeFilters
They reflect the predicates and runtime filters that are pushed down to
the ORC reader.

Currently, runtime IN-list filters are disabled by default. This patch
extends the query option, ENABLED_RUNTIME_FILTER_TYPES, to support a
comma separated list of filter types. It defaults to be "BLOOM,MIN_MAX".
Add "IN_LIST" in it to enable runtime IN-list filters.

Ran perf tests on a 3 instances cluster on my desktop using TPC-DS with
scale factor 20. It shows significant improvements in some queries:

+-----------+-------------+--------------------+--------+-------------+------------+------------+----------------+-------+----------------+---------+--------+
| Workload  | Query       | File Format        | Avg(s) | Base Avg(s) | Delta(Avg) | StdDev(%)  | Base StdDev(%) | Iters | Median Diff(%) | MW Zval | Tval   |
+-----------+-------------+--------------------+--------+-------------+------------+------------+----------------+-------+----------------+---------+--------+
| TPCDS(20) | TPCDS-Q67A  | orc / snap / block | 35.07  | 44.01       | I -20.32%  |   0.38%    |   1.38%        | 10    | I -25.69%      | -3.58   | -45.33 |
| TPCDS(20) | TPCDS-Q37   | orc / snap / block | 1.08   | 1.45        | I -25.23%  |   7.14%    |   3.09%        | 10    | I -34.09%      | -3.58   | -12.94 |
| TPCDS(20) | TPCDS-Q70A  | orc / snap / block | 6.30   | 8.60        | I -26.81%  |   5.24%    |   4.21%        | 10    | I -36.67%      | -3.58   | -14.88 |
| TPCDS(20) | TPCDS-Q16   | orc / snap / block | 1.33   | 1.85        | I -28.28%  |   4.98%    |   5.92%        | 10    | I -39.38%      | -3.58   | -12.93 |
| TPCDS(20) | TPCDS-Q18A  | orc / snap / block | 5.70   | 8.06        | I -29.25%  |   3.00%    |   4.12%        | 10    | I -40.30%      | -3.58   | -19.95 |
| TPCDS(20) | TPCDS-Q22A  | orc / snap / block | 2.01   | 2.97        | I -32.21%  |   6.12%    |   5.94%        | 10    | I -47.68%      | -3.58   | -14.05 |
| TPCDS(20) | TPCDS-Q77A  | orc / snap / block | 8.49   | 12.44       | I -31.75%  |   6.44%    |   3.96%        | 10    | I -49.71%      | -3.58   | -16.97 |
| TPCDS(20) | TPCDS-Q75   | orc / snap / block | 7.76   | 12.27       | I -36.76%  |   5.01%    |   3.87%        | 10    | I -59.56%      | -3.58   | -23.26 |
| TPCDS(20) | TPCDS-Q21   | orc / snap / block | 0.71   | 1.27        | I -44.26%  |   4.56%    |   4.24%        | 10    | I -77.31%      | -3.58   | -28.31 |
| TPCDS(20) | TPCDS-Q80A  | orc / snap / block | 9.24   | 20.42       | I -54.77%  |   4.03%    |   3.82%        | 10    | I -123.12%     | -3.58   | -40.90 |
| TPCDS(20) | TPCDS-Q39-1 | orc / snap / block | 1.07   | 2.26        | I -52.74%  | * 23.83% * |   2.60%        | 10    | I -149.68%     | -3.58   | -14.43 |
| TPCDS(20) | TPCDS-Q39-2 | orc / snap / block | 1.00   | 2.33        | I -56.95%  | * 19.53% * |   2.07%        | 10    | I -151.89%     | -3.58   | -20.81 |
+-----------+-------------+--------------------+--------+-------------+------------+------------+----------------+-------+----------------+---------+--------+
"Base Avg" is the avg of the original time. "Avg" is the current time.

However, we also see some regressions due to the suboptimal
implementation. The follow-up JIRAs will focus on improvements:
 - IMPALA-11140: Codegen InListFilter::Insert() and InListFilter::Find()
 - IMPALA-11141: Use exact data types in IN-list filters instead of
   casting data to a set of int64_t or a set of string.
 - IMPALA-11142: Consider IN-list filters in partitioned joins.

Tests:
 - Test IN-list filter on string, date and all integer types
 - Test IN-list filter with NULL
 - Test IN-list filter on complex exprs targets

Change-Id: I25080628233799aa0b6be18d5a832f1385414501
Reviewed-on: http://gerrit.cloudera.org:8080/18141
Reviewed-by: Qifan Chen <qchen@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2022-03-03 00:21:06 +00:00
Riza Suminto
873fe2e524 IMPALA-11135: Deflake LEFT ANTI JOIN test case in test_spilling.py
TestSpillingDebugActionDimensions.test_spilling has been flaky because a
test case from IMPALA-9725 sometimes does not spill its hash join
partition. This patch lowers the buffer_pool_limit of this test from
110MB to 105MB, just slightly above its Max Per-Host Resource
Reservation (104.61MB), to ensure consistent spilling behavior.

Testing:
After lowering the buffer pool limit, I loop the test 1000 times, and
all spill consistently in fragment "HASH_JOIN_NODE (id=14)".
To be specific, these are the num of SpilledPartitions of the first
instance (ending with "000d") of "Hash Join Builder (join_node_id=14)"
fragment across 1000 query runs:

+--------------------+----------+
| #SpilledPartitions | #Queries |
+--------------------+----------+
|                  2 |       30 |
|                  3 |       96 |
|                  4 |      674 |
|                  5 |       52 |
|                  6 |      146 |
|                  7 |        2 |
+--------------------+----------+

Change-Id: Idad9fc6ec6a0ba7fc70e0701e567da7165e40e83
Reviewed-on: http://gerrit.cloudera.org:8080/18261
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2022-02-24 07:06:50 +00:00
stiga-huang
331ff4647d IMPALA-11137: Enable proleptic Gregorian Calendar for Hive
Since HIVE-22589, Hive still uses Julian Calendar for writing dates
before 1582-10-15, whereas Impala uses proleptic Gregorian Calendar.
This affects the results Impala gets when querying tables written by
Hive. Currently, the Avro and ORC formats of date_tbl are suffering this
issue.

This patch enables proleptic Gregorian Calendar for Hive by default.
It also reverts the two commits of IMPALA-9555 which modifies the tests
to satisfy the inconsistent results.

Tests:
 - Ran CORE tests

Change-Id: I6be9c9720dd352d6821cdaa6c64d35ba20473bc0
Reviewed-on: http://gerrit.cloudera.org:8080/18262
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2022-02-23 20:09:10 +00:00
Attila Jeges
d3da875684 IMPALA-9498: Allow returning arrays in select list
Until now ARRAYs had to be unnested in queries. This patch adds
support to return ARRAYs as STRINGs (JSON arrays) in select list,
for example:
select id, int_array from functional_parquet.complextypestbl where id = 1;
returns: 1, [1,2,3]

Returning ARRAYs from inline or HMS views is also supported -
these arrays can be used both in the select list or as relative
table references. Using them as non-relative table reference is
not supported (IMPALA-11052).

Though STRUCTs are already supported, ARRAYs and STRUCTs nested in
each other are not supported yet.

Things intentionally postponed for later commits:
- Add MAP suppport too - this shouldn't be too tricky after
  ARRAY support, but I don't want to make this patch even more
  complex.
- Unify HS2 / Beeswax logic with the way STRUCTs are handled.
  This could be done in a "final" logic that can handle
  STRUCTS/ARRAYS nested to each other
- Implement "deep copy" and "deep serialize" for ARRAYs in BE.
  This would enable all operators, e.g. ORDER BY and UNION.

Testing:
- FE tests were added for analyses and authorization
- EE tests were added
- core tests were ran

Change-Id: Ibb1e42ffb21c7ddc033aba0f754b0108e46f34d0
Reviewed-on: http://gerrit.cloudera.org:8080/17811
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2022-02-17 18:51:06 +00:00
skyyws
345ba685e7 IMPALA-11049: Added expr analyzed check in 'SimplifyCastExprRule.java'
We added a new expr rewrite rule 'SimplifyCastExprRule.java' in
IMPALA-10836. If expr is not analyzed, this rewrite rule would throw
an 'AnalysisException', this is due to 'orderByElements_' not been
analyzed. We try to substitute order by elements when creating
'SortInfo', but caused some other problems. So we only add expr
analyzed check in this rule to solve this problem. When adding other
expr rewrite rules in the future, we should also add this check.

Testing:
- Added test cases in 'explain-level3.test'

Change-Id: I2780e04a6d5a32e224cd0470cf6f166a832363ec
Reviewed-on: http://gerrit.cloudera.org:8080/18099
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2022-02-16 13:28:27 +00:00
stiga-huang
35375b3287 IMPALA-2019(part-4): Add UTF-8 support for case conversion functions
There are 3 builtin case conversion string functions: upper(), lower(),
and initcap(). Previously they only convert English alphabetic
characters. This patch adds support to deal with Unicode characters.

There are many corner cases in case conversion depending on the locale
and context. E.g.
1) Case conversion is locale-sensitive.
Turkish has 4 letter "I"s. English has only two, a lowercase dotted i
and an uppercase dotless I. Turkish has lowercase and uppercase forms of
both dotted and dotless I. So simply converting "i" to "I" for upper
case is wrong in Turkish:
    +-------+--------+---------+
    |       | Dotted | Dotless |
    +-------+--------+---------+
    | Upper | İ      | I       |
    +-------+--------+---------+
    | Lower | i      | ı       |
    +-------+--------+---------+

2) Case conversion may change a string's length.
The German word "grüßen" should be converted to "GRÜSSEN" in upper case:
the letter "ß" should be converted to "SS".

3) Case conversion is context-sensitive.
The Greek word "ὈΔΥΣΣΕΎΣ" should be converted to "ὀδυσσεύς", where the
Greek letter "Σ" is converted to "σ" or to "ς", depending on its
position in the word.

The above cases will be focus in follow-up JIRAs. This patch addes the
initial implementation of UTF-8 aware case conversion functions.

--------
Implementation:
In UTF-8 mode (turned on by set UTF8_MODE=true) of these functions, the
bytes in strings are converted to wide characters using std::mbrtowc().
Each wide character (wchar_t) will then be converted using std::towupper
or std::towlower correspondingly. We then convert them back to multi
bytes using std::wcrtomb().

Note that these builtins are locale aware. If impalad is launched
without a UTF-8 aware locale, e.g. LC_ALL="C", these builtins can't
recognize non-ascii characters, which will return unexpected results.
Thus we modify our docker images to set LC_ALL="C.UTF-8" instead of "C".
This patch also logs the current locale when launching impala daemons
for better debugging. We will support customized locale in IMPALA-11080.

Test:
 - Add BE unit tests and e2e tests.

Change-Id: I443e89d46f4638ce85664b021666bc4f03ee8abd
Reviewed-on: http://gerrit.cloudera.org:8080/17785
Reviewed-by: Csaba Ringhofer <csringhofer@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2022-02-15 18:40:59 +00:00
xqhe
c36ff0cdd7 IMPALA-10982: fix unable to explain the set operation statement
For SetOperationStmt we will replace the query statement with the
rewritten version, but we haven’t set the explain flag if the
original is explain statement.

Tests:
  -- Using impala-shell to test the explain statement of set operation.
  -- Add new test case in the explain_level tests

Change-Id: I19264dfa794ffd5ed7355acfef0ac35f17c809d3
Reviewed-on: http://gerrit.cloudera.org:8080/18179
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2022-02-12 06:09:59 +00:00
pranav.lodha
bde995483a IMPALA-955: BYTES built-in function
The Bytes function returns the number of bytes contained
in the specified byte string. There are changes in
4 files. A few testcases are also added in
be/src/exprs/expr-test.cc and an end-to end test in
testdata/workloads/functional-query/queries/QueryTest/exprs.test.

Change-Id: I0bd06c3d6dba354d71f63c649eaa8f9f74d266ee
Reviewed-on: http://gerrit.cloudera.org:8080/18210
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2022-02-11 07:01:58 +00:00
Csaba Ringhofer
97dda2b27d IMPALA-6636: Use async IO in ORC scanner
This patch implements async IO in the ORC scanner. For each ORC stripe,
we begin with iterating the column streams. If a column stream is
possible for async IO, it will create ColumnRange, register
ScannerContext::Stream for that ORC stream, and start the stream. We
modify HdfsOrcScanner::ScanRangeInputStream::read to check whether there
is a matching ColumnRange for the given offset and length. If so, the
reading continue through HdfsOrcScanner::ColumnRange::read.

We leverage existing async IO methods from HdfsParquetScanner class for
initial memory allocations. We moved related methods such as
DivideReservationBetweenColumns and ComputeIdealReservation up to
HdfsColumnarScanner class.

Planner calculates the memory reservation differently between async
Parquet and async ORC. In async Parquet, the planner calculates the
column memory reservation and relies on the backend to divide them as
needed. In async ORC, the planner needs to split the column's memory
reservation based on the estimated number of streams for that column
type. For example, a string column with a 4MB memory estimate will need
to split that estimate into four 1MB because it might use dictionary
encoding with four streams (PRESENT, DATA, DICTIONARY_DATA, and LENGTH
stream). This splitting is required because each async IO stream needs
to start with an 8KB (min_buffer_size) initial memory reservation.

To show the improvement from ORC async IO, we contrast the total time
and geomean (in milliseconds) to run full TPC-DS 10 TB, 19 executors,
with varying ORC_ASYNC_IO and DISABLE_DATA_CACHE options as follow:

+----------------------+------------------+------------------+
| Total time           | ORC_ASYNC_READ=0 | ORC_ASYNC_READ=1 |
+----------------------+------------------+------------------+
| DISABLE_DATA_CACHE=0 |          3511075 |          3484736 |
| DISABLE_DATA_CACHE=1 |          5243337 |          4370095 |
+----------------------+------------------+------------------+

+----------------------+------------------+------------------+
| Geomean              | ORC_ASYNC_READ=0 | ORC_ASYNC_READ=1 |
+----------------------+------------------+------------------+
| DISABLE_DATA_CACHE=0 |      12786.58042 |      12454.80365 |
| DISABLE_DATA_CACHE=1 |      23081.10888 |      16692.31512 |
+----------------------+------------------+------------------+

Testing:
- Pass core tests.
- Pass core e2e tests with ORC_ASYNC_READ=1.

Change-Id: I348ad9e55f0cae7dff0d74d941b026dcbf5e4074
Reviewed-on: http://gerrit.cloudera.org:8080/15370
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2022-02-09 21:20:23 +00:00
Qifan Chen
63bd6a5aec IMPALA-11047 Preconditions.checkNotNull(statsTuple_) fail in HdfsScanNode.java if PARQUET_READ_STATISTICS=0
This patch addresses the check not-null failure in FE by checking
the query option PARQUET_READ_STATISTICS in the following situations:

  1) When determining whether to apply the min/max overlap predicate;
  2) When modifying query option minmax_filter_threshold and
     minmax_filtering_level for applying min/max filters to sort or
     partition columns.

When PARQUET_READ_STATISTICS is true or 1, then either will proceed.

Testing:
  1. Add a new test in TestOverlapMinMaxFilters;
  2. Ran the core test successfully.

Change-Id: I52203e73502a35a275decb602b063982b9cad26e
Reviewed-on: http://gerrit.cloudera.org:8080/18071
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2022-02-04 21:23:55 +00:00
Zoltan Borok-Nagy
a0e6b6b618 IMPALA-11105: Impala crashes in PhjBuilder::Close() when Prepare() fails
In PhjBuilder::Close() we invoke
'ht_ctx_->StatsCountersAdd(ht_stats_profile_.get())' when 'ht_ctx_' is
not null. But in Prepare we create 'ht_ctx_' first, then after a couple
operations which might fail we create 'ht_stats_profile_'. This means if
an operation fails in Prepare(), between the creation of 'ht_ctx_' and
'ht_stast_profile_', then later we'll get a SEGFAULT in Close().

This patch restructures the code in PhjBuilder::Prepare(), so at first
it creates the counters and profile, then it creates 'ht_ctx_',
similarly to what we do in grouping-aggregator.cc. It also modifies
HashTableCtx::StatsCountersAdd(), so in release mode it is a no-op
if 'profile' is null.

Testing:
 * added a debug action that fails PhjBuilder::Prepare() after the
   creation of 'ht_ctx_'

Change-Id: Id41b0c45d9693cb3433e02737048cb9f50ba59c1
Reviewed-on: http://gerrit.cloudera.org:8080/18195
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2022-02-04 17:55:47 +00:00
Steve Carlin
4734a681f3 IMPALA-10997: Refactor Java Hive UDF code.
In its current form, Impala supports Java UDFs that are derived from
the UDF.class.

The UDF.class is legacy code and Hive now supports implementation off
of the GenericUDF.class.

This rewrite will allow for easier extension to support GenericUDFs.

Among added classes:

UdfExecutor: The entry point class which is directly accessed by the
backend. This is a wrapper class to the UDF class that will handle
the evaluation of rows.

HiveUdfExecutor: Abstract base class that contains code that is common
to the legacy UDF.class and the GenericUDF.class when it is eventually
created.

HiveUdfExecutorLegacy: Implementation of the code that is UDF.class
specific.

HiveUdfLoader: Class responsible for using reflection to instantiate
the UDF class

HiveJavaFunction: Interface for retrieving objects pertaining to the
UDF function class.

HiveLegacyJavaFunction: Class representing the metadata for the legacy
UDF class.

Also added some functionality which captures the error when a user
attempts to create a function and the function doesn't exist. The
unit test checking this is the UDFRound function which no longer
exists in hive-exec.jar so it is now in a load-java-udfs-fail.test
test file.

Change-Id: Idc9572e15fbed1876412159b99dddd3fb4d37174
Reviewed-on: http://gerrit.cloudera.org:8080/18020
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Csaba Ringhofer <csringhofer@cloudera.com>
Tested-by: Csaba Ringhofer <csringhofer@cloudera.com>
2022-02-03 09:13:21 +00:00
Gergely Fürnstáhl
bb4903aeb0 IMPALA-10748: Remove enable_orc_scanner flag
Impala supports reading ORC files by default for quite some time.

Removed enable_orc_scanner flag and related code and test, disabling
ORC support is no longer possible.
Removed notes on how to disable ORC support from docs.

Change-Id: I7ff640afb98cbe3aa46bf03f9bff782574c998a5
Reviewed-on: http://gerrit.cloudera.org:8080/18188
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2022-02-03 03:13:41 +00:00
Steve Carlin
504e0d0012 IMPALA-11056: Create option to fail query on Java UDF exceptions
This commit will create a new query option,
"abort_java_udf_on_exception".  The current and default behavior
is that when the Java UDF throws an exception, a warning is logged
and the function returns NULL. If the query option is set to
true, the query will fail.

Change-Id: Ifece20cf16a6575f1c498238f754440e870e2ce9
Reviewed-on: http://gerrit.cloudera.org:8080/18080
Reviewed-by: Kurt Deschler <kdeschle@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Aman Sinha <amsinha@cloudera.com>
2022-01-27 23:06:36 +00:00
Tamas Mate
12118664d8 IMPALA-10910, IMPALA-5509: Runtime filter: dictionary filter support
This commit is based on Csaba Ringhofer's earlier work on IMPALA-5509.

If a runtime filter uses only a single column, then it can be used to
filter Parquet dictionaries and if all dictionary values are filtered
out, the whole row group can be skipped. This is especially useful for
Iceberg tables, as the partition column is in the data file, therefore
this can help eliminate unnecessary reads.

The chance of false positives grow exponentially with the size of the
dictionary, so this optimisation is only useful for small dictionaries.
A new query option has been added to limit the runtime filter evaluation
to smaller diciotnaries, the default value has been set to 1024,
the new option is 'PARQUET_DICTIONARY_RUNTIME_FILTER_ENTRY_LIMIT'.

Testing:
 - Added e2e test that creates an Iceberg/Parquet table and queries it
 - Ran single node perf test with TPC-H scale 10 on Parquet, there
   were no regressions

Change-Id: Ida0ada8799774be34312eaa4be47336149f637c7
Reviewed-on: http://gerrit.cloudera.org:8080/18017
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2022-01-23 20:55:00 +00:00
Zoltan Borok-Nagy
3f51a6a761 IMPALA-11051: Add support for 'void' Iceberg partition transform
Iceberg recently added a new partition transform called 'void':
https://iceberg.apache.org/#spec/#partition-transforms

This patch adds support for this transform.

When the user wants to drop a column from the partition spec,
the VOID transform should be used instead of just omitting
the column. Simply omitting the column might cause problems when
the metadata table is being queried (currently only supported
by other engines).

Testing
 * added SHOW CREATE TABLE test
 * added e2e test

Change-Id: Icbe11d56cdeb82aaadedfdb3ad61dd7cc4c2f4d0
Reviewed-on: http://gerrit.cloudera.org:8080/18102
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2022-01-17 15:42:26 +00:00
Riza Suminto
577fc2ee21 IMPALA-11072: Deflake TestSpillingDebugActionDimensions.test_spilling
The first test case in TestSpillingDebugActionDimensions.test_spilling
has been flaky for not spilling any partitions in its hash join node.
This patch fixes the flakiness by reducing the buffer_pool_limit from
215 MB to 110 MB, which is around double of the query Per Host Min
Memory Reservation.

Testing:
- Manually run the first test case of
  TestSpillingDebugActionDimensions.test_spilling. Verify that both of
  the hash joins are spilling and the test pass.

Change-Id: Ie8802505e0dcae1be5e855107436805bd10e0077
Reviewed-on: http://gerrit.cloudera.org:8080/18138
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2022-01-13 18:45:08 +00:00
Aman Sinha
6894085e4e IMPALA-11030: Fix incorrect creation of common partition exprs
When there are 2 or more analytic functions in an inline view
and at least one of them does not have a partition-by expr,
we were previously still populating the commonPartitionExprs
list in AnalyticInfo. This common partition expr was then
used during the auxiliary predicate creation when the outer
query has a predicate on partition-by column. This leads to
wrong result because the auxiliary predicate is pushed down
to the table scan. While pushing down predicate on a
partitioning column is okay if all the analytic functions
contain that partitioning column, it is not correct to do
this when at least one analytic function does not have that
partitioning column.

This patch fixes the wrong result by ensuring that the
AnalyticInfo's commonPartitionExprs is empty if at least
one analytic function does not have partitioning exprs.

Testing:
 - Added new planner test and e2e test for row_num
   analytic function

Change-Id: Iebb51f691e8e5459ffbaf5a49907140f2de212cc
Reviewed-on: http://gerrit.cloudera.org:8080/18072
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Aman Sinha <amsinha@cloudera.com>
2022-01-09 15:39:35 +00:00
Fang-Yu Rao
351e037472 IMPALA-10934 (Part 2): Enable table definition over a single file
This patch adds an end-to-end test to validate and characterize HMS'
behavior with respect to external table creation after HIVE-25569 via
which a user is allowed to create an external table associated with a
single file.

Change-Id: Ia4f57f07a9f543c660b102ebf307a6cf590a6784
Reviewed-on: http://gerrit.cloudera.org:8080/18033
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Aman Sinha <amsinha@cloudera.com>
2022-01-05 03:32:11 +00:00
wzhou-code
1ed48a542b IMPALA-11005 (part 3): Repalce random number generator with mt19937_64
Previous patch upgraded boost library. This patch changes 64-bit random
number generator from ranlux64_3 to mt19937_64 since mt19937_64 has
better performance according to boost benchmark at https://www.boost.org
/doc/libs/1_74_0/doc/html/boost_random/performance.html.
Also fixs an unit-test which is affected by the change of random number
generator.

Testing:
 - Passed exhaustive tests.

Change-Id: Iade226fc17442f4d7b9b14e4a9e80a30a3856226
Reviewed-on: http://gerrit.cloudera.org:8080/18022
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-12-16 11:49:15 +00:00
Abhishek Rawat
763acffb74 IMPALA-6590: Disable expr rewrites and codegen for VALUES() statements
Expression rewrites for VALUES() could result in performance regression
since there is virtually no benefit of rewrite, if the expression will
only ever be evaluated once. The overhead of rewrites in some cases
could be huge, especially if there are several constant expressions.
The regression also seems to non-linearly increase as number of columns
increases. Similarly, there is no value in doing codegen for such const
expressions.

The rewriteExprs() for ValuesStmt class was overridden with an empty
function body. As a result rewrites for VALUES() is a no-op.

Codegen was disabled for const expressions within a UNION node, if
the UNION node is not within a subplan. This applies to all UNION nodes
with const expressions (and not just limited to UNION nodes associated
with a VALUES clause).

The decision for whether or not to enable codegen for const expressions
in a UNION is made in the planner when a UnionNode is initialized. A new
member 'is_codegen_disabled' was added to the thrift struct TExprNode
for communicating this decision to backend. The Optimizer should take
decisions it can and so it seemed like the right place to disable/enable
codegen. The infrastructure is generic and could be extended in future
to selectively disable codegen for any given expression, if needed.

Testing:
- Added a new e2e test case in tests/query_test/test_codegen.py, which
  tests the different scenarios involving UNION with const expressions.
- Passed exhaustive unit-tests.
- Ran manual tests to validate that the non-linear regression in VALUES
  clause when involving increasing number of columns is no longer seen.
  Results below.

for i in 256 512 1024 2048 4096 8192 16384 32768;
do (echo 'VALUES ('; for x in $(seq $i);
do echo  "cast($x as string),"; done;
echo "NULL); profile;") |
time impala-shell.sh -f /dev/stdin |& grep Analysis; done

Base:
       - Analysis finished: 20.137ms (19.215ms)
       - Analysis finished: 46.275ms (44.597ms)
       - Analysis finished: 119.642ms (116.663ms)
       - Analysis finished: 361.195ms (355.856ms)
       - Analysis finished: 1s277ms (1s266ms)
       - Analysis finished: 5s664ms (5s640ms)
       - Analysis finished: 29s689ms (29s646ms)
       - Analysis finished: 2m (2m)

Test:
       - Analysis finished: 1.868ms (986.520us)
       - Analysis finished: 3.195ms (1.856ms)
       - Analysis finished: 7.332ms (3.484ms)
       - Analysis finished: 13.896ms (8.071ms)
       - Analysis finished: 31.015ms (18.963ms)
       - Analysis finished: 60.157ms (38.125ms)
       - Analysis finished: 113.694ms (67.642ms)
       - Analysis finished: 253.044ms (163.180ms)

Change-Id: I229d67b821968321abd8f97f7c89cf2617000d8d
Reviewed-on: http://gerrit.cloudera.org:8080/13645
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-12-16 05:51:27 +00:00
skyyws
ad29ce70b3 IMPALA-11040: Remove unnecessary reset() method in class 'UnionStmt'
When query contains multiple nested union stmt, more than twenty or
thirty, and needs 'reAnalyze', such as rewrite expr. Query would
execute slowly, even failed due to 'reset' method called in class
'UnionStmt' and 'SetOperationStmt'.
'SetOperationStmt' is added in IMPALA-9943 and IMPALA-4974. Multiple
nested union stmt will lead to 'reset' called numbers to grow
exponentially. Since 'operands_' will be reset in two class' reset()
method, and handle with their children recursively. Too many nested
union stmt will cause deep nesting.
UnionStmt.reset() content is exactly same as SetOperationStmt.reset().
This patch  removed this method in 'UnionStmt'. After this, the
original query would execute quickly.
An example already add in file 'union.test', without this patch, the
example query would execute slowly, or even fail.

Testing:
- Added new test case in 'union.test'

Change-Id: I408a396d40d9622f2ae6c459f49cbfcc19affd14
Reviewed-on: http://gerrit.cloudera.org:8080/18061
Reviewed-by: Qifan Chen <qchen@cloudera.com>
Reviewed-by: Zoltan Borok-Nagy <boroknagyz@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-12-09 23:17:20 +00:00
guojingfeng
b0b6e11f53 IMPALA-10970: Fix criterion for classifying coordinator only query
This patch fixes a bug in the criterion which decided whether a query
can be considered as a coordinator only query. It did not consider
the possibility of parallel plans and ended up mis-classifying some
queries as coordinator only queries.

This classification was used during scheduling when dedicated
coordinators and executor groups are used and allowed coordinator
queries to be scheduled only on the coordinator even in the absence
of healthy executor groups.

As a result of this bug, queries classified wrongly ended up with
error code: NO_REGISTERED_BACKENDS.

Testing:
- Add new mt_dop test case for functional_query and pass
- Ran and passed custom_cluster/test_coordinators, test_executor_groups

Change-Id: Icaaf1f1ba7a976122b4d37bd675e6d8181dc8700
Reviewed-on: http://gerrit.cloudera.org:8080/17937
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Bikramjeet Vig <bikramjeet.vig@cloudera.com>
2021-11-30 20:13:18 +00:00
Gabor Kaszab
df528fe2b1 IMPALA-10920: Zipping unnest for arrays
This patch provides an unnest implementation for arrays where unnesting
multiple arrays in one query results the items of the arrays being
zipped together instead of joining. There are two different syntaxes
introduced for this purpose:

1: ISO:SQL 2016 compliant syntax:
SELECT a1.item, a2.item
FROM complextypes_arrays t, UNNEST(t.arr1, t.arr2) AS (a1, a2);

2: Postgres compatible syntax:
SELECT UNNEST(arr1), UNNEST(arr2) FROM complextypes_arrays;

Let me show the expected behaviour through the following example:
Inputs: arr1: {1,2,3}, arr2: {11, 12}
After running any of the above queries we expect the following output:
===============
| arr1 | arr2 |
===============
| 1    | 11   |
| 2    | 12   |
| 3    | NULL |
===============

Expected behaviour:
 - When unnesting multiple arrays with zipping unnest then the 'i'th
   item of one array will be put next to the 'i'th item of the other
   arrays in the results.
 - In case the size of the arrays is not the same then the shorter
   arrays will be filled with NULL values up to the size of the longest
   array.

On a sidenote, UNNEST is added to Impala's SQL language as a new
keyword. This might interfere with use cases where a resource (db,
table, column, etc.) is named "UNNEST".

Restrictions:
 - It is not allowed to have WHERE filters on an unnested item of an
   array in the same SELECT query. E.g. this is not allowed:
   SELECT arr1.item
   FROM complextypes_arrays t, UNNEST(t.arr1) WHERE arr1.item < 5;

   Note, that it is allowed to have an outer SELECT around the one
   doing unnests and have a filter there on the unnested items.
 - If there is an outer SELECT filtering on the unnested array's items
   from the inner SELECT then these predicates won't be pushed down to
   the SCAN node. They are rather evaluated in the UNNEST node to
   guarantee result correctness after unnesting.
   Note, this restriction is only active when there are multiple arrays
   being unnested, or in other words when zipping unnest logic is
   required to produce results.
 - It's not allowed to do a zipping and a (traditional) joining unnest
   together in one SELECT query.
 - It's not allowed to perform zipping unnests on arrays from different
   tables.

Testing:
 - Added a bunch of E2E tests to the test suite to cover both syntaxes.
 - Did a manual test run on a table with 1000 rows, 3 array columns
   with size of around 5000 items in each array. I did an unnest on all
   three arrays in one query to see if there are any crashes or
   suspicious slowness when running on this scale.

Change-Id: Ic58ff6579ecff03962e7a8698edfbe0684ce6cf7
Reviewed-on: http://gerrit.cloudera.org:8080/17983
Reviewed-by: Csaba Ringhofer <csringhofer@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-11-23 07:03:10 +00:00
Andrew Sherman
ee03727971 IMPALA-11025: Transactional tables should use /test-warehouse/managed/databasename.db
Recent Hive releases seem to be enforcing that data for a managed table
is stored under the hive.metastore.warehouse.dir path property in a
folder path similar to databasename.db/tablename  - see
https://cwiki.apache.org/confluence/display/Hive/Managed+vs.+External+Tables
Use this form /test-warehouse/managed/databasename.db in
generate-schema-statements.py when creating transactional tables.

Testing:
- A few small changes to tests that verify filesystem changes for acid
  tables.
- Exhaustive tests pass.

Change-Id: Ib870ca802c9fa180e6be7a6f65bef35b227772db
Reviewed-on: http://gerrit.cloudera.org:8080/18046
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-11-23 03:24:08 +00:00
Daniel Becker
cb0018e679 IMPALA-11011: Impala crashes in OrcStructReader::NumElements()
Running the query

select inner_arr.ITEM
from functional_orc_def.complextypestbl_non_transactional.nested_struct.c.d.ITEM
as inner_arr;

crashes Impala because in OrcStructReader::NumElements() 'vbatch_' is
NULL and we dereference it.

This commit adds a NULL check and if 'vbatch_' is NULL, NumElements()
returns 0.

Testing:
  - added a regression test in
    'testdata/workloads/functional-query/queries/QueryTest/struct-in-select-list.test'
    that runs the above query.

Change-Id: I19cea7afdd1b3542a20a81b9f212fa320f3c1427
Reviewed-on: http://gerrit.cloudera.org:8080/18007
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-11-10 21:11:30 +00:00
Zoltan Borok-Nagy
b02c003138 IMPALA-10974: Impala cannot resolve columns of converted Iceberg table
When a regular Parquet/ORC table is converted to Iceberg via Hive,
only the Iceberg metadata files need to be created. The data files
can stay in place.

This causes problems when the data files don't have field ids for
the schema elements. Currently Impala resolves columns in data
files based on Iceberg field ids, but since they are missing,
Impala raises an error or returns NULLs.

With this patch Impala falls back to the default column resolution
strategy when the data files lack field ids.

Testing:
 * added e2e tests both for Parquet and ORC

Change-Id: I85881b09891c7bd101e7a96e92561b70bbe5af41
Reviewed-on: http://gerrit.cloudera.org:8080/17953
Reviewed-by: Csaba Ringhofer <csringhofer@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-11-04 17:55:21 +00:00
Zoltan Borok-Nagy
9ed4b36897 IMPALA-10777: Enable min/max filtering for Iceberg partitions
This patch enables min/max filters for Iceberg columns that
participate in table partitioning. The min/max filters are
evaluated at the Parquet row group level. This means that it
is still slower than dynamic partition pruning (which doesn't
even need to open the files), but much faster than no pruning at all.

Performance

I used the following query to measure perf on a scale 10 TPC-DS
dataset:

 select i_item_id,sum(ss_ext_sales_price) total_sales
 from
         store_sales,
         date_dim,
          customer_address,
          item
 where i_item_id in (select
      i_item_id
 from item
 where i_color in ('orchid','chiffon','lace'))
  and     ss_item_sk              = i_item_sk
  and     ss_sold_date_sk         = d_date_sk
  and     d_year                  = 2000
  and     d_moy                   = 1
  and     ss_addr_sk              = ca_address_sk
  and     ca_gmt_offset           = -8

The above query took the following times to execute:

Regular Parquet table: 1.16s
Iceberg table without min/max filters: 4.39s
Iceberg table with min/max filters: 1.77s

Testing:
 * added e2e test
 * planner test could not be added because Iceberg tables behave
   differently during planner tests (due to some hacks that needs
   refactoring)

Change-Id: I51b53188c6da7eeebfeae385e1de31ace0980cac
Reviewed-on: http://gerrit.cloudera.org:8080/17960
Reviewed-by: Qifan Chen <qchen@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-11-03 03:26:27 +00:00
Amogh Margoor
cd64271a0c IMPALA-9873: Avoid materialization of columns for filtered out rows in Parquet table.
Currently, entire row is materialized before filtering during scan.
Instead of paying the cost of materializing upfront, for columnar
formats we can avoid doing it for rows that are filtered out.
Columns that are required for filtering are the only ones that are
needed to be materialized before filtering. For rest of the columns,
materialization can be delayed and be done only for rows that survive.
This patch implements this technique for Parquet format only.

New configuration 'parquet_materialization_threshold' is introduced,
which is minimum number of consecutive rows that are filtered out
to avoid materialization. If set to less than 0, it disables the
late materialization.

Performance:
Peformance measured for single daemon, single threaded impalad
upon TPCH scale 42 lineitem table with 252 million rows,
unsorted data. Upto 2.5x improvement for non-page indexed and
upto 4x improvement in page index seen. Queries for page index
borrowed from blog:
https://blog.cloudera.com/speeding-up-select-queries-with-parquet-page-indexes/
More details:
https://docs.google.com/spreadsheets/d/17s5OLaFOPo-64kimAPP6n3kJA42vM-iVT24OvsQgfuA/edit?usp=sharing

Testing:
 1. Ran existing tests
 2. Added UT for 'ScratchTupleBatch::GetMicroBatch'
 3. Added end-to-end test for late materialization.
Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60
Reviewed-on: http://gerrit.cloudera.org:8080/17860
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Qifan Chen <qchen@cloudera.com>
2021-10-28 20:09:46 +00:00
Qifan Chen
975883c470 IMPALA-10811 RPC to submit query getting stuck for AWS NLB forever
This patch addresses Impala client hang due to AWS network load balancer
timeout which is fixed at 350s. When some long DDL operations are
executing and the timeout happens, AWS silently drops the connection and
the Impala client enters the hang state.

The fix maintains the current TCLIService protocol between the client
and Impala server and is applicable to the following Impala clients
which issue thrift RPC ExecuteStatement() followed by repeated call to
GetOperationStatus() (HS2, Impyla and HUE) or a variant of it (Beeswax)
to Impala backend.

  1. HS2
  2. Beeswax
  3. Impyla
  4. HUE

In the fix, the backend method ClientRequestState::ExecDdlRequest()
can start a new thread in 'async_exec_thread_' for ExecDdlRequestImpl()
which executes most of the DDLs asynchronously. This thread is waited
for in the wait thread 'wait_thread_'. Since the wait thread also runs
asynchronously, the execution of the DDLs will not cause a wait on the
Impala client. Thus the Impala client can keep checking its execution
status via GetOperationStatus() without long waiting, say more than
350s.

As an optimization, the above asynchronous mode is not applied to the
execution of certain DDLs that run very low risks of long execution.

  1. Operations that do not access catalog service;
  2. COMPUTE STATS as the stats computation queries already run
     asynchronously.

External behavior change:
  1. A new field with name "DDL execution mode:" is added to the
     summary section in the runtime profile, next to "DDL Type". This
     field takes either 'asynchronous' or 'synchronous' as value.
  2. A new query option 'enable_async_ddl_execution', default to true,
     is added. It can be set to false to turn off the patch.

Limitations:
  This patch does not handle potential AWS NLB-type time out for LOAD
  DATA (IMPALA-10967).

Testing:
  1. Added new async. DDL unit tests with HS2, HS2-HTTP, Beeswax and
     JDBC clients.
  2. Ran core tests successfully.

Change-Id: Ib57e86926a233ef13d27a9ec8d9c36d33a88a44e
Reviewed-on: http://gerrit.cloudera.org:8080/17872
Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-10-23 00:08:14 +00:00
stiga-huang
c127b6b1a7 IMPALA-10873: Push down EQUALS, IS NULL and IN-list predicate to ORC reader
This patch pushs down more kinds of predicates into the ORC reader,
including EQUALS, IN-list, and IS-NULL predicates to have more
improvements:
 - EQUALS and IN-list predicates can be evaluated inside the ORC reader
   with bloom filters in the ORC files.
 - Comparing to scanning parquet that converting an IN-list predicate
   into two binary predicates (i.e. LE and GE), the ORC reader can
   leverage IN-list predicates to skip ORC RowGroups. E.g. a RowGroup
   with int column 'x' in range [1, 100] will be skipped if we push down
   predicate "x in (0, 101)".
 - IS-NULL predicates (including IS-NOT-NULL) can also be used in the
   ORC reader to skip RowGroups.

Implementation:
FE will collect these kinds of predicates into 'min_max_conjuncts' of
THdfsScanNode. To better reflect the meaning, 'min_max_conjuncts' is
renamed to 'stats_conjuncts'. Same for other related variable names.

Parquet scanner will only pick binary min-max conjuncts (i.e. LT, GT,
LE, and GE) to keep the existing behavior. ORC scanner will build
SearchArgument based on all these conjuncts.

Tests
 * Add a new test table 'alltypessmall_bool_sorted' which has files
   contiaining sorted bool values.
 * Add test in orc-stats.test

Change-Id: Iaa89f080fe2e87d94fc8ea7f1be83e087fa34225
Reviewed-on: http://gerrit.cloudera.org:8080/17815
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Qifan Chen <qchen@cloudera.com>
2021-10-21 15:45:39 +00:00
Zoltan Borok-Nagy
3e75a17730 IMPALA-10957: test_iceberg_query is flaky
In iceberg-query.test we create an external Iceberg table and
set the table property 'iceberg.file_format' to check
backward-compatibility with earlier versions. At the end we
delete the table. The table deletion makes the test fail
sporadically during GVO.

Seems like the bug is caused by the parallel execution of this test.
The test didn't use a unique database, therefore dropping the table
could affect other executions of the same test. This patch puts
the relevant queries to their own .test file using a unique
database.

Change-Id: I16e558ae5add48d8a39bd89277a0256f534ba65f
Reviewed-on: http://gerrit.cloudera.org:8080/17929
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-10-18 19:32:29 +00:00
Zoltan Borok-Nagy
61964882d1 IMPALA-10914: Consistently schedule scan ranges for Iceberg tables
Before this patch Impala inconsistently scheduled scan ranges for
Iceberg tables on HDFS, in local catalog mode. It did so because
LocalIcebergTable reloaded all the files descriptors, and the HDFS
block locations were not consistent across the reloads. Impala's
scheduler uses the block location list for scan range assignment,
hence the assignments were inconsistent between queries. This has
a negative effect on caching and hence hit performance quite badly.

It is redundant and expensive to reload file descriptors for each
query in local catalog mode. This patch extends the GetPartialInfo()
RPC with Iceberg-specific snapshot information. It means that the
coordinator is now able to fetch Iceberg data file descriptors from
the CatalogD. This way scan range assignment becomes consistent
because we reuse the same file descriptors with the same block
location information.

Fixing the above revealed another bug. Before this patch we didn't
handle self-events of Iceberg tables. When an Iceberg table is stored
in the HiveCatalog it means that Iceberg will update the HMS table
on modifications because it needs to update table property
'metadata_location' (this points to the new snapshot file).
Then Catalogd processes these modifications again when they arrive
via the event notification mechanism. I fixed this by creating Iceberg
transactions in which I set the catalog service ID and new catalog
version for the Iceberg table. Since we are using transactions now
Iceberg has to embed all table modifications in a single ALTER TABLE
request to HMS, and detect the corresponding alter event later via the
aforementioned catalog service ID and version.

Testing:
 * added e2e test for the scan range assignment
 * added e2e test for detecting self-events

Change-Id: Ibb8216b37d350469b573dad7fcefdd0ee0599ed5
Reviewed-on: http://gerrit.cloudera.org:8080/17857
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Qifan Chen <qchen@cloudera.com>
2021-10-04 17:34:56 +00:00
Zoltan Borok-Nagy
d2f866f9a1 IMPALA-10935: Impala crashes on old Iceberg table property
With IMPALA-10627 we switched to use standard Iceberg table
properties: https://iceberg.apache.org/configuration/

E.g. we switched from 'iceberg.file_format' to 'write.format.default'.
For backward compatibility we also support 'iceberg.file_format'. Though
the support is not perfect as it causes a crash in some cases.

Impala crashes when the following conditions met:
* local catalog mode is being used
* Iceberg table is being queried
* the data file format is ORC
* 'iceberg.file_format' is set instead of 'write.format.default' table
  property
* Query is "select count(*) from t;"

Impala wrongly assumes that PARQUET is being used and tries to apply the
count star optimization. It is not implemented for the ORC scanner and
causes it to crash.

This patch fixes the wrong assumption. Also it fixes the HdfsOrcScanner,
so it won't crash in release mode but raise an error.

This patch also enables UNSETting the file format table property for
Iceberg tables. This table property was already enabled for
modifications (changing the value via SET TBLPROPERTIES).

Testing:
 * added e2e test for the above conditions

Change-Id: Iafd9baef1c124d7356a14ba24c571567629a5e50
Reviewed-on: http://gerrit.cloudera.org:8080/17877
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-10-02 03:15:17 +00:00
liuyao
39cc4b6bf4 IMPALA-2581: LIMIT can be propagated down into some aggregations
This patch contains 2 parts:
1. When both conditions below are true, push down limit to
pre-aggregation
     a) aggregation node has no aggregate function
     b) aggregation node has no predicate
2. finish aggregation when number of unique keys of hash table has
exceeded the limit.

Sample queries:
SELECT DISTINCT f FROM t LIMIT n
Can pass the LIMIT all the way down to the pre-aggregation, which
leads to a nearly unbounded speedup on these queries in large tables
when n is low.

Testing:
Add test targeted-perf/queries/aggregation.test
Pass core test

Change-Id: I930a6cb203615acfc03f23118d1bc1f0ea360995
Reviewed-on: http://gerrit.cloudera.org:8080/17821
Reviewed-by: Qifan Chen <qchen@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-09-22 20:42:10 +00:00
norbert.luksa
35b21083b1 IMPALA-6505: Min-Max predicate push down in ORC scanner
In planning phase, the planner collects and generates min-max predicates
that can be evaluated on parquet file statistics. We can easily extend
this on ORC tables.

This commit implements min/max predicate pushdown for the ORC scanner
leveraging on the external ORC library's search arguments. We build
the search arguments when we open the scanner as we need not to
modify them later.

Also added a new query option orc_read_statistics, similar to
parquet_read_statistics. If the option is set to true (it is by default)
predicate pushdown will take effect, otherwise it will be skipped. The
predicates will be evaluated at ORC row group level, i.e. by default for
every 10,000 rows.

Limitations:
 - Min-max predicates on CHAR/VARCHAR types are not pushed down due to
   inconsistent behaviors on padding/truncating between Hive and Impala.
   (IMPALA-10882)
 - Min-max predicates on TIMESTAMP are not pushed down (IMPALA-10915).
 - Min-max predicates having different arg types are not pushed down
   (IMPALA-10916).
 - Min-max predicates with non-literal const exprs are not pushed down
   since SearchArgument interfaces only accept literals. This only
   happens when expr rewrites are disabled thus constant folding is
   disabled.

Tests:
 - Add e2e tests similar to test_parquet_stats to verify that
   predicates are pushed down.
 - Run CORE tests
 - Run TPCH benchmark, there is no improvement, nor regression.
   On the other hand, certain selective queries gained significant
   speed-up, e.g. select count(*) from lineitem where l_orderkey = 1.

Change-Id: I136622413db21e0941d238ab6aeea901a6464845
Reviewed-on: http://gerrit.cloudera.org:8080/15403
Reviewed-by: Csaba Ringhofer <csringhofer@cloudera.com>
Reviewed-by: Qifan Chen <qchen@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-09-17 00:44:15 +00:00
AlexanderSaydakov
c925807b1a IMPALA-10901 cleaner and faster operations with datasketches
- serialize using bytes instead of stream
- avoid unnecessary constructor during deserialization
- simplified code slightly
- added original exception message to re-thrown generic message

Change-Id: I306a2489dac0f4d2d475e8f9987cd58bf95474bb
Reviewed-on: http://gerrit.cloudera.org:8080/17818
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-09-15 13:58:23 +00:00
stiga-huang
3850d49711 IMPALA-9662,IMPALA-2019(part-3): Support UTF-8 mode in mask functions
Mask functions are used in Ranger column masking policies to mask
sensitive data. There are 5 mask functions: mask(), mask_first_n(),
mask_last_n(), mask_show_first_n(), mask_show_last_n(). Take mask() as
an example, by default, it will mask uppercase to 'X', lowercase to 'x',
digits to 'n' and leave other characters unmasked. For masking all
characters to '*', we can use
  mask(my_col, '*', '*', '*', '*');
The current implementations mask strings byte-to-byte, which have
inconsistent results with Hive when the string contains unicode
characters:
  mask('中国', '*', '*', '*', '*') => '******'
Each Chinese character is encoded into 3 bytes in UTF-8 so we get the
above result. The result in Hive is '**' since there are two Chinese
characters.

This patch provides consistent masking behavior with Hive for
strings under the UTF-8 mode, i.e., set UTF8_MODE=true. In UTF-8 mode,
the masked unit of a string is a unicode code point.

Implementation
 - Extends the existing MaskTransform function to deal with unicode code
   points(represented by uint32_t).
 - Extends the existing GetFirstChar function to get the code point of
   given masked charactors in UTF-8 mode.
 - Implement a MaskSubStrUtf8 method as the core functionality.
 - Swith to use MaskSubStrUtf8 instead of MaskSubStr in UTF-8 mode.
 - For better testing, this patch also adds an overload for all mask
   functions for only masking other chars but keeping the
   upper/lower/digit chars unmasked. E.g. mask({col}, -1, -1, -1, 'X').

Tests
 - Add BE tests in expr-test
 - Add e2e tests in utf8-string-functions.test

Change-Id: I1276eccc94c9528507349b155a51e76f338367d5
Reviewed-on: http://gerrit.cloudera.org:8080/17780
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-09-15 05:04:07 +00:00
Gabor Kaszab
1e21aa6b96 IMPALA-9495: Support struct in select list for ORC tables
This patch implements the functionality to allow structs in the select
list of inline views, topmost blocks. When displaying the value of a
struct it is formatted into a JSON value and returned as a string. An
example of such a value:

SELECT struct_col FROM some_table;
'{"int_struct_member":12,"string_struct_member":"string value"}'

Another example where we query a nested struct:
SELECT outer_struct_col FROM some_table;
'{"inner_struct":{"string_member":"string value","int_member":12}}'

Note, the conversion from struct to JSON happens on the server side
before sending out the value in HS2 to the client. However, HS2 is
capable of handling struct values as well so in a later change we might
want to add a functionality to send the struct in thrift to the client
so that the client can use the struct directly.

-- Internal representation of a struct:
When scanning a struct the rowbatch will hold the values of the
struct's children as if they were queried one by one directly in the
select list.

E.g. Taking the following table:
CREATE TABLE tbl (id int, s struct<a:int,b:string>) STORED AS ORC

And running the following query:
SELECT id, s FROM tbl;

After scanning a row in a row batch will hold the following values:
(note the biggest size comes first)
 1: The pointer for the string in s.b
 2: The length for the string in s.b
 3: The int value for s.a
 4: The int value of id
 5: A single null byte for all the slots: id, s, s.a, s.b

The size of a struct has an effect on the order of the memory layout of
a row batch. The struct size is calculated by summing the size of its
fields and then the struct gets a place in the row batch to precede all
smaller slots by size. Note, all the fields of a struct are consecutive
to each other in the row batch. Inside a struct the order of the fields
is also based on their size as it does in a regular case for primitives.

When evaluating a struct as a SlotRef a newly introduced StructVal will
be used to refer to the actual values of a struct in the row batch.
This StructVal holds a vector of pointers where each pointer represents
a member of the struct. Following the above example the StructVal would
keep two pointers, one to point to an IntVal and one to point to a
StringVal.

-- Changes related to tuple and slot descriptors:
When providing a struct in the select list there is going to be a
SlotDescriptor for the struct slot in the topmost TupleDescriptor.
Additionally, another TupleDesriptor is created to hold SlotDescriptors
for each of the struct's children. The struct SlotDescriptor points to
the newly introduced TupleDescriptor using 'itemTupleId'.
The offsets for the children of the struct is calculated from the
beginning of the topmost TupleDescriptor and not from the
TupleDescriptor that directly holds the struct's children. The null
indicator bytes as well are stored on the level of the topmost
TupleDescriptor.

-- Changes related to scalar expressions:
A struct in the select list is translated into an expression tree where
the top of this tree is a SlotRef for the struct itself and its
children in the tree are SlotRefs for the members of the struct. When
evaluating a struct SlotRef after the null checks the evaluation is
delegated to the children SlotRefs.

-- Restrictions:
  - Codegen support is not included in this patch.
  - Only ORC file format is supported by this patch.
  - Only HS2 client supports returning structs. Beeswax support is not
    implemented as it is going to be deprecated anyway. Currently we
    receive an error when trying to query a struct through Beeswax.

-- Tests added:
  - The ORC and Parquet functional databases are extended with 3 new
    tables:
    1: A small table with one level structs, holding different
    kind of primitive types as members.
    2: A small table with 2 and 3 level nested structs.
    3: A bigger, partitioned table constructed from alltypes where all
    the columns except the 'id' column are put into a struct.
  - struct-in-select-list.test and nested-struct-in-select-list.test
    uses these new tables to query structs directly or through an
    inline view.

Change-Id: I0fbe56bdcd372b72e99c0195d87a818e7fa4bc3a
Reviewed-on: http://gerrit.cloudera.org:8080/17638
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-09-14 21:21:47 +00:00
Zoltan Borok-Nagy
6b4693ddbf IMPALA-10900: Add Iceberg tests that write many files
In earlier versions of Impala we had a bug that affected
insertions to Iceberg tables. When Impala wrote multiple
files during a single INSERT statement it could crash, or
even worse, it could silently omit data files from the
Iceberg metadata.

The current master doesn't have this bug, but we don't
really have tests for this case.

This patch adds tests that write many files during inserts
to an Iceberg table. Both non-partitioned and partitioned
Iceberg tables are tested.

We achieve writing lots of files by setting 'parquet_file_size'
to 8 megabytes.

Testing:
 * added e2e test that write many data files
 * added exhaustive e2e test that writes even more data files

Change-Id: Ia2dbc2c5f9574153842af308a61f9d91994d067b
Reviewed-on: http://gerrit.cloudera.org:8080/17831
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-09-08 18:38:37 +00:00
Attila Jeges
c8aa5796d9 IMPALA-10879: Add parquet stats to iceberg manifest
This patch adds parquet stats to iceberg manifest as per-datafile
metrics.

The following metrics are supported:
- column_sizes :
  Map from column id to the total size on disk of all regions that
  store the column. Does not include bytes necessary to read other
  columns, like footers.

- null_value_counts :
  Map from column id to number of null values in the column.

- lower_bounds :
  Map from column id to lower bound in the column serialized as
  binary. Each value must be less than or equal to all non-null,
  non-NaN values in the column for the file.

- upper_bounds :
  Map from column id to upper bound in the column serialized as
  binary. Each value must be greater than or equal to all non-null,
  non-Nan values in the column for the file.

The corresponding parquet stats are collected by 'ColumnStats'
(in 'min_value_', 'max_value_', 'null_count_' members) and
'HdfsParquetTableWriter::BaseColumnWriter' (in
'total_compressed_byte_size_' member).

Testing:
- New e2e test was added to verify that the metrics are written to the
  Iceberg manifest upon inserting data.
- New e2e test was added to verify that lower_bounds/upper_bounds
  metrics are used to prune data files on querying iceberg tables.
- Existing e2e tests were updated to work with the new behavior.
- BE test for single-value serialization.

Relevant Iceberg documentation:
- Manifest:
  https://iceberg.apache.org/spec/#manifests
- Values in lower_bounds and upper_bounds maps should be Single-value
  serialized to binary:
  https://iceberg.apache.org/spec/#appendix-d-single-value-serialization

Change-Id: Ic31f2260bc6f6a7f307ac955ff05eb154917675b
Reviewed-on: http://gerrit.cloudera.org:8080/17806
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Attila Jeges <attilaj@cloudera.com>
2021-09-02 21:34:41 +00:00
Zoltan Borok-Nagy
4f9f8c33ca IMPALA-10840: Add support for "FOR SYSTEM_TIME AS OF" and "FOR SYSTEM_VERSION AS OF" for Iceberg tables
This patch adds support "FOR SYSTEM_TIME AS OF" and
"FOR SYSTEM_VERSION AS OF" clauses for Iceberg tables. The new
clauses are part of the table ref. FOR SYSTEM_TIME AS OF conforms to the
SQL2011 standard:
https://cs.ulb.ac.be/public/_media/teaching/infoh415/tempfeaturessql2011.pdf

With FOR SYSTEM_TIME AS OF we can query a table at a specific time
point, e.g. we can retrieve what was the table content 1 day ago.

The timestamp given to "FOR SYSTEM_TIME AS OF" is interpreted in the
local timezone. The local timezone can be set via the query option
TIMEZONE. By default the timezone being used is the coordinator node's
local timezone. The timestamp is translated to UTC because table
snapshots are tagged with a UTC timestamps.

"FOR SYSTEM_VERSION AS OF" is a non-standard extension. It works
similarly to FOR SYSTEM_TIME AS OF, but with this clause we can query
a table via a snapshot ID instead of a timestamp.

HIVE-25344 also added support for these clauses to Hive.

Table snapshot IDs and timestamp information can be queried with the
help of the DESCRIBE HISTORY command.

Sample queries:

 SELECT * FROM t FOR SYSTEM_TIME AS OF now();
 SELECT * FROM t FOR SYSTEM_TIME AS OF '2021-08-10 11:02:34';
 SELECT * FROM t FOR SYSTEM_TIME AS OF now() - interval 10 days + interval 3 hours;

 SELECT * FROM t FOR SYSTEM_VERSION AS OF 7080861547601448759;

 SELECT * FROM t FOR SYSTEM_TIME AS OF now()
 MINUS
 SELECT * FROM t FOR SYSTEM_TIME AS OF now() - interval 1 days;

This patch uses some parts of the in-progress
IMPALA-9773 (https://gerrit.cloudera.org/#/c/13342/) developed by
Todd Lipcon and Grant Henke. This patch also resolves some TODOs of
IMPALA-9773, i.e. after this patch it'll be easier to add
time travel for Kudu tables as well.

Testing:
 * added parser tests (ParserTest.java)
 * added analyzer tests (AnalyzeStmtsTest.java)
 * added e2e tests (test_iceberg.py)

Change-Id: Ib523c5e47b8d9c377bea39a82fe20249177cf824
Reviewed-on: http://gerrit.cloudera.org:8080/17765
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-08-30 23:09:42 +00:00
Amogh Margoor
2040b2621f IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently.
HashTable implementation in Impala comprises of contiguous array
of Buckets and each Bucket contains either data or pointer to
linked list of duplicate entries named DuplicateNode.
These are the structures of Bucket and DuplicateNode:

  struct DuplicateNode {
    bool matched;
    DuplicateNode* next;
    HtData htdata;
  };

  struct Bucket {
    bool filled;
    bool matched;
    bool hasDuplicates;
    uint32_t hash;
    union {
      HtData htdata;
      DuplicateNode* duplicates;
    } bucketData;
  };

Size of Bucket is currently 16 bytes and size of DuplicateNode is
24 bytes. If we can remove the booleans from both struct size of
Bucket would reduce to 12 bytes and DuplicateNode will be 16 bytes.
One of the ways we can remove booleans is to fold it into pointers
already part of struct. Pointers store addresses and on
architectures like x86 and ARM the linear address is only 48 bits
long. With level 5 paging Intel is planning to expand it to 57-bit
long which means we can use most significant 7 bits i.e., 58 to 64
bits to store these booleans. This patch reduces the size of Bucket
and DuplicateNode by implementing this folding. However, there is
another requirement regarding Size of Bucket to be power of 2 and
also for the number of buckets in Hash table to be power of 2.
These requirements are for the following reasons:
1. Memory Allocator allocates memory in power of 2 to avoid
   internal fragmentation. Hence, num of buckets * sizeof(Buckets)
   should be power of 2.
2. Number of buckets being power of 2 enables faster modulo
   operation i.e., instead of slow modulo: (hash % N), faster
   (hash & (N-1)) can be used.

Due to this, 4 bytes 'hash' field from Bucket is removed and
stored separately in new array hash_array_ in HashTable.
This ensures sizeof(Bucket) is 8 which is power of 2.

New Classes:
------------
As a part of patch, TaggedPointer is introduced which is a template
class to store a pointer and 7-bit tag together in 64 bit integer.
This structure contains the ownership of the pointer and will take care
of allocation and deallocation of the object being pointed to.
However derived classes can opt out of the ownership of the object
and let the client manage it. It's derived classes for Bucket and
DuplicateNode do the same. These classes are TaggedBucketData and
TaggedDuplicateNode.

Benchmark:
----------
As a part of this patch a new Micro Benchmark for HashTable has
been introduced, which will help in measuring these:
1. Runtime for building hash table and probing it.
2. Memory consumed after building the Table.
This would help measuring the impact of changes to the HashTable's
data structure and algorithm.
Saw 25-30% reduction in memory consumed and no significant
difference in performance (0.91X-1.2X).

Other Benchmarks:
1. Billion row Synthetic benchmark on single node, single daemon:
   a. 2-3% improvement in Join GEOMEAN for Probe benchmark.
   b. 17% and 21% reduction in PeakMemoryUsage and
      CumulativeBytes allocated respectively
2. TPCH-42: 0-1.5% improvement in GEOMEAN runtime

Change-Id: I72912ae9353b0d567a976ca712d2d193e035df9b
Reviewed-on: http://gerrit.cloudera.org:8080/17592
Reviewed-by: Zoltan Borok-Nagy <boroknagyz@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-08-25 20:05:47 +00:00
Andrew Sherman
b54d0c35ff IMPALA-10849: Ignore escaped wildcards that terminate like predicates.
A like predicate is generally evaluated by converting it into a regex
that is evaluated at execution time. If the predicate of a like clause
is a constant (which is the common case when you say "row
like 'start%'") then there are optimizations where some cases that are
simpler then a regex are spotted, and a simple function than a regex
evaluator is used. One example is that a predicate such as ‘start%’ is
evaluated by looking for strings that begin with "start". Amusingly the
code that spots the potential optimizations uses regexes to look for
patterns in the like predicate. The code that looks for the
optimization where a simple prefix can be searched for does not deal
with the case where the '%' wildcard at the end of the predicate is
escaped. To fix this we add a test that deals with the case where the
predicate ends in an escaped '%'.

There are some other problems with escaped wildcards discussed in
IMPALA-2422. This change does not fix these problems, which are hard.

New tests for escaped wildcards are added to exprs.test - note that
these tests cannot be part of the LikeTbl tests as the like predicate
optimizations are only applied when the like predicate is a string
literal.

Exhaustive tests ran clean.

Change-Id: I30356c19f4f169d99f7cc6268937653af6b41b70
Reviewed-on: http://gerrit.cloudera.org:8080/17798
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-08-25 05:31:42 +00:00
Qifan Chen
cd902d8c22 IMPALA-3430: Runtime filter : Extend runtime filter to support Min/Max values for HDFS scans
This patch enables min/max filtering for non-correlated subqueries
that return one value. In this case, the filters are built from the
results of the subqueries and the filtering target is the scan node to
be qualified by one of the subqueries. Shown below is one such query
that normally gets compiled into a nested loop join. The filtering
limits the values from column store_sales.ss_sales_price to be within
[-infinite, min(ss_wholesale_cost)].

  select count(*) from store_sales
    where ss_sales_price <=
      (select min(ss_wholesale_cost) from store_sales);

In FE, the fact that the above scalar subquery exists is recorded
in a flag in InlineViewRef in analyzer and later on transferred to
AggregationNode in planner.

In BE, the min/max filtering infrastructure is integrated with the
nested loop join as follows.

 1. NljBuilderConfig is populated with filter descriptors from nested
    join plan node via NljBuilder::CreateEmbeddedBuilder() (similar
    to hash join), or in NljBuilderConfig::Init() when the sink config
    is created (for separate builder case);
 2. NljBuilder is populated with filter contexts utilizing the filter
    descriptors in NljBuilderConfig. Filter contexts are the interface
    to actual min/max filters;
 3. New insertion methods InsertFor<op>(), where <op> is LE, LT, GE and
    GT, are added to MinMaxFilter class hierarcy. They are used for
    join predicate target <op> src_expr;
 4. RuntimeContext::InsertPerCompareOp() calls one of the new
    insertion methods above based on the comparison op saved in the
    filter descriptor;
 5. NljBuilder::InsertRuntimeFilters() calls the new methods.

By default, the feature is turned on only for sorted or partitioned
join columns.

Testing:
 1. Add single range insertion tests in min-max-filter-test.cc;
 2. Add positive and negative plan tests in
    overlap_min_max_filters.test;
 3. Add tests in overlap_min_max_filters_on_partition_columns.test;
 4. Add tests in overlap_min_max_filters_on_sorted_columns.test;
 5. Run core tests.

TODO in follow-up patches:
 1. Extend min/max filter for inequality subquery for other use cases
    (IMPALA-10869).

Change-Id: I7c2bb5baad622051d1002c9c162c672d428e5446
Reviewed-on: http://gerrit.cloudera.org:8080/17706
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-08-21 14:46:51 +00:00
Amogh Margoor
0bde6b443c IMPALA-10680: Replace StringToFloatInternal using fast_double_parser library
StringToFloatInternal is used to parse string into float. It had logic
to ensure it is faster than standard functions like strtod in many
cases, but it was not as accurate. We are replacing it by a third
party library named fast_double_parser which is both fast and doesn't
sacrifise the accuracy for speed. On benchmarking on more than
1 million rows where string is cast to double, it is found that new
patch is on par with the earlier algorithm.

Results:
W/O library: Fetched 1222386 row(s) in 32.10s
With library: Fetched 1222386 row(s) in 31.71s

Testing:
1. Added test to check for accuracy improvement.
2. Ran existing Backend tests for correctness.

Change-Id: Ic105ad38a2fcbf2fb4e8ae8af6d9a8e251a9c141
Reviewed-on: http://gerrit.cloudera.org:8080/17389
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-08-15 20:40:39 +00:00
ShikhaAsrani
b1ca089446 IMPALA-10797: Frontend changes to enable 'stored as JSONFILE'
This change will allow usage of commands that do not require reading the
 Json File like:
- Create Table <Table> stored as JSONFILE
- Show Create Table <Table>
- Describe <Table>

Changes:
- Added JSON as FileFormat to thrift  and HdfsFileFormat.
- Allowing Sql keyword 'jsonfile' and mapping it to JSON format.
- Adding JSON serDe.
- JsonFiles have input format same as TextFile, so we need to use SerDe
library in use to differentiate between the two formats. Overloaded the
functions querying File Format based on input format to consider serDe
library too.
- Added tests for 'Create Table' and 'Show Create Table' commmands

Pending Changes:
- test for Describe command - to be added with backend changes.

Change-Id: I5b8cb2f59df3af09902b49d3bdac16c19954b305
Reviewed-on: http://gerrit.cloudera.org:8080/17727
Reviewed-by: Quanlong Huang <huangquanlong@gmail.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-08-13 16:04:19 +00:00
Zoltan Borok-Nagy
a1d5891c57 IMPALA-10741: Set engine.hive.enabled=true table property for Iceberg tables
Hive relies on engine.hive.enabled=true table property to be set
for Iceberg tables. Without it Hive overwrites table metadata with
different storage handler, SerDe/Input/OutputFormatter when it
writes the table, making it unusable.

With this patch Impala sets this table property during table creation.

Testing:
 * updated show-create-table.test
 * tested Impala/Hive interop manually

Change-Id: I6aa0240829697a27f48d0defcce48920a5d6f49b
Reviewed-on: http://gerrit.cloudera.org:8080/17750
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-08-05 15:45:24 +00:00
stiga-huang
599c84b4dd IMPALA-10808: (addendum) Abort on illegal decimal parquet schemas
The previous patch added checks on illegal decimal schemas of parquet
files. However, it doesn't return a non-ok status in
ParquetMetadataUtils::ValidateColumn if abort_on_error is set to false.
So we continue to use the illegal file schema and hit the DCHECK.

This patch fixes this and adding test coverage for illegal decimal
schemas.

Tests:
 - Add a bad parquet file with illegal decimal schemas.
 - Add e2e tests on the file.
 - Ran test_fuzz_decimal_tbl 100 times. Saw the errors are caught as
   expected.

Change-Id: I623f255a7f40be57bfa4ade98827842cee6f1fee
Reviewed-on: http://gerrit.cloudera.org:8080/17748
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-08-05 07:55:26 +00:00
Attila Jeges
4fc42c379e IMPALA-10739: Support setting new partition spec for Iceberg tables
With this patch Impala will support partition evolution for
Iceberg tables.

The DDL statement to change the default partition spec is:
ALTER TABLE <tbl> SET PARTITION SPEC(<partition-spec>)

Hive uses the same SQL syntax.

Testing:
- Added FE test to exercise parsing various well-formed and ill-formed
  ALTER TABLE SET PARTITION SPEC statements.

- Added e2e tests for:
  - ALTER TABLE SET PARTITION SPEC works for tables with HadoopTables
    and HadoopCatalog Catalog.
  - When evolving partition spec, the old data written with an earlier
    spec remains unchanged. New data is written using the new spec in
    a new layout. Data written with earlier spec and new spec can be
    fetched in a single query.
  - Invalid ALTER TABLE SET PARTITION SPEC statements yield the
    expected analysis error messages.

Change-Id: I9bd935b8a82e977df9ee90d464b5fe2a7acc83f2
Reviewed-on: http://gerrit.cloudera.org:8080/17723
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-08-03 16:27:07 +00:00