Commit Graph

200 Commits

Author SHA1 Message Date
stiga-huang
77d80aeda6 IMPALA-11812: Deduplicate column schema in hmsPartitions
A list of HMS Partitions will be created in many workloads in catalogd,
e.g. table loading, bulk altering partitions by ComputeStats or
AlterTableRecoverPartitions, etc. Currently, each of hmsPartition hold a
unique list of column schema, i.e. a List<FieldSchema>. This results in
lots of FieldSchema instances if the table is wide and lots of
partitions need to be loaded/operated. Though the strings of column
names and comments are interned, the FieldSchema objects could still
occupy the majority of the heap. See the histogram in JIRA description.

In reality, the hmsPartition instances of a table can share the
table-level column schema since Impala doesn't respect the partition
level schema.

This patch replaces column list in StorageDescriptor of hmsPartitions
with the table level column list to remove the duplications. Also add
some progress logs in batch HMS operations, and avoid misleading logs
when event-processor is disabled.

Tests:
- Ran exhaustive tests
- Add tests on wide table operations that hit OOM errors without this
  fix.

Change-Id: I511ecca0ace8bea4c24a19a54fb0a75390e50c4d
Reviewed-on: http://gerrit.cloudera.org:8080/19391
Reviewed-by: Aman Sinha <amsinha@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2023-01-01 04:38:36 +00:00
noemi
4a05eaf988 IMPALA-11807: Fix TestIcebergTable.test_avro_file_format and test_mixed_file_format
Iceberg hardcodes URIs in metadata files. If the table was written
in a certain storage location and then moved to another file system,
the hardcoded URIs will still point to the old location instead of
the current one. Therefore Impala will be unable to read the table.

TestIcebergTable.test_avro_file_format and test_mixed_file_format
use Hive from Impala to write tables. If the tables are created in
a different file system than the one they will be read from, the tests
fail due to the invalid URIs.
Skipping these 2 tests if testing is not done on HDFS.

Updated the data load schema of the 2 test tables created by Hive and
set LOCATION to the same as in the previous test tables. If this
makes it possible to rewrite the URIs in the metadata and makes the
tables accessible from another file system as well later, then the
tests can be enabled again.

Testing:
 - Testing locally on HDFS minicluster
 - Triggered an Ozone build to verify that it is skipped on a different
   file system

Change-Id: Ie2f126de80c6e7f825d02f6814fcf69ae320a781
Reviewed-on: http://gerrit.cloudera.org:8080/19387
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2022-12-22 19:45:21 +00:00
noemi
390a932064 IMPALA-11708: Add support for mixed Iceberg tables with AVRO file format
This patch extends the support of Iceberg tables containing multiple
file formats. Now AVRO data files can also be read in a mixed table
besides Parquet and ORC.

Impala uses its avro scanner to read AVRO files, therefore all the
avro related limitations apply here as well: writes/metadata
changes are not supported.

testing:
- E2E testing: extending 'iceberg-mixed-file-format.test' to include
  AVRO files as well, in order to test reading all three currently
  supported file formats: avro+orc+parquet

Change-Id: I941adfb659218283eb5fec1b394bb3003f8072a6
Reviewed-on: http://gerrit.cloudera.org:8080/19353
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2022-12-16 17:37:35 +00:00
Daniel Becker
25b5058ef5 IMPALA-11717: Use rapidjson for printing collections
We have been using rapidjson to print structs but didn't use it to print
collections (arrays and maps).

This change introduces the usage of rapidjson to print collections for
both the HS2 and the Beeswax protocol.

The old code handling the printing of collections in raw-value.{h,cc} is
removed.

Testing:
 - Ran existing EE tests
 - Added EE tests with non-string and NULL map keys in
   nested-map-in-select-list.test and map_null_keys.test.

Change-Id: I08a2d596a498fbbaf1419b18284846b992f49165
Reviewed-on: http://gerrit.cloudera.org:8080/19309
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Daniel Becker <daniel.becker@cloudera.com>
2022-12-15 15:04:07 +00:00
noemi
80fc49abe6 IMPALA-11158: Add support for Iceberg tables with AVRO data files
Iceberg tables containing only AVRO files or no AVRO files at all
can now be read by Impala. Mixed file format tables with AVRO are
currently unsupported.
Impala uses its avro scanner to read AVRO files, therefore all the
avro related limitations apply here as well: writes/metadata
changes are not supported.

testing:
- created test tables: 'iceberg_avro_only' contains only AVRO files;
  'iceberg_avro_mixed' contains all file formats: avro+orc+parquet
- added E2E test that reads Avro-only table
- added test case to iceberg-negative.test that tries to read
  mixed file format table

Change-Id: I827e5707e54bebabc614e127daa48255f86f4c4f
Reviewed-on: http://gerrit.cloudera.org:8080/19084
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2022-12-08 03:03:13 +00:00
Csaba Ringhofer
a983a347a7 IMPALA-11682: Add tests for minor compacted insert only ACID tables
Only test changes. Minor compacted delta dirs are supported in
Impala since IMPALA-9512, but at that time Hive supported minor
compaction only on full ACID tables. Since that time Hive added
support for minor compacting insert only/MM tables (HIVE-22610).

Change-Id: I7159283f3658f2119d38bd3393729535edd0a76f
Reviewed-on: http://gerrit.cloudera.org:8080/19164
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2022-11-03 00:52:08 +00:00
Daniel Becker
37f44a58f3 IMPALA-10918: Allow map type in SELECT list
Adding support for MAP types in the select list.
An example of how maps are printed:
{"k1":2,"k2":null}

Nested collection types (maps and arrays) are supported in any
combination. However, structs in collections and collections in structs
are not supported.

Limitations (other than map support) as described in the commit for
IMPALA-9498 still apply, the following are to be implemented later:
- Unify HS2 / Beeswax logic with the way STRUCTs are handled.
  This could be done in a "final" logic that can handle
  STRUCTS/ARRAYS nested to each other
- Implement "deep copy" and "deep serialize" for collections in BE.
  This would enable all operators, e.g. ORDER BY and UNION.

Testing:
 - modified the FE tests that checked that maps were not allowed in the
   select list - now the test expect maps are allowed there
 - added FE and EE tests involving maps based on the array tests

Change-Id: I921c647f1779add36e7f5df4ce6ca237dcfaf001
Reviewed-on: http://gerrit.cloudera.org:8080/18736
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2022-09-07 19:55:43 +00:00
LPL
cc26f345a4 IMPALA-11507: Use absolute_path when Iceberg data files are outside of the table location
For Iceberg tables, when one of the following properties is used, it is
considered that the table is possible to have data outside the table
location directory:
- 'write.object-storage.enabled' is true
- 'write.data.path' is not empty
- 'write.location-provider.impl' is configured
- 'write.object-storage.path'(Deprecated) is not empty
- 'write.folder-storage.path'(Deprecated) is not empty

We should tolerate the situation that relative path of the data files
cannot be obtained by the table location path, and we could use the
absolute path in that case. E.g. the ETL program will write the table
that the metadata of the Iceberg tables is placed in
'hdfs://nameservice_meta/warehouse/hadoop_catalog/ice_tbl/metadata',
the recent data files in
'hdfs://nameservice_data/warehouse/hadoop_catalog/ice_tbl/data', and the
data files half a year ago in
's3a://nameservice_data/warehouse/hadoop_catalog/ice_tbl/data', it
should still be queried normally by Impala.

Testing:
 - added e2e tests

Change-Id: I666bed21d20d5895f4332e92eb30a94fa24250be
Reviewed-on: http://gerrit.cloudera.org:8080/18894
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2022-09-06 18:35:30 +00:00
Zoltan Borok-Nagy
73da4d7ddf IMPALA-11484: Create SCAN plan for Iceberg V2 position delete tables
This patch adds support for reading Iceberg V2 tables use position
deletes. Equality deletes are still not supported. Position delete
files store the file path and file position of the deleted rows.

When an Iceberg table has position delete files we need to do an
ANTI JOIN between data files and delete files. From the data files
we need to query the virtual columns INPUT__FILE__NAME and
FILE__POSITION, while from the delete files we need the data columns
'file_path' and 'pos'. The latter data columns are not part of the
table schema, so we create a virtual table instance of
'IcebergPositionDeleteTable' that has a table schema corresponding
to the delete files ('file_path', 'pos').

This patch introduces a new class 'IcebergScanPlanner' which has
the responsibility of doing a plan for Iceberg table scans. It creates
the aforementioned ANTI JOIN. Also, if there are data files without
corresponding delete files, we can have a separate SCAN node and its
results would be UNIONed to the rows coming from the ANTI JOIN:

              UNION
             /     \
    SCAN data       ANTI JOIN
                     /      \
              SCAN data    SCAN deletes

Some refactorings in the context of this CR:
Predicate pushdown and time travel logic is transferred from
IcebergScanNode to IcebergScanPlanner. Iceberg snapshot summary
retrieval is moved from FeFsTable to FeIcebergTable.

Testing:
 * added planner test
 * added e2e tests

TODO in follow-up Jiras:
 * better cardinality estimates (IMPALA-11516)
 * support unrelative collection columns (select item from t.int_array)
   (IMPALA-11517)
   Currently such queries return error during analysis

Change-Id: I672cfee18d8e131772d90378d5b12ad4d0f7dd48
Reviewed-on: http://gerrit.cloudera.org:8080/18847
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2022-09-01 16:51:17 +00:00
Csaba Ringhofer
7ca11dfc7f IMPALA-9482: Support for BINARY columns
This patch adds support for BINARY columns for all table formats with
the exception of Kudu.

In Hive the main difference between STRING and BINARY is that STRING is
assumed to be UTF8 encoded, while BINARY can be any byte array.
Some other differences in Hive:
- BINARY can be only cast from/to STRING
- Only a small subset of built-in STRING functions support BINARY.
- In several file formats (e.g. text) BINARY is base64 encoded.
- No NDV is calculated during COMPUTE STATISTICS.

As Impala doesn't treat STRINGs as UTF8, BINARY and STRING become nearly
identical, especially from the backend's perspective. For this reason,
BINARY is implemented a bit differently compared to other types:
while the frontend treats STRING and BINARY as two separate types, most
of the backend uses PrimitiveType::TYPE_STRING for BINARY too, e.g.
in SlotDesc. Only the following parts of backend need to differentiate
between STRING and BINARY:
- table scanners
- table writers
- HS2/Beeswax service
These parts have access to column metadata, which allows to add special
handling for BINARY.

Only a very few builtins are allowed for BINARY at the moment:
- length
- min/max/count
- coalesce and similar "selector" functions
Other STRING functions can be only used by casting to STRING first.
Adding support for more of these functions is very easy, as simply
the BINARY type has to be "connected" to the already existing STRING
function's signature. Functions where the result depends on utf8_mode
need to ensure that with BINARY it always works as if utf8_mode=0 (for
example length() is mapped to bytes() as length count utf8 chars if
utf8_mode=1).

All kinds of UDFs (native, Hive legacy, Hive generic) support BINARY,
though in case of legacy Hive UDFs it is only supported if the argument
and return types are set explicitely to ensure backward compatibility.
See IMPALA-11340 for details.

The original plan was to behave as close to Hive as possible, but I
realized that Hive has more relaxed casting rules than Impala, which
led to STRING<->BINARY casts being necessary in more cases in Impala.
This was needed to disallow passing a BINARY to functions that expect
a STRING argument. An example for the difference is that in
INSERT ... VALUES () string literals need to be explicitly cast to
BINARY, while this is not needed in Hive.

Testing:
- Added functional.binary_tbl for all file formats (except Kudu)
  to test scanning.
- Removed functional.unsupported_types and related tests, as now
  Impala supports all (non-complex) types that Hive does.
- Added FE/EE tests mainly based on the ones added to the DATE type

Change-Id: I36861a9ca6c2047b0d76862507c86f7f153bc582
Reviewed-on: http://gerrit.cloudera.org:8080/16066
Reviewed-by: Quanlong Huang <huangquanlong@gmail.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2022-08-19 13:55:42 +00:00
Zoltan Borok-Nagy
522ee1fcc0 IMPALA-11350: Add virtual column FILE__POSITION for Parquet tables
Virtual column FILE__POSITION returns the ordinal position of the row
in the data file. It will be useful to add support for Iceberg's
position-based delete files

This patch only adds FILE__POSITION to Parquet tables. It works
similarly to the handling of collection position slots. I.e. we
add the responsibility of dealing with the file position slot to
an existing column reader. Because of page-filtering and late
materialization we already tracked the file position in member
'current_row_' during scanning.

Querying the FILE__POSITION in other file formats raises an error.

Testing:
 * added e2e tests

Change-Id: I4ef72c683d0d5ae2898bca36fa87e74b663671f7
Reviewed-on: http://gerrit.cloudera.org:8080/18704
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2022-08-12 19:21:55 +00:00
Csaba Ringhofer
efc303b71a IMPALA-11434: Fix analysis of multiple more than 1d arrays in select list
More than 1d arrays in select list tried to register a
CollectionTableRef with name "item" for the inner arrays,
leading to name collision if there was more than one such array.

The logic is changed to always use the full path as implicit alias
in CollectionTableRefs backing arrays in select list.

As a side effect this leads to using the fully qualified names
in expressions in the explain plans of queries that use arrays
from views. This is not an intended change, but I don't consider
it to be critical. Created IMPALA-11452 to deal with more
sophisticated alias handling in collections.

Testing:
- added a new table to testdata and a regression test

Change-Id: I6f2b6cad51fa25a6f6932420eccf1b0a964d5e4e
Reviewed-on: http://gerrit.cloudera.org:8080/18734
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2022-07-22 22:59:19 +00:00
stiga-huang
d74cc7319f IMPALA-9670: Fix unloaded views are shown as tables for GET_TABLES requests
At startup, catalogd pulls the table names from HMS and tracks each
table using an IncompleteTable which only contains the table name. The
table types (TABLE/VIEW) and comments are unknown until the table/view
is loaded in catalogd. GET_TABLES is a request of the HS2 protocol. It
fetches all the tables with their types and comments. For unloaded
tables/views, Impala always returns them with TABLE type (the default)
and empty comments.

This patch enables catalogd to always load the table types and comments
along with the table names. This behavior is controlled by a
catalogd-only flag, --pull_table_types_and_comments, which is false by
default. When this flag is enabled, catalogd will load table types and
comments at startup and in executing INVALIDATE METADATA commands. In
other words, an unloaded table (IncompleteTable) now not just contains
the table name, but also contains the correct table type and comment.

This is implemented by using the getTableMetas HMS API when invalidating
a table. The original behavior uses getAllTables to load all table names
and uses tableExists to verify whether a table still exists. When the
flag is set, we'll use getTableMetas instead to also load the table
types and comments.

Implementation:
Add a new table type, UNLOADED_TABLE, in TTableType to identify tables
that we just know it's not a view, but don’t know whether it's a Kudu or
HDFS table since its full set of metadata is unloaded.

When propagating catalog objects from catalogd to coordinators, views
are sent using a catalog key explicitly prefixed by VIEW. So
coordinators can create IncompleteTables/LocalIncompleteTables with the
correct types.

In most of the cases in creating an IncompleteTable, we have the table
types and comments in the context. For instance, when adding an
IncompleteTable for a CreateTable/CreateView request, we know exactly
it's a table or view. So we can create IncompleteTables with the correct
types.

Test infra changes:
 - Adds get_tables() method for the hs2_client
 - Extends ImpalaTestSuite.create_client_for_nth_impalad() to support
   hs2 and hs2-http protocols. So we can create HS2 clients on all
   impalads.

Tests:
 - Add custom cluster tests on all catalog modes (with/without
   local-catalog or event processor). Verify the table types and
   comments are always correct when pull_table_types_and_comments is
   true.

Change-Id: I528bb20272ebdd66a0118c30efc2b0566f2b0e2f
Reviewed-on: http://gerrit.cloudera.org:8080/18626
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2022-06-24 04:27:49 +00:00
Gabor Kaszab
5d021ce5a7 IMPALA-9496: Allow struct type in the select list for Parquet tables
This patch is to extend the support of Struct columns in the select
list to Parquet files as well.

There are some limitation with this patch:
  - Dictionary filtering could work when we have conjuncts on a member
    of a struct, however, if this struct is given in the select list
    then the dictionary filtering is disabled. The reason is that in
    this case there would be a mismatch between the slot/tuple IDs in
    the conjunct between the ones in the select list due to expr
    substitution logic when a struct is in the select list. Solving
    this puzzle would be a nice future performance enhancement. See
    IMPALA-11361.
  - When structs are read in a batched manner it delegates the actual
    reading of the data to the column readers of its children, however,
    would use the simple ReadValue() on these readers instead of the
    batched version. The reason is that calling the batched reader in
    the member column readers would in fact read in batches, but it
    won't handle the case when the parent struct is NULL and would set
    only itself to NULL but not the parent struct. This might also be a
    future performance enhancement. See IMPALA-11363.
  - If there is a struct in the select list then late materialization
    is turned off. The reason is that LM expects the column readers to
    be used through the batched reading interface, however, as said in
    the above bulletpoint currently struct column readers use the
    non-batched reading interface of its children. As a result after
    reading the column readers are not in a state as SkipRows() of LM
    expects and then results in a query failure because it's not able
    to skip the rows for non-filter readers.
    Once IMPALA-11363 is implemented and the struct will also use the
    ReadValueBatch() interface of its children then late
    materialization could be turned on even if structs are in the
    select list. See IMPALA-11364.

Testing:
  - There were a lot of tests already to exercise this functionality
    but they were only run on ORC table. I changed these to cover
    Parquet tables too.

Change-Id: I3e8b4cbc2c4d1dd5fbefb7c87dad8d4e6ac2f452
Reviewed-on: http://gerrit.cloudera.org:8080/18596
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2022-06-22 17:55:07 +00:00
Zoltan Borok-Nagy
e91c7810f0 IMPALA-10850: Interpret timestamp predicates in local timezone in IcebergScanNode
IcebergScanNode interprets the timestamp literals as UTC timestamps
during predicate pushdown to Iceberg. It causes problems when the
Iceberg table uses TIMESTAMPTZ (which corresponds to TIMESTAMP WITH
LOCAL TIME ZONE in SQL) because in the scanners we assume that the
timestamp literals in a query are in local timezone.

Hence, if the Iceberg table is partitioned by HOUR(ts), and Impala is
running in a different timezone than UTC, then the following query
doesn't return any rows:

 SELECT * from t
 WHERE ts = <some ts>;

Because during predicate pushdown the timestamp is interpreted as a
UTC timestamp (no conversion from local to UTC), but during query
execution the timestamp data in the files are converted to local
timezone, then compared to <some ts>. I.e. in the scanner the
assumption is that <some ts> is in local timezone.

On the other hand, when Iceberg type TIMESTAMP (which correcponds
to TIMESTAMP WITHOUT TIME ZONE in SQL) is used, then we should just
push down the timestamp values without any conversion. In this case
there is no conversion in the scanners either.

Testing:
 * added e2e test with TIMESTAMPTZ
 * added e2e test with TIMESTAMP

Change-Id: I181be5d2fa004f69b457f69ff82dc2f9877f46fa
Reviewed-on: http://gerrit.cloudera.org:8080/18399
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Csaba Ringhofer <csringhofer@cloudera.com>
2022-04-21 12:49:31 +00:00
Aman Sinha
8645ac6db3 IMPALA-11247: Test script changes for materialized views
IMPALA-10723 added support for treating materialized views as tables.
In certain test configurations, the rebuild of the materialized views
(which is done via Hive) was not populating the data in the MV. In
this patch, I have changed the source tables of materialized views
to be full-acid instead of insert-only transactional tables. This
enables the tests to succeed. Insert-only source tables are also
meant to work for the MV rebuild but that is a Hive issue that will
be investigated separately.

Change-Id: I349faa0ad36ec8ca6f574f7f92d9a32fb7d0d344
Reviewed-on: http://gerrit.cloudera.org:8080/18421
Reviewed-by: Aman Sinha <amsinha@cloudera.com>
Tested-by: Aman Sinha <amsinha@cloudera.com>
2022-04-17 22:36:57 +00:00
Aman Sinha
e644c99724 IMPALA-10723: Treat materialized view as a table instead of a view
The existing behavior is that materialized views are treated
as views and therefore expanded similar to a view when one
queries the MV directly (SELECT * FROM materialized_view).
This is incorrect since an MV is a regular table with physical
properties such as partitioning, clustering etc. and should be
treated as such even though it has a view definition associated
with it.

This patch focuses on the use case where MVs are created as
HDFS tables and makes the MVs a derived class of HdfsTable,
therefore making it a Table object. It adds support for
collecting and displaying statistics on materialized views
and these statistics could be leveraged by an external frontend
that supports MV based query rewrites (note that such a rewrite
is not supported by Impala with or without this patch). Note
that we are not introducing new syntax for MVs since DDL, DML
operations on MVs are only supported through Hive.

Directly querying a MV is permitted but inserts into MVs is not
since MVs are supposed to be only modified through an external
refresh when the source tables have modifications.

If the source tables associated with a materialized view have
column masking or row-filtering Ranger policies, querying the
MV will throw an error. This behavior is consistent with that
of Hive.

Testing:
 - Added transactional tables for alltypes, jointbl and used
   them as source tables to create materialized view.
 - Added tests for compute stats, drop stats, show stats and
   simple select query on a materialized view.
 - Added test for select on a materialized view when the
   source table has a column mask.
 - Modified analyzer tests related to alter, insert, drop of
   materialized view.

Change-Id: If3108996124c6544a97fb0c34b6aff5e324a6cff
Reviewed-on: http://gerrit.cloudera.org:8080/17595
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Quanlong Huang <huangquanlong@gmail.com>
2022-04-14 11:56:20 +00:00
Tamas Mate
9cd4823aa9 IMPALA-11023: Raise error when delete file is found in an Iceberg table
Iceberg V2 DeleteFiles are skipped during scans and the whole content of
the DataFiles are returned. This commit adds an extra check to prevent
scanning tables that have delete files to avoid unexpected results till
merge on read is supported. Metadata operations are allowed on tables
with delete files.

Testing:
 - Added e2e test.

Change-Id: I6e9cbf2424b27157883d551f73e728ab4ec6d21e
Reviewed-on: http://gerrit.cloudera.org:8080/18383
Reviewed-by: Zoltan Borok-Nagy <boroknagyz@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2022-04-11 19:37:04 +00:00
Zoltan Borok-Nagy
952f2af0ca IMPALA-11210: Impala can only handle lowercase schema elements of Iceberg table
When Impala/Hive creates a table they lowercase the schema elements.
When Spark creates an Iceberg table it doesn't lowercase the names
of the columns in the Iceberg metadata. This triggers a precondition
check in Impala which makes such Iceberg tables unloadable.

This patch converts column names to lowercase when converting Iceberg
schemas to Hive/Impala schemas.

Testing:
 * added e2e test

Change-Id: Iffd910f76844fbf34db805dda6c3053c5ad1cf79
Reviewed-on: http://gerrit.cloudera.org:8080/18368
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2022-03-31 11:53:13 +00:00
Zoltan Borok-Nagy
c10e951bcb IMPALA-11053: Impala should be able to read migrated partitioned Iceberg tables
When Hive (and probably other engines as well) converts a legacy Hive
table to Iceberg it doesn't rewrite the data files. It means that the
data files don't have write ids neither partition column data. Currently
Impala expects the partition columns to be present in the data files,
so it is not able to read converted partitioned tables.

With this patch Impala loads partition values from the Iceberg metadata.
The extra metadata information is attached to the file descriptor
objects and propageted to the scanners. This metadata contains the
Iceberg data file format (later it could be used to handle mixed-format
tables), and partition data.

We use the partition data in the HdfsScanner to create the template
tuple that contains the partition values of identity-partitioned
columns. This is not only true to migrated tables, but all Iceberg
tables with identity partitions, which means we also save some IO
and CPU time for such columns. The partition information could also
be used for Dynamic Partition Pruning later.

We use the (human-readable) string representation of the partition data
when storing them in the flat buffers. This helps debugging, also
it provides the needed flexibility when the partition columns
evolve (e.g. INT -> BIGINT, DECIMAL(4,2) -> DECIMAL(6,2)).

Testing
 * e2e test for all data types that can be used to partition a table
 * e2e test for migrated partitioned table + schema evolution (without
   renaming columns)
 * e2e for table where all columns are used as identity-partitions

Change-Id: Iac11a02de709d43532056f71359c49d20c1be2b8
Reviewed-on: http://gerrit.cloudera.org:8080/18240
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2022-03-07 20:00:42 +00:00
stiga-huang
374783c55e IMPALA-10898: Add runtime IN-list filters for ORC tables
ORC files have optional bloom filter indexes for each column. Since
ORC-1.7.0, the C++ reader supports pushing down predicates to skip
unreleated RowGroups. The pushed down predicates will be evaludated on
file indexes (i.e. statistics and bloom filter indexes). Note that only
EQUALS and IN-list predicates can leverage bloom filter indexes.

Currently Impala has two kinds of runtime filters: bloom filter and
min-max filter. Unfortunately they can't be converted into EQUALS or
IN-list predicates. So they can't leverage the file level bloom filter
indexes.

This patch adds runtime IN-list filters for this purpose. Currently they
are generated for the build side of a broadcast join. They will only be
applied on ORC tables and be pushed down to the ORC reader(i.e. ORC
lib). To avoid exploding the IN-list, if # of distinct values of the
build side exceeds a threshold (default to 1024), we set the filter to
ALWAYS_TRUE and clear its entry. The threshold can be configured by a
new query option, RUNTIME_IN_LIST_FILTER_ENTRY_LIMIT.

Evaluating runtime IN-list filters is much slower than evaluating
runtime bloom filters due to the current simple implementation (i.e.
std::unorder_set) and the lack of codegen. So we disable it at row
level.

For visibility, this patch addes two counters in the HdfsScanNode:
 - NumPushedDownPredicates
 - NumPushedDownRuntimeFilters
They reflect the predicates and runtime filters that are pushed down to
the ORC reader.

Currently, runtime IN-list filters are disabled by default. This patch
extends the query option, ENABLED_RUNTIME_FILTER_TYPES, to support a
comma separated list of filter types. It defaults to be "BLOOM,MIN_MAX".
Add "IN_LIST" in it to enable runtime IN-list filters.

Ran perf tests on a 3 instances cluster on my desktop using TPC-DS with
scale factor 20. It shows significant improvements in some queries:

+-----------+-------------+--------------------+--------+-------------+------------+------------+----------------+-------+----------------+---------+--------+
| Workload  | Query       | File Format        | Avg(s) | Base Avg(s) | Delta(Avg) | StdDev(%)  | Base StdDev(%) | Iters | Median Diff(%) | MW Zval | Tval   |
+-----------+-------------+--------------------+--------+-------------+------------+------------+----------------+-------+----------------+---------+--------+
| TPCDS(20) | TPCDS-Q67A  | orc / snap / block | 35.07  | 44.01       | I -20.32%  |   0.38%    |   1.38%        | 10    | I -25.69%      | -3.58   | -45.33 |
| TPCDS(20) | TPCDS-Q37   | orc / snap / block | 1.08   | 1.45        | I -25.23%  |   7.14%    |   3.09%        | 10    | I -34.09%      | -3.58   | -12.94 |
| TPCDS(20) | TPCDS-Q70A  | orc / snap / block | 6.30   | 8.60        | I -26.81%  |   5.24%    |   4.21%        | 10    | I -36.67%      | -3.58   | -14.88 |
| TPCDS(20) | TPCDS-Q16   | orc / snap / block | 1.33   | 1.85        | I -28.28%  |   4.98%    |   5.92%        | 10    | I -39.38%      | -3.58   | -12.93 |
| TPCDS(20) | TPCDS-Q18A  | orc / snap / block | 5.70   | 8.06        | I -29.25%  |   3.00%    |   4.12%        | 10    | I -40.30%      | -3.58   | -19.95 |
| TPCDS(20) | TPCDS-Q22A  | orc / snap / block | 2.01   | 2.97        | I -32.21%  |   6.12%    |   5.94%        | 10    | I -47.68%      | -3.58   | -14.05 |
| TPCDS(20) | TPCDS-Q77A  | orc / snap / block | 8.49   | 12.44       | I -31.75%  |   6.44%    |   3.96%        | 10    | I -49.71%      | -3.58   | -16.97 |
| TPCDS(20) | TPCDS-Q75   | orc / snap / block | 7.76   | 12.27       | I -36.76%  |   5.01%    |   3.87%        | 10    | I -59.56%      | -3.58   | -23.26 |
| TPCDS(20) | TPCDS-Q21   | orc / snap / block | 0.71   | 1.27        | I -44.26%  |   4.56%    |   4.24%        | 10    | I -77.31%      | -3.58   | -28.31 |
| TPCDS(20) | TPCDS-Q80A  | orc / snap / block | 9.24   | 20.42       | I -54.77%  |   4.03%    |   3.82%        | 10    | I -123.12%     | -3.58   | -40.90 |
| TPCDS(20) | TPCDS-Q39-1 | orc / snap / block | 1.07   | 2.26        | I -52.74%  | * 23.83% * |   2.60%        | 10    | I -149.68%     | -3.58   | -14.43 |
| TPCDS(20) | TPCDS-Q39-2 | orc / snap / block | 1.00   | 2.33        | I -56.95%  | * 19.53% * |   2.07%        | 10    | I -151.89%     | -3.58   | -20.81 |
+-----------+-------------+--------------------+--------+-------------+------------+------------+----------------+-------+----------------+---------+--------+
"Base Avg" is the avg of the original time. "Avg" is the current time.

However, we also see some regressions due to the suboptimal
implementation. The follow-up JIRAs will focus on improvements:
 - IMPALA-11140: Codegen InListFilter::Insert() and InListFilter::Find()
 - IMPALA-11141: Use exact data types in IN-list filters instead of
   casting data to a set of int64_t or a set of string.
 - IMPALA-11142: Consider IN-list filters in partitioned joins.

Tests:
 - Test IN-list filter on string, date and all integer types
 - Test IN-list filter with NULL
 - Test IN-list filter on complex exprs targets

Change-Id: I25080628233799aa0b6be18d5a832f1385414501
Reviewed-on: http://gerrit.cloudera.org:8080/18141
Reviewed-by: Qifan Chen <qchen@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2022-03-03 00:21:06 +00:00
Attila Jeges
d3da875684 IMPALA-9498: Allow returning arrays in select list
Until now ARRAYs had to be unnested in queries. This patch adds
support to return ARRAYs as STRINGs (JSON arrays) in select list,
for example:
select id, int_array from functional_parquet.complextypestbl where id = 1;
returns: 1, [1,2,3]

Returning ARRAYs from inline or HMS views is also supported -
these arrays can be used both in the select list or as relative
table references. Using them as non-relative table reference is
not supported (IMPALA-11052).

Though STRUCTs are already supported, ARRAYs and STRUCTs nested in
each other are not supported yet.

Things intentionally postponed for later commits:
- Add MAP suppport too - this shouldn't be too tricky after
  ARRAY support, but I don't want to make this patch even more
  complex.
- Unify HS2 / Beeswax logic with the way STRUCTs are handled.
  This could be done in a "final" logic that can handle
  STRUCTS/ARRAYS nested to each other
- Implement "deep copy" and "deep serialize" for ARRAYs in BE.
  This would enable all operators, e.g. ORDER BY and UNION.

Testing:
- FE tests were added for analyses and authorization
- EE tests were added
- core tests were ran

Change-Id: Ibb1e42ffb21c7ddc033aba0f754b0108e46f34d0
Reviewed-on: http://gerrit.cloudera.org:8080/17811
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2022-02-17 18:51:06 +00:00
Gabor Kaszab
df528fe2b1 IMPALA-10920: Zipping unnest for arrays
This patch provides an unnest implementation for arrays where unnesting
multiple arrays in one query results the items of the arrays being
zipped together instead of joining. There are two different syntaxes
introduced for this purpose:

1: ISO:SQL 2016 compliant syntax:
SELECT a1.item, a2.item
FROM complextypes_arrays t, UNNEST(t.arr1, t.arr2) AS (a1, a2);

2: Postgres compatible syntax:
SELECT UNNEST(arr1), UNNEST(arr2) FROM complextypes_arrays;

Let me show the expected behaviour through the following example:
Inputs: arr1: {1,2,3}, arr2: {11, 12}
After running any of the above queries we expect the following output:
===============
| arr1 | arr2 |
===============
| 1    | 11   |
| 2    | 12   |
| 3    | NULL |
===============

Expected behaviour:
 - When unnesting multiple arrays with zipping unnest then the 'i'th
   item of one array will be put next to the 'i'th item of the other
   arrays in the results.
 - In case the size of the arrays is not the same then the shorter
   arrays will be filled with NULL values up to the size of the longest
   array.

On a sidenote, UNNEST is added to Impala's SQL language as a new
keyword. This might interfere with use cases where a resource (db,
table, column, etc.) is named "UNNEST".

Restrictions:
 - It is not allowed to have WHERE filters on an unnested item of an
   array in the same SELECT query. E.g. this is not allowed:
   SELECT arr1.item
   FROM complextypes_arrays t, UNNEST(t.arr1) WHERE arr1.item < 5;

   Note, that it is allowed to have an outer SELECT around the one
   doing unnests and have a filter there on the unnested items.
 - If there is an outer SELECT filtering on the unnested array's items
   from the inner SELECT then these predicates won't be pushed down to
   the SCAN node. They are rather evaluated in the UNNEST node to
   guarantee result correctness after unnesting.
   Note, this restriction is only active when there are multiple arrays
   being unnested, or in other words when zipping unnest logic is
   required to produce results.
 - It's not allowed to do a zipping and a (traditional) joining unnest
   together in one SELECT query.
 - It's not allowed to perform zipping unnests on arrays from different
   tables.

Testing:
 - Added a bunch of E2E tests to the test suite to cover both syntaxes.
 - Did a manual test run on a table with 1000 rows, 3 array columns
   with size of around 5000 items in each array. I did an unnest on all
   three arrays in one query to see if there are any crashes or
   suspicious slowness when running on this scale.

Change-Id: Ic58ff6579ecff03962e7a8698edfbe0684ce6cf7
Reviewed-on: http://gerrit.cloudera.org:8080/17983
Reviewed-by: Csaba Ringhofer <csringhofer@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-11-23 07:03:10 +00:00
Zoltan Borok-Nagy
b02c003138 IMPALA-10974: Impala cannot resolve columns of converted Iceberg table
When a regular Parquet/ORC table is converted to Iceberg via Hive,
only the Iceberg metadata files need to be created. The data files
can stay in place.

This causes problems when the data files don't have field ids for
the schema elements. Currently Impala resolves columns in data
files based on Iceberg field ids, but since they are missing,
Impala raises an error or returns NULLs.

With this patch Impala falls back to the default column resolution
strategy when the data files lack field ids.

Testing:
 * added e2e tests both for Parquet and ORC

Change-Id: I85881b09891c7bd101e7a96e92561b70bbe5af41
Reviewed-on: http://gerrit.cloudera.org:8080/17953
Reviewed-by: Csaba Ringhofer <csringhofer@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-11-04 17:55:21 +00:00
stiga-huang
c127b6b1a7 IMPALA-10873: Push down EQUALS, IS NULL and IN-list predicate to ORC reader
This patch pushs down more kinds of predicates into the ORC reader,
including EQUALS, IN-list, and IS-NULL predicates to have more
improvements:
 - EQUALS and IN-list predicates can be evaluated inside the ORC reader
   with bloom filters in the ORC files.
 - Comparing to scanning parquet that converting an IN-list predicate
   into two binary predicates (i.e. LE and GE), the ORC reader can
   leverage IN-list predicates to skip ORC RowGroups. E.g. a RowGroup
   with int column 'x' in range [1, 100] will be skipped if we push down
   predicate "x in (0, 101)".
 - IS-NULL predicates (including IS-NOT-NULL) can also be used in the
   ORC reader to skip RowGroups.

Implementation:
FE will collect these kinds of predicates into 'min_max_conjuncts' of
THdfsScanNode. To better reflect the meaning, 'min_max_conjuncts' is
renamed to 'stats_conjuncts'. Same for other related variable names.

Parquet scanner will only pick binary min-max conjuncts (i.e. LT, GT,
LE, and GE) to keep the existing behavior. ORC scanner will build
SearchArgument based on all these conjuncts.

Tests
 * Add a new test table 'alltypessmall_bool_sorted' which has files
   contiaining sorted bool values.
 * Add test in orc-stats.test

Change-Id: Iaa89f080fe2e87d94fc8ea7f1be83e087fa34225
Reviewed-on: http://gerrit.cloudera.org:8080/17815
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Qifan Chen <qchen@cloudera.com>
2021-10-21 15:45:39 +00:00
Gabor Kaszab
1e21aa6b96 IMPALA-9495: Support struct in select list for ORC tables
This patch implements the functionality to allow structs in the select
list of inline views, topmost blocks. When displaying the value of a
struct it is formatted into a JSON value and returned as a string. An
example of such a value:

SELECT struct_col FROM some_table;
'{"int_struct_member":12,"string_struct_member":"string value"}'

Another example where we query a nested struct:
SELECT outer_struct_col FROM some_table;
'{"inner_struct":{"string_member":"string value","int_member":12}}'

Note, the conversion from struct to JSON happens on the server side
before sending out the value in HS2 to the client. However, HS2 is
capable of handling struct values as well so in a later change we might
want to add a functionality to send the struct in thrift to the client
so that the client can use the struct directly.

-- Internal representation of a struct:
When scanning a struct the rowbatch will hold the values of the
struct's children as if they were queried one by one directly in the
select list.

E.g. Taking the following table:
CREATE TABLE tbl (id int, s struct<a:int,b:string>) STORED AS ORC

And running the following query:
SELECT id, s FROM tbl;

After scanning a row in a row batch will hold the following values:
(note the biggest size comes first)
 1: The pointer for the string in s.b
 2: The length for the string in s.b
 3: The int value for s.a
 4: The int value of id
 5: A single null byte for all the slots: id, s, s.a, s.b

The size of a struct has an effect on the order of the memory layout of
a row batch. The struct size is calculated by summing the size of its
fields and then the struct gets a place in the row batch to precede all
smaller slots by size. Note, all the fields of a struct are consecutive
to each other in the row batch. Inside a struct the order of the fields
is also based on their size as it does in a regular case for primitives.

When evaluating a struct as a SlotRef a newly introduced StructVal will
be used to refer to the actual values of a struct in the row batch.
This StructVal holds a vector of pointers where each pointer represents
a member of the struct. Following the above example the StructVal would
keep two pointers, one to point to an IntVal and one to point to a
StringVal.

-- Changes related to tuple and slot descriptors:
When providing a struct in the select list there is going to be a
SlotDescriptor for the struct slot in the topmost TupleDescriptor.
Additionally, another TupleDesriptor is created to hold SlotDescriptors
for each of the struct's children. The struct SlotDescriptor points to
the newly introduced TupleDescriptor using 'itemTupleId'.
The offsets for the children of the struct is calculated from the
beginning of the topmost TupleDescriptor and not from the
TupleDescriptor that directly holds the struct's children. The null
indicator bytes as well are stored on the level of the topmost
TupleDescriptor.

-- Changes related to scalar expressions:
A struct in the select list is translated into an expression tree where
the top of this tree is a SlotRef for the struct itself and its
children in the tree are SlotRefs for the members of the struct. When
evaluating a struct SlotRef after the null checks the evaluation is
delegated to the children SlotRefs.

-- Restrictions:
  - Codegen support is not included in this patch.
  - Only ORC file format is supported by this patch.
  - Only HS2 client supports returning structs. Beeswax support is not
    implemented as it is going to be deprecated anyway. Currently we
    receive an error when trying to query a struct through Beeswax.

-- Tests added:
  - The ORC and Parquet functional databases are extended with 3 new
    tables:
    1: A small table with one level structs, holding different
    kind of primitive types as members.
    2: A small table with 2 and 3 level nested structs.
    3: A bigger, partitioned table constructed from alltypes where all
    the columns except the 'id' column are put into a struct.
  - struct-in-select-list.test and nested-struct-in-select-list.test
    uses these new tables to query structs directly or through an
    inline view.

Change-Id: I0fbe56bdcd372b72e99c0195d87a818e7fa4bc3a
Reviewed-on: http://gerrit.cloudera.org:8080/17638
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-09-14 21:21:47 +00:00
stiga-huang
599c84b4dd IMPALA-10808: (addendum) Abort on illegal decimal parquet schemas
The previous patch added checks on illegal decimal schemas of parquet
files. However, it doesn't return a non-ok status in
ParquetMetadataUtils::ValidateColumn if abort_on_error is set to false.
So we continue to use the illegal file schema and hit the DCHECK.

This patch fixes this and adding test coverage for illegal decimal
schemas.

Tests:
 - Add a bad parquet file with illegal decimal schemas.
 - Add e2e tests on the file.
 - Ran test_fuzz_decimal_tbl 100 times. Saw the errors are caught as
   expected.

Change-Id: I623f255a7f40be57bfa4ade98827842cee6f1fee
Reviewed-on: http://gerrit.cloudera.org:8080/17748
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-08-05 07:55:26 +00:00
Attila Jeges
fabe994d1f IMPALA-10627: Use standard parquet-related Iceberg table properties
This patch adds support for the following standard Iceberg properties:

write.parquet.compression-codec:
  Parquet compression codec. Supported values are: NONE, GZIP, SNAPPY
  (default value), LZ4, ZSTD. The table property will be ignored if
  COMPRESSION_CODEC query option is set.

write.parquet.compression-level:
  Parquet compression level. Used with ZSTD compression only.
  Supported range is [1, 22]. Default value is 3. The table property
  will be ignored if COMPRESSION_CODEC query option is set.

write.parquet.row-group-size-bytes :
  Parquet row group size in bytes. Supported range is [8388608,
  2146435072] (8MB - 2047MB). The table property will be ignored if
  PARQUET_FILE_SIZE query option is set.
  If neither the table property nor the PARQUET_FILE_SIZE query option
  is set, the way Impala calculates row group size will remain
  unchanged.

write.parquet.page-size-bytes:
  Parquet page size in bytes. Used for PLAIN encoding. Supported range
  is [65536, 1073741824] (64KB - 1GB).
  If the table property is unset, the way Impala calculates page size
  will remain unchanged.

write.parquet.dict-size-bytes:
  Parquet dictionary page size in bytes. Used for dictionary encoding.
  Supported range is [65536, 1073741824] (64KB - 1GB).
  If the table property is unset, the way Impala calculates dictionary
  page size will remain unchanged.

This patch also renames 'iceberg.file_format' table property to
'write.format.default' which is the standard Iceberg name for the
table property.

Change-Id: I3b8aa9a52c13c41b48310d2f7c9c7426e1ff5f23
Reviewed-on: http://gerrit.cloudera.org:8080/17654
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-07-20 23:58:06 +00:00
Zoltan Borok-Nagy
d0749d59de IMPALA-10732: Use consistent DDL for specifying Iceberg partitions
Currently we have a DDL syntax for defining Iceberg partitions that
differs from SparkSQL:
https://iceberg.apache.org/spark-ddl/#partitioned-by

E.g. Impala is using the following syntax:

CREATE TABLE ice_t (i int, s string, ts timestamp, d date)
PARTITION BY SPEC (i BUCKET 5, ts MONTH, d YEAR)
STORED AS ICEBERG;

The same in Spark is:

CREATE TABLE ice_t (i int, s string, ts timestamp, d date)
USING ICEBERG
PARTITIONED BY (bucket(5, i), months(ts), years(d))

HIVE-25179 added the following syntax for Hive:

CREATE TABLE ice_t (i int, s string, ts timestamp, d date)
PARTITIONED BY SPEC (bucket(5, i), months(ts), years(d))
STORED BY ICEBERG;

I.e. the same syntax as Spark, but adding the keyword "SPEC".

This patch makes Impala use Hive's syntax, i.e. we will also
use the PARTITIONED BY SPEC clause + the unified partition
transform syntax.

Testing:
 * existing tests has been rewritten with the new syntax

Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62
Reviewed-on: http://gerrit.cloudera.org:8080/17575
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-07-15 15:15:07 +00:00
Zoltan Borok-Nagy
ced7b7d221 IMPALA-10485: Support Iceberg field-id based column resolution in the ORC scanner
Currently the ORC scanner only supports position-based column
resolution. This patch adds Iceberg field-id based column resolution
which will be the default for Iceberg tables. It is needed to support
schema evolution in the future, i.e. ALTER TABLE DROP/RENAME COLUMNS.
(The Parquet scanner already supports Iceberg field-id based column
resolution)

Testing
 * added e2e test 'iceberg-orc-field-id.test' by copying the contents of
   nested-types-scanner-basic,
   nested-types-scanner-array-materialization,
   nested-types-scanner-position,
   nested-types-scanner-maps,
   and executing the queries on an Iceberg table with ORC data files

Change-Id: Ia2b1abcc25ad2268aa96dff032328e8951dbfb9d
Reviewed-on: http://gerrit.cloudera.org:8080/17398
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-05-20 19:19:50 +00:00
Zoltan Borok-Nagy
f0f083e45e IMPALA-10482, IMPALA-10493: Fix bugs in full ACID collection query rewrites
IMPALA-10482: SELECT * query on unrelative collection column of
transactional ORC table will hit IllegalStateException.

The AcidRewriter will rewrite queries like
"select item from my_complex_orc.int_array" to
"select item from my_complex_orc t, t.int_array"

This cause troubles in star expansion. Because the original query
"select * from my_complex_orc.int_array" is analyzed as
"select item from my_complex_orc.int_array"

But the rewritten query "select * from my_complex_orc t, t.int_array" is
analyzed as "select id, item from my_complex_orc t, t.int_array".

Hidden table refs can also cause issues during regular column
resolution. E.g. when the table has top-level 'pos'/'item'/'key'/'value'
columns.

The workaround is to keep track of the automatically added table refs
during query rewrite. So when we analyze the rewritten query we can
ignore these auxiliary table refs.

IMPALA-10493: Using JOIN ON syntax to join two full ACID collections
produces wrong results.

When AcidRewriter.splitCollectionRef() creates a new collection ref
it doesn't copy every information needed to correctly execute the
query. E.g. it dropped the ON clause, turning INNER joins to CROSS
joins.

Testing:
 * added e2e tests

Change-Id: I8fc758d3c1e75c7066936d590aec8bff8d2b00b0
Reviewed-on: http://gerrit.cloudera.org:8080/17038
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-05-03 20:42:30 +00:00
Tamas Mate
6b16df9e9a IMPALA-9732: Improve exceptions of unsupported HdfsTableSink formats
This change updates the exception that is thrown when the user tries to
insert into a partition which has unsupported format. The information to
make this decision is available during analysis, therefore this commit
also moves the check from the planner to the analyzer to have an
earlier result.

In the analyzer only the FeFsTables have to be checked therefore Kudu
tables are not related. Also, there is a difference between static and
dynamic partition clauses, for static partition clauses the partition
format is available during compile, for dynaminc partition clauses it is
only avaialble during runtime.

Testing:
 - Added unit tests
 - Ran exhaustive tests successfully

Change-Id: I7fa2f949336a422acb4d01c9347b9b2e808e4aec
Reviewed-on: http://gerrit.cloudera.org:8080/17300
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-04-13 17:59:21 +00:00
stiga-huang
e8720b40f1 IMPALA-2019(Part-1): Provide UTF-8 support in length, substring and reverse functions
A unicode character can be encoded into 1-4 bytes in UTF-8. String
functions will return undesired results when the input contains unicode
characters, because we deal with a string as a byte array. For instance,
length() returns the length in bytes, not in unicode characters.

UTF-8 is the dominant unicode encoding used in the Hadoop ecosystem.
This patch adds UTF-8 support in some string functions so they can have
UTF-8 aware behavior. For compatibility with the old versions, a new
query option, UTF8_MODE, is added for turning on/off the UTF-8 aware
behavior. Currently, only length(), substring() and reverse() support
it. Other function supports will be added in later patches.

String functions will check the query option and switch to use the
desired implementation. It's similar to how we use the decimal_v2 query
option in builtin functions.

For easy testing, the UTF-8 aware version of string functions are
also exposed as builtin functions (named by utf8_*, e.g. utf8_length).

Tests:
 - Add BE tests for utf8 functions.
 - Add e2e tests for the UTF8_MODE query option.

Change-Id: I0aaf3544e89f8a3d531ad6afe056b3658b525b7c
Reviewed-on: http://gerrit.cloudera.org:8080/16908
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-01-26 00:43:39 +00:00
Zoltan Borok-Nagy
90f3b2f491 IMPALA-10432: INSERT INTO Iceberg tables with partition transforms
INSERT INTO Iceberg tables that use partition transforms. Partition
transforms are functions that calculate partition data from row data.

There are the following partition transforms in Iceberg:
https://iceberg.apache.org/spec/#partition-transforms

 * IDENTITY
 * BUCKET
 * TRUNCATE
 * YEAR
 * MONTH
 * DAY
 * HOUR

INSERT INTO identity-partitioned Iceberg tables are already supported.
This patch adds support for the rest of the transforms.

We create the partitioning expressions in InsertStmt. Based on these
expressions data are automatically shuffled and sorted by the backend
executors before rows are given to the table sink operators. The table
sink operator writes the partitions one-by-one and creates a
human-readable partition path for them.

In the end, we will convert the partition path to partition data and
create Iceberg DataFiles with information about the files written.

Testing:
 * added planner test
 * added e2e tests

Change-Id: I3edf02048cea78703837b248c55219c22d512b78
Reviewed-on: http://gerrit.cloudera.org:8080/16939
Reviewed-by: wangsheng <skyyws@163.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-01-18 18:46:42 +00:00
Zoltan Borok-Nagy
296ed74d6f IMPALA-10380: INSERT INTO Iceberg tables with 'IDENTITY' partitions only
This patch adds support to INSERT INTO identity-partitioned
Iceberg tables.

Identity-partitioned Iceberg tables are similar to regular
partitioned tables, they are even stored in the same directory
structure. The difference is that the data files still store
the partitioning columns.

The INSERT INTO syntax is similar to the syntax for non-partitioned
tables, i.e.:

INSERT INTO <iceberg_tbl> VALUES (<val1>, <val2>, <val3>, ...);
Or,
INSERT INTO <iceberg_tbl> SELECT <val1>, <val2>, ... FROM <source_tbl>
(please note that we don't use the PARTITION keyword)

The values must be in column order corresponding to the table schema.
Impala will automatically create/find the partitions based on the
Iceberg partition spec.

Partitioned Iceberg tables are stored as non-partitioned tables
in the Hive Metastore (similarly to partitioned Kudu tables). However,
the InsertStmt still generates the partition expressions for them.
These partition expressions are used to shuffle and sort the input
data so we don't end up writing too many files. The HdfsTableSink
also uses the partition expressions to write the data files with
the proper partition paths.

Iceberg is able to parse the partition paths to generate the
corresponding metadata for the partitions. This happens at the
end in IcebergCatalogOpExecutor.

Testing:
 * added planner test to verify shuffling and sorting
 * added negative tests for unsupported features like PARTITION clause
   and non-identity partition transforms
 * e2e tests with partitioned inserts

Change-Id: If98797a2bfdc038d0467c8f83aadf1a12e1d69d4
Reviewed-on: http://gerrit.cloudera.org:8080/16825
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-12-17 08:54:51 +00:00
skyyws
a850cd3cc6 IMPALA-10361: Use field id to resolve columns for Iceberg tables
We supported resolve column by field id for Iceberg table in this
patch. Currently, we use field id to resolve column for Iceberg
tables, which means 'PARQUET_FALLBACK_SCHEMA_RESOLUTION' is invalid
for Iceberg tables.

Change-Id: I057bdc6ab2859cc4d40de5ed428d0c20028b8435
Reviewed-on: http://gerrit.cloudera.org:8080/16788
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Zoltan Borok-Nagy <boroknagyz@cloudera.com>
2020-12-10 19:01:08 +00:00
Aman Sinha
b5ba793227 IMPALA-10360: Allow simple limit to be treated as sampling hint
As a follow-up to IMPALA-10314, it is sometimes useful to consider
a simple limit as a way to sample from a table if a relevant hint
has been provided. Doing a sample instead of pure limit serves
dual purposes: (a) it still helps with reducing the planning time
since the scan ranges need be computed only for the sample files,
(b) it allows sufficient number of files/rows to be read from
the table such that after applying filter conditions or joins with
another table, the query may still produce the N rows needed for
limit.

This fuctionality is especially useful if the query is against a
view. Note that TABLESAMPLE clause cannot be applied to a view and
embedding a TABLESAMPLE explicitly on a table within a view will
not work because we don't want to sample if there's no limit.

In this patch, a new table level hint, 'convert_limit_to_sample(n)'
is added. If this hint is attached to a table either in the main
query block or within a view/subquery and simple limit optimization
conditions are satisfied (according to IMPALA-10314), the limit
is converted to a table sample. The parameter 'n' in parenthesis is
required and specifies the sample percentage. It must be an integer
between 1 and 100. For example:

 set optimize_simple_limit = true;
 CREATE VIEW v1 as SELECT * FROM T [convert_limit_to_sample(5)]
    WHERE [always_true] <predicate>;
 SELECT * FROM v1 LIMIT 10;

In this case, the limit 10 is applied on top of a 5 percent sample
of T which is applied after partition pruning.

Testing:
 - Added a alltypes_date_partition_2 table where the date and
   timestamp values match (this helps with setting the
   'always_true' hint).
 - Added views with 'convert_limit_to_sample' and 'always_true'
   hints and added new tests against the views. Modified a few
   existing tests to reference the new table variant.
 - Added an end-to-end test.

Change-Id: Ife05a5343c913006f7659949b327b63d3f10c04b
Reviewed-on: http://gerrit.cloudera.org:8080/16792
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-12-10 07:15:36 +00:00
skyyws
0c0985a825 IMPALA-10159: Supporting ORC file format for Iceberg table
This patch mainly realizes querying Iceberg table with ORC
file format. We can using following SQL to create table with
ORC file format:
  CREATE TABLE default.iceberg_test (
    level string,
    event_time timestamp,
    message string,
  )
  STORED AS ICEBERG
  LOCATION 'hdfs://xxx'
  TBLPROPERTIES ('iceberg.file_format'='orc', 'iceberg.catalog'='hadoop.tables');
But pay attention, there still some problems when scan ORC files
with Timestamp, more details please refer IMPALA-9967. We may add
new tests with Timestmap type after this JIRA fixed.

Testing:
- Create table tests in functional_schema_template.sql
- Iceberg table create test in test_iceberg.py
- Iceberg table query test in test_scanners.py

Change-Id: Ib579461aa57348c9893a6d26a003a0d812346c4d
Reviewed-on: http://gerrit.cloudera.org:8080/16568
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-10-14 19:19:19 +00:00
skyyws
5912c47617 IMPALA-10221: Rename 'iceberg_file_format' to 'iceberg.file_format' as Iceberg table property
We provide several new table properties in IMPALA-10164, such as
'iceberg.catalog', in order to keep consist of these properties, we
rename 'iceberg_file_format' to 'iceberg.file_format'. When we creating
Iceberg table, we should use SQL like this:
  CREATE TABLE default.iceberg_test (
    level string,
    event_time timestamp,
    message string,
  )
  STORED AS ICEBERG
  TBLPROPERTIES ('iceberg.file_format'='parquet',
    'iceberg.catalog'='hadoop.tables')

Change-Id: I722303fb765aca0f97a79bd6e4504765d355a623
Reviewed-on: http://gerrit.cloudera.org:8080/16550
Reviewed-by: Zoltan Borok-Nagy <boroknagyz@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-10-06 16:58:04 +00:00
skyyws
5b720a4d18 IMPALA-10164: Supporting HadoopCatalog for Iceberg table
This patch mainly realizes creating Iceberg table by HadoopCatalog.
We only supported HadoopTables api before this patch, but now we can
use HadoopCatalog to create Iceberg table. When creating managed table,
we can use SQL like this:
  CREATE TABLE default.iceberg_test (
    level string,
    event_time timestamp,
    message string,
  )
  STORED AS ICEBERG
  TBLPROPERTIES ('iceberg.catalog'='hadoop.catalog',
    'iceberg.catalog_location'='hdfs://test-warehouse/iceberg_test');
We supported two values ('hadoop.catalog', 'hadoop.tables') for
'iceberg.catalog' now. If you don't specify this property in your SQL,
default catalog type is 'hadoop.catalog'.
As for external Iceberg table, you can use SQL like this:
  CREATE EXTERNAL TABLE default.iceberg_test_external
  STORED AS ICEBERG
  TBLPROPERTIES ('iceberg.catalog'='hadoop.catalog',
    'iceberg.catalog_location'='hdfs://test-warehouse/iceberg_test',
    'iceberg.table_identifier'='default.iceberg_test');
We cannot set table location for both managed and external Iceberg
table with 'hadoop.catalog', and 'SHOW CREATE TABLE' will not display
table location yet. We need to use 'DESCRIBE FORMATTED/EXTENDED' to
get this location info.
'iceberg.catalog_location' is necessary for 'hadoop.catalog' table,
which used to reserved Iceberg table metadata and data, and we use this
location to load table metadata from Iceberg.
'iceberg.table_identifier' is used for Icebreg TableIdentifier.If this
property not been specified in SQL, Impala will use database and table name
to load Iceberg table, which is 'default.iceberg_test_external' in above SQL.
This property value is splitted by '.', you can alse set this value like this:
'org.my_db.my_tbl'. And this property is valid for both managed and external
table.

Testing:
- Create table tests in functional_schema_template.sql
- Iceberg table create test in test_iceberg.py
- Iceberg table query test in test_scanners.py
- Iceberg table show create table test in test_show_create_table.py

Change-Id: Ic1893c50a633ca22d4bca6726c9937b026f5d5ef
Reviewed-on: http://gerrit.cloudera.org:8080/16446
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-10-01 13:54:48 +00:00
skyyws
fb6d96e001 IMPALA-9741: Support querying Iceberg table by impala
This patch mainly realizes the querying of iceberg table through impala,
we can use the following sql to create an external iceberg table:
    CREATE EXTERNAL TABLE default.iceberg_test (
        level string,
        event_time timestamp,
        message string,
    )
    STORED AS ICEBERG
    LOCATION 'hdfs://xxx'
    TBLPROPERTIES ('iceberg_file_format'='parquet');
Or just including table name and location like this:
    CREATE EXTERNAL TABLE default.iceberg_test
    STORED AS ICEBERG
    LOCATION 'hdfs://xxx'
    TBLPROPERTIES ('iceberg_file_format'='parquet');
'iceberg_file_format' is the file format in iceberg, currently only
support PARQUET, other format would be supported in the future. And
if you don't specify this property in your SQL, default file format
is PARQUET.

We achieved this function by treating the iceberg table as normal
unpartitioned hdfs table. When querying iceberg table, we pushdown
partition column predicates to iceberg to decide which data files
need to be scanned, and then transfer this information to BE to
do the real scan operation.

Testing:
- Unit test for Iceberg in FileMetadataLoaderTest
- Create table tests in functional_schema_template.sql
- Iceberg table query test in test_scanners.py

Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006
Reviewed-on: http://gerrit.cloudera.org:8080/16143
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-09-06 02:12:07 +00:00
Aman Sinha
5e9f10d34c IMPALA-10064: Support constant propagation for eligible range predicates
This patch adds support for constant propagation of range predicates
involving date and timestamp constants. Previously, only equality
predicates were considered for propagation. The new type of propagation
is shown by the following example:

Before constant propagation:
 WHERE date_col = CAST(timestamp_col as DATE)
  AND timestamp_col BETWEEN '2019-01-01' AND '2020-01-01'
After constant propagation:
 WHERE date_col >= '2019-01-01' AND date_col <= '2020-01-01'
  AND timestamp_col >= '2019-01-01' AND timestamp_col <= '2020-01-01'
  AND date_col = CAST(timestamp_col as DATE)

As a consequence, since Impala supports table partitioning by date
columns but not timestamp columns, the above propagation enables
partition pruning based on timestamp ranges.

Existing code for equality based constant propagation was refactored
and consolidated into a new class which handles both equality and
range based constant propagation. Range based propagation is only
applied to date and timestamp columns.

Testing:
 - Added new range constant propagation tests to PlannerTest.
 - Added e2e test for range constant propagation based on a newly
   added date partitioned table.
 - Ran precommit tests.

Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Reviewed-on: http://gerrit.cloudera.org:8080/16346
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-09-02 22:57:55 +00:00
Zoltan Borok-Nagy
da34d34a42 IMPALA-9859: Full ACID Milestone 4: Part 2 Reading modified tables (complex types)
This implements scanning full ACID tables that contain complex types.
The same technique works that we use for primitive types. I.e. we add
a LEFT ANTI JOIN on top of the Hdfs scan node in order to subtract
the deleted rows from the inserted rows.

However, there were some types of queries where we couldn't do that.
These are the queries that scan the nested collection items directly.

E.g.: SELECT item FROM complextypestbl.int_array;

The above query only creates a single tuple descriptor that holds the
collection items. Since this tuple descriptor is not at the table-level,
we cannot add slot references to the hidden ACID column which are at the
top level of the table schema.

To resolve this I added a statement rewriter that rewrites the above
statement to the following:

  SELECT item FROM complextypestbl $a$1, $a$1.int_array;

Now in this example we'll have two tuple descriptors, one for the
table-level, and one for the collection item. So we can add the ACID
slot refs to the table-level tuple descriptor. The rewrite is
implemented by the new AcidRewriter class.

Performance

I executed the following query with num_nodes=1 on a non-transactional
table (without the rewrite), and on an ACID table (with the rewrite):

  select count(*) from customer_nested.c_orders.o_lineitems;

Without the rewrite:
Fetched 1 row(s) in 0.41s
+--------------+--------+-------+----------+----------+-------+------------+----------+---------------+---------------------------------------------------+
| Operator     | #Hosts | #Inst | Avg Time | Max Time | #Rows | Est. #Rows | Peak Mem | Est. Peak Mem | Detail                                            |
+--------------+--------+-------+----------+----------+-------+------------+----------+---------------+---------------------------------------------------+
| F00:ROOT     | 1      | 1     | 13.61us  | 13.61us  |       |            | 0 B      | 0 B           |                                                   |
| 01:AGGREGATE | 1      | 1     | 3.68ms   | 3.68ms   | 1     | 1          | 16.00 KB | 10.00 MB      | FINALIZE                                          |
| 00:SCAN HDFS | 1      | 1     | 280.47ms | 280.47ms | 6.00M | 15.00M     | 56.98 MB | 8.00 MB       | tpch_nested_orc_def.customer.c_orders.o_lineitems |
+--------------+--------+-------+----------+----------+-------+------------+----------+---------------+---------------------------------------------------+

With the rewrite:
Fetched 1 row(s) in 0.42s
+---------------------------+--------+-------+----------+----------+---------+------------+----------+---------------+---------------------------------------+
| Operator                  | #Hosts | #Inst | Avg Time | Max Time | #Rows   | Est. #Rows | Peak Mem | Est. Peak Mem | Detail                                |
+---------------------------+--------+-------+----------+----------+---------+------------+----------+---------------+---------------------------------------+
| F00:ROOT                  | 1      | 1     | 25.16us  | 25.16us  |         |            | 0 B      | 0 B           |                                       |
| 05:AGGREGATE              | 1      | 1     | 3.44ms   | 3.44ms   | 1       | 1          | 63.00 KB | 10.00 MB      | FINALIZE                              |
| 01:SUBPLAN                | 1      | 1     | 16.52ms  | 16.52ms  | 6.00M   | 125.92M    | 47.00 KB | 0 B           |                                       |
| |--04:NESTED LOOP JOIN    | 1      | 1     | 188.47ms | 188.47ms | 0       | 10         | 24.00 KB | 12 B          | CROSS JOIN                            |
| |  |--02:SINGULAR ROW SRC | 1      | 1     | 0ns      | 0ns      | 0       | 1          | 0 B      | 0 B           |                                       |
| |  03:UNNEST              | 1      | 1     | 25.37ms  | 25.37ms  | 0       | 10         | 0 B      | 0 B           | $a$1.c_orders.o_lineitems o_lineitems |
| 00:SCAN HDFS              | 1      | 1     | 96.26ms  | 96.26ms  | 100.00K | 12.59M     | 38.19 MB | 72.00 MB      | default.customer_nested $a$1          |
+---------------------------+--------+-------+----------+----------+---------+------------+----------+---------------+---------------------------------------+

So the overhead is very little.

Testing
* Added planner tests to PlannerTest/acid-scans.test
* E2E query tests to QueryTest/full-acid-complex-type-scans.test
* E2E tests for rowid-generation: QueryTest/full-acid-rowid.test

Change-Id: I8b2c6cd3d87c452c5b96a913b14c90ada78d4c6f
Reviewed-on: http://gerrit.cloudera.org:8080/16228
Reviewed-by: Zoltan Borok-Nagy <boroknagyz@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
2020-08-12 17:45:50 +00:00
Zoltan Borok-Nagy
f602c3f80f IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types)
Hive ACID supports row-level DELETE and UPDATE operations on a table.
It achieves it via assigning a unique row-id for each row, and
maintaining two sets of files in a table. The first set is in the
base/delta directories, they contain the INSERTed rows. The second set
of files are in the delete-delta directories, they contain the DELETEd
rows.

(UPDATE operations are implemented via DELETE+INSERT.)

In the filesystem it looks like e.g.:
 * full_acid/delta_0000001_0000001_0000/0000_0
 * full_acid/delta_0000002_0000002_0000/0000_0
 * full_acid/delete_delta_0000003_0000003_0000/0000_0

During scanning we need to return INSERTed rows minus DELETEd rows.
This patch implements it by creating an ANTI JOIN between the INSERT and
DELETE sets. It is a planner-only modification. Every HDFS SCAN
that scans full ACID tables (that also have deleted rows) are converted
to two HDFS SCANs, one for the INSERT deltas, and one for the DELETE
deltas. Then a LEFT ANTI HASH JOIN with BROADCAST distribution mode is
created above them.

Later we can add support for other distribution modes if the performance
requires it. E.g. if we have too many deleted rows then probably we are
better off with PARTITIONED distribution mode. We could estimate the
number of deleted rows by sampling the delete delta files.

The current patch only works for primitive types. I.e. we cannot select
nested data if the table has deleted rows.

Testing:
 * added planner test
 * added e2e tests

Change-Id: I15c8feabf40be1658f3dd46883f5a1b2aa5d0659
Reviewed-on: http://gerrit.cloudera.org:8080/16082
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-07-14 12:53:51 +00:00
Zoltan Borok-Nagy
930264afbd IMPALA-9515: Full ACID Milestone 3: Read support for "original files"
"Original files" are files that don't have full ACID schema. We can see
such files if we upgrade a non-ACID table to full ACID. Also, the LOAD
DATA statement can load non-ACID files into full ACID tables. So such
files don't store special ACID columns, that means we need
to auto-generate their values. These are (operation,
originalTransaction, bucket, rowid, and currentTransaction).

With the exception of 'rowid', all of them can be calculated based on
the file path, so I add their values to the scanner's template tuple.

'rowid' is the ordinal number of the row inside a bucket inside a
directory. For now Impala only allows one file per bucket per
directory. Therefore we can generate row ids for each file
independently.

Multiple files in a single bucket in a directory can only be present if
the table was non-transactional earlier and we upgraded it to full ACID
table. After the first compaction we should only see one original file
per bucket per directory.

In HdfsOrcScanner we calculate the first row id for our split then
the OrcStructReader fills the rowid slot with the proper values.

Testing:
 * added e2e tests to check if the generated values are correct
 * added e2e test to reject tables that have multiple files per bucket
 * added unit tests to the new auxiliary functions

Change-Id: I176497ef9873ed7589bd3dee07d048a42dfad953
Reviewed-on: http://gerrit.cloudera.org:8080/16001
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-06-29 21:00:05 +00:00
wzhou-code
c7ce4fa109 IMPALA-9691: Support Kudu Timestamp and Date bloom filter
Impala save timestamp as 12 bytes of structure TimestampValue with
time in nano seconds. Kudu store timestamp as 8 bytes of Unix Time
microseconds. To avoid the data truncation issue in the bloom filter,
add FunctionCallExpr with 'utc_to_unix_micros' as the root of source
expression of bloom filter to convert timestamp values to microseconds
when building timestamp bloom filter for Kudu.
Generated functional date_tbl table in Kudu format for unit-test.
Added new test cases for Kudu Timestamp and Date bloom filters.

Testing:
Passed all core tests.

Change-Id: I3c1e9bcc9fd6d79a39f25eaa3396188fc0a52a48
Reviewed-on: http://gerrit.cloudera.org:8080/16094
Reviewed-by: Thomas Tauber-Marshall <tmarshall@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-06-26 06:56:16 +00:00
Joe McDonnell
f15a311065 IMPALA-9709: Remove Impala-lzo from the development environment
This removes Impala-lzo from the Impala development environment.
Impala-lzo is not built as part of the Impala build. The LZO plugin
is no longer loaded. LZO tables are not loaded during dataload,
and LZO is no longer tested.

This removes some obsolete scan APIs that were only used by Impala-lzo.
With this commit, Impala-lzo would require code changes to build
against Impala.

The plugin infrastructure is not removed, and this leaves some
LZO support code in place. If someone were to decide to revive
Impala-lzo, they would still be able to load it as a plugin
and get the same functionality as before. This plugin support
may be removed later.

Testing:
 - Dryrun of GVO
 - Modified TestPartitionMetadataUncompressedTextOnly's
   test_unsupported_text_compression() to add LZO case

Change-Id: I3a4f12247d8872b7e14c9feb4b2c58cfd60d4c0e
Reviewed-on: http://gerrit.cloudera.org:8080/15814
Reviewed-by: Bikramjeet Vig <bikramjeet.vig@cloudera.com>
Tested-by: Joe McDonnell <joemcdonnell@cloudera.com>
2020-06-15 23:42:12 +00:00
xiaomeng
d45e3a50b0 IMPALA-9673: Add external warehouse dir variable in E2E test
Updated CDP build to 7.2.1.0-57 to include new Hive features such as
HIVE-22995.
In minicluster, we have default values of hive.create.as.acid and
hive.create.as.insert.only which are false. So by default hive creates
external type table located in external warehouse directory.
Due to HIVE-22995, desc db returns external warehouse directory.

With above reasons, we need use external warehouse dir in some tests.
Also add a new test for "CREATE DATABASE ... LOCATION".

Tested:
Re-run failed test in minicluster.
Run exhaustive tests.

Change-Id: I57926babf4caebfd365e6be65a399f12ea68687f
Reviewed-on: http://gerrit.cloudera.org:8080/15990
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-06-05 23:48:53 +00:00
Zoltan Borok-Nagy
f8015ff68d IMPALA-9512: Full ACID Milestone 2: Validate rows against the valid write id list
Minor compactions can compact several delta directories into a single
delta directory. The current directory filtering algorithm had to be
modified to handle minor compacted directories and prefer those over
plain delta directories. This happens in the Frontend, mostly in
AcidUtils.java.

Hive Streaming Ingestion writes similar delta directories, but they
might contain rows Impala cannot see based on its valid write id list.

E.g. we can have the following delta directory:

full_acid/delta_0000001_0000010/0000 # minWriteId: 1
                                     # maxWriteId: 10

This delta dir contains rows with write ids between 1 and 10. But maybe
we are only allowed to see write ids less than 5. Therefore we need to
check the ACID write id column (named originalTransaction) to determine
which rows are valid.

Delta directories written by Hive Streaming don't have a visibility txn
id, so we can recognize them based on the directory name. If there's
a visibilityTxnId and it is committed => every row is valid:

full_acid/delta_0000001_0000010_v01234 # has visibilityTxnId
                                       # every row is valid

If there's no visibilityTxnId then it was created via Hive Streaming,
therefore we need to validate rows. Fortunately Hive Streaming writes
rows with different write ids into different ORC stripes, therefore we
don't need to validate the write id per row. If we had statistics,
we could validate per stripe, but since Hive Streaming doesn't write
statistics we validate the write id per ORC row batch (an alternative
could be to do a 2-pass read, first we'd read a single value from each
stripe's 'currentTransaction' field, then we'd read the stripe if the
write id is valid).

Testing
 * the frontend logic is tested in AcidUtilsTest
 * the backend row validation is tested in test_acid_row_validation

Change-Id: I5ed74585a2d73ebbcee763b0545be4412926299d
Reviewed-on: http://gerrit.cloudera.org:8080/15818
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-05-20 21:00:44 +00:00
Adam Tamas
c32849a391 IMPALA-8980: Remove functional*.alltypesinsert from EE tests
-Modified the ‘test_insert.py’ so the tests can run parallel.
  -Every test will create its own temporary tables for insert testing.
-Swapped out the  SETUP tags to Truncate table QUERY statement.
  -Becouse the SETUP tag is not used anymore, the correspondig
  code was removed.
-A test query in ‘insert.test’. The test was incorrect so modified
to test for the right behavior.

Testing:
-tests/run-tests.py query_test/test_insert.py
-impala-py.test tests/query_test/test_insert.py
-the same for test_insert_permutation.py and test_load.py

Change-Id: I257e936868917a2fcc6c030f6c855b247e8a0eea
Reviewed-on: http://gerrit.cloudera.org:8080/15529
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-04-14 12:18:21 +00:00