impala

mirror of https://github.com/apache/impala.git synced 2025-12-26 14:02:53 -05:00

Author	SHA1	Message	Date
Daniel Becker	b05b408f17	IMPALA-13247: Support Reading Puffin files for the current snapshot This change adds support for reading NDV statistics from Puffin files when they are available for the current snapshot. Puffin files or blobs that were written for other snapshots than the current one are ignored. Because this behaviour is different from what we have for HMS stats and may therefore be unintuitive for users, reading Puffin stats is disabled by default; set the "--disable_reading_puffin_stats" startup flag to false to enable it. When Puffin stats reading is enabled, the NDV values read from Puffin files take precedence over NDV values stored in the HMS. This is because we only read Puffin stats for the current snapshot, so these values are always up-to-date, while the values in the HMS may be stale. Note that it is currently not possible to drop Puffin stats from Impala. For this reason, this patch also introduces two ways of disabling the reading of Puffin stats: - globally, with the aforementioned "--disable_reading_puffin_stats" startup flag: when it is set to true, Impala will never read Puffin stats - for specific tables, by setting the "impala.iceberg_disable_reading_puffin_stats" table property to true. Note that this change is only about reading Puffin files, Impala does not yet support writing them. Testing: - created the PuffinDataGenerator tool which can generate Puffin files and metadata.json files for different scenarios (e.g. all stats are in the same Puffin file; stats for different columns are in different Puffin files; some Puffin files are corrupt etc.). The generated files are under the "testdata/ice_puffin/generated" directory. - The new custom cluster test class 'test_iceberg_with_puffin.py::TestIcebergTableWithPuffinStats' uses the generated data to test various scenarios. - Added custom cluster tests that test the 'disable_reading_puffin_stats' startup flag. Change-Id: I50c1228988960a686d08a9b2942e01e366678866 Reviewed-on: http://gerrit.cloudera.org:8080/21605 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2024-10-19 22:14:59 +00:00
Peter Rozsa	df3a38096e	IMPALA-12861: Fix mixed file format listing for Iceberg tables This change fixes file format information collection for Iceberg tables. Previously, all file descriptor's file formats were collected from getSampledOrRawPartitions() in HdfsScanNode for Iceberg tables, now the collection part is extracted as a method and it's overridden in IcebergScanNode. Now, only the to-be-scanned file descriptor's file format is recorded, showing the correct file formats for each SCAN nodes in the plans. Tests: - Planner tests added for mixed file format table with partitioning. Change-Id: Ifae900914a0d255f5a4d9b8539361247dfeaad7b Reviewed-on: http://gerrit.cloudera.org:8080/21871 Reviewed-by: Daniel Becker <daniel.becker@cloudera.com> Reviewed-by: Zoltan Borok-Nagy <boroknagyz@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2024-10-18 13:24:42 +00:00
stiga-huang	d2e495e83a	IMPALA-13284: Loading test data on Apache Hive3 There are some failures in loading test data on Apache Hive 3.1.3: - STORED AS JSONFILE is not supported - STORED BY ICEBERG is not supported. Similarly, STORED BY ICEBERG STORED AS AVRO is not supported. - Missing the jar of iceberg-hive-runtime in CLASSPATH of HMS and Tez jobs. - Creating table in Impala is not translated to EXTERNAL table in HMS - Hive INSERT on insert-only tables failed in generating InsertEvents (HIVE-20067). This patch fixes the syntax issues by using old syntax of Apache Hive 3.1.3: - Convert STORED AS JSONFILE to ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.JsonSerDe' - Convert STORED BY ICEBERG to STORED BY 'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler' - Convert STORED BY ICEBERG STORED AS AVRO to the above one with tblproperties('write.format.default'='avro') Most of the conversion are done in generate-schema-statements.py. One exception is in testdata/bin/load-dependent-tables.sql where we need to generate a new file with the conversion when using it. The missing jar of iceberg-hive-runtime is added into HIVE_AUX_JARS_PATH in bin/impala-config.sh. Note that this is only needed by Apache Hive3 since CDP Hive3 has the jar of hive-iceberg-handler in its lib folder. To fix the failure of InsertEvents, we add the patch of HIVE-20067 and modify testdata/bin/patch_hive.sh to also recompile the submodule standalone-metastore. Modified some statements in testdata/datasets/functional/functional_schema_template.sql to be more reliable in retry. Tests - Verified the testdata can be loaded in ubuntu-20.04-from-scratch Change-Id: I8f52c91602da8822b0f46f19dc4111c7187ce400 Reviewed-on: http://gerrit.cloudera.org:8080/21657 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2024-08-20 07:01:21 +00:00
Csaba Ringhofer	a99de990b0	IMPALA-12370: Allow converting timestamps to UTC when writing to Kudu Before this commit, only read support was implemented (convert_kudu_utc_timestamps=true). This change adds write support: if write_kudu_utc_timestamps=true, then timestamps are converted from local time to UTC during DMLs to Kudu. In case of ambiguous conversions (DST changes) the earlier possible UTC timestamp is written. All DMLs supported with Kudu tables are affected: INSERT, UPSERT, UPDATE, DELETE To be able to read back Kudu tables written by Impala correctly convert_kudu_utc_timestamps and write_kudu_utc_timestamps need to have the same value. Having the same value in the two query option is also critical for UPDATE/DELETE if the primary key contains a timestamp column - these operations do a scan first (affected by convert_kudu_utc_timestamps) and then use the keys from the scan to select updated/deleted rows (affected by write_kudu_utc_timestamps). The conversion is implemented by adding to_utc_timestamp() to inserted timestamp expressions during planning. This allows doing the same conversion during the pre-insert sorting and partitioning. Read support is implemented differently - in that case the plan is not changed and the scanner does the conversion. Other changes: - Before this change, verification of tests with TIMESTAMP results were skipped when the file format is Kudu. This shouldn't be necessary so the skipping was removed. Change-Id: Ibb4995a64e042e7bb261fcc6e6bf7ffce61e9bd1 Reviewed-on: http://gerrit.cloudera.org:8080/21492 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Peter Rozsa <prozsa@cloudera.com>	2024-06-19 10:51:56 +00:00
Daniel Becker	2e093bbc8a	IMPALA-13085: Add warning and NULL out DECIMAL values in Iceberg metadata tables DECIMAL values are not supported in Iceberg metadata tables and Impala runs on a DCHECK and crashes if it encounters one. Until this issue is properly fixed (see IMPALA-13080), this commit introduces a temporary solution: DECIMAL values coming from Iceberg metadata tables are NULLed out and a warning is issued. Testing: - added a DECIMAL column to the 'iceberg_metadata_alltypes' test table, so querying the `files` metadata table will include a DECIMAL in the 'readable_metrics' struct. Change-Id: I0c8791805bc4fa2112e092e65366ca2815f3fa22 Reviewed-on: http://gerrit.cloudera.org:8080/21429 Reviewed-by: Daniel Becker <daniel.becker@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2024-05-28 14:36:09 +00:00
Riza Suminto	d0237fbe47	IMPALA-8042: Assign BETWEEN selectivity for discrete-unique column Impala frontend can not evaluate BETWEEN/NOT BETWEEN predicate directly. It needs to transform a BetweenPredicate into a CompoundPredicate consisting of upper bound and lower bound BinaryPredicate through BetweenToCompoundRule.java. The BinaryPredicate can then be pushed down or rewritten into other form by another expression rewrite rule. However, the selectivity of BetweenPredicate or its derivatives remains unassigned and often collapses with other unknown selectivity predicates to have collective selectivity equals Expr.DEFAULT_SELECTIVITY (0.1). This patch adds a narrow optimization of BetweenPredicate selectivity when the following criteria are met: 1. The BetweenPredicate is bound to a slot reference of a single column of a table. 2. The column type is discrete, such as INTEGER or DATE. 3. The column stats are available. 4. The column is sufficiently unique based on available stats. 5. The BETWEEN/NOT BETWEEN predicate is in good form (lower bound value <= upper bound value). 6. The final calculated selectivity is less than or equal to Expr.DEFAULT_SELECTIVITY. If these criteria are unmet, the Planner will revert to the old behavior, which is letting the selectivity unassigned. Since this patch only target BetweenPredicate over unique column, the following query will still have the default scan selectivity (0.1): select count() from tpch.customer c where c.c_custkey >= 1234 and c.c_custkey <= 2345; While this equivalent query written with BETWEEN predicate will have lower scan selectivity: select count() from tpch.customer c where c.c_custkey between 1234 and 2345; This patch calculates the BetweenPredicate selectivity during transformation at BetweenToCompoundRule.java. The selectivity is piggy-backed into the resulting CompoundPredicate and BinaryPredicate as betweenSelectivity_ field, separate from the selectivity_ field. Analyzer.getBoundPredicates() is modified to prioritize the derived BinaryPredicate over ordinary BinaryPredicate in its return value to prevent the derived BinaryPredicate from being eliminated by a matching ordinary BinaryPredicate. Testing: - Add table functional_parquet.unique_with_nulls. - Add FE tests in ExprCardinalityTest#testBetweenSelectivity, ExprCardinalityTest#testNotBetweenSelectivity, and PlannerTest#testScanCardinality. - Pass core tests. Change-Id: Ib349d97349d1ee99788645a66be1b81749684d10 Reviewed-on: http://gerrit.cloudera.org:8080/21377 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2024-05-24 12:02:28 +00:00
Daniel Becker	bbfba13ed4	IMPALA-13079: Add support for FLOAT/DOUBLE in Iceberg metadata tables Until now, the float and double data types were not supported in Iceberg metadata tables. This commit adds support for them. Testing: - added a test table that contains all primitive types (except for decimal, which is still not supported), a struct, an array and a map - added a test query that queries the `files` metadata table of the above table - the 'readable_metrics' struct contains lower and upper bounds for all columns in the original table, with the original type Change-Id: I2171c9aa9b6d2b634b8c511263b1610cb1d7cb29 Reviewed-on: http://gerrit.cloudera.org:8080/21425 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2024-05-15 20:51:33 +00:00
Daniel Becker	457ab9831a	IMPALA-12973,IMPALA-11491,IMPALA-12651: Support BINARY nested in complex types in select list Binary fields in complex types are currently not supported at all for regular tables (an error is returned). For Iceberg metadata tables, IMPALA-12899 added a temporary workaround to allow queries that contain these fields to succeed by NULLing them out. This change adds support for displaying them with base64 encoding for both regular and Iceberg metadata tables. Complex types are displayed in JSON format, so simply inserting the bytes of the binary fields is not acceptable as it would produce invalid JSON. Base64 is a widely used encoding that allows representing arbitrary binary information using only a limited set of ASCII characters. This change also adds support for top level binary columns in Iceberg metadata tables. However, these are not base64 encoded but are returned in raw byte format - this is consistent with how top level binary columns from regular (non-metadata) tables are handled. Testing: - added test queries in iceberg-metadata-tables.test referencing both nested and top level binary fields; also updated existing queries - moved relevant tests (queries extracting binary fields from within complex types) from nested-types-scanner-basic.test to a new binary-in-complex-type.test file and also added a query that selects the containing complex types; this new test file is run from test_scanners.py::TestBinaryInComplexType::\ test_binary_in_complex_type - moved negative tests in AnalyzerTest.TestUnsupportedTypes() to AnalyzeStmtsTest.TestComplexTypesInSelectList() and converted them to positive tests (expecting success); a negative test already in AnalyzeStmtsTest.TestComplexTypesInSelectList() was also converted Change-Id: I7b1d7fa332a901f05a46e0199e13fb841d2687c2 Reviewed-on: http://gerrit.cloudera.org:8080/21269 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Csaba Ringhofer <csringhofer@cloudera.com>	2024-04-26 13:18:54 +00:00
Zoltan Borok-Nagy	0334f83704	IMPALA-12810: Simplify IcebergDeleteNode and IcebergDeleteBuilder Now that we have the DIRECTED distribution mode, some parts of IcebergDeleteNode and IcebergDeleteBuilder became dead code. It is time to simplify the above classes. IcebergDeleteBuilder and KrpcDataStreamSender now also tolerate NULL file paths which are also not an error in the hash join mode. Change-Id: I3ba02b33433990950b49628f11e732e01ed8a34d Reviewed-on: http://gerrit.cloudera.org:8080/21258 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2024-04-11 21:27:22 +00:00
Csaba Ringhofer	8ff51fbf74	IMPALA-5323: Support BINARY columns in Kudu tables The patch adds read and write support for BINARY columns in Kudu tables. Predicate push down is implemented, but is incomplete: a constant binary argument will be only pushed down if the constant folding never encounters non-ascii strings. Examples: - cast(unhex(hex("aa")) as binary) can be pushed down - cast(hex(unhex("aa")) as binary) can't be pushed down as unhex("aa") is not ascii (even though the final result is ascii) See IMPALA-10349 for more details on this limitation. The patch also changes casting BINARY <-> STRING from noop to calling an actual function. While this may add some small overhead it allows the backend to know whether an expression returns STRING or BINARY. Change-Id: Iff701a4b3a09ce7b6982c5d238e65f3d4f3d1151 Reviewed-on: http://gerrit.cloudera.org:8080/18868 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2024-04-10 16:17:15 +00:00
Gabor Kaszab	18b9c08c52	IMPALA-12600: Schema evolution with equality delete files This patch adds test coverage for a table that has equality delete files and also schema evolution, where the schema changes didn't affect the primary key columns. Note, partition evolution on tables with equality deletes is still not supported. Testing: - Added a new test table for this use-case and some E2E tests on that table. Change-Id: I125f72bade5b79bad5aaa6b676d6afaf3ca98395 Reviewed-on: http://gerrit.cloudera.org:8080/21210 Reviewed-by: Gabor Kaszab <gaborkaszab@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2024-04-02 13:11:21 +00:00
Daniel Becker	72732da9d8	IMPALA-12609: Implement SHOW METADATA TABLES IN statement to list Iceberg Metadata tables After this change, the new SHOW METADATA TABLES IN statement can be used to list all the available metadata tables of an Iceberg table. Note that in contrast to querying the contents of Iceberg metadata tables, this does not require fully qualified paths, e.g. both SHOW METADATA TABLES IN functional_parquet.iceberg_query_metadata; and USE functional_parquet; SHOW METADATA TABLES IN iceberg_query_metadata; work. The available metadata tables for all Iceberg tables are the same, corresponding to the values of the enum "org.apache.iceberg.MetadataTableType", so there is actually no need to pass the name of the regular table for which the metadata table list is requested through Thrift. This change, however, does send the table name because this way - if we add support for metadata tables for other table formats, the table name/path will be necessary to determine the correct list of metadata tables - we could later add support for different authorisation policies for individual tables - we can check also at the point of generating the list of metadata tables that the table is an Iceberg table Testing: - added and updated tests in ParserTest, AnalyzeDDLTest, ToSqlTest and AuthorizationStmtTest - added a custom cluster test in test_authorization.py - added functional tests in iceberg-metadata-tables.test Change-Id: Ide10ccf10fc0abf5c270119ba7092c67e712ec49 Reviewed-on: http://gerrit.cloudera.org:8080/21026 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Zoltan Borok-Nagy <boroknagyz@cloudera.com>	2024-04-02 09:58:37 +00:00
jasonmfehr	5835c9b994	IMPALA-12913: Refactor Workload Management Custom Cluster Tests The custom cluster tests that assert the workload management functionality to insert completed queries into the impala_query_log table were inefficient because they created their own database tables and added data to those tables. This patch updates these tests to use the existing tables in the functional database where possible. The few tests that need their own tables now have those tables set up in a database created by the pytest unique_database fixture instead of using the default database. A new table has also been added to the functional database. This table is named zipcode_timezones and contains two columns, the first having a few zipcodes and the second having their corresponding timezone. This table can be used to join the zipcode_incomes and alltimezones tables. This table is populated by a new csv file in the testdata directory. Change-Id: I1e3249a8f306cf43de0d6f6586711c779399e83b Reviewed-on: http://gerrit.cloudera.org:8080/21153 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2024-03-27 04:46:37 +00:00
Gabor Kaszab	ada4090e09	IMPALA-12894: (part 1) Turn off the count() optimisation for V2 Iceberg tables This is a part 1 change that turns off the count() optimisations for V2 tables as there is a correctness issue with it. The reason is that Spark compaction may leave some dangling delete files that mess up the logic in Impala. Change-Id: Ida9fb04fd076c987b6b5257ad801bf30f5900237 Reviewed-on: http://gerrit.cloudera.org:8080/21139 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2024-03-13 18:59:07 +00:00
Gabor Kaszab	65094a74f1	IMPALA-12598: Allow multiple equality field id lists for Iceberg tables This patch adds support for reading Iceberg tables that have different equality field ID lists associated to different equality delete files. In practice this is a use case when one equality delete file deletes by e.g. columnA and columnB while another one deletes by columnB and columnC. In order to achieve such functionality the plan tree creation needed some adjustments so that it can create separate LEFT ANTI JOIN nodes for the different equality field ID lists. Testing: - Flink and NiFi was used for creating some test tables with the desired equality field IDs. Coverage on these tables are added to the test suite. Change-Id: I3e52d7a5800bf1b479f0c234679be92442d09f79 Reviewed-on: http://gerrit.cloudera.org:8080/20951 Reviewed-by: Gabor Kaszab <gaborkaszab@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2024-02-29 19:58:22 +00:00
Daniel Becker	27955a385e	IMPALA-12783: Nested struct with varlen data crashes If a struct ("main") is within an array and contains two child structs ("s1" ans "s2") which both contain strings (or other varlen data), Impala crashes when this struct is re-materialised (for example in a sort with limit) if codegen is enabled. To reproduce: In Hive: create table nested (arr ARRAY<STRUCT<s1: STRUCT<str1: STRING>, s2: STRUCT<str2: STRING>>>) stored as parquet; insert into nested values (array( named_struct("s1", named_struct("str1", "A string that is long"), "s2", named_struct("str2", "Another string that is long") ))); In Impala: select 1, arr from nested order by 1 limit 1; This is because in the codegen'd code, when checking if the strings ("str1" and "str2" in the example) are NULL, we incorrectly calculate the offset of their null indicator bytes from the memory address of their containing struct, not from the beginning of the "master tuple", which in this case is the item tuple of the array. Note that the null indicators of struct members are always at the end of the tuple containing the struct (recursively), i.e. the master tuple. This change corrects the behaviour, passing the master tuple to functions that need it. Testing: - extended the column 'arr_contains_nested_struct' in table 'collection_struct_mix' to include two nested structs with string members. Updated existing queries, which now cover the problem. Change-Id: Ide2b63f8b18633f38fbe939a17db923606ccb101 Reviewed-on: http://gerrit.cloudera.org:8080/20997 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2024-02-05 19:49:06 +00:00
Gabor Kaszab	012996a06b	IMPALA-12597: Basic Equality delete read support for Iceberg tables In general, applying equality deletes is similar to how position deletes are applied to data files: using a LEFT ANTI JOIN where the SCAN for the data rows is on the left side while the SCAN for the delete rows is on the right side of the JOIN. The difference is the virtual columns and the conjuncts being used. For equality deletes the data sequence number of a delete file has to be greater than the data sequence number of the data file being investigated. This information is added as a virtual column to the scans and a conjunct is created in the JOIN node to check the relation. The equality delete fields from the delete files are checked agains the respective columns of the data SCANS. This patch makes it possible for Impala to read Iceberg tables with basic equality delete files. The Iceberg spec gives great flexibility for engines for writing equality deletes, however in practice Flink, one of the engines that write EQ-deletes supports only a subset of the use cases. This patch focuses on reading the EQ-deletes written by Flink. The restrictions are the following: - All equality delete files in a table should have the same equality field ID list. - For partitioned Iceberg tables it is expected that the partition values are also written into the equality delete files. - Tables with equality deletes shouldn't have partition or schema evolution. - Floating point equality columns aren't supported. - If a malformed equality delete file doesn't have some of the equality field IDs then Parquet reader will fill those missing fields with NULLs. As a side effect this will drop the rows from the result where the corresponding data columns have a null value. See IMPALA-11388 epic Jira for more details. Testing: - Checked if the existing functional_parquet.iceberg_v2_delete_equality table can be read successfully. - Added new test tables so that E2E tests can validate correctness. Change-Id: I2053e6f321c69f1c82059a84a5d99aeaa9814cad Reviewed-on: http://gerrit.cloudera.org:8080/20753 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2023-12-19 22:14:01 +00:00
Zoltan Borok-Nagy	68fe57ff84	IMPALA-12313: (part 3) Add UPDATE support for Iceberg tables. Part 2 had some limitations, most importantly it could not update Iceberg tables if any of the following were true: * UPDATE value of partitioning column * UPDATE table that went through partition evolution * Table has SORT BY properties The problem with partitions is that the delete record and new data record might belong to different partitions and records are shuffled across based on the partitions of the delete records, hence the data files might not get written efficiently. The problem with SORT BY properties, is that we need to write the position delete files ordered by (file_path, position). To address the above problems, this patch introduces a new backend operator: IcebergBufferedDeleteSink. This new operator extracts and aggregates the delete record information from the incoming row batches, then in FlushFinal it orders the position delete records and writes them out to files. This mechanism is similar to Hive's approach: https://github.com/apache/hive/pull/3251 IcebergBufferedDeleteSink cannot spill to disk, so it can only run if there's enough memory to store the delete records. Paths are stored only once, and the int64_t positions are stored in a vector, so updating 100 Million records per node should require around 800MBs + (100K) filepaths ~= 820 MBs of memory per node. Spilling could be added later, but currently the need for it is not too realistic. Now records can get shuffled around based on the new data records' partition values, and the SORT operator sorts the records based on the SORT BY properties. There's only one case we don't allow the UPDATE statement: * UPDATE partition column AND * Right-hand side of assignment is non-constant expression AND * UPDATE statement has a JOIN When all of the above conditions meet, it would be possible to have an incorrect JOIN condition that has multiple matches for the data records, then the duplicated records would be shuffled independently (based on the new partition value) to different backend SINKs, and the different backend SINK would not be able to detect the duplicates. If any of the above conditions was false, then the duplicated records would be shuffled together to the same SINK, that could do the duplicate check. This patch also moves some code from IcebergDeleteSink to the newly introduced IcebergDeleteSinkBase. Testing: * planner tests * e2e tests * Impala/Hive interop tests Change-Id: I2bb97b4454165a292975d88dc9c23adb22ff7315 Reviewed-on: http://gerrit.cloudera.org:8080/20760 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2023-12-18 19:14:21 +00:00
Eyizoha	3af1930229	IMPALA-12322: Support converting UTC timestamps read from Kudu to local time This patch adds a query option 'convert_kudu_utc_timestamps' similar to 'convert_legacy_hive_parquet_utc_timestamps'. When enabled, it converts UTC timestamps read from Kudu to local timestamps. The corresponding modification also include predicate pushdown and runtime filter. Due to the ambiguity of timestamps caused by daylight saving time changes, it is difficult to resolve in the bloom filter. This patch additionally introduces a query option 'disable_kudu_local_timestamp_bloom_filter' to default disable the Kudu timestamp bloom filter after enabling time zone conversion in order to avoid erroneously filtering out data. However, for regions that do not observe daylight saving time, it can be set to false to re-enable the Kudu local timestamp bloom filter. Testing: - Add TestKuduTimestampConvert in query_test/test_kudu.py Perform end-to-end testing in a custom cluster, including basic Kudu UTC timestamp conversion testing, as well as checking if related predicate pushdown and runtime filters are working correctly (even with timestamps involving daylight saving time conversions). Change-Id: I9a1e7a13e617cc18deef14289cf9b958588397d3 Reviewed-on: http://gerrit.cloudera.org:8080/20681 Reviewed-by: Csaba Ringhofer <csringhofer@cloudera.com> Tested-by: Csaba Ringhofer <csringhofer@cloudera.com>	2023-12-14 13:19:35 +00:00
Zoltan Borok-Nagy	e326b3cc0d	IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables This patch adds limited UPDATE support for Iceberg tables. The limitations mean users cannot update Iceberg tables if any of the following is true: * UPDATE value of partitioning column * UPDATE table that went through partition evolution * Table has SORT BY properties The above limitations will be resolved by part 3. The usual limitations like writing non-Parquet files, using copy-on-write, modifying V1 tables are out of scope of IMPALA-12313. This patch implements UPDATEs with the merge-on-read technique. This means the UPDATE statement writes both data files and delete files. Data files contain the updated records, delete files contain the position delete records of the old data records that have been touched. To achieve the above this patch introduces a new sink: MultiDataSink. We can configure multiple TableSinks for a single MultiDataSink object. During execution, the row batches sent to the MultiDataSink will be forwarded to all the TableSinks that have been registered. The UPDATE statement for an Iceberg table creates a source select statement with all table columns and virtual columns INPUT__FILE__NAME and FILE__POSITION. E.g. imagine we have a table 'tbl' with schema (i int, s string, k int), and we update the table with: UPDATE tbl SET k = 5 WHERE i % 100 = 11; The generated source statement will be ==> SELECT i, s, 5, INPUT__FILE__NAME, FILE__POSITION FROM tbl WHERE i % 100 = 11; Then we create two table sinks that refer to expressions from the above source statement: Insert sink (i, s, 5) Delete sink (INPUT__FILE__NAME, FILE__POSITION) The tuples in the rowbatch of MultiDataSink contain slots for all the above expressions (i, s, 5, INPUT__FILE__NAME, FILE__POSITION). MultiDataSink forwards each row batch to each registered TableSink. They will pick their relevant expressions from the tuple and write data/delete files. The tuples are sorted by INPUTE__FILE__NAME and FILE__POSITION because we need to write the delete records in this order. For partitioned tables we need to shuffle and sort the input tuples. In this case we also add virtual columns "PARTITION__SPEC__ID" and "ICEBERG__PARTITION__SERIALIZED" to the source statement and shuffle and sort the rows based on them. Data files and delete files are now separated in the DmlExecState, so at the end of the operation we'll have two sets of files. We use these two sets to create a new Iceberg snapshot. Why does this patch have the limitations? - Because we are shuffling and sorting rows based on the delete records and their partitions. This means that the new data files might not get written in an efficient way, e.g. there will be too many of them, or we will need to keep too many open file handles during writing. Also, if the table has SORT BY properties, we cannot respect it as the input rows are ordered in a way to favor the position deletes. Part 3 will introduce a buffering writer for position delete files. This means we will shuffle and sort records based on the data records' partitions and SORT BY properties while delete records get buffered and written out at the end (sorted by file_path and position). In some edge cases the delete records might not get written efficiently, but it is a smaller problem then inefficient data files. Testing: * negative tests * planner tests * update all supported data types * partitioned tables * Impala/Hive interop tests * authz tests * concurrent tests Change-Id: Iff0ef6075a2b6ebe130d15daa389ac1a505a7a08 Reviewed-on: http://gerrit.cloudera.org:8080/20677 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2023-12-09 03:04:05 +00:00
Daniel Becker	6b47c40e0d	IMPALA-12159: Support ORDER BY for collections of variable length types in select list IMPALA-12019 implemented support for collections of fixed length types in the sorting tuple. This change implements it for collections of variable length types. Note that the limitation that structs that contain any type of collection are not allowed in the sorting tuple is still in place (see IMPALA-12160). Note that it was not and still is not allowed to sort by complex types, this change only allows them to be present in the select list when sortin by some other expression. This change also allows collections of variable length types to be non-passthrough children of UNION ALL nodes. Testing: - Renamed the 'simple_arrays_big' table to 'arrays_big' and extended it with collections containing variable length types. This table is mainly used to test that spilling works during sorting. - Renamed test_sort.py::TestArraySort::{test_simple_arrays, test_simple_arrays_with_limit} to {test_array_sort,test_array_sort_with_limit} - Extended the tests run in test_queries.py::TestQueries::{test_sort, test_top_n,test_partitioned_top_n} with collections containing var-len types. - Added tests in sort-complex.test that assert that it is not allowed to sort by collections. For structs we already have such tests in struct-in-select-list.test. Change-Id: Ic15b29393f260b572e11a8dbb9deeb8c02981852 Reviewed-on: http://gerrit.cloudera.org:8080/20108 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2023-12-06 22:09:05 +00:00
Daniel Becker	6d3317b9a1	IMPALA-12570: Add longer strings to tables containing collections IMPALA-12373 introduces small string optimisation, after which not all strings will have a var-len part. IMPALA-12159 adds support for ORDER BY for collections of variable length types in the select list, but the test tables it uses only/mostly contain short strings. This patch has two modifications: 1. It introduces longer strings in 'collection_tbl' and 'collection_struct_mix'. It also adds two more rows to the existing one in 'collection_tbl' so that it can be used in sorting tests. These tables are only used by complex types tests, so the impact is limited. 2. It modifies RandomNestedDataGenerator.java, so that now it takes a parameter for string length. Some variable names are changed to clearer names. The references to and uses of RandomNestedDataGenerator are updated. Change-Id: Ief770d6bc9258fce159a733d5afa34fe594b96f8 Reviewed-on: http://gerrit.cloudera.org:8080/20718 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2023-11-22 18:50:22 +00:00
Tamas Mate	dce68e6a3b	IMPALA-11996: Scanner change for Iceberg metadata querying This commit adds a scan node for querying Iceberg metadata tables. The scan node creates a Java scanner object that creates and scans the metadata table. The scanner uses the Iceberg API to scan the table after that the scan node fetches the rows one by one and materialises them into RowBatches. The Iceberg row reader on the backend does the translation between Iceberg and Impala types. There is only one fragment created to query the Iceberg metadata table which is supposed to be executed on the coordinator node that already has the Iceberg table loaded. This way there is no need for further table loading on the executor side. This change will not cover nested column types, these slots are set to NULL, it will be done in IMPALA-12205. Testing: - Added e2e tests for querying metadata tables - Updated planner tests Performance testing: Created a table and inserted ~5500 rows one by one, this generated ~270000 ALL_MANIFESTS metadata table records. This table is quite wide and has a String column as well. I only mention count(*) test on ALL_MANIFESTS, because every row is materialized in every scenario currently: - Cold cache: 15.76s - IcebergApiScanTime: 124.407ms - MaterializeTupleTime: 8s368ms - Warm cache: 7.56s - IcebergApiScanTime: 3.646ms - MaterializeTupleTime: 7s477ms Change-Id: I0e943cecd77f5ef7af7cd07e2b596f2c5b4331e7 Reviewed-on: http://gerrit.cloudera.org:8080/20010 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2023-10-26 12:40:22 +00:00
Eyizoha	2f06a7b052	IMPALA-10798: Initial support for reading JSON files Prototype of HdfsJsonScanner implemented based on rapidjson, which supports scanning data from splitting json files. The scanning of JSON data is mainly completed by two parts working together. The first part is the JsonParser responsible for parsing the JSON object, which is implemented based on the SAX-style API of rapidjson. It reads data from the char stream, parses it, and calls the corresponding callback function when encountering the corresponding JSON element. See the comments of the JsonParser class for more details. The other part is the HdfsJsonScanner, which inherits from HdfsScanner and provides callback functions for the JsonParser. The callback functions are responsible for providing data buffers to the Parser and converting and materializing the Parser's parsing results into RowBatch. It should be noted that the parser returns numeric values as strings to the scanner. The scanner uses the TextConverter class to convert the strings to the desired types, similar to how the HdfsTextScanner works. This is an advantage compared to using number value provided by rapidjson directly, as it eliminates concerns about inconsistencies in converting decimals (e.g. losing precision). Added a startup flag, enable_json_scanner, to be able to disable this feature if we hit critical bugs in production. Limitations - Multiline json objects are not fully supported yet. It is ok when each file has only one scan range. However, when a file has multiple scan ranges, there is a small probability of incomplete scanning of multiline JSON objects that span ScanRange boundaries (in such cases, parsing errors may be reported). For more details, please refer to the comments in the 'multiline_json.test'. - Compressed JSON files are not supported yet. - Complex types are not supported yet. Tests - Most of the existing end-to-end tests can run on JSON format. - Add TestQueriesJsonTables in test_queries.py for testing multiline, malformed, and overflow in JSON. Change-Id: I31309cb8f2d04722a0508b3f9b8f1532ad49a569 Reviewed-on: http://gerrit.cloudera.org:8080/19699 Reviewed-by: Quanlong Huang <huangquanlong@gmail.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2023-09-05 16:55:41 +00:00
Zoltan Borok-Nagy	a34f7ce632	IMPALA-12342: Erasure coding build fails on loading iceberg_lineitem_multiblock Previous to this patch we tried to load table iceberg_lineitem_multiblock with HDFS block size 524288. This failed in builds that use HDFS erasure coding which requires block size at least 1048576. This patch increases the block size to 1048576. This also triggers the bug that was fixed by IMPALA-12327. But to have more tests with multiblock tables this patch also adds table iceberg_lineitem_sixblocks and few tests with different MT_DOP settings. Testing: * tested in build with HDFS EC Change-Id: Iad15a335407c12578eb822bb1cb4450647502e50 Reviewed-on: http://gerrit.cloudera.org:8080/20359 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2023-08-21 20:53:27 +00:00
Zoltan Borok-Nagy	8638255e50	IMPALA-12327: Iceberg V2 operator wrong results in PARTITIONED mode The Iceberg delete node tries to do mini merge-joins between data records and delete records. This works in BROADCAST mode, and most of the time in PARTITIONED mode as well. Though the Iceberg delete node had the wrong assumption that if the rows in a row batch belong to the same file, and come in ascending order, we rely on the previous delete updating IcebergDeleteState to the next deleted row id and skip the binary search if it's greater than or equal to the current probe row id. When PARTITIONED mode is used, we cannot rely on ascending row order, not even inside row batches, not even when the previous file path is the same as the current one. This is because files with multiple blocks can be processed by multiple hosts in parallel, then the rows are getting hash-exchanged based on their file paths. Then the exchange-receiver at the LHS coalesces the row batches from multiple senders, hence the row IDs being unordered. This patch adds a fix to ignore presumptions and do a binary search when the position-based difference between the current row and previous row is not one, and we are in PARTITIONED mode. Tests: * added e2e tests Change-Id: Ib89a53e812af8c3b8ec5bc27bca0a50dcac5d924 Reviewed-on: http://gerrit.cloudera.org:8080/20295 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2023-08-02 21:50:55 +00:00
Joe McDonnell	e62043c1fa	IMPALA-12287: Use old INSERT OVERWRITE TABLE syntax for Hive dataload In dataload, we have some Hive statements that use the "INSERT OVERWRITE" syntax rather than the "INSERT OVERWRITE TABLE" syntax. Older versions of Hive do not support this syntax. In order to keep the dataload code as compatible as possible, this switches the "INSERT OVERWRITE" statements to "INSERT OVERWRITE TABLE". Testing: - Ran a core job Change-Id: I455641280166c39dcc42fb4187f728df8148cc70 Reviewed-on: http://gerrit.cloudera.org:8080/20198 Reviewed-by: Michael Smith <michael.smith@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2023-07-14 23:33:01 +00:00
Gergely Fürnstáhl	d0fe4c604f	IMPALA-11619: Improve Iceberg V2 reads with a custom Iceberg Position Delete operator IcebergDeleteNode and IcebergDeleteBuild classes are based on PartitionedHashJoin counterparts. The actual "join" part of the node is optimized, while others are kept very similarly, to be able to integrate features of PartitionedHashJoin if needed (partitioning, spilling). ICEBERG_DELETE_JOIN is added as a join operator which is used only by IcebergDeleteNode node. IcebergDeleteBuild processes the data from the relevant delete files and stores them in a {file_path: ordered row id vector} hash map. IcebergDeleteNode tracks the processed file and progresses through the row id vector parallel to the probe batch to check if a row is deleted or hashes the probe row's file path and uses binary search to find the closest row id if it is needed for the check. Testing: - Duplicated related planner tests to run both with new operator and hash join - Added a dimension for e2e tests to run both with new operator and hash join - Added new multiblock tests to verify assumptions used in new operator to optimize probing - Added new test with BATCH_SIZE=2 to verify in/out batch handling with new operator Change-Id: I024a61573c83bda5584f243c879d9ff39dd2dcfa Reviewed-on: http://gerrit.cloudera.org:8080/19850 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2023-07-05 20:32:23 +00:00
stiga-huang	e294be7707	IMPALA-12128: Bump ORC C++ version to 1.7.9-p10 This bumps the ORC C++ version from 1.7.0-p14 to 1.7.9-p10 to add the fixes of ORC-1041 and ORC-1304. Tests: - Add e2e test for ORC-1304. - It's hard to add test for ORC-1041 since it won't cause crashes when compiling with gcc-10. Change-Id: I26c39fe5b15ab0bcbe6b2af6fe7a45e48eaec6eb Reviewed-on: http://gerrit.cloudera.org:8080/20090 Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2023-06-20 10:24:33 +00:00
Michael Smith	2fc4f74796	IMPALA-10186: Fix writing empty parquet page Fixes writing an empty parquet page when a page fills (or reaches parquet_page_row_count_limit) at the same time that its dictionary fills. When a page filled (or reached parquet_page_row_count_limit) at the same time that the dictionary filled, Impala would first detect the page was full and create a new page. It would then detect the dictionary is full and create another page, resulting in an empty page. Parquet readers like Hive error if they encounter an empty page. This patch attempts to make it impossible to generate an empty page by reworking AppendRow and adding DCHECKs for empty pages. Dictionary size is checked on FinalizeCurrentPage so whenever a page is written, we also flush the dictionary if full. Addresses clang-tidy by adding override in source files. Testing: - new test for full page size reached with full dictionary - new test for parquet_page_row_count_limit with full dictionary - new test for parquet_page_row_count_limit followed by large value. This seems useful as a theoretical corner-case; it currently writes the too-large value to the page anyway, but if we ever start checking whether the first value will fit the page this could become an issue. Change-Id: I90d30d958f07c6289a1beba1b5df1ab3d7213799 Reviewed-on: http://gerrit.cloudera.org:8080/19898 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2023-05-19 12:02:42 +00:00
Daniel Becker	ff3d0c7984	IMPALA-12019: Support ORDER BY for arrays of fixed length types in select list As a first stage of IMPALA-10939, this change implements support for including in the sorting tuple top-level collections that only contain fixed length types (including fixed length structs). For these types the implementation is almost the same as the existing handling of strings. Another limitation is that structs that contain any type of collection are not yet allowed in the sorting tuple. Also refactored the RawValue::Write*() functions to have a clearer interface. Testing: - Added a new test table that contains many rows with arrays. This is queried in a new test added in test_sort.py, to ensure that we handle spilling correctly. - Added tests that have arrays and/or maps in the sorting tuple in test_queries.py::TestQueries::{test_sort, test_top_n,test_partitioned_top_n}. Change-Id: Ic7974ef392c1412e8c60231e3420367bd189677a Reviewed-on: http://gerrit.cloudera.org:8080/19660 Reviewed-by: Csaba Ringhofer <csringhofer@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2023-05-18 09:56:55 +00:00
Csaba Ringhofer	4261225f65	IMPALA-6433: Add read support for PageHeaderV2 Parquet v2 means several changes in Parquet files compared to v1: 1. file version = 2 instead of 1 `c185faf0c4/src/main/thrift/parquet.thrift (L1016)` Before this patch Impala rejected Parquet files with version!=1. 2. possible use of DataPageHeaderV2 instead DataPageHeader `c185faf0c4/src/main/thrift/parquet.thrift (L561)` The main differences compared to V1 DataPageHeader: a. rep/def levels are not compressed, so the compressed part contains only the actual encoded values b. rep/def levels must be RLE encoded (Impala only supports RLE encoded levels even for V1 pages) c. compression can be turned on/off per page (member is_compressed) d. number of nulls (member num_nulls) is required - in v1 it was included in statistics which is optional e. number of rows is required (member num_rows) which can help with matching collection items with the top level collection The patch adds support for understanding v2 data pages but does not implement some potential optimizations: a. would allow an optimization for queries that need only the nullness of a column but not the actual value: as the values are not needed the decompression of the page data can be skipped. This optimization is not implemented - currently Impala materializes both the null bit and the value for all columns regardless of whether the value is actually needed. d. could be also used for optimizations / additional validity checks but it is not used currently e. could make skipping rows easier but is not used, as the existing scanner has to be able to skip rows efficiently also in v1 files so it can't rely on num_rows 3. possible use of new encodings (e.g. DELTA_BINARY_PACKED) No new encoding is added - when an unsupported encoding is encountered Impala returns an error. parquet-mr uses new encodings (DELTA_BINARY_PACKED, DELTA_BYTE_ARRAY) for most types if the file version is 2, so with this patch Impala is not yet able to read all v2 Parquet tables written by Hive. 4. Encoding PLAIN_DICTIONARY is deprecated and RLE_DICTIONARY is used instead. The semantics of the two encodings are exactly the same. Additional changes: Some responsibilites are moved from ParquetColumnReader to ParquetColumnChunkReader: - ParquetColumnChunkReader decodes rep/def level sizes to hide v1/v2 differences (see 2.a.) - ParquetColumnChunkReader skips empty data pages in ReadNextDataPageHeader - the state machine of ParquetColumnChunkReader is simplified by separating data page header reading / reading rest of the page Testing: - added 4 v2 Parquet test tables (written by Hive) to cover compressed / uncompressed and scalar/complex cases - added EE and fuzz tests for the test tables above - manual tested v2 Parquet files written by pyarrow - ran core tests Note that no test is added where some pages are compressed while some are not. It would be tricky to create such files with existing writers. The code should handle this case and it is very unlikely that files like this will be encountered. Change-Id: I282962a6e4611e2b662c04a81592af83ecaf08ca Reviewed-on: http://gerrit.cloudera.org:8080/19793 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2023-05-12 18:31:03 +00:00
Gabor Kaszab	7e0feb4a8e	IMPALA-11701 Part1: Don't push down predicates to scanner if already applied by Iceberg We push down predicates to Iceberg that uses them to filter out files when getting the results of planFiles(). Using the FileScanTask.residual() function we can find out if we have to use the predicates to further filter the rows of the given files or if Iceberg has already performed all the filtering. Basically if we only filter on IDENTITY-partition columns then Iceberg can filter the files and using these filters in Impala wouldn't filter any more rows from the output (assuming that no partition evolution was performed on the table). An additional benefit of not pushing down no-op predicates to the scanner is that we can potentially materialize less slots. For example: SELECT count(1) from iceberg_tbl where part_col = 10; Another additional benefit comes with count() queries. If all the predicates are skipped from being pushed to Impala's scanner for a count() query then the Parquet scanner can go to an optimized path where it uses stats instead of reading actual data to answer the query. In the above query Iceberg filters the files using the predicate on a partition column and then there won't be any need to materialize 'part_col' in Impala, nor to push down the 'part_col = 10' predicate. Note, this is an all or nothing approach, meaning that assuming N number of predicates we either push down all predicates to the scanner or none of them. There is a room for improvement to identify a subset of the predicates that we still have to push down to the scanner. However, for this we'd need a mapping between Impala predicates and the predicates returned by Iceberg's FileScanTask.residual() function that would significantly increase the complexity of the relevant code. Testing: - Some existing tests needed some extra care as they were checking for predicates being pushed down to the scanner, but with this patch not all of them are pushed down. For these tests I added some extra predicates to achieve that all of the predicates are pushed down to the scanner. - Added a new planner test suite for checking how predicate push down works with Iceberg tables. Change-Id: Icfa80ce469cecfcfbcd0dcb595a6b04b7027285b Reviewed-on: http://gerrit.cloudera.org:8080/19534 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2023-04-21 15:22:17 +00:00
Daniel Becker	b73847f178	IMPALA-10851: Codegen for structs IMPALA-9495 added support for struct types in SELECT lists but only with codegen turned off. This commit implements codegen for struct types. To facilitate this, code generation for reading and writing 'AnyVal's has been refactored. A new class, 'CodegenAnyValReadWriteInfo' is introduced. This class is an interface between sources and destinations, one of which is an 'AnyVal' object: sources generate an instance of this class and destinations take that instance and use it to write the value. The other side can for example be tuples from which we read (in the case of 'SlotRef') or tuples we write into (in case of materialisation, see Tuple::CodegenMaterializeExprs()). The main advantage is that sources do not have to know how to write their destinations, only how to read the values (and vice versa). Before this change, many tests that involve structs ran only with codegen turned off. Now that codegen is supported in these cases, these tests are also run with codegen on. Testing: - enabed tests for structs in the select list with codegen on in tests/query_test/test_nested_types.py - enabled codegen in other tests where it used to be disabled because it was not supported. Change-Id: I5272c3f095fd9f07877104ee03c8e43d0c4ec0b6 Reviewed-on: http://gerrit.cloudera.org:8080/18526 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2023-04-14 13:46:59 +00:00
Zoltan Borok-Nagy	f2cb2c9ceb	IMPALA-11964: Make sure Impala returns error for Iceberg tables with equality deletes Impala only supports position deletes currently. It should raise an error when equality deletes are encountered. We already had a check for this when the query was planned by Iceberg. But when we were using cached metadata the check was missing. This means that Impala could return bogus results in the presence of equality delete files. This patch adds check for the latter case as well. Tables with equality delete files are still loadable by Impala, and users can still query snapshots of it if they don't have equality deletes. Testing: * added e2e tests Change-Id: I14d7116692c0e47d0799be650dc323811e2ee0fb Reviewed-on: http://gerrit.cloudera.org:8080/19601 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2023-03-22 16:44:05 +00:00
Daniel Becker	2d47306987	IMPALA-9551: Allow mixed complex types in select list Currently collections and structs are supported in the select list, also when they are nested (structs in structs and collections in collections), but mixing different kinds of complex types, i.e. having structs in collections or vice versa, is not supported. This patch adds support for mixed complex types in the select list. Limitation: zipping unnest is not supported for mixed complex types, for example the following query: use functional_parquet; select unnest(struct_contains_nested_arr.arr) from collection_struct_mix; Testing: - Created a new test table, 'collection_struct_mix', that contains mixed complex types. - Added tests in mixed-collections-and-structs.test that test having mixed complex types in the select list. These tests are called from test_nested_types.py::TestMixedCollectionsAndStructsInSelectList. - Ran existing tests that test collections and structs in the select list; test queries that expected a failure in case of mixed complex types have been moved to mixed-collections-and-structs.test and now expect success. Change-Id: I476d98884b5fd192dfcd4feeec7947526aebe993 Reviewed-on: http://gerrit.cloudera.org:8080/19322 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2023-03-07 13:22:33 +00:00
gaoxq	89cc20717e	IMPALA-4052: CREATE TABLE LIKE for Kudu tables This commit implements cloning between Kudu tables, including clone the schema and hash partitions. But there is one limitation, cloning of Kudu tables with range paritions is not supported. For cloning range partitions, it's tracked by IMPALA-11912. Cloning Kudu tables from other types of tables is not implemented, because the table creation statements are different. Testing: - e2e tests - AnalyzeDDLTest tests Change-Id: Ia3d276a6465301dbcfed17bb713aca06367d9a42 Reviewed-on: http://gerrit.cloudera.org:8080/18729 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2023-02-20 16:38:16 +00:00
stiga-huang	77d80aeda6	IMPALA-11812: Deduplicate column schema in hmsPartitions A list of HMS Partitions will be created in many workloads in catalogd, e.g. table loading, bulk altering partitions by ComputeStats or AlterTableRecoverPartitions, etc. Currently, each of hmsPartition hold a unique list of column schema, i.e. a List<FieldSchema>. This results in lots of FieldSchema instances if the table is wide and lots of partitions need to be loaded/operated. Though the strings of column names and comments are interned, the FieldSchema objects could still occupy the majority of the heap. See the histogram in JIRA description. In reality, the hmsPartition instances of a table can share the table-level column schema since Impala doesn't respect the partition level schema. This patch replaces column list in StorageDescriptor of hmsPartitions with the table level column list to remove the duplications. Also add some progress logs in batch HMS operations, and avoid misleading logs when event-processor is disabled. Tests: - Ran exhaustive tests - Add tests on wide table operations that hit OOM errors without this fix. Change-Id: I511ecca0ace8bea4c24a19a54fb0a75390e50c4d Reviewed-on: http://gerrit.cloudera.org:8080/19391 Reviewed-by: Aman Sinha <amsinha@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2023-01-01 04:38:36 +00:00
noemi	4a05eaf988	IMPALA-11807: Fix TestIcebergTable.test_avro_file_format and test_mixed_file_format Iceberg hardcodes URIs in metadata files. If the table was written in a certain storage location and then moved to another file system, the hardcoded URIs will still point to the old location instead of the current one. Therefore Impala will be unable to read the table. TestIcebergTable.test_avro_file_format and test_mixed_file_format use Hive from Impala to write tables. If the tables are created in a different file system than the one they will be read from, the tests fail due to the invalid URIs. Skipping these 2 tests if testing is not done on HDFS. Updated the data load schema of the 2 test tables created by Hive and set LOCATION to the same as in the previous test tables. If this makes it possible to rewrite the URIs in the metadata and makes the tables accessible from another file system as well later, then the tests can be enabled again. Testing: - Testing locally on HDFS minicluster - Triggered an Ozone build to verify that it is skipped on a different file system Change-Id: Ie2f126de80c6e7f825d02f6814fcf69ae320a781 Reviewed-on: http://gerrit.cloudera.org:8080/19387 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2022-12-22 19:45:21 +00:00
noemi	390a932064	IMPALA-11708: Add support for mixed Iceberg tables with AVRO file format This patch extends the support of Iceberg tables containing multiple file formats. Now AVRO data files can also be read in a mixed table besides Parquet and ORC. Impala uses its avro scanner to read AVRO files, therefore all the avro related limitations apply here as well: writes/metadata changes are not supported. testing: - E2E testing: extending 'iceberg-mixed-file-format.test' to include AVRO files as well, in order to test reading all three currently supported file formats: avro+orc+parquet Change-Id: I941adfb659218283eb5fec1b394bb3003f8072a6 Reviewed-on: http://gerrit.cloudera.org:8080/19353 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2022-12-16 17:37:35 +00:00
Daniel Becker	25b5058ef5	IMPALA-11717: Use rapidjson for printing collections We have been using rapidjson to print structs but didn't use it to print collections (arrays and maps). This change introduces the usage of rapidjson to print collections for both the HS2 and the Beeswax protocol. The old code handling the printing of collections in raw-value.{h,cc} is removed. Testing: - Ran existing EE tests - Added EE tests with non-string and NULL map keys in nested-map-in-select-list.test and map_null_keys.test. Change-Id: I08a2d596a498fbbaf1419b18284846b992f49165 Reviewed-on: http://gerrit.cloudera.org:8080/19309 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Daniel Becker <daniel.becker@cloudera.com>	2022-12-15 15:04:07 +00:00
noemi	80fc49abe6	IMPALA-11158: Add support for Iceberg tables with AVRO data files Iceberg tables containing only AVRO files or no AVRO files at all can now be read by Impala. Mixed file format tables with AVRO are currently unsupported. Impala uses its avro scanner to read AVRO files, therefore all the avro related limitations apply here as well: writes/metadata changes are not supported. testing: - created test tables: 'iceberg_avro_only' contains only AVRO files; 'iceberg_avro_mixed' contains all file formats: avro+orc+parquet - added E2E test that reads Avro-only table - added test case to iceberg-negative.test that tries to read mixed file format table Change-Id: I827e5707e54bebabc614e127daa48255f86f4c4f Reviewed-on: http://gerrit.cloudera.org:8080/19084 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2022-12-08 03:03:13 +00:00
Csaba Ringhofer	a983a347a7	IMPALA-11682: Add tests for minor compacted insert only ACID tables Only test changes. Minor compacted delta dirs are supported in Impala since IMPALA-9512, but at that time Hive supported minor compaction only on full ACID tables. Since that time Hive added support for minor compacting insert only/MM tables (HIVE-22610). Change-Id: I7159283f3658f2119d38bd3393729535edd0a76f Reviewed-on: http://gerrit.cloudera.org:8080/19164 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2022-11-03 00:52:08 +00:00
Daniel Becker	37f44a58f3	IMPALA-10918: Allow map type in SELECT list Adding support for MAP types in the select list. An example of how maps are printed: {"k1":2,"k2":null} Nested collection types (maps and arrays) are supported in any combination. However, structs in collections and collections in structs are not supported. Limitations (other than map support) as described in the commit for IMPALA-9498 still apply, the following are to be implemented later: - Unify HS2 / Beeswax logic with the way STRUCTs are handled. This could be done in a "final" logic that can handle STRUCTS/ARRAYS nested to each other - Implement "deep copy" and "deep serialize" for collections in BE. This would enable all operators, e.g. ORDER BY and UNION. Testing: - modified the FE tests that checked that maps were not allowed in the select list - now the test expect maps are allowed there - added FE and EE tests involving maps based on the array tests Change-Id: I921c647f1779add36e7f5df4ce6ca237dcfaf001 Reviewed-on: http://gerrit.cloudera.org:8080/18736 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2022-09-07 19:55:43 +00:00
LPL	cc26f345a4	IMPALA-11507: Use absolute_path when Iceberg data files are outside of the table location For Iceberg tables, when one of the following properties is used, it is considered that the table is possible to have data outside the table location directory: - 'write.object-storage.enabled' is true - 'write.data.path' is not empty - 'write.location-provider.impl' is configured - 'write.object-storage.path'(Deprecated) is not empty - 'write.folder-storage.path'(Deprecated) is not empty We should tolerate the situation that relative path of the data files cannot be obtained by the table location path, and we could use the absolute path in that case. E.g. the ETL program will write the table that the metadata of the Iceberg tables is placed in 'hdfs://nameservice_meta/warehouse/hadoop_catalog/ice_tbl/metadata', the recent data files in 'hdfs://nameservice_data/warehouse/hadoop_catalog/ice_tbl/data', and the data files half a year ago in 's3a://nameservice_data/warehouse/hadoop_catalog/ice_tbl/data', it should still be queried normally by Impala. Testing: - added e2e tests Change-Id: I666bed21d20d5895f4332e92eb30a94fa24250be Reviewed-on: http://gerrit.cloudera.org:8080/18894 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2022-09-06 18:35:30 +00:00
Zoltan Borok-Nagy	73da4d7ddf	IMPALA-11484: Create SCAN plan for Iceberg V2 position delete tables This patch adds support for reading Iceberg V2 tables use position deletes. Equality deletes are still not supported. Position delete files store the file path and file position of the deleted rows. When an Iceberg table has position delete files we need to do an ANTI JOIN between data files and delete files. From the data files we need to query the virtual columns INPUT__FILE__NAME and FILE__POSITION, while from the delete files we need the data columns 'file_path' and 'pos'. The latter data columns are not part of the table schema, so we create a virtual table instance of 'IcebergPositionDeleteTable' that has a table schema corresponding to the delete files ('file_path', 'pos'). This patch introduces a new class 'IcebergScanPlanner' which has the responsibility of doing a plan for Iceberg table scans. It creates the aforementioned ANTI JOIN. Also, if there are data files without corresponding delete files, we can have a separate SCAN node and its results would be UNIONed to the rows coming from the ANTI JOIN: UNION / \ SCAN data ANTI JOIN / \ SCAN data SCAN deletes Some refactorings in the context of this CR: Predicate pushdown and time travel logic is transferred from IcebergScanNode to IcebergScanPlanner. Iceberg snapshot summary retrieval is moved from FeFsTable to FeIcebergTable. Testing: * added planner test * added e2e tests TODO in follow-up Jiras: * better cardinality estimates (IMPALA-11516) * support unrelative collection columns (select item from t.int_array) (IMPALA-11517) Currently such queries return error during analysis Change-Id: I672cfee18d8e131772d90378d5b12ad4d0f7dd48 Reviewed-on: http://gerrit.cloudera.org:8080/18847 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2022-09-01 16:51:17 +00:00
Csaba Ringhofer	7ca11dfc7f	IMPALA-9482: Support for BINARY columns This patch adds support for BINARY columns for all table formats with the exception of Kudu. In Hive the main difference between STRING and BINARY is that STRING is assumed to be UTF8 encoded, while BINARY can be any byte array. Some other differences in Hive: - BINARY can be only cast from/to STRING - Only a small subset of built-in STRING functions support BINARY. - In several file formats (e.g. text) BINARY is base64 encoded. - No NDV is calculated during COMPUTE STATISTICS. As Impala doesn't treat STRINGs as UTF8, BINARY and STRING become nearly identical, especially from the backend's perspective. For this reason, BINARY is implemented a bit differently compared to other types: while the frontend treats STRING and BINARY as two separate types, most of the backend uses PrimitiveType::TYPE_STRING for BINARY too, e.g. in SlotDesc. Only the following parts of backend need to differentiate between STRING and BINARY: - table scanners - table writers - HS2/Beeswax service These parts have access to column metadata, which allows to add special handling for BINARY. Only a very few builtins are allowed for BINARY at the moment: - length - min/max/count - coalesce and similar "selector" functions Other STRING functions can be only used by casting to STRING first. Adding support for more of these functions is very easy, as simply the BINARY type has to be "connected" to the already existing STRING function's signature. Functions where the result depends on utf8_mode need to ensure that with BINARY it always works as if utf8_mode=0 (for example length() is mapped to bytes() as length count utf8 chars if utf8_mode=1). All kinds of UDFs (native, Hive legacy, Hive generic) support BINARY, though in case of legacy Hive UDFs it is only supported if the argument and return types are set explicitely to ensure backward compatibility. See IMPALA-11340 for details. The original plan was to behave as close to Hive as possible, but I realized that Hive has more relaxed casting rules than Impala, which led to STRING<->BINARY casts being necessary in more cases in Impala. This was needed to disallow passing a BINARY to functions that expect a STRING argument. An example for the difference is that in INSERT ... VALUES () string literals need to be explicitly cast to BINARY, while this is not needed in Hive. Testing: - Added functional.binary_tbl for all file formats (except Kudu) to test scanning. - Removed functional.unsupported_types and related tests, as now Impala supports all (non-complex) types that Hive does. - Added FE/EE tests mainly based on the ones added to the DATE type Change-Id: I36861a9ca6c2047b0d76862507c86f7f153bc582 Reviewed-on: http://gerrit.cloudera.org:8080/16066 Reviewed-by: Quanlong Huang <huangquanlong@gmail.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2022-08-19 13:55:42 +00:00
Zoltan Borok-Nagy	522ee1fcc0	IMPALA-11350: Add virtual column FILE__POSITION for Parquet tables Virtual column FILE__POSITION returns the ordinal position of the row in the data file. It will be useful to add support for Iceberg's position-based delete files This patch only adds FILE__POSITION to Parquet tables. It works similarly to the handling of collection position slots. I.e. we add the responsibility of dealing with the file position slot to an existing column reader. Because of page-filtering and late materialization we already tracked the file position in member 'current_row_' during scanning. Querying the FILE__POSITION in other file formats raises an error. Testing: * added e2e tests Change-Id: I4ef72c683d0d5ae2898bca36fa87e74b663671f7 Reviewed-on: http://gerrit.cloudera.org:8080/18704 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2022-08-12 19:21:55 +00:00
Csaba Ringhofer	efc303b71a	IMPALA-11434: Fix analysis of multiple more than 1d arrays in select list More than 1d arrays in select list tried to register a CollectionTableRef with name "item" for the inner arrays, leading to name collision if there was more than one such array. The logic is changed to always use the full path as implicit alias in CollectionTableRefs backing arrays in select list. As a side effect this leads to using the fully qualified names in expressions in the explain plans of queries that use arrays from views. This is not an intended change, but I don't consider it to be critical. Created IMPALA-11452 to deal with more sophisticated alias handling in collections. Testing: - added a new table to testdata and a regression test Change-Id: I6f2b6cad51fa25a6f6932420eccf1b0a964d5e4e Reviewed-on: http://gerrit.cloudera.org:8080/18734 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2022-07-22 22:59:19 +00:00
stiga-huang	d74cc7319f	IMPALA-9670: Fix unloaded views are shown as tables for GET_TABLES requests At startup, catalogd pulls the table names from HMS and tracks each table using an IncompleteTable which only contains the table name. The table types (TABLE/VIEW) and comments are unknown until the table/view is loaded in catalogd. GET_TABLES is a request of the HS2 protocol. It fetches all the tables with their types and comments. For unloaded tables/views, Impala always returns them with TABLE type (the default) and empty comments. This patch enables catalogd to always load the table types and comments along with the table names. This behavior is controlled by a catalogd-only flag, --pull_table_types_and_comments, which is false by default. When this flag is enabled, catalogd will load table types and comments at startup and in executing INVALIDATE METADATA commands. In other words, an unloaded table (IncompleteTable) now not just contains the table name, but also contains the correct table type and comment. This is implemented by using the getTableMetas HMS API when invalidating a table. The original behavior uses getAllTables to load all table names and uses tableExists to verify whether a table still exists. When the flag is set, we'll use getTableMetas instead to also load the table types and comments. Implementation: Add a new table type, UNLOADED_TABLE, in TTableType to identify tables that we just know it's not a view, but don’t know whether it's a Kudu or HDFS table since its full set of metadata is unloaded. When propagating catalog objects from catalogd to coordinators, views are sent using a catalog key explicitly prefixed by VIEW. So coordinators can create IncompleteTables/LocalIncompleteTables with the correct types. In most of the cases in creating an IncompleteTable, we have the table types and comments in the context. For instance, when adding an IncompleteTable for a CreateTable/CreateView request, we know exactly it's a table or view. So we can create IncompleteTables with the correct types. Test infra changes: - Adds get_tables() method for the hs2_client - Extends ImpalaTestSuite.create_client_for_nth_impalad() to support hs2 and hs2-http protocols. So we can create HS2 clients on all impalads. Tests: - Add custom cluster tests on all catalog modes (with/without local-catalog or event processor). Verify the table types and comments are always correct when pull_table_types_and_comments is true. Change-Id: I528bb20272ebdd66a0118c30efc2b0566f2b0e2f Reviewed-on: http://gerrit.cloudera.org:8080/18626 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2022-06-24 04:27:49 +00:00

1 2 3 4 5

237 Commits