impala

mirror of https://github.com/apache/impala.git synced 2026-01-27 15:03:20 -05:00

Author	SHA1	Message	Date
liuyao	39cc4b6bf4	IMPALA-2581: LIMIT can be propagated down into some aggregations This patch contains 2 parts: 1. When both conditions below are true, push down limit to pre-aggregation a) aggregation node has no aggregate function b) aggregation node has no predicate 2. finish aggregation when number of unique keys of hash table has exceeded the limit. Sample queries: SELECT DISTINCT f FROM t LIMIT n Can pass the LIMIT all the way down to the pre-aggregation, which leads to a nearly unbounded speedup on these queries in large tables when n is low. Testing: Add test targeted-perf/queries/aggregation.test Pass core test Change-Id: I930a6cb203615acfc03f23118d1bc1f0ea360995 Reviewed-on: http://gerrit.cloudera.org:8080/17821 Reviewed-by: Qifan Chen <qchen@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-09-22 20:42:10 +00:00
norbert.luksa	35b21083b1	IMPALA-6505: Min-Max predicate push down in ORC scanner In planning phase, the planner collects and generates min-max predicates that can be evaluated on parquet file statistics. We can easily extend this on ORC tables. This commit implements min/max predicate pushdown for the ORC scanner leveraging on the external ORC library's search arguments. We build the search arguments when we open the scanner as we need not to modify them later. Also added a new query option orc_read_statistics, similar to parquet_read_statistics. If the option is set to true (it is by default) predicate pushdown will take effect, otherwise it will be skipped. The predicates will be evaluated at ORC row group level, i.e. by default for every 10,000 rows. Limitations: - Min-max predicates on CHAR/VARCHAR types are not pushed down due to inconsistent behaviors on padding/truncating between Hive and Impala. (IMPALA-10882) - Min-max predicates on TIMESTAMP are not pushed down (IMPALA-10915). - Min-max predicates having different arg types are not pushed down (IMPALA-10916). - Min-max predicates with non-literal const exprs are not pushed down since SearchArgument interfaces only accept literals. This only happens when expr rewrites are disabled thus constant folding is disabled. Tests: - Add e2e tests similar to test_parquet_stats to verify that predicates are pushed down. - Run CORE tests - Run TPCH benchmark, there is no improvement, nor regression. On the other hand, certain selective queries gained significant speed-up, e.g. select count(*) from lineitem where l_orderkey = 1. Change-Id: I136622413db21e0941d238ab6aeea901a6464845 Reviewed-on: http://gerrit.cloudera.org:8080/15403 Reviewed-by: Csaba Ringhofer <csringhofer@cloudera.com> Reviewed-by: Qifan Chen <qchen@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-09-17 00:44:15 +00:00
AlexanderSaydakov	c925807b1a	IMPALA-10901 cleaner and faster operations with datasketches - serialize using bytes instead of stream - avoid unnecessary constructor during deserialization - simplified code slightly - added original exception message to re-thrown generic message Change-Id: I306a2489dac0f4d2d475e8f9987cd58bf95474bb Reviewed-on: http://gerrit.cloudera.org:8080/17818 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-09-15 13:58:23 +00:00
stiga-huang	3850d49711	IMPALA-9662,IMPALA-2019(part-3): Support UTF-8 mode in mask functions Mask functions are used in Ranger column masking policies to mask sensitive data. There are 5 mask functions: mask(), mask_first_n(), mask_last_n(), mask_show_first_n(), mask_show_last_n(). Take mask() as an example, by default, it will mask uppercase to 'X', lowercase to 'x', digits to 'n' and leave other characters unmasked. For masking all characters to '', we can use mask(my_col, '', '', '', ''); The current implementations mask strings byte-to-byte, which have inconsistent results with Hive when the string contains unicode characters: mask('中国', '', '', '', '') => '***' Each Chinese character is encoded into 3 bytes in UTF-8 so we get the above result. The result in Hive is '' since there are two Chinese characters. This patch provides consistent masking behavior with Hive for strings under the UTF-8 mode, i.e., set UTF8_MODE=true. In UTF-8 mode, the masked unit of a string is a unicode code point. Implementation - Extends the existing MaskTransform function to deal with unicode code points(represented by uint32_t). - Extends the existing GetFirstChar function to get the code point of given masked charactors in UTF-8 mode. - Implement a MaskSubStrUtf8 method as the core functionality. - Swith to use MaskSubStrUtf8 instead of MaskSubStr in UTF-8 mode. - For better testing, this patch also adds an overload for all mask functions for only masking other chars but keeping the upper/lower/digit chars unmasked. E.g. mask({col}, -1, -1, -1, 'X'). Tests - Add BE tests in expr-test - Add e2e tests in utf8-string-functions.test Change-Id: I1276eccc94c9528507349b155a51e76f338367d5 Reviewed-on: http://gerrit.cloudera.org:8080/17780 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-09-15 05:04:07 +00:00
Gabor Kaszab	1e21aa6b96	IMPALA-9495: Support struct in select list for ORC tables This patch implements the functionality to allow structs in the select list of inline views, topmost blocks. When displaying the value of a struct it is formatted into a JSON value and returned as a string. An example of such a value: SELECT struct_col FROM some_table; '{"int_struct_member":12,"string_struct_member":"string value"}' Another example where we query a nested struct: SELECT outer_struct_col FROM some_table; '{"inner_struct":{"string_member":"string value","int_member":12}}' Note, the conversion from struct to JSON happens on the server side before sending out the value in HS2 to the client. However, HS2 is capable of handling struct values as well so in a later change we might want to add a functionality to send the struct in thrift to the client so that the client can use the struct directly. -- Internal representation of a struct: When scanning a struct the rowbatch will hold the values of the struct's children as if they were queried one by one directly in the select list. E.g. Taking the following table: CREATE TABLE tbl (id int, s struct<a:int,b:string>) STORED AS ORC And running the following query: SELECT id, s FROM tbl; After scanning a row in a row batch will hold the following values: (note the biggest size comes first) 1: The pointer for the string in s.b 2: The length for the string in s.b 3: The int value for s.a 4: The int value of id 5: A single null byte for all the slots: id, s, s.a, s.b The size of a struct has an effect on the order of the memory layout of a row batch. The struct size is calculated by summing the size of its fields and then the struct gets a place in the row batch to precede all smaller slots by size. Note, all the fields of a struct are consecutive to each other in the row batch. Inside a struct the order of the fields is also based on their size as it does in a regular case for primitives. When evaluating a struct as a SlotRef a newly introduced StructVal will be used to refer to the actual values of a struct in the row batch. This StructVal holds a vector of pointers where each pointer represents a member of the struct. Following the above example the StructVal would keep two pointers, one to point to an IntVal and one to point to a StringVal. -- Changes related to tuple and slot descriptors: When providing a struct in the select list there is going to be a SlotDescriptor for the struct slot in the topmost TupleDescriptor. Additionally, another TupleDesriptor is created to hold SlotDescriptors for each of the struct's children. The struct SlotDescriptor points to the newly introduced TupleDescriptor using 'itemTupleId'. The offsets for the children of the struct is calculated from the beginning of the topmost TupleDescriptor and not from the TupleDescriptor that directly holds the struct's children. The null indicator bytes as well are stored on the level of the topmost TupleDescriptor. -- Changes related to scalar expressions: A struct in the select list is translated into an expression tree where the top of this tree is a SlotRef for the struct itself and its children in the tree are SlotRefs for the members of the struct. When evaluating a struct SlotRef after the null checks the evaluation is delegated to the children SlotRefs. -- Restrictions: - Codegen support is not included in this patch. - Only ORC file format is supported by this patch. - Only HS2 client supports returning structs. Beeswax support is not implemented as it is going to be deprecated anyway. Currently we receive an error when trying to query a struct through Beeswax. -- Tests added: - The ORC and Parquet functional databases are extended with 3 new tables: 1: A small table with one level structs, holding different kind of primitive types as members. 2: A small table with 2 and 3 level nested structs. 3: A bigger, partitioned table constructed from alltypes where all the columns except the 'id' column are put into a struct. - struct-in-select-list.test and nested-struct-in-select-list.test uses these new tables to query structs directly or through an inline view. Change-Id: I0fbe56bdcd372b72e99c0195d87a818e7fa4bc3a Reviewed-on: http://gerrit.cloudera.org:8080/17638 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-09-14 21:21:47 +00:00
Zoltan Borok-Nagy	6b4693ddbf	IMPALA-10900: Add Iceberg tests that write many files In earlier versions of Impala we had a bug that affected insertions to Iceberg tables. When Impala wrote multiple files during a single INSERT statement it could crash, or even worse, it could silently omit data files from the Iceberg metadata. The current master doesn't have this bug, but we don't really have tests for this case. This patch adds tests that write many files during inserts to an Iceberg table. Both non-partitioned and partitioned Iceberg tables are tested. We achieve writing lots of files by setting 'parquet_file_size' to 8 megabytes. Testing: * added e2e test that write many data files * added exhaustive e2e test that writes even more data files Change-Id: Ia2dbc2c5f9574153842af308a61f9d91994d067b Reviewed-on: http://gerrit.cloudera.org:8080/17831 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-09-08 18:38:37 +00:00
Attila Jeges	c8aa5796d9	IMPALA-10879: Add parquet stats to iceberg manifest This patch adds parquet stats to iceberg manifest as per-datafile metrics. The following metrics are supported: - column_sizes : Map from column id to the total size on disk of all regions that store the column. Does not include bytes necessary to read other columns, like footers. - null_value_counts : Map from column id to number of null values in the column. - lower_bounds : Map from column id to lower bound in the column serialized as binary. Each value must be less than or equal to all non-null, non-NaN values in the column for the file. - upper_bounds : Map from column id to upper bound in the column serialized as binary. Each value must be greater than or equal to all non-null, non-Nan values in the column for the file. The corresponding parquet stats are collected by 'ColumnStats' (in 'min_value_', 'max_value_', 'null_count_' members) and 'HdfsParquetTableWriter::BaseColumnWriter' (in 'total_compressed_byte_size_' member). Testing: - New e2e test was added to verify that the metrics are written to the Iceberg manifest upon inserting data. - New e2e test was added to verify that lower_bounds/upper_bounds metrics are used to prune data files on querying iceberg tables. - Existing e2e tests were updated to work with the new behavior. - BE test for single-value serialization. Relevant Iceberg documentation: - Manifest: https://iceberg.apache.org/spec/#manifests - Values in lower_bounds and upper_bounds maps should be Single-value serialized to binary: https://iceberg.apache.org/spec/#appendix-d-single-value-serialization Change-Id: Ic31f2260bc6f6a7f307ac955ff05eb154917675b Reviewed-on: http://gerrit.cloudera.org:8080/17806 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Attila Jeges <attilaj@cloudera.com>	2021-09-02 21:34:41 +00:00
Zoltan Borok-Nagy	4f9f8c33ca	IMPALA-10840: Add support for "FOR SYSTEM_TIME AS OF" and "FOR SYSTEM_VERSION AS OF" for Iceberg tables This patch adds support "FOR SYSTEM_TIME AS OF" and "FOR SYSTEM_VERSION AS OF" clauses for Iceberg tables. The new clauses are part of the table ref. FOR SYSTEM_TIME AS OF conforms to the SQL2011 standard: https://cs.ulb.ac.be/public/_media/teaching/infoh415/tempfeaturessql2011.pdf With FOR SYSTEM_TIME AS OF we can query a table at a specific time point, e.g. we can retrieve what was the table content 1 day ago. The timestamp given to "FOR SYSTEM_TIME AS OF" is interpreted in the local timezone. The local timezone can be set via the query option TIMEZONE. By default the timezone being used is the coordinator node's local timezone. The timestamp is translated to UTC because table snapshots are tagged with a UTC timestamps. "FOR SYSTEM_VERSION AS OF" is a non-standard extension. It works similarly to FOR SYSTEM_TIME AS OF, but with this clause we can query a table via a snapshot ID instead of a timestamp. HIVE-25344 also added support for these clauses to Hive. Table snapshot IDs and timestamp information can be queried with the help of the DESCRIBE HISTORY command. Sample queries: SELECT * FROM t FOR SYSTEM_TIME AS OF now(); SELECT * FROM t FOR SYSTEM_TIME AS OF '2021-08-10 11:02:34'; SELECT * FROM t FOR SYSTEM_TIME AS OF now() - interval 10 days + interval 3 hours; SELECT * FROM t FOR SYSTEM_VERSION AS OF 7080861547601448759; SELECT * FROM t FOR SYSTEM_TIME AS OF now() MINUS SELECT * FROM t FOR SYSTEM_TIME AS OF now() - interval 1 days; This patch uses some parts of the in-progress IMPALA-9773 (https://gerrit.cloudera.org/#/c/13342/) developed by Todd Lipcon and Grant Henke. This patch also resolves some TODOs of IMPALA-9773, i.e. after this patch it'll be easier to add time travel for Kudu tables as well. Testing: * added parser tests (ParserTest.java) * added analyzer tests (AnalyzeStmtsTest.java) * added e2e tests (test_iceberg.py) Change-Id: Ib523c5e47b8d9c377bea39a82fe20249177cf824 Reviewed-on: http://gerrit.cloudera.org:8080/17765 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-08-30 23:09:42 +00:00
Amogh Margoor	2040b2621f	IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently. HashTable implementation in Impala comprises of contiguous array of Buckets and each Bucket contains either data or pointer to linked list of duplicate entries named DuplicateNode. These are the structures of Bucket and DuplicateNode: struct DuplicateNode { bool matched; DuplicateNode* next; HtData htdata; }; struct Bucket { bool filled; bool matched; bool hasDuplicates; uint32_t hash; union { HtData htdata; DuplicateNode* duplicates; } bucketData; }; Size of Bucket is currently 16 bytes and size of DuplicateNode is 24 bytes. If we can remove the booleans from both struct size of Bucket would reduce to 12 bytes and DuplicateNode will be 16 bytes. One of the ways we can remove booleans is to fold it into pointers already part of struct. Pointers store addresses and on architectures like x86 and ARM the linear address is only 48 bits long. With level 5 paging Intel is planning to expand it to 57-bit long which means we can use most significant 7 bits i.e., 58 to 64 bits to store these booleans. This patch reduces the size of Bucket and DuplicateNode by implementing this folding. However, there is another requirement regarding Size of Bucket to be power of 2 and also for the number of buckets in Hash table to be power of 2. These requirements are for the following reasons: 1. Memory Allocator allocates memory in power of 2 to avoid internal fragmentation. Hence, num of buckets * sizeof(Buckets) should be power of 2. 2. Number of buckets being power of 2 enables faster modulo operation i.e., instead of slow modulo: (hash % N), faster (hash & (N-1)) can be used. Due to this, 4 bytes 'hash' field from Bucket is removed and stored separately in new array hash_array_ in HashTable. This ensures sizeof(Bucket) is 8 which is power of 2. New Classes: ------------ As a part of patch, TaggedPointer is introduced which is a template class to store a pointer and 7-bit tag together in 64 bit integer. This structure contains the ownership of the pointer and will take care of allocation and deallocation of the object being pointed to. However derived classes can opt out of the ownership of the object and let the client manage it. It's derived classes for Bucket and DuplicateNode do the same. These classes are TaggedBucketData and TaggedDuplicateNode. Benchmark: ---------- As a part of this patch a new Micro Benchmark for HashTable has been introduced, which will help in measuring these: 1. Runtime for building hash table and probing it. 2. Memory consumed after building the Table. This would help measuring the impact of changes to the HashTable's data structure and algorithm. Saw 25-30% reduction in memory consumed and no significant difference in performance (0.91X-1.2X). Other Benchmarks: 1. Billion row Synthetic benchmark on single node, single daemon: a. 2-3% improvement in Join GEOMEAN for Probe benchmark. b. 17% and 21% reduction in PeakMemoryUsage and CumulativeBytes allocated respectively 2. TPCH-42: 0-1.5% improvement in GEOMEAN runtime Change-Id: I72912ae9353b0d567a976ca712d2d193e035df9b Reviewed-on: http://gerrit.cloudera.org:8080/17592 Reviewed-by: Zoltan Borok-Nagy <boroknagyz@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-08-25 20:05:47 +00:00
Andrew Sherman	b54d0c35ff	IMPALA-10849: Ignore escaped wildcards that terminate like predicates. A like predicate is generally evaluated by converting it into a regex that is evaluated at execution time. If the predicate of a like clause is a constant (which is the common case when you say "row like 'start%'") then there are optimizations where some cases that are simpler then a regex are spotted, and a simple function than a regex evaluator is used. One example is that a predicate such as ‘start%’ is evaluated by looking for strings that begin with "start". Amusingly the code that spots the potential optimizations uses regexes to look for patterns in the like predicate. The code that looks for the optimization where a simple prefix can be searched for does not deal with the case where the '%' wildcard at the end of the predicate is escaped. To fix this we add a test that deals with the case where the predicate ends in an escaped '%'. There are some other problems with escaped wildcards discussed in IMPALA-2422. This change does not fix these problems, which are hard. New tests for escaped wildcards are added to exprs.test - note that these tests cannot be part of the LikeTbl tests as the like predicate optimizations are only applied when the like predicate is a string literal. Exhaustive tests ran clean. Change-Id: I30356c19f4f169d99f7cc6268937653af6b41b70 Reviewed-on: http://gerrit.cloudera.org:8080/17798 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-08-25 05:31:42 +00:00
Qifan Chen	cd902d8c22	IMPALA-3430: Runtime filter : Extend runtime filter to support Min/Max values for HDFS scans This patch enables min/max filtering for non-correlated subqueries that return one value. In this case, the filters are built from the results of the subqueries and the filtering target is the scan node to be qualified by one of the subqueries. Shown below is one such query that normally gets compiled into a nested loop join. The filtering limits the values from column store_sales.ss_sales_price to be within [-infinite, min(ss_wholesale_cost)]. select count(*) from store_sales where ss_sales_price <= (select min(ss_wholesale_cost) from store_sales); In FE, the fact that the above scalar subquery exists is recorded in a flag in InlineViewRef in analyzer and later on transferred to AggregationNode in planner. In BE, the min/max filtering infrastructure is integrated with the nested loop join as follows. 1. NljBuilderConfig is populated with filter descriptors from nested join plan node via NljBuilder::CreateEmbeddedBuilder() (similar to hash join), or in NljBuilderConfig::Init() when the sink config is created (for separate builder case); 2. NljBuilder is populated with filter contexts utilizing the filter descriptors in NljBuilderConfig. Filter contexts are the interface to actual min/max filters; 3. New insertion methods InsertFor<op>(), where <op> is LE, LT, GE and GT, are added to MinMaxFilter class hierarcy. They are used for join predicate target <op> src_expr; 4. RuntimeContext::InsertPerCompareOp() calls one of the new insertion methods above based on the comparison op saved in the filter descriptor; 5. NljBuilder::InsertRuntimeFilters() calls the new methods. By default, the feature is turned on only for sorted or partitioned join columns. Testing: 1. Add single range insertion tests in min-max-filter-test.cc; 2. Add positive and negative plan tests in overlap_min_max_filters.test; 3. Add tests in overlap_min_max_filters_on_partition_columns.test; 4. Add tests in overlap_min_max_filters_on_sorted_columns.test; 5. Run core tests. TODO in follow-up patches: 1. Extend min/max filter for inequality subquery for other use cases (IMPALA-10869). Change-Id: I7c2bb5baad622051d1002c9c162c672d428e5446 Reviewed-on: http://gerrit.cloudera.org:8080/17706 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-08-21 14:46:51 +00:00
Amogh Margoor	0bde6b443c	IMPALA-10680: Replace StringToFloatInternal using fast_double_parser library StringToFloatInternal is used to parse string into float. It had logic to ensure it is faster than standard functions like strtod in many cases, but it was not as accurate. We are replacing it by a third party library named fast_double_parser which is both fast and doesn't sacrifise the accuracy for speed. On benchmarking on more than 1 million rows where string is cast to double, it is found that new patch is on par with the earlier algorithm. Results: W/O library: Fetched 1222386 row(s) in 32.10s With library: Fetched 1222386 row(s) in 31.71s Testing: 1. Added test to check for accuracy improvement. 2. Ran existing Backend tests for correctness. Change-Id: Ic105ad38a2fcbf2fb4e8ae8af6d9a8e251a9c141 Reviewed-on: http://gerrit.cloudera.org:8080/17389 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-08-15 20:40:39 +00:00
ShikhaAsrani	b1ca089446	IMPALA-10797: Frontend changes to enable 'stored as JSONFILE' This change will allow usage of commands that do not require reading the Json File like: - Create Table <Table> stored as JSONFILE - Show Create Table <Table> - Describe <Table> Changes: - Added JSON as FileFormat to thrift and HdfsFileFormat. - Allowing Sql keyword 'jsonfile' and mapping it to JSON format. - Adding JSON serDe. - JsonFiles have input format same as TextFile, so we need to use SerDe library in use to differentiate between the two formats. Overloaded the functions querying File Format based on input format to consider serDe library too. - Added tests for 'Create Table' and 'Show Create Table' commmands Pending Changes: - test for Describe command - to be added with backend changes. Change-Id: I5b8cb2f59df3af09902b49d3bdac16c19954b305 Reviewed-on: http://gerrit.cloudera.org:8080/17727 Reviewed-by: Quanlong Huang <huangquanlong@gmail.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-08-13 16:04:19 +00:00
Zoltan Borok-Nagy	a1d5891c57	IMPALA-10741: Set engine.hive.enabled=true table property for Iceberg tables Hive relies on engine.hive.enabled=true table property to be set for Iceberg tables. Without it Hive overwrites table metadata with different storage handler, SerDe/Input/OutputFormatter when it writes the table, making it unusable. With this patch Impala sets this table property during table creation. Testing: * updated show-create-table.test * tested Impala/Hive interop manually Change-Id: I6aa0240829697a27f48d0defcce48920a5d6f49b Reviewed-on: http://gerrit.cloudera.org:8080/17750 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-08-05 15:45:24 +00:00
stiga-huang	599c84b4dd	IMPALA-10808: (addendum) Abort on illegal decimal parquet schemas The previous patch added checks on illegal decimal schemas of parquet files. However, it doesn't return a non-ok status in ParquetMetadataUtils::ValidateColumn if abort_on_error is set to false. So we continue to use the illegal file schema and hit the DCHECK. This patch fixes this and adding test coverage for illegal decimal schemas. Tests: - Add a bad parquet file with illegal decimal schemas. - Add e2e tests on the file. - Ran test_fuzz_decimal_tbl 100 times. Saw the errors are caught as expected. Change-Id: I623f255a7f40be57bfa4ade98827842cee6f1fee Reviewed-on: http://gerrit.cloudera.org:8080/17748 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-08-05 07:55:26 +00:00
Attila Jeges	4fc42c379e	IMPALA-10739: Support setting new partition spec for Iceberg tables With this patch Impala will support partition evolution for Iceberg tables. The DDL statement to change the default partition spec is: ALTER TABLE <tbl> SET PARTITION SPEC(<partition-spec>) Hive uses the same SQL syntax. Testing: - Added FE test to exercise parsing various well-formed and ill-formed ALTER TABLE SET PARTITION SPEC statements. - Added e2e tests for: - ALTER TABLE SET PARTITION SPEC works for tables with HadoopTables and HadoopCatalog Catalog. - When evolving partition spec, the old data written with an earlier spec remains unchanged. New data is written using the new spec in a new layout. Data written with earlier spec and new spec can be fetched in a single query. - Invalid ALTER TABLE SET PARTITION SPEC statements yield the expected analysis error messages. Change-Id: I9bd935b8a82e977df9ee90d464b5fe2a7acc83f2 Reviewed-on: http://gerrit.cloudera.org:8080/17723 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-08-03 16:27:07 +00:00
Attila Jeges	fabe994d1f	IMPALA-10627: Use standard parquet-related Iceberg table properties This patch adds support for the following standard Iceberg properties: write.parquet.compression-codec: Parquet compression codec. Supported values are: NONE, GZIP, SNAPPY (default value), LZ4, ZSTD. The table property will be ignored if COMPRESSION_CODEC query option is set. write.parquet.compression-level: Parquet compression level. Used with ZSTD compression only. Supported range is [1, 22]. Default value is 3. The table property will be ignored if COMPRESSION_CODEC query option is set. write.parquet.row-group-size-bytes : Parquet row group size in bytes. Supported range is [8388608, 2146435072] (8MB - 2047MB). The table property will be ignored if PARQUET_FILE_SIZE query option is set. If neither the table property nor the PARQUET_FILE_SIZE query option is set, the way Impala calculates row group size will remain unchanged. write.parquet.page-size-bytes: Parquet page size in bytes. Used for PLAIN encoding. Supported range is [65536, 1073741824] (64KB - 1GB). If the table property is unset, the way Impala calculates page size will remain unchanged. write.parquet.dict-size-bytes: Parquet dictionary page size in bytes. Used for dictionary encoding. Supported range is [65536, 1073741824] (64KB - 1GB). If the table property is unset, the way Impala calculates dictionary page size will remain unchanged. This patch also renames 'iceberg.file_format' table property to 'write.format.default' which is the standard Iceberg name for the table property. Change-Id: I3b8aa9a52c13c41b48310d2f7c9c7426e1ff5f23 Reviewed-on: http://gerrit.cloudera.org:8080/17654 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-07-20 23:58:06 +00:00
Qifan Chen	147b4b9e58	IMPALA-10754: test_overlap_min_max_filters_on_sorted_columns failed during GVO This patch addresses a failure in ubuntu-16.04 dockerised test. The test involved is found in overlap_min_max_filters_on_sorted_columns.test as follows. set minmax_filter_fast_code_path=on; set MINMAX_FILTER_THRESHOLD=0.0; SET RUNTIME_FILTER_WAIT_TIME_MS=$RUNTIME_FILTER_WAIT_TIME_MS; select straight_join count(a.timestamp_col) from alltypes_timestamp_col_only a join [SHUFFLE] alltypes_limited b where a.timestamp_col = b.timestamp_col and b.tinyint_col = 4; ---- RUNTIME_PROFILE aggregation(SUM, NumRuntimeFilteredPages)> 57 The patch reduces the threshold from 58 to 50. Testing: Ran the unit test successfully. Change-Id: Icb4cc7d533139c4a2b46a872234a47d46cb8a17c Reviewed-on: http://gerrit.cloudera.org:8080/17696 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-07-20 22:56:48 +00:00
stiga-huang	4df03a31ec	IMPALA-2019(Part-2): Provide UTF-8 support in instr() and locate() Similar to the previous patch, this patch adds UTF-8 support in instr() and locate() builtin functions so they can have consistent behaviors with Hive's. These two string functions both have an optional argument as position: INSTR(STRING str, STRING substr[, BIGINT position[, BIGINT occurrence]]) LOCATE(STRING substr, STRING str[, INT pos]) Their return values are positions of the matched substring. In UTF-8 mode (turned on by set UTF8_MODE=true), these positions are counted by UTF-8 characters instead of bytes. Error handling: Malformed UTF-8 characters are counted as one byte per character. This is consistent with Hive since Hive replaces those bytes to U+FFFD (REPLACEMENT CHARACTER). E.g. GenericUDFInstr calls Text#toString(), which performs the replacement. We can provide more behaviors on error handling like ignoring them or reporting errors. IMPALA-10761 will focus on this. Tests: - Add BE unit tests and e2e tests - Add random tests to make sure malformed UTF-8 characters won't crash us. Change-Id: Ic13c3d04649c1aea56c1aaa464799b5e4674f662 Reviewed-on: http://gerrit.cloudera.org:8080/17580 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-07-20 13:28:30 +00:00
Zoltan Borok-Nagy	62028d00e6	IMPALA-10802: test_show_create_table and test_catalogs fails with Iceberg syntax error Two Iceberg commits got into master branch in parallel. One of them modified the DDL syntax, the other one added some tests. They were correct on their own, but mixing the two causes test failures. The affected tests have been updated. Change-Id: Id3cf6ff04b8da5782df2b84a580cdbd4a4a16d06 Reviewed-on: http://gerrit.cloudera.org:8080/17689 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-07-19 22:16:18 +00:00
Qifan Chen	36d8e6766e	IMPALA-10763: Min/max filters should be enabled on Z-order sorted columns This patch enables min/max filtering on any Z-order sort-by columns by default. Since the column stats for a row group or a page is computed from the column values stored in the row group or the page, the current infrastructure for min/max filtering works for the Z-order out of box. The fact that these column values are ordered by Z-order is orthogonal to the work of min/max filtering. By default, the new feature is enabled. Set the existing control knob minmax_filter_sorted_columns to false to turn it off. Testing 1. Added new z-order related sort column tests in overlap_min_max_filters_on_sorted_columns.test; 2. Ran core-test. Change-Id: I2a528ffbd0e333721ef38b4be7d4ddcdbf188adf Reviewed-on: http://gerrit.cloudera.org:8080/17635 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-07-15 19:07:25 +00:00
Zoltan Borok-Nagy	474e022fda	IMPALA-10626: Add support for Iceberg's Catalogs API Iceberg recently switched to use its Catalogs class to define catalog and table properties. Catalog information is stored in a configuration file such as hive-site.xml. And the table properties contain information about which catalog is being used and what is the Iceberg table id. E.g. in the Hive conf we can have the following properties to define catalogs: iceberg.catalog.<catalog_name>.type = hadoop iceberg.catalog.<catalog_name>.warehouse = somelocation or iceberg.catalog.<catalog_name>.type = hive And at the table level we can have the following: iceberg.catalog = <catalog_name> name = <table_identifier> Table property 'iceberg.catalog' refers to a Catalog defined in the configuration file. This is in contradiction with Impala's current behavior where we are already using 'iceberg.catalog', and it can have the following values: * hive.catalog for HiveCatalog * hadoop.catalog for HadoopCatalog * hadoop.tables for HadoopTables To be backward-compatible and also support the new Catalogs properties Impala still recognizes the above special values. But, from now Impala doesn't define 'iceberg.catalog' by default. 'iceberg.catalog' being NULL means HiveCatalog for both Impala and Iceberg's Catalogs API, hence for Hive and Spark as well. If 'iceberg.catalog' has a different value than the special values it indicates that Iceberg's Catalogs API is being used, so Impala will try to look up the catalog configuration from the Hive config file. Testing: * added SHOW CREATE TABLE tests * added e2e tests that create/insert/drop Iceberg tables with Catalogs * manually tested interop behavior with Hive Change-Id: I5dfa150986117fc55b28034c4eda38a736460ead Reviewed-on: http://gerrit.cloudera.org:8080/17466 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-07-15 16:47:35 +00:00
Zoltan Borok-Nagy	d0749d59de	IMPALA-10732: Use consistent DDL for specifying Iceberg partitions Currently we have a DDL syntax for defining Iceberg partitions that differs from SparkSQL: https://iceberg.apache.org/spark-ddl/#partitioned-by E.g. Impala is using the following syntax: CREATE TABLE ice_t (i int, s string, ts timestamp, d date) PARTITION BY SPEC (i BUCKET 5, ts MONTH, d YEAR) STORED AS ICEBERG; The same in Spark is: CREATE TABLE ice_t (i int, s string, ts timestamp, d date) USING ICEBERG PARTITIONED BY (bucket(5, i), months(ts), years(d)) HIVE-25179 added the following syntax for Hive: CREATE TABLE ice_t (i int, s string, ts timestamp, d date) PARTITIONED BY SPEC (bucket(5, i), months(ts), years(d)) STORED BY ICEBERG; I.e. the same syntax as Spark, but adding the keyword "SPEC". This patch makes Impala use Hive's syntax, i.e. we will also use the PARTITIONED BY SPEC clause + the unified partition transform syntax. Testing: * existing tests has been rewritten with the new syntax Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62 Reviewed-on: http://gerrit.cloudera.org:8080/17575 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-07-15 15:15:07 +00:00
Zoltan Borok-Nagy	9d46255739	IMPALA-7087, IMPALA-8131: Read decimals from Parquet files with different precision/scale IMPALA-7087 is about reading Parquet decimal columns with lower precision/scale than table metadata. IMPALA-8131 is about reading Parquet decimal columns with higher scale than table metadata. Both are resolved by this patch. It reuses some parts from an earlier change request from Sahil Takiar: https://gerrit.cloudera.org/#/c/12163/ A new utility class has been introduced, ParquetDataConverter which does the data conversion. It also helps to decide whether data conversion is needed or not. NULL values are returned in case of overflows. This behavior is consistent with Hive. Parquet column stats reader is also updated to convert the decimal values. The stats reader is used to evaluate min/max conjuncts. It works well because later we also evaluate the conjuncts on the converted values anyway. The status of different filterings: * dictionary filtering: disabled for columns that need conversion * runtime bloom filters: work on the converted values * runtime min/max filters: work on the converted values This patch also enables schema evolution of decimal columns of Iceberg tables. Testing: * added e2e tests Change-Id: Icefa7e545ca9f7df1741a2d1225375ecf54434da Reviewed-on: http://gerrit.cloudera.org:8080/17678 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Csaba Ringhofer <csringhofer@cloudera.com>	2021-07-14 12:51:09 +00:00
Qifan Chen	84d784351c	IMPALA-10738: Min/max filters should be enabled for partition columns This patch enables min/max filters for partitoned columns to take advantage of the min/max filter infrastructure already built and to provide coverage for certain equi-joins in which the stats filters are not feasible. The new feature is turned on by default and to turn off the feature, set the new query option minmax_filter_partition_column to false. In the patch, the existing query option enabled_runtime_filter_types is enforced in specifying the types of the filters generated. The default value ALL generates both the bloom and min/max filters. The alternative value BLOOM generates only the bloom filters and another alternative value MIN_MAX generates only the min/max filters. The normal control knobs minmax_filter_threshold (for threshold) and minmax_filtering_level (for filtering level) still work. When the threshold is 0, the patch automatically assigns a reasonable value for the threshhold. Testing: 1). Added new tests in overlap_min_max_filters_on_partition_columns.test; 2). Core tests Change-Id: I89e135ef48b4bb36d70075287b03d1c12496b042 Reviewed-on: http://gerrit.cloudera.org:8080/17568 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-07-09 13:40:40 +00:00
Vihang Karajgaonkar	46ae99a36b	Bump up the GBN to 14842939 This patch bumps up the GBN to 14842939. This build includes HIVE-23995 and HIVE-24175 and some of the tests were modified to take into account of that. Also, fixes a minor bug in environ.py Testing done: 1. Core tests. Change-Id: I78f167c1c0d8e90808e387aba0e86b697067ed8f Reviewed-on: http://gerrit.cloudera.org:8080/17628 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Csaba Ringhofer <csringhofer@cloudera.com>	2021-07-06 18:35:30 +00:00
Qifan Chen	d99caa1f3a	IMPALA-10754: test_overlap_min_max_filters_on_sorted_columns failed during GVO This patch addresses the following failure in ubuntu 16.04 dockerised test: select straight_join count(a.timestamp_col) from alltypes_timestamp_col_only a join [SHUFFLE] alltypes_limited b where a.timestamp_col = b.timestamp_col and b.tinyint_col = 4 aggregation(SUM, NumRuntimeFilteredPages): 58 EXPECTED VALUE: 58 ACTUAL VALUE: 59 OP: : In the patch, the result expectation is altered from "==58" to ">57". Testing: 1). Ran test_overlap_min_max_filters_on_sorted_columns multiple times under regular and local catalog mode. Change-Id: I4f9eb198dc4e4b0ad1a17696a1d74ff05ac0a436 Reviewed-on: http://gerrit.cloudera.org:8080/17618 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-06-23 00:20:31 +00:00
Zoltan Borok-Nagy	c591b846c7	IMPALA-10166 (part 2): Enable DROP and CHANGE column IMPALA-10166 (part 1) already added the necessary code for DROP and CHANGE COLUMN, but disabled those stmts because to correctly support schema evolution we had to wait for column resolution by Iceberg field id. Since then IMPALA-10361 and IMPALA-10485 added support for field-id based column resolution for Parquet and ORC as well. Hence this patch enables DROP and CHANGE column ALTER TABLE statements. We still disallow REPLACE COLUMNS because it doesn't really make sense for Iceberg tables as it basically makes all existing data inaccessible. Changing DECIMAL columns are still disabled due to IMPALA-7087. Testing: * added e2e tests Change-Id: I9b0d1a55bf0ed718724a69b51392ed53680ffa90 Reviewed-on: http://gerrit.cloudera.org:8080/17593 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Zoltan Borok-Nagy <boroknagyz@cloudera.com>	2021-06-22 10:41:07 +00:00
Qifan Chen	40c3074e79	IMPALA-10709: Min/max filters should be enabled for joins on sorted columns in Parquet tables This patch enables min/max filters for equi-joins on lexical sort-by columns in a Parquet table created by Impala by default. This is to take advantage of Impala sorting the min/max values in column index in each data file for the table. The control knob is query option minmax_filter_sorted_columns, default to true. When minmax_filter_sorted_columns is true, the patch will generate min/max filters only for the leading sort columns. The normal control knobs minmax_filter_threshold (for threshold) and minmax_filtering_level (for filtering level) still work. When the threshold is 0, the patch automatically assigns a reasonable value for the threshhold, and selects PAGE to be the filtering level. In the backend, the skipped pages are quickly found by taking a fast code path to identify the corresponding lower and the upper bounds in the sorted min and max value arrays, given a range in the filter. The skipped pages are expressed as page ranges which are translated into row ranges later on. A new query option minmax_filter_fast_code_path is added to control the work of the fast code path. It can take ON (default), OFF, or VERIFICATION three values. The last helps verify that the results from both the fast and the regular code path are the same. Preliminary performance testing (joining into a simpplified TPCH lineitem table of 2 sorted BIG INT columns and a total of 6001215 rows) confirms that min/max filtering on leading sort-by columns improves the performance of scan operators greatly. The best result is seen with pages containing no more than 24000 rows: 84.62ms (page level filtering) vs. 115.27ms (row group level filtering) vs 137.14ms (no filtering). The query utilized is as follows. select straight_join a.l_orderkey from simpflified_lineitem a join [SHUFFLE] tpch_parquet.lineitem b where a.l_orderkey = b.l_orderkey and b.l_receiptdate = "1998-12-31" Also fixed in the patch are abnormal min/max display in "Final filter table" section in a profile for DECIMAL, TIMESTAMP and DATE data types, and reading DATE column index in batch without validation. Testing: 1). Added a new test overlap_min_max_filters_on_sorted_columns.test to verify a) Min/max filters are only created for leading sort by column; b) Query option minmax_filter_sorted_columns works; c) Query option minmax_filter_fast_code_path works. 2). Added new tests in parquet-page-index-test.cc to test fast code path under various conditions; 3). Ran core tests successfully. Change-Id: I28c19c4b39b01ffa7d275fb245be85c28e9b2963 Reviewed-on: http://gerrit.cloudera.org:8080/17478 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-06-15 18:33:43 +00:00
Qifan Chen	4290c5e297	IMPALA-9355: TestExchangeMemUsage.test_exchange_mem_usage_scaling doesn't hit the memory limit This patch restores the previous testing enviroment in which exchange node 5 in the test query shown below reliablly reports memory exceeding limit error. set mem_limit=171520k; set num_scanner_threads=1; select * from tpch_parquet.lineitem l1 join tpch_parquet.lineitem l2 on l1.l_orderkey = l2.l_orderkey and l1.l_partkey = l2.l_partkey and l1.l_suppkey = l2.l_suppkey and l1.l_linenumber = l2.l_linenumber order by l1.l_orderkey desc, l1.l_partkey, l1.l_suppkey, l1.l_linenumber limit 5 In the patch, the memory limit for the entire query is lowered to 164Mb, less than 171520k, the minimum for the query to be accepted after the result spooling is enabled by default. This is achieved by disabling result spooling by setting spool_query_results to false. Testing: 1). Ran exchange-mem-scaling.test Change-Id: Ia4ad4508028645b67de419cfdfa2327d2847cfc4 Reviewed-on: http://gerrit.cloudera.org:8080/17586 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-06-14 17:13:25 +00:00
Amogh Margoor	8e7a6227a4	IMPALA-10730: Add MD5 fuction to compute 128-bit checksum for a string. Built-in function has been added to compute MD5 128-bit checksum for a non-null string. If input string is null, then output of the function is null too. In FIPS mode, MD5 is disabled and function will throw error on invocation. Testing: 1. Added expression unit tests. 2. Added end-to-end tests for MD5. Change-Id: Id406d30a7cc6573212b302fbfec43eb848352ff2 Reviewed-on: http://gerrit.cloudera.org:8080/17567 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-06-10 22:24:16 +00:00
Amogh Margoor	a4373983df	IMPALA-5569: Add statement ALTER TABLE UNSET TBLPROPERTIES/SERDEPROPERTIES This patch adds ability to unset or delete table properties or serde properties for a table. It supports 'IF EXISTS' clause in case users are not sure if property being unset exists. Without 'IF EXISTS', trying to unset property that doesn't exist will fail. Tests: 1. Added Unit tests and end-to-end tests 2. Covered tables of different storage type like Kudu, Iceberg, HDFS table. Change-Id: Ife4f6561dcdcd20c76eb299c6661c778e342509d Reviewed-on: http://gerrit.cloudera.org:8080/17530 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-06-10 15:46:56 +00:00
Fucun Chu	ce21fe74b8	IMPALA-10689: Implement ds_cpc_union_f() function. This function receives two strings that are serialized Apache DataSketches CPC sketches. Union two sketches and returns the resulting sketch of union. Example: select ds_cpc_estimate(ds_cpc_union_f(sketch1, sketch2)) from sketch_tbl; +---------------------------------------------------+ \| ds_cpc_estimate(ds_cpc_union_f(sketch1, sketch2)) \| +---------------------------------------------------+ \| 15 \| +---------------------------------------------------+ Change-Id: Ib5c616316bf2bf2ff437678e9a44a15339920150 Reviewed-on: http://gerrit.cloudera.org:8080/17440 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-06-07 12:53:23 +00:00
Daniel Becker	817ca5920d	IMPALA-10640: Support reading Parquet Bloom filters - most common types This change adds read support for Parquet Bloom filters for types that can reasonably be supported in Impala. Other types, such as CHAR(N), would be very difficult to support because the length may be different in Parquet and in Impala which results in truncation or padding, and that changes the hash which makes using the Bloom filter impossible. Write support will be added in a later change. The supported Parquet type - Impala type pairs are the following: --------------------------------------- \|Parquet type \| Impala type \| \|---------------------------------------\| \|INT32 \| TINYINT, SMALLINT, INT \| \|INT64 \| BIGINT \| \|FLOAT \| FLOAT \| \|DOUBLE \| DOUBLE \| \|BYTE_ARRAY \| STRING \| --------------------------------------- The following types are not supported for the given reasons: ---------------------------------------------------------------- \|Impala type \| Problem \| \|----------------------------------------------------------------\| \|VARCHAR(N) \| truncation can change hash \| \|CHAR(N) \| padding / truncation can change hash \| \|DECIMAL \| multiple encodings supported \| \|TIMESTAMP \| multiple encodings supported, timezone conversion \| \|DATE \| not considered yet \| ---------------------------------------------------------------- Support may be added for these types later, see IMPALA-10641. If a Bloom filter is available for a column that is fully dictionary encoded, the Bloom filter is not used as the dictionary can give exact results in filtering. Testing: - Added tests/query_test/test_parquet_bloom_filter.py that tests whether Parquet Bloom filtering works for the supported types and that we do not incorrectly discard row groups for the unsupported type VARCHAR. The Parquet file used in the test was generated with an external tool. - Added unit tests for ParquetBloomFilter in file be/src/util/parquet-bloom-filter-test.cc - A minor, unrelated change was done in be/src/util/bloom-filter-test.cc: the MakeRandom() function had return type uint64_t, the documentation claimed it returned a 64 bit random number, but the actual number of random bits is 32, which is what is intended in the tests. The return type and documentation have been corrected to use 32 bits. Change-Id: I7119c7161fa3658e561fc1265430cb90079d8287 Reviewed-on: http://gerrit.cloudera.org:8080/17026 Reviewed-by: Csaba Ringhofer <csringhofer@cloudera.com> Tested-by: Csaba Ringhofer <csringhofer@cloudera.com>	2021-06-03 06:32:45 +00:00
Vihang Karajgaonkar	e24bdd2175	IMPALA-10700: Add query options to skip deleting stats When a truncate table command is issued, in case of non-transactional tables, the table and column statistics for the table are also deleted by default. This can be a expensive operation especially when many truncate table commands are running concurrently. As the concurrency increases, the response time from Hive metastore slows down the delete table and column statistics RPC calls. In cases where truncate operation is used to remove the existing data and then reload new data, it is likely that users will compute stats again as soon as the new data is reloaded. This would overwrite the existing statistics and hence the additional time spent by the truncate operation to delete column and table statistics becomes unnecessary. To improve this, this change introduces a new query option: DELETE_STATS_IN_TRUNCATE. The default value of this option is 1 or true which means stats will be deleted as part of truncate operation. As the name suggests, when this query options are set to false or 0, a truncate operation will not delete the table and column statistics for the table. This change also makes a improvement to truncate operation on tables which are replicated. If the table is being replicated, previously, the statistics were not getting deleted after truncate. Now the statistics will get deleted after truncate. Testing: Modified truncate-table.test to include variations of these query options and making sure that the statistics are deleted or skipped from deletion after truncate operation. Change-Id: I9400c3586b4bdf46d9b4056ea1023aabae8cc519 Reviewed-on: http://gerrit.cloudera.org:8080/17521 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-05-28 04:02:10 +00:00
Fucun Chu	67de4a48b0	IMPALA-10688: Implement ds_cpc_stringify() function This function receives a string that is a serialized Apache DataSketches CPC sketch and returns its stringified format. A stringified format should look like and contains the following data: select ds_cpc_stringify(ds_cpc_sketch(float_col)) from functional_parquet.alltypestiny; +--------------------------------------------+ \| ds_cpc_stringify(ds_cpc_sketch(float_col)) \| +--------------------------------------------+ \| ### CPC sketch summary: \| \| lg_k : 11 \| \| seed hash : 93cc \| \| C : 2 \| \| flavor : 1 \| \| merged : true \| \| intresting col : 0 \| \| table entries : 2 \| \| window : not allocated \| \| ### End sketch summary \| \| \| +--------------------------------------------+ Change-Id: I8c9d089bfada6bebd078d8f388d2e146c79e5285 Reviewed-on: http://gerrit.cloudera.org:8080/17373 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Gabor Kaszab <gaborkaszab@cloudera.com>	2021-05-26 06:42:02 +00:00
Zoltan Borok-Nagy	ced7b7d221	IMPALA-10485: Support Iceberg field-id based column resolution in the ORC scanner Currently the ORC scanner only supports position-based column resolution. This patch adds Iceberg field-id based column resolution which will be the default for Iceberg tables. It is needed to support schema evolution in the future, i.e. ALTER TABLE DROP/RENAME COLUMNS. (The Parquet scanner already supports Iceberg field-id based column resolution) Testing * added e2e test 'iceberg-orc-field-id.test' by copying the contents of nested-types-scanner-basic, nested-types-scanner-array-materialization, nested-types-scanner-position, nested-types-scanner-maps, and executing the queries on an Iceberg table with ORC data files Change-Id: Ia2b1abcc25ad2268aa96dff032328e8951dbfb9d Reviewed-on: http://gerrit.cloudera.org:8080/17398 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-05-20 19:19:50 +00:00
Zoltan Borok-Nagy	824b39e829	IMPALA-10433: Use Iceberg's fixed partition transforms Because of an Iceberg bug Impala didn't push predicates to Iceberg for dates/timestamps when the predicate referred to a value before the UNIX epoch. https://github.com/apache/iceberg/pull/1981 fixed the Iceberg bug, and lately Impala switched to an Iceberg version that has the fix, therefore this patch enables predicate pushdown for all timestamp/date values. The above Iceberg patch maintains backward compatibility with the old, wrong behavior. Therefore sometimes we need to read plus one Iceberg partition than necessary. Testing: * Updated current e2e tests Change-Id: Ie67f41a53f21c7bdb8449ca0d27746158be7675a Reviewed-on: http://gerrit.cloudera.org:8080/17417 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-05-19 00:59:35 +00:00
Amogh Margoor	86beb2f9dd	IMPALA-10679: Add builtin functions to comptute SHA-1 and SHA-2 digest. Built-in functions to compute SHA-1 digest and SHA-2 family of digest has been added. Support for SHA2 digest includes SHA224, SHA256, SHA384 and SHA512. In FIPS mode SHA1, SHA224 and SHA256 have been disabled and will throw error. SHA2 functions will also throw error for unsupported bit length i.e., bit length apart from 224, 256, 384, 512. Testing: 1. Added Unit test for expressions. 2. Added end-to-end test for new functions. Change-Id: If163b7abda17cca3074c86519d59bcfc6ace21be Reviewed-on: http://gerrit.cloudera.org:8080/17464 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-05-18 18:48:10 +00:00
liuyao	9c38568657	IMPALA-10696: fix accuracy problem Table alltypes has no statistics, so the cardinality of alltypes will be estimated based on the hdfs files and the avg row size. Calling PrintUtils.printMetric, double will be divided by long. There will be accuracy problems. In most cases, the number of lines calculated is 17.91 K. But due to accuracy problems here, the calculated value is 17.90K. I modified line 221 of stats-extrapolation.test and used row_regex to match, referring to the matching method of cardinality in line 224,in this case, their values are the same Testing: metadata/test_stats_extrapolation.py Change-Id: I0a1a3809508c90217517705b2b188b2ccba6f23f Reviewed-on: http://gerrit.cloudera.org:8080/17411 Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Reviewed-by: Jim Apple <jbapple@apache.org>	2021-05-15 22:20:58 +00:00
Csaba Ringhofer	4697db0214	IMPALA-5121: Fix AVG() on timestamp col with use_local_tz_for_unix_timestamp_conversions AVG used to contain a back and forth timezone conversion if use_local_tz_for_unix_timestamp_conversions is true. This could affect the results if there were values from different DST rules. Note that AVG on timestamps has other issues besides this, see IMPALA-7472 for details. Testing: - added a regression test Change-Id: I999099de8e07269b96b75d473f5753be4479cecd Reviewed-on: http://gerrit.cloudera.org:8080/17412 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-05-14 13:58:04 +00:00
Fucun Chu	b1326f7eff	IMPALA-10687: Implement ds_cpc_union() function This function receives a set of serialized Apache DataSketches CPC sketches produced by ds_cpc_sketch() and merges them into a single sketch. An example usage is to create a sketch for each partition of a table, write these sketches to a separate table and based on which partition the user is interested of the relevant sketches can be union-ed together to get an estimate. E.g.: SELECT ds_cpc_estimate(ds_cpc_union(sketch_col)) FROM sketch_tbl WHERE partition_col=1 OR partition_col=5; Testing: - Apart from the automated tests I added to this patch I also tested ds_cpc_union() on a bigger dataset to check that serialization, deserialization and merging steps work well. I took TPCH25.linelitem, created a number of sketches with grouping by l_shipdate and called ds_cpc_union() on those sketches Change-Id: Ib94b45ae79efcc11adc077dd9df9b9868ae82cb6 Reviewed-on: http://gerrit.cloudera.org:8080/17372 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-05-13 12:18:55 +00:00
Fucun Chu	e39c30b3cd	IMPALA-10282: Implement ds_cpc_sketch() and ds_cpc_estimate() functions These functions can be used to get cardinality estimates of data using CPC algorithm from Apache DataSketches. ds_cpc_sketch() receives a dataset, e.g. a column from a table, and returns a serialized CPC sketch in string format. This can be written to a table or be fed directly to ds_cpc_estimate() that returns the cardinality estimate for that sketch. Similar to the HLL sketch, the primary use-case for the CPC sketch is for counting distinct values as a stream, and then merging multiple sketches together for a total distinct count. For more details about Apache DataSketches' CPC see: http://datasketches.apache.org/docs/CPC/CPC.html Figures-of-Merit Comparison of the HLL and CPC Sketches see: https://datasketches.apache.org/docs/DistinctCountMeritComparisons.html Testing: - Added some tests running estimates for small datasets where the amount of data is small enough to get the correct results. - Ran manual tests on tpch_parquet.lineitem to compare perfomance with ndv(). Depending on data characteristics ndv() appears 2x-3x faster. CPC gives closer estimate than current ndv(). CPC is more accurate than HLL in some cases Change-Id: I731e66fbadc74bc339c973f4d9337db9b7dd715a Reviewed-on: http://gerrit.cloudera.org:8080/16656 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-05-11 18:07:40 +00:00
Zoltan Borok-Nagy	e26543426c	IMPALA-9967: Add support for reading ORC's TIMESTAMP WITH LOCAL TIMEZONE ORC-189 and ORC-666 added support for a new timestamp type 'TIMESTMAP WITH LOCAL TIMEZONE' to the Orc library. This patch adds support for reading such timestamps with Impala. These are UTC-normalized timestamps, therefore we convert them to local timezone during scanning. Testing: * added test for CREATE TABLE LIKE ORC * added scanner tests to test_scanners.py Change-Id: Icb0c6a43ebea21f1cba5b8f304db7c4bd43967d9 Reviewed-on: http://gerrit.cloudera.org:8080/17347 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-05-07 14:11:08 +00:00
Zoltan Borok-Nagy	f0f083e45e	IMPALA-10482, IMPALA-10493: Fix bugs in full ACID collection query rewrites IMPALA-10482: SELECT * query on unrelative collection column of transactional ORC table will hit IllegalStateException. The AcidRewriter will rewrite queries like "select item from my_complex_orc.int_array" to "select item from my_complex_orc t, t.int_array" This cause troubles in star expansion. Because the original query "select * from my_complex_orc.int_array" is analyzed as "select item from my_complex_orc.int_array" But the rewritten query "select * from my_complex_orc t, t.int_array" is analyzed as "select id, item from my_complex_orc t, t.int_array". Hidden table refs can also cause issues during regular column resolution. E.g. when the table has top-level 'pos'/'item'/'key'/'value' columns. The workaround is to keep track of the automatically added table refs during query rewrite. So when we analyze the rewritten query we can ignore these auxiliary table refs. IMPALA-10493: Using JOIN ON syntax to join two full ACID collections produces wrong results. When AcidRewriter.splitCollectionRef() creates a new collection ref it doesn't copy every information needed to correctly execute the query. E.g. it dropped the ON clause, turning INNER joins to CROSS joins. Testing: * added e2e tests Change-Id: I8fc758d3c1e75c7066936d590aec8bff8d2b00b0 Reviewed-on: http://gerrit.cloudera.org:8080/17038 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-05-03 20:42:30 +00:00
Csaba Ringhofer	676f79aa81	IMPALA-10691: Fix multithreading related crash in CAST FORMAT The issue occurs when a CastFormatExpr is shared among threads and multiple threads call its OpenEvaluator(). Later calls delete the DateTimeFormatContext created by older calls which makes fn_ctx->GetFunctionState() a dangling pointer. This only happens when CastFormatExpr is shared among FragmentInstances - in case of scanner threads OpenEvaluator() is called with THREAD_LOCAL and returns early without modifying anything. Testing: - added a regression test Change-Id: I501c8a184591b1c836b2ca4cada1f2117f9f5c99 Reviewed-on: http://gerrit.cloudera.org:8080/17374 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-04-30 23:36:31 +00:00
Amogh Margoor	af6adf7618	IMPALA-10654: Fix precision loss in DecimalValue to double conversion. Original approach to convert DecimalValue(internal representation of decimals) to double was not accurate. It was: static_cast<double>(value_) / pow(10.0, scale). However only integers from −2^53 to 2^53 can be represented accurately by double precision without any loss. Hence, it would not work for numbers like -0.43149576573887316. For DecimalValue representing -0.43149576573887316, value_ would be -43149576573887316 and scale would be 17. As value_ < -2^53, result would not be accurate. In newer approach we are using third party library https://github.com/lemire/fast_double_parser, which handles above scenario in a performant manner. Testing: 1. Added End to End Tests covering following scenarios: a. Test to show precision limitation of 16 in the write path b. DecimalValue's value_ between -2^53 and 2^53. b. value_ outside above range but abs(value_) < UINT64_MAX c. abs(value_) > UINT64_MAX -covers DecimalValue<__int128_t> 2. Ran existing backend and end-to-end tests completely Change-Id: I56f0652cb8f81a491b87d9b108a94c00ae6c99a1 Reviewed-on: http://gerrit.cloudera.org:8080/17303 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-04-27 14:52:43 +00:00
Csaba Ringhofer	9355b25e11	IMPALA-10662: Change EE tests to return the same results for HS2 as Beeswax In EE tests HS2 returned results with smaller precision than Beeswax for FLOAT/DOUBLE/TIMESTAMP types. These differences are not inherent to the HS2 protocol - the results are returned with full precision in Thrift and lose precision during conversion in client code. This patch changes to conversion in HS2 to match Beeswax and removes test section DBAPI_RESULTS that was used to handle the differences: - float/double: print method is changed from str() to ":.16".format() - timestamp: impyla's cursor is created with convert_types=False to avoid conversion to datetime.datetime (which has only microsec precision) Note that FLOAT/DOUBLE are still different in impala-shell, this change only deals with EE tests. Testing: - ran the changed tests Change-Id: If69ae90c6333ff245c2b951af5689e3071f85cb2 Reviewed-on: http://gerrit.cloudera.org:8080/17325 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-04-20 22:21:32 +00:00
Qifan Chen	a985e1134e	IMPALA-10647 Improve always-true min/max filter handling in coordinator The change improves how a coordinator behaves when a just arriving min/max filter is always true. A new member 'always_true_filter_received_' is introduced to record such a fact. Similarily, the new member always_false_flipped_to_false_ is added to indicate that the always false flag is flipped from 'true' to 'false'. These two members only influence how the min and max columns in "Filter routing table" and "Final filter table" in profile are displayed as follows. 1. 'PartialUpdates' - The min and the max are partially updated; 2. 'AlwaysTrue' - One received filter is AlwaysTrue; 3. 'AlwaysFalse' - No filter is received or all received filters are empty; 4. 'Real values' - The final accumulated min/max from all received filters. A second change introduced is to record, in scan node, the arrival time of min/max filters (as a timestamp since the system is rebooted, obtained by calling MonotonicMillis()). A timestamp of similar nature is recorded for hdfs parquet scanners when a row group is processed. By comparing these two timestamps, one can easily diagnose issues related to late arrival of min/max filters. This change also addresses a flaw with rows unexpectedly filtered out, due to the reason that the always_true_ flag in a min/max filter, when set, is ignored in the eval code path in RuntimeFilter::Eval(). Testing: 1. Added three new tests in overlap_min_max_filters.test to verify that the min/max are displayed correctly when the min/max filter in hash join builder is set to always true, always false, or a pair of meaningful min and max values. 2. Ran unit tests; 3. Ran runtime-filter-test; 4. Ran core tests successfully. Change-Id: I326317833979efcbe02ce6c95ad80133dd5c7964 Reviewed-on: http://gerrit.cloudera.org:8080/17252 Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-04-20 00:44:48 +00:00
Qifan Chen	1231208da7	IMPALA-10494: Making use of the min/max column stats to improve min/max filters This patch adds the functionality to compute the minimal and the maximal value for column types of integer, float/double, date, or decimal for parquet tables, and to make use of the new stats to discard min/max filters, in both hash join builders and Parquet scanners, when their coverage are too close to the actual range defined by the column min and max. The computation and dislay of the new column min/max stats can be controlled by two new Boolean query options (default to false): 1. compute_column_minmax_stats 2. show_column_minmax_stats Usage examples. set compute_column_minmax_stats=true; compute stats tpcds_parquet.store_sales; set show_column_minmax_stats=true; show column stats tpcds_parquet.store_sales; +-----------------------+--------------+-...-------+---------+---------+ \| Column \| Type \| #Falses \| Min \| Max \| +-----------------------+--------------+-...-------+---------+---------+ \| ss_sold_time_sk \| INT \| -1 \| 28800 \| 75599 \| \| ss_item_sk \| BIGINT \| -1 \| 1 \| 18000 \| \| ss_customer_sk \| INT \| -1 \| 1 \| 100000 \| \| ss_cdemo_sk \| INT \| -1 \| 15 \| 1920797 \| \| ss_hdemo_sk \| INT \| -1 \| 1 \| 7200 \| \| ss_addr_sk \| INT \| -1 \| 1 \| 50000 \| \| ss_store_sk \| INT \| -1 \| 1 \| 10 \| \| ss_promo_sk \| INT \| -1 \| 1 \| 300 \| \| ss_ticket_number \| BIGINT \| -1 \| 1 \| 240000 \| \| ss_quantity \| INT \| -1 \| 1 \| 100 \| \| ss_wholesale_cost \| DECIMAL(7,2) \| -1 \| -1 \| -1 \| \| ss_list_price \| DECIMAL(7,2) \| -1 \| -1 \| -1 \| \| ss_sales_price \| DECIMAL(7,2) \| -1 \| -1 \| -1 \| \| ss_ext_discount_amt \| DECIMAL(7,2) \| -1 \| -1 \| -1 \| \| ss_ext_sales_price \| DECIMAL(7,2) \| -1 \| -1 \| -1 \| \| ss_ext_wholesale_cost \| DECIMAL(7,2) \| -1 \| -1 \| -1 \| \| ss_ext_list_price \| DECIMAL(7,2) \| -1 \| -1 \| -1 \| \| ss_ext_tax \| DECIMAL(7,2) \| -1 \| -1 \| -1 \| \| ss_coupon_amt \| DECIMAL(7,2) \| -1 \| -1 \| -1 \| \| ss_net_paid \| DECIMAL(7,2) \| -1 \| -1 \| -1 \| \| ss_net_paid_inc_tax \| DECIMAL(7,2) \| -1 \| -1 \| -1 \| \| ss_net_profit \| DECIMAL(7,2) \| -1 \| -1 \| -1 \| \| ss_sold_date_sk \| INT \| -1 \| 2450816 \| 2452642 \| +-----------------------+--------------+-...-------+---------+---------+ Only the min/max values for non-partition columns are stored in HMS. The min/max values for partition columns are computed in coordinator. The min-max filters, in C++ class or protobuf form, are augmented to deal with the always true state better. Once always true is set, the actual min and max values in the filter are no longer populated. Testing: - Added new compute/show stats tests in compute-stats-column-minmax.test; - Added new tests in overlap_min_max_filters.test to demonstrate the usefulness of column stats to quickly disable useless filters in both hash join builder and Parquet scanner; - Added tests in min-max-filter-test.cc to demonstrate method Or(), ToProtobuf() and constructor can deal with always true flag well; - Tested with TPCDS 3TB to demonstrate the usefulness of the min and max column stats in disabling min/max filters that are not useful. - core tests. TODO: 1. IMPALA-10602: Intersection of multiple min/max filters when applying to common equi-join columns; 2. IMPALA-10601: Creating lineitem_orderkey_only table in tpch_parquet database; 3. IMPALA-10603: Enable min/max overlap filter feature for Iceberg tables with Parquet data files; 4. IMPALA-10617: Compute min/max column stats beyond parquet tables. Change-Id: I08581b44419bb8da5940cbf98502132acd1c86df Reviewed-on: http://gerrit.cloudera.org:8080/17075 Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>	2021-04-02 21:50:17 +00:00

1 2 3 4 5 ...

1508 Commits