Commit Graph

1457 Commits

Author SHA1 Message Date
Zoltan Borok-Nagy
dbc2fc14d8 IMPALA-10597: Enable setting 'iceberg.file_format'
Currently we prohibit setting the following properties:

* iceberg.catalog
* iceberg.catalog_location
* iceberg.file_format
* iceberg.table_identifier

This patch enables setting 'iceberg.file_format', therefore if
a table was created by another engine, but using HiveCatalog,
we'll be able to set the data file format to the proper value
and make the table readable by Impala. Setting the other
properties are not needed for HiveCatalog tables.

If the table wasn't created by HiveCatalog, then we cannot load the
table, therefore we cannot invoke any ALTER TABLE statement at all.
In that case we need to create an external table.

If the table already contains data files, then Impala checks if
all of them have the proper file format. If not, the ALTER TABLE
statement fails.

Before this patch a CREATE TABLE statement accepted any string
for 'iceberg.file_format', and in case of invalid file formats the
frontend silently used Parquet. This patch also adds a check to only
allow valid file formats.

Testing:
 * added e2e test

Change-Id: I4b3506be4562a1ace3e6435867aadb3bdde7a8e2
Reviewed-on: http://gerrit.cloudera.org:8080/17207
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-03-29 18:32:31 +00:00
Fucun Chu
77d6acd032 IMPALA-10581: Implement ds_theta_intersect_f() function
This function receives two strings that are serialized Apache
DataSketches Theta sketches. Computes the intersection of two sketches
of same or different column and returns the resulting sketch of
intersection.

Example:
select ds_theta_estimate(ds_theta_intersect_f(sketch1, sketch2))
from sketch_tbl;
+-----------------------------------------------------------+
| ds_theta_estimate(ds_theta_intersect_f(sketch1, sketch2)) |
+-----------------------------------------------------------+
| 5                                                         |
+-----------------------------------------------------------+

Change-Id: I335eada00730036d5433775cfe673e0e4babaa01
Reviewed-on: http://gerrit.cloudera.org:8080/17186
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-03-29 15:59:49 +00:00
Fucun Chu
622e3c95ad IMPALA-10580: Implement ds_theta_union_f() function
This function receives two strings that are serialized Apache
DataSketches Theta sketches. Union two sketches and returns the
resulting sketch of union.

Example:
select ds_theta_estimate(ds_theta_union_f(sketch1, sketch2))
from sketch_tbl;
+-------------------------------------------------------+
| ds_theta_estimate(ds_theta_union_f(sketch1, sketch2)) |
+-------------------------------------------------------+
| 15                                                    |
+-------------------------------------------------------+

Change-Id: I8329979b81ceeaad739a43fab79768ca9c2916fa
Reviewed-on: http://gerrit.cloudera.org:8080/17179
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-03-24 15:16:07 +00:00
wzhou-code
410c3e79e4 IMPALA-10564: Return error when inserting an invalid decimal value
When using CTAS statements or INSERT-SELECT statements to insert rows to
table with decimal columns, Impala insert NULL for overflowed decimal
values, instead of returning error. This issue happens when the data
expression for the decimal column in SELECT sub-query consists at least
one alias.
This issue is similar as IMPALA-6340, but IMPALA-6340 only fixed the
issue for the cases with the data expression for the decimal columns as
constants.

This patch fixed the issue by calling RuntimeState::CheckQueryState()
in the end of HdfsTableWriter::AppendRows() and KuduTableSink::Send().
If there is an invalid decimal error, the query will be failed without
inserting NULL for decimal column.
We did not change the behaviour for decimal_v1. NULL will be inserted
to the table for invalid decimal values with warning message.

Tests:
 - Added unit-tests for INSERT-SELECT and CTAS statements with
   overflowed decimal values to be inserted into tables. The
   overflowed decimal values are expressed as a constant expression,
   or as an expression with aliases.
   Also added cases to verify behaviour of decimal_v1 is unchanged.
 - Passed exhaustive tests.

Change-Id: I64ce4ed194af81ef06401ffc1124e12f05b8da98
Reviewed-on: http://gerrit.cloudera.org:8080/17168
Reviewed-by: Thomas Tauber-Marshall <tmarshall@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-03-23 22:52:38 +00:00
stiga-huang
a0f77680c5 IMPALA-10483: Support subqueries in Ranger masking policies
This patch adds support for using subqueries in Ranger masking policies,
i.e. column-masking/row-filtering policies. The subquery can reference
either the current table or other tables. However, masking policies on
these tables won't be applied recursively. This is consistent with Hive.
One motivation is to avoid infinitely masking if it references the same
table. Another motivation I think is to simplify the masking behavior,
so when the admin is setting a masking expression, it can be considered
as running in the admin's perspective (i.e. no masking).

Implementation
Before analyzing the query, the coordinator loads the metadata of all
possibly used tables into the query's StmtTableCache. Table masking
takes place after the analyzing phase. If the subquery filter introduces
any new tables, the analyzer will fail to resolve them since their
metadata is not loaded in the StmtTableCache. This patch modified the
StmtMetadataLoader to also load those tables introduced by masking
policies. So they can be resolved correctly.

Tests
 - Add more complex tests in test_row_filtering

Change-Id: I254df9f684c95c660f402abd99ca12dded7e764f
Reviewed-on: http://gerrit.cloudera.org:8080/17185
Reviewed-by: Aman Sinha <amsinha@cloudera.com>
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-03-22 15:52:03 +00:00
stiga-huang
c9d7bcb4a1 IMPALA-9661: Avoid introducing unused columns in table masking view
Previously, if a table has column masking policies, we replace its
unanalyzed TableRef with an analyzed InlineViewRef (table masking view)
in FromClause.analyze(). However, we can't detect which columns are
actually used in the original query at this point. In fact, analyze()
for SelectList, WhereClause, GroupByClause and other clauses containing
SlotRefs happen after FromClause.analyze(). After the whole query block
is analyzed, we can get the exact set of required columns.

This patch refactor the codes to do table masking after analyze() to
avoid introducing unused columns. Referenced columns of a TableRef are
registered in analyze(), which helps to figure out what columns are
actually needed.

With this, we don't need to revert table masking in FromClause.reset().
The doTableMasking flag in AST is also removed since now the table mask
is resolved once after analyze().

Tests:
 - Add more e2e tests in test_ranger.py
 - Run CORE tests

Change-Id: Ib015a8ab528065907b27fbdceb8e2818deb814e1
Reviewed-on: http://gerrit.cloudera.org:8080/17199
Reviewed-by: Aman Sinha <amsinha@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-03-22 08:41:00 +00:00
stiga-huang
98de1c5436 IMPALA-9234: Support Ranger row filtering policies
Ranger row filtering policies provide customized expressions to filter
out rows for specific users when reading from a table. This patch adds
support for this feature. A new feature flag, enable_row_filtering, is
added to disable this experimental feature. It defaults to be true so
the feature is enabled by default. Enabling row-filtering requires
--enable_column_masking=true since it depends on the column masking
implementation.

Note that row filtering policies take effects prior to any column
masking policies, because column masking policies apply on result data.

Implementation:
The existing table masking view infrastructure can be extended to
support row filtering. Currently when analyzing a table with column
masking policies, we replace the TableRef with an InlineViewRef which
contains a SelectStmt wrapping the columns with masking expressions.
This patch adds the row filtering expressions to the WhereClause of the
SelectStmt.

Limitations:
 - Expressions using subqueries are not supported (IMPALA-10483).
 - Row filtering policies on nested tables will not be applied when
   nested collection columns are used directly in the FROM clause. This
   will leak data so we forbid such kinds of queries until IMPALA-10484
   is resolved.

Tests:
 - Add FE test for error message when disabling row filtering.
 - Add e2e test with row filtering policies.
 - Add e2e test with column masking and row filtering policies both take
   place.
 - Verified audits in a CDP cluster with Ranger and Solr set up.

Change-Id: I580517be241225ca15e45686381b78890178d7cc
Reviewed-on: http://gerrit.cloudera.org:8080/16976
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-03-18 21:08:14 +00:00
Zoltan Borok-Nagy
6162343842 IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write id for ACID tables
ALTER TABLE ADD PARTITION should bump the write id for ACID tables.
Both for INSERT-only and full ACID tables.

For transational tables we are adding partitions in an ACID
transaction in the following sequence:

1. open transaction
2. allocate write id for table
3. add partitions to HMS table
4. commit transaction

However, please note that table metadata modifications are
independent of ACID transactions. I.e. if add partitions succeed,
but we cannot commit the transaction, then we the newly added
partitions won't get removed.

So why are we opening a txn then? We are doing it in order to bump
the write id in a best-effort way. This aids table metadata caching,
so by looking at the table write id we can determine if the cached
table metadata is up-to-date.

Testing:
 * added e2e test

Change-Id: Iad247008b7c206db00516326c1447bd00a9b34bd
Reviewed-on: http://gerrit.cloudera.org:8080/17081
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-03-18 19:35:58 +00:00
Fucun Chu
3e82501531 IMPALA-10558: Implement ds_theta_exclude() function
This function receives two strings that are serialized Apache
DataSketches Theta sketches. Computes the a-not-b set operation given
two sketches of same or different column.

Example:
select ds_theta_estimate(ds_theta_exclude(sketch1, sketch2))
from sketch_tbl;
+-------------------------------------------------------+
| ds_theta_estimate(ds_theta_exclude(sketch1, sketch2)) |
+-------------------------------------------------------+
| 5                                                     |
+-------------------------------------------------------+

Change-Id: I05119fd8c652c07ff248a99e44b0da3541e46ca3
Reviewed-on: http://gerrit.cloudera.org:8080/17153
Reviewed-by: Gabor Kaszab <gaborkaszab@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-03-17 22:14:44 +00:00
Riza Suminto
47219ec366 IMPALA-10565: Adjust result spooling memory based on scratch_limit
IMPALA-9856 enables result spooling by default. Result spooling depends
on the ability to spill its entire BufferedTupleStream to disk once it
hits maximum memory reservation. However, if the query option
scratch_limit is set lower than max_spilled_result_spooling_mem, the
query might fail in the middle of execution due to insufficient scratch
space. This patch adds planner change to consider scratch_limit and
scratch_dirs query option when computing resource used by result
spooling. The algorithm is as follow:

* If scratch_dirs is empty or scratch_limit < minMemReservationBytes
  required to use BufferedPlanRootSink, we set spool_query_results to
  false and fallback to use BlockingPlanRootSink.

* If scratch_limit > minMemReservationBytes but still fairly low, we
  lower the max_result_spooling_mem (default is 100MB) and
  max_spilled_result_spooling_mem (default is 1GB) to fit scratch_limit.

* if scratch_limit > max_spilled_result_spooling_mem, do nothing.

Testing:
- Add TestScratchLimit::test_result_spooling_and_varying_scratch_limit
- Verify that spool_query_results query option is disabled in
  TestScratchDir::test_no_dirs
- Pass exhaustive tests.

Change-Id: I541f46e6911694e14c0fc25be1a6982fd929d3a9
Reviewed-on: http://gerrit.cloudera.org:8080/17166
Reviewed-by: Aman Sinha <amsinha@cloudera.com>
Tested-by: Aman Sinha <amsinha@cloudera.com>
2021-03-14 03:35:40 +00:00
Zoltan Borok-Nagy
6c6b0ee869 IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables
This patch adds support for CREATE TABLE AS SELECT statements
for Iceberg tables.

CTAS statements work like the following in Impala:

1. Analysis of the whole CTAS statement
2. Divide CTAS to CREATE stmt and INSERT stmt
3. Create temporary in-memory target table from the CREATE stmt
4. Analyse the INSERT statement by using the temporary target table
5. If everything is OK so far, create the target table
6. Execute the INSERT query

For Iceberg tables the non-trivial thing was to create the temporary
target table without actually creating it via Iceberg API. I've created
a new class 'IcebergCtasTarget' that mimics an FeIceberg table. It can be
used with catalog V1 and V2 as well.

Testing
 * e2e CTAS tests in iceberg-ctas.test
 * SHOW CREATE TABLE stmts in show-create-table.test

Change-Id: I81d2084e401b9fa74d5ad161b51fd3e2aa3fcc67
Reviewed-on: http://gerrit.cloudera.org:8080/17130
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-03-12 19:28:19 +00:00
Fucun Chu
0d22e89df4 IMPALA-10520: Implement ds_theta_intersect() function
This function receives a set of serialized Apache DataSketches Theta
sketches produced by ds_theta_sketch() and intersects them into a
single sketch.

An example usage is to create a sketch for each partition of a table,
write these sketches to a separate table and intersect them to get
estimates based on the partitions the user is interested in related
sketches. E.g.:
  SELECT
      ds_theta_estimate(ds_theta_intersect(sketch_col))
  FROM sketch_tbl
  WHERE partition_col=1 OR partition_col=5;

Testing:
  - Apart from the automated tests I added to this patch I also
    tested ds_theta_intersect() on a bigger dataset to check that
    serialization, deserialization and merging steps work well. I
    took TPCH25.linelitem, created a number of sketches with grouping
    by l_shipdate and called ds_theta_intersect() on those sketches

Change-Id: I80e68c2151c4604f0386d3dfb004c82b10293f97
Reviewed-on: http://gerrit.cloudera.org:8080/17088
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-03-12 16:13:48 +00:00
liuyao
1a01bfe831 IMPALA-10377: Improve the accuracy of resource estimation
PlanNode does not consider some factors when estimating memory,
this will cause a large error rate

AggregationNode
1.MemoryEstimate = Ndv * (AvgRowSize + SizeOfBucket)
2.When estimating the Ndv of merge aggregation, Ndv should be
  divided only once.
3.If there is no grouping exprs, MemoryEstimate =
  MIN_PLAIN_AGG_MEM

SortNode
1.MemoryEstimate = Cardinality * AvgRowSize. Memory used when
  there is enough memory

HashJoinNode
1.MemoryEstimate= DataRows + Buckets + DuplicateNodes,
  DataRows = RightTableCardinality * AvgRowSize,
  Buckets= roundUpToPowerOf2(RightTableCardinality) *
           SizeOfBucket,
  DuplicateNodes = (RightTableCardinality - RightNdv) *
                    SizeOfDuplicateNode

KuduScanNode
1.MemoryEstimate = Columns * BytesPerColumn * MaxScannerThreads,
  Columns are scanned in query, not all the columns of the table

UnitTest
1.CardinalityTest adds test cases to test memory estimation.
  Modify existing test cases related to memory estimation

Change-Id: Ic01db168ff2c6d6de33ee553a8175599f035d7a1
Reviewed-on: http://gerrit.cloudera.org:8080/16842
Reviewed-by: Zoltan Borok-Nagy <boroknagyz@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-03-12 14:23:04 +00:00
Fang-Yu Rao
2039746ebe IMPALA-10576: Add refresh authorization to make a test case less flaky
We found that a test case run in test_grant_revoke_with_role() that is
used to verify a requesting user does not possess the necessary
privilege to perform the GRANT operation could fail since the expected
AuthorizationException is not returned after the query. Since the
privilege of GRANT was revoked immediately before this test case, we
suspect the authorization-related metadata has not been updated. To make
this test case less flaky, in this patch we add a REFRESH AUTHORIZATION
after the query that revoked the GRANT privilege from the requesting
user.

Testing:
 - Verified that this patch passes the core tests in an ASAN build.

Change-Id: I7407bac0407e162ab5ba623505bd7ee49bdf3abf
Reviewed-on: http://gerrit.cloudera.org:8080/17165
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-03-12 00:00:03 +00:00
Riza Suminto
49ac55fb69 IMPALA-9856: Enable result spooling by default.
Result spooling has been relatively stable since it was introduced, and
it has several benefits described in IMPALA-8656. This patch enable
result spooling (SPOOL_QUERY_RESULTS) query options by default.

Furthermore, some tests need to be adjusted to account for result
spooling by default. The following are the adjustment categories and
list of tests that fall under such category.

Change in assertions:
PlannerTest#testAcidTableScans
PlannerTest#testBloomFilterAssignment
PlannerTest#testConstantFolding
PlannerTest#testFkPkJoinDetection
PlannerTest#testFkPkJoinDetectionWithHDFSNumRowsEstDisabled
PlannerTest#testKuduSelectivity
PlannerTest#testMaxRowSize
PlannerTest#testMinMaxRuntimeFilters
PlannerTest#testMinMaxRuntimeFiltersWithHDFSNumRowsEstDisabled
PlannerTest#testMtDopValidation
PlannerTest#testParquetFiltering
PlannerTest#testParquetFilteringDisabled
PlannerTest#testPartitionPruning
PlannerTest#testPreaggBytesLimit
PlannerTest#testResourceRequirements
PlannerTest#testRuntimeFilterQueryOptions
PlannerTest#testSortExprMaterialization
PlannerTest#testSpillableBufferSizing
PlannerTest#testTableSample
PlannerTest#testTpch
PlannerTest#testKuduTpch
PlannerTest#testTpchNested
PlannerTest#testUnion
TpcdsPlannerTest
custom_cluster/test_admission_controller.py::TestAdmissionController::test_dedicated_coordinator_planner_estimates
custom_cluster/test_admission_controller.py::TestAdmissionController::test_memory_rejection
custom_cluster/test_admission_controller.py::TestAdmissionController::test_pool_mem_limit_configs
metadata/test_explain.py::TestExplain::test_explain_level2
metadata/test_explain.py::TestExplain::test_explain_level3
metadata/test_stats_extrapolation.py::TestStatsExtrapolation::test_stats_extrapolation

Increase BUFFER_POOL_LIMIT:
query_test/test_queries.py::TestQueries::test_analytic_fns
query_test/test_runtime_filters.py::TestRuntimeRowFilters::test_row_filter_reservation
query_test/test_sort.py::TestQueryFullSort::test_multiple_mem_limits_full_output
query_test/test_spilling.py::TestSpillingBroadcastJoins::test_spilling_broadcast_joins
query_test/test_spilling.py::TestSpillingDebugActionDimensions::test_spilling_aggs
query_test/test_spilling.py::TestSpillingDebugActionDimensions::test_spilling_regression_exhaustive
query_test/test_udfs.py::TestUdfExecution::test_mem_limits

Increase MEM_LIMIT:
query_test/test_mem_usage_scaling.py::TestExchangeMemUsage::test_exchange_mem_usage_scaling
query_test/test_mem_usage_scaling.py::TestScanMemLimit::test_hdfs_scanner_thread_mem_scaling

Increase MAX_ROW_SIZE:
custom_cluster/test_parquet_max_page_header.py::TestParquetMaxPageHeader::test_large_page_header_config
query_test/test_insert.py::TestInsertQueries::test_insert_large_string
query_test/test_query_mem_limit.py::TestQueryMemLimit::test_mem_limit
query_test/test_scanners.py::TestTextSplitDelimiters::test_text_split_across_buffers_delimiter
query_test/test_scanners.py::TestWideRow::test_wide_row

Disable result spooling to maintain assertion:
custom_cluster/test_admission_controller.py::TestAdmissionController::test_set_request_pool
custom_cluster/test_admission_controller.py::TestAdmissionController::test_timeout_reason_host_memory
custom_cluster/test_admission_controller.py::TestAdmissionController::test_timeout_reason_pool_memory
custom_cluster/test_admission_controller.py::TestAdmissionController::test_queue_reasons_memory
custom_cluster/test_admission_controller.py::TestAdmissionController::test_pool_config_change_while_queued
custom_cluster/test_query_retries.py::TestQueryRetries::test_retry_fetched_rows
custom_cluster/test_query_retries.py::TestQueryRetries::test_retry_finished_query
custom_cluster/test_scratch_disk.py::TestScratchDir::test_no_dirs
custom_cluster/test_scratch_disk.py::TestScratchDir::test_non_existing_dirs
custom_cluster/test_scratch_disk.py::TestScratchDir::test_non_writable_dirs
query_test/test_insert.py::TestInsertQueries::test_insert_large_string (the last query only)
query_test/test_kudu.py::TestKuduMemLimits::test_low_mem_limit_low_selectivity_scan
query_test/test_mem_usage_scaling.py::TestScanMemLimit::test_kudu_scan_mem_usage
query_test/test_queries.py::TestQueriesParquetTables::test_very_large_strings
query_test/test_query_mem_limit.py::TestCodegenMemLimit::test_codegen_mem_limit
shell/test_shell_client.py::TestShellClient::test_fetch_size

Testing:
- Pass exhaustive tests.

Change-Id: I9e360c1428676d8f3fab5d95efee18aca085eba4
Reviewed-on: http://gerrit.cloudera.org:8080/16755
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-03-02 04:58:51 +00:00
Qifan Chen
16493c0416 IMPALA-10532 TestOverlapMinMaxFilters.test_overlap_min_max_filters seems flaky
This patch addresses the flakiness seen with a particular test within
overlap_min_max_filters by allowing the sum of NumRuntimeFilteredPages
to be greater than an expected value. Previously, such a sum can only
be equal to the expected value and is not sufficient for various test
conditions in which the scan of the parquet data files can start
before the arrival of a runtime filter.

The extension in test_result_verifier.py allows '>' and '<' condition
to be expressed for aggregation(SUM, <counter>), such as
aggregation(SUM, NumRuntimeFilteredPages)> 80.

Testing:
 - Ran TestOverlapMinMaxFilters.

Change-Id: I93940a104bfb2d68cb1d41d7e303348190fd5972
Reviewed-on: http://gerrit.cloudera.org:8080/17111
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-02-26 22:52:16 +00:00
Fucun Chu
7f60990028 IMPALA-10467: Implement ds_theta_union() function
This function receives a set of serialized Apache DataSketches Theta
sketches produced by ds_theta_sketch() and merges them into a single
sketch.

An example usage is to create a sketch for each partition of a table,
write these sketches to a separate table and based on which partition
the user is interested of the relevant sketches can be union-ed
together to get an estimate. E.g.:
  SELECT
      ds_theta_estimate(ds_theta_union(sketch_col))
  FROM sketch_tbl
  WHERE partition_col=1 OR partition_col=5;

Testing:
  - Apart from the automated tests I added to this patch I also
    tested ds_theta_union() on a bigger dataset to check that
    serialization, deserialization and merging steps work well. I
    took TPCH25.linelitem, created a number of sketches with grouping
    by l_shipdate and called ds_theta_union() on those sketches

Change-Id: I91baf58c76eb43748acd5245047edac8c66761b2
Reviewed-on: http://gerrit.cloudera.org:8080/17048
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-02-19 13:32:09 +00:00
Qifan Chen
ebb2e06639 IMPALA-10325: Parquet scan should use min/max statistics to skip pages based on equi-join predicate
This patch adds a new class of predicates called overlap predicates
to aid in the acceptance or rejection of a row group, a page, or a
row in a Parquet table, utilizing the minimal and the maximal values
gathered from an equi hash join and the Parquet column index stats.
When a row group or page is rejected, all contained rows within are
rejected all together.

For example in the following query, the min and max in the overlap
predicate are computed from the join column from table 'b', and
are compared against the min/max of each row group or page at the
scan node for 'a'.

  select straight_join count(*)
  from lineitem a join [SHUFFLE] lineitem b
  where a.l_shipdate = b.l_receiptdate
  and b.l_commitdate = "1992-01-31";

An overlap predicate associated with the column type B in hash
table and scan column type A will be formed when both A and B are
of or can be converted to as:
  1. booleans;
  2. integers (tinyint, smallint, int, or bigint);
  3. approximate numeric (float or double);
  4. decimals with the same precision and scale;
  5. strings;
  6. date; or
  7. timestamps.

The overlap predicate is implemented as a min/max filter and can be
observed in the explain output of a query.

A new query option 'minmax_filter_threshold' is provided to control
the new feature. Setting it to 0.0 disables the feature. Setting it
to a value > 0.0 but less than 1.0 provides a threshold. An overlap
predicate will be evaluated against a row group and possibly the
containing pages/rows, as long as its overlap ratio is less than the
threshold. The overlap ratio is the common area of the row group
and the filter, divided by the area of the row group.

A second query option, minmax_filtering_level, is provided to
specify the filtering scope:
  1. ROW_GROUP: the overlap is only tested for row groups;
  2. PAGE: the overlap is tested for both row groups and pages;
  3. ROW: the overlap is for row groups, pages and rows.

Two new run-time profile counters are added to report the number of
row groups or pages filtered out via the overlap predicates
respectively:
  1. NumRuntimeFilteredRowGroups
  2. NumRuntimeFilteredPages

Two new column "Min value" and "Max value" are added to the
"Filter routing table" and "Final filter table" in profile to
display the min and the max values for a min/max filter.

Testing:
1. Unit tested on various column types with TPCH and TPCDS tables.
   Benefits were significant when the join column on the outer table
   is sorted and there exist many row groups or pages no overlapping
   with the min/max filters;
2. Added following new tests:
    a) In overlap_min_max_filters.test to demonstrate the number of
       filtered out pages and row groups with the two new profile
       counters;
    b) In runtime-filter-propagation.test to demonstrate that the
       overlap predicates work with different column types;
3. Core testing;
4. Performance measurement: the overal improvement with 3TB scale
   TPCDS is at 1.45% with the filter threshold at 0.5 and filtering
   level at ROW_GROUP. Good improvement (over 10%) are seen with
   query 16, 25, 62, 83, 94 and 99, due to the join column
   ship_date_sk being strongly correlated to the partition column
   sold_date_sk.

To do in follow-up JIRAs:
1. Improve filtering efficiency;
2. Apply the overlap predicate on partition columns;
3. IR code-gen for various MinMaxFilter::EvalOverlap methods.
4. Address the current limitation that the "Min value" and
   "Max value" columns may be empty for LOCAL filters.

Change-Id: I379405ee75b14929df7d6b5d20dabc6f51375691
Reviewed-on: http://gerrit.cloudera.org:8080/16720
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-02-18 14:31:51 +00:00
Fucun Chu
65c6a81ed9 IMPALA-10463: Implement ds_theta_sketch() and ds_theta_estimate() functions
These functions can be used to get cardinality estimates of data
using Theta algorithm from Apache DataSketches. ds_theta_sketch()
receives a dataset, e.g. a column from a table, and returns a
serialized Theta sketch in string format. This can be written to a
table or be fed directly to ds_theta_estimate() that returns the
cardinality estimate for that sketch.

Similar to the HLL sketch, the primary use-case for the Theta sketch
is for counting distinct values as a stream, and then merging
multiple sketches together for a total distinct count.

For more details about Apache DataSketches' Theta see:
https://datasketches.apache.org/docs/Theta/ThetaSketchFramework.html

Testing:
 - Added some tests running estimates for small datasets where the
   amount of data is small enough to get the correct results.
 - Ran manual tests on tpch25_parquet.lineitem to compare perfomance
   with ds_hll_*. ds_theta_* is faster than ds_hll_* on the original
   data, the difference is around 1%-10%. ds_hll_estimate() is faster
   than ds_theta_estimate() on existing sketch. HLL and Theta gives
   closer estimate except for string. see IMPALA-10464.

Change-Id: I14f24c16b815eec75cf90bb92c8b8b0363dcbfbc
Reviewed-on: http://gerrit.cloudera.org:8080/17008
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-02-17 17:09:48 +00:00
Tamas Mate
dc133d9513 IMPALA-10499: Fix failing test_misc
This change modifies the result type of the misc test which was failing.

Testing:
 - executed the misc tests with exhaustive exploration strategy

Change-Id: Ibe95f4bc3521f49d19e6da53deb904a25ac30982
Reviewed-on: http://gerrit.cloudera.org:8080/17066
Reviewed-by: Aman Sinha <amsinha@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-02-15 22:25:41 +00:00
Aman Sinha
baf81dea6d IMPALA-9745: Propagate source type when doing constant propagation
When doing constant propagation the source type was not being
propagated to the target expression leading to an analysis failure.
The behavior is most easily reproducible with STRING to TIMESTAMP
conversion in the presence of other predicates.

This patch fixes this by adding an implicit cast if needed for such
cases.

Testing:
 - Added planner test and ran other planner tests
 - Added end-to-end test

Change-Id: Ic3853478945229440f733c256ea225639f9178ff
Reviewed-on: http://gerrit.cloudera.org:8080/17064
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Aman Sinha <amsinha@cloudera.com>
2021-02-13 23:02:54 +00:00
Tim Armstrong
b42c64993d IMPALA-9979: part 2: partitioned top-n
Planner changes:
---------------
The planner now identifies predicates that can be converted into
limits in a partitioned or unpartitioned top-n with the following
method:
* Push down predicates that reference analytic tuple into inline view.
  These will be evaluated after the analytic plan for the inline
  SelectStmt is generated.
* Identify predicates that reference the analytic tuple and could
  be converted to limits.
* If they can be applied to the last sort group of the analytic
  plan, and the windows are all compatible, then the lowest
  limit gets converted into a limit in the top N.
* Otherwise generate a select node with the conjuncts. We add
  logic to merge SELECT nodes to avoid generating duplicates
  from inside and outside the inline view.
* The pushed predicate is still added to the SELECT node
  because it is necessary for correctness for predicates
  like '=' to filter additional rows and also the limit
  pushdown optimization looks for analytic predicates
  there, so retaining all predicates simplifies that.
  The selectivity of the predicate is adjusted so that
  cardinality estimates remain accurate.

The optimization can be disabled by setting
ANALYTIC_RANK_PUSHDOWN_THRESHOLD=0. By default it is
only enabled for limits of 1000 or less, because the
in-memory Top-N may perform significantly worse than
a full sort for large heaps (since updating the heap
for every input row ends up being more expensive than
doing a traditional sort). We could probably optimize
this more with better tuning so that it can gracefully
fall back to doing the full sort at runtime.

rank() and row_number() are handled. rank() needs support in
the TopN node to include ties for the last place, which is
also added in this patch.

If predicates are trivially false, we generate empty nodes.

This interacts with the limit pushdwon optimization. The limit
pushdown optimization is applied after the partitioned top-n
is generated, and can sometimes result in more optimal plans,
so it is generalized to handle pushing into partitioned top-n
nodes.

Backend changes:
---------------
The top-n node in the backend is augmented to handle
the partitioned case, for which we use a std::map and a
comparator based on the partition exprs. The partitioned
top-n node has a soft limit of 64MB on the size of the
in-memory heaps and can spill with use of an embedded Sorter.
The current implementation tries to evict heaps that are
less effective at filtering rows.

Limitations:
-----------
There are several possible extensions to this that we did not do:
* dense_rank() is not supported because it would require additional
  backend support - IMPALA-10014.
* ntile() is not supported because it would require additional
  backend support - IMPALA-10174.
* Only one predicate per analytic is pushed.
* Redundant rank()/row_number() predicates are not merged,
  only the lowest is chosen.
* Lower bounds are not converted into OFFSET.
* The analytic operator cannot be eliminated even if the analytic
  expression was only used in the predicate.
* This doesn't push predicates into UNION - IMPALA-10013
* Always false predicates don't result in empty plan - IMPALA-10015

Tests:
-----
* Planner tests - added tests that exercise the interesting code
  paths added in planning.
  - Predicate ordering in SELECT nodes changed in a couple of cases
    because some predicates were pushed into the inline views.
* Modified SORT targeted perf tests to avoid conversion to Top-N
* Added targeted perf test for partitioned top-n.
* End-to-end tests
 - Unpartitioned Top-N end-to-end tests
 - Basic partitioning and duplicate handling tests on functional
 - Similar basic tests on larger inputs from TPC-DS and with
   larger partition counts.
 - I inspected the results and also ran the same tests with
   analytic_rank_pushdown_threshold=0 to confirm that the
   results were the same as with the full sort.
 - Fallback to spilling sort.

Perf:
-----
Added a targeted benchmark that goes from ~2s to ~1s with
mt_dop=8 on TPC-H 30 on my desktop.

Change-Id: Ic638af9495981d889a4cb7455a71e8be0eb1a8e5
Reviewed-on: http://gerrit.cloudera.org:8080/16242
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-02-10 23:52:28 +00:00
Tamas Mate
701714b10a IMPALA-10379: Add missing HiveLexer classes to shared-deps
HIVE-19064 introduced additional lexer classes that are required during
runtime. This commit adds the missing HiveLexer lexer classes to the
shared-deps. Without these classes queries such as 'select 1 as "``"'
would fail with 'NoClassDefFoundError'.

Testing:
 - added a misc.test to verify that the classes are available and that
IMPALA-9641 is fixed by HIVE-19064

Change-Id: I6e3a00335983f26498c1130ab9f109f6e67256f5
Reviewed-on: http://gerrit.cloudera.org:8080/17019
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-02-07 05:20:48 +00:00
Zoltan Borok-Nagy
a3f441193d IMPALA-10223: Implement INSERT OVERWRITE for Iceberg tables
This patch adds support for INSERT OVERWRITE statements for
Iceberg tables. We use Iceberg's ReplacePartitions interface
for this. This interface provides consistent behavior with
INSERT OVERWRITEs against regular tables. It's also consistent
with other engines dynamic overwrites, e.g. Spark.

INSERT OVERWRITE for partitioned tables replaces the partitions
affected by the INSERT, while keeping the other partitions
untouched.

INSERT OVERWRITE is prohibited for tables that use the BUCKET
partition transform because it would randomly overwrite table
data.

Testing
 * added e2e test

Change-Id: Idf4acfb54cf62a3f3b2e8db9d04044580151299c
Reviewed-on: http://gerrit.cloudera.org:8080/17012
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-02-05 14:46:08 +00:00
stiga-huang
0473e1b973 IMPALA-10473: Fix wrong analytic results on constant partition/order by exprs
When the Partition-by and Order-by expressions of an analytic are all
constants, it should be evaluated in a single unpartitioned fragment
(same as analytics that have no Partition-by/Order-by exprs). Currently,
it's placed within the same fragment with the child node, which causes
it to be computed locally and get incorrect results when the fragment is
partitioned.

Tests:
 - Added planner tests
 - Added e2e tests

Change-Id: Ibc88a410dab984ff37e27dc635bee5f289003a2a
Reviewed-on: http://gerrit.cloudera.org:8080/17023
Reviewed-by: Aman Sinha <amsinha@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-02-05 11:54:00 +00:00
Zoltan Borok-Nagy
646b0e011c IMPALA-10456: Implement TRUNCATE for Iceberg tables
This patch adds support for the TRUNCATE statement for
Iceberg tables.

The TRUNCATE operation creates a new snapshot for the target
table that doesn't have any data files. Table and column stats
are also cleared. This patch also fixes a bug that caused
table/column stats not being propagated.

Testing
 * added e2e tests for both partitioned and unpartitioned tables

Change-Id: I6116c7c36aba871c0be79f499e0ac618072ca7b8
Reviewed-on: http://gerrit.cloudera.org:8080/16987
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: wangsheng <skyyws@163.com>
2021-02-01 11:14:01 +00:00
xqhe
4ae847bf94 IMPALA-10382: fix invalid outer join simplification
When set ENABLE_OUTER_JOIN_TO_INNER_TRANSFORMATION = true, the planner
will simplify outer joins if the predicate with case expr or conditional
function on both sides of outer join.
However, the predicate maybe not null-rejecting, if simplify the outer
join, the result is incorrect. E.g. t1.b > coalesce(t1.c, t2.c) can
return true if t2.c is null, so it is not null-rejecting predicate
for t2.

The fix is simply to support the case that the predicate with two
operands and the operator is one of (=, !=, >, <, >=, <=),
1. one of the operands or
2. if the operand is arithmetic expression and one of the children
does not contain conditional builtin function or case expr and has
tuple id in outer joined tuples.
E.g. t1.b > coalesce(t2.c, t1.c) or t1.b + coalesce(t2.c, t1.c) >
coalesce(t2.c, t1.c) is null-rejecting predicate for t1.

Testing:
* Add new plan tests in outer-to-inner-joins.test
* Add new query tests to verify the correctness on transformation

Change-Id: I84a3812f4212fa823f3d1ced6e12f2df05aedb2b
Reviewed-on: http://gerrit.cloudera.org:8080/16845
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
2021-01-27 17:30:37 +00:00
Zoltan Borok-Nagy
08367e91f0 IMPALA-10452: CREATE Iceberg tables with old PARTITIONED BY syntax
For convenience this patch adds support with the old-style
CREATE TABLE ... PARTITIONED BY ...; syntax for Iceberg tables.

So users should be able to write the following:

CREATE TABLE ice_t (i int)
PARTITIONED BY (p int)
STORED AS ICEBERG;

Which should be equivalent to this:

CREATE TABLE ice_t (i int, p int)
PARTITION BY SPEC (p IDENTITY)
STORED AS ICEBERG;

Please note that the old-style CREATE TABLE statement creates
IDENTITY-partitioned tables. For other partition transforms the
users must use the new, more generic syntax.

Hive also supports the old PARTITIONED BY syntax with the same
behavior.

Testing:
 * added e2e tests

Change-Id: I789876c161bc0987820955aa9ae01414e0dcb45d
Reviewed-on: http://gerrit.cloudera.org:8080/16979
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-01-26 22:12:25 +00:00
stiga-huang
e8720b40f1 IMPALA-2019(Part-1): Provide UTF-8 support in length, substring and reverse functions
A unicode character can be encoded into 1-4 bytes in UTF-8. String
functions will return undesired results when the input contains unicode
characters, because we deal with a string as a byte array. For instance,
length() returns the length in bytes, not in unicode characters.

UTF-8 is the dominant unicode encoding used in the Hadoop ecosystem.
This patch adds UTF-8 support in some string functions so they can have
UTF-8 aware behavior. For compatibility with the old versions, a new
query option, UTF8_MODE, is added for turning on/off the UTF-8 aware
behavior. Currently, only length(), substring() and reverse() support
it. Other function supports will be added in later patches.

String functions will check the query option and switch to use the
desired implementation. It's similar to how we use the decimal_v2 query
option in builtin functions.

For easy testing, the UTF-8 aware version of string functions are
also exposed as builtin functions (named by utf8_*, e.g. utf8_length).

Tests:
 - Add BE tests for utf8 functions.
 - Add e2e tests for the UTF8_MODE query option.

Change-Id: I0aaf3544e89f8a3d531ad6afe056b3658b525b7c
Reviewed-on: http://gerrit.cloudera.org:8080/16908
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-01-26 00:43:39 +00:00
liuyao
18acca92ee IMPALA-10435: Extend 'compute incremental stats' syntax
to support a list of columns

Modified parser to support compute incremental stats
columns.No need to modify the code of other modules
because it already supports

Change-Id: I4dcc2d4458679c39581446f6d87bb7903803f09b
Reviewed-on: http://gerrit.cloudera.org:8080/16947
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
2021-01-21 19:35:26 +00:00
Zoltan Borok-Nagy
90f3b2f491 IMPALA-10432: INSERT INTO Iceberg tables with partition transforms
INSERT INTO Iceberg tables that use partition transforms. Partition
transforms are functions that calculate partition data from row data.

There are the following partition transforms in Iceberg:
https://iceberg.apache.org/spec/#partition-transforms

 * IDENTITY
 * BUCKET
 * TRUNCATE
 * YEAR
 * MONTH
 * DAY
 * HOUR

INSERT INTO identity-partitioned Iceberg tables are already supported.
This patch adds support for the rest of the transforms.

We create the partitioning expressions in InsertStmt. Based on these
expressions data are automatically shuffled and sorted by the backend
executors before rows are given to the table sink operators. The table
sink operator writes the partitions one-by-one and creates a
human-readable partition path for them.

In the end, we will convert the partition path to partition data and
create Iceberg DataFiles with information about the files written.

Testing:
 * added planner test
 * added e2e tests

Change-Id: I3edf02048cea78703837b248c55219c22d512b78
Reviewed-on: http://gerrit.cloudera.org:8080/16939
Reviewed-by: wangsheng <skyyws@163.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-01-18 18:46:42 +00:00
stiga-huang
9bb7157bf0 IMPALA-10387: Add missing overloads of mask functions used in Ranger default masking policies
The mask functions in Hive are implemented through GenericUDFs which can
accept an infinite number of function signatures. Impala currently don't
support GenericUDFs. So we provide builtin mask functions with limited
overloads.

This patch adds some missing overloads that could be used by Ranger
default masking policies, e.g. MASK_HASH, MASK_SHOW_LAST_4,
MASK_DATE_SHOW_YEAR, etc.

Tests:
 - Add test coverage on all default masking policies applied on all
   supported types.

Change-Id: Icf3e70fd7aa9f3b6d6b508b776696e61ec1fcc2e
Reviewed-on: http://gerrit.cloudera.org:8080/16930
Reviewed-by: Quanlong Huang <huangquanlong@gmail.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-01-15 13:01:53 +00:00
Zoltan Borok-Nagy
696dafed66 IMPALA-10426: Fix crash when inserting invalid timestamps
Insertion of invalid timestamps causes Impala to crash when it uses
the INT64 Parquet timestamp types.

This patch fixes the error by checking for null values in
Int64TimestampColumnWriterBase::ConvertValue().

Testing:
 * added e2e tests

Change-Id: I74fb754580663c99e1d8c3b73f8d62ea3305ac93
Reviewed-on: http://gerrit.cloudera.org:8080/16951
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-01-14 19:34:38 +00:00
skyyws
1093a563e6 IMPALA-10368: Support required/optional property when creating Iceberg table
We supported create required/optional field for Iceberg table in this
patch. If we set 'NOT NULL' property for Iceberg table column in SQL,
Impala will create required field by Iceberg api, 'NULL' or default
will create optional field.
Besides, 'DESCRIBE XXX' for Iceberg table will display 'optional'
property like this:
+------+--------+---------+----------+
| name | type   | comment | nullable |
+------+--------+---------+----------+
| id   | int    |         | false    |
| name | string |         | true     |
| age  | int    |         | true     |
+------+--------+---------+----------+
And 'SHOW CREATE TABLE XXX' will also display 'NULL'/'NOT NULL'
property for Iceberg table.

Tests:
 * added new test in iceberg-create.test
 * added new test in iceberg-negative.test
 * added new test in show-create-table.test
 * modify 'DESCRIBE XXX' result in iceberg-create.test
 * modify 'DESCRIBE XXX' result in iceberg-alter.test
 * modify create table result in show-create-table.test

Change-Id: I70b8014ba99f43df1b05149ff7a15cf06b6cd8d3
Reviewed-on: http://gerrit.cloudera.org:8080/16904
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-01-11 17:08:21 +00:00
stiga-huang
e7839c4530 IMPALA-10416: Add raw string mode for testfiles to verify non-ascii results
Currently, the result section of the testfile is required to used
escaped strings. Take the following result section as an example:
  --- RESULTS
  'Alice\nBob'
  'Alice\\nBob'
The first line is a string with a newline character. The second line is
a string with a '\' and an 'n' character. When comparing with the actual
query results, we need to escape the special characters in the actual
results, e.g. replace newline characters with '\n'. This is done by
invoking encode('unicode_escape') on the actual result strings. However,
the input type of this method is unicode instead of str. When calling it
on str vars, Python will implicitly convert the input vars to unicode
type. The default encoding, ascii, is used. This causes
UnicodeDecodeError when the str contains non-ascii bytes. To fix this,
this patch explicitly decodes the input str using 'utf-8' encoding.

After fixing the logic of escaping the actual result strings, the next
problem is that it's painful to write unicode-escaped expected results.
Here is an example:
  ---- QUERY
  select "你好\n你好"
  ---- RESULTS
  '\u4f60\u597d\n\u4f60\u597d'
  ---- TYPES
  STRING
It's painful to manually translate the unicode characters.

This patch adds a new comment, RAW_STRING, for the result section to use
raw strings instead of unicode-escaped strings. Here is an example:
  ---- QUERY
  select "你好"
  ---- RESULTS: RAW_STRING
  '你好'
  ---- TYPES
  STRING
If the result contains special characters, it's recommended to use the
default string mode. If the special characters only contain newline
characters, we can use RAW_STRING and the existing MULTI_LINE comment
together.

This patch also fixes the issue that pytest fails to report assertion
failures if any of the compared str values contain non-ascii bytes
(IMPALA-10419). However, pytest works if the compared values are both
in unicode type. So we explicitly converting the actual and expected str
values to unicode type.

Test:
 - Add tests in special-strings.test for raw string mode and the escaped
   string mode (default).
 - Run test_exprs.py::TestExprs::test_special_strings locally.

Change-Id: I7cc2ea3e5849bd3d973f0cb91322633bcc0ffa4b
Reviewed-on: http://gerrit.cloudera.org:8080/16919
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-01-06 04:39:56 +00:00
Tim Armstrong
1d5fe2771f IMPALA-6434: Add support to decode RLE_DICTIONARY encoded pages
The encoding is identical to the already-supported PLAIN_DICTIONARY
encoding but the PLAIN enum value is used for the dictionary pages
and the RLE_DICTIONARY enum value is used for the data pages.

A hidden option -write_new_parquet_dictionary_encodings is
added to turn on writing too, for test purposes only.

Testing:
* Added an automated test using a pregenerated test file.
* Ran core tests.
* Manually tested by writing out TPC-H lineitem with the new encoding
  and reading back in Impala and Hive.

Parquet-tools output for the generated test file:
$ hadoop jar ~/repos/parquet-mr/parquet-tools/target/parquet-tools-1.12.0-SNAPSHOT.jar meta /test-warehouse/att/824de2afebad009f-6f460ade00000003_643159826_data.0.parq
20/12/21 20:28:36 INFO hadoop.ParquetFileReader: Initiating action with parallelism: 5
20/12/21 20:28:36 INFO hadoop.ParquetFileReader: reading another 1 footers
20/12/21 20:28:36 INFO hadoop.ParquetFileReader: Initiating action with parallelism: 5
file:            hdfs://localhost:20500/test-warehouse/att/824de2afebad009f-6f460ade00000003_643159826_data.0.parq
creator:         impala version 4.0.0-SNAPSHOT (build 7b691c5d4249f0cb1ced8ddf01033fbbe10511d9)

file schema:     schema
--------------------------------------------------------------------------------
id:              OPTIONAL INT32 L:INTEGER(32,true) R:0 D:1
bool_col:        OPTIONAL BOOLEAN R:0 D:1
tinyint_col:     OPTIONAL INT32 L:INTEGER(8,true) R:0 D:1
smallint_col:    OPTIONAL INT32 L:INTEGER(16,true) R:0 D:1
int_col:         OPTIONAL INT32 L:INTEGER(32,true) R:0 D:1
bigint_col:      OPTIONAL INT64 L:INTEGER(64,true) R:0 D:1
float_col:       OPTIONAL FLOAT R:0 D:1
double_col:      OPTIONAL DOUBLE R:0 D:1
date_string_col: OPTIONAL BINARY R:0 D:1
string_col:      OPTIONAL BINARY R:0 D:1
timestamp_col:   OPTIONAL INT96 R:0 D:1
year:            OPTIONAL INT32 L:INTEGER(32,true) R:0 D:1
month:           OPTIONAL INT32 L:INTEGER(32,true) R:0 D:1

row group 1:     RC:8 TS:754 OFFSET:4
--------------------------------------------------------------------------------
id:               INT32 SNAPPY DO:4 FPO:48 SZ:74/73/0.99 VC:8 ENC:RLE,RLE_DICTIONARY ST:[min: 0, max: 7, num_nulls: 0]
bool_col:         BOOLEAN SNAPPY DO:0 FPO:141 SZ:26/24/0.92 VC:8 ENC:RLE,PLAIN ST:[min: false, max: true, num_nulls: 0]
tinyint_col:      INT32 SNAPPY DO:220 FPO:243 SZ:51/47/0.92 VC:8 ENC:RLE,RLE_DICTIONARY ST:[min: 0, max: 1, num_nulls: 0]
smallint_col:     INT32 SNAPPY DO:343 FPO:366 SZ:51/47/0.92 VC:8 ENC:RLE,RLE_DICTIONARY ST:[min: 0, max: 1, num_nulls: 0]
int_col:          INT32 SNAPPY DO:467 FPO:490 SZ:51/47/0.92 VC:8 ENC:RLE,RLE_DICTIONARY ST:[min: 0, max: 1, num_nulls: 0]
bigint_col:       INT64 SNAPPY DO:586 FPO:617 SZ:59/55/0.93 VC:8 ENC:RLE,RLE_DICTIONARY ST:[min: 0, max: 10, num_nulls: 0]
float_col:        FLOAT SNAPPY DO:724 FPO:747 SZ:51/47/0.92 VC:8 ENC:RLE,RLE_DICTIONARY ST:[min: -0.0, max: 1.1, num_nulls: 0]
double_col:       DOUBLE SNAPPY DO:845 FPO:876 SZ:59/55/0.93 VC:8 ENC:RLE,RLE_DICTIONARY ST:[min: -0.0, max: 10.1, num_nulls: 0]
date_string_col:  BINARY SNAPPY DO:983 FPO:1028 SZ:74/88/1.19 VC:8 ENC:RLE,RLE_DICTIONARY ST:[min: 0x30312F30312F3039, max: 0x30342F30312F3039, num_nulls: 0]
string_col:       BINARY SNAPPY DO:1143 FPO:1168 SZ:53/49/0.92 VC:8 ENC:RLE,RLE_DICTIONARY ST:[min: 0x30, max: 0x31, num_nulls: 0]
timestamp_col:    INT96 SNAPPY DO:1261 FPO:1329 SZ:98/138/1.41 VC:8 ENC:RLE,RLE_DICTIONARY ST:[num_nulls: 0, min/max not defined]
year:             INT32 SNAPPY DO:1451 FPO:1470 SZ:47/43/0.91 VC:8 ENC:RLE,RLE_DICTIONARY ST:[min: 2009, max: 2009, num_nulls: 0]
month:            INT32 SNAPPY DO:1563 FPO:1594 SZ:60/56/0.93 VC:8 ENC:RLE,RLE_DICTIONARY ST:[min: 1, max: 4, num_nulls: 0]

Parquet-tools output for one of the lineitem files:
$ hadoop jar ~/repos/parquet-mr/parquet-tools/target/parquet-tools-1.12.0-SNAPSHOT.jar meta /test-warehouse/li2/4b4d9143c575dd71-3f69d3cf00000001_1879643220_data.0.parq
20/12/22 09:39:56 INFO hadoop.ParquetFileReader: Initiating action with parallelism: 5
20/12/22 09:39:56 INFO hadoop.ParquetFileReader: reading another 1 footers
20/12/22 09:39:56 INFO hadoop.ParquetFileReader: Initiating action with parallelism: 5
file:            hdfs://localhost:20500/test-warehouse/li2/4b4d9143c575dd71-3f69d3cf00000001_1879643220_data.0.parq
creator:         impala version 4.0.0-SNAPSHOT (build 7b691c5d4249f0cb1ced8ddf01033fbbe10511d9)

file schema:     schema
--------------------------------------------------------------------------------
l_orderkey:      OPTIONAL INT64 L:INTEGER(64,true) R:0 D:1
l_partkey:       OPTIONAL INT64 L:INTEGER(64,true) R:0 D:1
l_suppkey:       OPTIONAL INT64 L:INTEGER(64,true) R:0 D:1
l_linenumber:    OPTIONAL INT32 L:INTEGER(32,true) R:0 D:1
l_quantity:      OPTIONAL FIXED_LEN_BYTE_ARRAY L:DECIMAL(12,2) R:0 D:1
l_extendedprice: OPTIONAL FIXED_LEN_BYTE_ARRAY L:DECIMAL(12,2) R:0 D:1
l_discount:      OPTIONAL FIXED_LEN_BYTE_ARRAY L:DECIMAL(12,2) R:0 D:1
l_tax:           OPTIONAL FIXED_LEN_BYTE_ARRAY L:DECIMAL(12,2) R:0 D:1
l_returnflag:    OPTIONAL BINARY R:0 D:1
l_linestatus:    OPTIONAL BINARY R:0 D:1
l_shipdate:      OPTIONAL BINARY R:0 D:1
l_commitdate:    OPTIONAL BINARY R:0 D:1
l_receiptdate:   OPTIONAL BINARY R:0 D:1
l_shipinstruct:  OPTIONAL BINARY R:0 D:1
l_shipmode:      OPTIONAL BINARY R:0 D:1
l_comment:       OPTIONAL BINARY R:0 D:1

row group 1:     RC:1724693 TS:58432195 OFFSET:4
--------------------------------------------------------------------------------
l_orderkey:       INT64 SNAPPY DO:4 FPO:159797 SZ:2839537/13147604/4.63 VC:1724693 ENC:RLE,RLE_DICTIONARY,PLAIN ST:[min: 2142211, max: 6000000, num_nulls: 0]
l_partkey:        INT64 SNAPPY DO:2839640 FPO:3028619 SZ:8179566/13852808/1.69 VC:1724693 ENC:RLE,RLE_DICTIONARY,PLAIN ST:[min: 1, max: 200000, num_nulls: 0]
l_suppkey:        INT64 SNAPPY DO:11019308 FPO:11059413 SZ:3063563/3103196/1.01 VC:1724693 ENC:RLE,RLE_DICTIONARY ST:[min: 1, max: 10000, num_nulls: 0]
l_linenumber:     INT32 SNAPPY DO:14082964 FPO:14083007 SZ:412884/650550/1.58 VC:1724693 ENC:RLE,RLE_DICTIONARY ST:[min: 1, max: 7, num_nulls: 0]
l_quantity:       FIXED_LEN_BYTE_ARRAY SNAPPY DO:14495934 FPO:14496204 SZ:1298038/1297963/1.00 VC:1724693 ENC:RLE,RLE_DICTIONARY ST:[min: 1.00, max: 50.00, num_nulls: 0]
l_extendedprice:  FIXED_LEN_BYTE_ARRAY SNAPPY DO:15794062 FPO:16003224 SZ:9087746/10429259/1.15 VC:1724693 ENC:RLE,RLE_DICTIONARY,PLAIN ST:[min: 904.00, max: 104949.50, num_nulls: 0]
l_discount:       FIXED_LEN_BYTE_ARRAY SNAPPY DO:24881912 FPO:24881976 SZ:866406/866338/1.00 VC:1724693 ENC:RLE,RLE_DICTIONARY ST:[min: 0.00, max: 0.10, num_nulls: 0]
l_tax:            FIXED_LEN_BYTE_ARRAY SNAPPY DO:25748406 FPO:25748463 SZ:866399/866325/1.00 VC:1724693 ENC:RLE,RLE_DICTIONARY ST:[min: 0.00, max: 0.08, num_nulls: 0]
l_returnflag:     BINARY SNAPPY DO:26614888 FPO:26614918 SZ:421113/421069/1.00 VC:1724693 ENC:RLE,RLE_DICTIONARY ST:[min: 0x41, max: 0x52, num_nulls: 0]
l_linestatus:     BINARY SNAPPY DO:27036081 FPO:27036106 SZ:262209/270332/1.03 VC:1724693 ENC:RLE,RLE_DICTIONARY ST:[min: 0x46, max: 0x4F, num_nulls: 0]
l_shipdate:       BINARY SNAPPY DO:27298370 FPO:27309301 SZ:2602937/2627148/1.01 VC:1724693 ENC:RLE,RLE_DICTIONARY ST:[min: 0x313939322D30312D3032, max: 0x313939382D31322D3031, num_nulls: 0]
l_commitdate:     BINARY SNAPPY DO:29901405 FPO:29912079 SZ:2602680/2626308/1.01 VC:1724693 ENC:RLE,RLE_DICTIONARY ST:[min: 0x313939322D30312D3331, max: 0x313939382D31302D3331, num_nulls: 0]
l_receiptdate:    BINARY SNAPPY DO:32504185 FPO:32515219 SZ:2603040/2627498/1.01 VC:1724693 ENC:RLE,RLE_DICTIONARY ST:[min: 0x313939322D30312D3036, max: 0x313939382D31322D3330, num_nulls: 0]
l_shipinstruct:   BINARY SNAPPY DO:35107326 FPO:35107408 SZ:434968/434917/1.00 VC:1724693 ENC:RLE,RLE_DICTIONARY ST:[min: 0x434F4C4C45435420434F44, max: 0x54414B45204241434B2052455455524E, num_nulls: 0]
l_shipmode:       BINARY SNAPPY DO:35542401 FPO:35542471 SZ:650639/650580/1.00 VC:1724693 ENC:RLE,RLE_DICTIONARY ST:[min: 0x414952, max: 0x545255434B, num_nulls: 0]
l_comment:        BINARY SNAPPY DO:36193124 FPO:36711343 SZ:22240470/52696671/2.37 VC:1724693 ENC:RLE,RLE_DICTIONARY,PLAIN ST:[min: 0x20546972657369617320, max: 0x7A7A6C653F20626C697468656C792069726F6E69, num_nulls: 0]

Change-Id: I90942022edcd5d96c720a1bde53879e50394660a
Reviewed-on: http://gerrit.cloudera.org:8080/16893
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-01-05 23:30:35 +00:00
Aman Sinha
49680559b0 IMPALA-10182: Don't add inferred identity predicates to SELECT node
For an inferred equality predicates of type c1 = c2 if both sides
are referring to the same underlying tuple and slot, it is an identity
predicate which should not be evaluated by the SELECT node since it
will incorrectly eliminate NULL rows. This patch fixes the behavior.

Testing:
 - Added planner tests with base table and with outer join
 - Added runtime tests with base table and with outer join
 - Added planner test for IMPALA-9694 (same root cause)
 - Ran PlannerTest .. no other plans changed

Change-Id: I924044f582652dbc50085851cc639f3dee1cd1f4
Reviewed-on: http://gerrit.cloudera.org:8080/16917
Reviewed-by: Aman Sinha <amsinha@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-01-05 23:04:25 +00:00
Zoltan Borok-Nagy
03af0b2c8c IMPALA-10422: EXPLAIN statements leak ACID transactions and locks
Currently EXPLAIN statements might open ACID transactions and
create locks on ACID tables.

This is not necessary since we won't modify the table. But the
real problem is that these transactions and locks are leaked and
open forever. They are even getting heartbeated while the
coordinator is still running.

The solution is to not consume any ACID resources for EXPLAIN
statements.

Testing:
* Added EXPLAIN INSERT OVERWRITE in front of an actual INSERT OVERWRITE
  in an e2e test

Change-Id: I05113b1fd9a3eb2d0dd6cf723df916457f3fbf39
Reviewed-on: http://gerrit.cloudera.org:8080/16923
Reviewed-by: Csaba Ringhofer <csringhofer@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2021-01-05 21:31:05 +00:00
Fucun Chu
4099a60689 IMPALA-10317: Add query option that limits huge joins at runtime
This patch adds support for limiting the rows produced by a join node
such that runaway join queries can be prevented.

The limit is specified by a query option. Queries exceeding that limit
get terminated. The checking runs periodically, so the actual rows
produced may go somewhat over the limit.

JOIN_ROWS_PRODUCED_LIMIT is exposed as an advanced query option.

Rows produced Query profile is updated to include query wide and per
backend metrics for RowsReturned. Example from "
set JOIN_ROWS_PRODUCED_LIMIT = 10000000;
select count(*) from tpch_parquet.lineitem l1 cross join
(select * from tpch_parquet.lineitem l2 limit 5) l3;":

NESTED_LOOP_JOIN_NODE (id=2):
   - InactiveTotalTime: 107.534ms
   - PeakMemoryUsage: 16.00 KB (16384)
   - ProbeRows: 1.02K (1024)
   - ProbeTime: 0.000ns
   - RowsReturned: 10.00M (10002025)
   - RowsReturnedRate: 749.58 K/sec
   - TotalTime: 13s337ms

Testing:
 Added tests for JOIN_ROWS_PRODUCED_LIMIT

Change-Id: Idbca7e053b61b4e31b066edcfb3b0398fa859d02
Reviewed-on: http://gerrit.cloudera.org:8080/16706
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-12-22 06:10:39 +00:00
Fang-Yu Rao
1b863132c6 IMPALA-10211 (Part 1): Add support for role-related statements
This patch adds the support for the following role-related statements.
1. CREATE ROLE <role_name>.
2. DROP ROLE <role_name>.
3. GRANT ROLE <role_name> TO GROUP <group_name>.
4. REVOKE ROLE <role_name> FROM GROUP <group_name>.
5. GRANT <privilege> ON <resource> TO ROLE <role_name>.
6. REVOKE <privilege> ON <resource> FROM ROLE <role_name>.
7. SHOW GRANT ROLE <role_name> ON <resource>.
8. SHOW ROLES.
9. SHOW CURRENT ROLES.
10. SHOW ROLE GRANT GROUP <group_name>.

To support the first 4 statements, we implemented the methods of
createRole()/dropRole(), and grantRoleToGroup()/revokeRoleFromGroup()
with their respective API calls provided by Ranger. To support the 5th
and 6th statements, we modified createGrantRevokeRequest() so that the
cases in which the grantee or revokee is a role could be processed. We
slightly extended getPrivileges() so as to include the case when the
principal is a role for the 7th statement. For the last 3 statements, to
make Impala's behavior consistent with that when Sentry was the
authorization provider, we based our implementation on
SentryImpaladAuthorizationManager#getRoles() at
https://gerrit.cloudera.org/c/15833/8/fe/src/main/java/org/apache/impala/authorization/sentry/SentryImpaladAuthorizationManager.java,
which was removed in IMPALA-9708 when we dropped the support for Sentry.

To test the implemented functionalities, we based our test cases on
those at
https://gerrit.cloudera.org/c/15833/8/testdata/workloads/functional-query/queries/QueryTest/grant_revoke.test.
We note that before our tests could be automatically run in a
Kerberized environment (IMPALA-9360), in order to run the statements of
CREATE/DROP ROLE <role_name>,
GRANT/REVOKE ROLE <role_name> TO/FROM GROUP <group_name>, and
SHOW ROLES, we revised security-applicationContext.xml, one of the files
needed when the Ranger server is started, so that the corresponding API
calls could be performed in a non-Kerberized environment.

During the process of adding test cases to grant_revoke.test, we found
the following differences in Impala's behavior between the case when
Ranger is the authorization provider and that when Sentry is the
authorization provider. Specifically, we have the following two major
differences.
1. Before dropping a role in Ranger, we have to remove all the
privileges granted to the role in advance, which is not the case when
Sentry is the authorization provider.
2. The resource has to be specified for the statement of
SHOW GRANT ROLE <role_name> ON <resource>, which is different when
Sentry is the authorization provider. This could be partly due to the
fact that there is no API provided by Ranger that allows Impala to
directly retrieve the list of all privileges granted to a specified
role.
Due to the differences in Impala's behavior described above, we had to
revise the test cases in grant_revoke.test accordingly.

On the other hand, to include as many test cases that were in the
original grant_revoke.test as possible, we had to explicitly add the
test section of 'USER' to specify the connecting user to Impala for some
queries that require the connecting user to be a Ranger administrator,
e.g., CREATE/DROP ROLE <role_name> and
GRANT/REVOKE <role_name> TO/FROM GROUP <group_name>. The user has to be
'admin' in the current grant_revoke.test, whereas it could be the
default user 'getuser()' in the original grant_revoke.test because
previously 'getuser()' was also a Sentry administrator.

Moreover, for some test cases, we had to explicitly alter the owner of a
resource in the original grant_revoke.test when we would like to prevent
the original owner of the resource, e.g., the creator of the resource,
from accessing the resource since the original grant_revoke.test was run
without object ownership being taken into consideration.

We also note that in this patch we added the decorator of
@pytest.mark.execute_serially to each test in test_ranger.py since we
have observed that in some cases, e.g., if we are only running the E2E
tests in the Jenkins environment, some tests do not seem to be executed
sequentially.

Testing:
 - Briefly verified that the implemented statements work as expected in
   a Kerberized cluster.
 - Verified that test_ranger.py passes in a local development
   environment.
 - Verified that the patch passes the exhaustive tests in the DEBUG
   build.

Change-Id: Ic2b204e62a1d8ae1932d955b4efc28be22202860
Reviewed-on: http://gerrit.cloudera.org:8080/16837
Reviewed-by: Quanlong Huang <huangquanlong@gmail.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-12-21 14:29:52 +00:00
Zoltan Borok-Nagy
296ed74d6f IMPALA-10380: INSERT INTO Iceberg tables with 'IDENTITY' partitions only
This patch adds support to INSERT INTO identity-partitioned
Iceberg tables.

Identity-partitioned Iceberg tables are similar to regular
partitioned tables, they are even stored in the same directory
structure. The difference is that the data files still store
the partitioning columns.

The INSERT INTO syntax is similar to the syntax for non-partitioned
tables, i.e.:

INSERT INTO <iceberg_tbl> VALUES (<val1>, <val2>, <val3>, ...);
Or,
INSERT INTO <iceberg_tbl> SELECT <val1>, <val2>, ... FROM <source_tbl>
(please note that we don't use the PARTITION keyword)

The values must be in column order corresponding to the table schema.
Impala will automatically create/find the partitions based on the
Iceberg partition spec.

Partitioned Iceberg tables are stored as non-partitioned tables
in the Hive Metastore (similarly to partitioned Kudu tables). However,
the InsertStmt still generates the partition expressions for them.
These partition expressions are used to shuffle and sort the input
data so we don't end up writing too many files. The HdfsTableSink
also uses the partition expressions to write the data files with
the proper partition paths.

Iceberg is able to parse the partition paths to generate the
corresponding metadata for the partitions. This happens at the
end in IcebergCatalogOpExecutor.

Testing:
 * added planner test to verify shuffling and sorting
 * added negative tests for unsupported features like PARTITION clause
   and non-identity partition transforms
 * e2e tests with partitioned inserts

Change-Id: If98797a2bfdc038d0467c8f83aadf1a12e1d69d4
Reviewed-on: http://gerrit.cloudera.org:8080/16825
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-12-17 08:54:51 +00:00
Zoltan Borok-Nagy
a7e71b4523 IMPALA-10358: Correct Iceberg type mappings
The Iceberg format spec defines what types to use for different file
formats, e.g.: https://iceberg.apache.org/spec/#parquet

Impala should follow the specification, so this patch
 * annotates strings with UTF8 in Parquet metadata
 * removes fixed(L) <-> CHAR(L) mapping
 * forbids INSERTs when the Iceberg schema has a TIMESTAMPTZ column

This patch also refactors the type/schema conversions as
Impala => Iceberg conversions were duplicated in
IcebergCatalogOpExecutor and IcebergUtil. I introduced the class
'IcebergSchemaConverter' to contain the code for conversions.

Testing:
 * added test to check CHAR and VARCHAR types are not allowed
 * test that INSERTs are not allowed when the table has TIMESTMAPTZ
 * added test to check that strings are annotated with UTF8

Change-Id: I652565f82708824f5cf7497139153b06f116ccd3
Reviewed-on: http://gerrit.cloudera.org:8080/16851
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-12-15 19:17:51 +00:00
Zoltan Borok-Nagy
87b95a5568 IMPALA-10386: Don't allow PARTITION BY SPEC for non-Iceberg tables
PARTITION BY SPEC is only valid for Iceberg tables so Impala should
raise an error when it is used for non-Iceberg tables.

Testing
 * added e2e test

Change-Id: I6b3ec3e84476614cb11e801b6d89d84eb384dd43
Reviewed-on: http://gerrit.cloudera.org:8080/16846
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-12-14 20:47:10 +00:00
Zoltan Borok-Nagy
eb8b118db5 IMPALA-10384: Make partition names consistent between BE and FE
In the BE we build partition names with the trailing char '/'. In the FE
we build partition names without a trailing char. We should make this
consistent because this causes some annoying string adjustments in
the FE and can cause hidden bugs.

This patch creates partition names without the trailing '/' both in
the BE and the FE. This follows Hive's behavior that also prints
partition names without the trailing '/'.

Testing:
 * Ran exhaustive tests

Change-Id: I7e40111e2d1148aeb01ebc985bbb15db7d6a6012
Reviewed-on: http://gerrit.cloudera.org:8080/16850
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-12-11 19:51:28 +00:00
skyyws
a850cd3cc6 IMPALA-10361: Use field id to resolve columns for Iceberg tables
We supported resolve column by field id for Iceberg table in this
patch. Currently, we use field id to resolve column for Iceberg
tables, which means 'PARQUET_FALLBACK_SCHEMA_RESOLUTION' is invalid
for Iceberg tables.

Change-Id: I057bdc6ab2859cc4d40de5ed428d0c20028b8435
Reviewed-on: http://gerrit.cloudera.org:8080/16788
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Reviewed-by: Zoltan Borok-Nagy <boroknagyz@cloudera.com>
2020-12-10 19:01:08 +00:00
Aman Sinha
b5ba793227 IMPALA-10360: Allow simple limit to be treated as sampling hint
As a follow-up to IMPALA-10314, it is sometimes useful to consider
a simple limit as a way to sample from a table if a relevant hint
has been provided. Doing a sample instead of pure limit serves
dual purposes: (a) it still helps with reducing the planning time
since the scan ranges need be computed only for the sample files,
(b) it allows sufficient number of files/rows to be read from
the table such that after applying filter conditions or joins with
another table, the query may still produce the N rows needed for
limit.

This fuctionality is especially useful if the query is against a
view. Note that TABLESAMPLE clause cannot be applied to a view and
embedding a TABLESAMPLE explicitly on a table within a view will
not work because we don't want to sample if there's no limit.

In this patch, a new table level hint, 'convert_limit_to_sample(n)'
is added. If this hint is attached to a table either in the main
query block or within a view/subquery and simple limit optimization
conditions are satisfied (according to IMPALA-10314), the limit
is converted to a table sample. The parameter 'n' in parenthesis is
required and specifies the sample percentage. It must be an integer
between 1 and 100. For example:

 set optimize_simple_limit = true;
 CREATE VIEW v1 as SELECT * FROM T [convert_limit_to_sample(5)]
    WHERE [always_true] <predicate>;
 SELECT * FROM v1 LIMIT 10;

In this case, the limit 10 is applied on top of a 5 percent sample
of T which is applied after partition pruning.

Testing:
 - Added a alltypes_date_partition_2 table where the date and
   timestamp values match (this helps with setting the
   'always_true' hint).
 - Added views with 'convert_limit_to_sample' and 'always_true'
   hints and added new tests against the views. Modified a few
   existing tests to reference the new table variant.
 - Added an end-to-end test.

Change-Id: Ife05a5343c913006f7659949b327b63d3f10c04b
Reviewed-on: http://gerrit.cloudera.org:8080/16792
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-12-10 07:15:36 +00:00
Tim Armstrong
f684ed72c5 IMPALA-10252: fix invalid runtime filters for outer joins
The planner generates runtime filters for non-join conjuncts
assigned to LEFT OUTER and FULL OUTER JOIN nodes. This is
correct in many cases where NULLs stemming from unmatched rows
would result in the predicate evaluating to false. E.g.
x = y is always false if y is NULL.

However, it is incorrect if the NULL returned from the unmatched
row can result in the predicate evaluating to true. E.g.
x = isnull(y, 1) can return true even if y is NULL.

The fix is to detect cases when the source expression from the
left input of the join returns non-NULL for null inputs and then
skip generating the filter.

Examples of expressions that may be affected by this change are
COALESCE and ISNULL.

Testing:
Added regression tests:
* Planner tests for LEFT OUTER and FULL OUTER where the runtime
  filter was incorrectly generated before this patch.
* Enabled end-to-end test that was previously failing.
* Added a new runtime filter test that will execute on both
  Parquet and Kudu (which are subtly different because of nullability of
  slots).

Ran exhaustive tests.

Change-Id: I507af1cc8df15bca21e0d8555019997812087261
Reviewed-on: http://gerrit.cloudera.org:8080/16622
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-12-08 03:15:44 +00:00
Zoltan Borok-Nagy
579f5c67e0 IMPALA-10364: Set the real location for external Iceberg tables stored in HadoopCatalog
Impala tries to come up with the table location of external Iceberg
tables stored in HadoopCatalog. The current method is not correct for
tables that are nested under multiple namespaces.

With this patch Imapala loads the Iceberg table and retrieves the
location from it.

Testing:
 * added e2e test in iceberg-create.test

Change-Id: I04b75d219e095ce00b4c48f40b8dee872ba57b78
Reviewed-on: http://gerrit.cloudera.org:8080/16795
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-12-02 22:42:12 +00:00
Qifan Chen
63146103a7 IMPALA-9355: TestExchangeMemUsage.test_exchange_mem_usage_scaling doesn't hit the memory limit
This patch reduces the memory limit for the following query in
test_exchange_mem_usage_scaling test from 170MB to 164MB
to reduce the chance of not detecting a memory allocation
failure.

set mem_limit=<limit_in_mb>
set num_scanner_threads=1;
select *
from tpch_parquet.lineitem l1
  join tpch_parquet.lineitem l2 on l1.l_orderkey = l2.l_orderkey and
      l1.l_partkey = l2.l_partkey and l1.l_suppkey = l2.l_suppkey
      and l1.l_linenumber = l2.l_linenumber
order by l1.l_orderkey desc, l1.l_partkey, l1.l_suppkey,
l1.l_linenumber limit 5;

In a test with 500 executions of the above query with the memory
limit set to 164MB, there were 500 memory allocation failures in
total (one in each execution), and a total of 266 of them from
Exchange Node #4.

Testing:
  Ran the query in question individually;
  Ran TestExchangeMemUsage.test_exchange_mem_usage_scaling test;
  Ran core tests.

Change-Id: Id945d7e37fac07beb7808e6ccf8530e667cbaad4
Reviewed-on: http://gerrit.cloudera.org:8080/16791
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-11-30 22:33:57 +00:00
Aman Sinha
5530b62539 IMPALA-10314: Optimize planning time for simple limits
This patch optimizes the planning time for simple limit
queries by only considering a minimal set of partitions
whose file descriptors add up to N (the specified limit).
Each file is conservatively estimated to contain 1 row.

This reduces the number of partitions processed by
HdfsScanNode.computeScanRangeLocations() which, according
to query profiling, has been the main contributor to the
planning time especially for large number of partitions.
Further, within each partition, we only consider the number
of non-empty files that brings the total to N.

This is an opt-in optimization. A new planner option
OPTIMIZE_SIMPLE_LIMIT enables this optimization. Further,
if there's a WHERE clause, it must have an 'always_true'
hint in order for the optimization to be considered. For
example:
  set optimize_simple_limit = true;
  SELECT * FROM T
    WHERE /* +always_true */ <predicate>
  LIMIT 10;

If there are too many empty files in the partitions, it is
possible that the query may produce fewer rows although
those are still valid rows.

Testing:
 - Added planner tests for the optimization
 - Ran query_test.py tests by enabling the optimize_simple_limit
 - Added an e2e test. Since result rows are non-deterministic,
   only simple count(*) query on top of subquery with limit
   was added.

Change-Id: I9d6a79263bc092e0f3e9a1d72da5618f3cc35574
Reviewed-on: http://gerrit.cloudera.org:8080/16723
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
2020-11-28 07:30:06 +00:00