Our NumericLiteral is backed by a BigDecimal which cannot
represent the special float values NaN, infinity or negative zero.
As a result, when evaluating constant expressions from the FE we
hit an exception when trying to create a NumericLiteral from
a NaN or infinity value. Before, negative zero would silently
get converted to zero which is dangerous.
The fix is to treat the expr evaluation as a failure and not
replace the constant Expr with a LiteralExpr.
Change-Id: I8243b2ee9fa9c470d078b385583f2f48b606a230
Reviewed-on: http://gerrit.cloudera.org:8080/5050
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Internal Jenkins
Adds a new ExprRewriteRule to extract common conjuncts from
disjunctions.
Examples:
(a AND b AND c) OR (b AND d) ==> b AND ((a AND c) OR (d))
(a AND b) OR (a AND b) ==> a AND b
(a AND b AND c) OR (c) ==> c
Adds a new query option ENABLE_EXPR_REWRITES to enable/disable
non-essential expr rewrites in the FE. Note that some rewrites
are required, e.g., BetweenToCompoundRule. Disabling the rewrites
is useful for testing, in particular, to make sure that the exprs
specified in expr-test.cc are executed as written.
Testing: Added a new unit test in ExprRewriteRulesTest.
Change-Id: I3cf9b950afaa3fd753d1b09ba5e540b5258940ad
Reviewed-on: http://gerrit.cloudera.org:8080/4877
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Internal Jenkins
Removes the non-standard IGNORE syntax that was allowed for
DML into Kudu tables to indicate that certain errors should
be ignored, i.e. not fail the query and continue. However,
because there is no way to 'roll back' mutations that
occurred before an error occurs, tables are left in an
inconsistent state and it's difficult to know what rows were
successfully modified vs which rows were not. Instead, this
change makes it so that we always 'ignore' these conflicts,
i.e. a 'best effort'. In the future, when Kudu will provide
the mechanisms Impala needs to provide a notion of isolation
levels, then Impala will be able to provide options for more
traditional semantics.
After this change, the following errors are ignored:
* INSERT where the PK already exists
* UPDATE/DELETE where the PK doesn't exist
Another follow-up patch will change other violations to be
handled in this way as well, e.g. nulls inserted in
non-nullable cols.
Reporting:
The number of rows inserted is reported to the coordinator,
which makes the aggregate available to the shell and via the
profile.
TODO: Return rows modified for INSERT via HS2 (IMPALA-1789).
TODO: Return rows modified for other CRUD (beeswax+hs2) (IMPALA-3713).
TODO: Return error counts for specific warnings (IMPALA-4416).
Testing:
Updated tests. Ran all functional tests. More tests will be
needed when other conflicts are handled in the same way.
Change-Id: I83b5beaa982d006da4997a2af061ef7c22cad3f1
Reviewed-on: http://gerrit.cloudera.org:8080/4911
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Internal Jenkins
This patch introduces a new query statement, UPSERT, for Kudu
tables which operates like an INSERT and uses all of the analysis,
planning, and execution machinery as INSERT, except that if
there's a primary key collision instead of returning an error an
update is performed.
New syntax:
[with_clause] UPSERT INTO [TABLE] table_name [(column list)]
{
query_stmt
| VALUES (value [, value...]) [, (value [, (value...)]) ...]
}
where column list must contain all of the key columns in
table_name, if specified, and table_name must be a Kudu table.
This patch also improves the behavior of INSERTing into Kudu
tables without specifying all of the key columns - this now
results in an analysis exception, rather than attempting the
INSERT and receiving an error back from Kudu.
Change-Id: I8df5cea36b642e267f85ff6b163f3dd96b8386e9
Reviewed-on: http://gerrit.cloudera.org:8080/4047
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: Internal Jenkins
This commit adds support for non-covering range partitions in Kudu
tables. The SPLIT ROWS clause is now deprecated and no longer supported.
The following new syntax provides more flexibility in creating range
partitions and it supports bounded and unbounded ranges as well as single value
partitions; multi-column range partitions are supported as well.
The new syntax is:
DISTRIBUTE BY RANGE (col_list)
(
PARTITION lower_1 <[=] VALUES <[=] upper_1,
PARTITION lower_2 <[=] VALUES <[=] upper_2,
....
PARTITION lower_n <[=] VALUES <[=] upper_n,
PARTITION VALUE = val_1,
....
PARTITION VALUE = val_n
)
Multi-column range partitions are specified as follows:
DISTRIBUTE BY RANGE (col1, col2,..., coln)
(
PARTITION VALUE = (col1_val, col2_val, ..., coln_val),
....
PARTITION VALUE = (col1_val, col2_val, ..., coln_val)
)
Change-Id: I6799c01a37003f0f4c068d911a13e3f060110a06
Reviewed-on: http://gerrit.cloudera.org:8080/4856
Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Tested-by: Internal Jenkins
The underlying problem was for trivial/constant [NOT] EXISTS subqueries
we substituted out Subqueries with bool literals using an ExprSubstitutionMap,
but the Subquery.equals() function was not implemented properly, so we ended
up matching Subqueries to the wrong entry in the ExprSubstitutionMap.
This could ultimately lead to wrong plans and results.
Testing: Corrected an existing test and modified an existing test for
extra coverage.
Change-Id: I5562d98ce36507aa5e253323e184fd42b54f27ed
Reviewed-on: http://gerrit.cloudera.org:8080/4923
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Internal Jenkins
Introduces a new phase for rewriting Exprs after analysis and
before subquery rewriting. The transformed Exprs replace the
original ones in analyzed statements. If Exprs were changed,
the whole statement is reset() and re-analyzed, similar to how
subqueries are rewritten. If both Exprs and subqueries are
rewritten there is only one re-analysis of the changed statement.
The following new classes work together to perform transformations:
1. ExprRewriteRule
- base class for Expr transformation rules
2. ExprRewriter
- drives the transformation of Exprs using a list of
ExprRewriteRules
Statements that have exprs to be rewritten need to implement
a new method rewriteExprs() that accepts an ExprRewriter.
As an example, this patch adds a rule for converting
BetweenPredicates into their equivalent CompoundPredicates.
The BetweenPredicate has been notoriously buggy due to a lack
of such a separate rewrite phase and is now cleaned up.
Testing:
1. Added a new test for checking that the rewrite framework
covers all relevant statements, clauses and can properly
handle nested statements and subqueries.
2. Added a new test for ExprRewriteRules and implemented
tests for the BetweenPredicate rewrite.
2. There are many existing tests for BetweePredicates and
they all exercise the new rewrite rule/phase.
3. Ran a private core/hdfs run and it passed.
Change-Id: I2279dc984bcf7742db4fa3b1aa67283ecbb05e6e
Reviewed-on: http://gerrit.cloudera.org:8080/4746
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Internal Jenkins
This removes the data structures that were "superceded" in
IMPALA-3903 and changes all control flow to utilize the
new data structures. The new data structures are renamed
to remove the "Mt" prefix.
Change-Id: I465d0e15e2cf17cafe4c747d34c8f595d3645151
Reviewed-on: http://gerrit.cloudera.org:8080/4853
Reviewed-by: Marcel Kornacker <marcel@cloudera.com>
Tested-by: Tim Armstrong <tarmstrong@cloudera.com>
This change introduces a clustered/noclustered hint for insert
statements. Specifying this hint adds an additional sort node to the
plan, just before the table sink. This has the effect that data will be
clustered by its partition prior to writing partitions, which therefore
can be written sequentially.
Change-Id: I412153bd8435d792bd61dea268d7a3b884048f14
Reviewed-on: http://gerrit.cloudera.org:8080/4745
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Internal Jenkins
With this commit we simplify the syntax and handling of CREATE TABLE
statements for both managed and external Kudu tables.
Syntax example:
CREATE TABLE foo(a INT, b STRING, PRIMARY KEY (a, b))
DISTRIBUTE BY HASH (a) INTO 3 BUCKETS,
RANGE (b) SPLIT ROWS (('abc', 'def'))
STORED AS KUDU
Changes:
1) Remove the requirement to specify table properties such as key
columns in tblproperties.
2) Read table schema (column definitions, primary keys, and distribution
schemes) from Kudu instead of the HMS.
3) For external tables, the Kudu table is now required to exist at the
time of creation in Impala.
4) Disallow table properties that could conflict with an existing
table. Ex: key_columns cannot be specified.
5) Add KUDU as a file format.
6) Add a startup flag to impalad to specify the default Kudu master
addresses. The flag is used as the default value for the table
property kudu_master_addresses but it can still be overriden
using TBLPROPERTIES.
7) Fix a post merge issue (IMPALA-3178) where DROP DATABASE CASCADE
wasn't implemented for Kudu tables and silently ignored. The Kudu
tables wouldn't be removed in Kudu.
8) Remove DDL delegates. There was only one functional delegate (for
Kudu) the existence of the other delegate and the use of delegates in
general has led to confusion. The Kudu delegate only exists to provide
functionality missing from Hive.
9) Add PRIMARY KEY at the column and table level. This syntax is fairly
standard. When used at the column level, only one column can be
marked as a key. When used at the table level, multiple columns can
be used as a key. Only Kudu tables are allowed to use PRIMARY KEY.
The old "kudu.key_columns" table property is no longer accepted
though it is still used internally. "PRIMARY" is now a keyword.
The ident style declaration is used for "KEY" because it is also used
for nested map types.
10) For managed tables, infer a Kudu table name if none was given.
The table property "kudu.table_name" is optional for managed tables
and is required for external tables. If for a managed table a Kudu
table name is not provided, a table name will be generated based
on the HMS database and table name.
11) Use Kudu master as the source of truth for table metadata instead
of HMS when a table is loaded or refreshed. Table/column metadata
are cached in the catalog and are stored in HMS in order to be
able to use table and column statistics.
Change-Id: I7b9d51b2720ab57649abdb7d5c710ea04ff50dc1
Reviewed-on: http://gerrit.cloudera.org:8080/4414
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Internal Jenkins
StmtRewrite lost parentheses of CompoundPredicate in pushNegationToOperands()
and leads to incorrect toSql() result. Even though this issue would not leads
to incorrect result of query, it makes user confuse of the logical operator
precedence of predicates shown in EXPLAIN statement.
Change-Id: I79bfc67605206e0e026293bf7032a88227a95623
Reviewed-on: http://gerrit.cloudera.org:8080/4753
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Internal Jenkins
MT_DOP > 0 is only supported for plans without distributed joins
or table sinks. Adds validation to fail unsupported queries
gracefully in planning.
For scans in queries that are executable with MT_DOP > 0 we either
use the optimized MT scan node BE implementation (only Parquet), or
we use the conventional scan node with num_scanner_threads=1.
TODO: Still need to add end-to-end tests.
Change-Id: I91a60ea7b6e3ae4ee44be856615ddd3cd0af476d
Reviewed-on: http://gerrit.cloudera.org:8080/4677
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Internal Jenkins
The plan-root fragment instance that runs on the coordinator should be
handled like all others: started via RPC and run asynchronously. Without
this, the fragment requires special-case code throughout the
coordinator, and does not show up in system metrics etc.
This patch adds a new sink type, PlanRootSink, to the root fragment
instance so that the coordinator can pull row batches that are pushed by
the root instance. The coordinator signals completion to the fragment
instance via closing the consumer side of the sink, whereupon the
instance is free to complete.
Since the root instance now runs asynchronously wrt to the coordinator,
we add several coordination methods to allow the coordinator to wait for
a point in the instance's execution to be hit - e.g. to wait until the
instance has been opened.
Done in this patch:
* Add PlanRootSink
* Add coordination to PFE to allow coordinator to observe lifecycle
* Make FragmentMgr a singleton
* Removed dead code from Coordinator::Wait() and elsewhere.
* Moved result output exprs out of QES and into PlanRootSink.
* Remove special-case limit-based teardown of coordinator fragment, and
supporting functions in PlanFragmentExecutor.
* Simplified lifecycle of PlanFragmentExecutor by separating Open() into
Open() and Exec(), the latter of which drives the sink by reading
rows from the plan tree.
* Add child profile to PlanFragmentExecutor to measure time spent in
each lifecycle phase.
* Removed dependency between InitExecProfiles() and starting root
fragment.
* Removed mostly dead-code handling of LIMIT 0 queries.
* Ensured that SET returns a result set in all cases.
* Fix test_get_log() HS2 test. Errors are only guaranteed to be visible
after fetch calls return EOS, but test was assuming this would happen
after first fetch.
Change-Id: Ibb0064ec2f085fa3a5598ea80894fb489a01e4df
Reviewed-on: http://gerrit.cloudera.org:8080/4402
Tested-by: Internal Jenkins
Reviewed-by: Henry Robinson <henry@cloudera.com>
This adds a tie-break to make sure that we sort predicates in a
deterministic order on Java 7 and 8. This was suggested by Alex in
IMPALA-3644.
There are still three broken tests when run in Java 8, but it seems best
to address them in a subsequent change.
Change-Id: Id11010bfeaff368869e6d430eeb4773ddf41faff
Reviewed-on: http://gerrit.cloudera.org:8080/4671
Reviewed-by: Jim Apple <jbapple@cloudera.com>
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: Internal Jenkins
This change removes some of the occurrences of the strings 'CDH'/'cdh'
from the Impala repository. References to Cloudera-internal Jiras have
been replaced with upstream Jira issues on issues.cloudera.org.
For several categories of occurrences (e.g. pom.xml files,
DOWNLOAD_CDH_COMPONENTS) I also created a list of follow-up Jiras to
remove the occurrences left after this change.
Change-Id: Icb37e2ef0cd9fa0e581d359c5dd3db7812b7b2c8
Reviewed-on: http://gerrit.cloudera.org:8080/4187
Reviewed-by: Jim Apple <jbapple@cloudera.com>
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Internal Jenkins
Fixed 2 isssues:
- The getSelectivity() method sometimes returned NaN double values which
could not be sorted properly.
- The compare method for sorting runtime filters was swtiched to use
the builtin Double comparison method.
Change-Id: Iad433f2ece423ea29e79e81b68fa53cb0af18378
Reviewed-on: http://gerrit.cloudera.org:8080/4652
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Internal Jenkins
Folding const exprs where there were implicit casts on the
slot resulted in the predicate not being pushed to Kudu.
Change-Id: I3bab22d90ee00a054c847de6c734b4f24a3f5a85
Reviewed-on: http://gerrit.cloudera.org:8080/4613
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: Internal Jenkins
In case of count(distinct), FunctionCallExpr.analyze() changes type
for "NULL" into "BOOLEAN" to make sure that BE doesn't see any
"NULL_TYPE" exprs. In the meantime, Expr substitution, happening in
Expr.substituteImpl() reverts this change back to original type,
"NULL_TYPE".
This causes an issue when AggregateInfo.checkConsistency() performs
precondition check where slot types from
AggregateInfo.outputTupleDesc_ should be matched with the types from
AggregateInfo.groupingExpr_. The slot type shows "BOOLEAN" while type
from groupingExpr_ is "NULL_TYPE", which makes the precondition fail
and throws an exception.
To resolve the issue, preserveRootType is set to true when
Expr.substituteList() gets called in AggregateInfo.substitute()
Change-Id: Icf3b4511234e473e5b9548fbf3e97f333c9980f1
(cherry picked from commit b17785b4890bedd1c825140ce3c48cd7d9734295)
Reviewed-on: http://gerrit.cloudera.org:8080/4600
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Internal Jenkins
The underlying issue was already fixed in IMPALA-3940.
This patch adds a new regression test to cover the IMPALA-4206.
Change-Id: I5b164000c7b0ce7e2f296d168d75a6860f5963d8
Reviewed-on: http://gerrit.cloudera.org:8080/4556
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Internal Jenkins
Adds initial support for the functional-query test workload
for Kudu tables.
There are a few issues that make loading the functional
schema difficult on Kudu:
1) Kudu tables must have one or more columns that together
constitute a unique primary key.
a) Primary key columns must currently be the first columns
in the table definition (KUDU-1271).
b) Primary key columns cannot be nullable (KUDU-1570).
2) Kudu tables must be specified with distribution
parameters.
(1) limits the tables that can be loaded without ugly
workarounds. This patch only includes important tables that
are used for relevant tests, most notably the alltypes*
family. In particular, alltypesagg is important but it does
not have a set of columns that are non-nullable and form a unique
primary key. As a result, that table is created in Kudu with
a different name and an additional BIGINT column for a PK
that is a unique index and is generated at data loading time
using the ROW_NUMBER analytic function. A view is then
wrapped around the underlying table that matches the
alltypesagg schema exactly. When KUDU-1570 is resolved, this
can be simplified.
(2) requires some additional considerations and custom
syntax. As a result, the DDL to create the tables is
explicitly specified in CREATE_KUDU sections in the
functional_schema_constraints.csv, and an additional
DEPENDENT_LOAD_KUDU section was added to specify custom data
loading DML that differs from the existing DEPENDENT_LOAD.
TODO: IMPALA-4005: generate_schema_statements.py needs refactoring
Tests that are not relevant or not yet supported have been
marked with xfail and a skip where appropriate.
TODO: Support remaining functional tables/tests when possible.
Change-Id: Iada88e078352e4462745d9a9a1b5111260d21acc
Reviewed-on: http://gerrit.cloudera.org:8080/4175
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: Internal Jenkins
Switches the planner and KuduScanNode to use Kudu's new
ScanToken API instead of explicitly constructing scan ranges
for all tablets of a table, regardless of whether they were
needed. The ScanToken API allows Impala to specify the
projected columns and predicates during planning, and Kudu
returns a set of 'scan tokens' that represent a scanner for
each tablet that needs to be scanned. The scan tokens can
be serialized and distributed to the scan nodes, which can
then deserialize them into Kudu scanner objects. Upon
deserialization, the scan token has all scan parameters
already, including the 'pushed down' predicates. Impala no
longer needs to send the Kudu predicates to the BE and
convert them at the scan node.
This change also fixes:
1) IMPALA-4016: Avoid materializing slots only referenced
by Kudu conjuncts
2) IMPALA-3874: Predicates are not always pushed to Kudu
TODO: Consider additional planning improvements.
Testing: Updated the existing tests, verified everything
works as expected. Some BE tests no longer make sense and
they were removed.
TODO: When KUDU-1065 is resolved, add tests that demonstrate pruning.
Change-Id: I160e5849d372755748ff5ba3c90a4651c804b220
Reviewed-on: http://gerrit.cloudera.org:8080/4120
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: Internal Jenkins
Fixes inserts into partitioned tables that have a shuffle hint and only constant
partition exprs. The rows to be inserted are merged at the coordinator where
the table sink is executed. There is no need to hash exchange rows.
Now accepts insert hints when inserting into unpartitioned tables. The shuffle
hint leads to a plan where all rows are merged at the coordinator where
the table sink is executed.
Change-Id: I1084d49c95b7d867eeac3297fd2016daff0ab687
Reviewed-on: http://gerrit.cloudera.org:8080/4162
Reviewed-by: Marcel Kornacker <marcel@cloudera.com>
Tested-by: Internal Jenkins
When deciding between a broadcast or repartition join, Impala calculates
the cost of each join as the total amount of data that is sent over the
network. This ignores some relevant costs, and can lead to bad plans.
One such relevant cost is the work to create the hash table used in the
join. This patch accounts for this by adding the amount of data inserted
into the hash table (the size of the right side of the join) to the
previous cost.
This generally increases the estimated cost of broadcast joins relative
to repartitioning joins, as the broadcast join must build the hash table
on each node the data was broadcast to, so its effect will be to make
repartitioning joins more likely to be chosen, especially in large
clusters.
This patch has not yet been performance tested.
Change-Id: I03a0f56f69c8deae68d48dfdb9dc95b71aec11f1
Reviewed-on: http://gerrit.cloudera.org:8080/4098
Tested-by: Internal Jenkins
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
The bug was also fixed by the fix for IMPALA-3063:
532b1fe118
This patch adds a regression test for IMPALA-2540.
Change-Id: I7c7dececfee90540fe7d5f8a606381ec50a3b241
Reviewed-on: http://gerrit.cloudera.org:8080/4071
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Internal Jenkins
Testing: Ran the FE planner tests. Examined all the changed plans
to verify that the changes are benefitial according to our
cardinality estimates. Still need to do a real perf run.
Change-Id: I8ba903f1df2446350cca7e71fdb13f550bf9de72
Reviewed-on: http://gerrit.cloudera.org:8080/4035
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Internal Jenkins
As of Kudu 0.9, DISTRIBUTE BY is now required when creating
a new Kudu table. Create table analysis, data loading, and
tests are updated to reflect this.
This also bumps the Kudu version to 0.10.0.
Change-Id: Ieb15110b10b28ef6dd8ec136c2522b5f44dca43e
Reviewed-on: http://gerrit.cloudera.org:8080/3987
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: Internal Jenkins
Before this change joins were inverted while doing join ordering.
That approach was unnecessarily complex because it required
modifying the global analysis state for correct conjunct
placement, etc. However, join inversion is independent of join
ordering, and the existing approach could lead to generating
invalid plans with distributed non-equi right outer/semi joins,
which we cannot execute in the backend.
After this change joins are inverted in a separate pass over
the single-node plan. This simplifies the inversion
logic and allows us to avoid generating those invalid plans.
Note that this change is not only a separation of functionality
for the following reasons:
1. Our join cardinality estimation is not symmetric, i.e., A JOIN B
may not give the same estimate as B JOIN A due to our FK/PK detection
heuristic. In the context of this patch this means that an inverted
join may have a different cardinality estimate, so plans may change
depending on whether the inversion is done during join ordering of after.
2. We currently only invert outer/semi/anti joins based on the rhs table
ref join op. In this patch I want to preserve the existing behavior as
much as possible, but when doing the join ordering in a separate pass we
may see a join opn in a JoinNode that is different from the rhs table ref.
So in some situations the inversion behavior based on the join op could be
different and there are some examples in this patch.
This patch also moves the logic of converting hash joins to
nested-loop joins into a separate pass over the single-node plan.
Change-Id: If86db7753fc585bb4c69612745ec0103278888a4
Reviewed-on: http://gerrit.cloudera.org:8080/3846
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Internal Jenkins
The bug: During join ordering we rely on the column stats of
join predicates for estimating the join cardinality. We have code
that tries to find the stats of a column through views but there
was a bug in identifying slots that belong to base table scans.
The bug lead us to incorrectly accept slots of view references
which do not have stats.
This patch fixes the above issue and adds new test infrastructure
for creating test-local views. It adds a TPCH-equivalent database that
contains views of the form "select * from tpch_basetbl" for all TPCH
tables and add tests the plans of all TPCH queries on the view database.
Change-Id: Ie3b62a5e7e7d0e84850749108c13991647cedce6
Reviewed-on: http://gerrit.cloudera.org:8080/3865
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Internal Jenkins
The bug: Our BetweenPredicate has a complex object structure that is unlike
most other Exprs because we generate an equivalent CompoundPredicate during
analysis and replace the original children. Keeping the various members in
sync and preserving the object structure during clone() and substitute() is
very difficult and error prone. In particular, subquery rewriting is
difficult because we extract and replace correlated BinaryPredicates.
Substituting BinaryPredicates in a BetweenPredicate's children is not
equivalent to a substitution on the BetweenPredicat's original children,
so keeping the various redundant members in sync is quite difficult.
The fix is to replace BetweenPredicates with their equivalent CompoundPredicates
before performing subquery rewrites.
We ultimately still want to fix clone() and substitute() for BetweenPredicates,
but an elegant solution is likely to more involved.
Change-Id: I0838b30444ed9704ce6a058d30718a24caa7444a
Reviewed-on: http://gerrit.cloudera.org:8080/3804
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Internal Jenkins
This commit adds a set of planner tests for Kudu tables based on the 22
TPC-H queries.
Change-Id: I6c40534b72b9aa1ee582b9679c2a63cad52df703
Reviewed-on: http://gerrit.cloudera.org:8080/3790
Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Tested-by: Internal Jenkins
The bug: For correct predicate assignment we rely on TableRef.getAllTupleIds()
and TableRef.getMaterializedTupleIds(). The implementation of those functions
used to traverse the chain of table refs and collect the appropriate ids.
However, during plan generation we alter the chain of table refs, in particular,
for dealing with nested collections, so those altered TableRefs do not return the
expected list of ids, leading to wrong decisions in predicate assignment.
The fix: Cache the lists of ids during analysis, so we are free to alter the
chain of TableRefs during plan generation.
Change-Id: I298b8695c9f26644a395ca9f0e86040e3f5f3846
Reviewed-on: http://gerrit.cloudera.org:8080/2415
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Internal Jenkins
Added support for the 'ignore nulls' keyword to the last_value and
first_value analytic functions, eg. 'last_value(col ignore nulls)',
which would return the last value from the window that is not null,
or null if all of the values in the window are null.
We handle 'ignore nulls' in the FE in the same way that we handle
'distinct' - by adding isIgnoreNulls as a field in FunctionParams.
To avoid affecting performance when 'ignore nulls' is not used, and
to avoid having to special case 'ignore nulls' on the backend, this
patch adds 'last_value_ignore_nulls' and 'first_value_ignore_nulls'
builtin analytic functions that wrap 'last_value' and 'first_value'
respectively.
Change-Id: Ic27525e2237fb54318549d2674f1610884208e9b
Reviewed-on: http://gerrit.cloudera.org:8080/3328
Reviewed-by: Thomas Tauber-Marshall <tmarshall@cloudera.com>
Tested-by: Internal Jenkins
There were two separate issues:
First, the SortNode incorrectly picked up unassigned conjuncts, and expected those to
be empty. In this case where predicates are migrated into union operands, there could
actually be unassigned conjuncts bound by the SortNode's tuple id (and so would be
incorrectly picked up). The fix is to not pick up unassigned conjuncts in the SortNode,
and allow them to be picked up later (into a SelectNode).
Second, when generating the plan for union operands we were missing a call to
graft a SelectNode on top of the operand plan to capture unassigned conjuncts.
Change-Id: I95d105ac15a3dc975e52dfd418890e13f912dfce
Reviewed-on: http://gerrit.cloudera.org:8080/3600
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Alex Behm <alex.behm@cloudera.com>
PlanNode includes a 'capAtLimit()' method that can be used in
'computeStats()' on PlanNodes to ensure they do not estimate their
cardinality to be more than a pushed-down LIMIT clause.
This patch ensures that 'capAtLimit()' is used in all of the relevant
classes descending from PlanNode.
Change-Id: Ic06dcb93bbb2510c0d40151302bd817ef340b825
Reviewed-on: http://gerrit.cloudera.org:8080/3127
Reviewed-by: Jim Apple <jbapple@cloudera.com>
Tested-by: Internal Jenkins
This commit fixes an issue where an IllegalStateException is thrown
while generating runtime filters if a target expr of a join conjunct
is wrapped in a IF(TupleIsNull, NULL, e) expr. As this is not a valid
expr to be assigned to a scan node (target of a runtime filter), we
unwrap these exprs and replace exprs of the form IF(TupleIsNull, NULL, e)
with 'e' while producing the targer exprs for runtime filters. The original expr
of the join conjunct is not modified.
Change-Id: I2e3e207b4c8522283a1cd0d14be83d42eba58f5a
Reviewed-on: http://gerrit.cloudera.org:8080/3147
Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Tested-by: Internal Jenkins
With this commit runtime filters can be assigned to multiple destination
nodes (scans). For each filter, the destination nodes are determined
using equivalent classes during planning. For each filter, all its
destination nodes are in the left subtree rooted at the join node
that constructs this filter. A runtime filter may have both
local and remote targets. The backend determines how to route each
filter depending on the number and type (local, remote) of its destination
nodes.
With this commit, we enable runtime filter propagation in all the
operands of UNION [ALL|DISTINCT] nodes.
Change-Id: Iad2ce4e579a30616c469312a4e658140d317507b
Reviewed-on: http://gerrit.cloudera.org:8080/2932
Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Tested-by: Internal Jenkins
Before this patch, correlated exists and not exists subqueries were
rewritten as as left semi and anti joins respectively. Uncorrelated
exists subqueries were rewritten as cross joins, and uncorrelated
not-exists subqueries were not supported at all. This patch takes
advantage of the nested loop join that was recently introduced, which
allows us to rewrite both correlated and uncorrelated exists subqueries
as left semi joins and both correlated and uncorrelated not-exists
subqueries as anti joins.
Change-Id: I52ae12f116d026190f3a2a7575cda855317d11e8
Reviewed-on: http://gerrit.cloudera.org:8080/2792
Reviewed-by: Taras Bobrovytsky <tbobrovytsky@cloudera.com>
Tested-by: Internal Jenkins
New classes:
- ParallelPlanner: creates build plans, assigns plans to cohorts
- JoinBuildSink: DataSink for plan fragments that materialize build sides
- ids for plans, hash tables, plan fragments
Tests: this adds a new test file section PARALLELPLANS and augments the tpc-h/-ds tests with
those sections.
In the interest of keeping this patch small I didn't augment other test files with that
section yet (which will happen at a later date, to cover more corner cases).
Change-Id: Ic3c34dd3f9190a131e6f03d901b4bfcd164a5174
Reviewed-on: http://gerrit.cloudera.org:8080/2846
Tested-by: Internal Jenkins
Reviewed-by: Marcel Kornacker <marcel@cloudera.com>
Added costs to all Exprs, which estimate the relative cost of evaluating
an expression and all of its children. Costs are calculated during
analysis. For now, these costs are intended as a simple way to order
expressions from cheap to expensive, not necessarily to be a precise
reflection of running times.
In general, expressions that deal with variable length types like strings
will have higher cost than those dealing with fixed length types
like numbers and booleans. Additionally, expressions with complicated
subexpressions will have higher cost than simpler expressions.
Also added PlanNode.orderConjunctsByCost, which takes a list of Exprs and
returns a new list sorted according to an estimate of the cheapest order to
evaulate the conjuncts in, based on their cost and selectivity.
The conjuncts are sorted by repeatedly iterating over them and choosing the
conjunct that would result in the least total estimated work were it to be
applied before the remaining conjuncts. Selectivities are exponentially
backed off, and Exprs without selectivity estimates are given a reasonable
default.
Change-Id: I02279a26fbc6308ac5eb819d78345fc010469034
Reviewed-on: http://gerrit.cloudera.org:8080/2598
Reviewed-by: Thomas Tauber-Marshall <tmarshall@cloudera.com>
Tested-by: Internal Jenkins
This follows up on a TODO from the Kudu merge and also fixes a bug:
IMPALA-976 changed the computation of selectivities for a combined
list of conjuncts to better handle expressions with no selectivity
estimate. The Kudu implementation was forked from before this change
and thus did not have an equivalent change.
This refactors the algorithm to a new static method and calls it from
both PlanNode and KuduScanNode so that the selectivity estimate
behavior is the same regardless of whether Kudu can evaluate the
predicate server-side.
Todd tested this on TPCH 3TB and verified that the plans are reasonable
now where they used to be nonsense.
Change-Id: Id507077b577ed5804fc80517f33ea185f2bff41a
Reviewed-on: http://gerrit.cloudera.org:8080/2628
Reviewed-by: Casey Ching <casey@cloudera.com>
Tested-by: Internal Jenkins
This commit unblocks queries materializing only scalar typed
columns on tables backed by RC/sequence files containing complex
typed columns. This worked prior to 2.3.0 release.
Change-Id: I3a89b211bdc01f7e07497e293fafd75ccf0500fe
Reviewed-on: http://gerrit.cloudera.org:8080/2580
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Internal Jenkins
Grouping aggregations previously always repartitioned their input,
even if preceding joins or aggs had already partitioned the data on the
required key (or an equivalent key). This patch checks to see if data is
already partitioned on the required exprs (or equivalent ones), and if
so skips the preaggregation and only does a merge aggregation.
The patch also does some refactoring of the aggregation planning in
DistributedPlanner to make it easier to implement the change.
Includes planner tests for the three cases that are affected:
grouping aggregations, non-grouping distinct aggregations and
grouping distinct aggregations.
Change-Id: Iffdcfd3629b8a69bd23915e1adba3b8323cbbaef
Reviewed-on: http://gerrit.cloudera.org:8080/2414
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Internal Jenkins
This merges the 'feature/kudu' branch with cdh5-trunk as of commit:
055500cc753f87f6d1c70627321fcc825044e183
This patch is not a pure merge patch in the sense that goes beyond conflict
resolution to also address reviews to the 'feature/kudu' branch as a whole.
The review items and their resolution can be inspected at:
http://gerrit.cloudera.org:8080/#/c/1403/
Change-Id: I6dd4270cd17a4f5c02811c343726db3504275a92
Marcel spotted that nested TPCH-Q18 can be expressed with
more efficient SQL.
Results on nested TPCH-300:
Before 160s
After 100s
Change-Id: I8b351b7f467e8bef0c256dc43cea325d7f177edf
Reviewed-on: http://gerrit.cloudera.org:8080/2418
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Internal Jenkins
The bug:
Evaluating !empty() predicates at non-scan nodes interacts
poorly with our BE projection of collection slots. For example,
rows could incorrectly be filtered if a !empty() predicate is
assigned to a plan node that comes after the unnest of the
collection that also performs the projection.
The fix:
This patch reworks the generation of !empty() predicates
introduced in IMPALA-2663 for correctness purposes.
The predicates are generated in cases where we can ensure that
they will be assigned only by the parent scan, and no other
plan node.
The conditions are as follows:
- collection table ref is relative and non-correlated
- collection table ref represents the rhs of an inner/cross/semi join
- collection table ref's parent tuple is not outer joined
Change-Id: Ie975ce139a103285c4e9f93c59ce1f1d2aa71767
Reviewed-on: http://gerrit.cloudera.org:8080/2399
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Reviewed-by: Silvius Rus <srus@cloudera.com>
Tested-by: Internal Jenkins
The bug: On-clause predicates belonging to an inner join were not always assigned
correctly if they referenced an outer-joined tuple. Specifically, our logic
for detecting whether a predicate can be assigned below an outer join if also
left at the outer-join node was not correct, and so we assigned the predicate
below the join, but did not also leave it at the outer join.
The fix: Assign an inner join On-clause conjunct that references an outer-joined
tuple to the join that the On-clause belongs to.
Change-Id: Iffef7718679d48f866fa90fd3257f182cbb385ae
Reviewed-on: http://gerrit.cloudera.org:8080/2309
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Internal Jenkins