Commit Graph

23 Commits

Author SHA1 Message Date
Dimitris Tsirogiannis
5a6f53db16 Add partition pruning tests
The following changes are included in this commit:
1. Modified the alltypesagg table to include an additional partition key
that has nulls.
2. Added a number of tests in hdfs.test that exercise the partition
pruning logic (see IMPALA-887).
3. Modified all the tests that are affected by the change in alltypesagg.

Change-Id: I1a769375aaa71273341522eb94490ba5e4c6f00d
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2874
Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3236
2014-06-24 02:14:27 -07:00
Alex Behm
ef6705d7e0 Rename MergeNode to UnionNode.
Change-Id: I9e3675a103757db1345b04bd1d102d2719efddd0
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3128
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3154
Tested-by: Alex Behm <alex.behm@cloudera.com>
2014-06-19 12:44:21 -07:00
Alex Behm
677062be3d Rework planning of unions s.t. a UnionStmt produces a single MergeNode.
This patch changes the planning of a UnionStmt s.t. it always produces a single fragment
with a MergeNode connecting all child fragments as its root.
The data partition of the returned fragment and how the child fragments are merged
depends on the data partitions of the child fragments:
- All child fragments are unpartitioned or partitioned: The returned fragment is
  has a UNPARTITIONED or RANDOM data partition, respectively. The MergeNode absorbs
  the plan trees of all child fragments.
- Mixed partitioned/unpartitioned child fragments: The returned fragment is
  RANDOM partitioned. The plan trees of all partitioned child fragments are absorbed
  into the MergeNode. All unpartitioned child fragments are connected to the
  MergeNode via a RANDOM exchange, and remain unchanged otherwise.

Also adds support for random partitioned data exchanges.

Change-Id: I82b2d12c104d98c4e7133234653ee1b67658ef7a
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2876
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3143
2014-06-19 00:56:58 -07:00
Nong Li
5d903efca3 ExecSummary
The runtime profile as we present it is not very useful and I think the structure of
it makes it hard to consume. This patch adds a new client facing schemed set of
counters that are collected from the runtime profiles. For example, with this structure
it would be easy to have the shell get the stats of a running query and print a useful
progress report or to check the most relevant metrics for diagnosing issues.

Here's an example of the output for one of the tpch queries:
Operator              #Hosts   Avg Time   Max Time    #Rows  Est. #Rows  Peak Mem  Est. Peak Mem  Detail
------------------------------------------------------------------------------------------------------------------------
09:MERGING-EXCHANGE        1   79.738us   79.738us        5           5         0        -1.00 B  UNPARTITIONED
05:TOP-N                   3   84.693us   88.810us        5           5  12.00 KB       120.00 B
04:AGGREGATE               3    5.263ms    6.432ms        5           5  44.00 KB       10.00 MB  MERGE FINALIZE
08:AGGREGATE               3   16.659ms   27.444ms   52.52K     600.12K   3.20 MB       15.11 MB  MERGE
07:EXCHANGE                3    2.644ms      5.1ms   52.52K     600.12K         0              0  HASH(o_orderpriority)
03:AGGREGATE               3  342.913ms  966.291ms   52.52K     600.12K  10.80 MB       15.11 MB
02:HASH JOIN               3    2s165ms    2s171ms  144.87K     600.12K  13.63 MB      941.01 KB  INNER JOIN, BROADCAST
|--06:EXCHANGE             3    8.296ms    8.692ms   57.22K      15.00K         0              0  BROADCAST
|  01:SCAN HDFS            2    1s412ms    1s978ms   57.22K      15.00K  24.21 MB      176.00 MB  tpch.orders o
00:SCAN HDFS               3    8s032ms    8s558ms    3.79M     600.12K  32.29 MB      264.00 MB  tpch.lineitem l

Change-Id: Iaad4b9dd577c375006313f19442bee6d3e27246a
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2964
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
2014-06-11 03:10:11 -07:00
Nong Li
8f4dc0f2f0 IMPALA-974: Switch from FloatLiteral to DecimalLiteral.
Float/Doubles are lossy so using those as the default literal type
is problematic.

Change-Id: I5a619dd931d576e2e6cd7774139e9bafb9452db9
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2758
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
2014-05-31 22:19:06 -07:00
Alex Behm
15e05082c0 IMPALA-831: Distributed aggregation and top-n over unions.
Change-Id: I056e8271421008378db93e8b2393861cc9dd4b90
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1840
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1886
2014-03-13 15:42:31 -07:00
Alex Behm
a615ebc549 IMPALA-822,IMP-1271: Binding predicates on an aggregation now properly trigger slot materialization.
The bug was that the number of materialized agg-tuple slots did not correspond to the number
of materialized agg functions, due to binding predicates against an AggNode causing slot
materialization after SelectStmt.materializeRequiredSlots().
This patch fixes the issue by taking binding predicates (bound to a slot in an agg tuple)
into consideration in SelectStmt.materializeRequiredSlots().
I added a new sanity check in AggregationNode.toThrift() surfaced another issue with slot
materialization that is also fixed in this patch. The ordering exprs must be marked before
the agg exprs in SelectStmt.materializeRequiredSlots() because the odering exprs may contain
agg exprs that are only referenced inside the ORDER BY clause.

Change-Id: I1bdc0466f583907bed625ce6608938e59faee83f
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1639
Reviewed-by: Marcel Kornacker <marcel@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1818
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
2014-03-08 00:25:26 -08:00
Alex Behm
f71767c612 IMPALA-846: Add additional regression test.
This issue has coincidentally been resolved by the fix for IMPALA-820.
This patch adds an additional regression test for explicitly covering
IMPALA-846.

Change-Id: Ib60174676e5bb53de543a1db30adc05cef4d6593
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1719
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1730
2014-03-03 16:50:36 -08:00
Alex Behm
cb8150e8ee IMPALA-817: Check equality of function name in Function.equals().
Change-Id: Ib9b4ee3a21f90fdb0d7ebccd89462dc67040bd1e
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1594
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1611
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Reviewed-by: Marcel Kornacker <marcel@cloudera.com>
2014-02-19 17:13:51 -08:00
Nong Li
0d2919fe7f Refactor scalar and aggregate function analysis and execution.
This patch cleans up analysis and execution of scalar and aggregate functions
so that there is no difference between how builtins and user functions are
handled. The only difference is that the catalog is populated with the builtins
all the time.

The BE always gets a TFunction object and just executes it (builtins will have
an empty hdfs file location).

This removes the opcode registry and all of the functionality is subsumed by
the catalog, most of which was already duplicated there anyway.

This also introduces the concept of a system database; databases that the
user cannot modify and is populated automatically on startup.

Change-Id: Iaa3f84dad0a1a57691f5c7d8df7305faf01d70ed
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1386
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1577
2014-02-18 18:40:08 -08:00
Alex Behm
6799c93922 Simplified/enhanced explain plans with a total of four explain levels.
There are now 4 explain levels summarized as follows:
- Level 0: MINIMAL
  Non-fragmented parallel plan only showing plan nodes with minimal attributes
- Level 1: STANDARD
  Non-fragmented parallel plan with some details in plan nodes
- Level 2: EXTENDED
  Non-fragmented parallel plan with full details in plan nodes including
  the table/column stats, row size, #hosts, cardinality,
  and estimated per-host memory requirement
- Level 3: VERBOSE
  Fragmented parallel plan with full details (like level 2)

This patch also includes several bugfixes related to plan costing and/or
testing of explain plans.

Change-Id: I622310f01d1b3d53ea1031adaf3b3ffdd94eba30
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1211
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: jenkins
2014-01-10 19:17:59 -08:00
Marcel Kornacker
fd182201bc Predicate propagation plus join order optimization.
Introduces STRAIGHT_JOIN keyword to prevent join order optimization.

Structural changes to the planning framework:
- slot materialization: the decision whether to materialize a slot now happens *prior* to
  plan generation. This is needed in order to be able to generate accurate cost estimates
  at plan generation time. see QueryStmt.materializeRequiredSlots()
- added PlanNode.init(), which initializes the entire state of a PlanNode; this subsumes
  finalize()
  * computeMemLayout() now happens per-tuple in the corresponding ScanNode's init()
  * init() calls computeStats() by default; also marks slots as materialized and calls
    TupleDescriptor.computeMemLayout()
- added PlanNode.tblRefIds_
- restructured UnionStmt and union plan generation to fit pred propagation model:
  all tuples are created (and equiv predicates registered) prior to plan generation
- added Expr.isAuxExpr

Change-Id: I475c1645bfca9e84ae6e5f529e7781d9532e5c9a
Reviewed-on: http://gerrit.ent.cloudera.com:8080/955
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
Tested-by: jenkins
2014-01-08 10:54:24 -08:00
Alex Behm
3f54240fed PlannerTest uses explain level 'normal'. Only add stats and costs to explain output in 'verbose' mode.
Change-Id: I827b4c7085b5aa2dc5521f8748d8973178f43f4c
Reviewed-on: http://gerrit.ent.cloudera.com:8080/678
Reviewed-by: Marcel Kornacker <marcel@cloudera.com>
Tested-by: jenkins
2014-01-08 10:53:23 -08:00
Alex Behm
4bb8b38cde Added stats and cost estimates to explain output.
Change-Id: I1273745a439fd25cefa4e08ecc075c98cc8bfc45
Reviewed-on: http://gerrit.ent.cloudera.com:8080/602
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Alex Behm <alex.behm@cloudera.com>
2014-01-08 10:53:22 -08:00
Nong Li
15db34e356 AggregationNode refactoring
This patch redoes how the aggregation node is implemented. The functionality is
now split between aggregation-node, agg-expr and aggregate-functions. This is a working
progress (there's still a lot of debug stuff I added that needs to be cleaned up) but
it does pass the tests.

Aggregation-node is now very simple and now only deals with the grouping part.
Aggregate-expr serves as the glue between the agg node and the aggregate functions.
The aggregation functions are implemented with the UDA interface. I've reimplemented
our existing aggregate functions with this setup. For true UDAs, the binaries would be
loaded in aggregate-expr.

This also includes some preliminary changes in the FE. We now need to annotate each
AggNode as executing the update vs. merge phase (root aggs execute update, others
execute merge) and if it needs a finalize step (only the root does). This is more
general than our builtins which are too simple to need this structure.

There is a big TODO here to allow the intermediate types between agg nodes to change.
For example, in distinct estimate, the input type is the column type and the output type
is a bigint. We'd like the intermediate type to be CHAR(256). This is different since
currently, the intermediate type and output type have always been the same. We've hacked
around this by having both the intermediate and output type be TYPE_STRING. I've left
this for another patch (changing the BE to support this is trivial).
For aggregates that result in strings, we used to store some additional stuff past the
end of the tuple. The layout was:
<tuple> <length of 1st string buffer>,<length of 2nd string buffer>, etc

The rationale for this is that we want to reuse the buffer for min/max and grow the buffer
more quickly for group_concat. This breaks down the abstraction between agg-expr and
agg-node and is not something UDAs can use in general. Rather than try to hack around
this, I think the proper solution is to the intermediate type not be StringValue and
to contain the buffer length itself.

This patch also resurrects the distinct estimate code. The distinct estimate functions
exercise all of the code paths.

Change-Id: Ic152a2cd03bc1713967673681e1e6204dcd80346
Reviewed-on: http://gerrit.ent.cloudera.com:8080/564
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: Nong Li <nong@cloudera.com>
2014-01-08 10:53:13 -08:00
Marcel Kornacker
d85b90cb22 SlotDescriptor.label plus repartitioning for inserts when column stats are missing. 2014-01-08 10:51:56 -08:00
Alan Choi
2d25f11ec3 IMPALA-91 new explain plan output 2014-01-08 10:50:10 -08:00
Marcel Kornacker
0c36c7f327 Partitioned merge aggregation. 2014-01-08 10:48:59 -08:00
ishaan
09d6d931f4 Change the way data is loaded 2014-01-08 10:48:09 -08:00
Alan Choi
476a665763 IMP-620: print number of scanned partition and total scaned bytes 2014-01-08 10:46:57 -08:00
Marcel Kornacker
fd77f06f15 Moving functional-newplanner back to functional-planner (and renaming NewPlanner to Planner) 2014-01-08 10:46:20 -08:00
Marcel Kornacker
04d12f03fc cleaning up logging output 2014-01-08 10:44:28 -08:00
Lenni Kuff
04edc8f534 Update benchmark tests to run against generic workload, data loading with scale factor, +more
This change updates the run-benchmark script to enable it to target one or more
workloads. Now benchmarks can be run like:

./run-benchmark --workloads=hive-benchmark,tpch

We lookup the workload in the workloads directory, then read the associated
query .test files and start executing them.

To ensure the queries are not duplicated between benchmark and query tests, I
moved all existing queries (under fe/src/test/resources/* to the workloads
directory. You do NOT need to look through all the .test files, I've just moved
them. The one new file is the 'hive-benchmark.test' which contains the hive
benchmark queries.

Also added support for generating schema for different scale factors as well as
executing against these scale factors. For example, let's say we have a dataset
with a scale factor called "SF1". We would first generate the schema using:

./generate_schema_statements --workload=<workload> --scale_factor="SF3"
This will create tables with a unique names from the other scale factors.

Run the generated .sql file to load the data. Alternatively, the data can loaded
by running a new python script:
./bin/load-data.py -w <workload1>,<workload2> -e <exploration strategy> -s [scale factor]
For example: load-data.sh -w tpch -e core -s SF3

Then run against this:
./run-benchmark --workloads=<workload> --scale_factor=SF3

This changeset also includes a few other minor tweaks to some of the test
scripts.

Change-Id: Ife8a8d91567d75c9612be37bec96c1e7780f50d6
2014-01-08 10:44:22 -08:00