Commit Graph

31 Commits

Author SHA1 Message Date
Alex Behm
2277386d4d IMPALA-225: Compound predicate ranges on partition keys crash impalad. 2014-01-08 10:49:45 -08:00
Marcel Kornacker
398e725a23 make broadcast joins the default join strategy 2014-01-08 10:49:34 -08:00
Marcel Kornacker
d7e22f44bb Partitioned hash joins
- added PlanNode.numNodes, PlanNode.avgRowSize and PlanNode.computeStats()
- fixing up some cardinality estimates
- Planner now tries to do a cost-based decision between broadcast join and join with full repartitioning (both inputs)
- ExchangeNode now distinguishes between its input and output row descriptor: the output potentially contains more tuples
- fixed problem related to cancellation and concurrent hash table builds.

Not included:
- partitioned joins that take advantage of existing partitions of the inputs; those will have to wait for a follow-on change
2014-01-08 10:49:29 -08:00
Alan Choi
4a503a4e35 IMP-808 construct runtime state in fe-support to eval now() 2014-01-08 10:49:20 -08:00
Nong Li
20fc700002 Fix precision issue in text table writer. 2014-01-08 10:49:19 -08:00
Alex Behm
0821e2f826 IMPALA-66: Support for UNION with constant SELECT clauses. 2014-01-08 10:49:18 -08:00
Marcel Kornacker
0c36c7f327 Partitioned merge aggregation. 2014-01-08 10:48:59 -08:00
Marcel Kornacker
d7bfe6c68d IMPALA-144: partition pruning for arbitrary predicates that are fully bound by partition columns
This makes partition pruning more effective by extending it to predicates that are fully bound by the partition column,
e.g., '<col> IN (1, 2, 3)' will also be used to prune partitions, in addition to equality and binary comparisons.
2014-01-08 10:48:41 -08:00
Marcel Kornacker
c02d25baa8 IMPALA-20: Limit clause in inline view not handled correctly by planner
- this adds a SelectNode that evaluates conjuncts and enforces the limit
- all limits are now distributed: enforced both by the child plan fragment and
  by the merging ExchangeNode
- all limits w/ Order By are now distributed: enforced both by the child plan fragment and
  by the merging TopN node
2014-01-08 10:48:29 -08:00
ishaan
09d6d931f4 Change the way data is loaded 2014-01-08 10:48:09 -08:00
Nong Li
02c329b97a Update RC files to use io mgr and remove scanner support for non-io mgr. 2014-01-08 10:47:11 -08:00
Nong Li
15dfd968fb Disable tpch-q21 and fix plan output for tpch-q22.
We can now generate the temp table for q22 which changes the plan output.
2014-01-08 10:47:03 -08:00
Henry Robinson
15228f945f IMP-503: INSERTS into unpartitioned tables should be checked for union compatibility 2014-01-08 10:46:57 -08:00
Alan Choi
476a665763 IMP-620: print number of scanned partition and total scaned bytes 2014-01-08 10:46:57 -08:00
Marcel Kornacker
f6af9316d9 Fix for IMP-137: incorrect predicate placement for outer joins
Fixing predicate assignment for outer joins:
- On clause predicates for outer joins are now assigned to the join node
- the exception are On clause predicates that can be directly evaluated
  by the outer-joined tables themselves; those are "pushed down"
- Where clause predicates for outer-joined tables are assigned to the join node
  that materializes the outer join
2014-01-08 10:46:50 -08:00
Lenni Kuff
febdb112f4 Fixed bug in test file section parsing 2014-01-08 10:46:50 -08:00
Alan Choi
dfe7690add IMP-522 Fix null pointer exception in HBase query
The ScanNode.keyRanges is an array list that can contain null. The existing HBase scan node
did not check for that.

A keyRanges would contain null if
1. the row-key is a string type and it is referenced in the query and,
2. there is no predicate on the row-key.
2014-01-08 10:46:36 -08:00
Marcel Kornacker
2fda5d9b99 IMP-491
Fixes bug in Planner.createHashJoinFragment(), which didn't set the left child of the
hj node to the output of the left child fragment.

Also: row descriptor was set incorrectly (too wide; included tuples that weren't materialized)
for roots of plan trees of non-root fragments if those fragments materialized an aggregate
2014-01-08 10:46:33 -08:00
Michael Ubell
0750384b41 IMP-497 Insert with limit, remove extra files from test. 2014-01-08 10:46:33 -08:00
Michael Ubell
116241f1d1 IMP-497 Insert with limit. 2014-01-08 10:46:33 -08:00
Alan Choi
595edaa9d1 Disable all string to numeric and boolean implicit cast 2014-01-08 10:46:24 -08:00
Lenni Kuff
1451650055 Bring onlne all TPCH planner tests (updated for new planner) and supported query tests 2014-01-08 10:46:21 -08:00
Marcel Kornacker
fd77f06f15 Moving functional-newplanner back to functional-planner (and renaming NewPlanner to Planner) 2014-01-08 10:46:20 -08:00
Alan Choi
0ce8a044e3 Disable RC/Trevni (with option to allow it); remove file_buffer_size
IMP-336: remove file_buffer_size query options
Add "allow_unsupported_formats" query options to allow RC/Trevni in our test; disabled by
default
2014-01-08 10:46:02 -08:00
Marcel Kornacker
5984c0be52 First cut of partitioned plan generation:
- created new class PlanFragment, which encapsulates everything having to do with a single
  plan fragment, including its partition, output exprs, destination node, etc.
- created new class DataPartition
- explicit classes for fragment and plan node ids, to avoid getting them mixed up, which is easy to do with ints
- Adding IdGenerator class.
- moved PlanNode.ExplainPlanLevel to Types.thrift, so it can also be used for
  PlanFragment.getExplainString()
- Changed planner interface to return scan ranges with a complete list of server locations,
  instead of making a server assignment.

Also included: cleaned up AggregateInfo:
- the 2nd phase of a DISTINCT aggregation is now captured separately from a merge aggregation.
- moved analysis functionality into AggregateInfo

Removing broken test cases from workload functional-planner (they're being handled correctly in functional-newplanner).
2014-01-08 10:45:56 -08:00
Henry Robinson
e7348a209b IMP-232: Parallel INSERT OVERWRITE 2014-01-08 10:45:04 -08:00
Henry Robinson
c472213eeb Parallel INSERT, sink-per-scan-node plan 2014-01-08 10:44:35 -08:00
Lenni Kuff
87d0ed137f Temporarily disabled TPC-H planner tests that require data to be loaded in tmp tables
I am temporarily disabling the TPC-H planner tests that require data to be
pre-loaded in temp tables. This resolves a problem where the TPC-H query tests
need to be run before the TPC-H planner tests.  I have filed "IMP-171" to track
the work to re-enable these tests.
2014-01-08 10:44:30 -08:00
Marcel Kornacker
52bd3ad173 fixing PlannerTest 2014-01-08 10:44:28 -08:00
Marcel Kornacker
04d12f03fc cleaning up logging output 2014-01-08 10:44:28 -08:00
Lenni Kuff
04edc8f534 Update benchmark tests to run against generic workload, data loading with scale factor, +more
This change updates the run-benchmark script to enable it to target one or more
workloads. Now benchmarks can be run like:

./run-benchmark --workloads=hive-benchmark,tpch

We lookup the workload in the workloads directory, then read the associated
query .test files and start executing them.

To ensure the queries are not duplicated between benchmark and query tests, I
moved all existing queries (under fe/src/test/resources/* to the workloads
directory. You do NOT need to look through all the .test files, I've just moved
them. The one new file is the 'hive-benchmark.test' which contains the hive
benchmark queries.

Also added support for generating schema for different scale factors as well as
executing against these scale factors. For example, let's say we have a dataset
with a scale factor called "SF1". We would first generate the schema using:

./generate_schema_statements --workload=<workload> --scale_factor="SF3"
This will create tables with a unique names from the other scale factors.

Run the generated .sql file to load the data. Alternatively, the data can loaded
by running a new python script:
./bin/load-data.py -w <workload1>,<workload2> -e <exploration strategy> -s [scale factor]
For example: load-data.sh -w tpch -e core -s SF3

Then run against this:
./run-benchmark --workloads=<workload> --scale_factor=SF3

This changeset also includes a few other minor tweaks to some of the test
scripts.

Change-Id: Ife8a8d91567d75c9612be37bec96c1e7780f50d6
2014-01-08 10:44:22 -08:00