Commit Graph

149 Commits

Author SHA1 Message Date
Nong Li
fbfef4e22e Fix crash in TopN node with null tuples. 2014-01-08 10:46:54 -08:00
Lenni Kuff
837f35eab3 Updated results for more query tests to reflect proper ordering + improved result updating 2014-01-08 10:46:53 -08:00
Lenni Kuff
bed633c1ae Extract config/metastore creation from buildall + script for loading warehouse snapshot 2014-01-08 10:46:53 -08:00
Lenni Kuff
a035cf4e73 Update results of a few TPC-H queries to reflect proper ordering
Change-Id: I41156b506155c846220cfb097f5e8120503f8da8
2014-01-08 10:46:52 -08:00
Lenni Kuff
1b248d067b Add TPC-DS dataset and workload 2014-01-08 10:46:52 -08:00
Marcel Kornacker
f6af9316d9 Fix for IMP-137: incorrect predicate placement for outer joins
Fixing predicate assignment for outer joins:
- On clause predicates for outer joins are now assigned to the join node
- the exception are On clause predicates that can be directly evaluated
  by the outer-joined tables themselves; those are "pushed down"
- Where clause predicates for outer-joined tables are assigned to the join node
  that materializes the outer join
2014-01-08 10:46:50 -08:00
Lenni Kuff
febdb112f4 Fixed bug in test file section parsing 2014-01-08 10:46:50 -08:00
Lenni Kuff
ef48f65e76 Add test framework for running Impala query tests via Python
This is the first set of changes required to start getting our functional test
infrastructure moved from JUnit to Python. After investigating a number of
option, I decided to go with a python test executor named py.test
(http://pytest.org/). It is very flexible, open source (MIT licensed), and will
enable us to do some cool things like parallel test execution.

As part of this change, we now use our "test vectors" for query test execution.
This will be very nice because it means if load the "core" dataset you know you
will be able to run the "core" query tests (specified by --exploration_strategy
when running the tests).

You will see that now each combination of table format + query exec options is
treated like an individual test case. this will make it much easier to debug
exactly where something failed.

These new tests can be run using the script at tests/run-tests.sh
2014-01-08 10:46:50 -08:00
Lenni Kuff
1e25c98fb4 Test data loading framework improvements
This change includes a number of improvements for the test data loading framework:
* Named sections for schema template definitions
* Removal of uneeded sections from schema template definitions (ex. ANALYZE TABLE)
* More granular data loading via table name filters
* Improved robustness in detecting failed data loads
* Table level constraints for specific file formats
* Re-written compute stats script
2014-01-08 10:46:49 -08:00
Henry Robinson
997df15b69 IMP-581: HBase table loading error / IMP-401: Re-enable tests for structured columns 2014-01-08 10:46:48 -08:00
Nong Li
b4dc3eeb35 Fix IMP-575 2014-01-08 10:46:45 -08:00
Nong Li
adf36b81f9 Fix data errors test. 2014-01-08 10:46:45 -08:00
Nong Li
34879a4ddc Fix IMP-297 2014-01-08 10:46:44 -08:00
Nong Li
b22b565a92 Fix codegen for min/max of bool col. 2014-01-08 10:46:43 -08:00
Alan Choi
a5a9ccf8c2 IMP-550 short-circuit queries with limit 0
Impala server would examine the plan. If the first fragment's top plan node has a "limit 0",
then the query is set to EOS immediately.
2014-01-08 10:46:41 -08:00
Alan Choi
dfe7690add IMP-522 Fix null pointer exception in HBase query
The ScanNode.keyRanges is an array list that can contain null. The existing HBase scan node
did not check for that.

A keyRanges would contain null if
1. the row-key is a string type and it is referenced in the query and,
2. there is no predicate on the row-key.
2014-01-08 10:46:36 -08:00
Michael Ubell
8a5297a526 Add HdfsLzoTextScanner 2014-01-08 10:46:35 -08:00
Marcel Kornacker
2fda5d9b99 IMP-491
Fixes bug in Planner.createHashJoinFragment(), which didn't set the left child of the
hj node to the output of the left child fragment.

Also: row descriptor was set incorrectly (too wide; included tuples that weren't materialized)
for roots of plan trees of non-root fragments if those fragments materialized an aggregate
2014-01-08 10:46:33 -08:00
Michael Ubell
0750384b41 IMP-497 Insert with limit, remove extra files from test. 2014-01-08 10:46:33 -08:00
Michael Ubell
116241f1d1 IMP-497 Insert with limit. 2014-01-08 10:46:33 -08:00
Michael Ubell
7536510b69 IMP-258 Test writing nulls. 2014-01-08 10:46:31 -08:00
ishaan
05c65789bb Change Copyrights from 2011 ti 2012 2014-01-08 10:46:29 -08:00
Alan Choi
595edaa9d1 Disable all string to numeric and boolean implicit cast 2014-01-08 10:46:24 -08:00
Lenni Kuff
1451650055 Bring onlne all TPCH planner tests (updated for new planner) and supported query tests 2014-01-08 10:46:21 -08:00
Lenni Kuff
9f91081183 Modify TPCH tests to always insert into text table so workload can run on all file formats 2014-01-08 10:46:21 -08:00
Marcel Kornacker
fd77f06f15 Moving functional-newplanner back to functional-planner (and renaming NewPlanner to Planner) 2014-01-08 10:46:20 -08:00
Marcel Kornacker
ea050a43ad Switching over backend runtime structures to new planner.
Added container-util.h
2014-01-08 10:46:20 -08:00
Michael Ubell
85807f6169 Start a single impalad to avoid data load race 2014-01-08 10:46:18 -08:00
Michael Ubell
325a2f01ad Add refresh to load script 2014-01-08 10:46:18 -08:00
Michael Ubell
37aaf06f79 IMP-390 Get rid of test dependencies on InProcessQE and Runquery 2014-01-08 10:46:18 -08:00
Michael Ubell
477422beda IMP-380 handle '\r' at end of row. 2014-01-08 10:46:14 -08:00
Alan Choi
0ce8a044e3 Disable RC/Trevni (with option to allow it); remove file_buffer_size
IMP-336: remove file_buffer_size query options
Add "allow_unsupported_formats" query options to allow RC/Trevni in our test; disabled by
default
2014-01-08 10:46:02 -08:00
Alan Choi
dbf1074066 Fragments report errors to coordinator.
Enable multi-node DataErrorTest (IMP-250 resolved)
Check fragment/coord errors in DataErrorTest
2014-01-08 10:46:00 -08:00
Henry Robinson
3519701529 Support backtick quoting for identifiers 2014-01-08 10:46:00 -08:00
Henry Robinson
91c3b979ca IMP-370: SHOW TABLES IN support and IMP-363: SHOW DATABASES
Change-Id: Ic41c4b0767a0480f0a18e1e985f25de3bc2ca947
2014-01-08 10:45:59 -08:00
Henry Robinson
540673763f Add session key handling to ThriftServer, and session support to the frontend 2014-01-08 10:45:59 -08:00
Marcel Kornacker
927f4c52f8 Adding the remaining pieces of functionality to the new planner:
- HBaseScanNode.getScanRangeLocations()
- new planner creates INSERT plans
- Frontend.createExecRequest2(), which calls NewPlanner.
2014-01-08 10:45:58 -08:00
Michael Ubell
0c4f025a5e Fix loading of nulltable data, remove loading functional-planner data 2014-01-08 10:45:58 -08:00
Michael Ubell
bf57ae27a5 IMP-291 Read sequence file to next sync mark when; ragged columns 2014-01-08 10:45:57 -08:00
Marcel Kornacker
904d8601d4 adding back accidentally deleted bad_seq_snap/bad_file 2014-01-08 10:45:56 -08:00
Marcel Kornacker
5984c0be52 First cut of partitioned plan generation:
- created new class PlanFragment, which encapsulates everything having to do with a single
  plan fragment, including its partition, output exprs, destination node, etc.
- created new class DataPartition
- explicit classes for fragment and plan node ids, to avoid getting them mixed up, which is easy to do with ints
- Adding IdGenerator class.
- moved PlanNode.ExplainPlanLevel to Types.thrift, so it can also be used for
  PlanFragment.getExplainString()
- Changed planner interface to return scan ranges with a complete list of server locations,
  instead of making a server assignment.

Also included: cleaned up AggregateInfo:
- the 2nd phase of a DISTINCT aggregation is now captured separately from a merge aggregation.
- moved analysis functionality into AggregateInfo

Removing broken test cases from workload functional-planner (they're being handled correctly in functional-newplanner).
2014-01-08 10:45:56 -08:00
Nong Li
8763d5768d Fix num_scanner_threads default semantics. 2014-01-08 10:45:13 -08:00
Alan Choi
69fcaadd5f Added all the conversion errors in .test file. The errors come from run-query.
Error message is now more consistent.
Remove useless message from RC file.
2014-01-08 10:45:12 -08:00
Michael Ubell
5f951ffc4a Handle missing columns at the end of a row 2014-01-08 10:45:11 -08:00
Michael Ubell
d0dd13053a Improve string to timestamp performance. 2014-01-08 10:45:08 -08:00
Michael Ubell
0e714f5720 Add error recovery to sequence files. 2014-01-08 10:45:07 -08:00
ishaan
42231b7d86 Annotate queries for better benchmark reporting. 2014-01-08 10:45:05 -08:00
Henry Robinson
e7348a209b IMP-232: Parallel INSERT OVERWRITE 2014-01-08 10:45:04 -08:00
Henry Robinson
afc30baf52 Impalad for Trevni loading shouldn't use a state-store 2014-01-08 10:44:51 -08:00
Lenni Kuff
7d595ba740 Update run-workload result reporting to make reference result comparison more flexible
Now we save Hive results into a separate file (previously everything was stored
in the same file. Also added ability to do a run-benchmark and specify to skip
impala and which will help generate hive reference results.

Updated the reporting script to reflect this change.
2014-01-08 10:44:50 -08:00