Commit Graph

129 Commits

Author SHA1 Message Date
Michael Ubell
7536510b69 IMP-258 Test writing nulls. 2014-01-08 10:46:31 -08:00
ishaan
05c65789bb Change Copyrights from 2011 ti 2012 2014-01-08 10:46:29 -08:00
Alan Choi
595edaa9d1 Disable all string to numeric and boolean implicit cast 2014-01-08 10:46:24 -08:00
Lenni Kuff
1451650055 Bring onlne all TPCH planner tests (updated for new planner) and supported query tests 2014-01-08 10:46:21 -08:00
Lenni Kuff
9f91081183 Modify TPCH tests to always insert into text table so workload can run on all file formats 2014-01-08 10:46:21 -08:00
Marcel Kornacker
fd77f06f15 Moving functional-newplanner back to functional-planner (and renaming NewPlanner to Planner) 2014-01-08 10:46:20 -08:00
Marcel Kornacker
ea050a43ad Switching over backend runtime structures to new planner.
Added container-util.h
2014-01-08 10:46:20 -08:00
Michael Ubell
85807f6169 Start a single impalad to avoid data load race 2014-01-08 10:46:18 -08:00
Michael Ubell
325a2f01ad Add refresh to load script 2014-01-08 10:46:18 -08:00
Michael Ubell
37aaf06f79 IMP-390 Get rid of test dependencies on InProcessQE and Runquery 2014-01-08 10:46:18 -08:00
Michael Ubell
477422beda IMP-380 handle '\r' at end of row. 2014-01-08 10:46:14 -08:00
Alan Choi
0ce8a044e3 Disable RC/Trevni (with option to allow it); remove file_buffer_size
IMP-336: remove file_buffer_size query options
Add "allow_unsupported_formats" query options to allow RC/Trevni in our test; disabled by
default
2014-01-08 10:46:02 -08:00
Alan Choi
dbf1074066 Fragments report errors to coordinator.
Enable multi-node DataErrorTest (IMP-250 resolved)
Check fragment/coord errors in DataErrorTest
2014-01-08 10:46:00 -08:00
Henry Robinson
3519701529 Support backtick quoting for identifiers 2014-01-08 10:46:00 -08:00
Henry Robinson
91c3b979ca IMP-370: SHOW TABLES IN support and IMP-363: SHOW DATABASES
Change-Id: Ic41c4b0767a0480f0a18e1e985f25de3bc2ca947
2014-01-08 10:45:59 -08:00
Henry Robinson
540673763f Add session key handling to ThriftServer, and session support to the frontend 2014-01-08 10:45:59 -08:00
Marcel Kornacker
927f4c52f8 Adding the remaining pieces of functionality to the new planner:
- HBaseScanNode.getScanRangeLocations()
- new planner creates INSERT plans
- Frontend.createExecRequest2(), which calls NewPlanner.
2014-01-08 10:45:58 -08:00
Michael Ubell
0c4f025a5e Fix loading of nulltable data, remove loading functional-planner data 2014-01-08 10:45:58 -08:00
Michael Ubell
bf57ae27a5 IMP-291 Read sequence file to next sync mark when; ragged columns 2014-01-08 10:45:57 -08:00
Marcel Kornacker
904d8601d4 adding back accidentally deleted bad_seq_snap/bad_file 2014-01-08 10:45:56 -08:00
Marcel Kornacker
5984c0be52 First cut of partitioned plan generation:
- created new class PlanFragment, which encapsulates everything having to do with a single
  plan fragment, including its partition, output exprs, destination node, etc.
- created new class DataPartition
- explicit classes for fragment and plan node ids, to avoid getting them mixed up, which is easy to do with ints
- Adding IdGenerator class.
- moved PlanNode.ExplainPlanLevel to Types.thrift, so it can also be used for
  PlanFragment.getExplainString()
- Changed planner interface to return scan ranges with a complete list of server locations,
  instead of making a server assignment.

Also included: cleaned up AggregateInfo:
- the 2nd phase of a DISTINCT aggregation is now captured separately from a merge aggregation.
- moved analysis functionality into AggregateInfo

Removing broken test cases from workload functional-planner (they're being handled correctly in functional-newplanner).
2014-01-08 10:45:56 -08:00
Nong Li
8763d5768d Fix num_scanner_threads default semantics. 2014-01-08 10:45:13 -08:00
Alan Choi
69fcaadd5f Added all the conversion errors in .test file. The errors come from run-query.
Error message is now more consistent.
Remove useless message from RC file.
2014-01-08 10:45:12 -08:00
Michael Ubell
5f951ffc4a Handle missing columns at the end of a row 2014-01-08 10:45:11 -08:00
Michael Ubell
d0dd13053a Improve string to timestamp performance. 2014-01-08 10:45:08 -08:00
Michael Ubell
0e714f5720 Add error recovery to sequence files. 2014-01-08 10:45:07 -08:00
ishaan
42231b7d86 Annotate queries for better benchmark reporting. 2014-01-08 10:45:05 -08:00
Henry Robinson
e7348a209b IMP-232: Parallel INSERT OVERWRITE 2014-01-08 10:45:04 -08:00
Henry Robinson
afc30baf52 Impalad for Trevni loading shouldn't use a state-store 2014-01-08 10:44:51 -08:00
Lenni Kuff
7d595ba740 Update run-workload result reporting to make reference result comparison more flexible
Now we save Hive results into a separate file (previously everything was stored
in the same file. Also added ability to do a run-benchmark and specify to skip
impala and which will help generate hive reference results.

Updated the reporting script to reflect this change.
2014-01-08 10:44:50 -08:00
Henry Robinson
30653278b4 run-query should not start a state-store client by default 2014-01-08 10:44:50 -08:00
Henry Robinson
e3e6ba984b Show / describe 2014-01-08 10:44:49 -08:00
Alan Choi
8dae344ceb Do not validate filename in DataErrorTest because it is not deterministic. 2014-01-08 10:44:45 -08:00
Alan Choi
22765fc33a IMP-251: re-enable DataErrorTest
verify that the exception message contains the correct error;
verify that excpected exception is thrown;
verify that no exception is thrown when abort_on_error is set to false
2014-01-08 10:44:45 -08:00
Marcel Kornacker
7725f25ff5 This combines changes related to periodic reporting of plan fragment exec profiles:
- executor takes report callback; passed in by ImpalaServer::FragmentExecState
- the PlanFragmentExecutor invokes profile reporting cb in background thread.
- RuntimeProfile is now thread-safe and has an RuntimeProfile::Update()

Also included:
- a number of bug fixes related to async cancellation of query
  and propagation of errors through PlanFragmentExecutor/Coordinator/ImpalaServer.
- changing COUNTER_SCOPED_TIMER to SCOPED_TIMER
- derived counters: RuntimeProfile now lets you add counters that return a
  value via a function call, which is useful for reporting something like normalized
  ScanNode throughput; retrofitted to ScanNode and all subclasses
- changed coordinator to make cancellation atomic wrt recognition of an error status
  for the overall query.
- Removed InProcessQueryExecutor from data-stream-test.

Added aggregate throughput counters to coordinator:
- all throughput counters are grouped in a sub-profile "AggregateThroughput"
- each scan node gets its own counter
- the value is aggregated across all registered backends which contain that node in
  their plan fragments
2014-01-08 10:44:42 -08:00
Nong Li
7c411da86c Fixed schema template. 2014-01-08 10:44:41 -08:00
Nong Li
5b2621a401 Fix null table creation to workaround hive issue. 2014-01-08 10:44:41 -08:00
Nong Li
4c9c82910a Text parser fix for columns off end. 2014-01-08 10:44:40 -08:00
Nong Li
4d0319d32b Fix null string parsing. 2014-01-08 10:44:40 -08:00
Alan Choi
dd1537d116 IMP-132: collect unique agg expr 2014-01-08 10:44:39 -08:00
Nong Li
cbb06c191c Undo load dependent tables change. 2014-01-08 10:44:39 -08:00
Nong Li
81bba16dac Parallel scanners. 2014-01-08 10:44:38 -08:00
Alan Choi
9ac664f1f7 Fix IMP-239: text_converter_->WriteSlot returns true when it's ok
QueryTest and HBaseQueryTest set AbortOnError to false except the expected error case
2014-01-08 10:44:37 -08:00
Henry Robinson
c472213eeb Parallel INSERT, sink-per-scan-node plan 2014-01-08 10:44:35 -08:00
Nong Li
4fd7bd9606 Updated tpch core workload to include seq/snappy and seq/gzip.
Change-Id: Ifb01ee95542fced2ae8cfa4928ffbc7e357df3a8
2014-01-08 10:44:34 -08:00
Lenni Kuff
6e07e0b8d8 Added support for generating ANALYZE TABLE ... COMPUTE STATISTICS statements during data loading
Add support for generating ANALYZE TABLE ... COMPUTE STATISTICS statements to the data loading
workflow. This allows for capturing simple table stats such as number of rows, number of
partitions, and table size in bytes. These are stored into a new mysql database with the same
name as the metastore except with a '_Stats' suffix. If using Derby a new database results are
stored in a new derby database.
2014-01-08 10:44:34 -08:00
Alan Choi
3d9808d7a6 Upgrade HDFS past the build which contains disk id api 2014-01-08 10:44:31 -08:00
Alexander Behm
ee705e3083 Added timestamp arithmetic expressions. 2014-01-08 10:44:31 -08:00
Alan Choi
f15ef994fb "mvn test" now uses impalad and beeswax api to submit query and fetch, including
insert query.

review issue: 260
2014-01-08 10:44:30 -08:00
Lenni Kuff
87d0ed137f Temporarily disabled TPC-H planner tests that require data to be loaded in tmp tables
I am temporarily disabling the TPC-H planner tests that require data to be
pre-loaded in temp tables. This resolves a problem where the TPC-H query tests
need to be run before the TPC-H planner tests.  I have filed "IMP-171" to track
the work to re-enable these tests.
2014-01-08 10:44:30 -08:00