Commit Graph

73 Commits

Author SHA1 Message Date
Lenni Kuff
e218721386 IMPALA-198: Support setting file format, table comment in CREATE TABLE LIKE statements 2014-01-08 10:49:31 -08:00
Marcel Kornacker
d7e22f44bb Partitioned hash joins
- added PlanNode.numNodes, PlanNode.avgRowSize and PlanNode.computeStats()
- fixing up some cardinality estimates
- Planner now tries to do a cost-based decision between broadcast join and join with full repartitioning (both inputs)
- ExchangeNode now distinguishes between its input and output row descriptor: the output potentially contains more tuples
- fixed problem related to cancellation and concurrent hash table builds.

Not included:
- partitioned joins that take advantage of existing partitions of the inputs; those will have to wait for a follow-on change
2014-01-08 10:49:29 -08:00
Nong Li
1fcfb72bc4 IMPALA-145: Fix order by limit 0 crash. 2014-01-08 10:49:27 -08:00
Alex Behm
673d7b97cf IMPALA-190: Insert with NULL partition keys results in SIGSEGV. 2014-01-08 10:49:22 -08:00
Lenni Kuff
15f0313283 Add analysis checks for length of RowFormat strings, fix escaping of row format values 2014-01-08 10:49:21 -08:00
Lenni Kuff
018a72bfe2 IMPALA-189: Properly support NULL partition key values in ALTER .. PARTITION statements 2014-01-08 10:49:21 -08:00
Alex Behm
0821e2f826 IMPALA-66: Support for UNION with constant SELECT clauses. 2014-01-08 10:49:18 -08:00
Lenni Kuff
5a0b1270c4 Add support for ALTER ... PARTITION (partitionSpec) SET FILEFORMAT/LOCATION
Adds support for:
* ALTER TABLE <table> PARTITION (partitionSpec) SET FILEFORMAT
* ALTER TABLE <table> PARTITION (partitionSpec) SET LOCATION

This enables setting the location and fileformat of specific partitions.
2014-01-08 10:49:17 -08:00
Lenni Kuff
f4a5c0628f Cleanup HDFS directories before and after running ALTER TABLE tests 2014-01-08 10:49:17 -08:00
Lenni Kuff
1fb72fbc73 IMPALA-156: Support core 'ALTER TABLE' DDL command
This patch adds support for
- ALTER TABLE ADD|REPLACE COLUMNS
- ALTER TABLE DROP COLUMN
- ALTER TABLE ADD/DROP PARTITION
- ALTER TABLE SET FILEFORMAT
- ALTER TABLE SET LOCATION
- ALTER TABLE RENAME
2014-01-08 10:49:14 -08:00
Elliott Clark
0e0c02b6bd Add the ability to Select into HBase table.
* Changed frontend analysis for HBase tables
* Changed Thrift messages to allow HBase as a sink type.
* JNI Wrapper around htable
* Create hbase-table-sink
* Create hbase-table-writer
* Static init lots of JNI related code for HBase.
* Cleaned up some cpplint issues.
* Changed junit analysis tests
* Create a new HBase test table.
* Added functional tests for HBase inserts.
2014-01-08 10:49:06 -08:00
Lenni Kuff
5f81becd84 Create tables used by insert tests in a supported insert format 2014-01-08 10:49:00 -08:00
Alan Choi
57c2f828e0 IMP-791 Fix full outer join hang
In full or right outer join, the hash-join-node does not release
the io buffer when calling get next, causing deadlock.
2014-01-08 10:48:58 -08:00
Lenni Kuff
ca0d23a844 IMPALA-157: Support CREATE TABLE LIKE DDL 2014-01-08 10:48:55 -08:00
Henry Robinson
8d87972695 Improve parser coverage
This patch adds support for the following SQL constructs

  - Unary + operator
  - The ALL keyword, in SELECT ALL and SELECT aggregate_func(ALL *)
  - REAL and INTEGER as type synonyms for DOUBLE and INT respectively
  - The AS keyword after a table spec. e.g. SELECT * FROM tbl AS t0
2014-01-08 10:48:54 -08:00
Alex Behm
be03e6c21c IMPALA-138: Error messages for unknown column types are particularly bad. 2014-01-08 10:48:53 -08:00
Alex Behm
a01573af63 IMPALA-65: Add MySQL-style string literals with escaping. 2014-01-08 10:48:51 -08:00
Nong Li
0df9476be1 Parquet data loading. 2014-01-08 10:48:48 -08:00
ishaan
5ed84d7f65 IMP-739 Results for show queries should check for subset, not equality. 2014-01-08 10:48:46 -08:00
Alexander Behm
39e443407b IMPALA-136: GROUP BY float/double. 2014-01-08 10:48:43 -08:00
Nong Li
0385d14d69 Fix pre-hive 9 rc file scanner. 2014-01-08 10:48:41 -08:00
Lenni Kuff
90d7e085fa Update tests to use num_nodes=0, use external impala cluster, add sanity check run mode 2014-01-08 10:48:38 -08:00
Lenni Kuff
d57440e87d Allow column comments for CREATE TABLE and DESCRIBE <table> statements 2014-01-08 10:48:37 -08:00
Lenni Kuff
9f71374875 IMPALA-102: Add support for CREATE TABLE ... PARTITIONED BY (col1, col2) 2014-01-08 10:48:35 -08:00
Lenni Kuff
1cd847c856 IMPALA-81: Add support for CREATE/DROP DATABASE/TABLE
This adds Impala support for CREATE/DROP DATABASE/TABLE. With this change, Impala
supports creating tables in the metastore stored as text, sequence, and rc file format.
It currently only supports creating unpartitioned tables and tables stored in HDFS.
2014-01-08 10:48:30 -08:00
Marcel Kornacker
c02d25baa8 IMPALA-20: Limit clause in inline view not handled correctly by planner
- this adds a SelectNode that evaluates conjuncts and enforces the limit
- all limits are now distributed: enforced both by the child plan fragment and
  by the merging ExchangeNode
- all limits w/ Order By are now distributed: enforced both by the child plan fragment and
  by the merging TopN node
2014-01-08 10:48:29 -08:00
Lenni Kuff
5f9cd044ee Add scanner test suite that runs across all file format/compression permuations 2014-01-08 10:48:25 -08:00
ishaan
5138a720bb IMP-768: Enable the python test framework to check for insert results. 2014-01-08 10:48:22 -08:00
Henry Robinson
222d15c6ca IMPALA-72: String partition keys should be URL encoded 2014-01-08 10:48:20 -08:00
ishaan
09d6d931f4 Change the way data is loaded 2014-01-08 10:48:09 -08:00
Lenni Kuff
d2e4776731 Support passing snapshot file to buildall, add script to run all tests, remove old tests 2014-01-08 10:47:59 -08:00
Lenni Kuff
1896701399 IMPALA-44: Database names are case sensitive 2014-01-08 10:47:34 -08:00
Lenni Kuff
9d981984e7 Update expected results of the 'show table/database' test to remove trevni tables 2014-01-08 10:47:10 -08:00
Lenni Kuff
12d18631e3 Test enhancements: dynamic table format data loading, per-workload exploration stategies 2014-01-08 10:47:07 -08:00
Lenni Kuff
c806738af2 Add scan range length tests to Python test framework 2014-01-08 10:47:06 -08:00
Lenni Kuff
30dbf59ef2 Final changes to enable Python test infrastructure and tests
With this change the Python tests will now be called as part of buildall and
the corresponding Java tests have been disabled. The new tests can also be
invoked calling ./tests/run-tests.sh directly.

This includes a fix from Nong that caused wrong results for limit on non-io
manager formats.
2014-01-08 10:46:57 -08:00
Nong Li
fbfef4e22e Fix crash in TopN node with null tuples. 2014-01-08 10:46:54 -08:00
Lenni Kuff
837f35eab3 Updated results for more query tests to reflect proper ordering + improved result updating 2014-01-08 10:46:53 -08:00
Lenni Kuff
bed633c1ae Extract config/metastore creation from buildall + script for loading warehouse snapshot 2014-01-08 10:46:53 -08:00
Lenni Kuff
ef48f65e76 Add test framework for running Impala query tests via Python
This is the first set of changes required to start getting our functional test
infrastructure moved from JUnit to Python. After investigating a number of
option, I decided to go with a python test executor named py.test
(http://pytest.org/). It is very flexible, open source (MIT licensed), and will
enable us to do some cool things like parallel test execution.

As part of this change, we now use our "test vectors" for query test execution.
This will be very nice because it means if load the "core" dataset you know you
will be able to run the "core" query tests (specified by --exploration_strategy
when running the tests).

You will see that now each combination of table format + query exec options is
treated like an individual test case. this will make it much easier to debug
exactly where something failed.

These new tests can be run using the script at tests/run-tests.sh
2014-01-08 10:46:50 -08:00
Lenni Kuff
1e25c98fb4 Test data loading framework improvements
This change includes a number of improvements for the test data loading framework:
* Named sections for schema template definitions
* Removal of uneeded sections from schema template definitions (ex. ANALYZE TABLE)
* More granular data loading via table name filters
* Improved robustness in detecting failed data loads
* Table level constraints for specific file formats
* Re-written compute stats script
2014-01-08 10:46:49 -08:00
Nong Li
b4dc3eeb35 Fix IMP-575 2014-01-08 10:46:45 -08:00
Nong Li
34879a4ddc Fix IMP-297 2014-01-08 10:46:44 -08:00
Nong Li
b22b565a92 Fix codegen for min/max of bool col. 2014-01-08 10:46:43 -08:00
Alan Choi
a5a9ccf8c2 IMP-550 short-circuit queries with limit 0
Impala server would examine the plan. If the first fragment's top plan node has a "limit 0",
then the query is set to EOS immediately.
2014-01-08 10:46:41 -08:00
Alan Choi
dfe7690add IMP-522 Fix null pointer exception in HBase query
The ScanNode.keyRanges is an array list that can contain null. The existing HBase scan node
did not check for that.

A keyRanges would contain null if
1. the row-key is a string type and it is referenced in the query and,
2. there is no predicate on the row-key.
2014-01-08 10:46:36 -08:00
Marcel Kornacker
2fda5d9b99 IMP-491
Fixes bug in Planner.createHashJoinFragment(), which didn't set the left child of the
hj node to the output of the left child fragment.

Also: row descriptor was set incorrectly (too wide; included tuples that weren't materialized)
for roots of plan trees of non-root fragments if those fragments materialized an aggregate
2014-01-08 10:46:33 -08:00
Michael Ubell
0750384b41 IMP-497 Insert with limit, remove extra files from test. 2014-01-08 10:46:33 -08:00
Michael Ubell
116241f1d1 IMP-497 Insert with limit. 2014-01-08 10:46:33 -08:00
Michael Ubell
7536510b69 IMP-258 Test writing nulls. 2014-01-08 10:46:31 -08:00