This patch adds support for the following SQL constructs
- Unary + operator
- The ALL keyword, in SELECT ALL and SELECT aggregate_func(ALL *)
- REAL and INTEGER as type synonyms for DOUBLE and INT respectively
- The AS keyword after a table spec. e.g. SELECT * FROM tbl AS t0
This makes partition pruning more effective by extending it to predicates that are fully bound by the partition column,
e.g., '<col> IN (1, 2, 3)' will also be used to prune partitions, in addition to equality and binary comparisons.
This adds Impala support for CREATE/DROP DATABASE/TABLE. With this change, Impala
supports creating tables in the metastore stored as text, sequence, and rc file format.
It currently only supports creating unpartitioned tables and tables stored in HDFS.
- this adds a SelectNode that evaluates conjuncts and enforces the limit
- all limits are now distributed: enforced both by the child plan fragment and
by the merging ExchangeNode
- all limits w/ Order By are now distributed: enforced both by the child plan fragment and
by the merging TopN node
With this change the Python tests will now be called as part of buildall and
the corresponding Java tests have been disabled. The new tests can also be
invoked calling ./tests/run-tests.sh directly.
This includes a fix from Nong that caused wrong results for limit on non-io
manager formats.
Fixing predicate assignment for outer joins:
- On clause predicates for outer joins are now assigned to the join node
- the exception are On clause predicates that can be directly evaluated
by the outer-joined tables themselves; those are "pushed down"
- Where clause predicates for outer-joined tables are assigned to the join node
that materializes the outer join
This is the first set of changes required to start getting our functional test
infrastructure moved from JUnit to Python. After investigating a number of
option, I decided to go with a python test executor named py.test
(http://pytest.org/). It is very flexible, open source (MIT licensed), and will
enable us to do some cool things like parallel test execution.
As part of this change, we now use our "test vectors" for query test execution.
This will be very nice because it means if load the "core" dataset you know you
will be able to run the "core" query tests (specified by --exploration_strategy
when running the tests).
You will see that now each combination of table format + query exec options is
treated like an individual test case. this will make it much easier to debug
exactly where something failed.
These new tests can be run using the script at tests/run-tests.sh
This change includes a number of improvements for the test data loading framework:
* Named sections for schema template definitions
* Removal of uneeded sections from schema template definitions (ex. ANALYZE TABLE)
* More granular data loading via table name filters
* Improved robustness in detecting failed data loads
* Table level constraints for specific file formats
* Re-written compute stats script