Commit Graph

103 Commits

Author SHA1 Message Date
Lenni Kuff
831ee529be Fixed data loading bugs, moved most tables out of load-dependent-tables 2014-01-08 10:48:56 -08:00
Lenni Kuff
ca0d23a844 IMPALA-157: Support CREATE TABLE LIKE DDL 2014-01-08 10:48:55 -08:00
Henry Robinson
8d87972695 Improve parser coverage
This patch adds support for the following SQL constructs

  - Unary + operator
  - The ALL keyword, in SELECT ALL and SELECT aggregate_func(ALL *)
  - REAL and INTEGER as type synonyms for DOUBLE and INT respectively
  - The AS keyword after a table spec. e.g. SELECT * FROM tbl AS t0
2014-01-08 10:48:54 -08:00
Alex Behm
be03e6c21c IMPALA-138: Error messages for unknown column types are particularly bad. 2014-01-08 10:48:53 -08:00
Alex Behm
a01573af63 IMPALA-65: Add MySQL-style string literals with escaping. 2014-01-08 10:48:51 -08:00
Nong Li
0df9476be1 Parquet data loading. 2014-01-08 10:48:48 -08:00
ishaan
5ed84d7f65 IMP-739 Results for show queries should check for subset, not equality. 2014-01-08 10:48:46 -08:00
Skye Wanderman-Milne
461a48df2b Refactor testing framework to generate Avro tables. 2014-01-08 10:48:45 -08:00
Nong Li
6e293090e6 Parquet writer.
Change-Id: I7117b545e3d3a7803a219234ad992040a6c7c4ec
2014-01-08 10:48:44 -08:00
Alexander Behm
39e443407b IMPALA-136: GROUP BY float/double. 2014-01-08 10:48:43 -08:00
Nong Li
0385d14d69 Fix pre-hive 9 rc file scanner. 2014-01-08 10:48:41 -08:00
Marcel Kornacker
d7bfe6c68d IMPALA-144: partition pruning for arbitrary predicates that are fully bound by partition columns
This makes partition pruning more effective by extending it to predicates that are fully bound by the partition column,
e.g., '<col> IN (1, 2, 3)' will also be used to prune partitions, in addition to equality and binary comparisons.
2014-01-08 10:48:41 -08:00
Lenni Kuff
328ceed4e7 Add support for generating lzo compressed text files and running tests against lzo 2014-01-08 10:48:38 -08:00
Lenni Kuff
90d7e085fa Update tests to use num_nodes=0, use external impala cluster, add sanity check run mode 2014-01-08 10:48:38 -08:00
Lenni Kuff
d57440e87d Allow column comments for CREATE TABLE and DESCRIBE <table> statements 2014-01-08 10:48:37 -08:00
Lenni Kuff
9f71374875 IMPALA-102: Add support for CREATE TABLE ... PARTITIONED BY (col1, col2) 2014-01-08 10:48:35 -08:00
Lenni Kuff
1cd847c856 IMPALA-81: Add support for CREATE/DROP DATABASE/TABLE
This adds Impala support for CREATE/DROP DATABASE/TABLE. With this change, Impala
supports creating tables in the metastore stored as text, sequence, and rc file format.
It currently only supports creating unpartitioned tables and tables stored in HDFS.
2014-01-08 10:48:30 -08:00
Marcel Kornacker
c02d25baa8 IMPALA-20: Limit clause in inline view not handled correctly by planner
- this adds a SelectNode that evaluates conjuncts and enforces the limit
- all limits are now distributed: enforced both by the child plan fragment and
  by the merging ExchangeNode
- all limits w/ Order By are now distributed: enforced both by the child plan fragment and
  by the merging TopN node
2014-01-08 10:48:29 -08:00
Lenni Kuff
5f9cd044ee Add scanner test suite that runs across all file format/compression permuations 2014-01-08 10:48:25 -08:00
ishaan
5138a720bb IMP-768: Enable the python test framework to check for insert results. 2014-01-08 10:48:22 -08:00
Skye Wanderman-Milne
6c08716439 IMPALA-92: Significant performance difference between LIKE = 'x' AND = 'x' 2014-01-08 10:48:21 -08:00
Henry Robinson
222d15c6ca IMPALA-72: String partition keys should be URL encoded 2014-01-08 10:48:20 -08:00
ishaan
09d6d931f4 Change the way data is loaded 2014-01-08 10:48:09 -08:00
Skye Wanderman-Milne
357327b5c0 Fix file offsets in DataErrorsTest 2014-01-08 10:48:06 -08:00
Lenni Kuff
d2e4776731 Support passing snapshot file to buildall, add script to run all tests, remove old tests 2014-01-08 10:47:59 -08:00
Nong Li
a0229cd12e Update tpch schema to use bigint for keys. 2014-01-08 10:47:54 -08:00
Lenni Kuff
409d2ae5d7 Migrate run-test to python and add mini-stress test as part of buildall 2014-01-08 10:47:34 -08:00
Lenni Kuff
1896701399 IMPALA-44: Database names are case sensitive 2014-01-08 10:47:34 -08:00
Lenni Kuff
3fb375cdc4 Add initial set of queries for targeted perf workload
Includes a query that runs a simple "limit 0" as well as queries that perform aggregation
on columns with different numbers of GROUP BY groups.
2014-01-08 10:47:23 -08:00
Nong Li
02c329b97a Update RC files to use io mgr and remove scanner support for non-io mgr. 2014-01-08 10:47:11 -08:00
Lenni Kuff
9d981984e7 Update expected results of the 'show table/database' test to remove trevni tables 2014-01-08 10:47:10 -08:00
Lenni Kuff
e10960b2c9 Disable test execution against Trevni and replace with seq/snap format 2014-01-08 10:47:10 -08:00
Lenni Kuff
12d18631e3 Test enhancements: dynamic table format data loading, per-workload exploration stategies 2014-01-08 10:47:07 -08:00
Lenni Kuff
c806738af2 Add scan range length tests to Python test framework 2014-01-08 10:47:06 -08:00
Nong Li
15dfd968fb Disable tpch-q21 and fix plan output for tpch-q22.
We can now generate the temp table for q22 which changes the plan output.
2014-01-08 10:47:03 -08:00
Nong Li
f46c654e01 Enable tpch-q21 and tpch-q22 in tests. 2014-01-08 10:47:03 -08:00
Lenni Kuff
1fcf094d67 Add support for comparing query test results by column type 2014-01-08 10:47:01 -08:00
Henry Robinson
15228f945f IMP-503: INSERTS into unpartitioned tables should be checked for union compatibility 2014-01-08 10:46:57 -08:00
Lenni Kuff
30dbf59ef2 Final changes to enable Python test infrastructure and tests
With this change the Python tests will now be called as part of buildall and
the corresponding Java tests have been disabled. The new tests can also be
invoked calling ./tests/run-tests.sh directly.

This includes a fix from Nong that caused wrong results for limit on non-io
manager formats.
2014-01-08 10:46:57 -08:00
Alan Choi
476a665763 IMP-620: print number of scanned partition and total scaned bytes 2014-01-08 10:46:57 -08:00
Nong Li
fbfef4e22e Fix crash in TopN node with null tuples. 2014-01-08 10:46:54 -08:00
Lenni Kuff
837f35eab3 Updated results for more query tests to reflect proper ordering + improved result updating 2014-01-08 10:46:53 -08:00
Lenni Kuff
bed633c1ae Extract config/metastore creation from buildall + script for loading warehouse snapshot 2014-01-08 10:46:53 -08:00
Lenni Kuff
a035cf4e73 Update results of a few TPC-H queries to reflect proper ordering
Change-Id: I41156b506155c846220cfb097f5e8120503f8da8
2014-01-08 10:46:52 -08:00
Lenni Kuff
1b248d067b Add TPC-DS dataset and workload 2014-01-08 10:46:52 -08:00
Marcel Kornacker
f6af9316d9 Fix for IMP-137: incorrect predicate placement for outer joins
Fixing predicate assignment for outer joins:
- On clause predicates for outer joins are now assigned to the join node
- the exception are On clause predicates that can be directly evaluated
  by the outer-joined tables themselves; those are "pushed down"
- Where clause predicates for outer-joined tables are assigned to the join node
  that materializes the outer join
2014-01-08 10:46:50 -08:00
Lenni Kuff
febdb112f4 Fixed bug in test file section parsing 2014-01-08 10:46:50 -08:00
Lenni Kuff
ef48f65e76 Add test framework for running Impala query tests via Python
This is the first set of changes required to start getting our functional test
infrastructure moved from JUnit to Python. After investigating a number of
option, I decided to go with a python test executor named py.test
(http://pytest.org/). It is very flexible, open source (MIT licensed), and will
enable us to do some cool things like parallel test execution.

As part of this change, we now use our "test vectors" for query test execution.
This will be very nice because it means if load the "core" dataset you know you
will be able to run the "core" query tests (specified by --exploration_strategy
when running the tests).

You will see that now each combination of table format + query exec options is
treated like an individual test case. this will make it much easier to debug
exactly where something failed.

These new tests can be run using the script at tests/run-tests.sh
2014-01-08 10:46:50 -08:00
Lenni Kuff
1e25c98fb4 Test data loading framework improvements
This change includes a number of improvements for the test data loading framework:
* Named sections for schema template definitions
* Removal of uneeded sections from schema template definitions (ex. ANALYZE TABLE)
* More granular data loading via table name filters
* Improved robustness in detecting failed data loads
* Table level constraints for specific file formats
* Re-written compute stats script
2014-01-08 10:46:49 -08:00
Nong Li
b4dc3eeb35 Fix IMP-575 2014-01-08 10:46:45 -08:00