impala

mirror of https://github.com/apache/impala.git synced 2025-12-31 15:00:10 -05:00

Author	SHA1	Message	Date
Dimitris Tsirogiannis	5a6f53db16	Add partition pruning tests The following changes are included in this commit: 1. Modified the alltypesagg table to include an additional partition key that has nulls. 2. Added a number of tests in hdfs.test that exercise the partition pruning logic (see IMPALA-887). 3. Modified all the tests that are affected by the change in alltypesagg. Change-Id: I1a769375aaa71273341522eb94490ba5e4c6f00d Reviewed-on: http://gerrit.ent.cloudera.com:8080/2874 Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/3236	2014-06-24 02:14:27 -07:00
Alex Behm	a85dacafe8	IMPALA-904: Make TupleIsNullPredicate work on non-nullable tuples. We wrap certain exprs substituted from outer-joined inline view in an expr that evaluates to NULL if the underling tuple(s) are NULL. We do this for exprs that evaluate to non-NULL values if their slots are NULL, i.e., we must then distinguish tuples that are NULL from slots that are NULL (otherwise evaluating an expr against a tuple that is NULL due to the outer join may incorrectly return a non-NULL value.) The bug: Exprs referring to an outer-joined inline view may appear in various places in the outer query block. For example, they could appear in an On-clause or be placed into scans/aggregates due to predicate propagation. In such cases, the underlying tuples may not be nullable yet because they only become nullable after the outer join. We had a DCHECK in tuple-is-null-predicate.cc requiring the tuples to be nullable. The fix: Remove the DCHECK. The fix is not elegant but practical. It would be rather difficult to fix the inline view expr substitution such that a TupleIsNullPredicate never references a non-nullable tuple, esp. due to predicate propagation. Change-Id: I180f75f14173f356abfeec751e6b2d419378a9a7 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2157 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: jenkins	2014-04-07 14:18:49 -07:00
Nong Li	1a55133f0a	IMPALA-735. Fix codegen bug affecting outer joins. Change-Id: I99ca45b558fb2ed694f261a22e7e91e59f1ad675 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1496 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins	2014-02-10 05:00:21 -08:00
Nong Li	e3fdef7839	Fix subexpr elimination IR rewriting. Change-Id: Iabdcc1686951e71136a603ed30f9d16fb1c1ec46 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1056 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins	2014-01-08 10:54:22 -08:00
Alex Behm	1497002013	Added SHOW TABLE/COLUMN STATS command. Fixed the following stats-related bugs: - Per-partition row count was not distributed properly via CatalogService - HBase column stats were not loaded and distributed properly Enhancements to test framework: - Allow regex specification of expected row or column values - Fixed expected results of some tests because the test framework did not catch that they were incorrect Change-Id: I1fa8e710bbcf0ddb62b961fdd26ecd9ce7b75d51 Reviewed-on: http://gerrit.ent.cloudera.com:8080/813 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: jenkins	2014-01-08 10:53:51 -08:00
Alex Behm	9754f5bf52	IMPALA-504: Right and full outer joins do not return row with NULL value for rhs table. Change-Id: Ia3f8d474fb30189b36fb587b2920d7b9b224ea71 Reviewed-on: http://gerrit.ent.cloudera.com:8080/129 Tested-by: Alex Behm <alex.behm@cloudera.com> Reviewed-by: Alex Behm <alex.behm@cloudera.com>	2014-01-08 10:52:03 -08:00
Alex Behm	861ba05989	IMPALA-197: Outer join on constant expressions returns incorrect results.	2014-01-08 10:50:09 -08:00
Alex Behm	abafcf81ff	IMPALA-287: Full outer join is missing results.	2014-01-08 10:49:54 -08:00
Alex Behm	dbe3127383	IMPALA-285: Multiple outer joins with nesting crash impalad	2014-01-08 10:49:53 -08:00
Marcel Kornacker	7bf87a4b54	fix for IMPALA-90/IMPALA-221	2014-01-08 10:49:50 -08:00
Alan Choi	57c2f828e0	IMP-791 Fix full outer join hang In full or right outer join, the hash-join-node does not release the io buffer when calling get next, causing deadlock.	2014-01-08 10:48:58 -08:00
Lenni Kuff	ef48f65e76	Add test framework for running Impala query tests via Python This is the first set of changes required to start getting our functional test infrastructure moved from JUnit to Python. After investigating a number of option, I decided to go with a python test executor named py.test (http://pytest.org/). It is very flexible, open source (MIT licensed), and will enable us to do some cool things like parallel test execution. As part of this change, we now use our "test vectors" for query test execution. This will be very nice because it means if load the "core" dataset you know you will be able to run the "core" query tests (specified by --exploration_strategy when running the tests). You will see that now each combination of table format + query exec options is treated like an individual test case. this will make it much easier to debug exactly where something failed. These new tests can be run using the script at tests/run-tests.sh	2014-01-08 10:46:50 -08:00
Lenni Kuff	04edc8f534	Update benchmark tests to run against generic workload, data loading with scale factor, +more This change updates the run-benchmark script to enable it to target one or more workloads. Now benchmarks can be run like: ./run-benchmark --workloads=hive-benchmark,tpch We lookup the workload in the workloads directory, then read the associated query .test files and start executing them. To ensure the queries are not duplicated between benchmark and query tests, I moved all existing queries (under fe/src/test/resources/* to the workloads directory. You do NOT need to look through all the .test files, I've just moved them. The one new file is the 'hive-benchmark.test' which contains the hive benchmark queries. Also added support for generating schema for different scale factors as well as executing against these scale factors. For example, let's say we have a dataset with a scale factor called "SF1". We would first generate the schema using: ./generate_schema_statements --workload=<workload> --scale_factor="SF3" This will create tables with a unique names from the other scale factors. Run the generated .sql file to load the data. Alternatively, the data can loaded by running a new python script: ./bin/load-data.py -w <workload1>,<workload2> -e <exploration strategy> -s [scale factor] For example: load-data.sh -w tpch -e core -s SF3 Then run against this: ./run-benchmark --workloads=<workload> --scale_factor=SF3 This changeset also includes a few other minor tweaks to some of the test scripts. Change-Id: Ife8a8d91567d75c9612be37bec96c1e7780f50d6	2014-01-08 10:44:22 -08:00

13 Commits