impala

mirror of https://github.com/apache/impala.git synced 2025-12-31 15:00:10 -05:00

Author	SHA1	Message	Date
Alex Behm	677062be3d	Rework planning of unions s.t. a UnionStmt produces a single MergeNode. This patch changes the planning of a UnionStmt s.t. it always produces a single fragment with a MergeNode connecting all child fragments as its root. The data partition of the returned fragment and how the child fragments are merged depends on the data partitions of the child fragments: - All child fragments are unpartitioned or partitioned: The returned fragment is has a UNPARTITIONED or RANDOM data partition, respectively. The MergeNode absorbs the plan trees of all child fragments. - Mixed partitioned/unpartitioned child fragments: The returned fragment is RANDOM partitioned. The plan trees of all partitioned child fragments are absorbed into the MergeNode. All unpartitioned child fragments are connected to the MergeNode via a RANDOM exchange, and remain unchanged otherwise. Also adds support for random partitioned data exchanges. Change-Id: I82b2d12c104d98c4e7133234653ee1b67658ef7a Reviewed-on: http://gerrit.ent.cloudera.com:8080/2876 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/3143	2014-06-19 00:56:58 -07:00
Nong Li	8f4dc0f2f0	IMPALA-974: Switch from FloatLiteral to DecimalLiteral. Float/Doubles are lossy so using those as the default literal type is problematic. Change-Id: I5a619dd931d576e2e6cd7774139e9bafb9452db9 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2758 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins	2014-05-31 22:19:06 -07:00
Alex Behm	121fab8fdf	IMPALA-888: Drop union operands with constant conjuncts evaluating to false. This patch simplifies the complex slot materialization logic for unions by making the materialization independent of conjuncts assigned to MergeNodes. When 'pushing down' predicates into union operands, we drop union operands with constant predicates evaluating to false. Constant predicates that evaluate to true are simply ignored. Change-Id: I0e7ccfb206bed29db2b5d667e2bb61310980e80a Reviewed-on: http://gerrit.ent.cloudera.com:8080/2327 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: jenkins	2014-04-23 18:25:14 -07:00
Matthew Jacobs	8fa8a0f828	IMPALA-843: Do not close reader contexts until plan fragment close Fixes a crash that occurs in some cases when io buffers are still used and child nodes are closed early. We close child nodes early when all rows have been consumed and resources are transfered, but in some cases io buffers are still in use when a scan node is closed. We avoid this problem by only closing reader contexts when the entire fragment is closed. Change-Id: Ie62cdecdcd530bdc61dd4e83cd9ecfc7d2c93ef6 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1806 Reviewed-by: Matthew Jacobs <mj@cloudera.com> Tested-by: jenkins (cherry picked from commit 66f14a47b953b7b7153c73f4e018d03461dcd5ef) Reviewed-on: http://gerrit.ent.cloudera.com:8080/1859	2014-03-12 14:44:18 -07:00
Alan Choi	468ca0aa5d	IMPALA-723 Fix union with aggregate The problem is that with Union, AggregateInfo.materializeRequiredSlots() is being called more than once. Other "materializeSlots" related calls are idempotent, but this one is not. That's because materializedAggregateSlots_ is an array list and we keep adding the same duplicate value to the array list. We can fix it by making materializeRequiredSlots() idempotent. Change-Id: Ic18f89010c088fe9018b15f0281bc9340b8a2d14 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1195 Tested-by: jenkins Reviewed-by: Alan Choi <alan@cloudera.com> Tested-by: Alan Choi <alan@cloudera.com>	2014-01-08 10:54:40 -08:00
Matthew Jacobs	00bc971d34	IMPALA-531: Allow arithmetic expressions for LIMIT Change-Id: Ic1901e9dbaeee5fb0aef72a278b4aa262a2abcd7 Reviewed-on: http://gerrit.ent.cloudera.com:8080/829 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Reviewed-by: Matthew Jacobs <mj@cloudera.com> Tested-by: Matthew Jacobs <mj@cloudera.com>	2014-01-08 10:53:49 -08:00
ishaan	53cd9eadab	Treat HBase as a file format for functional tests Change-Id: Ia01181a1e10eb108419122d347e9d869a69e8922 Reviewed-on: http://gerrit.ent.cloudera.com:8080/102 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: Ishaan Joshi <ishaan@cloudera.com>	2014-01-08 10:52:36 -08:00
Alex Behm	937a44f9f8	IMPALA-68: Support Values() statement.	2014-01-08 10:50:31 -08:00
Alex Behm	5db3f2cdf5	IMPALA-227: SELECT * on partitioned table returns columns in different order than Hive.	2014-01-08 10:49:48 -08:00
Alex Behm	1b2e8280d4	Fix NULL issues.	2014-01-08 10:49:32 -08:00
Alex Behm	0821e2f826	IMPALA-66: Support for UNION with constant SELECT clauses.	2014-01-08 10:49:18 -08:00
Lenni Kuff	30dbf59ef2	Final changes to enable Python test infrastructure and tests With this change the Python tests will now be called as part of buildall and the corresponding Java tests have been disabled. The new tests can also be invoked calling ./tests/run-tests.sh directly. This includes a fix from Nong that caused wrong results for limit on non-io manager formats.	2014-01-08 10:46:57 -08:00
Lenni Kuff	ef48f65e76	Add test framework for running Impala query tests via Python This is the first set of changes required to start getting our functional test infrastructure moved from JUnit to Python. After investigating a number of option, I decided to go with a python test executor named py.test (http://pytest.org/). It is very flexible, open source (MIT licensed), and will enable us to do some cool things like parallel test execution. As part of this change, we now use our "test vectors" for query test execution. This will be very nice because it means if load the "core" dataset you know you will be able to run the "core" query tests (specified by --exploration_strategy when running the tests). You will see that now each combination of table format + query exec options is treated like an individual test case. this will make it much easier to debug exactly where something failed. These new tests can be run using the script at tests/run-tests.sh	2014-01-08 10:46:50 -08:00
Lenni Kuff	04edc8f534	Update benchmark tests to run against generic workload, data loading with scale factor, +more This change updates the run-benchmark script to enable it to target one or more workloads. Now benchmarks can be run like: ./run-benchmark --workloads=hive-benchmark,tpch We lookup the workload in the workloads directory, then read the associated query .test files and start executing them. To ensure the queries are not duplicated between benchmark and query tests, I moved all existing queries (under fe/src/test/resources/* to the workloads directory. You do NOT need to look through all the .test files, I've just moved them. The one new file is the 'hive-benchmark.test' which contains the hive benchmark queries. Also added support for generating schema for different scale factors as well as executing against these scale factors. For example, let's say we have a dataset with a scale factor called "SF1". We would first generate the schema using: ./generate_schema_statements --workload=<workload> --scale_factor="SF3" This will create tables with a unique names from the other scale factors. Run the generated .sql file to load the data. Alternatively, the data can loaded by running a new python script: ./bin/load-data.py -w <workload1>,<workload2> -e <exploration strategy> -s [scale factor] For example: load-data.sh -w tpch -e core -s SF3 Then run against this: ./run-benchmark --workloads=<workload> --scale_factor=SF3 This changeset also includes a few other minor tweaks to some of the test scripts. Change-Id: Ife8a8d91567d75c9612be37bec96c1e7780f50d6	2014-01-08 10:44:22 -08:00

14 Commits