impala

mirror of https://github.com/apache/impala.git synced 2025-12-31 15:00:10 -05:00

Author	SHA1	Message	Date
Taras Bobrovytsky	7faaa65996	Added order by query tests - Added static order by tests to test_queries.py and QueryTest/sort.test - test_order_by.py also contains tests with static queries that are run with multiple memory limits. - Added stress, scratch disk and failpoints tests - Incorporated Srinath's change that copied all order by with limit tests into the top-n.test file Extra time required: Serial: scratch disk: 42 seconds test queries sort : 77 seconds test sort: 56 seconds sort stress: 142 seconds TOTAL: 5 min 17 seconds Parallel(8 threads): scratch disk: 40 seconds test queries sort: 42 seconds test sort: 49 seconds sort stress: 93 seconds TOTAL: 3 min 44 sec Change-Id: Ic5716bcfabb5bb3053c6b9cebc9bfbbb9dc64a7c Reviewed-on: http://gerrit.ent.cloudera.com:8080/2820 Reviewed-by: Taras Bobrovytsky <tbobrovytsky@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/3205	2014-06-20 13:35:10 -07:00
Nong Li	6e691f9500	IMPALA-1010: Remove Close() of build side in blocking join node. This optimization is generally not safe since the probe side is still streaming. The join node could acquire all of the data from the child into its own pool but then there's no real point in doing this (doesn't lead to lower memory footprint and just makes the mem accounting harder to reason about). This is exposed in busy plans. Change-Id: I37b0f6507dc67c79e5ebe8b9242ec86f28ddad41 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2747 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins	2014-05-30 11:50:50 -07:00
Alex Behm	b252921363	IMPALA-994: Handle incorrect column metadata in views created by Hive. Change-Id: I3fba08d191c479f37371ce50fd07b8476a73eba2 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2613 Reviewed-by: Marcel Kornacker <marcel@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/2618 Reviewed-by: Alex Behm <alex.behm@cloudera.com>	2014-05-19 20:17:23 -07:00
Aaron Davidson	cafb7b72f8	External sorting This is an experimental implementation of external sorting. This patch includes the following additions: (1) creation and implementation of the Sorter interface, which can sort Impala Tuples. (2) normalization of Tuples to allow memcmp-able sorting. (3) a testing framework for the Sorter, (4) a benchmark to compare the current state of the Sorter with other sorts, (5) an implementation of a Vector which can store data whose size is only known at runtime, (6) a sorting algorithm (basically a dumbed down STL sort) which can operate over such a vector, (7) implementation of a simple in-memory Merger, and (8) logic to stream blocks of memory in and out of memory for the actual external merging. I have a local branch for experimental optimizations and benchmarking -- this should be considered a "basic", working sort. The following optimizations have been implemented: (i) Optionally extracting keys instead of writing them in place. (ii) Optionally opportunistically parallelize run building (sorting & prepare for output). (iii) Maximize disk IO and minimize buffer recycling by writing buffers out, but also keeping them in memory until right when they're needed. (iv) Prepare auxililary data backwards so the buffers can be released as we go, and still go out in an order which preserves the first buffers of the run. (v) Always merge maximum number of runs at a time, taking from the next merge level if available. Change-Id: I1d7304d54d73152da929b1efffc1e851e5fb8fd4 Reviewed-on: http://gerrit.ent.cloudera.com:8080/126 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: Aaron Davidson <aaron.davidson@cloudera.com>	2014-01-08 10:52:27 -08:00
Alex Behm	8ad15fabcf	IMPALA-372: Added CREATE/DROP/ALTER VIEW.	2014-01-08 10:51:35 -08:00

5 Commits