impala

mirror of https://github.com/apache/impala.git synced 2026-01-06 06:01:03 -05:00

Author	SHA1	Message	Date
Ippokratis Pandis	87502f829c	IMPALA-1471: Bug in spilling of PHJ that was affecting left anti and outer joins. In cases where we had to spill the probe side of PHJs, we were not only appending the probe row to the tuple stream to be spilled, but we were also getting into the regular processing loop with the iterator set to End(). In the case of left anti and left outer joins, the result was to incorrectly output this row, since it did not have a match. This bug had a small perf impact for all spilling joins because we were doing an unnecessary loop for each probe row we had to spill. This patch solves the problem by immediately going to the next probe row if the current row is spilled. Additionally, it fixes a bug in the block mgr where there was a code path we were not counting correctly the number of pinned buffers. It also adds tpch-q21 in the set of queries to run in the spilling test. Change-Id: I762f5c41fe468e4485a4b31dabe2e53f6b49ae24 Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5313 Reviewed-by: Ippokratis Pandis <ipandis@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5334	2014-11-20 02:21:14 -08:00
ishaan	23964c19af	[CDH5] Fix bad merge in in spilling.test Change-Id: Ia6e30cf5916c737088d8cb969e0167b9d69a599e	2014-10-08 23:19:02 -07:00
Nong Li	de31fa8e21	Disable spilling tests that are too flaky. Change-Id: I4ac877c3fa8297d873c67f219bb0c75f0001562d Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4731 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: jenkins	2014-10-06 15:18:56 -07:00
Nong Li	e08ffde009	PA/PHJ: Increase fanout to 32 and fix interaction with small buffers. Small buffers introduced an issue that is exacerbated by the large fanout. A stream can only be appended to forever once it has grabbed the initial io sized buffer. With small buffers, we don't grab that at the beginning anymore and, before this patch, it is grabbed when the stream first needs it. This means when one stream needs it, another stream could have already grabbed it (meaning this stream is pinned with multiple buffers). This patch has all the streams grab an IO buffer as soon as the first stream needs an io buffer. This guarantees that all streams get 1 before any get 2. Change-Id: I1be1219fc5f1fa3ceedd4d5e76ae056c8bb8ff3d	2014-10-06 15:16:16 -07:00
Nong Li	3e632ef6ad	Reduce min PA/PHJ mem requirement. Update PA/PHJ to use small (< io sized buffers) initially. Without this we would not be able to run at the QPS that we need just due to the buffering requirements of these operators. Change-Id: Ic8a777d147893567c9590fbab17f561eadb6ee19 Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4623 Tested-by: jenkins Reviewed-by: Nong Li <nong@cloudera.com>	2014-10-06 15:14:10 -07:00
ishaan	010cc22a2f	[CDH5] Fix test spilling. tpch in cdh5 does not have double columns. Also, remove round calls to test that we get consistent results. Change-Id: Ia45ef08644ed78b05a08c47422733ab38a26b508 Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4595 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: Ishaan Joshi <ishaan@cloudera.com>	2014-09-26 22:57:02 -07:00
Nong Li	d5c948c351	Increase the mem limit for one of the spilling queries. Change-Id: I9b52582b2ded82821ecc446762f07d7702dedabf Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4555 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: Nong Li <nong@cloudera.com>	2014-09-26 12:27:29 -07:00
Nong Li	f03b05ed50	Fix hash table buckets to allocate memory from the BlockMgr. This was always a TODO. We want memory to come from the block mgr and trigger spilling. Change-Id: I07f1f79fbbb33068fb2df64510a80a9b008ef73d Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4466 Tested-by: jenkins Reviewed-by: Nong Li <nong@cloudera.com>	2014-09-26 12:26:09 -07:00
Matthew Jacobs	da5198e615	Add spilling test for an analytic fn Change-Id: Ia93c71c9c2a01f7f04a81593d51f5ca565286b7d Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4447 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins	2014-09-23 07:26:09 -07:00
Nong Li	8a661d0787	[CDH5] cherry pick conflicts. Change-Id: Ic11237b7ead4a810b523d6b6095781efbc5bb66b	2014-09-20 19:41:42 -07:00
Nong Li	6b73eec02d	PHJ: Fix block management when spilling. The previous code did not handle well the case where the spilling happens when building the hash table (i.e. partitioning the build rows fit). This caused the probe partition to be starved causing queries that should be able to run to fail with a not enough buffers error. Change-Id: I3a9a84e8800a72ed3ce6f5ab7ff03bc2d6eb7ad8 Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4403 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: Nong Li <nong@cloudera.com>	2014-09-20 16:12:21 -07:00
Skye Wanderman-Milne	2a449651da	Use CRC hash for 0th partition level. Change-Id: Ie845e0edb684f13421eea41327b1571b368db21a Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4370 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins	2014-09-20 16:11:40 -07:00
ishaan	c4b4e010ff	Buffered Tuple Stream fixes. This patch fixes two issues: - Add API to buffered block mgr to allow an atomic Unpin and GetNewBlock. This has the semantics of unpinning a block and giving the buffer to the new block. This is necessary for the tuple stream to make sure another thread does not grab the unpinned block in between. - Buffer management reading an unpinned stream. Before moving onto a new block (and unpinning the current), we need to make sure all the tuples returned from the current block are returned up the operator tree. Change-Id: I95ee58d1019dd971f6a7dc19ecafdfa54cdbf942 Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4333 Tested-by: jenkins Reviewed-by: Nong Li <nong@cloudera.com>	2014-09-20 16:05:11 -07:00

13 Commits