impala

mirror of https://github.com/apache/impala.git synced 2026-01-16 18:00:39 -05:00

Author	SHA1	Message	Date
Skye Wanderman-Milne	68fef6a5bf	IMPALA-2213: make Parquet scanner fail query if the file size metadata is stale This patch changes the Parquet scanner to check if it can't read the full footer scan range, indicating that file has been overwritten by a shorter file without refreshing the table metadata. Before it would DCHECK. This patch adds a test for this case, as well as the case where the new file is longer than the metadata states (which fails with an existing error). Change-Id: Ie2031ac2dc90e4f2573bd3ca8a3709db60424f07 Reviewed-on: http://gerrit.cloudera.org:8080/1084 Tested-by: Internal Jenkins Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>	2015-10-01 13:58:39 -07:00
Skye Wanderman-Milne	eb9db01092	IMPALA-2377: fix error case in ArrayValueBuilder::GetFreeMemory() We weren't actually returning and would try to allocate an array larger than INT_MAX. I tested by hand that a very large array (200M elements) would fail to be allocated and the appropriate error is returned to the shell, as well as adding an ArrayValueBuilder unit test. Change-Id: Iedc3b3ca8c9c07100d6602f8e8cc9cfd57151747 Reviewed-on: http://gerrit.cloudera.org:8080/886 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Internal Jenkins	2015-10-01 13:58:35 -07:00
Skye Wanderman-Milne	4d68fcc87e	Fix MemPool allocations of INT_MAX. Since MemPool::Allocate() takes an int, INT_MAX is the largest possible requested allocation. This and slightly lower values would cause the 'num_bytes' variable in the private Allocate() function overflow, which would yield a valid pointer to a buffer too small to hold the requested number of bytes. This patch fixes this problem by making 'num_bytes' an int64_t, and adds a test for this behavior. This may help some problems related to IMPALA-1619, although it's not quite the same since MemPool was never limited to 1GB, only FreePool. Change-Id: Ia8040d87d3be32896944b5d44ff0c1f2667837e0 Reviewed-on: http://gerrit.cloudera.org:8080/1107 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Reviewed-by: Dan Hecht <dhecht@cloudera.com> Tested-by: Internal Jenkins	2015-09-30 17:17:54 -07:00
Matthew Jacobs	70b9954593	IMPALA-2189: [RM] Retry logic for Llama RPC may throw exception The code in resource-broker.cc that makes RPCs to Llama will attempt to retry the RPC some number of times (which is configurable) if the RPC returns a failure. If the RPC throws (which thrift may do), we try to reset the connection and then make the RPC again, but this time not guarded by a try/catch block. If this RPC throws, the process will crash. This fixes the issue by removing the try/catch and instead using the ClientCache DoRpc function which handles this already. Some additional Llama RPC calling wrappers were removed as well. Change-Id: Iba5add47a77fe9257e73eea5711ef4b948abe76a Reviewed-on: http://gerrit.cloudera.org:8080/881 Reviewed-by: Dan Hecht <dhecht@cloudera.com> Tested-by: Internal Jenkins	2015-09-30 17:17:52 -07:00
Juan Yu	7c498627f6	IMPALA-2249: Avoid allocating StringBuffer > 1GB in ScannerContext::Stream::GetBytesInternal() Due to IMPALA-1619, allocating StringBuffer larger than 1GB could cause Impala crash. Check the requested buffer size in advance and fail the request if it is larger than 1GB. Once IMPALA-1619 is fixed, we should revert this change. Change-Id: Iffd1e701614b520ce58922ada2400386661eedb1 (cherry picked from commit 74ba16770eeade36ab77c86ed99d9248c60b0131) Reviewed-on: http://gerrit.cloudera.org:8080/869 Reviewed-by: Juan Yu <jyu@cloudera.com> Tested-by: Internal Jenkins	2015-09-30 17:17:46 -07:00
Dimitris Tsirogiannis	7d338af638	IMPALA-2369, IMPALA-2435: Impala crashes when the sorter hits an OOM error This commit fixes the issue where the impalad will crash if the sort node can't get a new block from the buffered block manager. The fix removes the DCHECKS after the calls to BufferedBlockMgr::GetNewBlock() and returns a proper OOM error instead. It also ensures that the callers of Sorter::Run::Init() don't ignore the returned status (IMPALA-2435). Change-Id: I611f173fac3add770988e9d4aaa48efc4229fbd6 Reviewed-on: http://gerrit.cloudera.org:8080/976 Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com> Tested-by: Internal Jenkins	2015-09-30 17:17:43 -07:00
Tim Armstrong	f5db872fa2	IMPALA-2417: crash when updating counter in DataStreamRecvr There were several possible races in DataStreamRecvr where the RuntimeState for the query was torn down while a thread was waiting in DataStreamRecvr. This is possible when waiting on a condition variable because the lock is released. After the thread wakes up from the condition variable, it can crash by updating a counter on a RuntimeProfile that no longer exists. To address the problem this patch adds CANCEL_SAFE_SCOPED_TIMER, a variant of SCOPED_TIMER that checks a boolean variable for cancellation before updating the RuntimeProfile. This should be used instead of SCOPED_TIMER anywhere that it is possible that the RuntimeProfile will be torn down while the timer is active. Change-Id: Ib4339090d3dfb097e4c160a21b470f00b9c44bbf Reviewed-on: http://gerrit.cloudera.org:8080/1061 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Internal Jenkins	2015-09-30 17:17:41 -07:00
Matthew Jacobs	325916eefe	IMPALA-2046: Print hostname along with backend number When the coordinator prints the 'backend number' of fragments that are finished or result in an error, the hostname associated with that backend is also printed. Change-Id: I0b27549bd9155ab9b077933ab6f621f4f0887371 Reviewed-on: http://gerrit.cloudera.org:8080/912 Reviewed-by: Marcel Kornacker <marcel@cloudera.com> Tested-by: Internal Jenkins	2015-09-27 15:13:31 -07:00
Sailesh Mukil	b0bfe41046	IMPALA-2252: Crash (likely race) tearing down BufferedBlockMgr on query failure When running certain heavy workloads it was observed that a 'RuntimeState' member variable was being accessed in WriteComplete() on a RequestContext::Cancel() path after that RuntimeState object was destroyed. This is a temporary fix to ensure that the destroyed member variable is not accessed if the write status is cancelled. A test is not included as it is not deterministically reproducible. Change-Id: I8a55c070d25f0ca5c830a955e84df450061753a3 Reviewed-on: http://gerrit.cloudera.org:8080/897 Reviewed-by: Sailesh Mukil <sailesh@cloudera.com> Tested-by: Internal Jenkins	2015-09-27 15:13:30 -07:00
Skye Wanderman-Milne	24d63dd3b2	Fix reporting of custom OOM error messages. Before this patch, a common pattern was: Status status = Status::MemLimitExceeded(); status.SetErrorMsg(<custom error msg>); state_->SetMemLimitExceeded(); return status; This could cause the custom error message to be dropped, since RuntimeState::SetMemLimitExceeded() sets query_status_ to the generic "Memory limit exceeded" error, which then prevents query_status_ from being set to the custom error status. (The custom error message is often logged in the runtime state, but not always.) This patch has RuntimeState::SetMemLimitExceeded() take an optional ErrorMsg argument which is used to construct the new query status, and changes existing uses of SetErrorMsg() + SetMemLimitExceeded() to use this new functionality. Change-Id: I9fe20da0bcc2cf01f2fd1fe29ae32c1a00708da1 Reviewed-on: http://gerrit.cloudera.org:8080/885 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: Skye Wanderman-Milne <skye@cloudera.com>	2015-09-27 15:13:29 -07:00
Alex Behm	14aef7f6a7	IMPALA-2357: Fix spilling sorts with var-len slots that are NULL or empty. The bug: Several places in the sort assumed that the presence of var-len slots in the tuple to be sorted implied having var-len blocks. However, if all var-len slots are NULL or empty, then we can have no var-len blocks. This case is now properly handled. Change-Id: Ia3ad3669313e9d494ce2472af7775febfa6f247c Reviewed-on: http://gerrit.cloudera.org:8080/913 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Internal Jenkins	2015-09-27 15:13:27 -07:00
Tim Armstrong	d37bf390a8	IMPALA-2406: avoid rows with no tuples In some cases the planner generated plans with rows with no materialized tuples. Recent changes to the backend caused these to hit a DCHECK. This patch addresses one case in the planner where it was possible to create such plans: when the planner generated an empty node from a select subquery with no from clause. The fix is to create a materialized tuple based on the select list expressions, in the same way as we handle these selects when the planner cannot statically determine they have no result rows. An example query is included as a test. It also adds additional checks to the frontend and backend to catch these invalid rows earlier. Change-Id: I851f2fb5d389471d0bb764cb85f3c49031a075e4 Reviewed-on: http://gerrit.cloudera.org:8080/911 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Internal Jenkins	2015-09-27 15:13:25 -07:00
Alex Behm	7122f5a58f	Nested Types: Simple BE performance improvements. This patch avoids a few unecessary memory allocations, locks and atomics that can become a bottleneck when executing subplans. On the following benchmark the end-to-end runtime of a subplan-heavy query was improved from 37s to 27s. Benchmark: select count(*) from huge_customer c, c.c_orders o, o.o_lineitems The huge_customer table had 48 files totalling 6.5GB. The table was created by copying the files of tpch_nested_parquet.customer several times into the huge_customer table. I ran the benchmark with a single impalad. There are still several easy opportunities for improving the performance of subplan execution. Change-Id: I9fce1c2857a8f8e6ed3f1b4842d07fd80c11296a Reviewed-on: http://gerrit.cloudera.org:8080/894 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Internal Jenkins	2015-09-24 10:58:58 -07:00
Matthew Jacobs	59df1b83e3	IMPALA-2245: Fix for client->num_tmp_reserved_buffers_ accounting Fixes some logic in the buffered-block-mgr that we believe is wrong, and leads to the following DCHECK failing: Check failed: client->num_tmp_reserved_buffers_ == 0 (buffered-block-mgr.cc:259) It may be possible for other issues to exist in the accounting for num_tmp_reserved_buffers_, but this fix seems to work as evidenced by the stress tests running without hitting this DCHECK. Change-Id: Ic6415afc6722461dc57f763a46e3abbc3aa09af6 Reviewed-on: http://gerrit.cloudera.org:8080/880 Reviewed-by: Matthew Jacobs <mj@cloudera.com> Tested-by: Internal Jenkins	2015-09-24 10:58:55 -07:00
Martin Grund	579be1c542	IMPALA-2284: Disallow long (1<<30) strings in group_concat() This is the first step to fix issues with large memory allocations. In this patch, the built-in `group_concat` is no longer allowed to allocate arbitraryly large strings and crash impala, but is limited to the upper bound of possible allocations in Impala. This patch does not perform any functional change, but rather avoids unnecessary crashes. However, it changes the parameter type of FindChunk() in MemPool to be a signed 64bit integer. This change allows the mempool to allocate internally memory of more than one 1GB, but the public interface of Allocate() is not changed, so the general limitation remains. The reason for this change is as follows: 1) In a UDF FunctionContext::Reallocate() would allocate slightly more than 512MB from the FreePool. 2) The free pool tries to double this size to alloocate 1GB from the MemPool. 3) The MemPool doubles the size again and overflows the signed 32bit integer in the FindChunk() method. This will then only allocate 1GB instead of the expected 2GB. What happens is that one of the callers expected a larger allocation than actually happened, which will in turn lead to memory corruption as soon as the memory is accessed. Change-Id: I068835dfa0ac8f7538253d9fa5cfc3fb9d352f6a Reviewed-on: http://gerrit.cloudera.org:8080/858 Tested-by: Internal Jenkins Reviewed-by: Dan Hecht <dhecht@cloudera.com>	2015-09-23 15:15:55 -07:00
Tim Armstrong	8bdc7a2a33	IMPALA-2381: Include tuple pointers in mem limit A recent change that allocated a row batch's tuple pointers with malloc rather than a MemPool meant that the tuple pointers were no longer counted towards query or process memory limits. For some workload the number of row batches and therefore the aggregate size of the tuple pointers can be significant, so it is important to correctly account for the memory. This change simply pairs malloc() and free() calls with matching calls to MemTracker::Consume() and MemTracker::Release(). Change-Id: I7fb5f2844ad0d51f71a3701d8f0897d8f0b24e18 Reviewed-on: http://gerrit.cloudera.org:8080/895 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Reviewed-by: Alex Behm <alex.behm@cloudera.com> Reviewed-by: Ippokratis Pandis <ipandis@cloudera.com> Tested-by: Internal Jenkins	2015-09-23 15:15:53 -07:00
Tim Armstrong	f61b86f060	Update and reenable some scratch file tests. These tests were disabled because they relied on blacklisting, which was disabled. It is still useful to have tests to exercise the error handling code and ensure that scratch directories are used or not used when they should be. Change-Id: I89195a9fdd7ed858ae2addce93413871cffaf29b Reviewed-on: http://gerrit.cloudera.org:8080/846 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Internal Jenkins	2015-09-23 15:15:52 -07:00
Ippokratis Pandis	48699de6e3	IMPALA-1621,2241,2271,2330,2352: Lazy switch to IO buffers to reduce min mem needed for PAGG/PHJ PAGG and PHJ were using an all-or-nothing approach wrt spilling. In particular, they were trying to switch to IO-sized buffers for both streams (aggregated and unaggregated in PAGG; build and probe in PHJ) of every partition (currently 16 partitions for a total of 32 streams), even if some of the streams had very few rows, they were empty or simply they would not spill so there was no need to allocate IO-buffers for them. That was increasing the min mem needed by those operators in many queries. This patch decouples the decision to switch to IO-buffers for each stream of each partition. Streams will switch to IO-sized buffers whenever the rows they contain do not fit in the first two small buffers (64KB and 512KB respectively). When we decide to spill a partition, we switch to IO buffers both streams. With these change many streams of PAGG and PHJ nodes do not need to use IO-sized buffers, reducing the min mem requirement. For example, below is the min mem needed (in MBs) for some of the TPC-H queries. Some need half or less mem from the mem they needed before: TPC-H Q3: 645 -> 240 TPC-H Q5: 375 -> 245 TPC-H Q7: 685 -> 265 TPC-H Q8: 740 -> 250 TPC-H Q9: 650 -> 400 TPC-H Q18: 1100 -> 425 TPC-H Q20: 420 -> 250 TPC-H Q21: 975 -> 620 To make this small buffer optimization to work, we had to fix IMPALA-2352. That is, the AllocateRow() call of PAGG::ConstructIntermediateTuple() could return unsuccessfully just because the small buffers of the stream were exhausted. In that case, previously we would treat it as an indication that there is no memory left, start spilling a partition and switching all stream to IO-buffes. Now we make a best effort, trying to first SwitchToIoffers() and if that is successful, we re-attempt the AllocateRow() call. See IMPALA-2352 for more details. Another change is that now SwitchToIoBuffers() will reset the flag using_small_buffers_ back to false, in case we are in a very low memory situation and it fails to get a buffer. That allows us to retry calling SwitchToIoBuffers() once we free up some space. See IMPALA-2330 for more details. With the above fixes we should also have fixed IMPALA-2241 and IMPALA-2271 that are essentially stream::using_small_buffers_-related DCHECKs. This patch adds all 22 TPC-H queries in test_mem_usage_scaling test and updates the per-query min mem limits in it. Additionally, it adds a new aggregation test that uses the TPC-H dataset for larger aggregations (TestTPCHAggregationQueries). It also removes some dead test code. Change-Id: Ia8ccd0b76f6d37562be21fd4539aedbc2a864d38 Reviewed-on: http://gerrit.cloudera.org:8080/818 Reviewed-by: Ippokratis Pandis <ipandis@cloudera.com> Tested-by: Internal Jenkins Conflicts: tests/query_test/test_aggregation.py	2015-09-23 11:07:42 -07:00
Tim Armstrong	0e1161b161	IMPALA-2366: check fread return code correctly DiskIoMgr did not correctly check the return code of fread. As a result, if fread() return an error, DiskIoMgr would think it had succeeded and filled the buffer with valid data, whereas in reality the buffer would be full of garbage data and some error occurred. This change will detect and report errors correctly. Added a test for the eof case that failed before the fix and now passed. The error case is difficult to test without modifying DiskIoMgr or injecting faults at the filesystem level. Change-Id: Ic4822183a9cd228da670b5fd130e34ca875e8c80 Reviewed-on: http://gerrit.cloudera.org:8080/882 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Internal Jenkins	2015-09-22 10:58:33 -07:00
Tim Armstrong	68bd90dbc2	IMPALA-2350: malformed tuples in buffered-tuple-stream-test The buffered tuple stream tests wrote out the memory for integer tuples as a contiguous array of 4-byte integers, neglecting null indicators. The same bad assumption was present in both the read and write code, so the test passed in many circumstances. However, buffer overruns were sometimes detected in the ASAN build. This bug was present for a long time but a recent change made the ASAN build fail consistently. IMPALA-1688, an unexplained failure in buffered-tuple-stream-test may have the same cause. This fix also changes the integers written out so that they are not all small positive integers with many zero bytes. Change-Id: I4a158751e8a9c934c912831032e83ec85056f06e Reviewed-on: http://gerrit.cloudera.org:8080/876 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Internal Jenkins	2015-09-22 10:58:33 -07:00
Tim Armstrong	2d0ed379a2	Add a TODO about duplicate tuples for RowBatch::DeepCopyTo RowBatch::DeepCopyTo doesn't deduplicate tuples when copying. If not used carefully, this could lead to us producing oversized row batches (similar to the problem solved by dedup in RowBatch::Serialize). Currently this could only occur if the build-side child of a nested loop join both produces batches with many duplicate tuples and sets MarkNeedToReturn. Also fix error in comment. Change-Id: Ib59c92f42ee4d491c9a9d55e0b125af90f2b1d48 Reviewed-on: http://gerrit.cloudera.org:8080/874 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Internal Jenkins	2015-09-22 10:58:33 -07:00
Dan Hecht	f732fe2cdf	Log backtrace on all MEM_LIMIT_EXCEEDED errors Normally, error Status logs backtrace, but this doesn't happen for MEM_LIMIT_EXCEEDED because these were copied from a global status. Instead, construct them on the fly so that we get the normal Status backtrace logging. Also, remove some special case backtrace logging that is now redundant from buffered-block-mgr.cc. Primary motivation for this is to help debug IMPALA-2327, where it appears a MEM_LIMIT_EXCEEDED status is dropped. But I think this will be generally useful for debugging problems that happen after MEM_LIMIT_EXCEEDED. Testing: Run test_mem_scaling.py and see all "Memory limit exceeded" now have backtraces. Change-Id: I4cd04426e63397c24d3e16faa33caafc2a608c0c Reviewed-on: http://gerrit.cloudera.org:8080/872 Reviewed-by: Dan Hecht <dhecht@cloudera.com> Tested-by: Internal Jenkins	2015-09-22 10:58:33 -07:00
Alex Behm	1c528492d3	Nested Types: Fix projection of collection-typed slots. There was a bug with projecting collection-typed slots in the UnnestNode by setting them to NULL. The problem was that the same tuple/slot could be referenced by multiple input rows. As a result, all unnests after the first that operate on the same collection value would incorrectly return an empty row batch because the slot had been set to NULL by the first unnesting. The fix is to ignore the null bit when retrieving a collection-typed slot's value in the UnnestNode. We still set the null bit after retrieving the value for projection. This solution purposely ignores the conventional NULL semantics of slots. It is a temporary hack which must be removed eventually. We rely on the producer of collection-typed slot values (scan node) to write an empty array value into such slots when the they are NULL in addition to setting the null bit. Change-Id: Ie6dc671b3d031f1dfe4d95090b1b6987c2c974da Reviewed-on: http://gerrit.cloudera.org:8080/859 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Internal Jenkins	2015-09-22 10:58:33 -07:00
Tim Armstrong	db7519df24	IMPALA-2207: memory corruption on build side of NLJ The NLJ node did not follow the expected protocol when need_to_return is set on a row batch, which means that memory referenced by a rowbatch can be freed or reused the next time GetNext() is called on the child. This patch changes the NLJ node to follow the protocol by deep copying all build side row batches when the need_to_return_ flag is set on the row batches. This prevents the row batches from referencing memory that may be freed or reused. Reenable test that was disabled because of IMPALA-2332 since this was the root cause. Change-Id: Idcbb8df12c292b9e2b243e1cef5bdfc1366898d1 Reviewed-on: http://gerrit.cloudera.org:8080/810 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Internal Jenkins	2015-09-22 10:58:32 -07:00
Ippokratis Pandis	1da025fb0a	Work-around IMPALA-2344: Fail query with OOM in case block->Pin() fails This patch provides a temporary work-around IMPALA-2344. In case block->Pin() fails in Sorter::Run::GetNext() we will fail the query with OOM instead of DCHECK'ing. This work-around is similar to those we did for IMPALA-1590 and IMPALA-1868. When we rework the buffer management those failures should not happen. This patch also has some minor formatting changes. More importantly it adds a DCHECK in BufferedBlockMgr::FindBufferForBlock() that ensures we do not call PinBuffer() in an already pinned block. Change-Id: I5a43302b807972e39f4f6c98ec5ecb1eee5fe056 Reviewed-on: http://gerrit.cloudera.org:8080/861 Reviewed-by: Ippokratis Pandis <ipandis@cloudera.com> Tested-by: Internal Jenkins	2015-09-22 10:58:32 -07:00
Alex Behm	057b0b7dba	IMPALA-2322: Set new pointer for ArrayValue in Tuple::DeepCopyVarlenData(). The bug was a simple oversight where copied the array data, but forgot to update the pointer of the corresponding ArrayValue. Change-Id: Ib6ec0380f66194efc7ea3eb989535652eb8b526f Reviewed-on: http://gerrit.cloudera.org:8080/855 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Internal Jenkins	2015-09-22 10:58:32 -07:00
Tim Armstrong	c5b9b7a97d	Added DCHECKS for non-null array values in sorter The sorter does not currently support sorting tuples with collection slots because the necessary deep copy logic is not implemented. Fortunately, projection should ensure that all array values that reach the sorter have been set to null. This patch adds DCHECKs to ensure that this is the case. Variables are also renamed to reflect that with nested types string values are a subset of variable-length values. Change-Id: If617abe678903c69d12d1c65062c8063ae137296 Reviewed-on: http://gerrit.cloudera.org:8080/844 Tested-by: Internal Jenkins Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>	2015-09-22 10:58:32 -07:00
Tim Armstrong	9ebe92c4f9	IMPALA-2299: dedup zero-length tuples correctly The dedup logic in row batch serialisation incorrectly assumed that two distinct tuples must have two distinct memory addresses. This is not true if one tuple has zero length. Update the serialisation logic to check for this case and insert a NULL. Adds a unit test that exercises this bug prior to the fix and a query test that also hit a DCHECK prior to the fix. Change-Id: If163274b3a6c10f8ac6b6bc90eee9ec95830b7dd Reviewed-on: http://gerrit.cloudera.org:8080/849 Reviewed-by: Marcel Kornacker <marcel@cloudera.com> Tested-by: Internal Jenkins	2015-09-22 10:58:31 -07:00
Nong Li	e06764fb16	Fix race in DiskIoMgr. I discovered this while reading through the code to help with an Impala issue. The problem is that in the cancellation path, EnqueueBuffer() will return the buffer descriptor to the cache, which is effectively the same as freeing it. The caller would then try to read some state from it and the state can be corrupt. From a conceptual point of view, EnqueueBuffer() represents a memory hand-off so this behavior makes sense. The caller just needs to store the state it needs before EnqueueBuffer(). This fix is similar to that for IMPALA-2101. Change-Id: I2deab4923d1324fd275b8dd4ad37eedb6d39922c Reviewed-on: http://gerrit.sjc.cloudera.com:8080/8023 Tested-by: jenkins Reviewed-by: Nong Li <nong@cloudera.com> Reviewed-on: http://gerrit.cloudera.org:8080/834 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: Internal Jenkins	2015-09-15 08:38:17 -07:00
Tim Armstrong	1a72618fa0	IMPALA-2339: DCHECK in sorter should allow VARCHAR This DCHECK was overly strict - this code path is correct for any variable-length string type, not just STRING. Change-Id: I724ea4c9d97056dc1e977124dd9ec0f41cf7a2ce Reviewed-on: http://gerrit.cloudera.org:8080/840 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Internal Jenkins	2015-09-15 08:38:11 -07:00
Tim Armstrong	d7ae529dac	IMPALA-2335: DCHECK for zero-length tuple This DCHECK condition was overly strict - a non-nullable tuple pointer can be NULL if the tuple is zero bytes long (since there is no memory backing the tuple). Adds a test query that hit the DCHECK. Change-Id: I16e8bc0db747b83c239de931a4fc9677d5c85ae6 Reviewed-on: http://gerrit.cloudera.org:8080/836 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Internal Jenkins	2015-09-15 08:38:09 -07:00
Tim Armstrong	79e1ca689f	IMPALA-2331: set null indicators in RowBatchSerializeTest This correctly initialises all null indicators in RowBatchSerialize test to not-NULL (i.e. 0 bytes). Previously they were not initialised, so the tests only passed if the memory happened to be filled with zeroes. Change-Id: I12edad82ebb8cd7a2fb91b6686555b82f9e133cf Reviewed-on: http://gerrit.cloudera.org:8080/825 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Internal Jenkins	2015-09-14 13:43:02 -07:00
Alex Behm	930f31c28f	Fix mismatched malloc/delete. Change-Id: I049ae522adf81a8c9a378b33fb80076492442726 Reviewed-on: http://gerrit.cloudera.org:8080/824 Reviewed-by: Ippokratis Pandis <ipandis@cloudera.com> Tested-by: Internal Jenkins	2015-09-14 13:43:02 -07:00
Alex Behm	f46aa38161	IMPALA-2325: Skip NULL tuples in Coordinator::ValidateCollectionSlots(). Change-Id: I4b3e07f1ebe0d4244f7386f54d7b74b403b798e2 Reviewed-on: http://gerrit.cloudera.org:8080/817 Reviewed-by: Marcel Kornacker <marcel@cloudera.com> Tested-by: Internal Jenkins	2015-09-14 13:43:01 -07:00
Ippokratis Pandis	f57ce3436c	IMPALA-2256: Handle joins with right side of high cardinality and zero materialized slots The hash join and tuple stream code was not handling correctly the case of joins whose right side had very high cardinality but where tuple had zero footprint. Any such join with more than 16M tuples on the right side would crash. In particular, if the tuple footprint is zero, an infinite number of rows fit in one block. But according to the old way we were iterating over the rows of the stream, we would increment by 1 the idx to get the next "row" eventually overflowing and hitting dcheck. Another, second, problem was the calculation of the size of the hash table in such where the footprint of tuples is zero. In such case, a hash table of minimum size would suffice. Instead we would try to create a very large hash table to fit the large number of tuples, resulting to OOM errors. This patch fixes the two problems by having specific calculation of the next idx in the stream as well as the size of the hash table in case the stream contains tuples with zero footprint. Change-Id: I12469b9c63581fcbc78c87200de7797eac3428c9 Reviewed-on: http://gerrit.cloudera.org:8080/811 Reviewed-by: Ippokratis Pandis <ipandis@cloudera.com> Tested-by: Internal Jenkins	2015-09-14 13:43:01 -07:00
Alex Behm	ecdd5688b9	Nested Types: Tuple pointers are owned by the containing RowBatch by default. This patch makes the ownership of the memory backing the tuple pointers of a RowBatch dependent on whether the legacy joins and aggs are enabled: By default, the memory is malloc'd and owned by the RowBatch: If enable_partitioned_hash_join=true and enable_partitioned_aggregation=true then the memory is owned by the RowBatch and is freed upon its destruction. This mode is more performant especially with SubplanNodes in the ExecNode tree because the tuple pointers are not transferred and do not have to be re-created in every Reset(). Memory is allocated from MemPool: Otherwise, the memory is allocated from the RowBatch's tuple pool. As a result, the pointer memory is transferred just like tuple data, and must be re-created in Reset(). This mode is required for the legacy join and agg which rely on the tuple pointers being allocated from the RowBatch's tuple pool, so they can acquire ownership of the tuple pointers. Performance impact for nested types: Initial cluster runs and profiling on nested TPCH identified excessive malloc/frees as a major performance bottleneck. This change paves the way for further optimizations which yielded a 2x improvement in response time for most nested TPCH queries. Change-Id: I4ac58b18058ce46b4db89fbe117b0bcad19e9ee7 Reviewed-on: http://gerrit.cloudera.org:8080/807 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Internal Jenkins	2015-09-14 13:43:01 -07:00
Tim Armstrong	25e7454bc9	IMPALA-2319: correctly enforce NLJ limit In some cases in the NLJ node eos_ wasn't set even though the limit was reached. This prevented the limit from being handled correctly before returning rows to the caller of GetNext(). This could result in either too many rows being returned, or a crash when the row batch size was set to an invalid negative number. The fix is to always check for whether the limit was reached before returning from GetNext(). Change-Id: I660e774787870213ada9f2d3e6f10953d9937022 Reviewed-on: http://gerrit.cloudera.org:8080/797 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Internal Jenkins	2015-09-11 22:45:39 +00:00
Tim Armstrong	302df6be5d	Cleanup of DeepCopy Rename some functions and consistently skip the variable length code path when there are no variable length slots. Change-Id: I2f3405fcc5f545b207fa48e17f37fe968208d94c Reviewed-on: http://gerrit.cloudera.org:8080/773 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Internal Jenkins	2015-09-10 21:27:52 +00:00
Alex Behm	deb9c6f8e6	Nested Types: Poor man's projection for collection-typed slots. Collection-typed slots are expensive to copy, e.g., during data exchanges or when writing into a buffered-tuple-stream. Even worse, such slots could be duplicated many times after unnesting in a subplan. To alleviate this problem, this patch implements a poor man's projection where collection-typed slots are set to NULL inside the SubplanNode that flattens them. The FE guarantees that the contents of an array-typed slot are never referenced outside of the single UnnestNode that access them, so when returning eos in UnnestNode::GetNext() we also set the unnested array slot to NULL to avoid those expensive copies in downstream exec nodes. The FE provides that guarantee by creating a new slot in the parent scan for every relative CollectionTableRef. For example, for a table 't' with a collection-typed column 'c' the following query would have two separate slots in the tuple of 't', one for 'c1' and one for 'c2': select * from t, t.c c1, t.c c2 Change-Id: I90e5b86463019c9ed810c299945c831c744ff563 Reviewed-on: http://gerrit.cloudera.org:8080/763 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Internal Jenkins	2015-09-10 05:44:55 +00:00
Dan Hecht	c404f2ea2e	IMPALA-2286: Fix race between ~BufferedBlockMgr() and BufferedBlockMgr::Create() The race, described below, can lead to: DCHECK: query_to_block_mgrs_.find(query_id_) != query_to_block_mgrs_.end() Another fragment may have called Create() for this query_id_ and saw that this BufferedBlockMgr is being destructed. That fragement will overwrite the map entry for query_id_, pointing it to a different BufferedBlockMgr object. We should let that object's destructor remove the entry. On the other hand, if the second BufferedBlockMgr is destructed before this thread acquires the lock, then we'll remove the entry (because we can't distinguish between the two expired pointers), and when the other ~BufferedBlockMgr() call occurs, it won't find an entry for this query_id_. Change-Id: I18a9d965e0689adc4ee9b837eef19a21e0683a64 Reviewed-on: http://gerrit.cloudera.org:8080/772 Reviewed-by: Dan Hecht <dhecht@cloudera.com> Tested-by: Internal Jenkins	2015-09-09 17:48:36 +00:00
Tim Armstrong	235a8d08da	IMPALA-2295: deep copy arrays in BufferedTupleStream This was unimplemented and is used on some code paths. Arrays were not properly copied into the BufferedTupleStream, potentially leaving stray pointers to invalid or reused memory. Arrays are now correctly deep copied. Includes a unit test that copys rows containing arrays in and out of a BufferedTupleStream. Also implement matching optimisation for deep copy in RowBatch. Change-Id: I75d91a6b439450c5b47646b73bc528cfb8f7109b Reviewed-on: http://gerrit.cloudera.org:8080/751 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Internal Jenkins	2015-09-09 02:39:14 +00:00
Tim Armstrong	1ae79dbb0a	Switch std::list to std::deque in Sorter Sorter calls std::list::size() in multiple places. This is an O(n) operation pre-C++11. std::deque is a good alternative here, since all inserts and removals are at the beginning and end of the list. This should give a small performance boost. Change-Id: I1387299796742673c6ed03f6b47a0d41b88c2563 Reviewed-on: http://gerrit.cloudera.org:8080/768 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Internal Jenkins	2015-09-09 01:33:37 +00:00
Tim Armstrong	5ac55f24cc	IMPALA-2296: missing DeepCopy array support Implement Tuple-to-Tuple DeepCopy for collections. Add query test that uses the TOP-N node, which deep copies tuples in this way. Confirmed that the query test failed before this fix. Change-Id: I3fea860d8251038d7b5eb85c77973939abe9dbf8 Reviewed-on: http://gerrit.cloudera.org:8080/757 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Internal Jenkins	2015-09-08 23:40:53 +00:00
Tim Armstrong	af85fa63da	Fix missing space in error message Change-Id: I193b84537e8b7c9066bbde2956353eb5d5283a77 Reviewed-on: http://gerrit.cloudera.org:8080/765 Reviewed-by: Henry Robinson <henry@cloudera.com> Tested-by: Internal Jenkins	2015-09-08 20:08:21 +00:00
Tim Armstrong	4c95469a80	IMPALA-2305: avoid blacklisting to prevent test failures Blacklisting appears to be causing some tests to fail (perhaps because of a write error encountered by another test). For now, disable blacklisting. Change-Id: I06265eac465f58a836b69b01bcb60a8f875a22cc Reviewed-on: http://gerrit.cloudera.org:8080/760 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Internal Jenkins	2015-09-06 22:48:25 +00:00
Tim Armstrong	5b1157b44b	IMPALA-2079: Part 2: handling tmp device errors Tmp devices are blacklisted when a write error is encountered for that device. No more scratch space will be allocated on the blacklisted device, based on the assumption that the device is likely to be misconfigured or failing. This patch does not attempt to recover the query that experienced the write error. It also does not attempt to remap any existing blocks away from the temporary device. This behaviour is unit tested for several failure scenarios. This patch adds additional test infrastructure required for testing BufferedBlockMgr behavior in the presence of faults and in configurations with multiple tmp directories. Adds metrics tmp-file-mgr.active-scratch-dirs and tmp-file-mgr.active-scratch-dirs.list that track the number and set of active scratch dirs and expose it in the Impala web UI. Change-Id: I9d80ed3a7afad6ff8e5d739b6ea2bc0949f16746 Reviewed-on: http://gerrit.cloudera.org:8080/579 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Internal Jenkins	2015-09-03 01:54:32 +00:00
Skye Wanderman-Milne	bcc73a36da	Nested types: read and materialize nested types in Parquet scanner This patch modifies the Parquet scanner to resolve nested schemas, and read and materialize collection types. The high-level modification is to create a CollectionColumnReader that recursively materializes map- and array-type slots. This patch also adds many tests, most of which query a new table called complextypestbl. This table contains hand-generated data that is meant to expose edge cases in the scanner. The tests mostly test the scanner, with a few tests of other functionality (e.g. array serialization). I ran a local benchmark comparing this scanner code to the original scanner code on an expanded version of tpch_parquet.lineitem with 48009720 rows. My benchmark involved selecting different numbers of columns with a single scanner thread, and I looked at the HDFS scan node time in the query profiles. This code introduces a 10%-20% regression in single-threaded scan time. Change-Id: Id27fb728934e8346444f61752c9278d8010e5f3a Reviewed-on: http://gerrit.cloudera.org:8080/576 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Internal Jenkins	2015-09-02 19:23:54 +00:00
Tim Armstrong	9a80402e47	IMPALA-2270: avoid FnvHash64to32 with empty inputs FnvHash64to32 produces pathologically bad results when hashing zero-byte input: it always returns 0 regardless of the input hash seed. This is a result of it xoring the 32-bit hash seed with itself. This patch adds a DCHECK to this function to verify that this function is not invoked with zero-byte inputs, and updates all callsites to check for the zero-length case. This patch also improves hashing of booleans: false and NULL no longer hash to the same value. Change-Id: I6706f6ea167e5362d55351f7cc0c637c680a315d Reviewed-on: http://gerrit.cloudera.org:8080/720 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Internal Jenkins	2015-09-02 11:07:03 +00:00
Skye Wanderman-Milne	62b31bdb52	Nested types: send materialized path to BE See comment in Descriptors.thrift for what the materialized path is. Change-Id: I64d00cf1bc2edcbbed3b6cdd5e934c55fff70a49 Reviewed-on: http://gerrit.cloudera.org:8080/650 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: Internal Jenkins	2015-09-02 10:30:24 +00:00
Tim Armstrong	b402003532	Nested Types: Dedup tuples when de/serializing This patch extends the deduplication of tuples in row batches to work on non-adjacent tuples. This deduplication requires an additional data structure (a hash table) and adds additional performance overhead (up to 3x serialization time), so it is only enabled for row batches with compositions that are likely to blow up due to non-adjacent duplication of large tuples. This avoids performance regression in typical cases, while preventing size blow-ups in problematic cases, such as joining three streams of tuples some of which contain may contain large collections. A test is included that ensures that adjacent deduplication is enabled. The row batch serialize benchmark shows that deduplication does not regress performance of serialization or deserialization. Change-Id: I3c71ad567d1c972a0f417d19919c2b28891fb407 Reviewed-on: http://gerrit.cloudera.org:8080/573 Tested-by: Internal Jenkins Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>	2015-08-31 18:45:57 +00:00

1 2 3 4 5 ...

808 Commits