impala

mirror of https://github.com/apache/impala.git synced 2026-01-07 09:02:19 -05:00

Author	SHA1	Message	Date
Ippokratis Pandis	e1ae5fe95a	IMPALA-1068: COMPUTE STATS should place -1 in #NULLs With IMPALA-1033 we disabled the counting of the number of NULLs in each column, and that gave a 2x speed-up in the computation. But erroneously the value 0 was being placed in the number of NULLs, instead of the correct -1 that indicates 'unknown'. Change-Id: Ib882eb2a87e7e2469f606081cb2881461b441a45 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3377 Reviewed-by: Ippokratis Pandis <ipandis@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/3378	2014-07-07 15:13:25 -07:00
Skye Wanderman-Milne	b572fe0af5	Remove unnecessary decimal casts for some builtins. For arithmetic ops, this is an optimization. The Add(decimal,decimal) already handles the cast as part of the operation. For binary predicates, the cast is bad and can lead to overflows. The decimal Compare() function has custom logic to not overflow. Change-Id: I9f5ad74ea89e9dfa5a3a40c1e07f7e9178bf1d52 (cherry picked from commit 6bffaa885542443ca559888d921853ecd194cbcb) Reviewed-on: http://gerrit.ent.cloudera.com:8080/3414 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: jenkins	2014-07-03 21:32:51 -07:00
Skye Wanderman-Milne	dbae673715	Open and close exprs on partition key exprs in HdfsPartitionDescriptor Change-Id: I954cd54113b4fb0d65423850a3a4145791b36107 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3136 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: jenkins (cherry picked from commit bf7af4dc7d5013b5d0f0f0797aba3c37f17c1fb6) Reviewed-on: http://gerrit.ent.cloudera.com:8080/3395	2014-07-03 12:04:25 -07:00
ishaan	f262fcea64	Support utf-8 input and out in the shell Also add --strict_unicode option which controls whether invalid unicode code points should be ignored on input. Change-Id: Ice59d6dd3df4557ab3b1fc91d7ddc0e1bf03f1c7 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3218 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: jenkins	2014-07-02 23:18:27 -07:00
Lenni Kuff	1d3267ef8b	Add NOTICE.txt file to Impala repo Change-Id: Ic1a1304d7425e4bc56daebf4418045889410d6a8 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3227 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: Lenni Kuff <lskuff@cloudera.com> (cherry picked from commit 8f6c6659883f5baaa2a576ae3163b20d7f11a7a1) Reviewed-on: http://gerrit.ent.cloudera.com:8080/3387	2014-07-02 15:23:24 -07:00
Nong Li	274f97efc5	IMPALA-1066: Fix bad free in Min()/Max() of strings. Change-Id: If66844a88accdc369458ab92f033eef50775d69e Reviewed-on: http://gerrit.ent.cloudera.com:8080/3373 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins	2014-07-01 20:45:08 -07:00
Nong Li	f05e2a92af	IMPALA-1066: Build with -no-strict-alias. Change-Id: I2d9684b0d1f352cba27dff92273d93d60d8435c2 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3336 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/3375 Reviewed-by: Nong Li <nong@cloudera.com>	2014-07-01 20:44:36 -07:00
Dimitris Tsirogiannis	cf782fe500	IMPALA-1065: Running explain on attached (TPC-DS) query throws IllegalStateExcpetion This commit resolves IMPALA-1065 where the explain statement of TPC-DS Q48 resulted in an IllegalStateException due to an overflow in the cardinality estimation of a cross join operator. The fix is to check if an overflow has occurred and reset the cardinality estimation to a valid value. Change-Id: I0e88fde07e7a5d86819af317e98bab7ac08d5a8a Reviewed-on: http://gerrit.ent.cloudera.com:8080/3346 Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/3366	2014-07-01 19:23:29 -07:00
Henry Robinson	dd4c1c32dc	Add optional RM reservation limit to memtrackers If RM and per-query memory limits were enabled at the same time, the per-query limit would be ignored if RM wanted to expand the memory allocation. This change adds an optional reservation limit to a memtracker. The original limit goes back to being a hard limit - i.e. any attempt to consume more than that amount results in failure. The RM reservation limit is the RM-allocated memory limit. If that is exceeded it triggers the ExpandRmReservation() method, which tries to retrieve more memory as long as the hard limit is observed. The net effect is that per-query memory limits have the intended, hard-limit effect, while the RM limits coexist nicely and can expand with more memory as required. At the same time, we change the precedence of various ways of suggesting an initial reservation size so that the user can change the reservation size via a query option (MEM_RESERVATION_SIZE). Change-Id: I41bfa4eb1336810a8a5946f6be3472111a052144 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3134 Tested-by: jenkins Reviewed-by: Henry Robinson <henry@cloudera.com>	2014-07-01 18:08:47 -07:00
Alex Behm	003da0ec59	IMPALA-1061: Fix resolution of implicit table aliases in views and star expressions. This patch cleans up registration and resolution of implicit table aliases as follows: During analysis, we register all legal aliases of table/view references and remember implicit table aliases that are ambiguous. When resolving table or column references we consider all legal table aliases. A table/view may have either one explicit alias or two implicit aliases. The implicit aliases are the fully-qualified and the unqualified table name. Within a single query, explicit and implicit aliases can be mixed as long as there are no clashes between explicit and fully-qualified implicit aliases, and there are no ambiguous references to implicit unqualified aliases. Change-Id: I5734539aa821d130882491ec628dae8128d22e2f Reviewed-on: http://gerrit.ent.cloudera.com:8080/3258 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/3359	2014-07-01 17:50:21 -07:00
Skye Wanderman-Milne	f0fb28158b	FE changes to avoid shipping null-type expressions to the BE. Once the expr refactoring goes in, the BE will not be able to evaluate any TYPE_NULL exprs. This patch ensures that the FE casts all null literals and slot refs before they reach the BE. There are a bunch of places where we know the appropriate type and just weren't using it before. This patch also introduces a few notable hacks: * Serializing null SlotRefs and NullLiterals as boolean NullLiterals in case they weren't cast earlier. * Converting null SlotRefs to NullLiterals in uncheckedCastTo() since we don't need to read from the slot at all. This works, but we should consider adding a final pass that cleans up the plan tree and takes care of this. Change-Id: Ic2ee181139059553d7f2d0e17e9dacaee241df17 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3294 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: jenkins (cherry picked from commit a8a67ebcad12956a8260b4ea4189afb7ffab4b68) Reviewed-on: http://gerrit.ent.cloudera.com:8080/3361	2014-07-01 15:48:08 -07:00
Skye Wanderman-Milne	a5c85898e6	Fix StringFunctions::SubString() Without this patch, the returned StringValue's ptr would be before the input pointer if the 'pos' argument was < -input.len Change-Id: I7bd506f5d1119741a94817c34a017215b67cc26e Reviewed-on: http://gerrit.ent.cloudera.com:8080/3351 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins (cherry picked from commit bad40d2beceffaacc409e34041a00d3ffbabf201) Reviewed-on: http://gerrit.ent.cloudera.com:8080/3360 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>	2014-07-01 15:24:39 -07:00
Victor Bittorf	3c388cd1dc	CDH-19918: fixed Moscow timezone conversion. Conversion from UTC to Moscow time was incorrect, this has been fixed. Change-Id: Ib2a1720424bffff4f09713bfb06b5046fb38c031 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3311 Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com> Tested-by: jenkins (cherry picked from commit 9ae067013daf5e2e3a1dca3b31758e87f95432d1) Reviewed-on: http://gerrit.ent.cloudera.com:8080/3357	2014-07-01 13:49:53 -07:00
Victor Bittorf	140b1c8b95	Fixed UDF memory leak warning for STDDEV Change-Id: I8df3d28e9dc0f06819f6512c175b5dec4210a329 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3312 Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com> Tested-by: jenkins (cherry picked from commit 7f44fa68e2d06aa0166263a89a4eaecc21baaa25) Reviewed-on: http://gerrit.ent.cloudera.com:8080/3358	2014-07-01 13:45:20 -07:00
Nong Li	9abca8321b	Fix result precision in decimal round/truncate/etc and overflow. Change-Id: I23840734fd5b7ab7404d94f6df05410b153354de Reviewed-on: http://gerrit.ent.cloudera.com:8080/3338 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins	2014-07-01 08:05:39 -07:00
Nong Li	3fe082d3c9	Add CASE decimal builtin. Change-Id: I007e7f319acd6a5bce739a08797d1d87ffc64472 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3275 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins	2014-07-01 08:05:28 -07:00
Nong Li	d0fe59fe95	Remove unnecessary include from udf dev library. Change-Id: I8bdc9474d817bf63a0908a0c8e4e7f754b4e0b33 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3331 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins	2014-07-01 08:05:09 -07:00
Matthew Jacobs	65c1a6f21e	Remove SOURCE keyword by parsing as an identifier and checking the value Reverts "IMPALA-1033: Remove SOURCE keyword; very common identifier" Change-Id: I3fcf6d02786e00287b564cff0a823d0c19504e7a	2014-06-30 16:47:47 -07:00
Dimitris Tsirogiannis	630d90392e	CDH-20089: Query planning failed in HdfsScanNode.evalBinaryPredicate This commit fixes issue CDH-20089 where an error is thrown when we have a binary predicate on a partition key that has no values. Change-Id: I3b5cefb4d7193045fc6fc5e94766589c2299b5b1 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3327 Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/3335	2014-06-30 15:05:31 -07:00
Alex Behm	7777fbff53	Clean up expr substitution and cloning. Before: The pre- and postconditions of expr substitution and cloning, in particular, their effect on the isAnalyzed_ flag were unclear and sometimes inconsistent e.g., some literal exprs set isAnalyzed_ to true in their c'tor. As a result, several places required ad-hoc solutions like Expr.unsetIsAnalyzed() and Expr.reanalyze(). This patch cleans up expr substitution and cloning, summarized as follows: Expr analysis: All exprs start our with isAnalyzed_ = false. The flag it set to true iff analyze() has been called on the expr. Expr.clone(): Creates a deep copy of an expr including all its analysis state. Expr.equals(): Comparison of expr trees ignores implicit casts. This simplifies expr substitution because un/analyzed exprs can be easily compared/substituted. ExprSubstitutionMap: When adding a mapping, the rhs expr must be analyzed to allow substitution across query blocks. There is no requirement on the lhs expr. Expr substitution: Substitution returns an analyzed clone of the original expr with exprs substituted. While performing the substitution, implicit casts and analysis state are removed such that the returned result has minimal implicit casts and types. There are two versions of substitute functions: One that throws exceptions one that does not, because the caller may have different expectations on whether a substitution must succeed or not. Numeric literals: This patch combines IntLiteral and DecimalLiteral into a NumericLiteral. Its main benefit is that analyze() always produces the same type, even if the literal was implicitly cast and/or isAnalyzed was unset because of expr substitution. This was not the case before because an implicit cast could permanently turn an IntLiteral into a DecimalLiteral. There is no more need for unsetIsAnalyzed() or reanalyze(). Change-Id: I646110e3714cff8ae8d5a378c25a107dd43334b6 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3228 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/3318	2014-06-30 10:18:26 -07:00
Alex Behm	96722da3fe	Fix misplaced comment in testfile. Change-Id: I55dc7d0e8e74a4f8c9a99e9601b2578ef6b0390d Reviewed-on: http://gerrit.ent.cloudera.com:8080/3303 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/3317	2014-06-30 10:17:26 -07:00
Lenni Kuff	ad933ec765	Switch terminology of 'impersonated user' to 'delegated user' This is to help ensure naming is consistent across the platform and also avoid confusion with HS2 "impersonation" which is something very different. Change-Id: I48c1b76dff75b92b11ddc7aab0eb9a3a5d20e489 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3315 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: jenkins (cherry picked from commit 931f6a66c0d8dff25b746d127dc1f36e96b12f98) Reviewed-on: http://gerrit.ent.cloudera.com:8080/3326	2014-06-28 20:46:06 -07:00
Dimitris Tsirogiannis	2aedf5fab4	Add missing ALTER TABLE statement in alltypesaggmultifiles table. The DDL statements for adding the partitions of alltypesaggmultifiles did not include an ALTER TABLE stmt for one of the partitions, thereby causing the planner tests to fail when test data were loaded from a snapshot. Change-Id: Id4b078cd334d816d6eb8eb15e5856189701a4bca Reviewed-on: http://gerrit.ent.cloudera.com:8080/3305 Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com> Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/3310	2014-06-27 18:00:09 -07:00
Nong Li	163750f170	Fix decimal multiply result precision off by 1. Change-Id: I860e0d13ee9bae7d3e180103a22fe7606a320b13 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3249 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins	2014-06-27 11:22:05 -07:00
Nong Li	3e31f81731	Fix index out of bounds with rtrim(). Change-Id: I8c420a45aacdb0ce8f6a83fa8cdf5e91b8ef1f77 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3268 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins	2014-06-27 11:22:00 -07:00
Nong Li	67e80b16e3	Add int96 to multiint benchmark. This was one idea to just cast to __int128_t as a poor man's int96. Unfortunately, it seems too slow: ~15x for add, ~10x for multi and 3x for divide compared to __int128_t. Change-Id: I06eb3fa3ac1edc2c174873a73a252a0165911b1c Reviewed-on: http://gerrit.ent.cloudera.com:8080/2433 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins	2014-06-27 11:21:54 -07:00
Nong Li	553395928e	Change logging level of thrift plan in plan fragment executor. VLOG(3) includes each row which is much less often useful than the serialized plan. Change-Id: I933188f046dafb51da9d06583697792113a9165a Reviewed-on: http://gerrit.ent.cloudera.com:8080/3289 Reviewed-by: Marcel Kornacker <marcel@cloudera.com> Tested-by: jenkins	2014-06-27 11:21:47 -07:00
Skye Wanderman-Milne	5305b17121	IMPALA-1053: only log unsupported type warning once Change-Id: Ibb34e4632f87ac192bb58d4d6616b41e7dac53d2 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3140 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: jenkins (cherry picked from commit 51a1db1ceaa0a928f364f333c4351abefd90b2f8) Reviewed-on: http://gerrit.ent.cloudera.com:8080/3297	2014-06-26 23:58:40 -07:00
Skye Wanderman-Milne	3a6d6b71cb	Fix NULL handling in ArithmeticExpr Before: if both operands to an arithmetic expression were null literals, we would set the operand types and return type to INT. This isn't correct for operators that don't support ints, e.g. divide (there's a separate integer division function), since the function signature wouldn't match the arithmetic expr's types. I think we didn't run into problems because the BE uses void*s everywhere, but I hit this when I switched the arithmetic functions to the UDF interface. In addition, some of the builtins were registered with the wrong return type. After: set the operand types to a type appropriate for the operator before we set the return type, meaning the return type gets assigned correctly using the existing logic. Change-Id: I39fa147c178d895bdffaf1be676ddaa3af1d42c8 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3255 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: jenkins (cherry picked from commit 2634932790d1f4a42ce64f73ec3722a8a7be04af) Reviewed-on: http://gerrit.ent.cloudera.com:8080/3298	2014-06-26 23:52:02 -07:00
Skye Wanderman-Milne	6d17b93814	Open and close exprs in tests Change-Id: Ie4abc8e1e56fc77d68d9656260b8f4adcc2a36e9 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3135 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: jenkins (cherry picked from commit f7eafefa1051ac9f3e5649f45655b80223af5f29) Reviewed-on: http://gerrit.ent.cloudera.com:8080/3296	2014-06-26 23:48:29 -07:00
Skye Wanderman-Milne	3a6600c964	Fix UDF test UDF invocations in udf.test should not specify a database. This is how we switch between testing IR UDFs in the ir_function_test database and native UDFs in the native_function_test database. Change-Id: I09ede18f2b91440ef7a2a76b0daf41a007af2671 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3130 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: jenkins (cherry picked from commit 4d6160c0b88285aea754f6353cdd02b5e4b15633) Reviewed-on: http://gerrit.ent.cloudera.com:8080/3295	2014-06-26 22:17:56 -07:00
Paden Tomasello	d6a20c2f08	Rowbatch.cc uses LZ4 codec instead of Snappy codec Comparison of Exchange node data for Lz4 and Snappy running query: select (star symbol) from tpch.lineitem order by l_orderkey Snappy: XCHANGE_NODE (id equal 2):(Total: 36s021ms...) BytesReceived: 26.75 MB (28047762) DeserializeRowBatchTimer: 246.561ms Lz4: EXCHANGE_NODE (id equal 2):(Total: 34s699ms...) BytesReceived: 11.20 MB (11741118) DeserializeRowBatchTimer: 131.379ms Change-Id: Iae8d212ba0fd508542f3ef9ddaf7507426e13253 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3120 Reviewed-by: Paden Tomasello <paden.tomasello@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/3252	2014-06-26 12:06:39 -07:00
Dimitris Tsirogiannis	6a795915d6	Fix loading data from snapshopt for alltypesagg table. The alltypesagg table was not loaded correctly from a snapshot file due to a missing ALTER TABLE statement, thereby causing some tests to fail. Change-Id: I74066a99529f24fc268bb5779d3fb64fbd4f66b9 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3248 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/3270 Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>	2014-06-25 21:52:11 -07:00
Dimitris Tsirogiannis	5a6f53db16	Add partition pruning tests The following changes are included in this commit: 1. Modified the alltypesagg table to include an additional partition key that has nulls. 2. Added a number of tests in hdfs.test that exercise the partition pruning logic (see IMPALA-887). 3. Modified all the tests that are affected by the change in alltypesagg. Change-Id: I1a769375aaa71273341522eb94490ba5e4c6f00d Reviewed-on: http://gerrit.ent.cloudera.com:8080/2874 Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/3236	2014-06-24 02:14:27 -07:00
Lenni Kuff	13f487ae31	CDH-19900: Change to make Hive/Impala privilege models consistent This makes two changes to the privilege model: * All CREATE statements now require ALL privileges on the parent object * The user should always be able to perform "use default". Additionally, it enables all of the authorization tests, and fixes a bug with new privilege format from Sentry, and corrects an issue where a role wasn't always being updated during an 'invalidate metadata' operation. Change-Id: I92bab4ee0455574a2785bb5483b6d05611c3dfdc Reviewed-on: http://gerrit.ent.cloudera.com:8080/3225 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins	2014-06-23 19:08:37 -07:00
Alex Behm	bf85225911	IMPALA-881: Tests for joins with union inputs. Change-Id: I4be6821ac3938345ca95c542d868c87512ff66da Reviewed-on: http://gerrit.ent.cloudera.com:8080/3229 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: jenkins	2014-06-23 15:38:06 -07:00
Skye Wanderman-Milne	bf8e1b81a0	Make sure QueryExecState::Wait() completes before fetching rows. We run Wait() asynchronously for API compatibility, but many QueryExecState functions cannot actually be run concurrently with Wait() (e.g., Wait() opens output_exprs_, which are then evaluated in FetchRows()). Change-Id: I708aa23fdb238ee7aede1113790f48da2859cab9 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2993 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: jenkins (cherry picked from commit 47f20b643e80f0f8640be9264d7ee3fc5d14dad0) Reviewed-on: http://gerrit.ent.cloudera.com:8080/3226	2014-06-23 11:40:08 -07:00
Henry Robinson	bac4f6c9c8	Properly account for all finished-with expansions Change-Id: I86819add942d13fcef3a9dab6977fcabe6cfdb4f Reviewed-on: http://gerrit.ent.cloudera.com:8080/3220 Reviewed-by: Henry Robinson <henry@cloudera.com> Tested-by: jenkins	2014-06-21 00:40:26 -07:00
Henry Robinson	2a374e5893	Prepare resource broker for cancellation changes This patch anticipates the changes to Llama that allow a client-specified resource ID to be returned with every reservation or expansion request. Doing this allows us to remove the tricky coordination logic between WaitForNotification() and AMNotification() when we don't know which side will access the rendezvous data structures first. Now we can guarantee that the consumer-side will be set-up before the notification is received. Change-Id: I908b1dae8d074a84b0465e3a444d6651f126efd7 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3093 Reviewed-by: Henry Robinson <henry@cloudera.com> Tested-by: jenkins	2014-06-21 00:08:19 -07:00
Henry Robinson	7992e872c1	[CDH5] Upgrade Llama Change-Id: Ie91ba1bc55e02f7eb70c90ce1ed8ce1242fa553d Reviewed-on: http://gerrit.ent.cloudera.com:8080/3161 Reviewed-by: Henry Robinson <henry@cloudera.com> Tested-by: jenkins	2014-06-21 00:03:16 -07:00
Nong Li	a7beb12540	[CDH5] Fix column stats for decimal. Change-Id: I72b31f6431bf6259e759fd290200fd1a755f82c6	2014-06-20 23:03:06 -07:00
Nong Li	b72ef379b6	[CDH5] Update hive thirdparty. Change-Id: Ia13b2b2723ba0aae3e349f47d635e6d925f623eb	2014-06-20 23:03:05 -07:00
Srinath Shankar	7b81a0330c	Change units and naming of some counters used in sort Also differentiates between memory limit and memory used. Change-Id: Ic5534345830b1c3b5109697a93868eb5d40befda Reviewed-on: http://gerrit.ent.cloudera.com:8080/3219 Reviewed-by: Srinath Shankar <sshankar@cloudera.com> Tested-by: jenkins	2014-06-20 20:05:10 -07:00
Alex Behm	881f3a8c33	Re-order union operands descending by their estimated per-host memory. Re-order union operands descending by their estimated per-host memory, s.t. parent nodes can gauge the peak memory consumption of a MergeNode after opening it during execution (a MergeNode opens its first operand in Open()). Scan nodes are always ordered last because they can dynamically scale down their memory usage, whereas many other nodes cannot (e.g., joins, aggregations). One goal is to decrease the likelihood of a SortNode parent claiming too much memory in its Open(), possibly causing the mem limit to be hit when subsequent union operands are executed. Change-Id: Ia51caaffd55305ea3dbd2146cd55acc7da67f382 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3146 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Alex Behm <alex.behm@cloudera.com> Reviewed-on: http://gerrit.ent.cloudera.com:8080/3213 Tested-by: jenkins	2014-06-20 18:46:10 -07:00
Victor Bittorf	2d7f2e19b2	IMPALA 938: Infer schema from Parquet file Syntax is "CREATE TABLE name LIKE fileformat '/path/to/file'". Supports all options that CREATE TABLE does. Currently only PARQUET is supported. Run testdata/bin/create-load-data.sh after pulling this patch. Change-Id: Ibb9fbb89dbde6acceb850b914c48d12f22b33f55 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2720 Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/3158	2014-06-20 17:38:01 -07:00
Taras Bobrovytsky	7faaa65996	Added order by query tests - Added static order by tests to test_queries.py and QueryTest/sort.test - test_order_by.py also contains tests with static queries that are run with multiple memory limits. - Added stress, scratch disk and failpoints tests - Incorporated Srinath's change that copied all order by with limit tests into the top-n.test file Extra time required: Serial: scratch disk: 42 seconds test queries sort : 77 seconds test sort: 56 seconds sort stress: 142 seconds TOTAL: 5 min 17 seconds Parallel(8 threads): scratch disk: 40 seconds test queries sort: 42 seconds test sort: 49 seconds sort stress: 93 seconds TOTAL: 3 min 44 sec Change-Id: Ic5716bcfabb5bb3053c6b9cebc9bfbbb9dc64a7c Reviewed-on: http://gerrit.ent.cloudera.com:8080/2820 Reviewed-by: Taras Bobrovytsky <tbobrovytsky@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/3205	2014-06-20 13:35:10 -07:00
ishaan	0d0614765d	Only use nproc to determine functional test concurrency when it's available in the os. Some operating systems don't ship which nproc, which causes impala-config.sh to fail. This change alleviates the problem by checking if nproc exists, and setting a reasonable default if it fails. Change-Id: Ic6e4d0fbce57eedc82163cfa17f71bdccbc38b51 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3208 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: Ishaan Joshi <ishaan@cloudera.com>	2014-06-20 12:52:08 -07:00
Dimitris Tsirogiannis	7dbd3a5860	IMPALA-1040: Reading a decimal partitioned column with invalid values This commit fixes IMPALA-1040 in which when an invalid value is inserted to a decimal partitioned column through hive it results in a non informative error message and in some cases in the associated table to disappear from Impala's catalog. The fix results in a more informative error message to always be thrown by Impala to indicate the insertion of an invalid partition key value. Change-Id: I2855ea69944e269fb7e02b3825f44e64352151e7 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3062 Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/3200	2014-06-20 12:46:52 -07:00
Henry Robinson	df9c13dcbe	Fix memtracker instantiation when using FETCH_FIRST Change-Id: I47b614b3559880f428951b015291bee4f5af6c49 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3038 Reviewed-by: Henry Robinson <henry@cloudera.com> Tested-by: jenkins	2014-06-20 12:29:20 -07:00
Srinath Shankar	c4219929f9	Change memory allocation in buffered block manager and sorter The sorter and block manager currently allocate all of their memory up-front. This patch changes that so that memory is allocated as a run is built. Only the minimum number of blocks required are allocated up-front. Added a non-blocking TryExpand() call to the buffered block manager to allocate a new buffer and assign it to a block. The only place where this is invoked is when the sorter tries to extend a run. While there are other ways of doing this, this seemed like a minimally invasive change to make at this point. In the merge phase, the sorter does not try to allocate more buffers, but instead works with the buffers allocated up to that point. This is something that is pretty easy to change. Other changes include: a) There is no longer a max_available_buffers() in the block manager. Replaced by a combination of available_allocated_buffers() and TryExpand(). b) In WriteUnpinnedBlocks(), unallocated memory is taken into account to determine if blocks should be written out. c) The sorter uses a block to copy out sorted var-len data when unpinning the blocks in a run. This block is now allocated up-front. Conflicts: tests/query_test/test_sort.py Change-Id: Ifbb2ffd679a882afe8895f4785ec6d7c49c30b98 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3148 Reviewed-by: Srinath Shankar <sshankar@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/3199	2014-06-20 09:57:13 -07:00

1 2 3 4 5 ...

2565 Commits