impala

mirror of https://github.com/apache/impala.git synced 2025-12-30 21:02:41 -05:00

Author	SHA1	Message	Date
Taras Bobrovytsky	e94de02469	Added execution summary, modified benchmark to handle JSON - Added execution summary to the beeswax client and QueryResult - Modified report-benchmark-results to handle JSON and perform execution summary comparison between runs - Added comments to the new workload runner Change-Id: I9c3c5f2fdc5d8d1e70022c4077334bc44e3a2d1d Reviewed-on: http://gerrit.ent.cloudera.com:8080/3598 Reviewed-by: Taras Bobrovytsky <tbobrovytsky@cloudera.com> Tested-by: jenkins (cherry picked from commit fd0b1406be2511c202e02fa63af94fbbe5e18eee) Reviewed-on: http://gerrit.ent.cloudera.com:8080/3618	2014-07-25 21:06:00 -07:00
ishaan	3bed0be1df	Refactor the performance framework and change its execution strategy. This patch introduces new abstractions and changes the way queries are run via the workload runner. A new class 'Workload' is introduced, which represents the notion of a workload in the performance framework (i.e, A set of query names mapped to query strings). The new workflow is: - run-workload acts as a driver. It accepts user parmaters for which queries to run and their execution strategy. It generates workload objects and passes them to the workload-runner. - The workload runner takes a workload, its execution parameters and generates a set of test vectors over which the workload is run iteratively. - A workload is executed by initialiazing a QueryExecutor for each query being run in a test vector. The workload executor is then responsible for execution and gathering results. - The execution details of every query being executed are are stored and returned to the driver (run-workload). Change-Id: Ia16360140d65e6733e534e823bc5d5614622ab5f Reviewed-on: http://gerrit.ent.cloudera.com:8080/3616 Reviewed-by: Taras Bobrovytsky <tbobrovytsky@cloudera.com> Tested-by: jenkins	2014-07-25 18:17:11 -07:00
Dan Hecht	1fee56cb26	IMPALA-1080: Implement "SET <query_option>" as SQL statement. Also add support for "SET", which returns a table of query options and their respective values. The front-end parses the option into a (key, value) pair and then the existing backend logic is used to set the option, or return the result sets. Change-Id: I40dbd98537e2a73bdd5b27d8b2575a2fe6f8295b Reviewed-on: http://gerrit.ent.cloudera.com:8080/3582 Reviewed-by: Daniel Hecht <dhecht@cloudera.com> Tested-by: jenkins (cherry picked from commit aa0f6a2fc1d3fe21f22cc7bc56887e1fdb02250b) Reviewed-on: http://gerrit.ent.cloudera.com:8080/3614	2014-07-25 10:25:09 -07:00
Nong Li	cfa58a4567	Run test_rows_availability serially. Change-Id: Id87a209a614f889209456f8c0d9aedd8ad0e513f Reviewed-on: http://gerrit.ent.cloudera.com:8080/3565 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/3584	2014-07-22 14:35:46 -07:00
Nong Li	7dc57aaa9e	Change buffered block mgr to support multiple clients. This patch does a few things: 1. Moves the buffer block mgr from the sorter to the runtime state. This is now one that is shared across the query fragment. The partitioned hash join and agg will use this as well. 2. Adds a Client interface to the block mgr. Each exec node is a different client and can reserve a minimum number of buffers. This avoid starvation. 3. Updated the BufferedBlockMgr interface's for getting pinned blocks to collapse two existing APIs. Change-Id: Ibb31fbe480f3726048457f26e24a9e33f7201d86 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3504 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: Nong Li <nong@cloudera.com> Reviewed-on: http://gerrit.ent.cloudera.com:8080/3574	2014-07-22 12:45:37 -07:00
Nong Li	a25400c94e	Increase timeout in test_rows_availability to make sure query state is what we expect. Change-Id: Id4feebcc7b7cecb07555009219e6420e48a0c82b Reviewed-on: http://gerrit.ent.cloudera.com:8080/3534 Tested-by: jenkins Reviewed-by: Nong Li <nong@cloudera.com> Reviewed-on: http://gerrit.ent.cloudera.com:8080/3579	2014-07-22 12:12:13 -07:00
Nong Li	202d656ddc	Stop setting query state to EXCEPTION for non-exception cases. We were setting the state to exception on Cancel() all the time. We use the cancellation path as the normal cleanup path so this gets called even when the query went fine (e.g. UnregisterQuery calls Cancel()). We had already plumbed through a 'cause' argument to differentiate. Change-Id: Icf1091c165dec36d3dad7ce308367bbbc9edee4f Reviewed-on: http://gerrit.ent.cloudera.com:8080/3524 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/3575	2014-07-22 04:08:28 -07:00
ishaan	c6f49bb8e3	Fix the query generator to work with python 2.6.x Change-Id: Ib7ca870f946d365cb7e026cf753c8f25795dcb06 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3138 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: jenkins	2014-07-21 20:05:50 -07:00
Abdullah Yousufi	6c1e272ef7	IMPALA-1059: Make backticking -d option argument idempotent There was an issue with the previous fix to IMPALA-1059 if the user tried to reconnect within the shell after having passed in a database via the -d option. The passed database would be doubly backticked. This makes the backticking of the argument idempotent. Change-Id: I6eaed997c2be73d8659a2a12046ce393b97ec82c Reviewed-on: http://gerrit.ent.cloudera.com:8080/3467 Reviewed-by: Abdullah Yousufi <abdullah.yousufi@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/3502	2014-07-15 18:10:40 -07:00
Nong Li	188a0ea833	Rework structure of hash table. This patch does two things in preparation for external joins. The hash table used to contain a directory structure (buckets and nodes) both of which were contiguous. The nodes contained the tuple ptrs within it. This patch changes it so the nodes are not stored contiguously but allocated in pages. (this structure is dense and does not require random lookups by index). The bucket structure is still contiguous since we rely on the doubling property and random lookup by index. The second change is that the node's no longer store the tuple ptrs within them. This makes it easier to build the hash table ontop of existing data. Here's a quick benchmark doing a self join on tpch lineitem. Both build and probe times decreased a bit. Before: HASH_JOIN_NODE (id=2):(Total: 1s139ms, non-child: 985.939ms, % non-child: 86.50%) - BuildBuckets: 2.10M (2097152) - BuildRows: 6.00M (6001215) - BuildTime: 527.991ms - LeftChildRows: 6.00M (6001215) - LeftChildTime: 451.964ms - LoadFactor: 0.50 - RowsReturned: 30.01M (30012985) - RowsReturnedRate: 26.33 M/sec After: HASH_JOIN_NODE (id=2):(Total: 1s019ms, non-child: 835.350ms, % non-child: 81.97%) - BuildBuckets: 2.10M (2097152) - BuildRows: 6.00M (6001215) - BuildTime: 423.175ms - LeftChildRows: 6.00M (6001215) - LeftChildTime: 406.67ms - LoadFactor: 0.50 - RowsReturned: 30.01M (30012985) - RowsReturnedRate: 29.45 M/sec Change-Id: I79e209a24c24fb4f2f99574bcf187746fddadc06 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3245 Tested-by: jenkins Reviewed-by: Nong Li <nong@cloudera.com>	2014-07-15 16:57:09 -07:00
Abdullah Yousufi	864ed53511	IMPALA-1059: Backtick argument passed to USE by shell -d option If not backticked, arguments such as parquet are interpreted as keywords, when it is possible a database by that name exists. This could have been avoided via single quotes around backticks: -d '`parquet`' Otherwise, -d `parquet` throws a commandline error. In interactive mode, backticks alone (ex. use `parquet`) will pass the name as an identifier rather than a keyword. Change-Id: I24b43eeeb6b4bfda5388165856788a20b64bc2ba Reviewed-on: http://gerrit.ent.cloudera.com:8080/3307 Reviewed-by: Abdullah Yousufi <abdullah.yousufi@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/3500	2014-07-15 15:43:49 -07:00
Taras Bobrovytsky	568e851774	Added option to specify the scale factor for pytest This allows execution of tests on a cluster with multiple scale factors. For example: py.test <test file> --impalad <cluster ip>:21000 --scale_factor 300gb Change-Id: I5230a6ef354def44b984eab2ac8a01989b9a471c Reviewed-on: http://gerrit.ent.cloudera.com:8080/3051 Reviewed-by: Taras Bobrovytsky <tbobrovytsky@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/3215	2014-07-15 14:44:37 -07:00
Taras Bobrovytsky	8d6f8ff01c	run-workload should exit with a non-zero error code if a query fails and abort_on_error is true The exception raised by a child thread did not reach the main thread, so the script exited with 0 instead of 1. Change-Id: I09be9dc824386bf25a64af0323cbf78f6d006b91 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3081 Reviewed-by: Taras Bobrovytsky <tbobrovytsky@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/3214	2014-07-15 14:43:10 -07:00
Abdullah Yousufi	f4d1afe0ce	IMPALA-921: Change EXPLAIN_LEVEL value from 0 to 1 in impala-shell for SET command Change-Id: I2bfcefb5c8143d4cb4d74157c5309cd9445bac02 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3383 Reviewed-by: Abdullah Yousufi <abdullah.yousufi@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/3499	2014-07-15 12:32:43 -07:00
Henry Robinson	9d0173c647	[CDH5] Disable ACL tests The tests pass every time locally (in a 60 minute run), but fail intermittently on our build machines. Change-Id: I62d5ea0df8c42728a538b29bd16006be3179bfd3 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3489 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: jenkins	2014-07-14 15:38:11 -07:00
Henry Robinson	ff32821c6b	[CDH5] Test to confirm that ACLs are inherited correctly on INSERT Change-Id: I781a6b7203c2e12b484162954abae51a6443bead Reviewed-on: http://gerrit.ent.cloudera.com:8080/3076 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: jenkins	2014-07-09 19:04:55 -07:00
ishaan	f262fcea64	Support utf-8 input and out in the shell Also add --strict_unicode option which controls whether invalid unicode code points should be ignored on input. Change-Id: Ice59d6dd3df4557ab3b1fc91d7ddc0e1bf03f1c7 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3218 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: jenkins	2014-07-02 23:18:27 -07:00
Matthew Jacobs	65c1a6f21e	Remove SOURCE keyword by parsing as an identifier and checking the value Reverts "IMPALA-1033: Remove SOURCE keyword; very common identifier" Change-Id: I3fcf6d02786e00287b564cff0a823d0c19504e7a	2014-06-30 16:47:47 -07:00
Alex Behm	7777fbff53	Clean up expr substitution and cloning. Before: The pre- and postconditions of expr substitution and cloning, in particular, their effect on the isAnalyzed_ flag were unclear and sometimes inconsistent e.g., some literal exprs set isAnalyzed_ to true in their c'tor. As a result, several places required ad-hoc solutions like Expr.unsetIsAnalyzed() and Expr.reanalyze(). This patch cleans up expr substitution and cloning, summarized as follows: Expr analysis: All exprs start our with isAnalyzed_ = false. The flag it set to true iff analyze() has been called on the expr. Expr.clone(): Creates a deep copy of an expr including all its analysis state. Expr.equals(): Comparison of expr trees ignores implicit casts. This simplifies expr substitution because un/analyzed exprs can be easily compared/substituted. ExprSubstitutionMap: When adding a mapping, the rhs expr must be analyzed to allow substitution across query blocks. There is no requirement on the lhs expr. Expr substitution: Substitution returns an analyzed clone of the original expr with exprs substituted. While performing the substitution, implicit casts and analysis state are removed such that the returned result has minimal implicit casts and types. There are two versions of substitute functions: One that throws exceptions one that does not, because the caller may have different expectations on whether a substitution must succeed or not. Numeric literals: This patch combines IntLiteral and DecimalLiteral into a NumericLiteral. Its main benefit is that analyze() always produces the same type, even if the literal was implicitly cast and/or isAnalyzed was unset because of expr substitution. This was not the case before because an implicit cast could permanently turn an IntLiteral into a DecimalLiteral. There is no more need for unsetIsAnalyzed() or reanalyze(). Change-Id: I646110e3714cff8ae8d5a378c25a107dd43334b6 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3228 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/3318	2014-06-30 10:18:26 -07:00
Lenni Kuff	ad933ec765	Switch terminology of 'impersonated user' to 'delegated user' This is to help ensure naming is consistent across the platform and also avoid confusion with HS2 "impersonation" which is something very different. Change-Id: I48c1b76dff75b92b11ddc7aab0eb9a3a5d20e489 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3315 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: jenkins (cherry picked from commit 931f6a66c0d8dff25b746d127dc1f36e96b12f98) Reviewed-on: http://gerrit.ent.cloudera.com:8080/3326	2014-06-28 20:46:06 -07:00
Dimitris Tsirogiannis	5a6f53db16	Add partition pruning tests The following changes are included in this commit: 1. Modified the alltypesagg table to include an additional partition key that has nulls. 2. Added a number of tests in hdfs.test that exercise the partition pruning logic (see IMPALA-887). 3. Modified all the tests that are affected by the change in alltypesagg. Change-Id: I1a769375aaa71273341522eb94490ba5e4c6f00d Reviewed-on: http://gerrit.ent.cloudera.com:8080/2874 Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/3236	2014-06-24 02:14:27 -07:00
Nong Li	a7beb12540	[CDH5] Fix column stats for decimal. Change-Id: I72b31f6431bf6259e759fd290200fd1a755f82c6	2014-06-20 23:03:06 -07:00
Alex Behm	881f3a8c33	Re-order union operands descending by their estimated per-host memory. Re-order union operands descending by their estimated per-host memory, s.t. parent nodes can gauge the peak memory consumption of a MergeNode after opening it during execution (a MergeNode opens its first operand in Open()). Scan nodes are always ordered last because they can dynamically scale down their memory usage, whereas many other nodes cannot (e.g., joins, aggregations). One goal is to decrease the likelihood of a SortNode parent claiming too much memory in its Open(), possibly causing the mem limit to be hit when subsequent union operands are executed. Change-Id: Ia51caaffd55305ea3dbd2146cd55acc7da67f382 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3146 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Alex Behm <alex.behm@cloudera.com> Reviewed-on: http://gerrit.ent.cloudera.com:8080/3213 Tested-by: jenkins	2014-06-20 18:46:10 -07:00
Taras Bobrovytsky	7faaa65996	Added order by query tests - Added static order by tests to test_queries.py and QueryTest/sort.test - test_order_by.py also contains tests with static queries that are run with multiple memory limits. - Added stress, scratch disk and failpoints tests - Incorporated Srinath's change that copied all order by with limit tests into the top-n.test file Extra time required: Serial: scratch disk: 42 seconds test queries sort : 77 seconds test sort: 56 seconds sort stress: 142 seconds TOTAL: 5 min 17 seconds Parallel(8 threads): scratch disk: 40 seconds test queries sort: 42 seconds test sort: 49 seconds sort stress: 93 seconds TOTAL: 3 min 44 sec Change-Id: Ic5716bcfabb5bb3053c6b9cebc9bfbbb9dc64a7c Reviewed-on: http://gerrit.ent.cloudera.com:8080/2820 Reviewed-by: Taras Bobrovytsky <tbobrovytsky@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/3205	2014-06-20 13:35:10 -07:00
Dimitris Tsirogiannis	7dbd3a5860	IMPALA-1040: Reading a decimal partitioned column with invalid values This commit fixes IMPALA-1040 in which when an invalid value is inserted to a decimal partitioned column through hive it results in a non informative error message and in some cases in the associated table to disappear from Impala's catalog. The fix results in a more informative error message to always be thrown by Impala to indicate the insertion of an invalid partition key value. Change-Id: I2855ea69944e269fb7e02b3825f44e64352151e7 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3062 Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/3200	2014-06-20 12:46:52 -07:00
Ippokratis Pandis	6026f1ebe1	IMPALA-1055: Compute stats query statements don't quote DB and table names The compute stats statement was not quoting the DB and table names. If those names were aliasing with keywords, then the compute stats would not execute due to a syntax error. Change-Id: Ie08421246bb54a63a44eaf19d0d835da780b7033 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3170 Reviewed-by: Ippokratis Pandis <ipandis@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/3198	2014-06-20 09:32:52 -07:00
Nong Li	52f2b2cb52	Fix overflow in decimal divide. Added warning if overflow happened. Change-Id: I2e9167dbec83b3d1c2cf0e52fae4e09d6b5a38ce Reviewed-on: http://gerrit.ent.cloudera.com:8080/3141 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/3191	2014-06-20 02:24:41 -07:00
ishaan	d6042f7780	Disable metric verification for mem-pool.total-bytes. This is to unblock the builds until IMPALA-1057 is resolved. Change-Id: I3d2c861737526c33cf48b444c81c429b9abbe829 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3185 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: Ishaan Joshi <ishaan@cloudera.com>	2014-06-19 18:18:01 -07:00
Alex Behm	aacd8bcf72	Change UnionNode to open its first child in UnionNode::Open(). This patch ensures that rows are available for clients to fetch after we advance the query to FINISHED if the coordinator fragment is rooted at a UnionNode. Change-Id: I9b4ad3f70b46c7e7720bdd5ca9ad85479c2cb7fd Reviewed-on: http://gerrit.ent.cloudera.com:8080/3139 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/3168	2014-06-19 16:44:43 -07:00
ishaan	dc3dc3dc1e	Enable tpch queries to run on text to unblock the full data load build. Some planner tests depend on data being populated in the tpch tmp tables (in text format) . This change re-enables the tpch query tests to run on text so that they pass. Change-Id: I4ed09f55e05cb01978cb6f0808c6395552c0f129 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3176 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: Ishaan Joshi <ishaan@cloudera.com>	2014-06-19 16:19:13 -07:00
Alex Behm	ef6705d7e0	Rename MergeNode to UnionNode. Change-Id: I9e3675a103757db1345b04bd1d102d2719efddd0 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3128 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/3154 Tested-by: Alex Behm <alex.behm@cloudera.com>	2014-06-19 12:44:21 -07:00
Skye Wanderman-Milne	c3c9365c17	Change shell to print WARNINGS instead of ERRORS Change-Id: I8b41a2f4307e31eda970ca891adb4f12fea926bb Reviewed-on: http://gerrit.ent.cloudera.com:8080/3088 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: Skye Wanderman-Milne <skye@cloudera.com> (cherry picked from commit 0a655f759d5096def89d2c72be5aa9a0cb2c10b1) Reviewed-on: http://gerrit.ent.cloudera.com:8080/3149	2014-06-19 10:42:58 -07:00
Lenni Kuff	0ac0527643	Reduce test execution time by limiting long running tests to exhaustive exec strategy I looked at the latest run from master and took the tests suites that had long execution times. This cleans those test suites up to either completely disable them on 'core' or add constraints to limit the number of test vectors. It shouldn't impact nightly coverage since we still run the same tests exhaustively. Change-Id: I10c78c35155b00de0c36d9fc0923b2b1fc6b44de Reviewed-on: http://gerrit.ent.cloudera.com:8080/3119 Reviewed-by: Marcel Kornacker <marcel@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/3125 Reviewed-by: Lenni Kuff <lskuff@cloudera.com>	2014-06-18 16:18:17 -07:00
anusha	6b3689e8c7	IMPALA-973: Fix for invalidate metadata behaviour Change-Id: Ie0c4c458b0919978b03ebaba28bf37950dd34643 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3009 Tested-by: jenkins Reviewed-by: Anusha Dasarakothapalli <anusha.dasarakothapalli@cloudera.com> Reviewed-on: http://gerrit.ent.cloudera.com:8080/3091	2014-06-17 12:18:50 -07:00
Dimitris Tsirogiannis	67eb5eb3a8	IMPALA-1028: Cardinality estimate is wrong for partitioned tables if we filter out all partitions This commit fixes IMPALA-1028 in which the cardinality estimate is not correct when all the partitions of a partitioned table are filtered out. To fix this issue we make sure that the estimated result cardinality of the scan node is zero when all the partitions are filtered out. Change-Id: I225949eb2e8f905a5d0f678d7f199fb95ba4aab0 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3063 Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/3083	2014-06-16 20:36:13 -07:00
Matthew Jacobs	b3c98cf3c8	Fix occasional admission control test failures The admission control tests could occasionally fail when cancelled queries return OK (IMPALA-1047). Until fixed, we can just treat such queries as if there were cancelled. Change-Id: Id9fc8e9f585e466059d4ffefb4d9ed407206ad1d Reviewed-on: http://gerrit.ent.cloudera.com:8080/3019 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: jenkins (cherry picked from commit 2901a8a960076f2aec74cb5a1f5000953359a68f) Reviewed-on: http://gerrit.ent.cloudera.com:8080/3025 Reviewed-by: Matthew Jacobs <mj@cloudera.com>	2014-06-16 15:50:33 -07:00
Matthew Jacobs	dbe1b534ed	IMPALA-1050: NPE error when pool placement policy cannot map user to pool Change-Id: I53ed823ee55bee96269f4119af7da2dab25d4a7c Reviewed-on: http://gerrit.ent.cloudera.com:8080/3028 Reviewed-by: Matthew Jacobs <mj@cloudera.com> Tested-by: jenkins (cherry picked from commit 569bd5d4a8e30a907a33551c58a3ab80849b8dc9) Reviewed-on: http://gerrit.ent.cloudera.com:8080/3061	2014-06-15 13:38:20 -07:00
Srinath Shankar	0df773eed6	Check RuntimeState for cancellation in sorter. Currently, cancellation checking when a SortNode is executing only happens when a batch is being added to the sorter (SortNode::SortInput()) or when a batch is being retrieved from the sorter (SortNode::GetNext()) This fix passes in a RuntimeState into the Sorter instance itself, which checks for cancellation at the following points: i) During an in-memory sort (In Partition() and SortHelper()). In Partition(), the cancellation check may be delayed if the input is completely sorted. ii) During an intermediate merge before each batch of rows from a merge is copied into a run. Change-Id: I5c28c7244ee2e40627cf14542b99f872e3a8c343 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3007 Reviewed-by: Srinath Shankar <sshankar@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/3059	2014-06-14 17:48:40 -07:00
Skye Wanderman-Milne	bbb908db1e	Add HS2 GetLog() test Change-Id: I24cc4a1873942cb4d67dcf75ed57ce7becec6f11 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3016 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: jenkins (cherry picked from commit 33f332f44c31fea747fadc56c7816c1da3b25b6c) Reviewed-on: http://gerrit.ent.cloudera.com:8080/3040 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>	2014-06-13 18:39:07 -07:00
Henry Robinson	d162571211	Fix 'summary' when exch map is not set Change-Id: I66d9987f45f6cee045a300f86de357a2761929d7 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3000 Reviewed-by: Henry Robinson <henry@cloudera.com> Tested-by: jenkins (cherry picked from commit 6f82cb296d0b3f0546d4e8a26485b79f20ff8996) Reviewed-on: http://gerrit.ent.cloudera.com:8080/3020 Tested-by: Henry Robinson <henry@cloudera.com>	2014-06-12 22:18:04 -07:00
anusha	ffc334a735	IMPALA-834: Fix for Create Table like Views Change-Id: Ied1f706c48a1106e1d6fc2aa73e57746f52ea333 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2939 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/3014 Reviewed-by: Anusha Dasarakothapalli <anusha.dasarakothapalli@cloudera.com>	2014-06-12 22:13:30 -07:00
Henry Robinson	9a7c6d286f	Add 'summary' to shell Users can now type 'summary' in the Impala shell after a query executes to get a breakdown of the work done by each part of the query plan. Change-Id: Ia6a43429ffc7778f3c2c8fcbf45d83828263c2ab Reviewed-on: http://gerrit.ent.cloudera.com:8080/2963 Tested-by: jenkins Reviewed-by: Henry Robinson <henry@cloudera.com> (cherry picked from commit 9b98d42acb14d43a64832767528ee572eac4979b) Reviewed-on: http://gerrit.ent.cloudera.com:8080/2995	2014-06-12 02:59:58 -07:00
Skye Wanderman-Milne	1cc628d32d	IMPALA-950: Skip computing stats for decimal columns. This patch also adds a mechanism to return analysis warnings to client, which is used to log skipped decimal columns. Change-Id: I30c246044a68ec8861cd5bed072bd54e65a079e6 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2822 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: jenkins (cherry picked from commit fc77422acef7e6f93fdeb5448309414b905f0725) Reviewed-on: http://gerrit.ent.cloudera.com:8080/2984	2014-06-11 19:16:34 -07:00
ayousufi	66e90d75ee	IMPALA-286: Display set query options in default section in impala-shell Options displayed with 'set' command. Default values distinguished from set values by square brackets. Change-Id: Iacf0574555aab78aa0ba2008ceb8776d372a57a5 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2913 Reviewed-by: Abdullah Yousufi <abdullah.yousufi@cloudera.com> Tested-by: jenkins	2014-06-11 11:51:19 -07:00
Skye Wanderman-Milne	6ac9a8104b	IMPALA-1009: UDF/UDA leaks should not fail queries With this change, leaky UDFs built with the SDK will still fail when using the test harness, but leaky UDFs running in Impala will only trigger a warning. This change also updates the test infrastructure to always check for non-fatal errors/warnings. Change-Id: I5615349b9d691e4eddea3e03e152ef12e73835e7 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2844 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: jenkins (cherry picked from commit 60ce5190d96add6104aba642d2354d87a26000fa) Reviewed-on: http://gerrit.ent.cloudera.com:8080/2938	2014-06-10 21:46:47 -07:00
Nong Li	5e49150a22	Speed up views compat test. - Use a smaller table so hive runs faster - Don't invalidate the catalog, just the view created in hive - This lets us run it in parallel Change-Id: I8085d8967dc96cbbb20e2d719072b29fe591cd98 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2958 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins	2014-06-10 20:53:23 -07:00
Nong Li	ad534429df	[CDH5] Disable flaky hdfs caching test. Change-Id: I19900ae029876d8f74169eda0f08f5be3509fbaf Reviewed-on: http://gerrit.ent.cloudera.com:8080/2946 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins	2014-06-10 18:24:42 -07:00
Ippokratis Pandis	fe0646f76b	IMPALA-1022: Handle cases where in Parquet the expected number of rows in metadata is wrong There are cases of Parquet files where the metadata indicate wrong number of rows for these files. The parquet-scanner until now was not reporting any problem in this case. Instead it was reading as long as there where values for the read columns. But with IMPALA-1016 we are now reading at most as many rows as the rows per metadata. With this patch, the parquet-scanner, right before it finishes scannings, checks whether it read the expected number of rows (taken from metadata). In cases where the actual number of rows read is less than or greater than the expected number, it either aborts or logs an error. Change-Id: Ie6a66a38e8912730bf04762e6526ec1cadb2bcdc Reviewed-on: http://gerrit.ent.cloudera.com:8080/2755 Reviewed-by: Ippokratis Pandis <ipandis@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/2944	2014-06-10 17:27:54 -07:00
Lenni Kuff	892eccc8d0	CDH-19184: Impala should show impersonated user (if there is one) rather than connected user Currently, we always display the 'User' as the connected user in the debug webpage and runtime profiles. This is confusing when impersonation + authorization is enabled because there is not an easy way to find the impersonated user other than looking at the audit log records. This change does the following: * Updates the "User" field in the runtime profile to show the "effective user". The effective user is the connected user if there is no impersonated user, otherwise it is the impersonated user. This should help CM display the correct user as well. * Add two new fields in the runtime profile "Connected User" & "Impersonated User" to make it easier to tell which user is which. * Update the /queries debug webpage to show the effective user rather than the connected user. Change-Id: I639de6738242d2c378e785271a72257301a53ade Reviewed-on: http://gerrit.ent.cloudera.com:8080/2863 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: jenkins (cherry picked from commit d4ad768780dfdfe0874f2b3e9c59074f1c3685d7) Reviewed-on: http://gerrit.ent.cloudera.com:8080/2935	2014-06-10 11:08:25 -07:00
Lenni Kuff	b3ebfddadd	Allow tests to access query result column values by col alias or col position For example, you can now do something like: result_set = execute("select * from tbl") result_row = result_set[0] result_row['col_alias'] or result_row[4] to access column values. If the column alias/position does not exist an exception is thrown. Change-Id: Ie4b65619ed17fd90bf39e0966a7fc7e1180dbc5c Reviewed-on: http://gerrit.ent.cloudera.com:8080/2719 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/2922	2014-06-09 23:24:26 -07:00

1 2 3 4 5 ...

414 Commits