impala

mirror of https://github.com/apache/impala.git synced 2026-01-24 06:00:49 -05:00

Author	SHA1	Message	Date
casey	24ce8cfada	IMPALA-1456: Hive UDFs with String args would crash impalad The wrong buffer was being used. Change-Id: I18bf9040eaeda871d1d0baee2e276749a3a38615 Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5185 Reviewed-by: Casey Ching <casey@cloudera.com> Tested-by: jenkins	2014-11-17 15:02:30 -08:00
casey	4915ea4ac9	IMPALA-1134: Use copyBytes() to get value from o.a.h.io.Text This affects java UDFs. Previously it was possible that the length of the string returned from a java udf didn't match the actual data. Per the Text.getBytes() documentation "... only data up to getLength() is valid.". Impala just needs to use copyBytes() which is a convenience function for this situation. The same should be done for BytesWritable. Before: Query: select length(echo('12345678901234567890')) +-------------------------------------------+ \| length(java.echo('12345678901234567890')) \| +-------------------------------------------+ \| 22 \| +-------------------------------------------+ After: Query: select length(echo('12345678901234567890')) +-------------------------------------------------+ \| length(functional.echo('12345678901234567890')) \| +-------------------------------------------------+ \| 20 \| +-------------------------------------------------+ Change-Id: If9671278df8abf7529d3bc470c5f9d037ac3da1b Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4897 Reviewed-by: Casey Ching <casey@cloudera.com> Tested-by: jenkins	2014-11-17 15:02:24 -08:00
Dimitris Tsirogiannis	cb697e40b1	IMPALA-1412: Create view as select produces incorrect results This commit fixes the issue where querying a view produces incorrect results if the view definition statement references the same column multiple times in the select list. In that case, a predicate referencing the same slot is generated when equivalences among view slots are enforced, thereby causing null values to be rejected. Change-Id: I3d13656141fb41d232ddd38562cbde277f2a1264 Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5031 Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com> Tested-by: jenkins	2014-11-17 15:02:04 -08:00
Victor Bittorf	3f75bd6735	Reintroduce SEQUENCEFILE writer tests The sequence writer test had an issue with zlib on certain cluster machines, making this a flaky test. This has passed several times locally and in private builds. This re-enables the test because the failures could not be produced in private builds. Change-Id: I0aeea3a2d000e711e5a84427a7b40592e1eef75b Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5077 Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com> Tested-by: jenkins	2014-11-17 11:19:16 -08:00
casey	516d7483dd	IMPALA-1300: Allow subqueries in UNION operands This enables the existing subquery rewrite rules to rewrite UNION statements. UNION rewriting is easily done by simply calling the rewriter for each operand in the UNION. At least one TPC-DS query requires this functionality (IMPALA-1365). The more difficult case of a UNION within a subquery is still not supported. Change-Id: I7f83eed0eb8ae81565e629f09f6918a4ba86ee13 Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4859 Reviewed-by: Casey Ching <casey@cloudera.com> Tested-by: jenkins	2014-11-17 11:19:09 -08:00
Alex Behm	7b6ecbeea5	Fix exhaustive test run: Modify test to produce identical results on HBase. Change-Id: I7187f9aca63f61ea1686820b3cbec277240da191 Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4866 Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com> Tested-by: jenkins	2014-11-17 11:19:01 -08:00
Martin Grund	f58159d431	[CDH5] IMPALA-1141: HBase Planner Performance This patch improves the performance of the planning phase of a query querying HBase tables. It removes an unnecessary second call to compute stats and adds a new version for estimating the row count in a table. This patch adds an incremental version to estimate the number of rows for a set of regions. This incremental version will start querying up to five regions to calculate the average row size and use this value to estimate the row count based on the size of the regions on disk. Only if the standard deviation from the average is larger than 15% query an additional region, it will query additional regions to calculate an average with more confidence. If the data is balanced it will not be necessary to retrieve data from all regions but only from a subset. In the worst case, all regions are queried. Change-Id: Idcb3bea81b11cb08da6d9329ba66c86aca23e170 Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5258 Tested-by: jenkins Reviewed-by: Martin Grund <mgrund@cloudera.com>	2014-11-14 13:47:02 -08:00
Dan Hecht	4bf6a21a9e	S3: Qualify DataSource paths Impala qualifies all paths stored in the metastore except for the DataSource jar path. Use a qualified path here as well, which will allow datasources to live on the non-default FS. In CreateDataSrcStmt, use the post-analyzed qualified path rather than the user passed string. Then, fix CreateTableDataSrcStmt so that it doesn't strip out the scheme://authority portion of the URI, but instead uses the qualified path string directly. Note that the metastore may still contain unqualified paths in DataSource tables' properties that were generated by previous versions. That's okay though since the backend won't assume all paths are qualified in case other components generate (or have in the past) metadata with unqualified paths. Change-Id: I905d8f6a7bf1793cfccf720b6ab5dc845d7dd5fa Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5201 Reviewed-by: Daniel Hecht <dhecht@cloudera.com> Tested-by: jenkins (cherry picked from commit 86c75be01d0f5654291acdbc1c68f5a76915028c) Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5239	2014-11-13 12:42:32 -08:00
Dimitris Tsirogiannis	5e27746e70	IMPALA-1441: Wrong results with outer joins in inline views This commit fixes the issue where additional predicates that alter the meaning of a query are generated when an inline view contains an outer join. Such predicates are derived from slot equivalences withouth taking into account the directionality of the value transfer graph. Change-Id: I0a3390d39a4f2039a8b114a7659980aa444d35c0 Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5109 Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5186	2014-11-07 12:39:51 -08:00
Skye Wanderman-Milne	c693fbc48c	Misc. diagnostic/debugging improvements - Add number of files in table to query plan - Add number of remote scan ranges to runtime profile - Clean up logging in ClientCache Change-Id: I0580fe435ac0a52548aedb4e0ffa875ce9b9dede Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5166 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: jenkins	2014-11-06 22:04:11 -08:00
Nong Li	e2d7fb6402	Some test case cleanup. Change-Id: Ic29b7c1f5fd714a1e2cc41bf0e55c0d11c782862 Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4791 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5090 Reviewed-by: Nong Li <nong@cloudera.com>	2014-11-03 22:33:08 -08:00
Dimitris Tsirogiannis	ef254f08a3	IMPALA-1411: Create table as select produces incorrect results This commit fixes the issue where a CTAS statement inserts a wrong number of rows if the associated select statement contains an inline view with a limit clause. The limit clause of the nested query was not taken into consideration during planning, resulting in the generation of a wrong distributed plan. Change-Id: Ib3ad50199d95d2d6b9ad0aa3b2031a002cbcca44 Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5057 Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5063	2014-11-03 12:12:00 -08:00
Matthew Jacobs	164687ad81	IMPALA-1357: Analysis of WithClause pollutes global state The analysis of a with clause should have its own global state so the local view(s) can be analyzed without polluting the global state of the parent QueryStmt. This might not always matter, but in a complex query involving a with clause that contained a subquery, re-analysis of the WithClause after the subquery rewrite resulted in an invalid Exists conjunct being registered in the parent analyzer's global state. The Exists conjunct was assigned to a scan node which then failed a pre-condition check. Change-Id: Ib020787b2e1ff202d96fe1b92bd9740897ab32a0 Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4825 Reviewed-by: Matthew Jacobs <mj@cloudera.com> Tested-by: jenkins (cherry picked from commit 629a8652c5a290054a8e582cc5cb5768a3ee67a8) Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5038	2014-10-30 16:50:00 -07:00
Taras Bobrovytsky	e5e06c307b	[CDH5] Modified TPCH queries to match the specification Change-Id: Ife2c1fae4d774cd8fe188dfe9c98042ff7e45368 Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4997 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: jenkins	2014-10-29 22:07:33 -07:00
Martin Grund	6e0c1c26c9	IMPALA-1424: abs() function retains input type This patch modifies the abs() built-in function so that it retains the type of the input argument for the return type in the same way as Postgres does. Change-Id: I1750237b85bedbc3ce9d52330ac4d458b0aada3a Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4980 Reviewed-by: Martin Grund <mgrund@cloudera.com> Tested-by: jenkins (cherry picked from commit 424b359ab0a4f621f2865844c3293f2c80e0867f) Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4996	2014-10-28 08:07:21 -07:00
Skye Wanderman-Milne	4a722980e5	IMPALA-1401: raise MAX_PAGE_HEADER_SIZE and use scanner context to stitch together header buffer Change-Id: I4f33b90e845e9bef1ac929bf4ebb8e98eaff985c Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4961 Reviewed-by: Marcel Kornacker <marcel@cloudera.com> Tested-by: jenkins (cherry picked from commit c3a90183b2f03434a9604f3aa2ef6dd08c9ba97c) Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4981 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>	2014-10-27 16:30:56 -07:00
Matthew Jacobs	56611601a3	IMPALA-1395: Add test case back, but commented out Change-Id: I157db82dd016afd54a55512225e8cd6025ec161d Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4936 Reviewed-by: Matthew Jacobs <mj@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4943	2014-10-24 10:31:48 -07:00
Matthew Jacobs	aedf8e5fb8	IMPALA-1395: Remove slow test for IMPALA-1312 that breaks exhaustive runs Removing the test case for IMPALA-1312 to unblock exhaustive runs. This query was previously hitting a DCHECK failure in the BufferedTupleStream where the number of pinned blocks wasn't being updated properly. With codegen enabled, this query took ~70sec. Without codegen, it took so long that the exhaustive runs would fail- I found it took ~35min on my local machine. IMPALA-1414 tracks investigating why this query is so slow. Change-Id: I2bf8a8c51fc7ded0026e334636f9b2cc859ffdb2 Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4931 Reviewed-by: Matthew Jacobs <mj@cloudera.com> Tested-by: jenkins (cherry picked from commit f8b7320e035549da4e4a6a99b87da97bc18be0ad) Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4941	2014-10-24 03:47:45 -07:00
Dimitris Tsirogiannis	7ecf10365f	CDH-22383: Impala hangs when querying HBase tables with large number of columns This commit fixes the issue where Impala hangs when querying an HBase table with large (>500) number of columns. The issue was triggered by a large memory allocation of a tuple buffer during the first GetNext call of the HBase scanner that was causing an infinite loop where each iteration was allocating a significant amount of memory. The fix is to dynamically set the mem limit of a row batch based on the corresponding row size and to dynamically set the maximum size of the tuple buffer so that it does not exceed that limit. Change-Id: Ia64f98b229772b50658af952fc641bf00f54f450 Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4871 Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4933	2014-10-23 15:29:51 -07:00
Martin Grund	e866765213	IMPALA-181: ORDER BY with Ordinals In case of certain queries order by with ordinals would not work properly. This is the case for all "select * " type of queries. Until now, the ordinal substitution was based on the values from the select list. However, these expression are not expanded in case of "*", rather the list of result expressions and column lables is filled. This patch simply changes the lookup of the expression from the select list to the result list because only ordinals from the result can be used as a sorting field. Change-Id: I21d3c3da837307cae04f8a4be02ca31bdcfbcbdb (cherry picked from commit 1b62c08552c19f1b0c2220d1568804e2eba7efac) Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4920 Tested-by: jenkins Reviewed-by: Martin Grund <mgrund@cloudera.com>	2014-10-22 15:19:09 -07:00
Dimitris Tsirogiannis	e672f1c79e	IMPALA-1400: Window function insert issue (LAG() + OVER) This commit fixes an issue where an error is thrown during planning when an insert-select statement contains an analytic function. The issue was caused by a missing mapping step of logical to physical tuples for the case of insert statement. Change-Id: I68d856b1fda4dd0a7345648459e466d90d95201f Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4911 Reviewed-by: Marcel Kornacker <marcel@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4915 Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>	2014-10-21 22:07:12 -07:00
Nong Li	86aebc7f8f	IMPALA-1348: Fix NAAJ where the null partitions have streams with multiple blocks. Change-Id: I892f3435814bd4fcddeb496017dbb60704f13419 Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4728 Tested-by: jenkins Reviewed-by: Nong Li <nong@cloudera.com>	2014-10-14 12:01:53 -07:00
Henry Robinson	b6e91905ed	IMPALA-1384: Fix show table stats test on exhaustive test run Change-Id: I2f1033bc078906ce72a19099f214ab4e3cd9a936 Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4824 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins (cherry picked from commit 0ead02755b6a65d408bed59df810114e26c0c397) Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4830 Reviewed-by: Henry Robinson <henry@cloudera.com>	2014-10-11 22:46:05 -07:00
ishaan	23964c19af	[CDH5] Fix bad merge in in spilling.test Change-Id: Ia6e30cf5916c737088d8cb969e0167b9d69a599e	2014-10-08 23:19:02 -07:00
ishaan	10303ed440	Add partition filters to tpcds-q89 and re-enable tpcds-q47 This patch adds partitions filters to tpcds-q89 to account for the lack of dynamic partition pruning. Additionally, it also re-enables running tpcds-q47, which was blocked by IMPALA-1238 Change-Id: Ied05d80565ebb29cd06b3c38d76bd31f0285028e Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4453 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: jenkins	2014-10-08 16:50:16 -07:00
ishaan	7f576dc41e	Change the result verification for tpcds-q6 to account for the order by. Change-Id: I304788ed5d5b54dc81e23ba192e322967c028c6b Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4711 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: jenkins	2014-10-08 16:48:34 -07:00
Alex Behm	7e4eb77dcd	IMPALA-1102: Clean up and fix removal of redundant join predicates. Correct elimination of redundant join predicates relies on slot equivalences being enforced at the lowest possible plan node possibly by generating new predicates. Previously, we only enforced such equivalences at scan and aggregation nodes which is insufficient because join materialize a new tuple combination which may also require construction of new predicates to establish known slot equivalences. This patch generalies the existing helper function for constructing the minimum spanning tree to cover known slot equivalences for each equivalence class. The function is intended to be called during bottom-up plan generation at nodes that change the tuple composition (scans, joins, aggs, etc.) Change-Id: I73880310553c63296486b2f77a51618738005167 Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4781 Reviewed-by: Marcel Kornacker <marcel@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4794 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Alex Behm <alex.behm@cloudera.com>	2014-10-08 16:48:16 -07:00
Marcel Kornacker	609c287b17	IMPALA-1243: Incorrect plan in analytic using inline view This fixes the incorrect pushing of predicates into Unions for which at least one operand contains an analytic expr. It also adds a TupleDescriptor.debugName_ member variable that makes it easier to read the output of DescriptorTable.debugString(). Change-Id: Icd50220e711851b8174fdfb53c6b2cd03ca3dcde Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4586 Reviewed-by: Marcel Kornacker <marcel@cloudera.com> Tested-by: jenkins	2014-10-07 19:06:25 -07:00
Nong Li	5845a02b6e	IMPALA-1351: Update NAAJ stream to use io sized buffers and better error handling. Since we only make one NULL-aware stream per NAAJ (as opposed to one per partition), we do not care about the memory footprint on this tuple stream. For simplicity, this will always use io-sized buffers. Also, improving error handling in PHJ::ProcessProbeBatch(), as status_ was not being set properly. Disabling the regression test for this bug, as it takes too long to run. Need to find a simpler query. Change-Id: I7572f607199f38b1bc30ae208ece2832522342a1 Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4770 Reviewed-by: Ippokratis Pandis <ipandis@cloudera.com> Tested-by: jenkins Conflicts: be/src/exec/partitioned-hash-join-node.cc	2014-10-07 16:52:05 -07:00
Alex Behm	0752b9fa62	IMPALA-1353: Enforce slot equivalences for inline-view tuples. Correct elimination of redundant join predicates relies on slot equivalences being enforced at the lowest possible plan node possibly by generating new predicates. Previously, we only enforced such equivalences at scan and aggregation nodes which is insufficient, explained as follows. Equivalences between slots of an inline view may not be correctly enforced in the scans/aggs of the inline-view plan if those inline-view slots point to complex expressions in the underlying view stmt. We currently cannot reason about equivalences of complex expressions. As a result, it is possible that inline-view slots are known to be equivalent but the underlying expressions are thought to be non-equivalent. This patch adds enforcement of equivalent inline-view slots by generating new predicates that are migrated into the inline-view plan. This way, our existing expression substitution logic can be used to indirectly reason about the equivalences of complex expressions that are part of an inline view. Change-Id: Id38115c90e2c47d65463380a6f8cb1d0f21134b7 Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4755 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: jenkins Conflicts: fe/src/main/java/com/cloudera/impala/planner/Planner.java	2014-10-07 16:49:15 -07:00
Skye Wanderman-Milne	c79cd3aa23	Add targted-perf query that makes local expr allocations Change-Id: Ida40481cb429227058d78c619820de23f5c4a15e Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4772 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: jenkins	2014-10-07 15:48:32 -07:00
Nong Li	a2e7b05bb1	IMPALA-1332: Fix memory leak for FULL OUTER/RIGHT OUTER joins. This can happen if not all rows are returned. Change-Id: I4d54641b71c44faa85a2138d16f9dda1052317b5 Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4737 Tested-by: jenkins Reviewed-by: Lenni Kuff <lskuff@cloudera.com>	2014-10-06 19:49:56 -07:00
Matthew Jacobs	652d4b4699	IMPALA-1234: Fix bugs when producing EmptySetNode Fixes two issues that can occur when generating the plan for a stmt with an empty result set (e.g. due to limit 0 or constant predicates that evaluate to false): 1) Unions with an inline view that produces an empty result set does not create the EmptySetNode for the correct stmt. 2) An EmptySetNode may contain non-materialized tuples which will fail a precondition check when generating the thrift plan. Change-Id: I1511c755be3a59fdb8934624fd08250323266d27 Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4744 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: jenkins	2014-10-06 19:49:50 -07:00
Alex Behm	6374a21924	IMPALA-1343: Fix chain of left table refs when inverting a join. Change-Id: Ib509f14d65578e3d8e8cccb015d569cb39e3e8a2 Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4736 Reviewed-by: Marcel Kornacker <marcel@cloudera.com> Tested-by: jenkins	2014-10-06 19:49:25 -07:00
Nong Li	364b826d4d	Agg w/o grouping not estimating/reporting the right number of rows. Change-Id: I63ef99f9c123da1116f94d4f6f413764ff16a518 Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4726 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: jenkins	2014-10-06 19:49:10 -07:00
Alex Behm	2ec3dff824	IMPALA-1342: Update outer-join analysis state to accommodate join inversion. Change-Id: I9f0bb4af78b77a56a144a07ee9ed235711a12655 Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4727 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: jenkins	2014-10-06 19:49:04 -07:00
Skye Wanderman-Milne	b6204dff59	IMPALA-1340: removing implicit casts during expr substitution is not always safe Union statements were sometimes losing necessary casts during expression substitution, causing the backend union node to receive slot refs that did not have the same types as the result tuple. Add a flag to Expr.Substitute() to preserve the root expr types, which adds back the casts after substitution. Currently only the union node sets this flag to true, but there may be other places that are incorrect. Change-Id: I1b4d9846860ef9694ff0c089f79654b1746d687d Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4777 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: Skye Wanderman-Milne <skye@cloudera.com>	2014-10-06 17:47:37 -07:00
Nong Li	de31fa8e21	Disable spilling tests that are too flaky. Change-Id: I4ac877c3fa8297d873c67f219bb0c75f0001562d Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4731 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: jenkins	2014-10-06 15:18:56 -07:00
Alex Behm	3e7de9f304	IMPALA-1318: Joins should not return semi-joined tuples. Change-Id: I93f5ddb8317af7794b5977e145805f9ff498d722 Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4633 Tested-by: jenkins Reviewed-by: Nong Li <nong@cloudera.com>	2014-10-06 15:17:22 -07:00
Henry Robinson	6af7c8fe4a	IMPALA-1330: Fix column types for SHOW {table, partition} STATS Because we add 'total' to the last row in SHOW PARTITIONS, we set the partition key columns to be string. At least, that's what the comment said, but we didn't do that in fact. This patch also corrects the column type for max width, which should be INT. Change-Id: I787ab17be27f45107340119017e528c58a3daad3 Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4678 Reviewed-by: Henry Robinson <henry@cloudera.com> Tested-by: jenkins	2014-10-06 15:16:56 -07:00
Victor Bittorf	7b244d34b6	IMPALA-1344: Fixed analytic aggregations with CHAR The fix is to only register aggregates for string, not for CHAR or VARCHAR. The CHAR and and VARCHAR types are implicitly cast to STRING for aggregation. Also, fixed aggregate fn builtins that should not ignore distinct. Change-Id: If4c1a2c6127360c2c8127a5c02949df74fafc85a Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4717 Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com> Tested-by: jenkins	2014-10-06 15:16:50 -07:00
Victor Bittorf	a62500ee28	Changed CHAR & VARCHAR max length to match Hive. Also modified the text of the analysis exception for lengths that are too long or short because John said they were unclear. Change-Id: I9427d5c39298aa8207672e50e10fe527c5076599 Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4698 Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com> Tested-by: jenkins	2014-10-06 15:16:45 -07:00
Alex Behm	78508f6f78	IMPALA-1343: Revert in-place state changes to table refs for inverted joins. The bug was that changes to table refs were done in-place for join inversion, and not reverted when a particular join re-ordering attempt was unsuccessful. Subsequent join re-ordering attempts with a different left-most table ref should use the original unmodified table refs. To achieve the above, this patch reverts the state changes made to table refs for unsuccessful join ordering attempts. TODO: The cleaner fix is to clone all table refs for each new join re-ordering attempt. However, implementing a state-preserving clone() for table refs is a more involved change. Change-Id: Ife0121f0e15441a5c0a23f75054c683c05b1ecac Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4715 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins	2014-10-06 15:16:40 -07:00
Victor Bittorf	c29ed3761e	IMPALA-1339: NULLs incorrectly hashed in groupby Problem: hash table assumed all raw values were at most 16 bytes. This maximum was increased to to support up to 128 bytes for CHARs. Change-Id: I107c58b9a013d5db46ff5586bcdceee3961346e9 Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4701 Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com> Tested-by: jenkins	2014-10-06 15:16:36 -07:00
Alex Behm	a78de45bad	IMPALA-1324,IMPALA-1307: Enforce compatible hash exprs for partitioned joins. When generating the plan for a partitioned hash join we place the join into a partition-compatible input fragment, if possible. During this optimization one must ensure that the hash exchange sending to the new join (which was placed into a compatible fragment) is compatible with the hash exprs used for sending to the fragment containing the join node. In particular, we had two bugs: 1. The number of hash exprs could be different, possibly because of redundant exprs on one or both sides 2. The order of the hash exprs could be different, causing two rows with the same hash-expr values to be sent to different nodes The fix is to enfore the above two properties, and revert to exchanging both sides if no compatible hash exprs can be constructed. Change-Id: Id155fb8094ed1694f7bc038ed2f9685f4d645fbe Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4639 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: jenkins	2014-10-06 15:16:30 -07:00
Nong Li	e08ffde009	PA/PHJ: Increase fanout to 32 and fix interaction with small buffers. Small buffers introduced an issue that is exacerbated by the large fanout. A stream can only be appended to forever once it has grabbed the initial io sized buffer. With small buffers, we don't grab that at the beginning anymore and, before this patch, it is grabbed when the stream first needs it. This means when one stream needs it, another stream could have already grabbed it (meaning this stream is pinned with multiple buffers). This patch has all the streams grab an IO buffer as soon as the first stream needs an io buffer. This guarantees that all streams get 1 before any get 2. Change-Id: I1be1219fc5f1fa3ceedd4d5e76ae056c8bb8ff3d	2014-10-06 15:16:16 -07:00
Victor Bittorf	d5fd59e2ed	IMPALA-1337: Aggregation failures for VARCHAR The issue is that the aggregation node needed to use IsVarLen; previously it assumed TYPE_STRING was the only variable length type. Change-Id: I9545e8d405937a47b25c9042f97854851a448c6e Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4690 Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com> Tested-by: jenkins	2014-10-06 15:14:51 -07:00
Victor Bittorf	f4626b03e6	IMPALA-1322: Fix related issue There is an issue related to IMPALA-1322. The expression list when laying out memory was being improperly index. Change-Id: I2eef84a812b451d87ecb8afd304e765aff1f5a6b Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4675 Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com> Tested-by: jenkins	2014-10-06 15:14:44 -07:00
Dimitris Tsirogiannis	384daae537	IMPALA-1335: Wrong subquery rewrite for correlated scalar subqueries with complex exprs This commit fixes the issue where, for the case of scalar subqueries, complex exprs in a correlated predicate may result in a wrong subquery rewrite. Change-Id: Ib6f14a37ca7a74e25daf3b31f86766ff9032d7fd Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4674 Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com> Tested-by: jenkins	2014-10-06 15:14:39 -07:00
Nong Li	3e632ef6ad	Reduce min PA/PHJ mem requirement. Update PA/PHJ to use small (< io sized buffers) initially. Without this we would not be able to run at the QPS that we need just due to the buffering requirements of these operators. Change-Id: Ic8a777d147893567c9590fbab17f561eadb6ee19 Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4623 Tested-by: jenkins Reviewed-by: Nong Li <nong@cloudera.com>	2014-10-06 15:14:10 -07:00

1 2 3 4 5 ...

802 Commits