Commit Graph

434 Commits

Author SHA1 Message Date
casey
24ce8cfada IMPALA-1456: Hive UDFs with String args would crash impalad
The wrong buffer was being used.

Change-Id: I18bf9040eaeda871d1d0baee2e276749a3a38615
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5185
Reviewed-by: Casey Ching <casey@cloudera.com>
Tested-by: jenkins
2014-11-17 15:02:30 -08:00
casey
4915ea4ac9 IMPALA-1134: Use copyBytes() to get value from o.a.h.io.Text
This affects java UDFs. Previously it was possible that the length of
the string returned from a java udf didn't match the actual data. Per the
Text.getBytes() documentation "... only data up to getLength() is
valid.". Impala just needs to use copyBytes() which is a convenience
function for this situation. The same should be done for BytesWritable.

Before:

Query: select length(echo('12345678901234567890'))
+-------------------------------------------+
| length(java.echo('12345678901234567890')) |
+-------------------------------------------+
| 22                                        |
+-------------------------------------------+

After:

Query: select length(echo('12345678901234567890'))
+-------------------------------------------------+
| length(functional.echo('12345678901234567890')) |
+-------------------------------------------------+
| 20                                              |
+-------------------------------------------------+

Change-Id: If9671278df8abf7529d3bc470c5f9d037ac3da1b
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4897
Reviewed-by: Casey Ching <casey@cloudera.com>
Tested-by: jenkins
2014-11-17 15:02:24 -08:00
Victor Bittorf
3f75bd6735 Reintroduce SEQUENCEFILE writer tests
The sequence writer test had an issue with zlib on certain cluster machines, making
this a flaky test. This has passed several times locally and in private builds. This
re-enables the test because the failures could not be produced in private builds.

Change-Id: I0aeea3a2d000e711e5a84427a7b40592e1eef75b
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5077
Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com>
Tested-by: jenkins
2014-11-17 11:19:16 -08:00
casey
516d7483dd IMPALA-1300: Allow subqueries in UNION operands
This enables the existing subquery rewrite rules to rewrite UNION
statements. UNION rewriting is easily done by simply calling the
rewriter for each operand in the UNION. At least one TPC-DS query
requires this functionality (IMPALA-1365).

The more difficult case of a UNION within a subquery is still not
supported.

Change-Id: I7f83eed0eb8ae81565e629f09f6918a4ba86ee13
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4859
Reviewed-by: Casey Ching <casey@cloudera.com>
Tested-by: jenkins
2014-11-17 11:19:09 -08:00
Alex Behm
7b6ecbeea5 Fix exhaustive test run: Modify test to produce identical results on HBase.
Change-Id: I7187f9aca63f61ea1686820b3cbec277240da191
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4866
Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Tested-by: jenkins
2014-11-17 11:19:01 -08:00
Dan Hecht
4bf6a21a9e S3: Qualify DataSource paths
Impala qualifies all paths stored in the metastore except for the
DataSource jar path.  Use a qualified path here as well, which will
allow datasources to live on the non-default FS.

In CreateDataSrcStmt, use the post-analyzed qualified path rather than
the user passed string.  Then, fix CreateTableDataSrcStmt so that it
doesn't strip out the scheme://authority portion of the URI, but instead
uses the qualified path string directly.

Note that the metastore may still contain unqualified paths in
DataSource tables' properties that were generated by previous versions.
That's okay though since the backend won't assume all paths are
qualified in case other components generate (or have in the past)
metadata with unqualified paths.

Change-Id: I905d8f6a7bf1793cfccf720b6ab5dc845d7dd5fa
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5201
Reviewed-by: Daniel Hecht <dhecht@cloudera.com>
Tested-by: jenkins
(cherry picked from commit 86c75be01d0f5654291acdbc1c68f5a76915028c)
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5239
2014-11-13 12:42:32 -08:00
Skye Wanderman-Milne
c693fbc48c Misc. diagnostic/debugging improvements
- Add number of files in table to query plan
- Add number of remote scan ranges to runtime profile
- Clean up logging in ClientCache

Change-Id: I0580fe435ac0a52548aedb4e0ffa875ce9b9dede
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5166
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: jenkins
2014-11-06 22:04:11 -08:00
Nong Li
e2d7fb6402 Some test case cleanup.
Change-Id: Ic29b7c1f5fd714a1e2cc41bf0e55c0d11c782862
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4791
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5090
Reviewed-by: Nong Li <nong@cloudera.com>
2014-11-03 22:33:08 -08:00
Matthew Jacobs
164687ad81 IMPALA-1357: Analysis of WithClause pollutes global state
The analysis of a with clause should have its own global state so the
local view(s) can be analyzed without polluting the global state of the
parent QueryStmt. This might not always matter, but in a complex query
involving a with clause that contained a subquery, re-analysis of the
WithClause after the subquery rewrite resulted in an invalid Exists
conjunct being registered in the parent analyzer's global state. The
Exists conjunct was assigned to a scan node which then failed a
pre-condition check.

Change-Id: Ib020787b2e1ff202d96fe1b92bd9740897ab32a0
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4825
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: jenkins
(cherry picked from commit 629a8652c5a290054a8e582cc5cb5768a3ee67a8)
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5038
2014-10-30 16:50:00 -07:00
Martin Grund
6e0c1c26c9 IMPALA-1424: abs() function retains input type
This patch modifies the abs() built-in function so that it
retains the type of the input argument for the return type
in the same way as Postgres does.

Change-Id: I1750237b85bedbc3ce9d52330ac4d458b0aada3a
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4980
Reviewed-by: Martin Grund <mgrund@cloudera.com>
Tested-by: jenkins
(cherry picked from commit 424b359ab0a4f621f2865844c3293f2c80e0867f)
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4996
2014-10-28 08:07:21 -07:00
Skye Wanderman-Milne
4a722980e5 IMPALA-1401: raise MAX_PAGE_HEADER_SIZE and use scanner context to
stitch together header buffer

Change-Id: I4f33b90e845e9bef1ac929bf4ebb8e98eaff985c
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4961
Reviewed-by: Marcel Kornacker <marcel@cloudera.com>
Tested-by: jenkins
(cherry picked from commit c3a90183b2f03434a9604f3aa2ef6dd08c9ba97c)
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4981
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
2014-10-27 16:30:56 -07:00
Matthew Jacobs
56611601a3 IMPALA-1395: Add test case back, but commented out
Change-Id: I157db82dd016afd54a55512225e8cd6025ec161d
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4936
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4943
2014-10-24 10:31:48 -07:00
Matthew Jacobs
aedf8e5fb8 IMPALA-1395: Remove slow test for IMPALA-1312 that breaks exhaustive runs
Removing the test case for IMPALA-1312 to unblock exhaustive runs. This query was
previously hitting a DCHECK failure in the BufferedTupleStream where the number of
pinned blocks wasn't being updated properly. With codegen enabled, this query took
~70sec. Without codegen, it took so long that the exhaustive runs would fail- I
found it took ~35min on my local machine.

IMPALA-1414 tracks investigating why this query is so slow.

Change-Id: I2bf8a8c51fc7ded0026e334636f9b2cc859ffdb2
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4931
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: jenkins
(cherry picked from commit f8b7320e035549da4e4a6a99b87da97bc18be0ad)
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4941
2014-10-24 03:47:45 -07:00
Martin Grund
e866765213 IMPALA-181: ORDER BY with Ordinals
In case of certain queries order by with ordinals would not work
properly. This is the case for all "select * " type of queries. Until
now, the ordinal substitution was based on the values from the select
list. However, these expression are not expanded in case of "*",
rather the list of result expressions and column lables is filled.

This patch simply changes the lookup of the expression from the select
list to the result list because only ordinals from the result can be
used as a sorting field.

Change-Id: I21d3c3da837307cae04f8a4be02ca31bdcfbcbdb
(cherry picked from commit 1b62c08552c19f1b0c2220d1568804e2eba7efac)
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4920
Tested-by: jenkins
Reviewed-by: Martin Grund <mgrund@cloudera.com>
2014-10-22 15:19:09 -07:00
Nong Li
86aebc7f8f IMPALA-1348: Fix NAAJ where the null partitions have streams with multiple blocks.
Change-Id: I892f3435814bd4fcddeb496017dbb60704f13419
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4728
Tested-by: jenkins
Reviewed-by: Nong Li <nong@cloudera.com>
2014-10-14 12:01:53 -07:00
Henry Robinson
b6e91905ed IMPALA-1384: Fix show table stats test on exhaustive test run
Change-Id: I2f1033bc078906ce72a19099f214ab4e3cd9a936
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4824
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
(cherry picked from commit 0ead02755b6a65d408bed59df810114e26c0c397)
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4830
Reviewed-by: Henry Robinson <henry@cloudera.com>
2014-10-11 22:46:05 -07:00
ishaan
23964c19af [CDH5] Fix bad merge in in spilling.test
Change-Id: Ia6e30cf5916c737088d8cb969e0167b9d69a599e
2014-10-08 23:19:02 -07:00
Nong Li
5845a02b6e IMPALA-1351: Update NAAJ stream to use io sized buffers and better error handling.
Since we only make one NULL-aware stream per NAAJ (as opposed to one per partition),
we do not care about the memory footprint on this tuple stream. For simplicity,
this will always use io-sized buffers.
Also, improving error handling in PHJ::ProcessProbeBatch(), as status_ was not being
set properly.

Disabling the regression test for this bug, as it takes too long to run. Need to find
a simpler query.

Change-Id: I7572f607199f38b1bc30ae208ece2832522342a1
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4770
Reviewed-by: Ippokratis Pandis <ipandis@cloudera.com>
Tested-by: jenkins

Conflicts:

	be/src/exec/partitioned-hash-join-node.cc
2014-10-07 16:52:05 -07:00
Nong Li
a2e7b05bb1 IMPALA-1332: Fix memory leak for FULL OUTER/RIGHT OUTER joins.
This can happen if not all rows are returned.

Change-Id: I4d54641b71c44faa85a2138d16f9dda1052317b5
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4737
Tested-by: jenkins
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
2014-10-06 19:49:56 -07:00
Matthew Jacobs
652d4b4699 IMPALA-1234: Fix bugs when producing EmptySetNode
Fixes two issues that can occur when generating the plan for a
stmt with an empty result set (e.g. due to limit 0 or constant
predicates that evaluate to false):
 1) Unions with an inline view that produces an empty result set
    does not create the EmptySetNode for the correct stmt.
 2) An EmptySetNode may contain non-materialized tuples which
    will fail a precondition check when generating the thrift
    plan.

Change-Id: I1511c755be3a59fdb8934624fd08250323266d27
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4744
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: jenkins
2014-10-06 19:49:50 -07:00
Skye Wanderman-Milne
b6204dff59 IMPALA-1340: removing implicit casts during expr substitution is not always safe
Union statements were sometimes losing necessary casts during
expression substitution, causing the backend union node to receive
slot refs that did not have the same types as the result tuple. Add a
flag to Expr.Substitute() to preserve the root expr types, which adds
back the casts after substitution.

Currently only the union node sets this flag to true, but there may be
other places that are incorrect.

Change-Id: I1b4d9846860ef9694ff0c089f79654b1746d687d
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4777
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: Skye Wanderman-Milne <skye@cloudera.com>
2014-10-06 17:47:37 -07:00
Nong Li
de31fa8e21 Disable spilling tests that are too flaky.
Change-Id: I4ac877c3fa8297d873c67f219bb0c75f0001562d
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4731
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
Tested-by: jenkins
2014-10-06 15:18:56 -07:00
Alex Behm
3e7de9f304 IMPALA-1318: Joins should not return semi-joined tuples.
Change-Id: I93f5ddb8317af7794b5977e145805f9ff498d722
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4633
Tested-by: jenkins
Reviewed-by: Nong Li <nong@cloudera.com>
2014-10-06 15:17:22 -07:00
Henry Robinson
6af7c8fe4a IMPALA-1330: Fix column types for SHOW {table, partition} STATS
Because we add 'total' to the last row in SHOW PARTITIONS, we set the
partition key columns to be string. At least, that's what the comment
said, but we didn't do that in fact.

This patch also corrects the column type for max width, which should be INT.

Change-Id: I787ab17be27f45107340119017e528c58a3daad3
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4678
Reviewed-by: Henry Robinson <henry@cloudera.com>
Tested-by: jenkins
2014-10-06 15:16:56 -07:00
Victor Bittorf
7b244d34b6 IMPALA-1344: Fixed analytic aggregations with CHAR
The fix is to only register aggregates for string, not for CHAR or VARCHAR. The CHAR and
and VARCHAR types are implicitly cast to STRING for aggregation.

Also, fixed aggregate fn builtins that should not ignore distinct.

Change-Id: If4c1a2c6127360c2c8127a5c02949df74fafc85a
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4717
Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com>
Tested-by: jenkins
2014-10-06 15:16:50 -07:00
Victor Bittorf
a62500ee28 Changed CHAR & VARCHAR max length to match Hive.
Also modified the text of the analysis exception for lengths that are too long or
short because John said they were unclear.

Change-Id: I9427d5c39298aa8207672e50e10fe527c5076599
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4698
Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com>
Tested-by: jenkins
2014-10-06 15:16:45 -07:00
Victor Bittorf
c29ed3761e IMPALA-1339: NULLs incorrectly hashed in groupby
Problem: hash table assumed all raw values were at most 16 bytes. This maximum was
increased to to support up to 128 bytes for CHARs.

Change-Id: I107c58b9a013d5db46ff5586bcdceee3961346e9
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4701
Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com>
Tested-by: jenkins
2014-10-06 15:16:36 -07:00
Nong Li
e08ffde009 PA/PHJ: Increase fanout to 32 and fix interaction with small buffers.
Small buffers introduced an issue that is exacerbated by the large fanout. A stream can
only be appended to forever once it has grabbed the initial io sized buffer. With small
buffers, we don't grab that at the beginning anymore and, before this patch, it is
grabbed when the stream first needs it. This means when one stream needs it, another
stream could have already grabbed it (meaning this stream is pinned with multiple
buffers).

This patch has all the streams grab an IO buffer as soon as the first stream needs an
io buffer. This guarantees that all streams get 1 before any get 2.

Change-Id: I1be1219fc5f1fa3ceedd4d5e76ae056c8bb8ff3d
2014-10-06 15:16:16 -07:00
Victor Bittorf
d5fd59e2ed IMPALA-1337: Aggregation failures for VARCHAR
The issue is that the aggregation node needed to use IsVarLen; previously
it assumed TYPE_STRING was the only variable length type.

Change-Id: I9545e8d405937a47b25c9042f97854851a448c6e
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4690
Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com>
Tested-by: jenkins
2014-10-06 15:14:51 -07:00
Victor Bittorf
f4626b03e6 IMPALA-1322: Fix related issue
There is an issue related to IMPALA-1322. The expression list when laying out memory
was being improperly index.

Change-Id: I2eef84a812b451d87ecb8afd304e765aff1f5a6b
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4675
Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com>
Tested-by: jenkins
2014-10-06 15:14:44 -07:00
Nong Li
3e632ef6ad Reduce min PA/PHJ mem requirement.
Update PA/PHJ to use small (< io sized buffers) initially. Without this we would
not be able to run at the QPS that we need just due to the buffering requirements
of these operators.

Change-Id: Ic8a777d147893567c9590fbab17f561eadb6ee19
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4623
Tested-by: jenkins
Reviewed-by: Nong Li <nong@cloudera.com>
2014-10-06 15:14:10 -07:00
Victor Bittorf
794e70b0bd Fix CHAR/VARCHAR Aggregation
This fixes an issue where VARCHAR and CHAR could error in some aggregations.
The cause of the problem is that the BE currently does not support CHAR/VARCHAR as
arguments to aggregates, they require an implicit cast to string first.
The resolution is to have these operators return STRING instead of CHAR(*) or VARCHAR(*).
Note that the CHAR(*) comparisons still ignore spaces for min/max.

This takes advantage of the fact that STRING, VARCHAR(*), and CHAR(*) values are all
handled as a StringVal for exprs. The STRING aggregates are registered as CHAR(*) and
VARCHAR(*) aggregates and the front end converts the return type to a STRING in all cases.

Also includes a fix for a TODO about casting between CHAR and VARCHAR.

Change-Id: I1d3a9cc48e426286ce63677324a8c680e67b005a
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4573
Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com>
Tested-by: jenkins
2014-10-06 15:13:17 -07:00
Victor Bittorf
fa502f973a IMPALA-1319: Fixed CHAR padding for numeric casts
IMPALA-1322: Crash on VARCHAR/CHAR join

Fixed 2 issues:
  (1) Disabled codegen for CHAR in hash join equality
  (2) fixed memory layout for CHAR
  (3) Fixed a regression where space padding could be dropped for numeric casts.

Change-Id: I6475fd527ca0d67c7d4d5ec7e561549e43fbc336
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4640
Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com>
Tested-by: jenkins
2014-10-06 15:12:44 -07:00
Skye Wanderman-Milne
0db2181d97 IMPALA-1326: fix bug in BufferedTupleStream::GetTupleRow()
Change-Id: If133a2041e0bae0c327fe83b114e36b9320784bb
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4658
Tested-by: jenkins
Reviewed-by: Nong Li <nong@cloudera.com>
2014-10-06 15:12:32 -07:00
Henry Robinson
080299730c IMPALA-1298: Add var_{pop,samp} as aliases for variance_{pop,samp}
Change-Id: I5880ad7ebf0775704ee7fa08685928224e316458
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4656
Reviewed-by: Henry Robinson <henry@cloudera.com>
Tested-by: jenkins
2014-10-06 15:12:25 -07:00
Nong Li
a1b2de9c95 Update distinctpc/pcsa to return bigint.
Change-Id: Iac3414aa0151f52ba9ec028da152b09fc09af264
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4637
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
2014-10-06 15:12:12 -07:00
Matthew Jacobs
8b1b8f5780 IMPALA-1302: Incorrect result of FIRST_VALUE query
FIRST_VALUE with row offsets preceding did not produce the correct
results. This fix changes the rewrite for FIRST_VALUE and adds
additional handling for NULLs in the backend.

Change-Id: I03d54c05f63f46e9adb467008fa876ab33812c7b
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4648
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: jenkins
2014-10-06 15:12:03 -07:00
Matthew Jacobs
86b9f8282f Move aggregation tests on decimal tables to decimal.test
Fixes test failures in exhaustive mode when aggregation tests
are run on table formats that do not support decimal.

Change-Id: Ic5dfb398575770cf318ffcc0ce3a20737bb2f5cd
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4636
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
Tested-by: jenkins
2014-10-06 15:11:58 -07:00
Matthew Jacobs
8f0c206bdd IMPALA-1087: Fix error handling loading libraries in LibCache
If an error occurred loading a library in LibCache (e.g. by using
CREATE FUNCTION) an error is returned but a cache entry may still
exist which may result in strange errors later when the cache
entry is accessed by subsequent queries.

This changes LibCache::GetCacheEntry to ensure cache entries do
not exist if errors occur. Because GetCacheEntry needs to take
the global lock and then the cache entry lock, but needs to
unlock the global lock before performing slow HDFS operations,
we set the error status on the cache entry so that all locks
can be released when an error occurs. Other threads that attempt
to access the cache entry check the status and return if it is
not OK. The first thread (the thread that got the error) can
then remove the cache entry whenever it is able to again acquire
the global lock_.

Change-Id: I00fd0e2a4611b06fa72ffe0aaaa7d077b7a0c36e
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4642
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: jenkins
2014-10-06 15:11:43 -07:00
Matthew Jacobs
928907905b Fix appx_median to return correct result type
Change-Id: Ifc54e1069e2f7a46242229d710943e921633a920
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4625
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: jenkins
2014-10-06 15:11:27 -07:00
Victor Bittorf
658f05f63c IMPALA-1316: crash on VARCHAR join
Fixed codegen issue casing some VARCHAR joins to crash.

Change-Id: Ib2674199a3b2c3c5a5fd63cfae0b64e3b1ca158b
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4616
Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com>
Tested-by: jenkins
2014-10-06 15:11:10 -07:00
Ippokratis Pandis
5c4486a2b2 Proper handling of NULL tuples by buffered-tuple-stream.
Adding a bitstring at the head of each block in the TupleStream that indicates which
tuples of the appended rows in the block are NULLs. When reading the stream, through
GetNext() or GetTupleRow() calls, the NULL tuples are stitched back to their correct
position.

This fixes crashes in PHJ of bushy plans with NULLs on the build side(s) as well as
similar crashes in PAGG and the analytic node.
For example, it fixes IMPALA-1204, IMPALA-1223, and IMPALA-1249.
Also, adds regression tests for IMPALA-1175, IMPALA-1204, IMPALA-1223, IMPALA-1249
and IMPALA-1306.

Change-Id: I30ad0dbd4dfeabcda8fae444d1c6ec9291f38398
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4596
Reviewed-by: Ippokratis Pandis <ipandis@cloudera.com>
Tested-by: jenkins
2014-10-06 15:10:58 -07:00
Dimitris Tsirogiannis
5db0f877cb Fix subqueries test for HBase
Change-Id: I8d3c10d29a198135e87ab848ba206c2662166760
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4597
Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Tested-by: jenkins
2014-10-06 15:09:37 -07:00
Dimitris Tsirogiannis
b201c7a7d1 IMPALA-1299: Analytic should be allows in correlated EXISTS subquery
With this commit we enable correlated and uncorrelated EXISTS
subqueries with grouping and/or aggregation including analytic
functions. Furthermore, we enable correlated EXISTS subqueries
with a LIMIT clause.

Change-Id: I36c33f80b152b7f175bf803cbe920ce1983d7162
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4583
Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Tested-by: jenkins
2014-10-06 15:09:25 -07:00
Nong Li
4d2da72698 IMPAL-1312: Fix num_pinned tracking in BufferedTupleStream.
Change-Id: I04264ef25ba8d43826e65f98d34135e7b3593f8b
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4582
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: jenkins
2014-10-06 15:09:20 -07:00
ishaan
010cc22a2f [CDH5] Fix test spilling.
tpch in cdh5 does not have double columns. Also, remove round calls to test that we get
consistent results.

Change-Id: Ia45ef08644ed78b05a08c47422733ab38a26b508
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4595
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: Ishaan Joshi <ishaan@cloudera.com>
2014-09-26 22:57:02 -07:00
ishaan
a7c87bb250 [CDH5] Fix tpcds analytical functions test.
There was a new test file in cdh4 which had the wrong datatypes for tpcds.

Change-Id: Ide1300d0a539d1f40a4c0763b44d06fd81c96204
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4590
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: Ishaan Joshi <ishaan@cloudera.com>
2014-09-26 16:56:40 -07:00
Victor Bittorf
dbaf718221 IMPALA-1185: Make Avro and Seq writers unsupported
Avro and Sequence writers are only available if query option
ALLOW_UNSUPPORTED_FORMATS is set to true, prints an error otherwise.

Change-Id: I597039f7c68f708fda10f848531eb557d6910f92
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4539
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
2014-09-26 12:28:03 -07:00
Nong Li
d5c948c351 Increase the mem limit for one of the spilling queries.
Change-Id: I9b52582b2ded82821ecc446762f07d7702dedabf
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4555
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: Nong Li <nong@cloudera.com>
2014-09-26 12:27:29 -07:00
casey
9a72c28832 Add DECODE builtin
This adds DECODE functionality into the existing CaseExpr class. There
will be no separate backend impementation for DECODE, it will be sent to
the backend as a CASE expr so the existing codegen function can be used.

Because Oracle does cast checking during execution and Impala cast
checking during analysis, some uses of DECODE that are valid in Oracle
are invalid in Impala.

Ex:

  SELECT DECODE(foo, bar, int_col, baz, string_col_containing_only_ints)
  FROM ...

  would be run on Oracle. If string_col_containing_only_ints actually
  contained non-INTs, an error would be thrown during execution and no
  results would be returned. In Impala an error is thrown during analysis.
  If a CAST was added to the STRING column, a cast failure would result in
  NULL.

Change-Id: Ia08cc2389abb6f843bba117e7091c659ad25ff41
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4334
Tested-by: jenkins
Reviewed-by: Casey Ching <casey@cloudera.com>
Tested-by: Casey Ching <casey@cloudera.com>
2014-09-26 12:26:46 -07:00