Commit Graph

749 Commits

Author SHA1 Message Date
Henry Robinson
080299730c IMPALA-1298: Add var_{pop,samp} as aliases for variance_{pop,samp}
Change-Id: I5880ad7ebf0775704ee7fa08685928224e316458
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4656
Reviewed-by: Henry Robinson <henry@cloudera.com>
Tested-by: jenkins
2014-10-06 15:12:25 -07:00
Nong Li
a1b2de9c95 Update distinctpc/pcsa to return bigint.
Change-Id: Iac3414aa0151f52ba9ec028da152b09fc09af264
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4637
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
2014-10-06 15:12:12 -07:00
Matthew Jacobs
8b1b8f5780 IMPALA-1302: Incorrect result of FIRST_VALUE query
FIRST_VALUE with row offsets preceding did not produce the correct
results. This fix changes the rewrite for FIRST_VALUE and adds
additional handling for NULLs in the backend.

Change-Id: I03d54c05f63f46e9adb467008fa876ab33812c7b
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4648
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: jenkins
2014-10-06 15:12:03 -07:00
Matthew Jacobs
86b9f8282f Move aggregation tests on decimal tables to decimal.test
Fixes test failures in exhaustive mode when aggregation tests
are run on table formats that do not support decimal.

Change-Id: Ic5dfb398575770cf318ffcc0ce3a20737bb2f5cd
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4636
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
Tested-by: jenkins
2014-10-06 15:11:58 -07:00
Matthew Jacobs
8f0c206bdd IMPALA-1087: Fix error handling loading libraries in LibCache
If an error occurred loading a library in LibCache (e.g. by using
CREATE FUNCTION) an error is returned but a cache entry may still
exist which may result in strange errors later when the cache
entry is accessed by subsequent queries.

This changes LibCache::GetCacheEntry to ensure cache entries do
not exist if errors occur. Because GetCacheEntry needs to take
the global lock and then the cache entry lock, but needs to
unlock the global lock before performing slow HDFS operations,
we set the error status on the cache entry so that all locks
can be released when an error occurs. Other threads that attempt
to access the cache entry check the status and return if it is
not OK. The first thread (the thread that got the error) can
then remove the cache entry whenever it is able to again acquire
the global lock_.

Change-Id: I00fd0e2a4611b06fa72ffe0aaaa7d077b7a0c36e
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4642
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: jenkins
2014-10-06 15:11:43 -07:00
Matthew Jacobs
928907905b Fix appx_median to return correct result type
Change-Id: Ifc54e1069e2f7a46242229d710943e921633a920
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4625
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: jenkins
2014-10-06 15:11:27 -07:00
Alex Behm
39fef93425 IMPALA-1278: Basic cardinality estimation for semi joins.
Change-Id: I353c9c581f7a54e0b42bdd1e89cf99b93d6e18de
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4634
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: jenkins
2014-10-06 15:11:15 -07:00
Victor Bittorf
658f05f63c IMPALA-1316: crash on VARCHAR join
Fixed codegen issue casing some VARCHAR joins to crash.

Change-Id: Ib2674199a3b2c3c5a5fd63cfae0b64e3b1ca158b
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4616
Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com>
Tested-by: jenkins
2014-10-06 15:11:10 -07:00
Ippokratis Pandis
5c4486a2b2 Proper handling of NULL tuples by buffered-tuple-stream.
Adding a bitstring at the head of each block in the TupleStream that indicates which
tuples of the appended rows in the block are NULLs. When reading the stream, through
GetNext() or GetTupleRow() calls, the NULL tuples are stitched back to their correct
position.

This fixes crashes in PHJ of bushy plans with NULLs on the build side(s) as well as
similar crashes in PAGG and the analytic node.
For example, it fixes IMPALA-1204, IMPALA-1223, and IMPALA-1249.
Also, adds regression tests for IMPALA-1175, IMPALA-1204, IMPALA-1223, IMPALA-1249
and IMPALA-1306.

Change-Id: I30ad0dbd4dfeabcda8fae444d1c6ec9291f38398
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4596
Reviewed-by: Ippokratis Pandis <ipandis@cloudera.com>
Tested-by: jenkins
2014-10-06 15:10:58 -07:00
Lenni Kuff
758ba08bbb Silence most of data loading spew by redirecting it to log files
Change-Id: I256a3970ce52bbcac816178029f703095fec388f
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4610
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: Lenni Kuff <lskuff@cloudera.com>
2014-10-06 15:09:42 -07:00
Dimitris Tsirogiannis
5db0f877cb Fix subqueries test for HBase
Change-Id: I8d3c10d29a198135e87ab848ba206c2662166760
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4597
Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Tested-by: jenkins
2014-10-06 15:09:37 -07:00
Dimitris Tsirogiannis
b201c7a7d1 IMPALA-1299: Analytic should be allows in correlated EXISTS subquery
With this commit we enable correlated and uncorrelated EXISTS
subqueries with grouping and/or aggregation including analytic
functions. Furthermore, we enable correlated EXISTS subqueries
with a LIMIT clause.

Change-Id: I36c33f80b152b7f175bf803cbe920ce1983d7162
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4583
Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Tested-by: jenkins
2014-10-06 15:09:25 -07:00
Nong Li
4d2da72698 IMPAL-1312: Fix num_pinned tracking in BufferedTupleStream.
Change-Id: I04264ef25ba8d43826e65f98d34135e7b3593f8b
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4582
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: jenkins
2014-10-06 15:09:20 -07:00
Nong Li
8e48068d6b BufferedBlockMgr: bug fixes for stress.
Change-Id: I084569e10595fd359c7c83be731c4378156185d4
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4576
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
2014-10-06 15:09:13 -07:00
ishaan
e126a3c8b5 Enable more tpcds queries that use correlated subqueries and analytic functions.
This patch only operates on queries that use store_sales as the fact table.

Change-Id: I763245ef5f68bb1519bcb4d4b26ede96913a1d57
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4312
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4106
2014-09-27 01:15:41 -07:00
ishaan
010cc22a2f [CDH5] Fix test spilling.
tpch in cdh5 does not have double columns. Also, remove round calls to test that we get
consistent results.

Change-Id: Ia45ef08644ed78b05a08c47422733ab38a26b508
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4595
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: Ishaan Joshi <ishaan@cloudera.com>
2014-09-26 22:57:02 -07:00
ishaan
a7c87bb250 [CDH5] Fix tpcds analytical functions test.
There was a new test file in cdh4 which had the wrong datatypes for tpcds.

Change-Id: Ide1300d0a539d1f40a4c0763b44d06fd81c96204
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4590
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: Ishaan Joshi <ishaan@cloudera.com>
2014-09-26 16:56:40 -07:00
Victor Bittorf
dbaf718221 IMPALA-1185: Make Avro and Seq writers unsupported
Avro and Sequence writers are only available if query option
ALLOW_UNSUPPORTED_FORMATS is set to true, prints an error otherwise.

Change-Id: I597039f7c68f708fda10f848531eb557d6910f92
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4539
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
2014-09-26 12:28:03 -07:00
Nong Li
d5c948c351 Increase the mem limit for one of the spilling queries.
Change-Id: I9b52582b2ded82821ecc446762f07d7702dedabf
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4555
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: Nong Li <nong@cloudera.com>
2014-09-26 12:27:29 -07:00
Taras Bobrovytsky
fec685e075 Added stress tests for Agg and Join with spilling
Change-Id: I07999f0902886e16646f30fe2981074b6c683264
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4527
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
2014-09-26 12:27:12 -07:00
casey
9a72c28832 Add DECODE builtin
This adds DECODE functionality into the existing CaseExpr class. There
will be no separate backend impementation for DECODE, it will be sent to
the backend as a CASE expr so the existing codegen function can be used.

Because Oracle does cast checking during execution and Impala cast
checking during analysis, some uses of DECODE that are valid in Oracle
are invalid in Impala.

Ex:

  SELECT DECODE(foo, bar, int_col, baz, string_col_containing_only_ints)
  FROM ...

  would be run on Oracle. If string_col_containing_only_ints actually
  contained non-INTs, an error would be thrown during execution and no
  results would be returned. In Impala an error is thrown during analysis.
  If a CAST was added to the STRING column, a cast failure would result in
  NULL.

Change-Id: Ia08cc2389abb6f843bba117e7091c659ad25ff41
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4334
Tested-by: jenkins
Reviewed-by: Casey Ching <casey@cloudera.com>
Tested-by: Casey Ching <casey@cloudera.com>
2014-09-26 12:26:46 -07:00
Victor Bittorf
a3767c9f2b Fix data loading to unblock gvm
Change-Id: I5e145f1e8497d340cb72a8112c247e63b1c79362
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4537
Reviewed-by: Nong Li <nong@cloudera.com>
Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com>
Tested-by: Victor Bittorf <victor.bittorf@cloudera.com>
2014-09-26 12:26:37 -07:00
Nong Li
f03b05ed50 Fix hash table buckets to allocate memory from the BlockMgr.
This was always a TODO. We want memory to come from the block mgr and trigger spilling.

Change-Id: I07f1f79fbbb33068fb2df64510a80a9b008ef73d
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4466
Tested-by: jenkins
Reviewed-by: Nong Li <nong@cloudera.com>
2014-09-26 12:26:09 -07:00
Dimitris Tsirogiannis
cb53f14087 IMPALA-1301: Invalid rewrite of uncorrelated scalar subquery when outer
expr is in both sides of binary predicate

This commit fixes the issue where an invalid rewrite is performed for
scalar uncorrelated subqueries that participated in a binary predicate
in which columns from the outer query block appear in both sides. We fix
this issue by rewritting the subquery using a cross join and placing
binary predicate in the outer's WHERE clause.

Change-Id: Ic5234bd2ee704d5ddead9217a636259e694e3eda
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4512
Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Tested-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
2014-09-26 12:25:06 -07:00
Victor Bittorf
af4b2086dc Char PARQUET, AVRO, and TEXT tests
Adds fixes and tests for Hive CHAR & VARCHAR compatibility.
Also fixes a bug in tuple materialization for VARCHAR and non in-lined CHAR.

Change-Id: I400b089cb8ddba2e264ef9f2e37956b2ceaaf9fb
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4054
Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com>
Tested-by: jenkins
2014-09-26 12:24:07 -07:00
Matthew Jacobs
b99fe95b7c IMPALA-1293: Fix DCHECK failure with window ROWS BETWEEN UNBOUNDED PRECEDING
Change-Id: I4e92e9593402f4341826c6940e23e493c7d23641
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4487
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: jenkins
2014-09-26 12:03:40 -07:00
Matthew Jacobs
f75fc4337a IMPALA-1296: Fix DCHECK failure for unnecessary buffered tuple in AnalyticNode
The AnalyticEvalNode had a DCHECK that expected the buffered tuple to
only be set when it was needed (i.e. when there are partition exprs or
order by exprs). However, the FE creates a buffered tuple for an entire
sort group when any AnalyticEvalNodes in the sort group need it and that
tuple is set for all nodes. This reverses the logic so that we DCHECK
the buffered tuple is set when it is needed.

Change-Id: If54b303bc439f235da06a542b46a35c61da9e1bd
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4489
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: jenkins
2014-09-26 12:03:06 -07:00
Matthew Jacobs
e3c8d2fe7e Analytic fn query test for IMPALA-1280 needs VERIFY_IS_EQUAL_SORTED
Change-Id: Iefb232a501a97a0f0351fd0794c7a3bfc279f98c
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4513
Reviewed-by: Ippokratis Pandis <ipandis@cloudera.com>
Tested-by: jenkins
2014-09-26 12:03:00 -07:00
Victor Bittorf
afbc2c28a3 Char Partition Fix
Fixed bug CHAR and VARCHAR partition columns. Also, disables CHAR and VARCHAR for UDAs
and UDFs.

Change-Id: I67ccd746cb4c063f8a7a984df9564fa9122fdf43
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4493
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
2014-09-26 12:02:54 -07:00
Dimitris Tsirogiannis
5046a47dc3 IMPALA-1297: Results of NOT IN may not be correct if subquery results in
NULL

This commit fixes a bug in the implementation of the null-aware anti
join that resulted in wrong results being returned from NOT IN correlated
subqueries in the presence of nulls.

Change-Id: I6f2eb326ec7e40d80ec8da94ba33946b9ac9b115
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4506
Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Tested-by: jenkins
2014-09-26 12:02:47 -07:00
Taras Bobrovytsky
9e67b4a401 Analytic functions tests
Added several tests for analytical functions

Tests for the following have not been added because it's not implemented yet:
- Lag, Lead functions
- Window clauses

Change-Id: I34546c967a6d29c97327f4cba405006a50867dcb
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4307
Reviewed-by: Taras Bobrovytsky <tbobrovytsky@cloudera.com>
Tested-by: jenkins
2014-09-26 11:26:05 -07:00
Alex Behm
5b4e8b79bf IMPALA-1118: Retain join predicates that reference a slot of an outer-joined tuple.
The bug: When gathering join predicates, we eliminate redundant ones based on
equivalence classes. However, predicates that reference an outer-joined tuple cannot
be safely removed even if the corresponding equivalence class is already covered by
another predicate because such predicates could imply 'slot IS NOT NULL' and removing
them would allow NULL tuples from outer joins to be incorrectly returned.

The fix: Retain otherwise redundant predicates if they reference a slot of an
outer-joined tuple to maintain that the output of a join satisfies 'slot IS NOT NULL'
(otherwise NULL tuples from preceding outer joins could incorrectly survive).
For each outer-joined slot we only need to retain a single predicate.
TODO: Consider better fixes for outer-joined slots: (1) Create IS NOT NULL
predicates and place them at the lowest possible plan node. (2) Convert outer
joins into inner joins (or full outer joins into left/right outer joins).

Change-Id: Ie4dc20c1db3f3822d4b60e5dfbc00c024a1d3db7
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4485
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: jenkins
2014-09-25 19:37:03 -07:00
Dimitris Tsirogiannis
30e7c450c5 IMPALA-1165: Predicate dropped: Inline view plus distinct aggregate in
outer query

This commit fixes a bug with marking propagated predicates as assigned
during slot materialization. When gathering predicates for the purpose
of slot materialization, no predicates should be marked as assigned.

Change-Id: I5277dc622990b1731db1ecb3ea646e7b72d4e3db
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4496
Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Tested-by: jenkins
2014-09-25 19:37:03 -07:00
Alex Behm
875417d817 Remove obsolete data compaction flag from FE.
Change-Id: I0413d87d07fc07c14dc6c415c503065fc4944613
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4499
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: jenkins
2014-09-25 19:37:03 -07:00
Marcel Kornacker
3edeef53d8 Fix for bug in AnalyticExpr.resetAnalysisState()
Fixes:
IMPALA-1256: Nested analytic: AnalysisException: select list expression not produced by aggregation output
IMPALA-1280: Crash running analytic with LEFT SEMI JOIN

Change-Id: I98b8f90de0079afad5b2d547abc27bcee57651f3
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4500
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: jenkins
2014-09-25 19:37:02 -07:00
Alex Behm
7d7127c25d IMPALA-1289: Fix predicate assignment for inverted joins.
Change-Id: I017c39f77df0264a7ee35dca28ee263aa55ca517
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4491
Reviewed-by: Marcel Kornacker <marcel@cloudera.com>
Tested-by: jenkins
2014-09-25 19:37:02 -07:00
Skye Wanderman-Milne
f2b01997df Allow UDA intermediates to use CHAR. Update stddev/var to use it.
Change-Id: I791c6389978f4994cba33f01273e94343a163916
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4368
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: jenkins
2014-09-25 19:37:02 -07:00
Skye Wanderman-Milne
7f87e7e5b5 IMPALA-1111: Fix alignment in ReservoirSample aggregate functions
Change-Id: Iac7aa96eb19079715a7e8152a5edfeafa0d50bc7
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4478
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: jenkins
2014-09-25 19:37:02 -07:00
Alex Behm
88ae4c9080 Fix HBase region splitting for tests.
It appears that HBase sometimes ignores an admin.splitRegion() RPC,
which made our region splitting fail. As a workaround, this patch adds
another retry loop such that the split/wait sequence is attempted
multiple times.

Change-Id: I9aa8ab87bba79ea11b79c50f15328b8be844924d
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4557
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
Tested-by: Alex Behm <alex.behm@cloudera.com>
2014-09-25 18:44:28 -07:00
Dimitris Tsirogiannis
f21aed16fd Bug fixes in null-aware anti-join
This commit fixes issue IMPALA-1215 where NOT IN subqueries return wrong
results in the presence of null values.

Change-Id: I97e41c8df8ba864d0189595d670b3f0349fcad36
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4467
Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: Nong Li <nong@cloudera.com>
2014-09-23 07:33:23 -07:00
Dan Hecht
47a11578d4 IMPALA-1272: fix crash when compression codec is invalid for parquet
Defer resizing the columns_ vector until we are sure we will initialize it.
Downstream code doesn't expect any NULLs.

Change-Id: I250cceee5181428fcd3cd1a8b021edb7187ae888
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4465
Reviewed-by: Daniel Hecht <dhecht@cloudera.com>
Tested-by: jenkins
2014-09-23 07:33:13 -07:00
Matthew Jacobs
28fc8ddf60 IMPALA-1292: Incorrect result in analytic SUM when ORDER BY column is null
The 'less than' predicate created by AnalyticPlanner used to check if the
previous row was less than the current row is not exactly what we want
to determine when rows in RANGE windows (the default window in this case)
share the same result values. Rows get the same results when the order by
exprs evaluate equally or both null, so it's easiest (and more efficient)
to use a predicate that simply checks equality or both null. We already
create such predicates for checking for partition boundaries, so this is
a trivial change.

When we support arbitrary RANGE window offsets we will likely want to
add similar predicates that compare two tuples plus/minus the offset,
but those will be simpler because there can be only one order by expr
when specifying RANGE offsets with PRECEDING/FOLLOWING.

Change-Id: I52ff6203686832852430e498eca6ad2cc2daee98
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4474
Tested-by: jenkins
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: Matthew Jacobs <mj@cloudera.com>
2014-09-23 07:32:43 -07:00
Matthew Jacobs
08a5204594 Analytic Fns: BE support for range unbounded on both sides and range offsets fail analysis
1) Adds BE support for RANGE windows between UNBOUNDED PRECEDING to
   UNBOUNDED FOLLOWING.
2) RANGE windows with offset boundaries fail analysis because they're
   not supported by the BE yet.

Change-Id: I734575eb87c909d09d24c4df028023f3b50d3cb5
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4442
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: Matthew Jacobs <mj@cloudera.com>
2014-09-23 07:32:21 -07:00
Marcel Kornacker
0b3124ab35 Analytic plan optimization: taking advantage of the hash partitioning of the preceding aggregation.
- determine the partition group that has maximal intersection of its partition exprs with the
  preceding grouping exprs
- if that intersection's expected ndv > #nodes, make that partition group the first one in the sequence
  to be computed and reduce the hash partition of the preceding aggregation to that intersection

Change-Id: I612b4a260a8975deb495e5d34c32f03db4a7cca7
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4451
Reviewed-by: Marcel Kornacker <marcel@cloudera.com>
Tested-by: jenkins
2014-09-23 07:32:04 -07:00
Victor Bittorf
9939c9d009 Bugfix and tests for CHAR(N) and VARCHAR(N)
Fixed a bug when setting the length in reading/write text files for CHAR(N).
Also added chars_tiny table for testing CHAR(N) and VARCHAR(N).

Change-Id: If5d5db30afa4b00cf03c68c6a845f182970329f4
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4415
Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com>
Tested-by: jenkins
2014-09-23 07:30:07 -07:00
Matthew Jacobs
8a75e759cb Move analytic fns test case for decimal to decimal.test
Change-Id: Ic6e02484f47f9a9c47924850c8cf12daf8574c8c
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4449
Reviewed-by: Ippokratis Pandis <ipandis@cloudera.com>
Tested-by: jenkins
2014-09-23 07:26:32 -07:00
Matthew Jacobs
57addd34ac Analysis error for min()/max() w/ analytic windows without UNBOUNDED PRECEDING
min()/max() do not currently support windows without UNBOUNDED PRECEDING,
so this changes AnalyticExpr to detect this during analysis and throw an
AnalysisException.

Also removed some stale TODOs in the BE

Change-Id: I734b0a5d5399f9bb9d4db6ab1ddc079237b0ac03
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4431
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: jenkins
2014-09-23 07:26:21 -07:00
Matthew Jacobs
da5198e615 Add spilling test for an analytic fn
Change-Id: Ia93c71c9c2a01f7f04a81593d51f5ca565286b7d
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4447
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
2014-09-23 07:26:09 -07:00
Alex Behm
8345494fb1 IMPALA-1249: Anti joins have a uni-directional value transfer.
Like left/right outer joins, anti joins have a uni-directional value transfer.
Predicates could be pushed into anti joined plan subtrees if the condition
was inverted, but this patch does not implement this optimization.

No special consideration must be made to prevent predicate assignment
into anti-joined branches because anti-joined tuples are invisible outside
of the On-clause, and therefore, all unassigned conjuncts referencing the
invisible tuple must come from the original join's On-clause. The assignment
of such predicates is already handled correctly.

Change-Id: Ic2b94f6eb57e000ea51e253035e713288b205298
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4425
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: jenkins
2014-09-23 07:25:51 -07:00
Alex Behm
0791eb2ee5 IMPALA-1281: Restrict re-ordering of cross joins.
This patch restricts the leftmost table ref candidates of cross joins to the
very first join (like we already do for outer/semi joins). Join inversion
is still considered for cross joins.

While conceptually possible, it is tricky to reason about allowing the rhs of
arbitrary cross-join table refs as the leftmost candidate during join
re-ordering. We would have to carefully change the joinOps of all table refs
in between, and ensure to not make those changes in place to avoid "polluting"
the table refs for the next round of join re-ordering (considering a new
leftmost table ref). The safer fix is to restrict the considered orders.

Change-Id: I5fdc323e4a9c2dada06d9aec81769057f7076299
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4438
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: jenkins
2014-09-23 07:25:37 -07:00