Commit Graph

17 Commits

Author SHA1 Message Date
Dimitris Tsirogiannis
c88d179413 IMPALA-1636: Generalize index-based partition pruning to allow constant
expressions

This commit enables fast partition pruning for cases where constant
expressions appear in binary or IN predicates. During partition pruning,
the constant expressions are evaluated in the BE and are replaced by the
computed results as LiteralExprs.

Change-Id: Ie8a2accf260391117559dc6c0a565f907c516478
Reviewed-on: http://gerrit.cloudera.org:8080/144
Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Tested-by: Internal Jenkins
2015-03-07 09:51:27 +00:00
Alex Behm
c0f2e043b4 Fix exhaustive test runs: Preserve types when substituting root output exprs.
A recent change (3ccee71) to fix resetAnalysisState() of NullLiterals
exposed another bug during exhaustive test runs.
For insert queries into Parquet, the types in the schema of the generated
Parquet files are based on the insert exprs, correctly assuming that
the FE handles all the necessary casting to make sure the Parquet file
schema and the table schema match.
Since we apply an smap on the output exprs towards the end of planning,
NullLiterals were reset to the NULL_TYPE, causing the Parquet schema
to incorrectly have BOOLEAN columns (we cast naked NULL_LITERALS to
BOOLEAN in toThrift()), leading to a mismatch of the Parquet schema
and the table schema. Subsequent queries on such a table failed,
correctly reporting a type mismatch.

The fix is to preserve types when doing the substitution on the output exprs.

Change-Id: I135f1b826b06a6a200df7b73343d2eb1fb4b7b80
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5453
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5455
2014-11-30 01:08:08 -08:00
Nong Li
e2d7fb6402 Some test case cleanup.
Change-Id: Ic29b7c1f5fd714a1e2cc41bf0e55c0d11c782862
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4791
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5090
Reviewed-by: Nong Li <nong@cloudera.com>
2014-11-03 22:33:08 -08:00
Victor Bittorf
7b244d34b6 IMPALA-1344: Fixed analytic aggregations with CHAR
The fix is to only register aggregates for string, not for CHAR or VARCHAR. The CHAR and
and VARCHAR types are implicitly cast to STRING for aggregation.

Also, fixed aggregate fn builtins that should not ignore distinct.

Change-Id: If4c1a2c6127360c2c8127a5c02949df74fafc85a
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4717
Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com>
Tested-by: jenkins
2014-10-06 15:16:50 -07:00
Victor Bittorf
a62500ee28 Changed CHAR & VARCHAR max length to match Hive.
Also modified the text of the analysis exception for lengths that are too long or
short because John said they were unclear.

Change-Id: I9427d5c39298aa8207672e50e10fe527c5076599
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4698
Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com>
Tested-by: jenkins
2014-10-06 15:16:45 -07:00
Victor Bittorf
c29ed3761e IMPALA-1339: NULLs incorrectly hashed in groupby
Problem: hash table assumed all raw values were at most 16 bytes. This maximum was
increased to to support up to 128 bytes for CHARs.

Change-Id: I107c58b9a013d5db46ff5586bcdceee3961346e9
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4701
Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com>
Tested-by: jenkins
2014-10-06 15:16:36 -07:00
Victor Bittorf
d5fd59e2ed IMPALA-1337: Aggregation failures for VARCHAR
The issue is that the aggregation node needed to use IsVarLen; previously
it assumed TYPE_STRING was the only variable length type.

Change-Id: I9545e8d405937a47b25c9042f97854851a448c6e
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4690
Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com>
Tested-by: jenkins
2014-10-06 15:14:51 -07:00
Victor Bittorf
f4626b03e6 IMPALA-1322: Fix related issue
There is an issue related to IMPALA-1322. The expression list when laying out memory
was being improperly index.

Change-Id: I2eef84a812b451d87ecb8afd304e765aff1f5a6b
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4675
Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com>
Tested-by: jenkins
2014-10-06 15:14:44 -07:00
Victor Bittorf
794e70b0bd Fix CHAR/VARCHAR Aggregation
This fixes an issue where VARCHAR and CHAR could error in some aggregations.
The cause of the problem is that the BE currently does not support CHAR/VARCHAR as
arguments to aggregates, they require an implicit cast to string first.
The resolution is to have these operators return STRING instead of CHAR(*) or VARCHAR(*).
Note that the CHAR(*) comparisons still ignore spaces for min/max.

This takes advantage of the fact that STRING, VARCHAR(*), and CHAR(*) values are all
handled as a StringVal for exprs. The STRING aggregates are registered as CHAR(*) and
VARCHAR(*) aggregates and the front end converts the return type to a STRING in all cases.

Also includes a fix for a TODO about casting between CHAR and VARCHAR.

Change-Id: I1d3a9cc48e426286ce63677324a8c680e67b005a
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4573
Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com>
Tested-by: jenkins
2014-10-06 15:13:17 -07:00
Victor Bittorf
fa502f973a IMPALA-1319: Fixed CHAR padding for numeric casts
IMPALA-1322: Crash on VARCHAR/CHAR join

Fixed 2 issues:
  (1) Disabled codegen for CHAR in hash join equality
  (2) fixed memory layout for CHAR
  (3) Fixed a regression where space padding could be dropped for numeric casts.

Change-Id: I6475fd527ca0d67c7d4d5ec7e561549e43fbc336
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4640
Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com>
Tested-by: jenkins
2014-10-06 15:12:44 -07:00
Victor Bittorf
658f05f63c IMPALA-1316: crash on VARCHAR join
Fixed codegen issue casing some VARCHAR joins to crash.

Change-Id: Ib2674199a3b2c3c5a5fd63cfae0b64e3b1ca158b
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4616
Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com>
Tested-by: jenkins
2014-10-06 15:11:10 -07:00
Victor Bittorf
afbc2c28a3 Char Partition Fix
Fixed bug CHAR and VARCHAR partition columns. Also, disables CHAR and VARCHAR for UDAs
and UDFs.

Change-Id: I67ccd746cb4c063f8a7a984df9564fa9122fdf43
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4493
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
2014-09-26 12:02:54 -07:00
Victor Bittorf
9939c9d009 Bugfix and tests for CHAR(N) and VARCHAR(N)
Fixed a bug when setting the length in reading/write text files for CHAR(N).
Also added chars_tiny table for testing CHAR(N) and VARCHAR(N).

Change-Id: If5d5db30afa4b00cf03c68c6a845f182970329f4
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4415
Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com>
Tested-by: jenkins
2014-09-23 07:30:07 -07:00
Victor Bittorf
6289121261 CHAR(N) Followup Patch
This patch addresses:
  1. Char doesn't use codegen
  2. Not in-lining large CHAR(N) for N > 128
  3. Parquet reader/writer for CHAR(N) and VARCHAR(N)

Change-Id: I83a29a8bd312841a3e29bfe2243884074570f247
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4280
Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com>
Tested-by: jenkins
2014-09-20 16:12:03 -07:00
Victor Bittorf
a1892a17d5 IMPALA-1248: Fixed CHAR(N) in VALUES clause.
Queries like;
INSERT INTO table VALUES (CAST("..." AS CHAR(N)))
Used codegen path and failed; changed to use interpreted path.

Change-Id: Id80274580df268b3f828dec19a2e0b0578061ca8
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4362
Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com>
Tested-by: jenkins
2014-09-20 16:07:16 -07:00
Victor Bittorf
8bebf2b196 CHAR: adding support for CHAR(N)
Support for CHAR is implemented as a StringVal in the backend.

TODO:
  1. Parquet Reader/writer
  2. Codegen slot ref
  3. Codegen text reader
  4. Don't inline large chars
  5. update impala-hs2-server.cc with CHAR support

Change-Id: Ibba2c89cea971cb740001ea7975bf3e929150471
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4075
Reviewed-by: Nong Li <nong@cloudera.com>
Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com>
Tested-by: jenkins
2014-09-13 00:19:20 -07:00
Victor Bittorf
2dce31f6c2 Adding VARCHAR front & backend.
VARCHAR is treated as StringVal in the backend. All UDAs and UDFs which accept STRING
will also accept VARCHAR(N).

TODO: Reverted Avro codegen to fix Jenkins; needs separate patch.

Change-Id: Ifc120b6f0fe1f996b11a48b134d339ad3719331e
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/2527
Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com>
Tested-by: jenkins
(cherry picked from commit 3fcbf4f677b8e26c37eded4d8bb628e6fc53c1e9)
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4058
2014-08-27 13:52:58 -07:00