Commit Graph

1393 Commits

Author SHA1 Message Date
Dan Hecht
1fee56cb26 IMPALA-1080: Implement "SET <query_option>" as SQL statement.
Also add support for "SET", which returns a table of query options and
their respective values.

The front-end parses the option into a (key, value) pair and then the
existing backend logic is used to set the option, or return the result
sets.

Change-Id: I40dbd98537e2a73bdd5b27d8b2575a2fe6f8295b
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3582
Reviewed-by: Daniel Hecht <dhecht@cloudera.com>
Tested-by: jenkins
(cherry picked from commit aa0f6a2fc1d3fe21f22cc7bc56887e1fdb02250b)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3614
2014-07-25 10:25:09 -07:00
Matthew Jacobs
b83aa4984b Add compute histograms aggregate function
Adds an aggregate function to compute equi-depth histograms. The UDA
creates a sample of the column values using weighted reservoir sampling
and computes the histogram from the sorted sample.

TODO:
* Extract highly frequent values into separate buckets (i.e. 'compressed
  histogram').
* Expose separate finalize fn to produce samples and histogram data for stats

Change-Id: I314ce5fb8c73b935c4d61ea5bbd6816c59b3b41e
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3552
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: jenkins
(cherry picked from commit c5c475712f88244e15160befaf4e99d6e165a148)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3608
2014-07-25 00:21:10 -07:00
Alex Behm
c001be18d4 IMPALA-1103: Fix cancellation check in FetchInternal() to use the query status.
We recently changed user-initiated cancellation to not set the query state
to EXCEPTION. In FetchInternal() we relied on the previous behavior for
detecting cancellations/errors after BlockOnWait().
This patch fixes the cancellation/error check to use the query status
instead of the query state.

Change-Id: I48b4834e77b6e692fb6722637fb9fd5d8c8d9d97
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3597
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3600
2014-07-24 10:57:29 -07:00
Skye Wanderman-Milne
f9bad0530a Cache codegen'd expr functions
Change-Id: Ie0d5ab2a21cc7b0f3c7f7d239f1129f2bc18ba9e
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3475
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
(cherry picked from commit c030b678425b83c42e074d45d4a245adccb6e0ae)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3482
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
2014-07-24 01:56:41 -07:00
Nong Li
045d69a6c6 Add per client mem tracking to buffered block mgr.
This also means clients of the block mgr need to delete all blocks in close.
This is less important for sorting since it's typically at the end but will
be useful very soon.

Change-Id: Ia4ee188ad845540039ede5fe410a6048abe2bf5a
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3540
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3588
2014-07-22 20:23:37 -07:00
Nong Li
629d351ae1 Augment internal queue interface. Update BufferedBlockMgr to use it more.
Change-Id: I662fde6165726767787b722f5b74d10f94fe158c
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3543
Reviewed-by: Marcel Kornacker <marcel@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3585
Reviewed-by: Nong Li <nong@cloudera.com>
2014-07-22 14:55:28 -07:00
Nong Li
7dc57aaa9e Change buffered block mgr to support multiple clients.
This patch does a few things:
1. Moves the buffer block mgr from the sorter to the runtime state. This is now
   one that is shared across the query fragment. The partitioned hash join and agg
   will use this as well.
2. Adds a Client interface to the block mgr. Each exec node is a different client
   and can reserve a minimum number of buffers. This avoid starvation.
3. Updated the BufferedBlockMgr interface's for getting pinned blocks to collapse
   two existing APIs.

Change-Id: Ibb31fbe480f3726048457f26e24a9e33f7201d86
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3504
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: Nong Li <nong@cloudera.com>
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3574
2014-07-22 12:45:37 -07:00
Nong Li
1d5b9440d7 Speed up codegen compile time by moving unnecessary files out of cross compiled module.
This gets us about a 10x speedup (700ms to 70ms) and back to where we were before.

Change-Id: I76d9f73b0b74ba7f45e3590e22d6541c560e9a58
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3570
Tested-by: jenkins
Reviewed-by: Nong Li <nong@cloudera.com>
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3580
2014-07-22 11:26:43 -07:00
Nong Li
202d656ddc Stop setting query state to EXCEPTION for non-exception cases.
We were setting the state to exception on Cancel() all the time.
We use the cancellation path as the normal cleanup path so this
gets called even when the query went fine (e.g. UnregisterQuery
calls Cancel()). We had already plumbed through a 'cause' argument
to differentiate.

Change-Id: Icf1091c165dec36d3dad7ce308367bbbc9edee4f
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3524
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3575
2014-07-22 04:08:28 -07:00
Nong Li
dacdee6317 Fix long expr-test time.
This disables optimizations while running expr-test based on a env var. Most of
our jenkins job will run with it disabled.

Change-Id: I680734a354e3ef4899cc626efed643ba2c9b5051
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3545
Tested-by: jenkins
Reviewed-by: Nong Li <nong@cloudera.com>
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3576
2014-07-22 03:46:31 -07:00
Nong Li
9a2f7d3bbe Add fragment start up query timeline.
Change-Id: Icf015904d91f8e3a043c39b50a6c9eb1e1576c20
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3519
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3573
2014-07-22 02:54:51 -07:00
Paden Tomasello
d8e76cc43e Fixed warning 'passing NULL to non-pointer arguement' in expr.cc
Change-Id: I836c873281cf415b2d952a4b46d6eb3ac5a12bdd
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3523
Reviewed-by: Paden Tomasello <paden.tomasello@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3569
2014-07-21 23:29:46 -07:00
Alex Behm
e9864d5f78 Introduce type hierarchy and add complex types.
This patch replaces ColumnType with a hierarchy of types that models
the existing scalar types as well as the new complex types ARRAY, MAP,
and STRUCT.

Change-Id: Ia895f41153e99febb0c35412acac12689c3c2064
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3491
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3538
2014-07-21 20:00:46 -07:00
Paden Tomasello
879a40913c Implemented UDFs for timestamp functions.
FromUtc and ToUtc use thirdparty libraries which use inline asm which
isn't currently supported with JIT. The UDFs are included in this
commit, but the function symbols were not changed in
impala_functions.py

Change-Id: I0824a434d4a26a39abf29bc6e47d51b5ad7991d6
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3390
Reviewed-by: Paden Tomasello <paden.tomasello@cloudera.com>
Tested-by: jenkins
(cherry picked from commit 8e149ccd78010b7a22d6fff1b0de5614848b02ac)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3548
2014-07-21 15:27:46 -07:00
Paden Tomasello
3d173e65d2 Adding Codegen function and tests for CASE expressions.
Change-Id: Ib52b3e3f12b35e2c0a60ef94501c20ef83abdfe5
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3187
Reviewed-by: Paden Tomasello <paden.tomasello@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3498
2014-07-18 12:03:58 -07:00
Mike Yoder
798dcd3a3c Adding warning messages to insecure LDAP configurations and added --ldap_passwords_in_clear_ok
Change-Id: Id7c7006269c11b4cd7aea51789b7af9aeffea2c3
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3501
Reviewed-by: Henry Robinson <henry@cloudera.com>
Tested-by: Henry Robinson <henry@cloudera.com>
(cherry picked from commit 4a86031ee960fe1996eaab1344b46cab5d61f02e)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3528
Tested-by: jenkins
2014-07-16 22:38:35 -07:00
Nong Li
0e6b8ecfcd More logging in free pool/function context.
Change-Id: I4264212359ba46e31cf42a7e4f531a34ca2e07df
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3288
Tested-by: jenkins
Reviewed-by: Nong Li <nong@cloudera.com>
2014-07-15 16:57:09 -07:00
Nong Li
207e3f8b95 Change how we do right/full outer joins to maintain a bit in hash table.
We used to maintain a separate hash table (in the form of a boost
unordered set) to keep track of the build rows that have been matched.
This patch changes it by just keeping a bit in the hash table. It is not
possible to use boost::unordered_set for tables that are large.

Change-Id: Ie36e609bf79e5e7e403417a3c02a0817d37acc60
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3478
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
2014-07-15 16:57:09 -07:00
Nong Li
188a0ea833 Rework structure of hash table.
This patch does two things in preparation for external joins. The
hash table used to contain a directory structure (buckets and nodes)
both of which were contiguous. The nodes contained the tuple ptrs
within it.

This patch changes it so the nodes are not stored contiguously but
allocated in pages. (this structure is dense and does not require
random lookups by index). The bucket structure is still contiguous
since we rely on the doubling property and random lookup by index.

The second change is that the node's no longer store the tuple ptrs
within them. This makes it easier to build the hash table ontop of
existing data.

Here's a quick benchmark doing a self join on tpch lineitem. Both
build and probe times decreased a bit.

Before:
 HASH_JOIN_NODE (id=2):(Total: 1s139ms, non-child: 985.939ms, % non-child: 86.50%)
         - BuildBuckets: 2.10M (2097152)
         - BuildRows: 6.00M (6001215)
         - BuildTime: 527.991ms
         - LeftChildRows: 6.00M (6001215)
         - LeftChildTime: 451.964ms
         - LoadFactor: 0.50
         - RowsReturned: 30.01M (30012985)
         - RowsReturnedRate: 26.33 M/sec
After:
HASH_JOIN_NODE (id=2):(Total: 1s019ms, non-child: 835.350ms, % non-child: 81.97%)
         - BuildBuckets: 2.10M (2097152)
         - BuildRows: 6.00M (6001215)
         - BuildTime: 423.175ms
         - LeftChildRows: 6.00M (6001215)
         - LeftChildTime: 406.67ms
         - LoadFactor: 0.50
         - RowsReturned: 30.01M (30012985)
         - RowsReturnedRate: 29.45 M/sec

Change-Id: I79e209a24c24fb4f2f99574bcf187746fddadc06
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3245
Tested-by: jenkins
Reviewed-by: Nong Li <nong@cloudera.com>
2014-07-15 16:57:09 -07:00
Nong Li
6ca2eb4944 Fix reading past the end of probe tuple.
Change-Id: I5c1a53e3bdc95e42257d614b1dff1f6e81a04003
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3465
Reviewed-by: Marcel Kornacker <marcel@cloudera.com>
Tested-by: jenkins
2014-07-15 16:57:09 -07:00
Nong Li
1ce1c47184 Don't propagate parent tuple ids to child nodes.
I'm not sure when we added this but it does not have any benefit. The join nodes
combine the tuple*'s from the LHS and RHS anyway and the extra Tuple* reserved in
the LHS row batch is never written to or read.

Change-Id: I40f88f417161ef72185e995b6c5b8f56f31fbfc4
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3438
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: jenkins
2014-07-15 16:57:09 -07:00
Nong Li
f5d4280045 Fix/suppress some compiler warnings.
Change-Id: I5ee900a062b30404e6a0b88fe373fba06d92699e
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3447
Tested-by: jenkins
Reviewed-by: Nong Li <nong@cloudera.com>
2014-07-15 16:57:09 -07:00
Henry Robinson
79d64ad7ba Don't log HS2 passwords, even though they shouldn't be set
Change-Id: Ibf275bbf595c043452f05485fdb28f2800b0747a
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3484
Reviewed-by: Marcel Kornacker <marcel@cloudera.com>
Tested-by: jenkins
(cherry picked from commit 4a6283097ea925bef357bbfca7a0d6f87ceb0a9a)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3486
Reviewed-by: Henry Robinson <henry@cloudera.com>
Tested-by: Henry Robinson <henry@cloudera.com>
(cherry picked from commit e801bd8c0d134e783c2313c7dd422a5ad06591af)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3487
(cherry picked from commit 08fa3466dd8914356494919534641842ff3953e0)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3488
2014-07-11 23:06:36 -07:00
Alex Behm
ebc70d921d Fixes for sporadic build failure in compute stats cancellation.
The root cause of the problem was that columns of a Table were not
added to the colsByName_ map with lower case keys on the Table.load() path
that is only exercised by the catalog server (the Impalads "load" tables
via Table.loadFromThrift() which did the right thing).

The above led to an empty column stats object being sent to the HMS
after an otherwise successful compute stats.

The problem was sporadic for the following reasons:
1. Only certain file formats like avro/snap/block have uppercase
   column names in the HMS because the table was created by Hive
2. Some of our tests executed via run-tests.py, notably the
   cancellation tests, aren't deterministic in which test vectors
   are executed in a particular run. As a result, we only see the
   cancellation test run compute stats on an avro/snap/block
   once in a while (this behavior is unaffected by this patch).

This patch includes other minor bugfixes and simplifications
related to compute stats.

Change-Id: I7cb5fe69404e35133eda314d9f7d072c78416ff1
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3468
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3479
2014-07-10 19:09:08 -07:00
Skye Wanderman-Milne
17fb6e758f Make sure generated functions returning DecimalVals adhere to x86 ABI
I hit this in the expr refactoring. This makes sure we never expose a
function that returns a DecimalVal directly (rather than through an
extra return parameter as specified by the ABI), which will crash if
called from precompiled native code.

Change-Id: Ifb249086c221b53553d3e7fb39af065f4cca2bac
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3425
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: jenkins
(cherry picked from commit 429448935555b098e324bcb97ab43a7c90e0b918)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3473
2014-07-10 15:33:46 -07:00
Henry Robinson
0874316975 Log full exec request at VLOG(2)
Change-Id: I0009b6f2642658f6bc32b2fb1a65f9d445dca596
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3308
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
(cherry picked from commit 112c451c5466b38a048182cd37f9b0eb9589ab4b)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3466
Reviewed-by: Henry Robinson <henry@cloudera.com>
2014-07-09 17:57:05 -07:00
Henry Robinson
84195eb1b0 Fix compilation with ASAN
Change-Id: I90c7413a73e868253bc91c647bd6a01ae04c0919
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3436
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
Tested-by: Henry Robinson <henry@cloudera.com>
(cherry picked from commit 9bdd73f9526fcb7348ab686e2c05777886028ba2)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3458
Reviewed-by: Henry Robinson <henry@cloudera.com>
Tested-by: jenkins
2014-07-09 14:19:11 -07:00
Alex Behm
21c9eb68b1 Restore casts stripped from grouping exprs by substitution.
Change-Id: I2a317025f9a8549beed7cf79b463239e11a6a2d0
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3352
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3432
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
2014-07-08 10:45:43 -07:00
Skye Wanderman-Milne
b572fe0af5 Remove unnecessary decimal casts for some builtins.
For arithmetic ops, this is an optimization. The Add(decimal,decimal) already handles
the cast as part of the operation.

For binary predicates, the cast is bad and can lead to overflows. The decimal Compare()
function has custom logic to not overflow.

Change-Id: I9f5ad74ea89e9dfa5a3a40c1e07f7e9178bf1d52
(cherry picked from commit 6bffaa885542443ca559888d921853ecd194cbcb)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3414
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: jenkins
2014-07-03 21:32:51 -07:00
Skye Wanderman-Milne
dbae673715 Open and close exprs on partition key exprs in HdfsPartitionDescriptor
Change-Id: I954cd54113b4fb0d65423850a3a4145791b36107
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3136
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: jenkins
(cherry picked from commit bf7af4dc7d5013b5d0f0f0797aba3c37f17c1fb6)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3395
2014-07-03 12:04:25 -07:00
Nong Li
274f97efc5 IMPALA-1066: Fix bad free in Min()/Max() of strings.
Change-Id: If66844a88accdc369458ab92f033eef50775d69e
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3373
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
2014-07-01 20:45:08 -07:00
Nong Li
f05e2a92af IMPALA-1066: Build with -no-strict-alias.
Change-Id: I2d9684b0d1f352cba27dff92273d93d60d8435c2
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3336
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3375
Reviewed-by: Nong Li <nong@cloudera.com>
2014-07-01 20:44:36 -07:00
Henry Robinson
dd4c1c32dc Add optional RM reservation limit to memtrackers
If RM and per-query memory limits were enabled at the same time, the
per-query limit would be ignored if RM wanted to expand the memory
allocation. This change adds an optional reservation limit to a
memtracker. The original limit goes back to being a hard limit -
i.e. any attempt to consume more than that amount results in
failure. The RM reservation limit is the RM-allocated memory limit. If
that is exceeded it triggers the ExpandRmReservation() method, which tries
to retrieve more memory as long as the hard limit is observed.

The net effect is that per-query memory limits have the intended,
hard-limit effect, while the RM limits coexist nicely and can expand
with more memory as required.

At the same time, we change the precedence of various ways of suggesting
an initial reservation size so that the user can change the reservation
size via a query option (MEM_RESERVATION_SIZE).

Change-Id: I41bfa4eb1336810a8a5946f6be3472111a052144
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3134
Tested-by: jenkins
Reviewed-by: Henry Robinson <henry@cloudera.com>
2014-07-01 18:08:47 -07:00
Skye Wanderman-Milne
f0fb28158b FE changes to avoid shipping null-type expressions to the BE.
Once the expr refactoring goes in, the BE will not be able to evaluate
any TYPE_NULL exprs. This patch ensures that the FE casts all null
literals and slot refs before they reach the BE.

There are a bunch of places where we know the appropriate type and
just weren't using it before. This patch also introduces a few notable
hacks:

* Serializing null SlotRefs and NullLiterals as boolean NullLiterals
  in case they weren't cast earlier.
* Converting null SlotRefs to NullLiterals in uncheckedCastTo() since
  we don't need to read from the slot at all.

This works, but we should consider adding a final pass that cleans up
the plan tree and takes care of this.

Change-Id: Ic2ee181139059553d7f2d0e17e9dacaee241df17
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3294
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: jenkins
(cherry picked from commit a8a67ebcad12956a8260b4ea4189afb7ffab4b68)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3361
2014-07-01 15:48:08 -07:00
Skye Wanderman-Milne
a5c85898e6 Fix StringFunctions::SubString()
Without this patch, the returned StringValue's ptr would be before the
input pointer if the 'pos' argument was < -input.len

Change-Id: I7bd506f5d1119741a94817c34a017215b67cc26e
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3351
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
(cherry picked from commit bad40d2beceffaacc409e34041a00d3ffbabf201)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3360
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
2014-07-01 15:24:39 -07:00
Victor Bittorf
3c388cd1dc CDH-19918: fixed Moscow timezone conversion.
Conversion from UTC to Moscow time was incorrect, this has been fixed.

Change-Id: Ib2a1720424bffff4f09713bfb06b5046fb38c031
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3311
Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com>
Tested-by: jenkins
(cherry picked from commit 9ae067013daf5e2e3a1dca3b31758e87f95432d1)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3357
2014-07-01 13:49:53 -07:00
Victor Bittorf
140b1c8b95 Fixed UDF memory leak warning for STDDEV
Change-Id: I8df3d28e9dc0f06819f6512c175b5dec4210a329
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3312
Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com>
Tested-by: jenkins
(cherry picked from commit 7f44fa68e2d06aa0166263a89a4eaecc21baaa25)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3358
2014-07-01 13:45:20 -07:00
Nong Li
9abca8321b Fix result precision in decimal round/truncate/etc and overflow.
Change-Id: I23840734fd5b7ab7404d94f6df05410b153354de
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3338
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
2014-07-01 08:05:39 -07:00
Nong Li
3fe082d3c9 Add CASE decimal builtin.
Change-Id: I007e7f319acd6a5bce739a08797d1d87ffc64472
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3275
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
2014-07-01 08:05:28 -07:00
Nong Li
d0fe59fe95 Remove unnecessary include from udf dev library.
Change-Id: I8bdc9474d817bf63a0908a0c8e4e7f754b4e0b33
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3331
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
2014-07-01 08:05:09 -07:00
Lenni Kuff
ad933ec765 Switch terminology of 'impersonated user' to 'delegated user'
This is to help ensure naming is consistent across the platform and
also avoid confusion with HS2 "impersonation" which is something very
different.

Change-Id: I48c1b76dff75b92b11ddc7aab0eb9a3a5d20e489
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3315
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
Tested-by: jenkins
(cherry picked from commit 931f6a66c0d8dff25b746d127dc1f36e96b12f98)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3326
2014-06-28 20:46:06 -07:00
Nong Li
163750f170 Fix decimal multiply result precision off by 1.
Change-Id: I860e0d13ee9bae7d3e180103a22fe7606a320b13
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3249
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
2014-06-27 11:22:05 -07:00
Nong Li
3e31f81731 Fix index out of bounds with rtrim().
Change-Id: I8c420a45aacdb0ce8f6a83fa8cdf5e91b8ef1f77
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3268
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
2014-06-27 11:22:00 -07:00
Nong Li
67e80b16e3 Add int96 to multiint benchmark.
This was one idea to just cast to __int128_t as a poor man's int96.
Unfortunately, it seems too slow: ~15x for add, ~10x for multi and
3x for divide compared to __int128_t.

Change-Id: I06eb3fa3ac1edc2c174873a73a252a0165911b1c
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2433
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
2014-06-27 11:21:54 -07:00
Nong Li
553395928e Change logging level of thrift plan in plan fragment executor.
VLOG(3) includes each row which is much less often useful than the serialized
plan.

Change-Id: I933188f046dafb51da9d06583697792113a9165a
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3289
Reviewed-by: Marcel Kornacker <marcel@cloudera.com>
Tested-by: jenkins
2014-06-27 11:21:47 -07:00
Skye Wanderman-Milne
3a6d6b71cb Fix NULL handling in ArithmeticExpr
Before: if both operands to an arithmetic expression were null
literals, we would set the operand types and return type to INT. This
isn't correct for operators that don't support ints, e.g. divide
(there's a separate integer division function), since the function
signature wouldn't match the arithmetic expr's types. I think we
didn't run into problems because the BE uses void*s everywhere, but I
hit this when I switched the arithmetic functions to the UDF
interface.

In addition, some of the builtins were registered with the wrong
return type.

After: set the operand types to a type appropriate for the operator
before we set the return type, meaning the return type gets assigned
correctly using the existing logic.

Change-Id: I39fa147c178d895bdffaf1be676ddaa3af1d42c8
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3255
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: jenkins
(cherry picked from commit 2634932790d1f4a42ce64f73ec3722a8a7be04af)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3298
2014-06-26 23:52:02 -07:00
Skye Wanderman-Milne
6d17b93814 Open and close exprs in tests
Change-Id: Ie4abc8e1e56fc77d68d9656260b8f4adcc2a36e9
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3135
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: jenkins
(cherry picked from commit f7eafefa1051ac9f3e5649f45655b80223af5f29)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3296
2014-06-26 23:48:29 -07:00
Paden Tomasello
d6a20c2f08 Rowbatch.cc uses LZ4 codec instead of Snappy codec
Comparison of Exchange node data for Lz4 and Snappy
running query: select (star symbol) from tpch.lineitem order by
 l_orderkey

Snappy:
XCHANGE_NODE (id equal 2):(Total: 36s021ms...)
BytesReceived: 26.75 MB (28047762)
DeserializeRowBatchTimer: 246.561ms

Lz4:
EXCHANGE_NODE (id equal 2):(Total: 34s699ms...)
BytesReceived: 11.20 MB (11741118)
DeserializeRowBatchTimer: 131.379ms

Change-Id: Iae8d212ba0fd508542f3ef9ddaf7507426e13253
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3120
Reviewed-by: Paden Tomasello <paden.tomasello@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3252
2014-06-26 12:06:39 -07:00
Skye Wanderman-Milne
bf8e1b81a0 Make sure QueryExecState::Wait() completes before fetching rows.
We run Wait() asynchronously for API compatibility, but many
QueryExecState functions cannot actually be run concurrently with
Wait() (e.g., Wait() opens output_exprs_, which are then evaluated in
FetchRows()).

Change-Id: I708aa23fdb238ee7aede1113790f48da2859cab9
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2993
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: jenkins
(cherry picked from commit 47f20b643e80f0f8640be9264d7ee3fc5d14dad0)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3226
2014-06-23 11:40:08 -07:00
Henry Robinson
bac4f6c9c8 Properly account for all finished-with expansions
Change-Id: I86819add942d13fcef3a9dab6977fcabe6cfdb4f
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3220
Reviewed-by: Henry Robinson <henry@cloudera.com>
Tested-by: jenkins
2014-06-21 00:40:26 -07:00