Commit Graph

22 Commits

Author SHA1 Message Date
Skye Wanderman-Milne
3a6600c964 Fix UDF test
UDF invocations in udf.test should not specify a database. This is how
we switch between testing IR UDFs in the ir_function_test database and
native UDFs in the native_function_test database.

Change-Id: I09ede18f2b91440ef7a2a76b0daf41a007af2671
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3130
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: jenkins
(cherry picked from commit 4d6160c0b88285aea754f6353cdd02b5e4b15633)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3295
2014-06-26 22:17:56 -07:00
Skye Wanderman-Milne
6ac9a8104b IMPALA-1009: UDF/UDA leaks should not fail queries
With this change, leaky UDFs built with the SDK will still fail when
using the test harness, but leaky UDFs running in Impala will only
trigger a warning. This change also updates the test infrastructure to
always check for non-fatal errors/warnings.

Change-Id: I5615349b9d691e4eddea3e03e152ef12e73835e7
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2844
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: jenkins
(cherry picked from commit 60ce5190d96add6104aba642d2354d87a26000fa)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2938
2014-06-10 21:46:47 -07:00
Nong Li
8f4dc0f2f0 IMPALA-974: Switch from FloatLiteral to DecimalLiteral.
Float/Doubles are lossy so using those as the default literal type
is problematic.

Change-Id: I5a619dd931d576e2e6cd7774139e9bafb9452db9
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2758
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
2014-05-31 22:19:06 -07:00
Skye Wanderman-Milne
c8b2017093 Add decimal UDF/UDA support.
Change-Id: Ie48c1cb8e978c7282593b7f602dd68added6d3fd
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2625
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: jenkins
(cherry picked from commit 5048f04b332c13b1bff32fb257272b0fea4b8584)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2739
2014-05-29 20:49:53 -07:00
Dimitris Tsirogiannis
ca86e470de IMPALA-887: Improve partition pruning time
This commit is the first step in improving the performance of partition
pruning. Currently, Impala can prune approximately 10K partitions per
sec, thereby introducing significant overhead for huge table with a
large number of partitions. With this commit we reduce that overhead by
3X by batching the partition pruning calls to the backend.

Change-Id: I3303bfc7fb6fe014790f58a5263adeea94d0fe7d
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2608
Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2687
2014-05-26 13:10:12 -07:00
Alex Behm
66a6c1f312 Fix UDF query test files.
Change-Id: Idea277ea2d20c47b2a81b0f2f06c48455de2ea45
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1780
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: jenkins
2014-03-06 07:37:14 -08:00
Skye Wanderman-Milne
6ceed1e632 UDF API additions
This patch introduces the ability to specify a prepare and close
function for a UDF, as well as FunctionContext methods for maintaining
state across UDF invocations within a query. Many of the changes are
related to adding an Expr::Open() function which calls the UDF's
prepare function, if specified (it has to be called in Open() since
the LLVM module must be compiled first).

Change-Id: I581d90d03dff71f7ff5d4a6bef839ba6bc46b443
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1693
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: jenkins
(cherry picked from commit 8e2ed7fb9051d98f89327715fdebd6f5ed22d6ee)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1757
2014-03-05 07:32:34 -08:00
Skye Wanderman-Milne
203fc66456 Add GetTypeDesc() method to FunctionContext.
This is currently only implemented for NativeUdfExpr.

Change-Id: I81b442c5668dff43d0486d1cfc445bca2af66606
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1664
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: jenkins
(cherry picked from commit e1087c3a78e6e12938b583c302907bd32c59f524)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1720
2014-03-01 20:24:30 -08:00
Nong Li
d5d4b4785b Fix broken udf test case. Should not specify DB.
Change-Id: I5f6343cbef9f52d349130360e029b38b23d0187a
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1505
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
2014-02-10 11:34:56 -08:00
Nong Li
7d578a9e54 Cleanup for IMPALA-774 fix.
Change-Id: I47bce71c482b3576957e88980f764c30f45229a9
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1454
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1470
2014-02-05 22:58:51 -08:00
Nong Li
ccd8c0338f IMPALA-774: Fix runtimestate setup when evaluating expr from FE.
We weren't initializing the udf mem pool causing UDFs to return strings to crash if used as part
of a constant expression.

Change-Id: Ic3a0e556aec8ce03a9e59f3ccf6980c682046b50
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1447
Reviewed-by: Nong Li <nong@cloudera.com>
Reviewed-by: Henry Robinson <henry@cloudera.com>
Tested-by: jenkins
2014-02-05 11:02:27 -08:00
Skye Wanderman-Milne
9d05d6d03a Allow UDF tests to run in parallel.
Change-Id: I9512d4a6920c4a71383d9374eb5feb303c3db85d
Reviewed-on: http://gerrit.ent.cloudera.com:8080/727
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: Skye Wanderman-Milne <skye@cloudera.com>
2014-01-08 10:53:47 -08:00
Skye Wanderman-Milne
7e8e184acf Allow UDFs in conjunct expressions.
This patch refactors HDFSScanNode to copy and prepare all conjunct
exprs in Prepare(), rather than in the scanner threads. This is
necessary so the UDF exprs get codegen'd. Prepare() also only codegens
the functions for the necessary file formats now, rather than for all
file formats regardless of what's actually be scanned.

Change-Id: Ic3220cbd0cba9a3baa138b1f50ecdc6889ed0cd1
Reviewed-on: http://gerrit.ent.cloudera.com:8080/710
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: Skye Wanderman-Milne <skye@cloudera.com>
2014-01-08 10:53:39 -08:00
Skye Wanderman-Milne
97a6b12e37 Fix UDFs used in partition pruning exprs.
Exprs used for partition pruning are prepared/evaluated with a
separate RuntimeState. If these exprs use UDFs, the runtime state
needs access to the process's ExecEnv so we can use the LibCache and
the IR produced by the UDF exprs needs to be optimized and jit'd.

Change-Id: If7c1d6ebc0015ef3c21a0421c1a36cad4be66625
Reviewed-on: http://gerrit.ent.cloudera.com:8080/695
Tested-by: jenkins
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: Skye Wanderman-Milne <skye@cloudera.com>
2014-01-08 10:53:39 -08:00
Skye Wanderman-Milne
b41ff0c8cd Modify test-udfs.cc so there are no undefined symbols in shared library.
AnalyzeDDLTest was failing because the fesupport binary couldn't
resolve a function used in libTestUdfs.so (the function was defined in
udf.cc, rather than udf.h). I couldn't figure out how to cleanly build
udf.cc into the libTestUdfs.so, so instead I removed the use of the
function in test-udfs.cc.

Change-Id: I81243547584a5b49a5f9265d0d17e035e18d6110
Reviewed-on: http://gerrit.ent.cloudera.com:8080/694
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: Skye Wanderman-Milne <skye@cloudera.com>
2014-01-08 10:53:27 -08:00
Nong Li
911cfc1bb9 Fix vararg UDFs.
Change-Id: I0e202b984ece7de3d220b6ce89b0c0a4c9edcb45
Reviewed-on: http://gerrit.ent.cloudera.com:8080/688
Tested-by: jenkins
Reviewed-by: Nong Li <nong@cloudera.com>
2014-01-08 10:53:26 -08:00
Skye Wanderman-Milne
8692e7df8d Add timestamp support to CodegenAnyVal
Change-Id: I2bbeae16660709c2c15d545e6d1c791912e880db
Reviewed-on: http://gerrit.ent.cloudera.com:8080/655
Tested-by: jenkins
Reviewed-by: Nong Li <nong@cloudera.com>
2014-01-08 10:53:21 -08:00
Nong Li
1eb2b7a964 Add execution for vararg UDFs.
Change-Id: I46e5670c09ac0b8e62f39dfc832fe880dd1dc995
Reviewed-on: http://gerrit.ent.cloudera.com:8080/572
Tested-by: jenkins
Reviewed-by: Nong Li <nong@cloudera.com>
2014-01-08 10:53:09 -08:00
Skye Wanderman-Milne
b7f83bcd73 Add support for LLVM IR UDFs.
This patch also adds a number of improvements to NativeUdfExpr. Highlights include:

* Correctly handling the lowering of AnyVal struct types (required for ABI compatibility)
* A rudimentary library cache for reusing handles produced by dlopen
* More complicated test cases

Change-Id: Iab9acdd7d7c4308e5d7ee3210f21b033fda5a195
Reviewed-on: http://gerrit.ent.cloudera.com:8080/540
Tested-by: jenkins
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: Skye Wanderman-Milne <skye@cloudera.com>
2014-01-08 10:53:03 -08:00
Nong Li
8963d79f51 Fix build break from UdfContext rename.
Change-Id: Ia3df23fcba7d3812ae90565daab89916cbb50861
Reviewed-on: http://gerrit.ent.cloudera.com:8080/549
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: jenkins
2014-01-08 10:53:01 -08:00
Nong Li
e39de94316 Add parser/analysis to support UDAs.
I looked around some and I think having create/drop/show [aggregate] function
seems reasonable and extends nicely for UDTs.

The create aggregate function can accept a lot of arguments. The non-essential one, I
went with resolving them by name rather than position (i.e. argName="value"). I think
this is better for the user than specifying it by position.

The grammar is:
CREATE AGGREGATE <name>(<arg_types>) RETURNS <type> [INTERMEDIATE <type>]
LOCATION '/path' UpdateFn='Fn' [comment='comment']
[SerializeFn='symbol'] [MergeFn='symbol'] [InitFn='symbol'] [FinalizeFn='symbol']

The optional args at the end can be in any order. If the other symbols are not
specified, we derive them from the UpdateFn symbol that's required. The analyzer
would try to figure it out and fail if we can't find the derived symbol in the binary.

The simplest example would be:
CREATE AGGREGATE FUNCTION count(float) RETURNS BIGINT LOCATION '/path'
UpdateFn='CountUpdateFn';

In which case we assume the intermediate type is the return type and the other functions
are called 'CountInitFn', 'CountSerializeFn', 'CountMergeFn' 'CountFinalizeFn'.

Change-Id: Iefc5741293050f5b295df28e9d1a7d039ead8675
Reviewed-on: http://gerrit.ent.cloudera.com:8080/513
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: Nong Li <nong@cloudera.com>
2014-01-08 10:52:59 -08:00
Skye Wanderman-Milne
fd99db0300 First pass at UdfExpr.
Change-Id: I517bf56541749b5c2459554821c7bf838239fdf0
Reviewed-on: http://gerrit.ent.cloudera.com:8080/439
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: Skye Wanderman-Milne <skye@cloudera.com>
2014-01-08 10:52:50 -08:00