Commit Graph

13 Commits

Author SHA1 Message Date
Nong Li
1cab95066d Add the return type as a column for SHOW FUNCTIONS.
Also includes some misc pattern matching cleanup.

Change-Id: I6c9ec78b094a73864b4d669afbd75a48c9bf9585
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2199
Tested-by: jenkins
Reviewed-by: Nong Li <nong@cloudera.com>
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2271
2014-04-17 17:58:13 -07:00
Lenni Kuff
6bba0c8ffe Fix bug cleaning up removed Functions and fix test_ddl to create all test dbs
When dropping functions, we neeed to remove the function from the list
of Functions with that name AND remove the list from the Function map if
the list is empty. The second part wasn't happening.

Also fixes the test_ddl to properly create all test databases.

Change-Id: Id85af7d5db74a31161f48bea3816bdf734063133
Reviewed-on: http://gerrit.ent.cloudera.com:8080/952
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
2014-01-08 10:54:00 -08:00
Lenni Kuff
39f77b8b8f Add support for cluster-synchronized catalog operations
This change adds support for cluster-synchronized catalog operations. This provides the
guaranteethat after a catalog op completes, all other subscribers to the catalog topic have
also processed that update. This is useful when load balancing, because a common workflow
is to target a different impalad for each statement executed.
For example if each of the following were executed sequentially, but targeting
a different node:
1) CREATE TABLE Foo
2) INSERT INTO Foo
3) SELECT * FROM Foo
4) INSERT INTO Foo ....

Since both the INSERT and the CREATE update the catalog, it would not work as expected
without this patch. The user might either get a "table not found" error or would be
missing partition information from the INSERT.

The downside is that this approach to DDL takes a bit longer because we need to wait
until all subscribers have processed an update. If all nodes are healthy, this overhead
should not be significantly longer than the current DDL time. However, a single bad node
might slow down or completely block the completion of all DDL operations. By default
this feature is disabled, but it can be enabled using a new query option: SYNCED_DDL=1

To test this, the base test suite was updated to support selecting a random impalad
to execute each query section in a query test file. This is currently only enabled
for the insert and DDL tests, but could be leveraged by more tests in the future.

TODO: Add additional failure tests around this functionality.
TODO: Add an explicit "sync" statement so users do not need to run all their DDL
in this mode (since it is slower).

Change-Id: I45e757a931bf2a4740cc0cdd1e76ce49a1e22b83
Reviewed-on: http://gerrit.ent.cloudera.com:8080/899
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: jenkins
2014-01-08 10:53:58 -08:00
Nong Li
601f24a198 UDA execution loose ends.
Unfortunately, the BE does not have the codegen path to execute UDAs.
This puts some restrictions on the UDAs we can run.

- No IR UDAs
- No varargs
- Must have 8 arguments or less.

The code to do this is almost all there for UDFs but I'm not sure I'll get to it.

Change-Id: I8a06e635a9138397c8474a5704c3e588bb92347b
Reviewed-on: http://gerrit.ent.cloudera.com:8080/703
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: Nong Li <nong@cloudera.com>
2014-01-08 10:53:38 -08:00
Nong Li
a944a1fe52 'Invalidate metadata' no longer clears user functions.
Change-Id: I36de18fefa1d515a7960c2bf8c116d5217c388d6
Reviewed-on: http://gerrit.ent.cloudera.com:8080/726
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
Tested-by: Nong Li <nong@cloudera.com>
2014-01-08 10:53:36 -08:00
Lenni Kuff
01c8c43fec Uniquify FUNCTION catalog topic entry keys by including parent database name
Change-Id: I6aa49520f548ddfcd557e2f908a09be454765e8c
Reviewed-on: http://gerrit.ent.cloudera.com:8080/698
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: Lenni Kuff <lskuff@cloudera.com>
2014-01-08 10:53:29 -08:00
Nong Li
6b9a7de02e Add symbol resolution during analysis for create function stmts.
Before this, we had to specify the entire mangled symbol. This can be quite
long and quite tedious (take a look at some of the create UDA test cases that
specify all the symbols).

This patch adds some code to convert from the user function signature to the
mangled name. This means the user can specify the unmangled name and we can
do the symbol lookup. The mangling rules are pretty convoluted but if it is
messed up, the user can always specify the full symbol.

Some other minor cleanup in:
  - JNI from FE to BE
  - UDFs/UDAs that are loaded as test data

Change-Id: I733dbf3a72cb7b06221c27e622d161bcca0d74a8
Reviewed-on: http://gerrit.ent.cloudera.com:8080/624
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: Nong Li <nong@cloudera.com>
2014-01-08 10:53:20 -08:00
Nong Li
4bb1e8c854 Add varargs to UDF/UDA parser/analyzer.
Change-Id: I4c3f2e74f6c29cee4b0b787c058b0455b16a11fd
Reviewed-on: http://gerrit.ent.cloudera.com:8080/548
Reviewed-by: Marcel Kornacker <marcel@cloudera.com>
Tested-by: jenkins
2014-01-08 10:53:05 -08:00
Nong Li
e39de94316 Add parser/analysis to support UDAs.
I looked around some and I think having create/drop/show [aggregate] function
seems reasonable and extends nicely for UDTs.

The create aggregate function can accept a lot of arguments. The non-essential one, I
went with resolving them by name rather than position (i.e. argName="value"). I think
this is better for the user than specifying it by position.

The grammar is:
CREATE AGGREGATE <name>(<arg_types>) RETURNS <type> [INTERMEDIATE <type>]
LOCATION '/path' UpdateFn='Fn' [comment='comment']
[SerializeFn='symbol'] [MergeFn='symbol'] [InitFn='symbol'] [FinalizeFn='symbol']

The optional args at the end can be in any order. If the other symbols are not
specified, we derive them from the UpdateFn symbol that's required. The analyzer
would try to figure it out and fail if we can't find the derived symbol in the binary.

The simplest example would be:
CREATE AGGREGATE FUNCTION count(float) RETURNS BIGINT LOCATION '/path'
UpdateFn='CountUpdateFn';

In which case we assume the intermediate type is the return type and the other functions
are called 'CountInitFn', 'CountSerializeFn', 'CountMergeFn' 'CountFinalizeFn'.

Change-Id: Iefc5741293050f5b295df28e9d1a7d039ead8675
Reviewed-on: http://gerrit.ent.cloudera.com:8080/513
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: Nong Li <nong@cloudera.com>
2014-01-08 10:52:59 -08:00
Nong Li
a0bf45a0b4 Add udf type.
Change-Id: Ic5f52c127750cc9c847a3e34d3fdcfc78bee5a8a
Reviewed-on: http://gerrit.ent.cloudera.com:8080/454
Tested-by: jenkins
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
2014-01-08 10:52:48 -08:00
Nong Li
308650f208 Fix create function ddl test setup issue.
Change-Id: I30c9a4342efbdb17bd53fb14bdcee172506cdadb
Reviewed-on: http://gerrit.ent.cloudera.com:8080/447
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
Tested-by: Nong Li <nong@cloudera.com>
2014-01-08 10:52:44 -08:00
Nong Li
8eb727b585 UDF ddl cleanup
Change-Id: I381fed277b5809727d2d8bf430258c01d2d0ae1f
Reviewed-on: http://gerrit.ent.cloudera.com:8080/436
Tested-by: jenkins
Reviewed-by: Nong Li <nong@cloudera.com>
2014-01-08 10:52:43 -08:00
Nong Li
2394ae2e66 UDF parsing and analysis.
Change-Id: If8058c1cb66bf5e9c7049d4b78f5882b46c03fc1
Reviewed-on: http://gerrit.ent.cloudera.com:8080/318
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: Nong Li <nong@cloudera.com>
2014-01-08 10:52:32 -08:00