impala

mirror of https://github.com/apache/impala.git synced 2025-12-31 15:00:10 -05:00

Author	SHA1	Message	Date
Nong Li	1cab95066d	Add the return type as a column for SHOW FUNCTIONS. Also includes some misc pattern matching cleanup. Change-Id: I6c9ec78b094a73864b4d669afbd75a48c9bf9585 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2199 Tested-by: jenkins Reviewed-by: Nong Li <nong@cloudera.com> Reviewed-on: http://gerrit.ent.cloudera.com:8080/2271	2014-04-17 17:58:13 -07:00
Lenni Kuff	6bba0c8ffe	Fix bug cleaning up removed Functions and fix test_ddl to create all test dbs When dropping functions, we neeed to remove the function from the list of Functions with that name AND remove the list from the Function map if the list is empty. The second part wasn't happening. Also fixes the test_ddl to properly create all test databases. Change-Id: Id85af7d5db74a31161f48bea3816bdf734063133 Reviewed-on: http://gerrit.ent.cloudera.com:8080/952 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins	2014-01-08 10:54:00 -08:00
Lenni Kuff	39f77b8b8f	Add support for cluster-synchronized catalog operations This change adds support for cluster-synchronized catalog operations. This provides the guaranteethat after a catalog op completes, all other subscribers to the catalog topic have also processed that update. This is useful when load balancing, because a common workflow is to target a different impalad for each statement executed. For example if each of the following were executed sequentially, but targeting a different node: 1) CREATE TABLE Foo 2) INSERT INTO Foo 3) SELECT * FROM Foo 4) INSERT INTO Foo .... Since both the INSERT and the CREATE update the catalog, it would not work as expected without this patch. The user might either get a "table not found" error or would be missing partition information from the INSERT. The downside is that this approach to DDL takes a bit longer because we need to wait until all subscribers have processed an update. If all nodes are healthy, this overhead should not be significantly longer than the current DDL time. However, a single bad node might slow down or completely block the completion of all DDL operations. By default this feature is disabled, but it can be enabled using a new query option: SYNCED_DDL=1 To test this, the base test suite was updated to support selecting a random impalad to execute each query section in a query test file. This is currently only enabled for the insert and DDL tests, but could be leveraged by more tests in the future. TODO: Add additional failure tests around this functionality. TODO: Add an explicit "sync" statement so users do not need to run all their DDL in this mode (since it is slower). Change-Id: I45e757a931bf2a4740cc0cdd1e76ce49a1e22b83 Reviewed-on: http://gerrit.ent.cloudera.com:8080/899 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: jenkins	2014-01-08 10:53:58 -08:00
Nong Li	601f24a198	UDA execution loose ends. Unfortunately, the BE does not have the codegen path to execute UDAs. This puts some restrictions on the UDAs we can run. - No IR UDAs - No varargs - Must have 8 arguments or less. The code to do this is almost all there for UDFs but I'm not sure I'll get to it. Change-Id: I8a06e635a9138397c8474a5704c3e588bb92347b Reviewed-on: http://gerrit.ent.cloudera.com:8080/703 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: Nong Li <nong@cloudera.com>	2014-01-08 10:53:38 -08:00
Nong Li	a944a1fe52	'Invalidate metadata' no longer clears user functions. Change-Id: I36de18fefa1d515a7960c2bf8c116d5217c388d6 Reviewed-on: http://gerrit.ent.cloudera.com:8080/726 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: Nong Li <nong@cloudera.com>	2014-01-08 10:53:36 -08:00
Lenni Kuff	01c8c43fec	Uniquify FUNCTION catalog topic entry keys by including parent database name Change-Id: I6aa49520f548ddfcd557e2f908a09be454765e8c Reviewed-on: http://gerrit.ent.cloudera.com:8080/698 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: Lenni Kuff <lskuff@cloudera.com>	2014-01-08 10:53:29 -08:00
Nong Li	6b9a7de02e	Add symbol resolution during analysis for create function stmts. Before this, we had to specify the entire mangled symbol. This can be quite long and quite tedious (take a look at some of the create UDA test cases that specify all the symbols). This patch adds some code to convert from the user function signature to the mangled name. This means the user can specify the unmangled name and we can do the symbol lookup. The mangling rules are pretty convoluted but if it is messed up, the user can always specify the full symbol. Some other minor cleanup in: - JNI from FE to BE - UDFs/UDAs that are loaded as test data Change-Id: I733dbf3a72cb7b06221c27e622d161bcca0d74a8 Reviewed-on: http://gerrit.ent.cloudera.com:8080/624 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: Nong Li <nong@cloudera.com>	2014-01-08 10:53:20 -08:00
Nong Li	4bb1e8c854	Add varargs to UDF/UDA parser/analyzer. Change-Id: I4c3f2e74f6c29cee4b0b787c058b0455b16a11fd Reviewed-on: http://gerrit.ent.cloudera.com:8080/548 Reviewed-by: Marcel Kornacker <marcel@cloudera.com> Tested-by: jenkins	2014-01-08 10:53:05 -08:00
Nong Li	e39de94316	Add parser/analysis to support UDAs. I looked around some and I think having create/drop/show [aggregate] function seems reasonable and extends nicely for UDTs. The create aggregate function can accept a lot of arguments. The non-essential one, I went with resolving them by name rather than position (i.e. argName="value"). I think this is better for the user than specifying it by position. The grammar is: CREATE AGGREGATE <name>(<arg_types>) RETURNS <type> [INTERMEDIATE <type>] LOCATION '/path' UpdateFn='Fn' [comment='comment'] [SerializeFn='symbol'] [MergeFn='symbol'] [InitFn='symbol'] [FinalizeFn='symbol'] The optional args at the end can be in any order. If the other symbols are not specified, we derive them from the UpdateFn symbol that's required. The analyzer would try to figure it out and fail if we can't find the derived symbol in the binary. The simplest example would be: CREATE AGGREGATE FUNCTION count(float) RETURNS BIGINT LOCATION '/path' UpdateFn='CountUpdateFn'; In which case we assume the intermediate type is the return type and the other functions are called 'CountInitFn', 'CountSerializeFn', 'CountMergeFn' 'CountFinalizeFn'. Change-Id: Iefc5741293050f5b295df28e9d1a7d039ead8675 Reviewed-on: http://gerrit.ent.cloudera.com:8080/513 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: Nong Li <nong@cloudera.com>	2014-01-08 10:52:59 -08:00
Nong Li	a0bf45a0b4	Add udf type. Change-Id: Ic5f52c127750cc9c847a3e34d3fdcfc78bee5a8a Reviewed-on: http://gerrit.ent.cloudera.com:8080/454 Tested-by: jenkins Reviewed-by: Alex Behm <alex.behm@cloudera.com>	2014-01-08 10:52:48 -08:00
Nong Li	308650f208	Fix create function ddl test setup issue. Change-Id: I30c9a4342efbdb17bd53fb14bdcee172506cdadb Reviewed-on: http://gerrit.ent.cloudera.com:8080/447 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: Nong Li <nong@cloudera.com>	2014-01-08 10:52:44 -08:00
Nong Li	8eb727b585	UDF ddl cleanup Change-Id: I381fed277b5809727d2d8bf430258c01d2d0ae1f Reviewed-on: http://gerrit.ent.cloudera.com:8080/436 Tested-by: jenkins Reviewed-by: Nong Li <nong@cloudera.com>	2014-01-08 10:52:43 -08:00
Nong Li	2394ae2e66	UDF parsing and analysis. Change-Id: If8058c1cb66bf5e9c7049d4b78f5882b46c03fc1 Reviewed-on: http://gerrit.ent.cloudera.com:8080/318 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: Nong Li <nong@cloudera.com>	2014-01-08 10:52:32 -08:00

13 Commits