impala

mirror of https://github.com/apache/impala.git synced 2026-01-03 15:00:52 -05:00

Author	SHA1	Message	Date
Skye Wanderman-Milne	6ac9a8104b	IMPALA-1009: UDF/UDA leaks should not fail queries With this change, leaky UDFs built with the SDK will still fail when using the test harness, but leaky UDFs running in Impala will only trigger a warning. This change also updates the test infrastructure to always check for non-fatal errors/warnings. Change-Id: I5615349b9d691e4eddea3e03e152ef12e73835e7 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2844 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: jenkins (cherry picked from commit 60ce5190d96add6104aba642d2354d87a26000fa) Reviewed-on: http://gerrit.ent.cloudera.com:8080/2938	2014-06-10 21:46:47 -07:00
Victor Bittorf	09aff77a6c	IMPALA-943: removed database udf_test from front-end tests Added CATCH section to test files. Change-Id: I28ba3a6e5ae4c53df5b86505573793d7b150863b Reviewed-on: http://gerrit.ent.cloudera.com:8080/2782 Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com> Tested-by: jenkins (cherry picked from commit 5b616715958f3ebfdc45b8dc0e4baa82bd55f1d2) Reviewed-on: http://gerrit.ent.cloudera.com:8080/2912	2014-06-09 19:06:15 -07:00
Skye Wanderman-Milne	c8b2017093	Add decimal UDF/UDA support. Change-Id: Ie48c1cb8e978c7282593b7f602dd68added6d3fd Reviewed-on: http://gerrit.ent.cloudera.com:8080/2625 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: jenkins (cherry picked from commit 5048f04b332c13b1bff32fb257272b0fea4b8584) Reviewed-on: http://gerrit.ent.cloudera.com:8080/2739	2014-05-29 20:49:53 -07:00
Skye Wanderman-Milne	bd2fc2d1d4	IMPALA-934: Refresh cached UDF library when creating a new function This change adds the ability to refresh a local cache entry, causing the old cache entry to be dropped and the library to be reloaded from HDFS. This is used in ResolveSymbolLookup(), which is called by the frontend when creating a new a function, and in ImpalaServer when receiving a "create function" heartbeat. This change also makes sure the FE calls into the backend for jars, so jars get refreshed as well. Change-Id: I5fd61c1bc2e04838449335d5a68b61af8b101b01 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2286 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: jenkins (cherry picked from commit e8587794b3b82438190c91b2ebe9d1e12db73981) Reviewed-on: http://gerrit.ent.cloudera.com:8080/2348	2014-04-24 19:39:16 -07:00
Lenni Kuff	bb09b5270f	IMPALA-839: Update tests to be more thorough when run exhaustively Some tests have constraints that were there only to help reduce runtime which reduces coverage when running in exhaustive mode. The majority of the constraints are because it adds no value to run the test across additional dimensions (or it is invalid to run with those dimensions). Updates the tests that have legitimate constraints to use two new helper methods for constraining the table format dimension: create_uncompressed_text_dimension() create_parquet_dimension() These will create a dimension that will produce a single test vector, either uncompressed text or parquet respectively. Change-Id: Id85387c1efd5d192f8059ef89934933389bfe247 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2149 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: jenkins (cherry picked from commit e02acbd469bc48c684b2089405b4a20552802481) Reviewed-on: http://gerrit.ent.cloudera.com:8080/2290	2014-04-18 20:11:31 -07:00
Alex Behm	2fff51d9e9	IMP-1329,IMPALA-924: Make ExchangeNode::Open() block until rows are available. The bug: Coordinator::Wait() is supposed to block until rows become available for consumption by the client. We rely on Wait() to determine when to advance the query status to a 'ready' state and signal to the client that rows can be fetched. Long fetch times can trigger client timeouts at various levels (socket, app, etc.). Coordinator::Wait() simply opens the coordinator fragment's plan tree. For most plan nodes, Open() does work to prepare the plan tree, s.t., GetNext() returns quickly. However, for ExchangeNodes Open() used to not wait until rows are obtained form the underlying stream receiver. The fix: Make ExchangeNode::Open() block until rows are available. Change-Id: I7b197eea11d21fd732414d96c899a17b2d99631c Reviewed-on: http://gerrit.ent.cloudera.com:8080/2128 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/2185	2014-04-10 23:49:38 -07:00
Skye Wanderman-Milne	8e9776b824	Mark TestUdfs.test_mem_limits to run serially This was causing other tests to fail with process mem limit exceeded. Change-Id: I1407b0896052aece691c681827994961b09d8103 (cherry picked from commit 2bcc46117f504f50ded724fddf74f24bd829c6c6) Reviewed-on: http://gerrit.ent.cloudera.com:8080/2003 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: jenkins	2014-03-19 14:18:11 -07:00
Skye Wanderman-Milne	3e728f3180	Symbol mangling for UDF prepare/close functions Change-Id: If8f1386073f467e66ada74e606fc98f3344f0733 (cherry picked from commit 32df8b3f963a2b46ec33aad86a151d4c7ecda39c) Reviewed-on: http://gerrit.ent.cloudera.com:8080/1993 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: jenkins	2014-03-19 02:15:07 -07:00
Skye Wanderman-Milne	44125729dc	UDF/UDA memory management improvements * AggFnEvaluator now uses the UDF mem pool (I'm planning to change this to per-exec node pools in the expr refactoring) * FunctionContext::TrackAllocation()/Free() actually use the UDF's mem tracker * Added FunctionContextImpl::Close() which sets warnings for leaked allocations Change-Id: I792ffd49102a92b57e34df18d8ff5f5d0fd27370 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1792 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: Skye Wanderman-Milne <skye@cloudera.com> (cherry picked from commit 41a5f7cfa718789fa3b2de3a31f085411fb5000c) Reviewed-on: http://gerrit.ent.cloudera.com:8080/1954 Tested-by: jenkins	2014-03-17 20:38:25 -07:00
Lenni Kuff	23c619f794	Limit test_udfs to always run with a single exec_option test vector Change-Id: If3ff1f5f17a95cce88282f9dc165fe5ce85200b9 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1781 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/1811 Reviewed-by: Lenni Kuff <lskuff@cloudera.com>	2014-03-07 18:44:11 -08:00
Skye Wanderman-Milne	6ceed1e632	UDF API additions This patch introduces the ability to specify a prepare and close function for a UDF, as well as FunctionContext methods for maintaining state across UDF invocations within a query. Many of the changes are related to adding an Expr::Open() function which calls the UDF's prepare function, if specified (it has to be called in Open() since the LLVM module must be compiled first). Change-Id: I581d90d03dff71f7ff5d4a6bef839ba6bc46b443 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1693 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: jenkins (cherry picked from commit 8e2ed7fb9051d98f89327715fdebd6f5ed22d6ee) Reviewed-on: http://gerrit.ent.cloudera.com:8080/1757	2014-03-05 07:32:34 -08:00
Skye Wanderman-Milne	203fc66456	Add GetTypeDesc() method to FunctionContext. This is currently only implemented for NativeUdfExpr. Change-Id: I81b442c5668dff43d0486d1cfc445bca2af66606 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1664 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: jenkins (cherry picked from commit e1087c3a78e6e12938b583c302907bd32c59f524) Reviewed-on: http://gerrit.ent.cloudera.com:8080/1720	2014-03-01 20:24:30 -08:00
Nong Li	904ae86e82	IMPALA-626: Allow dropping functions while it is running. Change-Id: Ia9d6fa1daadddbd05961696d13b9ff43fef2da61 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1621 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins	2014-02-20 13:12:10 -08:00
Lenni Kuff	5f027f61c5	IMPALA-800 / IMPALA-795: Check catalog version before removing entries from the lib cache There was an issue with the lib cache cleanup code where if a function were dropped then re-created we might incorrectly remove the new functions's library from the cache. Consider these statements executed in quick succession: 1) create function fn() 2) drop function fn() 3) create function fn() 4) select fn() ... Since we perform direct-DDL and immediately apply the result of a DDL operation to the local impalad catalog, steps 1-4 may complete before a statestore catalog update with the drop from step 2) is received. When the statestore heartbeat with the drop is received, we incorrectly removed the new function's lib cache entry while the select statement was executing, causing the crash. The fix for this problem is to verify the catalog versions to ensure we only drop items that have a catalog version <= the catalog version the drop corresponds to. Change-Id: I7dd1886bf24740cb41f1315ecbb540e38d9ad363 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1552 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/1576	2014-02-17 17:56:49 -08:00
Skye Wanderman-Milne	3598395290	Set sync_ddl=true for tests that drop functions. This is a temporary "fix" for IMPALA-795 to unblock the build. The actual fix should prevent a dropped and re-created function from being re-dropped by an old catalog update. Change-Id: Id9dc36a8ecd5e7d1a1146ad0ac092ae12cb33529 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1547 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: jenkins (cherry picked from commit 80439d638a4ac02cedfe1490556b176cd818429f) Reviewed-on: http://gerrit.ent.cloudera.com:8080/1559 Tested-by: Skye Wanderman-Milne <skye@cloudera.com>	2014-02-14 10:44:54 -08:00
Nong Li	3722711a06	IMPALA-800 workaround. Mark test_libs_with_same_filename as serial. This test will drop functions in a binary used by the other UDF tests. That triggers IMPALA-800. Change-Id: I8e6f1ad5b4a7ece2d891559751142f0c12e07c3c Reviewed-on: http://gerrit.ent.cloudera.com:8080/1556 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: Lenni Kuff <lskuff@cloudera.com> (cherry picked from commit 95100e0bdfd9472183fcc7cd8636666d5b654a37) Reviewed-on: http://gerrit.ent.cloudera.com:8080/1558 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: Nong Li <nong@cloudera.com>	2014-02-14 00:22:08 -08:00
Nong Li	80d4fd958e	IMPALA-786: Drop function should clear library cache. We were previously only clearing the cache in the catalog service update loop so the impalad the drop was issued to was not doing the right thing. Change-Id: I6bee228e8c0d565cea4ea61cbf64240d83a45a7d Reviewed-on: http://gerrit.ent.cloudera.com:8080/1511 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins	2014-02-10 18:51:39 -08:00
Skye Wanderman-Milne	b54d16dabd	IMPALA-679: Append hash of HDFS path to filename in CopyHdfsFile() to avoid collisions. Change-Id: Ia84fa81fe043a9604248d66ed963ef3f91b0601e Reviewed-on: http://gerrit.ent.cloudera.com:8080/1018 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: jenkins	2014-01-08 10:54:22 -08:00
Lenni Kuff	39f77b8b8f	Add support for cluster-synchronized catalog operations This change adds support for cluster-synchronized catalog operations. This provides the guaranteethat after a catalog op completes, all other subscribers to the catalog topic have also processed that update. This is useful when load balancing, because a common workflow is to target a different impalad for each statement executed. For example if each of the following were executed sequentially, but targeting a different node: 1) CREATE TABLE Foo 2) INSERT INTO Foo 3) SELECT * FROM Foo 4) INSERT INTO Foo .... Since both the INSERT and the CREATE update the catalog, it would not work as expected without this patch. The user might either get a "table not found" error or would be missing partition information from the INSERT. The downside is that this approach to DDL takes a bit longer because we need to wait until all subscribers have processed an update. If all nodes are healthy, this overhead should not be significantly longer than the current DDL time. However, a single bad node might slow down or completely block the completion of all DDL operations. By default this feature is disabled, but it can be enabled using a new query option: SYNCED_DDL=1 To test this, the base test suite was updated to support selecting a random impalad to execute each query section in a query test file. This is currently only enabled for the insert and DDL tests, but could be leveraged by more tests in the future. TODO: Add additional failure tests around this functionality. TODO: Add an explicit "sync" statement so users do not need to run all their DDL in this mode (since it is slower). Change-Id: I45e757a931bf2a4740cc0cdd1e76ce49a1e22b83 Reviewed-on: http://gerrit.ent.cloudera.com:8080/899 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: jenkins	2014-01-08 10:53:58 -08:00
Skye Wanderman-Milne	9d05d6d03a	Allow UDF tests to run in parallel. Change-Id: I9512d4a6920c4a71383d9374eb5feb303c3db85d Reviewed-on: http://gerrit.ent.cloudera.com:8080/727 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: Skye Wanderman-Milne <skye@cloudera.com>	2014-01-08 10:53:47 -08:00
Nong Li	4800995d44	Add execution for Hive UDFs. Change-Id: I6a5ad96fed77e2b8a2701f21a917a8eb7a11d500 Reviewed-on: http://gerrit.ent.cloudera.com:8080/458 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: Nong Li <nong@cloudera.com>	2014-01-08 10:53:25 -08:00
Nong Li	904289d168	Add UDA execution. Change-Id: Ie5aab79742675fc62ed731c13abe83304df80991 Reviewed-on: http://gerrit.ent.cloudera.com:8080/642 Tested-by: jenkins Reviewed-by: Nong Li <nong@cloudera.com>	2014-01-08 10:53:24 -08:00
Nong Li	6b9a7de02e	Add symbol resolution during analysis for create function stmts. Before this, we had to specify the entire mangled symbol. This can be quite long and quite tedious (take a look at some of the create UDA test cases that specify all the symbols). This patch adds some code to convert from the user function signature to the mangled name. This means the user can specify the unmangled name and we can do the symbol lookup. The mangling rules are pretty convoluted but if it is messed up, the user can always specify the full symbol. Some other minor cleanup in: - JNI from FE to BE - UDFs/UDAs that are loaded as test data Change-Id: I733dbf3a72cb7b06221c27e622d161bcca0d74a8 Reviewed-on: http://gerrit.ent.cloudera.com:8080/624 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: Nong Li <nong@cloudera.com>	2014-01-08 10:53:20 -08:00
Skye Wanderman-Milne	b7f83bcd73	Add support for LLVM IR UDFs. This patch also adds a number of improvements to NativeUdfExpr. Highlights include: * Correctly handling the lowering of AnyVal struct types (required for ABI compatibility) * A rudimentary library cache for reusing handles produced by dlopen * More complicated test cases Change-Id: Iab9acdd7d7c4308e5d7ee3210f21b033fda5a195 Reviewed-on: http://gerrit.ent.cloudera.com:8080/540 Tested-by: jenkins Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: Skye Wanderman-Milne <skye@cloudera.com>	2014-01-08 10:53:03 -08:00
Skye Wanderman-Milne	cf7ed25377	Fix UDF test, take two Change-Id: I817389d94dab665199d2c1b7365e8ce0d1495c41 Reviewed-on: http://gerrit.ent.cloudera.com:8080/504 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: Skye Wanderman-Milne <skye@cloudera.com>	2014-01-08 10:52:53 -08:00
Skye Wanderman-Milne	fd99db0300	First pass at UdfExpr. Change-Id: I517bf56541749b5c2459554821c7bf838239fdf0 Reviewed-on: http://gerrit.ent.cloudera.com:8080/439 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: Skye Wanderman-Milne <skye@cloudera.com>	2014-01-08 10:52:50 -08:00

26 Commits