impala

mirror of https://github.com/apache/impala.git synced 2026-01-05 12:01:11 -05:00

Author	SHA1	Message	Date
Dan Hecht	1fee56cb26	IMPALA-1080: Implement "SET <query_option>" as SQL statement. Also add support for "SET", which returns a table of query options and their respective values. The front-end parses the option into a (key, value) pair and then the existing backend logic is used to set the option, or return the result sets. Change-Id: I40dbd98537e2a73bdd5b27d8b2575a2fe6f8295b Reviewed-on: http://gerrit.ent.cloudera.com:8080/3582 Reviewed-by: Daniel Hecht <dhecht@cloudera.com> Tested-by: jenkins (cherry picked from commit aa0f6a2fc1d3fe21f22cc7bc56887e1fdb02250b) Reviewed-on: http://gerrit.ent.cloudera.com:8080/3614	2014-07-25 10:25:09 -07:00
Alex Behm	e9864d5f78	Introduce type hierarchy and add complex types. This patch replaces ColumnType with a hierarchy of types that models the existing scalar types as well as the new complex types ARRAY, MAP, and STRUCT. Change-Id: Ia895f41153e99febb0c35412acac12689c3c2064 Reviewed-on: http://gerrit.ent.cloudera.com:8080/3491 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/3538	2014-07-21 20:00:46 -07:00
Skye Wanderman-Milne	6ceed1e632	UDF API additions This patch introduces the ability to specify a prepare and close function for a UDF, as well as FunctionContext methods for maintaining state across UDF invocations within a query. Many of the changes are related to adding an Expr::Open() function which calls the UDF's prepare function, if specified (it has to be called in Open() since the LLVM module must be compiled first). Change-Id: I581d90d03dff71f7ff5d4a6bef839ba6bc46b443 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1693 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: jenkins (cherry picked from commit 8e2ed7fb9051d98f89327715fdebd6f5ed22d6ee) Reviewed-on: http://gerrit.ent.cloudera.com:8080/1757	2014-03-05 07:32:34 -08:00
Nong Li	f0a67153d3	Decimal analysis changes. Change-Id: Ib7d6a6a7650cc9058ff1486fc7546ab66c698d46 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1734 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins	2014-03-03 21:15:00 -08:00
Nong Li	0d2919fe7f	Refactor scalar and aggregate function analysis and execution. This patch cleans up analysis and execution of scalar and aggregate functions so that there is no difference between how builtins and user functions are handled. The only difference is that the catalog is populated with the builtins all the time. The BE always gets a TFunction object and just executes it (builtins will have an empty hdfs file location). This removes the opcode registry and all of the functionality is subsumed by the catalog, most of which was already duplicated there anyway. This also introduces the concept of a system database; databases that the user cannot modify and is populated automatically on startup. Change-Id: Iaa3f84dad0a1a57691f5c7d8df7305faf01d70ed Reviewed-on: http://gerrit.ent.cloudera.com:8080/1386 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/1577	2014-02-18 18:40:08 -08:00
Nong Li	69fe1c6c10	Change FE to use ColumnType instead of PrimitiveType. PrimitiveType is an enum and cannot be used for more complex types. The change touches a lot of files but very mechanically. A similar change needs to be done in the BE which will be a subsequent patch. The version as I have it breaks rolling upgrade due to the thrift changes. If this is not okay, we can work around that but it will be annoying. Change-Id: If3838bb27377bfc436afd6d90a327de2ead0af54 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1287 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/1304 Reviewed-by: Henry Robinson <henry@cloudera.com> Tested-by: Henry Robinson <henry@cloudera.com>	2014-01-17 14:32:55 -08:00
Alex Behm	6799c93922	Simplified/enhanced explain plans with a total of four explain levels. There are now 4 explain levels summarized as follows: - Level 0: MINIMAL Non-fragmented parallel plan only showing plan nodes with minimal attributes - Level 1: STANDARD Non-fragmented parallel plan with some details in plan nodes - Level 2: EXTENDED Non-fragmented parallel plan with full details in plan nodes including the table/column stats, row size, #hosts, cardinality, and estimated per-host memory requirement - Level 3: VERBOSE Fragmented parallel plan with full details (like level 2) This patch also includes several bugfixes related to plan costing and/or testing of explain plans. Change-Id: I622310f01d1b3d53ea1031adaf3b3ffdd94eba30 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1211 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: jenkins	2014-01-10 19:17:59 -08:00
Skye Wanderman-Milne	acdc792355	IMPALA-695: Use the local path of Hive UDF jars in the FE. The FE was creating class loaders with the HDFS locations of Hive UDF libs, rather than the local locations created by the BE. Our tests still passed since we only used UDFs already on the classpath (e.g. Hive builtins). Change-Id: Idbe9c98ad6adb84b70cb44efbf9ad0afc53366ca Reviewed-on: http://gerrit.ent.cloudera.com:8080/1081 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: jenkins	2014-01-08 10:54:25 -08:00
Nong Li	601f24a198	UDA execution loose ends. Unfortunately, the BE does not have the codegen path to execute UDAs. This puts some restrictions on the UDAs we can run. - No IR UDAs - No varargs - Must have 8 arguments or less. The code to do this is almost all there for UDFs but I'm not sure I'll get to it. Change-Id: I8a06e635a9138397c8474a5704c3e588bb92347b Reviewed-on: http://gerrit.ent.cloudera.com:8080/703 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: Nong Li <nong@cloudera.com>	2014-01-08 10:53:38 -08:00
Nong Li	e959e49b7c	Update opcode registry to support UDF-interface builtins. There's a bigger change to migrate the rest of them but I think this is how the builtins, when not running as cross compiled, should be run. This mode is still useful when developing the builtin. When run as cross compiled IR, we wouldn't do anything to distinguish between a builtin and an external UDF. Change-Id: I6aa336b22aa19b00507bad33c9df3978baa576cc Reviewed-on: http://gerrit.ent.cloudera.com:8080/542 Tested-by: jenkins Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>	2014-01-08 10:53:06 -08:00
Nong Li	e39de94316	Add parser/analysis to support UDAs. I looked around some and I think having create/drop/show [aggregate] function seems reasonable and extends nicely for UDTs. The create aggregate function can accept a lot of arguments. The non-essential one, I went with resolving them by name rather than position (i.e. argName="value"). I think this is better for the user than specifying it by position. The grammar is: CREATE AGGREGATE <name>(<arg_types>) RETURNS <type> [INTERMEDIATE <type>] LOCATION '/path' UpdateFn='Fn' [comment='comment'] [SerializeFn='symbol'] [MergeFn='symbol'] [InitFn='symbol'] [FinalizeFn='symbol'] The optional args at the end can be in any order. If the other symbols are not specified, we derive them from the UpdateFn symbol that's required. The analyzer would try to figure it out and fail if we can't find the derived symbol in the binary. The simplest example would be: CREATE AGGREGATE FUNCTION count(float) RETURNS BIGINT LOCATION '/path' UpdateFn='CountUpdateFn'; In which case we assume the intermediate type is the return type and the other functions are called 'CountInitFn', 'CountSerializeFn', 'CountMergeFn' 'CountFinalizeFn'. Change-Id: Iefc5741293050f5b295df28e9d1a7d039ead8675 Reviewed-on: http://gerrit.ent.cloudera.com:8080/513 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: Nong Li <nong@cloudera.com>	2014-01-08 10:52:59 -08:00
Nong Li	a0bf45a0b4	Add udf type. Change-Id: Ic5f52c127750cc9c847a3e34d3fdcfc78bee5a8a Reviewed-on: http://gerrit.ent.cloudera.com:8080/454 Tested-by: jenkins Reviewed-by: Alex Behm <alex.behm@cloudera.com>	2014-01-08 10:52:48 -08:00
Lenni Kuff	c2cfc7e2a3	IMPALA-373: Add support for 'LOAD DATA' statements This change adds Impala support for LOAD DATA statements. This allows the user to load one or more files into a table or partition from a given HDFS location. The load operation only moves files, it does not convert data to match the target table/partition's file format.	2014-01-08 10:51:02 -08:00
Alan Choi	b1de018298	IMPALA-31 Support EXPLAIN <query> Hue is moving to HiveServer2 but HiveServer2 does not have an "explain" RPC call. To support "explain", I added it to the language. An "explain" statement will return a result set: one row per explain line.	2014-01-08 10:50:32 -08:00
Alex Behm	1b2e8280d4	Fix NULL issues.	2014-01-08 10:49:32 -08:00
Alex Behm	be03e6c21c	IMPALA-138: Error messages for unknown column types are particularly bad.	2014-01-08 10:48:53 -08:00
Henry Robinson	71e6d81d1b	IMP-261: Clean up network address handling	2014-01-08 10:48:33 -08:00
Henry Robinson	2f339f2ed8	Add ASL license to all public files	2014-01-08 10:46:32 -08:00
ishaan	05c65789bb	Change Copyrights from 2011 ti 2012	2014-01-08 10:46:29 -08:00
Michael Ubell	ad46b98366	Add Kerberos authentication.	2014-01-08 10:45:10 -08:00
Marcel Kornacker	c004cdaa1c	Thrift structures for the new planner interface.	2014-01-08 10:44:47 -08:00
Henry Robinson	0fd68e5718	USE stub implementation	2014-01-08 10:44:42 -08:00
Marcel Kornacker	fb32d40b03	Switching to an asynchronous plan fragment exec interface; this entails: - making the coordinator asynchronous - renamed ImpalaBackendService to ImpalaInternalService; - new class ImpalaServer implements ImpalaService and ImpalaInternalService - renaming ImpalaInternalService fields to conform to c++ style - merged impala-service.{cc,h} and backend-service.{cc,h} into impala-server.{cc,h} - added TStatusCode field to Status.ErrorDetail - removed ImpalaInternalService.CloseChannel Also removed JdbcDriverTest.java	2014-01-08 10:44:15 -08:00
Kay Ousterhout	073e38d6c2	Added the StateStore, a centralized repository for soft state. The commit also adds the StateStoreSubscriber, a component that runs alongside each impalad and handles communication with the state store.	2012-07-13 09:26:16 -07:00
Marcel Kornacker	6a57a1d879	Enabling multi-node distributed execution: - adding flag --backends="host:port,host:port,..." , which TestEnv uses to create clients for ImpalaBackendServices running on those nodes; this is just a hack in order to be able to use runquery for multi-node execution - impalad-main.cc: main() of impala daemon, which will export both ImpalaService and ImpalaBackendService (but at the moment only does the latter; everything related to ImpalaService is commented out) - com.cloudera.impala.service.Frontend: API to the frontend functionality; invoked by impalad via jni; ignore for now	2012-02-10 10:53:40 -08:00
Nong Li	c84fec38d3	- Move thrift out of FE src and into impala/common - Thrift files now build using cmake instead of mvn - Added cmake build to impala/ which drives the build process	2011-12-30 19:35:20 -08:00

26 Commits