impala

mirror of https://github.com/apache/impala.git synced 2026-01-04 00:00:56 -05:00

Author	SHA1	Message	Date
Alex Behm	3f54240fed	PlannerTest uses explain level 'normal'. Only add stats and costs to explain output in 'verbose' mode. Change-Id: I827b4c7085b5aa2dc5521f8748d8973178f43f4c Reviewed-on: http://gerrit.ent.cloudera.com:8080/678 Reviewed-by: Marcel Kornacker <marcel@cloudera.com> Tested-by: jenkins	2014-01-08 10:53:23 -08:00
Alex Behm	c5c2ccb56c	Fix build break due to machine-dependent explain output. Change-Id: I6b72e4e6cf2a7b38d4687c6f0f860e9744c9cedb Reviewed-on: http://gerrit.ent.cloudera.com:8080/675 Tested-by: jenkins Reviewed-by: Marcel Kornacker <marcel@cloudera.com>	2014-01-08 10:53:22 -08:00
Alex Behm	4bb8b38cde	Added stats and cost estimates to explain output. Change-Id: I1273745a439fd25cefa4e08ecc075c98cc8bfc45 Reviewed-on: http://gerrit.ent.cloudera.com:8080/602 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Alex Behm <alex.behm@cloudera.com>	2014-01-08 10:53:22 -08:00
Skye Wanderman-Milne	8692e7df8d	Add timestamp support to CodegenAnyVal Change-Id: I2bbeae16660709c2c15d545e6d1c791912e880db Reviewed-on: http://gerrit.ent.cloudera.com:8080/655 Tested-by: jenkins Reviewed-by: Nong Li <nong@cloudera.com>	2014-01-08 10:53:21 -08:00
Nong Li	6b9a7de02e	Add symbol resolution during analysis for create function stmts. Before this, we had to specify the entire mangled symbol. This can be quite long and quite tedious (take a look at some of the create UDA test cases that specify all the symbols). This patch adds some code to convert from the user function signature to the mangled name. This means the user can specify the unmangled name and we can do the symbol lookup. The mangling rules are pretty convoluted but if it is messed up, the user can always specify the full symbol. Some other minor cleanup in: - JNI from FE to BE - UDFs/UDAs that are loaded as test data Change-Id: I733dbf3a72cb7b06221c27e622d161bcca0d74a8 Reviewed-on: http://gerrit.ent.cloudera.com:8080/624 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: Nong Li <nong@cloudera.com>	2014-01-08 10:53:20 -08:00
Nong Li	c031cd4e96	Update RLE encoding to pad literal groups to 8. Change-Id: I77cb2b80b888b569ff715c583f16aea4e39fe680 Reviewed-on: http://gerrit.ent.cloudera.com:8080/644 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins	2014-01-08 10:53:17 -08:00
Nong Li	15db34e356	AggregationNode refactoring This patch redoes how the aggregation node is implemented. The functionality is now split between aggregation-node, agg-expr and aggregate-functions. This is a working progress (there's still a lot of debug stuff I added that needs to be cleaned up) but it does pass the tests. Aggregation-node is now very simple and now only deals with the grouping part. Aggregate-expr serves as the glue between the agg node and the aggregate functions. The aggregation functions are implemented with the UDA interface. I've reimplemented our existing aggregate functions with this setup. For true UDAs, the binaries would be loaded in aggregate-expr. This also includes some preliminary changes in the FE. We now need to annotate each AggNode as executing the update vs. merge phase (root aggs execute update, others execute merge) and if it needs a finalize step (only the root does). This is more general than our builtins which are too simple to need this structure. There is a big TODO here to allow the intermediate types between agg nodes to change. For example, in distinct estimate, the input type is the column type and the output type is a bigint. We'd like the intermediate type to be CHAR(256). This is different since currently, the intermediate type and output type have always been the same. We've hacked around this by having both the intermediate and output type be TYPE_STRING. I've left this for another patch (changing the BE to support this is trivial). For aggregates that result in strings, we used to store some additional stuff past the end of the tuple. The layout was: <tuple> <length of 1st string buffer>,<length of 2nd string buffer>, etc The rationale for this is that we want to reuse the buffer for min/max and grow the buffer more quickly for group_concat. This breaks down the abstraction between agg-expr and agg-node and is not something UDAs can use in general. Rather than try to hack around this, I think the proper solution is to the intermediate type not be StringValue and to contain the buffer length itself. This patch also resurrects the distinct estimate code. The distinct estimate functions exercise all of the code paths. Change-Id: Ic152a2cd03bc1713967673681e1e6204dcd80346 Reviewed-on: http://gerrit.ent.cloudera.com:8080/564 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: Nong Li <nong@cloudera.com>	2014-01-08 10:53:13 -08:00
Lenni Kuff	a2cbd2820e	Add Catalog Service and support for automatic metadata refresh The Impala CatalogService manages the caching and dissemination of cluster-wide metadata. The CatalogService combines the metadata from the Hive Metastore, the NameNode, and potentially additional sources in the future. The CatalogService uses the StateStore to broadcast metadata updates across the cluster. The CatalogService also directly handles executing metadata updates request from impalad servers (DDL requests). It exposes a Thrift interface to allow impalads to directly connect execute their DDL operations. The CatalogService has two main components - a C++ server that implements StateStore integration, Thrift service implementiation, and exporting of the debug webpage/metrics. The other main component is the Java Catalog that manages caching and updating of of all the metadata. For each StateStore heartbeat, a delta of all metadata updates is broadcast to the rest of the cluster. Some Notes On the Changes --- * The metadata is all sent as thrift structs. To do this all catalog objects (Tables/Views, Databases, UDFs) have thrift struct to represent them. These are sent with each statestore delta update. * The existing Catalog class has been seperated into two seperate sub-classes. An ImpladCatalog and a CatalogServiceCatalog. See the comments on those classes for more details. What is working: * New CatalogService created * Working with statestore delta updates and latest UDF changes * DDL performed on Node 1 is now visible on all other nodes without a "refresh". * Each DDL operation against the Catalog Service will return the catalog version that contains the change. An impalad will wait for the statestore heartbeat that contains this version before returning from the DDL comment. * All table types (Hbase, Hdfs, Views) getting their metadata propagated properly * Block location information included in CS updates and used by Impalads * Column and table stats included in CS updates and used by Impalads * Query tests are all passing Still TODO: * Directly return catalog object metadata from DDL requests * Poll the Hive Metastore to detect new/dropped/modified tables * Reorganize the FE code for the Catalog Service. I don't think we want everything in the same JAR. Change-Id: I8c61296dac28fb98bcfdc17361f4f141d3977eda Reviewed-on: http://gerrit.ent.cloudera.com:8080/601 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: Lenni Kuff <lskuff@cloudera.com>	2014-01-08 10:53:11 -08:00
Nong Li	1eb2b7a964	Add execution for vararg UDFs. Change-Id: I46e5670c09ac0b8e62f39dfc832fe880dd1dc995 Reviewed-on: http://gerrit.ent.cloudera.com:8080/572 Tested-by: jenkins Reviewed-by: Nong Li <nong@cloudera.com>	2014-01-08 10:53:09 -08:00
Nong Li	4bb1e8c854	Add varargs to UDF/UDA parser/analyzer. Change-Id: I4c3f2e74f6c29cee4b0b787c058b0455b16a11fd Reviewed-on: http://gerrit.ent.cloudera.com:8080/548 Reviewed-by: Marcel Kornacker <marcel@cloudera.com> Tested-by: jenkins	2014-01-08 10:53:05 -08:00
Skye Wanderman-Milne	b7f83bcd73	Add support for LLVM IR UDFs. This patch also adds a number of improvements to NativeUdfExpr. Highlights include: * Correctly handling the lowering of AnyVal struct types (required for ABI compatibility) * A rudimentary library cache for reusing handles produced by dlopen * More complicated test cases Change-Id: Iab9acdd7d7c4308e5d7ee3210f21b033fda5a195 Reviewed-on: http://gerrit.ent.cloudera.com:8080/540 Tested-by: jenkins Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: Skye Wanderman-Milne <skye@cloudera.com>	2014-01-08 10:53:03 -08:00
Nong Li	8963d79f51	Fix build break from UdfContext rename. Change-Id: Ia3df23fcba7d3812ae90565daab89916cbb50861 Reviewed-on: http://gerrit.ent.cloudera.com:8080/549 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: jenkins	2014-01-08 10:53:01 -08:00
Nong Li	e39de94316	Add parser/analysis to support UDAs. I looked around some and I think having create/drop/show [aggregate] function seems reasonable and extends nicely for UDTs. The create aggregate function can accept a lot of arguments. The non-essential one, I went with resolving them by name rather than position (i.e. argName="value"). I think this is better for the user than specifying it by position. The grammar is: CREATE AGGREGATE <name>(<arg_types>) RETURNS <type> [INTERMEDIATE <type>] LOCATION '/path' UpdateFn='Fn' [comment='comment'] [SerializeFn='symbol'] [MergeFn='symbol'] [InitFn='symbol'] [FinalizeFn='symbol'] The optional args at the end can be in any order. If the other symbols are not specified, we derive them from the UpdateFn symbol that's required. The analyzer would try to figure it out and fail if we can't find the derived symbol in the binary. The simplest example would be: CREATE AGGREGATE FUNCTION count(float) RETURNS BIGINT LOCATION '/path' UpdateFn='CountUpdateFn'; In which case we assume the intermediate type is the return type and the other functions are called 'CountInitFn', 'CountSerializeFn', 'CountMergeFn' 'CountFinalizeFn'. Change-Id: Iefc5741293050f5b295df28e9d1a7d039ead8675 Reviewed-on: http://gerrit.ent.cloudera.com:8080/513 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: Nong Li <nong@cloudera.com>	2014-01-08 10:52:59 -08:00
Alex Behm	39f9a067fa	IMPALA-444: Fixed accuracy of string to double conversion. Falling back to strod for scientific notation. Change-Id: I9a5d948620907d34601ef041e58b1c9bb2172f71 Reviewed-on: http://gerrit.ent.cloudera.com:8080/507 Tested-by: jenkins Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Alex Behm <alex.behm@cloudera.com>	2014-01-08 10:52:56 -08:00
Alex Behm	6253b21834	IMPALA-505: Fixed conjunct evaluation against partition columns in hdfs scan node when there are no matarialized slots. Change-Id: Ia003347bd7ee4986f5411c7175057192635a4c6c Reviewed-on: http://gerrit.ent.cloudera.com:8080/509 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Alex Behm <alex.behm@cloudera.com>	2014-01-08 10:52:54 -08:00
Skye Wanderman-Milne	fd99db0300	First pass at UdfExpr. Change-Id: I517bf56541749b5c2459554821c7bf838239fdf0 Reviewed-on: http://gerrit.ent.cloudera.com:8080/439 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: Skye Wanderman-Milne <skye@cloudera.com>	2014-01-08 10:52:50 -08:00
Nong Li	a0bf45a0b4	Add udf type. Change-Id: Ic5f52c127750cc9c847a3e34d3fdcfc78bee5a8a Reviewed-on: http://gerrit.ent.cloudera.com:8080/454 Tested-by: jenkins Reviewed-by: Alex Behm <alex.behm@cloudera.com>	2014-01-08 10:52:48 -08:00
Alex Behm	33000b8c15	Fixed codegen of floating-point modulo. Change-Id: Idd28c6a71a659471aa632a6e26d970557daeb3bf Reviewed-on: http://gerrit.ent.cloudera.com:8080/385 Tested-by: jenkins Reviewed-by: Marcel Kornacker <marcel@cloudera.com>	2014-01-08 10:52:46 -08:00
Nong Li	308650f208	Fix create function ddl test setup issue. Change-Id: I30c9a4342efbdb17bd53fb14bdcee172506cdadb Reviewed-on: http://gerrit.ent.cloudera.com:8080/447 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: Nong Li <nong@cloudera.com>	2014-01-08 10:52:44 -08:00
Nong Li	8eb727b585	UDF ddl cleanup Change-Id: I381fed277b5809727d2d8bf430258c01d2d0ae1f Reviewed-on: http://gerrit.ent.cloudera.com:8080/436 Tested-by: jenkins Reviewed-by: Nong Li <nong@cloudera.com>	2014-01-08 10:52:43 -08:00
Nong Li	b22d1f41a7	Change all "Status Close()" to "void Close()" Doing it this way makes sure we don't bail early on the Close path which is rarely the right thing to do. This found a few places where we were not doing proper cleanup because of this. Change-Id: Ie663c68398c14589b5cbc1bd980644b0b10fd865 Reviewed-on: http://gerrit.ent.cloudera.com:8080/373 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: Nong Li <nong@cloudera.com>	2014-01-08 10:52:38 -08:00
ishaan	53cd9eadab	Treat HBase as a file format for functional tests Change-Id: Ia01181a1e10eb108419122d347e9d869a69e8922 Reviewed-on: http://gerrit.ent.cloudera.com:8080/102 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: Ishaan Joshi <ishaan@cloudera.com>	2014-01-08 10:52:36 -08:00
Nong Li	af90c8a133	Fix memory usage tracking. Changes MemLimit to MemTracker: - the limit is optional - it also records a label and an optional parent - Consume() and Release() also update the ancestors and there's also a new AnyLimitExceeded(), which also checks the ancestors - the consumption counter is a HighwaterMarkCounter and can optionally be created as part of a profile Each fragment instance now has a MemTracker that is part of a 3-level hierarchy: process, query, fragment instance. Change-Id: I5f580f4956fdf07d70bd9a6531032439aaf0fd07 Reviewed-on: http://gerrit.ent.cloudera.com:8080/339 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: Nong Li <nong@cloudera.com>	2014-01-08 10:52:36 -08:00
Nong Li	2394ae2e66	UDF parsing and analysis. Change-Id: If8058c1cb66bf5e9c7049d4b78f5882b46c03fc1 Reviewed-on: http://gerrit.ent.cloudera.com:8080/318 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: Nong Li <nong@cloudera.com>	2014-01-08 10:52:32 -08:00
Aaron Davidson	cafb7b72f8	External sorting This is an experimental implementation of external sorting. This patch includes the following additions: (1) creation and implementation of the Sorter interface, which can sort Impala Tuples. (2) normalization of Tuples to allow memcmp-able sorting. (3) a testing framework for the Sorter, (4) a benchmark to compare the current state of the Sorter with other sorts, (5) an implementation of a Vector which can store data whose size is only known at runtime, (6) a sorting algorithm (basically a dumbed down STL sort) which can operate over such a vector, (7) implementation of a simple in-memory Merger, and (8) logic to stream blocks of memory in and out of memory for the actual external merging. I have a local branch for experimental optimizations and benchmarking -- this should be considered a "basic", working sort. The following optimizations have been implemented: (i) Optionally extracting keys instead of writing them in place. (ii) Optionally opportunistically parallelize run building (sorting & prepare for output). (iii) Maximize disk IO and minimize buffer recycling by writing buffers out, but also keeping them in memory until right when they're needed. (iv) Prepare auxililary data backwards so the buffers can be released as we go, and still go out in an order which preserves the first buffers of the run. (v) Always merge maximum number of runs at a time, taking from the next merge level if available. Change-Id: I1d7304d54d73152da929b1efffc1e851e5fb8fd4 Reviewed-on: http://gerrit.ent.cloudera.com:8080/126 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: Aaron Davidson <aaron.davidson@cloudera.com>	2014-01-08 10:52:27 -08:00
Aaron Davidson	00275ce3a9	(IMPALA-422) Add string concatenation function Implements a group_concat() function which concatenates all the values in a group together. The format is group_concat(str_col, [separator]). The default separator is ', '. NULLs are ignored. Change-Id: If152df6f528401117dba81d66ef691bfb548cc7d Reviewed-on: http://gerrit.ent.cloudera.com:8080/117 Reviewed-by: Aaron Davidson <aaron.davidson@cloudera.com> Tested-by: Aaron Davidson <aaron.davidson@cloudera.com>	2014-01-08 10:52:21 -08:00
Skye Wanderman-Milne	efac6f82fd	Print errors to shell in BaseSequenceScanner. Change-Id: I0d1b041695c0d61b8c4994833f0a703e3bfa9c6a Reviewed-on: http://gerrit.ent.cloudera.com:8080/278 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: Skye Wanderman-Milne <skye@cloudera.com>	2014-01-08 10:52:20 -08:00
Lenni Kuff	d66d3bfce3	IMPALA-161: Add Impala support for CREATE TABLE AS SELECT This adds support for CREATE TABLE AS SELECT to Impala. It supports all functionality a regular CREATE TABLE statement includes, except it does not allow for for specifying partition columns. Hive also has this limitation and it wouldn't be too hard to support in the future. Change-Id: I4ca3c3b8f1576441b8bb5ed9dc521d7dfa96ab74 Reviewed-on: http://gerrit.ent.cloudera.com:8080/157 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: Lenni Kuff <lskuff@cloudera.com>	2014-01-08 10:52:17 -08:00
ishaan	e9e23bff5d	Fix build because of a change in parquetfile. This changes QueryTest/create.test to unblock the builds. Change-Id: If91ac43e349c2f81034ba7504c27890781f33260 Reviewed-on: http://gerrit.ent.cloudera.com:8080/255 Tested-by: jenkins <kitchen-build@cloudera.com> Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: Ishaan Joshi <ishaan@cloudera.com>	2014-01-08 10:52:16 -08:00
Nong Li	a3bc1ce133	Some parquet encoder/decoder refactoring. Added dictionary to other types. Split out the encoder/type for parquet reader/writer. I think this puts us in a better place to support future encodings. On the tpch lineitem table, the results are: Before: BytesWritten: 236.45 MB Per Column Sizes: l_comment: 75.71 MB l_commitdate: 8.64 MB l_discount: 11.19 MB l_extendedprice: 33.02 MB l_linenumber: 4.56 MB l_linestatus: 869.98 KB l_orderkey: 8.99 MB l_partkey: 27.02 MB l_quantity: 11.58 MB l_receiptdate: 8.65 MB l_returnflag: 1.40 MB l_shipdate: 8.65 MB l_shipinstruct: 1.45 MB l_shipmode: 2.17 MB l_suppkey: 21.91 MB l_tax: 10.68 MB After: BytesWritten: 198.63 MB (84%) Per Column Sizes: l_comment: 75.71 MB (100%) l_commitdate: 8.64 MB (100%) l_discount: 2.89 MB (25.8%) l_extendedprice: 33.13 MB (100.33%) l_linenumber: 1.50 MB (32.89%) l_linestatus: 870.26 KB (100.032%) l_orderkey: 9.18 MB (102.11%) l_partkey: 27.10 MB (100.29%) l_quantity: 4.32 MB (37.31%) l_receiptdate: 8.65 MB (100%) l_returnflag: 1.40 MB (100%) l_shipdate: 8.65 MB (100%) l_shipinstruct: 1.45 MB (100%) l_shipmode: 2.17 MB (100%) l_suppkey: 10.11 MB (46.14%) l_tax: 2.89 MB (27.06%) The table is overall 84% as big (i.e. 16% smaller). A few columns got marginally bigger. If the file filled the 1 GB, I'd expect the overhead to decrease even more. The restructuring to use a virtual call doesn't seem to change things much and will go away when we codegen the scanner. Here's what they look like with this patch (note this is on the before data files, so only string cols are dictionary encoded). Before query times: Insert Time: 8.5 sec select : 2.3 sec select avg(l_orderkey): .33 sec After query times: Insert Time: 9.5 sec <-- Longer due to doing dictionary encoding select : 2.4 sec <-- kind of noisy, possibly a slight slow down select avg(l_orderkey): .33 sec Change-Id: I213fdca1bb972cc200dc0cd9fb14b77a8d36d9e6 Reviewed-on: http://gerrit.ent.cloudera.com:8080/238 Tested-by: jenkins <kitchen-build@cloudera.com> Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>	2014-01-08 10:52:16 -08:00
Skye Wanderman-Milne	b9ea32e9b7	Fix IMPALA-129, IMPALA-534, and other scanner bugs. Change-Id: Idbd29af3fcc35b9e1173d08ac55b5780751c5938 Reviewed-on: http://gerrit.ent.cloudera.com:8080/196 Tested-by: jenkins <kitchen-build@cloudera.com> Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: Skye Wanderman-Milne <skye@cloudera.com>	2014-01-08 10:52:14 -08:00
Alex Behm	9a201645cd	IMPALA-496: Fix escaping of field delimiter and escape character in inserts Change-Id: I49c36ae9823b35dcb9e92d1a13bef270657e36f2 Reviewed-on: http://gerrit.ent.cloudera.com:8080/163 Tested-by: jenkins <kitchen-build@cloudera.com> Reviewed-by: Nong Li <nong@cloudera.com> Reviewed-by: Lenni Kuff <lskuff@cloudera.com>	2014-01-08 10:52:09 -08:00
Alex Behm	f0e2d539fc	IMPALA-495: Views Sometimes Not Utilizing Partition Pruning. Change-Id: I65daebbe8c4b72b956a409fe28edd3773fda7cb7 Reviewed-on: http://gerrit.ent.cloudera.com:8080/128 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Alex Behm <alex.behm@cloudera.com>	2014-01-08 10:52:04 -08:00
Alex Behm	c9965e5a5c	Fix build break due to views defined by a constant select. Change-Id: I5deeeb03469494f5ba6ed7a911354bbdd6c98195 Reviewed-on: http://gerrit.ent.cloudera.com:8080/149 Reviewed-by: Henry Robinson <henry@cloudera.com> Tested-by: Henry Robinson <henry@cloudera.com>	2014-01-08 10:52:04 -08:00
Alex Behm	2b427208e5	IMPALA-507: Creating a VIEW that does not reference a table fails with IllegalStateException. Change-Id: I11470ba919bbfced76730adae2a46647c4ef110b Reviewed-on: http://gerrit.ent.cloudera.com:8080/146 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Alex Behm <alex.behm@cloudera.com>	2014-01-08 10:52:04 -08:00
Alex Behm	52c9d26d16	IMPALA-475: Impala should avoid the use of c_# style autogenerated column aliases unless necessary. Change-Id: I959e35bcee1698ebc35534dc4f390c5c2c7dc919 Reviewed-on: http://gerrit.ent.cloudera.com:8080/141 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Alex Behm <alex.behm@cloudera.com>	2014-01-08 10:52:03 -08:00
Alex Behm	9754f5bf52	IMPALA-504: Right and full outer joins do not return row with NULL value for rhs table. Change-Id: Ia3f8d474fb30189b36fb587b2920d7b9b224ea71 Reviewed-on: http://gerrit.ent.cloudera.com:8080/129 Tested-by: Alex Behm <alex.behm@cloudera.com> Reviewed-by: Alex Behm <alex.behm@cloudera.com>	2014-01-08 10:52:03 -08:00
Skye Wanderman-Milne	6e7406df8b	IMPALA-502: Impala does not return NULL for case where table has extra string column and data does not (it returns an empty string) Change-Id: I0cfe5ce5fc279d46610a3cc191a501ccbc335296 Reviewed-on: http://gerrit.ent.cloudera.com:8080/127 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: Skye Wanderman-Milne <skye@cloudera.com>	2014-01-08 10:52:02 -08:00
Nong Li	fd53edbbe4	Fix parquet writer bug with not setting dictionary metadata. Change-Id: Ia5c0886497678d31b82cb5052e06df437bb201be Reviewed-on: http://gerrit.ent.cloudera.com:8080/114 Reviewed-by: Marcel Kornacker <marcel@cloudera.com> Tested-by: Marcel Kornacker <marcel@cloudera.com>	2014-01-08 10:52:02 -08:00
Lenni Kuff	faeb7f5fa3	Add scanner test case for scenario where data and table schema do not match Change-Id: I16f007ad1cb2caac47506914512c5665fc3d5f56 Reviewed-on: http://gerrit.ent.cloudera.com:8080/98 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: Lenni Kuff <lskuff@cloudera.com>	2014-01-08 10:52:01 -08:00
Skye Wanderman-Milne	3fecdeb793	IMPALA-441: support default values for Avro tables	2014-01-08 10:51:39 -08:00
Alex Behm	8ad15fabcf	IMPALA-372: Added CREATE/DROP/ALTER VIEW.	2014-01-08 10:51:35 -08:00
Alex Behm	3bba336bbf	IMPALA-359: Return proper tuple id of inline view with distinct aggregation.	2014-01-08 10:51:26 -08:00
Alan Choi	254ee6ef89	IMPALA-434 Support binary hbase encoding	2014-01-08 10:51:18 -08:00
Skye Wanderman-Milne	e8344bb0d0	Dictionary encoding/decoding	2014-01-08 10:51:15 -08:00
Lenni Kuff	c2cfc7e2a3	IMPALA-373: Add support for 'LOAD DATA' statements This change adds Impala support for LOAD DATA statements. This allows the user to load one or more files into a table or partition from a given HDFS location. The load operation only moves files, it does not convert data to match the target table/partition's file format.	2014-01-08 10:51:02 -08:00
Alex Behm	045038e479	IMPALA-374: Added WITH clause without recursion.	2014-01-08 10:51:00 -08:00
Henry Robinson	79b36a5eb3	IMPALA-375: Add column permutation clause to INSERT statement	2014-01-08 10:50:59 -08:00
Alan Choi	15a3d92492	Qualify table with database	2014-01-08 10:50:57 -08:00
Alan Choi	58687d16b8	IMPALA-406 Raise an error when inserting into HBase table using a null row key.	2014-01-08 10:50:56 -08:00

... 2 3 4 5 6 ...

318 Commits