impala

mirror of https://github.com/apache/impala.git synced 2026-01-07 18:02:33 -05:00

Author	SHA1	Message	Date
Nong Li	c031cd4e96	Update RLE encoding to pad literal groups to 8. Change-Id: I77cb2b80b888b569ff715c583f16aea4e39fe680 Reviewed-on: http://gerrit.ent.cloudera.com:8080/644 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins	2014-01-08 10:53:17 -08:00
ishaan	8a43426879	Sleep after starting the hiveserver2 service to guards against it not starting on time. Change-Id: I9a0de1cc63089cba2f9b59942ee45abc44b8662e Reviewed-on: http://gerrit.ent.cloudera.com:8080/643 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: Ishaan Joshi <ishaan@cloudera.com>	2014-01-08 10:53:17 -08:00
Lenni Kuff	13605ad834	Support catalogd in ImpalaCluster test library Adds basic support for catalogd to our ImpalaCluster test library/object model. This will allow us to write more programatic tests targeting the catalogd process including process failure tests and metric check validators. Change-Id: I8e5f7bc73f999f105437c6d3d52c6d436a354d2d Reviewed-on: http://gerrit.ent.cloudera.com:8080/617 Tested-by: jenkins Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: Lenni Kuff <lskuff@cloudera.com>	2014-01-08 10:53:16 -08:00
Nong Li	1621d27053	LibCache improvements. - Add fine grain locking - Allow caching of module files copied from HDFS. Change-Id: Ib7409c1fea715199f2be5ed65bb3b0cba90c9d9a Reviewed-on: http://gerrit.ent.cloudera.com:8080/632 Tested-by: jenkins Reviewed-by: Nong Li <nong@cloudera.com>	2014-01-08 10:53:16 -08:00
Lenni Kuff	dd6736b74d	Change error message format to fix Analysis test failure Change-Id: Ib9a84f3a5ff4431e45f9a6477dac5686fd1066db Reviewed-on: http://gerrit.ent.cloudera.com:8080/636 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: Lenni Kuff <lskuff@cloudera.com>	2014-01-08 10:53:16 -08:00
Lenni Kuff	9425ad1e14	IMPALA-461: Log table loading exceptions as ERROR instead of INFO Change-Id: Ie993f6b8765b73ab9b6dbc12b4b2739203076023 Reviewed-on: http://gerrit.ent.cloudera.com:8080/630 Tested-by: jenkins Reviewed-by: Alex Behm <alex.behm@cloudera.com>	2014-01-08 10:53:15 -08:00
Lenni Kuff	b07a3ccfd6	Use an external Hive Metastore Service for local test runs Using an external Hive Metastore Service for local test runs has a number of benefits. Some of the benefits are that it helps separate the metastore logs from the impala logs, and that it is more representative of what is on real cluster environments. It also may help with some of the concurrency issues that we have been seeing when running directly against the backend database since we no longer spin up an in-process metastore server for each client connection. The metastore is started by running "run-hive-server.sh" which is invoked as part of "run-all.sh". Change-Id: If60fa97aa38e4ad5cf578b9b409eeea1e0e29375 Reviewed-on: http://gerrit.ent.cloudera.com:8080/628 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: jenkins	2014-01-08 10:53:15 -08:00
Nong Li	6981f33b11	Compile avro with -fPIC (so the FE can pick it up). Change-Id: Iab8f377663ae332e08d42fea95b2d968e879b12c Reviewed-on: http://gerrit.ent.cloudera.com:8080/623 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: Nong Li <nong@cloudera.com>	2014-01-08 10:53:15 -08:00
Lenni Kuff	92829b8400	IMPALA-587: Support implicit hbase column mapping keys The Hive HBase spec specifies that the key column mapping can either be defined explicitly (using the :key syntax) or left out completely in which case a mapping to the first table column is implied. This change updates Impala to support implicit key mappings and also adds some checks in our ALTER TABLE DDL to unsure we cannot get into this state by dropping a column from an Hbase table (a similar restriction that Hive puts in place) Change-Id: I920d642261659ee3e881da2553ffe83300923af8 Reviewed-on: http://gerrit.ent.cloudera.com:8080/554 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: Lenni Kuff <lskuff@cloudera.com>	2014-01-08 10:53:14 -08:00
Nong Li	c868350fbd	Add OS info, which now just contains the os version. Change-Id: Ifdaf80702301ff6beb3fd34abe814fd2fa904607 Reviewed-on: http://gerrit.ent.cloudera.com:8080/619 Tested-by: jenkins Reviewed-by: Nong Li <nong@cloudera.com>	2014-01-08 10:53:14 -08:00
Skye Wanderman-Milne	49c07abce3	Fix gen_opcodes.py for Python 2.4 Change-Id: I6e373c370f8081c1e549cbe4d1bc2a0a254ad357 Reviewed-on: http://gerrit.ent.cloudera.com:8080/622 Reviewed-by: Henry Robinson <henry@cloudera.com> Tested-by: Skye Wanderman-Milne <skye@cloudera.com>	2014-01-08 10:53:14 -08:00
Nong Li	15db34e356	AggregationNode refactoring This patch redoes how the aggregation node is implemented. The functionality is now split between aggregation-node, agg-expr and aggregate-functions. This is a working progress (there's still a lot of debug stuff I added that needs to be cleaned up) but it does pass the tests. Aggregation-node is now very simple and now only deals with the grouping part. Aggregate-expr serves as the glue between the agg node and the aggregate functions. The aggregation functions are implemented with the UDA interface. I've reimplemented our existing aggregate functions with this setup. For true UDAs, the binaries would be loaded in aggregate-expr. This also includes some preliminary changes in the FE. We now need to annotate each AggNode as executing the update vs. merge phase (root aggs execute update, others execute merge) and if it needs a finalize step (only the root does). This is more general than our builtins which are too simple to need this structure. There is a big TODO here to allow the intermediate types between agg nodes to change. For example, in distinct estimate, the input type is the column type and the output type is a bigint. We'd like the intermediate type to be CHAR(256). This is different since currently, the intermediate type and output type have always been the same. We've hacked around this by having both the intermediate and output type be TYPE_STRING. I've left this for another patch (changing the BE to support this is trivial). For aggregates that result in strings, we used to store some additional stuff past the end of the tuple. The layout was: <tuple> <length of 1st string buffer>,<length of 2nd string buffer>, etc The rationale for this is that we want to reuse the buffer for min/max and grow the buffer more quickly for group_concat. This breaks down the abstraction between agg-expr and agg-node and is not something UDAs can use in general. Rather than try to hack around this, I think the proper solution is to the intermediate type not be StringValue and to contain the buffer length itself. This patch also resurrects the distinct estimate code. The distinct estimate functions exercise all of the code paths. Change-Id: Ic152a2cd03bc1713967673681e1e6204dcd80346 Reviewed-on: http://gerrit.ent.cloudera.com:8080/564 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: Nong Li <nong@cloudera.com>	2014-01-08 10:53:13 -08:00
Skye Wanderman-Milne	0b2bebdfd1	Improve codegen optimizations, take two Change-Id: Id9b48e1979bb9999c58e7fd89553ee9a7d8996d0 Reviewed-on: http://gerrit.ent.cloudera.com:8080/606 Tested-by: jenkins Reviewed-by: Nong Li <nong@cloudera.com>	2014-01-08 10:53:13 -08:00
Skye Wanderman-Milne	656ae8b1c8	Cross-compiled UDF builtins. When codegen is enabled, UDF builtins will be loaded from the IR module rather than using the native functions. Since we cannot run UDFs without codegen yet this means UDF builtins can only be run this way, but once we add support for running UDFs without codegen this will allow us to switch back to the native functions for development/debugging. Change-Id: I948b113c61603801b84f80982384bbc07596f119 Reviewed-on: http://gerrit.ent.cloudera.com:8080/605 Tested-by: jenkins Reviewed-by: Nong Li <nong@cloudera.com>	2014-01-08 10:53:13 -08:00
Lenni Kuff	bf139d1eba	Update catalogd to forward log4j log messages to glog Change-Id: I4620b77ba731e134a3e48883e8ae7ee3820ed584 Reviewed-on: http://gerrit.ent.cloudera.com:8080/612 Reviewed-by: Henry Robinson <henry@cloudera.com> Tested-by: jenkins	2014-01-08 10:53:12 -08:00
ishaan	aa530ce11d	Change the order of fields stored in the benchmark results to fix performance comparisons. Change-Id: I7b7ebd711adfe9a44cba92b55d35ef8dd97eba60 Reviewed-on: http://gerrit.ent.cloudera.com:8080/584 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: Ishaan Joshi <ishaan@cloudera.com>	2014-01-08 10:53:12 -08:00
Lenni Kuff	6e8741aafd	Add metric to determine if impalad catalog is 'ready' Change-Id: I8e94d9beff05f2370902c887a5ae6a4fffad9dfe Reviewed-on: http://gerrit.ent.cloudera.com:8080/611 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: Lenni Kuff <lskuff@cloudera.com>	2014-01-08 10:53:12 -08:00
Lenni Kuff	5a97258c1a	Update table metadata loading to workaround Hive MetaStore bug HIVE-5457 There is a Hive Metastore concurrency bug (HIVE-5457) which causes concurrent calls to getTable() to sometimes fail due with data nucleus exceptions. This causes catalogd to fail to load ALL metadata for all tables. This fix is to serialize our calls to getTable(). Additionally, tweaked the logging a bit and improved start-impala-cluster to do a better job of reporting the status of catalog initialization. It's too bad we have to serialize these calls, but we seem to be able to run everything else in parallel with no problems (get col stats, block md, etc). Also added a couple of changes in our hive-site to match the defaults for our cluster metastore deployments. Change-Id: Ic70e2a9b8190a56510e430d8da3942dca252eb4c Reviewed-on: http://gerrit.ent.cloudera.com:8080/609 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: Lenni Kuff <lskuff@cloudera.com>	2014-01-08 10:53:11 -08:00
Lenni Kuff	a2cbd2820e	Add Catalog Service and support for automatic metadata refresh The Impala CatalogService manages the caching and dissemination of cluster-wide metadata. The CatalogService combines the metadata from the Hive Metastore, the NameNode, and potentially additional sources in the future. The CatalogService uses the StateStore to broadcast metadata updates across the cluster. The CatalogService also directly handles executing metadata updates request from impalad servers (DDL requests). It exposes a Thrift interface to allow impalads to directly connect execute their DDL operations. The CatalogService has two main components - a C++ server that implements StateStore integration, Thrift service implementiation, and exporting of the debug webpage/metrics. The other main component is the Java Catalog that manages caching and updating of of all the metadata. For each StateStore heartbeat, a delta of all metadata updates is broadcast to the rest of the cluster. Some Notes On the Changes --- * The metadata is all sent as thrift structs. To do this all catalog objects (Tables/Views, Databases, UDFs) have thrift struct to represent them. These are sent with each statestore delta update. * The existing Catalog class has been seperated into two seperate sub-classes. An ImpladCatalog and a CatalogServiceCatalog. See the comments on those classes for more details. What is working: * New CatalogService created * Working with statestore delta updates and latest UDF changes * DDL performed on Node 1 is now visible on all other nodes without a "refresh". * Each DDL operation against the Catalog Service will return the catalog version that contains the change. An impalad will wait for the statestore heartbeat that contains this version before returning from the DDL comment. * All table types (Hbase, Hdfs, Views) getting their metadata propagated properly * Block location information included in CS updates and used by Impalads * Column and table stats included in CS updates and used by Impalads * Query tests are all passing Still TODO: * Directly return catalog object metadata from DDL requests * Poll the Hive Metastore to detect new/dropped/modified tables * Reorganize the FE code for the Catalog Service. I don't think we want everything in the same JAR. Change-Id: I8c61296dac28fb98bcfdc17361f4f141d3977eda Reviewed-on: http://gerrit.ent.cloudera.com:8080/601 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: Lenni Kuff <lskuff@cloudera.com>	2014-01-08 10:53:11 -08:00
Nong Li	f3b5f9d5d8	Patch tcmalloc's stack walking to be more strict and less likely to crash. The issue from looking at the core dump is that tcmalloc crashes trying to walk the call stack. This is why it is happening on the heap checker build where it stores the stack of many calls. I can see from GDB that there is something goofy with that stack frames. Tcmalloc is able to identity 4 frames (same values as GDB) and then crashes on the next where GDB shows ???. GDB can continue to show the rest of the stack so I don't think this is stack corruption. The frames are in libstdc++ and I believe the issue is from a compiler optimization to omit stack frame pointers. Tcmalloc's stack walking is not tolerant of this. Debuggers have lots of logic to look around where the stack should have started and recover. We have a few options to handle this. Here's one proposed solution. I think we can also consider setting NO_TCMALLOC_SAMPLE which will cause it to never collect stacks. We can also try different options for different build types. Change-Id: I98dfb5bccd5fe485ac50b56c6f0fe3f3ded9ff76 Reviewed-on: http://gerrit.ent.cloudera.com:8080/600 Tested-by: jenkins Reviewed-by: Marcel Kornacker <marcel@cloudera.com>	2014-01-08 10:53:10 -08:00
Nong Li	78539ee531	Allow insert cancellation test to fail due to IMPALA-551 Change-Id: I5d98be1cc503cc51206051a7c6a493bf884ab5b3 Reviewed-on: http://gerrit.ent.cloudera.com:8080/594 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: Nong Li <nong@cloudera.com>	2014-01-08 10:53:10 -08:00
Skye Wanderman-Milne	e5ea524448	Revert "Improve codegen optimizations." This reverts commit b375f3d7f4961def4ef930273420d447f99d093f. Reverting for now to fix build. Change-Id: Icaa790d44ab47825f855d8a123cad3130948934a Reviewed-on: http://gerrit.ent.cloudera.com:8080/586 Tested-by: jenkins Reviewed-by: Nong Li <nong@cloudera.com>	2014-01-08 10:53:09 -08:00
Nong Li	1eb2b7a964	Add execution for vararg UDFs. Change-Id: I46e5670c09ac0b8e62f39dfc832fe880dd1dc995 Reviewed-on: http://gerrit.ent.cloudera.com:8080/572 Tested-by: jenkins Reviewed-by: Nong Li <nong@cloudera.com>	2014-01-08 10:53:09 -08:00
ishaan	ee42aa8d36	Fix incorrect argument in the Impala test suite call to execute_using_jdbc execute_using_jdbc used to expect a query string. Its interface was recently changed to accept a query object. Additionally, change the interface of the Query() class to enable it to accept raw (qualified) query strings. Change-Id: I44693cd2cccf1041cab32a9821fb76b12d148375 Reviewed-on: http://gerrit.ent.cloudera.com:8080/577 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: Ishaan Joshi <ishaan@cloudera.com>	2014-01-08 10:53:09 -08:00
Skye Wanderman-Milne	3a388eb461	Improve codegen optimizations. Change-Id: I0698cbcb417b8e9981ffac43361cf1cafbb17348 Reviewed-on: http://gerrit.ent.cloudera.com:8080/576 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: Skye Wanderman-Milne <skye@cloudera.com>	2014-01-08 10:53:08 -08:00
Alex Behm	9065648d77	Improvements to cost estimation and explain output. Fixed cost estimation of union queries and exchange nodes. Fixed propagation of stats through cloning of exprs and plan nodes. Fixed propagation of expr stats to slots they are materialized into (e.g., grouping columns in multi-level aggs). Improved explain output for constant selects. Change-Id: I96d1652c00d48e4093b85ae7fc8bad28d74b8b81 Reviewed-on: http://gerrit.ent.cloudera.com:8080/547 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Alex Behm <alex.behm@cloudera.com>	2014-01-08 10:53:08 -08:00
ishaan	a33c795de3	Fix build failure because of a function signature change in the test file parser. Change-Id: I329eca710459910a743d682c21a625672096aec0 Reviewed-on: http://gerrit.ent.cloudera.com:8080/573 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: jenkins Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: Ishaan Joshi <ishaan@cloudera.com>	2014-01-08 10:53:08 -08:00
Chris Channing	ba7c764279	IMP-651: Adding support for the greatest function. Change-Id: Ia8c53db5504e28d8669e6013545da6b1164bcb23 Reviewed-on: http://gerrit.ent.cloudera.com:8080/570 Tested-by: jenkins Reviewed-by: Alex Behm <alex.behm@cloudera.com>	2014-01-08 10:53:07 -08:00
ishaan	565d15579c	Add the ability to use a workload as the unit of execution in the Impala benchmark runner. At the moment, a query is the default unit of execution and parallelism in the Impala performance suite. With this change, we now have the ability to treat a workload as the unit of execution. A workload is defined as a unique combination of the dataset, scale factor, a subset (or all) of the queries in the dataset, and a table format (file format, compression codec and compression scheme). It introduces two new command line options in bin/run-workload.py: * --execution_scope The default scope is 'query', and it maintains previous semantics. The new scope is 'workload', which toggles the unit of execution to a workload. * --shuffle_query_exec_order. Shuffles the order in which queries are executed (only applicable when the execution_scope if workload), defaults to False. Change-Id: I790d75f0896210cda8eb999015b0be04246e4c45 Reviewed-on: http://gerrit.ent.cloudera.com:8080/503 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: Ishaan Joshi <ishaan@cloudera.com>	2014-01-08 10:53:07 -08:00
Chris Channing	dc055c57ff	IMP-651: Adding support for the least function. Change-Id: I51c12bdd2ed614e2885403b4f857abe7d8e5777c Reviewed-on: http://gerrit.ent.cloudera.com:8080/552 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Alex Behm <alex.behm@cloudera.com>	2014-01-08 10:53:07 -08:00
Nong Li	e4786f08fe	Workaround GCC bug to fix build break in OpcodeRegistry. Change-Id: I26eaaa4e87099d79507511203700352bb6df3922 Reviewed-on: http://gerrit.ent.cloudera.com:8080/569 Reviewed-by: Marcel Kornacker <marcel@cloudera.com> Tested-by: jenkins	2014-01-08 10:53:06 -08:00
Alex Behm	4b29b54d76	Fixed cost estimation overflow. Temporarily switched a few precondition checks related to cost estimation to warnings until cost estimation is more rebust. Change-Id: I82538b5325a17921e6caab2be997f65cf57f5438 Reviewed-on: http://gerrit.ent.cloudera.com:8080/568 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Alex Behm <alex.behm@cloudera.com>	2014-01-08 10:53:06 -08:00
Nong Li	e959e49b7c	Update opcode registry to support UDF-interface builtins. There's a bigger change to migrate the rest of them but I think this is how the builtins, when not running as cross compiled, should be run. This mode is still useful when developing the builtin. When run as cross compiled IR, we wouldn't do anything to distinguish between a builtin and an external UDF. Change-Id: I6aa336b22aa19b00507bad33c9df3978baa576cc Reviewed-on: http://gerrit.ent.cloudera.com:8080/542 Tested-by: jenkins Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>	2014-01-08 10:53:06 -08:00
Nong Li	4bb1e8c854	Add varargs to UDF/UDA parser/analyzer. Change-Id: I4c3f2e74f6c29cee4b0b787c058b0455b16a11fd Reviewed-on: http://gerrit.ent.cloudera.com:8080/548 Reviewed-by: Marcel Kornacker <marcel@cloudera.com> Tested-by: jenkins	2014-01-08 10:53:05 -08:00
ishaan	8e553a8a2e	IMPALA-454 Tab completion in the shell should not depend on case. This change adds support for upper case and mixed case tab completion for commands in the shell. Change-Id: I5b7083ec71463c9fd60b0a8b788423e2fe8d0ce5 Reviewed-on: http://gerrit.ent.cloudera.com:8080/563 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: Ishaan Joshi <ishaan@cloudera.com>	2014-01-08 10:53:05 -08:00
Nong Li	b93b15f10f	Integrate function context with mempool. Change-Id: I55edb6cb89b67eb2c8031ac3a4f119df92a0896f Reviewed-on: http://gerrit.ent.cloudera.com:8080/565 Tested-by: jenkins Reviewed-by: Nong Li <nong@cloudera.com>	2014-01-08 10:53:05 -08:00
Lenni Kuff	43eed3365c	Separate log4j/glog forwarding code from fe-support/FeSupport This change splits the log4j/glog forwarding code out from libfesupport into its own shared library - libloggingsupport.so. This allows the log forwarding to be used in other places than the impalad FE, such as the CatalogService. Change-Id: I669e5b913b913488b4b7d5b7ed4b8be271850c6e Reviewed-on: http://gerrit.ent.cloudera.com:8080/559 Tested-by: jenkins Reviewed-by: Lenni Kuff <lskuff@cloudera.com>	2014-01-08 10:53:04 -08:00
Nong Li	68e8470ae3	Add CHAR(N) type in BE. It's going to be pretty suboptimal to implement UDAs without having CHAR(N) support so I implemented the bare minimum to support it. We need to change all uses of PrimitiveType in the future. This is a bit hard to test now since we don't expose this in the language except for UDAs currently. I can combine this with the UDA patch but that patch is pretty big. Change-Id: I799dd2c905b41194e92cc01728727546294b0a02 Reviewed-on: http://gerrit.ent.cloudera.com:8080/562 Tested-by: jenkins Reviewed-by: Nong Li <nong@cloudera.com>	2014-01-08 10:53:04 -08:00
Skye Wanderman-Milne	b7f83bcd73	Add support for LLVM IR UDFs. This patch also adds a number of improvements to NativeUdfExpr. Highlights include: * Correctly handling the lowering of AnyVal struct types (required for ABI compatibility) * A rudimentary library cache for reusing handles produced by dlopen * More complicated test cases Change-Id: Iab9acdd7d7c4308e5d7ee3210f21b033fda5a195 Reviewed-on: http://gerrit.ent.cloudera.com:8080/540 Tested-by: jenkins Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: Skye Wanderman-Milne <skye@cloudera.com>	2014-01-08 10:53:03 -08:00
Nong Li	e5ed8e4105	Move minicluster_xml_conf to HADOOP_CONF_DIR. The current location gets deleted if you rebuild, making you have to restart mini dfs. Change-Id: If71b144534255fa8df2bfa187c0814ffdf28463e Reviewed-on: http://gerrit.ent.cloudera.com:8080/550 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: Nong Li <nong@cloudera.com>	2014-01-08 10:53:03 -08:00
Alex Behm	e4a24c8c1d	Fixed the process failure test that was failing due to a race in reading/writing a query's profile web page. Change-Id: Ibf4a27aa17eb6439630d1616c2c719fc1ee2ba4e Reviewed-on: http://gerrit.ent.cloudera.com:8080/553 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Alex Behm <alex.behm@cloudera.com>	2014-01-08 10:53:03 -08:00
Lenni Kuff	beea7d3d10	Disabled thrift-server-test due to IMPALA-606 Change-Id: I4d080535cc4778ddad90fca22dbffdfc5f303b15 Reviewed-on: http://gerrit.ent.cloudera.com:8080/556 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Reviewed-by: Marcel Kornacker <marcel@cloudera.com> Tested-by: Lenni Kuff <lskuff@cloudera.com>	2014-01-08 10:53:02 -08:00
ishaan	3dfdbd88d9	IMPALA-547 The Impala Shell should have better handling when the history file does not exist or is uneditable. Currently, the shell warns the user that it's unable to load the command history if the command history file (~/.impalahistory) is not found. Moreover, if the file is not editable, then an error is thrown after each the execution of each command. This change disables readline if the history file is not editable instead of throwing repeated errors, and removes the warning if the history file does not exist. Change-Id: Ie4c94629431f2407b0679a7721a6bdf28907437f Reviewed-on: http://gerrit.ent.cloudera.com:8080/532 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: Ishaan Joshi <ishaan@cloudera.com>	2014-01-08 10:53:02 -08:00
ishaan	c0129a1683	Improve the Impala shell's behavior when attempting to connect to a keberized impalad. This change has the following additions: - If the user's connecting to a kerberized impalad, the Impala shell will check whether a valid ticket exists by running 'klist -s'. If a valid ticket is not found, then the shell will exit with an appropriate error message on the commandline. - If the user's connecting to a kerberized impalad without the '-k' option, the Impala Shell will issue a 'klist -s' to check if there are valid kerberos tickets in the credentials cache. If a valid ticket is found, it will retry the connection with kerberos enabled. - The Impala shell encodes strings entered on the commandline as unicode. The sasl module expects ascii strings as arguments. Explcitly encode any string sent to the sasl module to ascii. Change-Id: I1799b1e7988a19fa513b683afe1e3b66b68c1ffc Reviewed-on: http://gerrit.ent.cloudera.com:8080/535 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: Ishaan Joshi <ishaan@cloudera.com>	2014-01-08 10:53:02 -08:00
Nong Li	8963d79f51	Fix build break from UdfContext rename. Change-Id: Ia3df23fcba7d3812ae90565daab89916cbb50861 Reviewed-on: http://gerrit.ent.cloudera.com:8080/549 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: jenkins	2014-01-08 10:53:01 -08:00
Nong Li	93bece32ae	Rename UdfContext to FunctionContext. Change-Id: I45da3f51a66c3e2cc4580c26733269f30ab9be83 Reviewed-on: http://gerrit.ent.cloudera.com:8080/546 Tested-by: jenkins Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>	2014-01-08 10:53:01 -08:00
Henry Robinson	8cee9fa138	Fix failing test_shell_commandline Change-Id: Iea170885f740ceeb08e21e64ef88ab44584fa270 Reviewed-on: http://gerrit.ent.cloudera.com:8080/545 Tested-by: jenkins Reviewed-by: Lenni Kuff <lskuff@cloudera.com>	2014-01-08 10:53:01 -08:00
Henry Robinson	41c88219ab	Fix PYTHONPATH for Thrift on non-Debian systems Python modules on Redhat systems might be in lib or in lib64, unlike Debian systems which symlink one to the other Change-Id: Ia1e2d362e3d7e13b87c70e7578644827a5234a91 Reviewed-on: http://gerrit.ent.cloudera.com:8080/544 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: jenkins	2014-01-08 10:53:00 -08:00
Henry Robinson	b9bc9a9e89	Add SSL support for client connections to Impala This patch allows Impala to start either Beeswax or HS2 on an SSL-secured port. SSL is a certificate-based authentication scheme, where the server provides a certificate to the client as part of the handshake process. The client verifies that certificate, either by contacting a trusted third-party certificate authority (CA), or by accepting a 'self-signed' certificate from the server that is also provided to the client out-of-band; the client simply compares the two certificate copies. Once the certificate is verified, the client and server negotiate an encryption key for the session, using a public key provided by the server to encrypt that negotiation. Therefore the server has to have access to a private key in order to decrypt the encryption key. Both certificate and key are stored in industry standard .PEM format. Impala uses the same certificate and key for both Beeswax and HS2, and the files containing the certificate and key are provided via --ssl_server_certificate and --ssl_private_key. If either are non-blank, SSL is enabled for Beeswax and HS2. The Python shell supports SSL as of this patch via new --ssl and --ca_cert flags. Finally, this patch also adds support for Impala's ThriftClients to use SSL, paving the way for having the backend service use encryption on the wire as well (although such a configuration is not used by this patch). The client SSL support is only currently used for the new test case. This patch does not enable 'mutual' authentication, where clients provide certificates to the server in order to authenticate themselves. Impala has other authentication mechanisms for that purpose. Change-Id: I3942aa0d21b34b7cda748292f04a9523f35ee6d4 Reviewed-on: http://gerrit.ent.cloudera.com:8080/514 Tested-by: jenkins Reviewed-by: Henry Robinson <henry@cloudera.com>	2014-01-08 10:53:00 -08:00
Henry Robinson	f3e4df14ac	Move Thrift backend code into separate rpc library We have ~60 files in Util which is a bit unwieldy. The Thrift / RPC code is some of the easiest to move out, and doesn't really belong in a 'Util' library. Change-Id: I7a188ab69459b019a643b192d51879bc8ead88a7 Reviewed-on: http://gerrit.ent.cloudera.com:8080/528 Tested-by: jenkins Reviewed-by: Henry Robinson <henry@cloudera.com>	2014-01-08 10:53:00 -08:00

1 2 3 4 5 ...

1681 Commits