impala

mirror of https://github.com/apache/impala.git synced 2025-12-22 03:18:15 -05:00

Author	SHA1	Message	Date
Skye Wanderman-Milne	09018a3756	Nested types: initialize repetition level decoders in HdfsParquetScanner The decoders aren't used yet, but will be when we materialize arrays. In addition, setting up the repetition level decoder makes sure the definition level encoder is initialized to the right place in the data buffer. Change-Id: Ic85ae812b10c747b36d884794d8dcf5976dfe74f Reviewed-on: http://gerrit.cloudera.org:8080/405 Reviewed-by: Casey Ching <casey@cloudera.com> Reviewed-by: Dan Hecht <dhecht@cloudera.com> Tested-by: Internal Jenkins	2015-05-22 05:19:25 +00:00
Skye Wanderman-Milne	7801aa499f	Use codegen to inject runtime constants in exprs This patch introduces the function GetConstant(), which is used by expr compute function and UDFs to access query constants. There is a corresponding GetIrConstant() function that returns the IR versions of the same constants. Currently the only implemented constants are the expr's return type and argument types, but other constants can be easily be added to these functions. Interpreted expr functions run normally, but cross-compiled functions can be passed to InlineConstants(), which looks for calls to GetConstant() and replaces them with the result of calling GetIrConstant(). I used this technique in the decimal functions that previously were not switching on the type at all. The performance of LeastGreatest() after this patch is the same as it was before it switched on the type. Change-Id: I8b55744551830d894318a7bab6b6f045fb8bed41 Reviewed-on: http://gerrit.cloudera.org:8080/352 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: Internal Jenkins	2015-05-15 02:24:04 +00:00
Henry Robinson	8b4a748d5e	IMPALA-1726: Statestore to timeout hung RPCs If a Heartbeat() RPC appears hung, the statestore should abort that RPC so as not to hold on to a sender thread, and to trigger the failure detector to evict the hung node. We could just add a TCP timeout to the client cache used by the statestore, but doing so would mean that all RPCs were subject to the timeout, and UpdateState() typically takes much longer than Heartbeat() by design, so setting a reasonable timeout would be impossible. Instead, this patch adds a second client cache designed only for Heartbeat() RPCs, with an aggressive timeout of 3s by default. (Heartbeat() usually takes ~1-2ms). A timeout for UpdateState() is also set to avoid thread starvation, but this is much less aggressive at 300s. This patch also adds ClientConnection::DoRpc(), which calls an RPC and handles various failure modes, including timeout. If DoRpc() returns an error, the statestore handles it in the usual way, including updating the failure detector if the failed RPC is Heartbeat(). Change-Id: I2f2462278e59581937c9c10910625d2724a11efa Reviewed-on: http://gerrit.cloudera.org:8080/206 Reviewed-by: Henry Robinson <henry@cloudera.com> Tested-by: Internal Jenkins	2015-03-20 14:37:14 -07:00
ishaan	e02a38fa32	Use try/finally instead of the with context manager in generate_error_codes.py This will make the code compatible with python 2.4 Change-Id: I39c23256907520183f5f7797097f2fb1ad0e5cfc	2015-03-01 09:21:14 -08:00
Martin Grund	b582cdc22b	IMPALA-1598: Adding Error Codes to Log Messages This patch introduces the concept of error codes for errors that are recorded in Impala and are going to be presented to the client. These error codes are used to aggregate and group incoming error / warning messages to reduce the spill on the shell and increase the usefulness of the messages. By splitting the message string from the implementation, it becomes possible to edit the string independently of the code and pave the way for internationalization. Error messages are defined as a combination of an enum value and a string. Both are defined in the Error.thrift file that is automatically generated using the script in common/thrift/generate_error_codes.py. The goal of the script is to have a central understandable repository of error messages. Adding new messages to this file will require rebuilding the thrift part. The proxy class ErrorMessage is responsible to represent an error and capture the parameters that are used to format the error message string. When error messages are recorded they are recorded based on the following algorithm: - If an error message is of type GENERAL, do not aggregate this message and simply add it to the total number of messages - If an error messages is of specific type, record the first error message as a sample and for all other occurrences increment the count. - The coordinator will merge all error messages except the ones of type GENERAL and display a count. For example, in the case of the parquet file spanning multiple blocks the output will look like: Parquet files should not be split into multiple hdfs-blocks. file=hdfs://localhost:20500/fid.parq (1 of 321 similar) All messages are always logged to VLOG. In the coordinator error messages are merged across all backends to retain readability in the case of large clusters. The current version of this patch adds these new error codes to some of the most important error messages as a reference implementation. Change-Id: I1f1811631836d2dd6048035ad33f7194fb71d6b8 Reviewed-on: http://gerrit.cloudera.org:8080/39 Reviewed-by: Martin Grund <mgrund@cloudera.com> Tested-by: Internal Jenkins	2015-03-01 03:37:32 +00:00

5 Commits