impala

mirror of https://github.com/apache/impala.git synced 2026-01-07 18:02:33 -05:00

Author	SHA1	Message	Date
Michael Ubell	8a5297a526	Add HdfsLzoTextScanner	2014-01-08 10:46:35 -08:00
Henry Robinson	2f339f2ed8	Add ASL license to all public files	2014-01-08 10:46:32 -08:00
ishaan	ccb020c4a0	Adding copyrights to remaining files.	2014-01-08 10:46:30 -08:00
ishaan	05c65789bb	Change Copyrights from 2011 ti 2012	2014-01-08 10:46:29 -08:00
Henry Robinson	dd0e9f1180	IMP-265: State-store subscriber recovery mode	2014-01-08 10:46:25 -08:00
Michael Ubell	c1852e2dcf	Add from_unixtime and unix_timestamp(string, string)	2014-01-08 10:46:22 -08:00
Marcel Kornacker	ea050a43ad	Switching over backend runtime structures to new planner. Added container-util.h	2014-01-08 10:46:20 -08:00
Nong Li	08968c1d07	Performance improvements for aggregation and hash join nodes with codegen.	2014-01-08 10:46:19 -08:00
Alan Choi	0ce8a044e3	Disable RC/Trevni (with option to allow it); remove file_buffer_size IMP-336: remove file_buffer_size query options Add "allow_unsupported_formats" query options to allow RC/Trevni in our test; disabled by default	2014-01-08 10:46:02 -08:00
Alan Choi	dbf1074066	Fragments report errors to coordinator. Enable multi-node DataErrorTest (IMP-250 resolved) Check fragment/coord errors in DataErrorTest	2014-01-08 10:46:00 -08:00
Henry Robinson	91c3b979ca	IMP-370: SHOW TABLES IN support and IMP-363: SHOW DATABASES Change-Id: Ic41c4b0767a0480f0a18e1e985f25de3bc2ca947	2014-01-08 10:45:59 -08:00
Henry Robinson	540673763f	Add session key handling to ThriftServer, and session support to the frontend	2014-01-08 10:45:59 -08:00
Marcel Kornacker	927f4c52f8	Adding the remaining pieces of functionality to the new planner: - HBaseScanNode.getScanRangeLocations() - new planner creates INSERT plans - Frontend.createExecRequest2(), which calls NewPlanner.	2014-01-08 10:45:58 -08:00
Michael Ubell	48c454d319	IMP-267 Add version() function.	2014-01-08 10:45:57 -08:00
Nong Li	a9ff7323f2	Fix our string to numeric casts to use StringParser instead of lexical_cast.	2014-01-08 10:45:57 -08:00
Marcel Kornacker	5984c0be52	First cut of partitioned plan generation: - created new class PlanFragment, which encapsulates everything having to do with a single plan fragment, including its partition, output exprs, destination node, etc. - created new class DataPartition - explicit classes for fragment and plan node ids, to avoid getting them mixed up, which is easy to do with ints - Adding IdGenerator class. - moved PlanNode.ExplainPlanLevel to Types.thrift, so it can also be used for PlanFragment.getExplainString() - Changed planner interface to return scan ranges with a complete list of server locations, instead of making a server assignment. Also included: cleaned up AggregateInfo: - the 2nd phase of a DISTINCT aggregation is now captured separately from a merge aggregation. - moved analysis functionality into AggregateInfo Removing broken test cases from workload functional-planner (they're being handled correctly in functional-newplanner).	2014-01-08 10:45:56 -08:00
Nong Li	2ea454fcda	Updated coordinator to output summary profiles.	2014-01-08 10:45:14 -08:00
Michael Ubell	ad46b98366	Add Kerberos authentication.	2014-01-08 10:45:10 -08:00
Nong Li	073837de79	Change default abort on error to false.	2014-01-08 10:45:08 -08:00
Marcel Kornacker	a0f0064a2a	additional Thrift changes for new planner	2014-01-08 10:45:07 -08:00
Nong Li	be33587e10	Added wall based rate counters and other counter cleanup.	2014-01-08 10:45:06 -08:00
Henry Robinson	e7348a209b	IMP-232: Parallel INSERT OVERWRITE	2014-01-08 10:45:04 -08:00
Nong Li	126971edbb	Update Impala to use CDH4.1 rc3.	2014-01-08 10:45:04 -08:00
Henry Robinson	e3e6ba984b	Show / describe	2014-01-08 10:44:49 -08:00
Marcel Kornacker	c004cdaa1c	Thrift structures for the new planner interface.	2014-01-08 10:44:47 -08:00
Nong Li	a417099e66	Fix runtime profile aggregated throughput.	2014-01-08 10:44:47 -08:00
Nong Li	689fb7d799	Push throughput counter to io mgr and other counter fixes.	2014-01-08 10:44:46 -08:00
Marcel Kornacker	c18d0970d7	Changed RuntimeProfile::PrettyPrint() and Coordinator::BackendExecState::GetNodeThroughput() not to hold locks while they make function calls. Changed Frontend.assignIds() to use UUID.randomUUID() to generate the query id.	2014-01-08 10:44:46 -08:00
Nong Li	e160e09a85	Fix incorrect use of memcpy llvm codegen intrinsic.	2014-01-08 10:44:44 -08:00
Marcel Kornacker	7725f25ff5	This combines changes related to periodic reporting of plan fragment exec profiles: - executor takes report callback; passed in by ImpalaServer::FragmentExecState - the PlanFragmentExecutor invokes profile reporting cb in background thread. - RuntimeProfile is now thread-safe and has an RuntimeProfile::Update() Also included: - a number of bug fixes related to async cancellation of query and propagation of errors through PlanFragmentExecutor/Coordinator/ImpalaServer. - changing COUNTER_SCOPED_TIMER to SCOPED_TIMER - derived counters: RuntimeProfile now lets you add counters that return a value via a function call, which is useful for reporting something like normalized ScanNode throughput; retrofitted to ScanNode and all subclasses - changed coordinator to make cancellation atomic wrt recognition of an error status for the overall query. - Removed InProcessQueryExecutor from data-stream-test. Added aggregate throughput counters to coordinator: - all throughput counters are grouped in a sub-profile "AggregateThroughput" - each scan node gets its own counter - the value is aggregated across all registered backends which contain that node in their plan fragments	2014-01-08 10:44:42 -08:00
Henry Robinson	0fd68e5718	USE stub implementation	2014-01-08 10:44:42 -08:00
Nong Li	81bba16dac	Parallel scanners.	2014-01-08 10:44:38 -08:00
Henry Robinson	e5893064b0	Fix build failure	2014-01-08 10:44:37 -08:00
Henry Robinson	fb681fba4e	Simple Python shell for Impala	2014-01-08 10:44:37 -08:00
Henry Robinson	c472213eeb	Parallel INSERT, sink-per-scan-node plan	2014-01-08 10:44:35 -08:00
Alan Choi	88ae4d748f	Fill Disk ID in THdfsFileSplit in the FE	2014-01-08 10:44:33 -08:00
Alexander Behm	ee705e3083	Added timestamp arithmetic expressions.	2014-01-08 10:44:31 -08:00
Alan Choi	f15ef994fb	"mvn test" now uses impalad and beeswax api to submit query and fetch, including insert query. review issue: 260	2014-01-08 10:44:30 -08:00
Alan Choi	88101bc90e	This patch implements the probabilistic counting algorithm as an aggregate "distinctpc" and "distinctpcsa". We've gathered statistics on an internal dataset (all columns) which is part of our regression data. It's roughly 400mb, ~100 columns, int/bigint/string type. On Hive, it took roughly 64sec. On this Impala implementation, it took 35sec. By adding inline to hash-util.h (which we don't), we can achieve 24~26sec. Change-Id: Ibcba3c9512b49e8b9eb0c2fec59dfd27f14f84c3	2014-01-08 10:44:27 -08:00
Alan Choi	cbadb4eac4	When a scan range begins at the starting point fo the tuple, we'll missed that tuple. This patch fixes this problem. review: 162	2014-01-08 10:44:24 -08:00
Alan Choi	41200fc307	Impalad now accept Query.Configuration as execution option issue: 210	2014-01-08 10:44:22 -08:00
Henry Robinson	4b60df6458	IMP-63 and IMP-140: Update metastore after INSERT query	2014-01-08 10:44:22 -08:00
Lenni Kuff	64058cb9b8	Fix some ImpalaServer bugs due to incorrectly assigning fragment ids [Submitting on behalf of Marcel] - fragment ids weren't assigned correctly (they need to be unique across all nodes on which they're executing) - some of the execution logic that I checked in yesterday was flawed	2014-01-08 10:44:19 -08:00
Michael Ubell	02d63d8dc3	Trevni file support	2014-01-08 10:44:19 -08:00
Alexander Behm	5a92fee31c	Added now() function.	2014-01-08 10:44:19 -08:00
Marcel Kornacker	10bf3e91e3	Cancellation support: - added DataStreamMgr::Cancel(), which is used to propagate cancellation from the coordinator to all (possibly blocked) ExchangeNodes - all exec nodes now check for cancellation before they do anything that might block for a while - fixed up logic related to async cancellation Added support for async query execution via beeswax interface: - implemented ImpalaServer::query() - QueryExecState now tracks beeswax's idea of the query state - ImpalaServer::get_state() now returns the actual state Fixed handling of ExecNode::Close(): - needs to be called for entire plan tree, regardless of what fails (can't use RETURN_IF_ERROR() inside of it) - needs to be called for every Open() call by coordinator/ImpalaServer	2014-01-08 10:44:18 -08:00
Marcel Kornacker	fb32d40b03	Switching to an asynchronous plan fragment exec interface; this entails: - making the coordinator asynchronous - renamed ImpalaBackendService to ImpalaInternalService; - new class ImpalaServer implements ImpalaService and ImpalaInternalService - renaming ImpalaInternalService fields to conform to c++ style - merged impala-service.{cc,h} and backend-service.{cc,h} into impala-server.{cc,h} - added TStatusCode field to Status.ErrorDetail - removed ImpalaInternalService.CloseChannel Also removed JdbcDriverTest.java	2014-01-08 10:44:15 -08:00
Alexander Behm	a1e03b81cd	Added BETWEEN and IN predicates.	2014-01-08 10:44:15 -08:00
Kay Ousterhout	ac00649369	Changed SubscriptionId to be a 64-bit integer. Currently, subscriptions are per-subscriber. However, we don't want to be stuck with this decision later on, and if subscriptions are uniquely numbered, a 32-bit integer may not allow as many subscriptions as we'd like.	2014-01-08 10:44:14 -08:00
Alan Choi	bee2736de7	IMP-35: query without any column reference failed For example, select 1 from alltypessmall; This is because TDescriptorTAble.slotDescriptors is empty. The fix is to make it optional.	2012-07-17 17:50:51 -07:00

1 2

91 Commits