impala

mirror of https://github.com/apache/impala.git synced 2026-01-07 09:02:19 -05:00

Author	SHA1	Message	Date
Henry Robinson	a0d7b3731e	Workaround for Hive 0.8.1 bug: load hive-builtins jar into HDFS TODO: Remove once we understand how to make Hive look in the local FS for the jar.	2012-02-21 14:32:54 -08:00
Henry Robinson	6260989cd8	Upgrade to hbase-0.92.0-cdh4b1	2012-02-21 14:32:52 -08:00
Henry Robinson	f2a602ea9d	Update impala-config.sh and FindHDFS.cmake to adjust to new hadoop layout. Add core-site.xml to $HADOOP_CONF_DIR for direct read configuration. Update fe pom.xml to point to new Hadoop.	2012-02-21 14:32:51 -08:00
Henry Robinson	09dd1c2ecd	Upgrade Hive to hive-0.8.0-cdh4b1	2012-02-20 23:21:12 -08:00
Henry Robinson	518ebece37	Add hadoop-0.23.0-cdh4b2-SNAPSHOT This version is a custom build of Hadoop cdh4b2, including: 1. The BlockReader direct read API 2. A forward-port of MiniHadoopClusterManager from cdh3 3. The local-read security check has been removed 4. The sizes of the BlockReaderLocal slow-read / checksum buffers have been increased to handle reads up to 1MB at a time. It is built from github.sf.cloudera.com/Henry/hadoop-common, branch cdh4-23-direct-read, commit 8f8c63	2012-02-20 23:21:08 -08:00
Henry Robinson	36129bca76	Remove Hadoop-0.20.2	2012-02-20 23:21:06 -08:00
Nong Li	88237350f0	Change the build to allow debug and release builds to coexist.	2012-02-17 18:14:04 -08:00
Nong Li	e120233fb4	Update HdfsTestScanNode to be able to handle scan ranges.	2012-02-14 16:18:21 -08:00
Nong Li	a078d0abcf	Fix build issue with gflags.	2012-02-10 13:24:33 -08:00
Marcel Kornacker	6a57a1d879	Enabling multi-node distributed execution: - adding flag --backends="host:port,host:port,..." , which TestEnv uses to create clients for ImpalaBackendServices running on those nodes; this is just a hack in order to be able to use runquery for multi-node execution - impalad-main.cc: main() of impala daemon, which will export both ImpalaService and ImpalaBackendService (but at the moment only does the latter; everything related to ImpalaService is commented out) - com.cloudera.impala.service.Frontend: API to the frontend functionality; invoked by impalad via jni; ignore for now	2012-02-10 10:53:40 -08:00
Marcel Kornacker	aec4f13dda	Changing the conversion of TRowBatch to RowBatch to make a copy of the tuple data and fixing a leak in ExchangeNode::GetNext().	2012-02-08 11:14:55 -08:00
Nong Li	797bd1ee58	Fix a couple of hotspots resulting in ~9% perf improvements on the hive benchmark. - std::string replacement for hdfs-text-scan-node boundary strings - remove bzero in RowBatch::AddTuple Improved the benchmark running script to compare to previous results if available.	2012-01-26 08:40:35 -08:00
Marcel Kornacker	17aa46eb82	fixing Jenkins build failure	2012-02-07 17:13:54 -08:00
Marcel Kornacker	a607c6c69f	Partitioned parallel execution in QueryTest: - changing block assignment to plan fragments if numPartitions != all - adding class HdfsFsCache, which caches Hdfs connections for the lifetime of the process. This fixes a bug in the hdfs text scan node, which used to obtain (inadvertently) shared connections via hdfsConnect() and then end up closing them process-wide via a call to hdfsDisconnect(). - adapting tests to eliminate random output Also fixed handling of empty tables (or empty scan ranges) (jira IMP-28)	2012-02-07 16:05:57 -08:00
Michael Ubell	52d95e90fc	Fixe typeo in impala_functions.py	2012-02-06 16:23:12 -08:00
Michael Ubell	352deb82aa	Added the string functions "reverse" "strleft" and "strright" Fixed some comments in gen_opcodes.py changed to make -j	2012-02-06 15:55:40 -08:00
Marcel Kornacker	ff7e268c6f	disabling tcmalloc in libbackend.so for more meaningful stack traces	2012-02-01 14:30:09 -08:00
Marcel Kornacker	38b6d6286e	Added support for single-process distributed query execution: * new class ExchangeNode: ExecNode for incoming data stream * new class Coordinator: coordinates execution of all plan fragments * reorganized classes PlanExecutor and QueryExecutor * renamed PlanExecutorAdaptor to JniCoordinator * backend-service: creates thrift server that exports ImpalaBackendService * added --num_backends flag for runquery	2012-02-01 12:06:55 -08:00
Henry Robinson	32e230b256	Shell scripts to start, load and kill a mini dfs cluster. mvn -Pload_testdata ... from fe/ will run a three-node, single-process HDFS cluster loaded with data from AllTypesAgg in /impala-dist-test. A Hive table over this data is created as AllTypesAggMini.	2012-01-30 17:32:12 -08:00
Nong Li	783480d6bf	- Cleaned up some TODOs. - Fix tuple template. Fixed strcmp - atoi/atof handle overflows. - added likely/unlikely compiler directive - Runquery now reports mean/stddev for profile runs - removed quoted char	2012-01-18 23:08:29 -08:00
Henry Robinson	e3ae3a5823	Remove CMakeCache.txt from root when cleaning with ./buildall.sh	2012-01-16 13:59:40 -08:00
Nong Li	bf74bc25e3	Some cleanup: - Fixed issue with SSE file parse. - Moved build scripts to impala/bin. Rebuilding from just BE does not work. - Cleanedup a few compiler warnings. - Add option to disable automatic counters for profilers.	2011-12-31 06:17:28 -08:00
Nong Li	94db70c9fd	Fix build. Dependencies don't propagate right on first build.	2011-12-30 21:28:18 -08:00
Nong Li	c84fec38d3	- Move thrift out of FE src and into impala/common - Thrift files now build using cmake instead of mvn - Added cmake build to impala/ which drives the build process	2011-12-30 19:35:20 -08:00
Marcel Kornacker	c056445612	Added m:n data streams: - DataStreamSender: sender side (1:n) for a single stream - DataStreamMgr: receiver side; singleton class for all incoming streams active at a node Changed ExecNode::GetNext() to return eos indicator explicitly; this allows us to pass incoming TRowBatches (which may not be full) up w/o copying the data. Added data-stream-test.	2012-01-10 18:00:20 -08:00
Henry Robinson	01ff5b842f	Add length, lower and upper to string functions	2012-01-09 17:26:41 -08:00
Nong Li	2880f54d35	Perf Work: - Added perf counter utility - Added google perf tools - Added html data set - Added escape char test - Initial perf tuning	2011-12-30 00:26:27 -08:00
Nong Li	4930632c8d	Remove BE testing. Not working on jenkins build machines.	2011-12-19 23:29:59 -08:00
Marcel Kornacker	482e83a396	Removing testdata submodule, we need to have large data files hosted outside of git.	2011-12-16 13:16:28 -08:00
Carl Steinbach	2378417376	IMP-18. Add RCFile support to Impala backend	2011-12-15 15:50:54 -08:00
Alexander Behm	f51ed6a47c	Fixed bug in planner tests due to absolute path.	2011-12-12 15:37:25 -08:00
Alexander Behm	c7f7382c31	Added planner changes and data sinks for INSERT statements.	2011-12-12 15:14:49 -08:00
Marcel Kornacker	83d0d90943	This covers: 1) partitioning of scans along hdfs file splits (ScanNode.getScanParams()); 2) rudimentary distributed plan generator, which adds a merge phase to what are essentially single-node plans and partitions the plan based on the partitioning of the leftmost scan in the plan tree There is no test coverage for 1) yet, because with the current hdfs setup (all paths point to the local fs) there are no file splits - every file is a single block. I will change our test setup in a forthcoming CL to use the hadoop minicluster environment, which should allow us to create files with splits.	2011-12-08 15:12:58 -08:00
Nong Li	5ae17ad5f9	Adding grep data.	2011-12-06 03:18:30 -08:00
Carl Steinbach	44fa92a639	IMP-29. Remove mock libhdfs implementation	2011-12-06 13:25:18 -08:00
Nong Li	ea9d4b94f4	Adding opcode cmakelists.txt	2011-11-20 14:27:27 -08:00
Nong Li	b1833d4de8	Implmented opcode registry. Added substr() and pi() functions. Added backend testing to buildall.sh	2011-11-20 13:44:41 -08:00
Marcel Kornacker	a8acd52281	Defining data serialization format (Data.thrift). Adding MemPool::GetOffset()/GetDataPtr(). Fixed planner bug (wouldn't generate TScanParams for more than one scan). Fixed bug in java test harness (which made it ignore the fact that the join tests have been broken for a while).	2011-11-22 15:42:32 -08:00
Alexander Behm	62177d4d9c	Added parser and analyzer support for INSERT statements.	2011-11-22 15:34:04 -08:00
Nong Li	27b29e4568	Updated HDFS scan node FE and BE to no longer pass all of the partition key values via thrift. Instead, the FE passes a regex which the BE uses to extract the values.	2011-11-16 15:37:23 -08:00
Nong Li	6eda6d19c6	Implemented TopN.	2011-11-06 17:03:33 -08:00
Nong Li	bfc9824b73	Added doxygen ("build docs") to backend.	2011-11-06 16:27:29 -08:00
Marcel Kornacker	0914fedea9	Defining Impala backend service, which is exported by backend processes to service plan fragment execution requests. Changing thrift plan-related structs to pull out runtime parameters in preparation for parallel execution.	2011-11-02 14:47:32 -07:00
Nong Li	84e915bb6e	Added support for string aggregation.	2011-11-01 15:36:45 -07:00
Carl Steinbach	3b7d5e3980	IMP-26. Add cpplint to the build	2011-10-20 14:22:06 -07:00
Marcel Kornacker	33166839c7	Changing QueryTests to run queries with different batch sizes, in order to hit more corner cases in executor code. Adding MemPool::Release(), which allows passing data between pools. Changing the semantics of GetNext() not to overwrite tuple data even beyond the next call; the previous semantics (data only good until the next call) would have required joins to create copies. Adding mem-pool-test.	2011-10-17 05:10:21 -07:00
Marcel Kornacker	d5708b272f	adding backend implementation for LIMIT clause	2011-09-29 22:34:46 -07:00
Carl Steinbach	b49efb3a87	IMP-25. Add project configuration file for Astyle source code formatter	2011-09-29 15:30:21 -07:00
Marcel Kornacker	0827146a2b	adding outer joins plus new tests	2011-09-28 09:02:07 -07:00
Carl Steinbach	6e2c757c5c	IMP-23. Generate Cscope index file during build	2011-09-19 11:49:27 -07:00

... 47 48 49 50 51 ...

2565 Commits