impala

mirror of https://github.com/apache/impala.git synced 2026-01-04 09:00:56 -05:00

Author	SHA1	Message	Date
Bharath Vissapragada	3f2f008ac4	IMPALA-3552: Make incremental stats max serialized size configurable The fix "IMPALA-2648/IMPALA-2664" introduced a conservative limitation on the maximum serialized size of incremental stats. As a side effect, some users with very large tables are experiencing regressions especially when they upgrade impala and the serialized size goes beyond 200MB. To mitigate the issue, the change introduces a new gflag, 'inc_stats_size_limit_bytes' to make the max serialized size configurable, which allows impala users to specify their own maximum serialized size. Default value for inc_stats_size_limit_bytes is 200MB. The change introduces a TBackendGflags class to pass the gflags from backend to the Frontend and the Catalog via thrift. This also revamps existing query options to use the TBackendConfig. Change-Id: I33684725a61eabc67237503e61178305d37d3cb5 Reviewed-on: http://gerrit.cloudera.org:8080/4867 Reviewed-by: Bharath Vissapragada <bharathv@cloudera.com> Tested-by: Internal Jenkins	2016-11-15 03:22:11 +00:00
Henry Robinson	19de09ab7d	IMPALA-4160: Remove Llama support. Alas, poor Llama! I knew him, Impala: a system of infinite jest, of most excellent fancy: we hath borne him on our back a thousand times; and now, how abhorred in my imagination it is! Done: * Removed QueryResourceMgr, ResourceBroker, CGroupsMgr * Removed untested 'offline' mode and NM failure detection from ImpalaServer * Removed all Llama-related Thrift files * Removed RM-related arguments to MemTracker constructors * Deprecated all RM-related flags, printing a warning if enable_rm is set * Removed expansion logic from MemTracker * Removed VCore logic from QuerySchedule * Removed all reservation-related logic from Scheduler * Removed RM metric descriptions * Various misc. small class changes Not done: * Remove RM flags (--enable_rm etc.) * Remove RM query options * Changes to RequestPoolService (see IMPALA-4159) * Remove estimates of VCores / memory from plan Change-Id: Icfb14209e31f6608bb7b8a33789e00411a6447ef Reviewed-on: http://gerrit.cloudera.org:8080/4445 Tested-by: Internal Jenkins Reviewed-by: Henry Robinson <henry@cloudera.com>	2016-09-20 23:50:43 +00:00
Dan Hecht	ffa7829b70	IMPALA-3918: Remove Cloudera copyrights and add ASF license header For files that have a Cloudera copyright (and no other copyright notice), make changes to follow the ASF source file header policy here: http://www.apache.org/legal/src-headers.html#headers Specifically: 1) Remove the Cloudera copyright. 2) Modify NOTICE.txt according to http://www.apache.org/legal/src-headers.html#notice to follow that format and add a line for Cloudera. 3) Replace or add the existing ASF license text with the one given on the website. Much of this change was automatically generated via: git grep -li 'Copyright.Cloudera' > modified_files.txt cat modified_files.txt \| xargs perl -n -i -e 'print unless m#Copyright.Cloudera#i;' cat modified_files_txt \| xargs fix_apache_license.py [1] Some manual fixups were performed following those steps, especially when license text was completely missing from the file. [1] https://gist.github.com/anonymous/ff71292094362fc5c594 with minor modification to ORIG_LICENSE to match Impala's license text. Change-Id: I2e0bd8420945b953e1b806041bea4d72a3943d86 Reviewed-on: http://gerrit.cloudera.org:8080/3779 Reviewed-by: Dan Hecht <dhecht@cloudera.com> Tested-by: Internal Jenkins	2016-08-09 08:19:41 +00:00
Bharath Vissapragada	084b9b1692	IMPALA-2432: Add query endtime to impalad's lineage This commit adds query endtime to impalad's lineage log entries consumed by navigator. The lineage graph is constructed in the frontend and is then passed to the backend as a serialized thrift object. When the query terminates (includes cancellations and aborts), the backend appends the query endtime ("endTime") to the lineage graph and generates the lineage log entry in JSON format. Change-Id: I2236e98895ae9a159ad6e78b0e18e3622fdc3306 Reviewed-on: http://gerrit.cloudera.org:8080/934 Reviewed-by: Bharath Vissapragada <bharathv@cloudera.com> Tested-by: Internal Jenkins	2015-11-04 08:39:12 +00:00
Martin Grund	384ae3ab08	Fixes for Toolchain Issues If a static version of zlib and bzip2 is picked up we assumed that it would be compiled with -fPIC. However, this is not always the case. Thus in the non-toolchain case we specifically dynamic link with zlib and bzip2 for the dynamic targets. In addition, this patch removes static linking of libgcc in the toolchain case as LLVM is not able to find the exception handling symbols even if they are present in the binary. Static linking of libgcc is postponed. Next, if Impala is build with -notests the external data source thrift files would not be generated. This patch make sure the dependencies are expressed correctly. Finally, if a user would have google perftools installed on the system we would accidentally pick up the system libraries and the thirdparty headers which will end in linker errors. This patch fixes the path issues. Change-Id: Ic000101c33da26d75a0cd733f7ef02f1bd694937 Reviewed-on: http://gerrit.cloudera.org:8080/460 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Internal Jenkins	2015-06-15 23:14:32 +00:00
Matthew Jacobs	fe87bb1563	Add MetricDefs, static definitions of metric metadata generated from json Adds a static definition of the metric metadata used by Impala. The metric names, descriptions, and other properties are defined in common/thrift/metrics.json file, and the generate_metrics.py script creates a thrift representation. The metric definitions are then available in a constant map which is used at runtime to instantiate metrics, looking them up in the map by the metric key. New metrics should be defined by adding an entry to the list of metrics in metrics.json with the following properties: key: The unique string identifying the metric. If the metric can be templated, e.g. rpc call duration, it may be a format string (in the format used by strings::Substitute()). description: A text description of the metric. May also be a format string. label: A brief title for the metric, not currently used by Impala but provided for external tools. units: The unit of the metric. Must be a valid value of TUnit. kind: The kind of metric, e.g. GAUGE or COUNTER. Must be a valid value of TMetricKind. contexts: The context in which this metric may be instantiated. Usually "IMPALAD", "STATESTORED", "CATALOGD", but may be a different kind of 'entity'. Not currently used by Impala but provided for modeling purposes for external tools. For example, adding the counter for the total number of queries run over the lifetime of the impalad process might look like: { "key": "impala-server.num-queries", "description": "The total number of queries processed.", "label": "Queries", "units": "UNIT", "kind": "COUNTER", "contexts": [ "IMPALAD" ] } TODO: Incorporate 'label' into the metrics debug page. TODO: Verify the context at runtime, e.g. verify 'contexts' contains, e.g. a DCHECK. After the metric definition is added, the generate_metrics.py script will generate the TMetricDefs.thrift that contains a TMetricDef for the metric definition. At runtime, the metric can be instantiated using the key defined in metrics.json. Gauges, Counters, and Properties are instantiated using static methods on MetricGroup. Other metric types are instantiated using static CreateAndRegister methods on their associated classes. TODO: Generate a thrift enum used to lookup metric defs. TODO: Consolidate the instantiation of metrics that are created outside of metrics.h (i.e. collection metrics, memory metrics). TODO: Need a better way to verify if metric definitions are missing. Change-Id: Iba7f94144d0c34f273c502ce6b9a2130ea8fedaa Reviewed-on: http://gerrit.cloudera.org:8080/330 Reviewed-by: Matthew Jacobs <mj@cloudera.com> Tested-by: Internal Jenkins	2015-05-14 21:27:28 +00:00
Martin Grund	b582cdc22b	IMPALA-1598: Adding Error Codes to Log Messages This patch introduces the concept of error codes for errors that are recorded in Impala and are going to be presented to the client. These error codes are used to aggregate and group incoming error / warning messages to reduce the spill on the shell and increase the usefulness of the messages. By splitting the message string from the implementation, it becomes possible to edit the string independently of the code and pave the way for internationalization. Error messages are defined as a combination of an enum value and a string. Both are defined in the Error.thrift file that is automatically generated using the script in common/thrift/generate_error_codes.py. The goal of the script is to have a central understandable repository of error messages. Adding new messages to this file will require rebuilding the thrift part. The proxy class ErrorMessage is responsible to represent an error and capture the parameters that are used to format the error message string. When error messages are recorded they are recorded based on the following algorithm: - If an error message is of type GENERAL, do not aggregate this message and simply add it to the total number of messages - If an error messages is of specific type, record the first error message as a sample and for all other occurrences increment the count. - The coordinator will merge all error messages except the ones of type GENERAL and display a count. For example, in the case of the parquet file spanning multiple blocks the output will look like: Parquet files should not be split into multiple hdfs-blocks. file=hdfs://localhost:20500/fid.parq (1 of 321 similar) All messages are always logged to VLOG. In the coordinator error messages are merged across all backends to retain readability in the case of large clusters. The current version of this patch adds these new error codes to some of the most important error messages as a reference implementation. Change-Id: I1f1811631836d2dd6048035ad33f7194fb71d6b8 Reviewed-on: http://gerrit.cloudera.org:8080/39 Reviewed-by: Martin Grund <mgrund@cloudera.com> Tested-by: Internal Jenkins	2015-03-01 03:37:32 +00:00
Henry Robinson	6bc411c890	Add support for HS2 protocol V6 This patch adds support for V6 of the HS2 protocol, which notably includes columnar organisation of result sets. Clients that set their protocol version to < V6 will receive result sets in the traditional row orientation. The performance of fetches over HS2 goes up significantly as a result, since the V1 protocol had some pathologies in its deserialisation performance. Beeswax Row materialisation: 455ms, client processing time: 523ms HS2 V6: Row materialisation: 444ms, client processing time: 1.8s HS2 V1: Row materialisation: 585ms, client processing time: 15.9s (!) TODO: Add support for the CHAR datatype The following patch is also included: Fix wait-for-hiveserver2.py when Impala moves to HS2 V6 Due to HIVE-6050, older versions of Hive are not compatible with newer clients (even those that try to use old protocol versions). wait-for-hiveserver2.py uses HS2 to talk to the HiveServer2 service, but picks up the newer version from V6, and fails. This patch temporarily re-adds cli_service.thrift (renaming the Thrift service as LegacyTCLIService) only for wait-for-hiveserver2.py to use. As soon as Impala's thirdparty Hive moves to HS2 V6, we can get rid of this change. Change-Id: I2cbe884345ae7e772620b80a29b6574bd6532940 Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4402 Tested-by: jenkins Reviewed-by: Henry Robinson <henry@cloudera.com>	2014-09-18 20:17:18 -07:00
Henry Robinson	8a33b1861b	Optionally dynamically link Impala executables This patch adds two new flags to make_impala.sh: -build_shared_libs: Impala libraries (excluding thirdparty ones) will be built as shared objects (.so), and linked dynamically. -build_static_libs: Impala libraries will be built as archive files (.a), and linked statically. This was the behaviour before this patch, and is still the default for make_impala.sh. The speedup from dynamic linking for a clean build is significant: make_impala.sh -clean -build_static_libs: 11m48.676s make_impala.sh -clean -build_shared_libs: 5m46.943s make_debug.sh now passes -build_shared_libs by default. make_[asan\|release].sh still builds with statically linked libraries. All automated builds will be statically linked for now; we can move them to dynamic linking on a case-by-case basis. Change-Id: Icfd8101bf8e85cadd61d8995ae8864f8730297ea Reviewed-on: http://gerrit.sjc.cloudera.com:8080/3828 Reviewed-by: Henry Robinson <henry@cloudera.com> Tested-by: jenkins	2014-08-17 12:44:05 -07:00
Matthew Jacobs	64f55f32fe	Refactor thrift for ext-data-source to generate only necessary structs ext-data-source only needs a small subset of the thrift structures, so this separates the dependencies between files so that just the necessary structs are generated for ext-data-source. Afterwards, we can remove extra maven dependencies which were using environment variables to get versions. While the environment variables work when building the pom, they are not propagated to dependencies so building fe/pom.xml ended up producing lots of warnings which are now gone. Change-Id: I267fe7bc7a54c3c21aad8c1ffce07cf1a1e07c5e Reviewed-on: http://gerrit.sjc.cloudera.com:8080/3748 Reviewed-by: Matthew Jacobs <mj@cloudera.com> Tested-by: jenkins (cherry picked from commit 1f738962ccb7a34834decfe6cb27307ed4548870) Reviewed-on: http://gerrit.sjc.cloudera.com:8080/3767	2014-08-05 11:33:46 -07:00
Nong Li	5d903efca3	ExecSummary The runtime profile as we present it is not very useful and I think the structure of it makes it hard to consume. This patch adds a new client facing schemed set of counters that are collected from the runtime profiles. For example, with this structure it would be easy to have the shell get the stats of a running query and print a useful progress report or to check the most relevant metrics for diagnosing issues. Here's an example of the output for one of the tpch queries: Operator #Hosts Avg Time Max Time #Rows Est. #Rows Peak Mem Est. Peak Mem Detail ------------------------------------------------------------------------------------------------------------------------ 09:MERGING-EXCHANGE 1 79.738us 79.738us 5 5 0 -1.00 B UNPARTITIONED 05:TOP-N 3 84.693us 88.810us 5 5 12.00 KB 120.00 B 04:AGGREGATE 3 5.263ms 6.432ms 5 5 44.00 KB 10.00 MB MERGE FINALIZE 08:AGGREGATE 3 16.659ms 27.444ms 52.52K 600.12K 3.20 MB 15.11 MB MERGE 07:EXCHANGE 3 2.644ms 5.1ms 52.52K 600.12K 0 0 HASH(o_orderpriority) 03:AGGREGATE 3 342.913ms 966.291ms 52.52K 600.12K 10.80 MB 15.11 MB 02:HASH JOIN 3 2s165ms 2s171ms 144.87K 600.12K 13.63 MB 941.01 KB INNER JOIN, BROADCAST \|--06:EXCHANGE 3 8.296ms 8.692ms 57.22K 15.00K 0 0 BROADCAST \| 01:SCAN HDFS 2 1s412ms 1s978ms 57.22K 15.00K 24.21 MB 176.00 MB tpch.orders o 00:SCAN HDFS 3 8s032ms 8s558ms 3.79M 600.12K 32.29 MB 264.00 MB tpch.lineitem l Change-Id: Iaad4b9dd577c375006313f19442bee6d3e27246a Reviewed-on: http://gerrit.ent.cloudera.com:8080/2964 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins	2014-06-11 03:10:11 -07:00
Matthew Jacobs	25c0ebf58c	External Data Source: Public API Adds the thrift structures for the public external data source API and a new maven project containing the Java ExternalDataSource interface and the generated Java thrift classes. The ExternalDataSource.thrift structures can evolve in a backward compatible way. The ExternalDataSource Java interface will always contain a version number in the namespace (e.g. com.cloudera.impala.extdatasource.v1 for V1) so we can potentially make breaking changes to the interface in the future but still support older versions. A trivial implementation of the ExternalDataSource API is also added for testing purposes. TODO: Make the sample data source implementation realistic. Change-Id: I827d6420a87ed7a2bce34e050362ca98ddc5dbcc Reviewed-on: http://gerrit.ent.cloudera.com:8080/2241 Reviewed-by: Matthew Jacobs <mj@cloudera.com> Tested-by: jenkins (cherry picked from commit f29814e9ede9d4c889f2648606fcf511feeb47ae) Reviewed-on: http://gerrit.ent.cloudera.com:8080/2313	2014-04-22 18:34:48 -07:00
Nong Li	0d2919fe7f	Refactor scalar and aggregate function analysis and execution. This patch cleans up analysis and execution of scalar and aggregate functions so that there is no difference between how builtins and user functions are handled. The only difference is that the catalog is populated with the builtins all the time. The BE always gets a TFunction object and just executes it (builtins will have an empty hdfs file location). This removes the opcode registry and all of the functionality is subsumed by the catalog, most of which was already duplicated there anyway. This also introduces the concept of a system database; databases that the user cannot modify and is populated automatically on startup. Change-Id: Iaa3f84dad0a1a57691f5c7d8df7305faf01d70ed Reviewed-on: http://gerrit.ent.cloudera.com:8080/1386 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/1577	2014-02-18 18:40:08 -08:00
Alex Behm	dc7b398bd3	Impala reserves resources from YARN via LLama. Impala reserves resources from YARN via Llama and handles resources preemptions by cancelling affected queries. Adds the Impala Resource Broker for interacting with Llama. Refactors scheduler and coordinator to move fragment-to-host assignment logic into scheduler. Local test setup uses MiniLLama. Change-Id: Ic7b0fe43de52d30f4207b4e65cce7e6a294e54e1	2014-01-15 15:12:04 -08:00
Henry Robinson	51e58e1f3c	Statestore aesthetic cleanup * Statestore is now one word, without camelcase, eveywhere. Previous names included StateStore, state-store and state_store, variously. The only exception is a couple of flags that have 'state_store', and can't be changed for compatibility reasons. * File names are also changed to reflect the standard naming. * Most comments are now 90 chars wide (from 80 before) Change-Id: I83b666c87991537f9b1b80c2f0ea70c2e0c07dcf Reviewed-on: http://gerrit.ent.cloudera.com:8080/1225 Reviewed-by: Henry Robinson <henry@cloudera.com> Tested-by: jenkins	2014-01-09 09:56:04 -08:00
Lenni Kuff	9d5b94baa5	CatalogServer follow-on code review changes Changes to address follow-on code review comments. This change consists mainly of: * Comment cleanup / clarification * Thrift struct consolidation * Minor naming changes * Small code fixes/changes, etc Change-Id: Idd03cc8adeb9c0d99744688a02f81a08135966de Reviewed-on: http://gerrit.ent.cloudera.com:8080/667 Tested-by: jenkins Reviewed-by: Lenni Kuff <lskuff@cloudera.com>	2014-01-08 10:53:42 -08:00
Lenni Kuff	bf139d1eba	Update catalogd to forward log4j log messages to glog Change-Id: I4620b77ba731e134a3e48883e8ae7ee3820ed584 Reviewed-on: http://gerrit.ent.cloudera.com:8080/612 Reviewed-by: Henry Robinson <henry@cloudera.com> Tested-by: jenkins	2014-01-08 10:53:12 -08:00
Lenni Kuff	a2cbd2820e	Add Catalog Service and support for automatic metadata refresh The Impala CatalogService manages the caching and dissemination of cluster-wide metadata. The CatalogService combines the metadata from the Hive Metastore, the NameNode, and potentially additional sources in the future. The CatalogService uses the StateStore to broadcast metadata updates across the cluster. The CatalogService also directly handles executing metadata updates request from impalad servers (DDL requests). It exposes a Thrift interface to allow impalads to directly connect execute their DDL operations. The CatalogService has two main components - a C++ server that implements StateStore integration, Thrift service implementiation, and exporting of the debug webpage/metrics. The other main component is the Java Catalog that manages caching and updating of of all the metadata. For each StateStore heartbeat, a delta of all metadata updates is broadcast to the rest of the cluster. Some Notes On the Changes --- * The metadata is all sent as thrift structs. To do this all catalog objects (Tables/Views, Databases, UDFs) have thrift struct to represent them. These are sent with each statestore delta update. * The existing Catalog class has been seperated into two seperate sub-classes. An ImpladCatalog and a CatalogServiceCatalog. See the comments on those classes for more details. What is working: * New CatalogService created * Working with statestore delta updates and latest UDF changes * DDL performed on Node 1 is now visible on all other nodes without a "refresh". * Each DDL operation against the Catalog Service will return the catalog version that contains the change. An impalad will wait for the statestore heartbeat that contains this version before returning from the DDL comment. * All table types (Hbase, Hdfs, Views) getting their metadata propagated properly * Block location information included in CS updates and used by Impalads * Column and table stats included in CS updates and used by Impalads * Query tests are all passing Still TODO: * Directly return catalog object metadata from DDL requests * Poll the Hive Metastore to detect new/dropped/modified tables * Reorganize the FE code for the Catalog Service. I don't think we want everything in the same JAR. Change-Id: I8c61296dac28fb98bcfdc17361f4f141d3977eda Reviewed-on: http://gerrit.ent.cloudera.com:8080/601 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: Lenni Kuff <lskuff@cloudera.com>	2014-01-08 10:53:11 -08:00
Nong Li	af90c8a133	Fix memory usage tracking. Changes MemLimit to MemTracker: - the limit is optional - it also records a label and an optional parent - Consume() and Release() also update the ancestors and there's also a new AnyLimitExceeded(), which also checks the ancestors - the consumption counter is a HighwaterMarkCounter and can optionally be created as part of a profile Each fragment instance now has a MemTracker that is part of a 3-level hierarchy: process, query, fragment instance. Change-Id: I5f580f4956fdf07d70bd9a6531032439aaf0fd07 Reviewed-on: http://gerrit.ent.cloudera.com:8080/339 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: Nong Li <nong@cloudera.com>	2014-01-08 10:52:36 -08:00
Henry Robinson	90ed9f0ab8	Remove planservice	2014-01-08 10:50:20 -08:00
Henry Robinson	2ae20cbbb7	Statestore-2.0: New state-store implementation * API simplified to deal only with 'topics', not services and objects * Scalability improved: heartbeat loop is now multi-threaded * State-store can store arbitrary objects * State-store may send either deltas or complete topic state (delta computation to come)	2014-01-08 10:49:23 -08:00
Nong Li	6e293090e6	Parquet writer. Change-Id: I7117b545e3d3a7803a219234ad992040a6c7c4ec	2014-01-08 10:48:44 -08:00
Nong Li	868a99135a	Add network benchmark	2014-01-08 10:47:56 -08:00
Alan Choi	be98df19c8	HiveServer2 This patch implements the HiveServer2 API. We have tested it with Lenni's patch against the tpch workload. It has also been tested manually against Hive's beeline with queries and metadata operations. All of the HiveServer2 code is implemented in impala-hs2-server.cc. Beeswax code is refactored to impala-beeswax-server.cc. HiveServer2 has a few more metadata operations. These operations go through impala-hs2-server to ddl-executor and then to FE. The logics are implemented in fe/src/main/java/com/cloudera/impala/service/MetadataOp.java. Because of the Thrift union issue, I have to modify the generated c++ file. Therefore, all the HiveServer2 thrift generated c++ code are checked into be/src/service/hiveserver2/. Once the thrift issue is resolved, I'll remove these files. Change-Id: I9a8fe5a09bf250ddc43584249bdc87b6da5a5881	2014-01-08 10:47:24 -08:00
Henry Robinson	7ba437a52e	Code changes to build against thrift 0.9.0 in thirdparty/	2014-01-08 10:47:22 -08:00
Henry Robinson	986f3cddf6	Move sparrow/ to statestore/ and remove sparrow namespace	2014-01-08 10:47:12 -08:00
Nong Li	2289906a5a	Fix linker dependencies.	2014-01-08 10:46:56 -08:00
Henry Robinson	2f339f2ed8	Add ASL license to all public files	2014-01-08 10:46:32 -08:00
ishaan	05c65789bb	Change Copyrights from 2011 ti 2012	2014-01-08 10:46:29 -08:00
Michael Ubell	ad46b98366	Add Kerberos authentication.	2014-01-08 10:45:10 -08:00
Marcel Kornacker	c004cdaa1c	Thrift structures for the new planner interface.	2014-01-08 10:44:47 -08:00
Marcel Kornacker	fb32d40b03	Switching to an asynchronous plan fragment exec interface; this entails: - making the coordinator asynchronous - renamed ImpalaBackendService to ImpalaInternalService; - new class ImpalaServer implements ImpalaService and ImpalaInternalService - renaming ImpalaInternalService fields to conform to c++ style - merged impala-service.{cc,h} and backend-service.{cc,h} into impala-server.{cc,h} - added TStatusCode field to Status.ErrorDetail - removed ImpalaInternalService.CloseChannel Also removed JdbcDriverTest.java	2014-01-08 10:44:15 -08:00
Kay Ousterhout	073e38d6c2	Added the StateStore, a centralized repository for soft state. The commit also adds the StateStoreSubscriber, a component that runs alongside each impalad and handles communication with the state store.	2012-07-13 09:26:16 -07:00
Alan Choi	f52286f72c	This completes the Beeswax implementation for ODBC. All the ODBC tests (CDH/hive-odbc-test) passes (except those with "create table" and "show table". We should have nightly regression of the odbc test to run against impalad. There're still a few issues: 1. running with num_node > 0 crashes the coordinator; 2. work around for a few ODBC jiras 3. no test for bool/timestamp because ODBC doesn't support them. review: issue 110	2012-06-18 14:46:46 -07:00
Alan Choi	ef10afa439	This changes the Thrift from 0.6.1 to 0.7.0. Please uninstall the old thrift and download/install Thrift 0.7.0. Beeswax service now depends on Hive metastore; fix buildall.sh to clean generated-source in FE; fix .gitignore to clean generated-source in BE;	2012-06-14 18:21:08 -07:00
Alan Choi	7af87c7dea	Beeswax Service for Impala (partiial implementation) review id: 82	2012-06-06 10:08:06 -07:00
Henry Robinson	3ff3559805	Add support for per-partition file formats to front end and backend. At the same time, this patch removes the partitionKeyRegex in favour of explicitly sending a list of literal expressions for each file path from the front end.	2012-06-05 12:00:09 -07:00
Marcel Kornacker	4a4a07fde7	A number of changes for the Jenkins build: - added option to run with derby metastore, based on whether env var METASTORE_IS_DERBY is set - emoved hardwired file locations from planner tests - switching to linking statically against libthrift.a Also added script rebuild.sh, which contains the build steps of buildall.sh (against impala sources).	2012-03-08 16:19:47 -08:00
Nong Li	b410b62716	Add distributed profile counter for the BE.	2012-03-01 13:59:17 -08:00
Nong Li	88237350f0	Change the build to allow debug and release builds to coexist.	2012-02-17 18:14:04 -08:00
Nong Li	94db70c9fd	Fix build. Dependencies don't propagate right on first build.	2011-12-30 21:28:18 -08:00
Nong Li	c84fec38d3	- Move thrift out of FE src and into impala/common - Thrift files now build using cmake instead of mvn - Added cmake build to impala/ which drives the build process	2011-12-30 19:35:20 -08:00
Marcel Kornacker	c056445612	Added m:n data streams: - DataStreamSender: sender side (1:n) for a single stream - DataStreamMgr: receiver side; singleton class for all incoming streams active at a node Changed ExecNode::GetNext() to return eos indicator explicitly; this allows us to pass incoming TRowBatches (which may not be full) up w/o copying the data. Added data-stream-test.	2012-01-10 18:00:20 -08:00
Alexander Behm	c7f7382c31	Added planner changes and data sinks for INSERT statements.	2011-12-12 15:14:49 -08:00
Nong Li	b1833d4de8	Implmented opcode registry. Added substr() and pi() functions. Added backend testing to buildall.sh	2011-11-20 13:44:41 -08:00
Marcel Kornacker	0914fedea9	Defining Impala backend service, which is exported by backend processes to service plan fragment execution requests. Changing thrift plan-related structs to pull out runtime parameters in preparation for parallel execution.	2011-11-02 14:47:32 -07:00
Marcel Kornacker	c534062c20	fixing build failure introduced by 7d9b7e2	2011-08-03 15:37:58 -07:00
Marcel Kornacker	cc141953de	Adding plan service for be test driver Adding mock implementation of libhdfs (only what's needed for text-scan-node) in order to avoid having to make any jni calls. Some bug fixes.	2011-07-22 12:09:55 -07:00
marcel	08e8a5db4c	fixed jni and linker problems some bug fixes and missing functions some cleanup	2011-07-15 13:17:20 -07:00
Marcel Kornacker	c23616a30c	deserializing plan request in c++ Coordinator.main(): util function to execute single query against test schema removed dead code from TestSchemaUtils	2011-07-13 13:48:54 -07:00

1 2

51 Commits