impala

mirror of https://github.com/apache/impala.git synced 2026-01-05 21:00:54 -05:00

Author	SHA1	Message	Date
Matthew Jacobs	2f9b2ae785	Fix SHOW DATA SOURCE test; must execute setup/cleanup serially The SHOW DATA SOURCE tests were run as part of the other SHOW * tests in test_show(), but the setup/cleanup for data sources can't be run in parallel. This change moves the SHOW DATA SOURCE tests into a separate test method and the setup/cleanup code is only run for this test (i.e. not using setup_method() and teardown_method()). The test is then only executed serially. Change-Id: I221145f49cfe7290e132c6a87a5295b747c1fcc7 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2864 Reviewed-by: Matthew Jacobs <mj@cloudera.com> Tested-by: jenkins (cherry picked from commit 5bcd769eae3a694d7f6f42d093f9197e8a4e8b77) Reviewed-on: http://gerrit.ent.cloudera.com:8080/2870	2014-06-05 20:07:57 -07:00
Henry Robinson	d264ab90fe	Add support for client SSL to Python Beeswax client Change-Id: I0d9352471067bfe19e25221e0ecbbb08f945b962 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2810 Reviewed-by: Henry Robinson <henry@cloudera.com> Tested-by: jenkins (cherry picked from commit 545bd30d5cf3cae9a3581d7bc942a909a1a98806) Reviewed-on: http://gerrit.ent.cloudera.com:8080/2850 Tested-by: Henry Robinson <henry@cloudera.com>	2014-06-05 10:48:23 -07:00
Nong Li	5d80942d42	[CDH5] IMPALA-1019: Fix cancellation path in io mgr for cached reads. Change-Id: I11efd65d1efa900f79afe88b781262a44ac5006a Reviewed-on: http://gerrit.ent.cloudera.com:8080/2703 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins	2014-05-30 19:14:39 -07:00
Nong Li	84f851b5a5	IMPALA-959: Fix ASAN decimal crashes. Not quite sure what the underlying issue is but these fixes seem to work. Change-Id: I759804eb8338ba86969c0214a1e6e35588c94297 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2726 Tested-by: jenkins Reviewed-by: Nong Li <nong@cloudera.com>	2014-05-30 16:47:07 -07:00
Skye Wanderman-Milne	c8b2017093	Add decimal UDF/UDA support. Change-Id: Ie48c1cb8e978c7282593b7f602dd68added6d3fd Reviewed-on: http://gerrit.ent.cloudera.com:8080/2625 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: jenkins (cherry picked from commit 5048f04b332c13b1bff32fb257272b0fea4b8584) Reviewed-on: http://gerrit.ent.cloudera.com:8080/2739	2014-05-29 20:49:53 -07:00
ishaan	c5c58c6bce	The workload runner should abort execution is a query fails in a multi-user run. Currently, we coalesce the results and do not properly catch a failure if one of the threads has a failed query and exit_on_error is set to True. This patch ensures that we exit before the next query is run. Change-Id: Ie650e0f547874386c79c78982ea9916f33e18cda Reviewed-on: http://gerrit.ent.cloudera.com:8080/2654 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: jenkins	2014-05-27 20:46:21 -07:00
Lenni Kuff	c45e9a70d9	[CDH5] Add DDL support for HDFS caching This change adds DDL support for HDFS caching. The DDL allows the user to indicate a table or partition should be cached and which pool to cache the data into: * Create a cached table: CREATE TABLE ... CACHED IN 'poolName' * Cache a table/partition: ALTER TABLE ... [partitionSpec] SET CACHED IN 'poolName' * Uncache a table/partition: ALTER TABLE ... [partitionSpec] SET UNCACHED When a table/partition is marked as cached, a new HDFS caching request is submitted to cache the location (HDFS path) of the table/partition and the ID of that request is stored with in the table metadata (in the table properties). This is stored as: 'cache_directive_id'='<requestId>'. The cache requests and IDs are managed by HDFS and persisted across HDFS restarts. When a cached table or partition is dropped it is important to uncache the cached data (drop the associated cache request). For partitioned tables, this means dropping all cache requests from all cached partitions in the table. Likewise, if a partitioned table is created as cached, new partitions should be marked as cached by default. It is desirable to know which cache pools exists early on (in analysis) so the query will fail without hitting HDFS/CatalogServer if a non-existent pool is specified. To support this, a new cache pool catalog object type was introduced. The catalog server caches the known pools (periodically refreshing the cache) and sends the known pools out in catalog updates. This allows impalads to perform analysis checks on cache pool existence going to HDFS. It would be easy to use this to add basic cache pool management in the future (ADD/DROP/SHOW CACHE POOL). Waiting for the table/partition to become cached may take a long time. Instead of blocking the user from access the time during this period we will wait for the cache requests to complete in the background and once they have finished the table metadata will be automatically refreshed. Change-Id: I1de9c6e25b2a3bdc09edebda5510206eda3dd89b Reviewed-on: http://gerrit.ent.cloudera.com:8080/2310 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: jenkins	2014-05-27 16:47:15 -07:00
Dimitris Tsirogiannis	ca86e470de	IMPALA-887: Improve partition pruning time This commit is the first step in improving the performance of partition pruning. Currently, Impala can prune approximately 10K partitions per sec, thereby introducing significant overhead for huge table with a large number of partitions. With this commit we reduce that overhead by 3X by batching the partition pruning calls to the backend. Change-Id: I3303bfc7fb6fe014790f58a5263adeea94d0fe7d Reviewed-on: http://gerrit.ent.cloudera.com:8080/2608 Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/2687	2014-05-26 13:10:12 -07:00
Taras Bobrovytsky	46aba6149d	CDH-18512: Modification to allow spaces around the = sign in SET in impala-shell Change-Id: I3c149e9a27962ed1130b1ddbb02952f4254bd4c9 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2609 Reviewed-by: Taras Bobrovytsky <tbobrovytsky@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/2645	2014-05-21 15:34:24 -07:00
Henry Robinson	93a3d65492	Support for LDAP tests * Allow Beeswax connections to optionally use LDAP * Run custom cluster tests from the aux repo, if it exists Change-Id: I054af64e030ad0cd722ae8dd75afda9c58ea2913 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2547 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/2640	2014-05-21 05:52:55 -07:00
Matthew Jacobs	f9c9a7ca13	Add SHOW DATA SOURCES Change-Id: Ieeb0df107f45a58b8a99f717e96453da93ee7270 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2529 Reviewed-by: Matthew Jacobs <mj@cloudera.com> Tested-by: jenkins (cherry picked from commit b2392c5bfe9fc928ad19af6ff6737e6dc6324e63) Reviewed-on: http://gerrit.ent.cloudera.com:8080/2614	2014-05-19 17:52:27 -07:00
Srinath Shankar	d193a1e8a5	IMPALA-963: Impala crash in ClearResultCache() The issue is that Impala crashes in ClearResultCache() with result caching on for parallel inserts. The reason is that the ClearResltCache() accesses the coordinator RuntimeState to update the query mem tracker. However, for there is no coordinator fragment (or RuntimeState) for parallel inserts. The fix is to intiialize a query mem tracker to track memory usage in the coordinator instance even if there is no coordinator fragment. Change-Id: I3a2ef14860f683910c29ae19b931202ca6867b9f Reviewed-on: http://gerrit.ent.cloudera.com:8080/2501 Reviewed-by: Srinath Shankar <sshankar@cloudera.com> Tested-by: jenkins	2014-05-19 12:40:12 -07:00
Skye Wanderman-Milne	edbbe6035e	Decimal: read from Avro Allows reading decimal columns with or without codegen. Includes tests based on a data file posted on HIVE-5823. Change-Id: Ie541c6b98bd24543691850cb45a434af60b5a5a6 (cherry picked from commit 6983dcefdf70cce14724e17d03bc061ffb8f671c) Reviewed-on: http://gerrit.ent.cloudera.com:8080/2596 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: jenkins	2014-05-16 22:26:11 -07:00
ishaan	0298e8b6ab	Fix the ASAN build by xfailing test_decimal when ASAN_OPTIONS is set. Adding decimal columns crashes an ASAN built impalad. This change skips the test. Change-Id: Ic94055a3f0d00f89354177de18bc27d2f4cecec2 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2532 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/2594	2014-05-16 18:14:30 -07:00
Nong Li	4b883ac7eb	Fix decimal bugs. Fix overflow handling in a few cases and add decimal as hs2 type. Change-Id: Ifde1988365f6be961e7eb7404ed37d7bbaab875c Reviewed-on: http://gerrit.ent.cloudera.com:8080/2531 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/2564	2014-05-16 00:17:38 -07:00
Matthew Jacobs	19f34d9187	test_data_source_tables should only run for 'text/none' Change-Id: I784fb4305f8cff92c2582b0a7008f836c7aa9fa4 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2504 Reviewed-by: Matthew Jacobs <mj@cloudera.com> Tested-by: jenkins (cherry picked from commit 9f3621ec5d270c60258e93e8a2a596329c31f4e6) Reviewed-on: http://gerrit.ent.cloudera.com:8080/2508	2014-05-09 19:32:18 -07:00
Matthew Jacobs	0c533bb152	External Data Source: Backend changes Change-Id: Ifa62b4ea231da47facb31c3f8d43e5e3ac73591f Reviewed-on: http://gerrit.ent.cloudera.com:8080/2284 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins (cherry picked from commit f1e5db2853135c4346788192e2dbc632d4fe1dfb) Reviewed-on: http://gerrit.ent.cloudera.com:8080/2497 Reviewed-by: Matthew Jacobs <mj@cloudera.com>	2014-05-09 02:24:41 -07:00
Henry Robinson	38befd2126	IMPALA-724: Support infinite / nan values in text files This patch allows the text scanner to read 'inf' or 'Infinity' from a row and correctly translate it into floating-point infinity. It also adds is_inf() and is_nan() builtins. Finally, we change the text table writer to write Infinity and NaN for compatibility with Hive. In the future, we might consider adding nan / inf literals to our grammar (postgres has this, see: http://www.postgresql.org/docs/9.3/static/datatype-numeric.html). Change-Id: I796f2852b3c6c3b72e9aae9dd5ad228d188a6ea3 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2393 Reviewed-by: Henry Robinson <henry@cloudera.com> Tested-by: jenkins (cherry picked from commit 58091355142cadd2b74874d9aa7c8ab6bf3efe2f) Reviewed-on: http://gerrit.ent.cloudera.com:8080/2483	2014-05-08 12:28:53 -07:00
Lenni Kuff	13c794db91	[CDH5] Update dependency versions to CDH5.1.0 This just updates the versions, it doesn't touch anything in /thirdparty. Change parquet version to append SNAPSHOT Added hadoop-hbase-compat jar in AUX_CLASSPATH and mapreduce/*.jar to HDFS Change-Id: I4471ef4476997371cf49a9d54cfa63f2fda126e4	2014-05-07 15:10:40 -07:00
Nong Li	03e5665e56	Decimal: Read/Write to parquet. This adds support for the FIXED_LENGTH_BYTE_ARRAY parquet type and encoding for decimals. Change-Id: I9d5780feb4530989b568ec8d168cbdc32b7039bd Reviewed-on: http://gerrit.ent.cloudera.com:8080/1727 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/2432	2014-05-02 16:38:35 -07:00
Nong Li	5adbcbbce5	Update decimal tests to only run on text/none. Change-Id: I9a35f9e1687171fc3f06c17516bca2ea4b9af9e1 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2217 Tested-by: jenkins Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Reviewed-on: http://gerrit.ent.cloudera.com:8080/2431 Reviewed-by: Nong Li <nong@cloudera.com>	2014-05-02 12:18:37 -07:00
Nong Li	bb3feb675e	Dynamically scale down mem usage in scanners and io mgr. This patch scales down the amount of buffering in the io mgr and the number of scanner threads if the query is close to mem limits. Change-Id: I68ef247a68642939b98ec7c429dfd393b23a20d2 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1906 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/2417	2014-05-01 15:04:07 -07:00
casey	192d52c258	Testing: Generate queries and compare results against other databases This is the intital commit and is a work in progress. See the README for a list of possible improvements. As an overview of how the files are related: model.py: This is the base upon which the other files are built. It contains something like a grammer for queries. query_generator.py: Generates random permutations of the model. model_translator.py: Produces SQL based on the model discrepancy_searcher.py: Uses the above to generate, run, and compare query results. Change-Id: Iaca6277766f5a86568eaa3f05b99c832942ab38b Reviewed-on: http://gerrit.ent.cloudera.com:8080/1648 Reviewed-by: Casey Ching <casey@cloudera.com> Tested-by: Casey Ching <casey@cloudera.com>	2014-05-01 14:20:35 -07:00
Skye Wanderman-Milne	60db4d4d82	CDH-18416: Don't inline ReadWriteUtil::ReadZLong() For wide Avro tables, ReadZLong() would get inlined many times into a single function body, causing LLVM to crash. Not inlining doesn't seem to have a performance impact on narrow tables, and helps with wide tables. This change also adds tests over wide (i.e. many-column) tables. The test tables are produced by specifying shell commands to generate test tables in functional_schema_template.sql, which are executed in generate-schema-statements.py. In the SQL templates, sections starting with a ` are treated as shell commands. The output of the shell command is then used as the section text. This is only a starting point; it isn't currently implemented for all sections, and may have to be tweaked if we use this mechanism for all tables. Change-Id: Ife0d857d19b21534167a34c8bc06bc70bef34910 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2206 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: Skye Wanderman-Milne <skye@cloudera.com> (cherry picked from commit 1c5951e3cce25a048208ab9bb3a3aed95e41cf67) Reviewed-on: http://gerrit.ent.cloudera.com:8080/2353 Tested-by: jenkins	2014-04-28 15:58:15 -07:00
Skye Wanderman-Milne	bd2fc2d1d4	IMPALA-934: Refresh cached UDF library when creating a new function This change adds the ability to refresh a local cache entry, causing the old cache entry to be dropped and the library to be reloaded from HDFS. This is used in ResolveSymbolLookup(), which is called by the frontend when creating a new a function, and in ImpalaServer when receiving a "create function" heartbeat. This change also makes sure the FE calls into the backend for jars, so jars get refreshed as well. Change-Id: I5fd61c1bc2e04838449335d5a68b61af8b101b01 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2286 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: jenkins (cherry picked from commit e8587794b3b82438190c91b2ebe9d1e12db73981) Reviewed-on: http://gerrit.ent.cloudera.com:8080/2348	2014-04-24 19:39:16 -07:00
casey	2351266d0e	Replace single process mini-dfs with multiple processes This should allow individual service components, such as a single nodemanager, to be shutdown for failure testing. The mini-cluster bundled with hadoop is a single process that does not expose the ability to control individual roles. Now each role can be controlled and configured independently of the others. Change-Id: Ic1d42e024226c6867e79916464d184fce886d783 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1432 Tested-by: Casey Ching <casey@cloudera.com> Reviewed-by: Casey Ching <casey@cloudera.com> Reviewed-on: http://gerrit.ent.cloudera.com:8080/2297 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: Ishaan Joshi <ishaan@cloudera.com>	2014-04-23 18:24:05 -07:00
Lenni Kuff	bb09b5270f	IMPALA-839: Update tests to be more thorough when run exhaustively Some tests have constraints that were there only to help reduce runtime which reduces coverage when running in exhaustive mode. The majority of the constraints are because it adds no value to run the test across additional dimensions (or it is invalid to run with those dimensions). Updates the tests that have legitimate constraints to use two new helper methods for constraining the table format dimension: create_uncompressed_text_dimension() create_parquet_dimension() These will create a dimension that will produce a single test vector, either uncompressed text or parquet respectively. Change-Id: Id85387c1efd5d192f8059ef89934933389bfe247 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2149 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: jenkins (cherry picked from commit e02acbd469bc48c684b2089405b4a20552802481) Reviewed-on: http://gerrit.ent.cloudera.com:8080/2290	2014-04-18 20:11:31 -07:00
Lenni Kuff	15327e8136	Migrate DataErrors tests to Python test framework, re-enable subset of tests This re-enables a subset of the stable data errors tests and updates them to work in our test framework. This includes support for updating results via --update_results. This also lets us remove a lot of old code that was there only to support these disabled tests. Change-Id: I4c40c3976d00dfc710d59f3f96c99c1ed33e7e9b Reviewed-on: http://gerrit.ent.cloudera.com:8080/1952 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/2277	2014-04-18 02:25:11 -07:00
Nong Li	1cab95066d	Add the return type as a column for SHOW FUNCTIONS. Also includes some misc pattern matching cleanup. Change-Id: I6c9ec78b094a73864b4d669afbd75a48c9bf9585 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2199 Tested-by: jenkins Reviewed-by: Nong Li <nong@cloudera.com> Reviewed-on: http://gerrit.ent.cloudera.com:8080/2271	2014-04-17 17:58:13 -07:00
Nong Li	87295a4e06	Decimal implementation. This patch implements decimal support for text based formats. Change-Id: I8e2c9e512ed149fe965216a72cb21fffd4f18e75 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1669 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: Nong Li <nong@cloudera.com> Reviewed-on: http://gerrit.ent.cloudera.com:8080/2238 Tested-by: jenkins	2014-04-14 21:07:32 -07:00
ishaan	5803e6883e	Cleanup and re-enable some tests in TestPartitionMetadata Partition metadata tests were marked as xfail because of IMPALA-624. Additionally, we had to invoke hive to insert into two partitions pointing to the same location (this limitation is now removed). This patch changes the test to use Impala exclusively, removes the xfail tag and adds a teardown method to the test class. Change-Id: I15fa97bef4f8714d0873a9c713627a198f3388ad Reviewed-on: http://gerrit.ent.cloudera.com:8080/2086 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/2215	2014-04-13 17:55:43 -07:00
ishaan	0e0c480262	Re-enable some tests in test_describe_formatted A few tests which dealt with running queries via hs2 and impala were marked as xfail as hiveserver2 would occasionally not come up. Given that we now have a script that checks whether hiveserver2 is up before continuining the build, it should be safe to remove the xfail. Change-Id: I2b5063e7259c01fc0ef8ffda86d85514c9cf959c Reviewed-on: http://gerrit.ent.cloudera.com:8080/2082 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/2214 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>	2014-04-13 17:51:45 -07:00
ishaan	6f416dd2c2	Close all queries in test_cancellation The queries in test_cancellation are currently cancelled but not closed, causing some test queries to eventually time out because the admission controller limits are passed. This patch ensures that all queries issued in test_cancellation are closed. Change-Id: I65b26672155e31889bb6f43d3ac87be0f7b4eb72 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2187 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/2213	2014-04-13 17:45:51 -07:00
Nong Li	1a3caca8c4	[CDH5] Update execution engine to take advantage of DN caching. This finishes up the support to use HDFS caching. The scheduler will prefer replicas that are cached and the scan node plumbs the metadata to the io mgr. This is a bit hard to test without a cluster and some perf benchmarking. I've added a basic test to make sure the path is being exercised. Change-Id: I8762ca9ef2f88c3637113d3c5ee82f4c0ea7f1be Reviewed-on: http://gerrit.ent.cloudera.com:8080/2212 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins	2014-04-13 17:11:21 -07:00
Skye Wanderman-Milne	e60bf29a96	IMPALA-13: Use SSE string functions that take an explicit length This patch modifies DelimitedTextParser and StringValue to work with data containing null characters by using SSE instructions that take a length, rather than expecting null-terminated strings. It also adds some other minor changes to correctly handle data with nulls and to faciliate testing. I checked the execution time of a count() and a select() limit 1 query locally, and saw no difference for either text or sequence files. Change-Id: Ia920b35bea7048aa286f39ec83e313c2a39251d1 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2110 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: Skye Wanderman-Milne <skye@cloudera.com> Reviewed-on: http://gerrit.ent.cloudera.com:8080/2181	2014-04-11 11:16:24 -07:00
Alex Behm	2fff51d9e9	IMP-1329,IMPALA-924: Make ExchangeNode::Open() block until rows are available. The bug: Coordinator::Wait() is supposed to block until rows become available for consumption by the client. We rely on Wait() to determine when to advance the query status to a 'ready' state and signal to the client that rows can be fetched. Long fetch times can trigger client timeouts at various levels (socket, app, etc.). Coordinator::Wait() simply opens the coordinator fragment's plan tree. For most plan nodes, Open() does work to prepare the plan tree, s.t., GetNext() returns quickly. However, for ExchangeNodes Open() used to not wait until rows are obtained form the underlying stream receiver. The fix: Make ExchangeNode::Open() block until rows are available. Change-Id: I7b197eea11d21fd732414d96c899a17b2d99631c Reviewed-on: http://gerrit.ent.cloudera.com:8080/2128 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/2185	2014-04-10 23:49:38 -07:00
Alex Behm	91db96d903	IMPALA-762: Add the query status to Beeswax::get_log() and pick it up in the Impala shell. COMPUTE STATS is an async DDL command. When COMPUTE STATS fails it will set the query status of the QueryExecState properly, but the original Beeswax::query() RPC won't throw. The Impala shell sometimes did not pick up and display the query status because no RPC actually threw. To fix this, I modified Beeswax::get_log() to include the query status if it is not ok. The shell looks for a special prefix to distinguish the query status from the runtime state error log. Change-Id: I0d9dbf0801629a37de22ea4ebb6d2e5d53b836ef Reviewed-on: http://gerrit.ent.cloudera.com:8080/1899 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/2063	2014-04-10 15:47:06 -07:00
Henry Robinson	37236845b1	Mark test_non_codegen_tinyint_grouping as execute_serially The test contains an INSERT and some DDL, which is racy if performed in parallel. Change-Id: I2b88533f45756fcf6372d6ee4eb7edd474087048 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2167 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Reviewed-by: Alex Behm <alex.behm@cloudera.com> Reviewed-by: Henry Robinson <henry@cloudera.com> Tested-by: Henry Robinson <henry@cloudera.com> (cherry picked from commit 8b103c029cc341bacea4746c369bb58e6af5ed29) Reviewed-on: http://gerrit.ent.cloudera.com:8080/2182 Tested-by: jenkins	2014-04-10 15:17:25 -07:00
Henry Robinson	415540d789	IMPALA-901: Fix grouping with NULLs when codegen is disabled The standard implementation of HashTable::Equals() did not correctly check the NULL bit when the argument row did not evaluate to NULL for a given probe expr. In the rare circumstance that this gave rise to a false positive (more on that below), two rows with different grouping values would be considered equal, and one would be excluded from the final aggregation output. HashTable::EvalRow() fills an expression value buffer with the values of either probe or build exprs evaluated for the argument row. These cached values are used to determine row equality in Equals(). In order to avoid a lot of false collisions, an 'unlikely' value is written to that buffer for NULL values, chosen to be HashUtil::FNV_SEED. So without correct NULL-bit checking in Equals(), two single-slot rows are considered to be equal if one of them has NULL for its slot, and the other has a value equal to HashUtil::FNV_SEED truncated to the size of the slot. For tinyint columns, this value is -59. As it happens, our random generator happened to create a table with one tinyint column and which contained NULL and -59 as values. In order to trigger this bug, the rows must also have been written to disk in order such that the scanners returned -59 first, and then NULL to the aggregation node; the bug is not symmetric and works in the opposite case. Change-Id: I17d43eaeee62b2ac01b67dd599bc4346b012a074 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2130 Reviewed-by: Marcel Kornacker <marcel@cloudera.com> Tested-by: jenkins (cherry picked from commit 6e8098254280a9d5ead0b607263ca6728a3222a7) Reviewed-on: http://gerrit.ent.cloudera.com:8080/2161 Reviewed-by: Henry Robinson <henry@cloudera.com>	2014-04-07 17:30:52 -07:00
Henry Robinson	99c37aac37	IMPALA-827: Add an option for directories created by INSERT to inherit their parent's permissions This patch adds --insert_inherit_permissions. If true, all new partition directories created by INSERT will inherit their permissions from their parent. When false, the directories are created with the default permissions. Change-Id: Ib2b4c251e51ea5048387169678e8dde34ecfe5f6 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1917 Tested-by: jenkins Reviewed-by: Henry Robinson <henry@cloudera.com>	2014-04-04 10:25:20 -07:00
Matthew Jacobs	cd2dc3e2bd	Fix test_failpoints to close queries after cancel Change-Id: I4f272ccec84030d8b4f85d0e1554a042ee26be30 Reviewed-on: http://gerrit.ent.cloudera.com:8080/2092 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: jenkins (cherry picked from commit d42aad459a68991fc489caf1edbca10ea599d28a) Reviewed-on: http://gerrit.ent.cloudera.com:8080/2116 Reviewed-by: Matthew Jacobs <mj@cloudera.com>	2014-03-28 18:47:25 -07:00
Skye Wanderman-Milne	8e9776b824	Mark TestUdfs.test_mem_limits to run serially This was causing other tests to fail with process mem limit exceeded. Change-Id: I1407b0896052aece691c681827994961b09d8103 (cherry picked from commit 2bcc46117f504f50ded724fddf74f24bd829c6c6) Reviewed-on: http://gerrit.ent.cloudera.com:8080/2003 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: jenkins	2014-03-19 14:18:11 -07:00
Skye Wanderman-Milne	3e728f3180	Symbol mangling for UDF prepare/close functions Change-Id: If8f1386073f467e66ada74e606fc98f3344f0733 (cherry picked from commit 32df8b3f963a2b46ec33aad86a151d4c7ecda39c) Reviewed-on: http://gerrit.ent.cloudera.com:8080/1993 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: jenkins	2014-03-19 02:15:07 -07:00
Lenni Kuff	70c05d4caa	IMPALA-897: shell does not close queries after completion when running from a script The problem was that we were setting a flag marking the last_query_handle as closed, but were not resetting the flag before the next query. This caused the first query to be closed properly, but subsequent queries would not be closed. The fix is to change where the flag is reset to the same place as where we assign last_query_handle. Added a test case. Change-Id: I870a96789489bfe4f388910b808409cd0584af8a (cherry picked from commit 1439151af5b63112b0dd631fac9c7ab4d43bba37) Reviewed-on: http://gerrit.ent.cloudera.com:8080/1976 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: jenkins	2014-03-18 18:46:54 -07:00
Lenni Kuff	9c3b318112	Fix test_compressed_formats to properly pull in tbl created in Hive Change-Id: I4e143826e5900ebfa6f77023ae4cf0d2c71db190 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1960 Reviewed-by: Ishaan Joshi <ishaan@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/1967 Reviewed-by: Lenni Kuff <lskuff@cloudera.com>	2014-03-18 13:24:10 -07:00
Lenni Kuff	b7432cd68a	Constrain test_explain to run only on text/none table format The tests expect to be run against text/none tables which causes failures on exhaustive test runs. I don't think it adds any extra coverage to run these tests against lzo format so added a constraint. Change-Id: Ib0878e2ba84107c9df4499def304fe45ba4fe4b4 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1884 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: Lenni Kuff <lskuff@cloudera.com> Reviewed-on: http://gerrit.ent.cloudera.com:8080/1964 Tested-by: jenkins	2014-03-18 11:51:19 -07:00
Skye Wanderman-Milne	44125729dc	UDF/UDA memory management improvements * AggFnEvaluator now uses the UDF mem pool (I'm planning to change this to per-exec node pools in the expr refactoring) * FunctionContext::TrackAllocation()/Free() actually use the UDF's mem tracker * Added FunctionContextImpl::Close() which sets warnings for leaked allocations Change-Id: I792ffd49102a92b57e34df18d8ff5f5d0fd27370 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1792 Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com> Tested-by: Skye Wanderman-Milne <skye@cloudera.com> (cherry picked from commit 41a5f7cfa718789fa3b2de3a31f085411fb5000c) Reviewed-on: http://gerrit.ent.cloudera.com:8080/1954 Tested-by: jenkins	2014-03-17 20:38:25 -07:00
Lenni Kuff	d7c06486e1	Disable flaky explain tests due to inconsistent per-host mem requirements Change-Id: Ie372696c4986dc7f7c8f7fc074c41b89bd65f456 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1939 Reviewed-by: Lenni Kuff <lskuff@cloudera.com> Tested-by: jenkins (cherry picked from commit ed4cb660b7a60d9b9248df525c477bab4d218c4b) Reviewed-on: http://gerrit.ent.cloudera.com:8080/1953 Tested-by: Lenni Kuff <lskuff@cloudera.com>	2014-03-17 17:42:21 -07:00
Henry Robinson	635dd7d289	IMPALA-875: Respect isAnalyzed_ in IntLiteral expressions Partition column expressions are analysed twice for INSERT statements - once to infer the type and so to add a possible cast, and once to compute stats on the resulting expr. However, this process resulted in an partition column expr that was a IntLiteral getting the smallest type that would contains its value, rather than retaining the column-compatible type that had been assigned to it. This patch does the minimum thing, which is make IntLiteral.analyze() idempotent. Doing the same thing to Expr and LiteralExpr unearths some other bugs, which we will have to fix in a follow-on patch (see IMPALA-884). Change-Id: Ie22fc5d3f4832c735a1ebc0ef78f50d736f597fd Reviewed-on: http://gerrit.ent.cloudera.com:8080/1931 Reviewed-by: Henry Robinson <henry@cloudera.com> Tested-by: jenkins (cherry picked from commit 1912d65ea21a5025d385948642f0d4aadad91abf) Reviewed-on: http://gerrit.ent.cloudera.com:8080/1947	2014-03-17 17:35:12 -07:00
Lenni Kuff	dd20958e5d	Minor test cleanup * Prefer 'refresh <table name>' over 'invalidate metadata' * Remove the 'RELOAD' test setup option that was used by only 1 test. * Delete a .py test file that seems to be a duplicate Change-Id: I890546635840bb8f4d55789a89f8c8f33e40d001 Reviewed-on: http://gerrit.ent.cloudera.com:8080/1933 Reviewed-by: Nong Li <nong@cloudera.com> Tested-by: jenkins Reviewed-on: http://gerrit.ent.cloudera.com:8080/1946 Reviewed-by: Lenni Kuff <lskuff@cloudera.com>	2014-03-17 17:30:15 -07:00

1 2 3 4 5 ...

357 Commits