impala

mirror of https://github.com/apache/impala.git synced 2026-01-06 06:01:03 -05:00

Author	SHA1	Message	Date
Hayabusa-intel	4e7172f6f5	IMPALA-2459: Implement next_day date/time UDF Returns the date of the weekday that follows a particular date. The weekday argument is a string literal indicating the day of the week. Also this argument is case-insensitive. Available values are: "Sunday"/"SUN", "Monday"/"MON", "Tuesday"/"TUE", "Wednesday"/"WED", "Thursday"/"THU", "Friday"/"FRI", "Saturday"/"SAT". For example, the first Saturday after Wednesday, 25 December 2013 is on 28 December 2013. select next_day('2013-12-25','Saturday') returns '2013-12-28 00:00:00' select next_day(to_timestamp('08-1987-21', 'MM-yyyy-dd'), 'FRIDAY') returns '1987-08-28 00:00:00' Change-Id: I2721d236c096639a9e7d2df8a45ca888c6b3e83e Reviewed-on: http://gerrit.cloudera.org:8080/1943 Reviewed-by: Lars Volker <lv@cloudera.com> Tested-by: Lars Volker <lv@cloudera.com>	2016-06-09 04:30:48 -07:00
Alex Behm	025fd3bd7f	IMPALA-3646: Handle corrupt RLE literal or repeat counts of 0. Adds handling and testing for a specific Parquet data corruption scenario with plain dictionary encoded values. The problematic scenario is when the repeat or literal count of the RLE-encoded dictionary indexes is decoded as 0 - an invalid value. There are several other cases of data corruption that are not yet handled gracefully. This patch only handles one specific case. Change-Id: Ibf406c82cdded37966f09c81e4cc1446d2b60d63 Reviewed-on: http://gerrit.cloudera.org:8080/3299 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Alex Behm <alex.behm@cloudera.com>	2016-06-07 17:29:59 -07:00
Michael Ho	86ff18eee9	IMPALA-3223: Removal of non-toolchain builds. This change removes the option to build without specifying the environment variable $IMPALA_TOOLCHAIN. By default, if it's not set, sourcing impala-config.sh will set it to $IMPALA_HOME/toolchain. A user can override it by setting $IMPALA_TOOLCHAIN to his/her own toolchain directory. The user can also set $SKIP_TOOLCHAIN_BOOTSTRAP to true to avoid running the toolchain bootstrapping script (e.g. a particular component in toolchain is at a version not checked into S3). $IMPALA_TOOLCHAIN holds some third party binaries which Impala relies on. They can be compiled from source in the native toolchain which is public. This commit also removes build_thirdparty.sh as it's no longer used. By default, Impala will be built with the compiler in $IMPALA_TOOLCHAIN but this option can be overridden by setting environment variable $USE_SYSTEM_GCC to 1. Change-Id: I42b60e99fb9caf1294be7ab242856ca3b9a5ab73 Reviewed-on: http://gerrit.cloudera.org:8080/3259 Reviewed-by: Michael Ho <kwho@cloudera.com> Tested-by: Michael Ho <kwho@cloudera.com>	2016-06-07 17:29:59 -07:00
Tim Armstrong	d23e5505c8	IMPALA-3670: fix sorter buffer mgmt bugs Also make test_scratch_disk.py more deterministic, by using max_block_mgr_memory, which doesn't include scanner memory. The fixed test_scratch_disk.py exercises the other sorter bugs that occurs when scratch cannot be written. Testing: Added a test that does a sort with various memory limits and consumes the whole output of the sorter (we have many tests of sorts with limits but limited coverage of sorts without limits). Ran an exhaustive test run before posting for review. This added test reproduced one of the sorter bugs, where var-len blocks were not always attached to the output batch. The other test was reproduced by the test change in IMPALA-3669: test_scratch_disk fix. Change-Id: Ia1a0ddffa0a5b157ab86a376b7b7360a923698d6 Reviewed-on: http://gerrit.cloudera.org:8080/3315 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Tim Armstrong <tarmstrong@cloudera.com>	2016-06-06 22:34:19 -07:00
Tim Armstrong	ee53ddb389	IMPALA-1346/1590/2344: fix sorter buffer mgmt when spilling The Sorter's memory management logic failed to correctly manage buffers when spilling. It would try to make use of all buffers in the system, neglecting to account for other operators' buffer usage. This patch adjusts the logic so that it handles contention for buffers so long as it can get enough buffers to make progress. Instead of precalculating the number of buffers it thinks it should be able to pin, it just makes a best-effort attempt to pin the initial buffers as many runs as possible, up to a limit. As long as it can pin three runs, it can make progress. Testing: Added an additional test that failed before the patch without OOM. An analytic function test that was meant to fail also started succeeding so I had to adjust the limit there too. Change-Id: Idfe55cc13c7f2b54cba1d05ade44cbcf6bb573c0 Reviewed-on: http://gerrit.cloudera.org:8080/2908 Reviewed-by: Dan Hecht <dhecht@cloudera.com> Tested-by: Tim Armstrong <tarmstrong@cloudera.com>	2016-06-06 17:34:07 -07:00
Tim Armstrong	37ec25396f	IMPALA-3344: Simplify sorter and document/enforce invariants. Clarify relationships between classes, clean up the previous mess where every class was friends with the other so there's an actual distinction between public and private members. TupleIterator is now no longer tied to TupleSorter, just Run. Document and enforce invariants in many cases. Factor out some functions from large functions. Simplify and document iterator logic. Make management of buffers when iterating over output stream more explicitly correct: either use MarkNeedToReturn() or attach block to the batch as appropriate. The SortedRunMerger didn't handle resource transfer correctly, except if all the memory came from the batch's MemPool. This patch fixes the cases when resources are attached to the batches, but not the 'need_to_return' case. Document that SortedRunMerger requires 'deep_copy_input' to be true if batches can have the 'need_to_return' flag set. Also use the atomic block exchange operation when moving between blocks in unpinned runs to prevent pin failures at that point. I explicitly have avoided changing the hairy block management logic when allocating buffers for merging, that will need addressing in a follow-up patch. Add a SpilledRuns counter so that it's more explicit that spilling occurred. Testing: Added some tests for corner cases with empty and NULL strings. Fixed a test that previously failed with OOM but now succeeds. Performance: Benchmarking against old code initial revealed some regressions from changes in inlining. Force inlining the TupleComparator::operator() and iterator Next()/Prev() functions helped and performance seems similar or slightly better on the targeted orderby benchmarks. Change-Id: I9c619e81fd1b8ac50e257172c8bce101a112b52a Reviewed-on: http://gerrit.cloudera.org:8080/2826 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Tim Armstrong <tarmstrong@cloudera.com>	2016-06-02 21:33:08 -07:00
Michael Ho	b14ca6d09f	IMPALA-3645: Free probe expressions' local allocations in ConstructBuildSide() With the prefetching changes, the probe expressions' local allocations are no longer freed via QueryMaintenance() in PHJ. Instead, they are freed explicitly in GetNext() after an entire probe batch has been processed. Due to this change in how we handle local allocations of probe expressions, a DCHECK was added to verify that there is no local allocation from the probe expression in ProcessBuildInput(). Turns out that Expr::Open() called in ConstructBuildSide() on the probe expressions may have caused local allocations to occur for certain UDFs (e.g. extract()). This change handles the situation above by freeing local allocations of the probe expressions once before calling ProcessBuildInput() in ConstructBuildSide(). A new regression test is also added for this specific case. Change-Id: I2096ca3e2093c5ab0ecc0e7ca4cd1b5f3c1ed1ed Reviewed-on: http://gerrit.cloudera.org:8080/3253 Reviewed-by: Michael Ho <kwho@cloudera.com> Tested-by: Internal Jenkins	2016-06-02 09:32:54 -07:00
Thomas Tauber-Marshall	710fa06b7c	IMPALA-3639: expr-test fails on ASAN In ExprTest::GetValue, we create a local string and then end up returning a reference to that string, resulting in a memory error. The mistake wasn't obvious from looking at the code due to the convoluted way that GetValue and ConvertValue work. This patch modifies GetValue and ConvertValue to be simpler and eliminates the memory error. Change-Id: I040179ee44782a22c88b810ff97612aaa89839f4 Reviewed-on: http://gerrit.cloudera.org:8080/3278 Reviewed-by: Thomas Tauber-Marshall <tmarshall@cloudera.com> Tested-by: Internal Jenkins	2016-06-02 09:32:54 -07:00
Michael Ho	5f3996e6d1	IMPALA-3181: Add noexcept to some functions This commit adds noexcept specifier to some cross-compiled functions which are known to not throw exceptions. This helps avoid some exception related instructions (e.g. invoke, landingpad) in the IR. Change-Id: I96bd2fec6c14771acae1e700bed958951368ee77 Reviewed-on: http://gerrit.cloudera.org:8080/3256 Reviewed-by: Dan Hecht <dhecht@cloudera.com> Tested-by: Internal Jenkins	2016-06-02 09:32:54 -07:00
Thomas Tauber-Marshall	5231301084	IMPALA-1633: GetOperationStatus should set errorMessage and sqlState Currently, we never populate the errorMessage or sqlState fields of TGetOperationStatusResp when the GetOperationStatus HiveServer2 rpc is called. This patch checks if the query has an error status and if so sets errorMessage and sqlState. GetOperationStatus also now takes the QueryExecState lock since QueryExecState::query_state_ and QueryExecState::query_status_ are supposed to be protected by it. Additionally, this patch performs some cleanup and adds some documentation around our behavior for updating QueryExecState::query_state_/query_status_. This also addresses IMPALA-3298: TGetOperationStatusResp missing error message when data is expired Change-Id: Icb792f88286779fcf2ce409828de818bc4e80bed Reviewed-on: http://gerrit.cloudera.org:8080/3094 Reviewed-by: Thomas Tauber-Marshall <tmarshall@cloudera.com> Tested-by: Internal Jenkins	2016-06-01 19:32:39 -07:00
Tim Armstrong	585ee48dc7	IMPALA-3647: track runtime filter memory in separate tracker This change breaks out runtime filter memory consumption from the query-wide tracker to improve debuggability of memory limit exceeded errors. Testing: ran exhaustive tests, ran local and cluster stress tests. Change-Id: I9f28f3b55b5c62e6f0f9838c5947c9446d444d20 Reviewed-on: http://gerrit.cloudera.org:8080/3247 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Reviewed-by: Michael Ho <kwho@cloudera.com> Tested-by: Internal Jenkins	2016-05-31 23:32:12 -07:00
Tim Armstrong	4edb8bb60d	IMPALA-3633: cancel fragment if coordinator is gone The bug is that return_val.status is an optional field, so setting the status without __isset is equivalent to Status::OK(). This meant that fragment did not get notified when reporting status if the coordinator had gone away. This means that is a cancel RPC was lost, we could be left with zombie fragments with no coordinator that kept on running until completion. Testing: I couldn't see a way to replicate this reliably with our existing test setup, since it requires some RPCs to be dropped to get into this state. I manually tested by commenting out CancelRemoteFragments(), starting a long-running query then cancelling it. Before the patch, perf top showed that the fragments continue to execute the query. After the patch, the fragments stopped executing quickly. Change-Id: I62ab6f4df7c0ee60c6aa6291513f9f0cbfac3fe7 Reviewed-on: http://gerrit.cloudera.org:8080/3238 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Internal Jenkins	2016-05-31 23:32:12 -07:00
Lars Volker	5be7c68ed8	IMPALA-3627: Clean up RPC structures in ImpalaInternalService This change is a pre-requisite for IMPALA-2550. Change-Id: I0659c94f6b80bd7bbe0bd150ce243f9efa9a41ad TODO: Write commit message Reviewed-on: http://gerrit.cloudera.org:8080/3202 Reviewed-by: Lars Volker <lv@cloudera.com> Tested-by: Internal Jenkins	2016-05-31 23:32:12 -07:00
Taras Bobrovytsky	98d7b8a90d	IMPALA-3163: Fix Decimal to Timestamp casting Before this patch, we would first convert the Decimal to Double, then Double to Timestamp. This resulted in imprecise results. I ran a benchmark where we read decimal values from a large parquet table and cast them to timestamp. The new correct implementation is slightly slower than the old one (101 seconds vs 70 seconds). Change-Id: Iabeea9f4ab4880b2f814408add63c77916e2dba9 Reviewed-on: http://gerrit.cloudera.org:8080/3154 Reviewed-by: Dan Hecht <dhecht@cloudera.com> Tested-by: Internal Jenkins	2016-05-31 23:32:11 -07:00
Tim Armstrong	4896895988	IMPALA-3619: disable IR symbols by default These come with significant memory overhead, meaning that the memory usage of the debug build diverges significantly from the release build. We should disable them by default. They can be enable by setting ENABLE_IMPALA_IR_DEBUG_INFO=true. Change-Id: Ia5426fe3f8be0b7a100c0c3683c8ef1eaf507146 Reviewed-on: http://gerrit.cloudera.org:8080/3223 Reviewed-by: Dan Hecht <dhecht@cloudera.com> Tested-by: Internal Jenkins	2016-05-31 23:32:11 -07:00
Lars Volker	d16e83214a	IMPALA-3581: Change location of minidump folders to log_dir Currently the default minidump location is /tmp/impala-minidumps, which can be wiped on reboot on various distributions. This change moves the default location to FLAGS_log_dir/minidumps/$daemon. The additional trailing $daemon folder is kept to prevent name collisions in case of local test clusters and strangely configured installations. For local test clusters the minidumps will be written to $IMPALA_HOME/logs/cluster/minidumps/{catalogd,impalad,statestored}. Change-Id: Idecf5a314bfb8b0870e8aa4819c4fb39a107702f Reviewed-on: http://gerrit.cloudera.org:8080/3171 Reviewed-by: Taras Bobrovytsky <tbobrovytsky@cloudera.com> Tested-by: Internal Jenkins	2016-05-31 23:32:11 -07:00
Sailesh Mukil	6f1fe4ebe7	IMPALA-3577, IMPALA-3486: Partitions on multiple filesystems breaks with S3_SKIP_INSERT_STAGING The HdfsTableSink usualy creates a HDFS connection to the filesystem that the base table resides in. However, if we create a partition in a FS different than that of the base table and set S3_SKIP_INSERT_STAGING to "true", the table sink will try to write to a different filesystem with the wrong filesystem connector. This patch allows the table sink itself to work with different filesystems by getting rid of a single FS connector and getting a connector per partition. This also reenables the multiple_filesystems test and modifies it to use the unique_database fixture so that parallel runs on the same bucket do not clash and end up in failures. This patch also introduces a SECONDARY_FILESYSTEM environment variable which will be set by the test to allow S3, Isilon and the localFS to be used as the secondary filesystems. All jobs with HDFS as the default filesystem need to set the appropriate environment for S3 and Isilon, i.e. the following: - export AWS_SECERT_ACCESS_KEY - export AWS_ACCESS_KEY_ID - export SECONDARY_FILESYSTEM (to whatever filesystem needs to be tested) TODO: SECONDARY_FILESYSTEM and FILESYSTEM_PREFIX and NAMENODE have a lot of similarities. Need to clean them up in a following patch. Change-Id: Ib13b610eb9efb68c83894786cea862d7eae43aa7 Reviewed-on: http://gerrit.cloudera.org:8080/3146 Reviewed-by: Sailesh Mukil <sailesh@cloudera.com> Tested-by: Internal Jenkins	2016-05-31 23:32:11 -07:00
Tim Armstrong	8d2320df26	IMPALA-3597: mislabelled cache levels on debug webpage Change-Id: I638f518b6f460bea6724c1b1efd4c4aefecf5219 Reviewed-on: http://gerrit.cloudera.org:8080/3210 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Reviewed-by: Dan Hecht <dhecht@cloudera.com> Tested-by: Internal Jenkins	2016-05-31 23:32:10 -07:00
Bharath Vissapragada	e26dc85684	IMPALA-3554: Use kerberos principal in SentryProxy class For kerberized clusters, users expect the Catalog service to use the kerberos principal instead of operating sytem user that runs the Catalog process. This patch fixes that. Change-Id: I842e558e59023c7d937796a4cac51a013d948e02 Reviewed-on: http://gerrit.cloudera.org:8080/3165 Reviewed-by: Bharath Vissapragada <bharathv@cloudera.com> Tested-by: Internal Jenkins	2016-05-31 23:32:10 -07:00
Michael Ho	0b7ae6e4eb	IMPALA-3223: Relocate squeasel and mustache directories This change moves the source and header files of squeasel and mustache to be/src/thirdparty. This is a step towards removing thirdparty as a preparation to move to ASF. There is also corresponding change to Impala-lzo to update its include path. Change-Id: I782e493bc28086a1587274b3c474ea6b6f201855 Reviewed-on: http://gerrit.cloudera.org:8080/3206 Reviewed-by: Michael Ho <kwho@cloudera.com> Tested-by: Michael Ho <kwho@cloudera.com>	2016-05-31 23:31:41 -07:00
Tim Armstrong	6198d9262e	Refactor RuntimeState and ExecEnv dependencies Previously including runtime-state.h or exec-env.h pulled in a huge number of headers. By replacing all of those includes with forward declarations, we can reduce the number of headers included when building each source file. This required various changes, including splitting header files, and in one case extracting the nested DiskIoMgr::RequestContext class so that the RequestContext can be instantiated without the full DiskIoMgr header. The payoff is that touching many header files results in significantly smaller incremental builds. E.g. changes to bloom-filter.h only require recompiling a handful of files, instead of 100+. Build time of individual files should also be slightly quicker, since they pull in fewer headers. Change-Id: I3b246ad9c3681d649e7bfc969c7fa885c6242d84 Reviewed-on: http://gerrit.cloudera.org:8080/3108 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Internal Jenkins	2016-05-25 19:41:45 -07:00
Matthew Jacobs	f413e236a8	IMPALA-3579: Strict handling of numeric overflow in text parsing Adds a query option 'strict_mode' which treats integer and floating pt overflows as parse errors. In the past, overflows were ignored and the max value was returned. When this query option is set, overflowing values are treated as if they were completely invalid data, i.e. NULL is returned. When abort_on_error is enabled, this means the query is aborted. Notes: * DECIMAL overflow/underflow is already treated as an error. * The handling in text-converter treats underflows the same as overflows, so they would result in the same behavior. However, floating point parsing never returns an underflow today. * We may also want to handle numeric values that are truncated when parsing to integer types, e.g. 10.5 -> 10. Change-Id: I7409c31ec0cb6fe0b2d9842b9f58fe1670914836 Reviewed-on: http://gerrit.cloudera.org:8080/3150 Reviewed-by: Matthew Jacobs <mj@cloudera.com> Tested-by: Internal Jenkins	2016-05-23 08:40:20 -07:00
Bharath Vissapragada	49610e2cfa	IMPALA-3314/IMPALA-3513: Fix querying tables/partitions altered to Avro format Bug: Impalads crash if we query an Avro table with stale metadata Cause: This happens because avroSchema_ is not set in HdfsTable, which is not propagated to the avro scanner and it doesn't have appropriate checks to make sure the schema is non-null. The patch fixes the following. 1. Avro scanner should gracefully handle the case where the avro schema is not set. Appropriate null checks and a meaning error message have been added. 2. This is a special case with multi-fileformat partitioned tables. avroSchema_ should be set in HdfsTable even if any subset of the partitions are backed by avro. Without this patch, we only set it if the base table file format is Avro. Change-Id: I09262d3a7b85a2263c721f3beafd0cab2a1bdf4b Reviewed-on: http://gerrit.cloudera.org:8080/3136 Reviewed-by: Bharath Vissapragada <bharathv@cloudera.com> Tested-by: Internal Jenkins	2016-05-23 08:40:20 -07:00
Michael Ho	0243a21da8	IMPALA-3242: Remove most usages of RuntimeState::SetMemLimitExceeded() There are multiple places in the code which call RuntimeState::SetMemLimitExceeded(). Most of them are unnecessary as the error status constructed will eventually be propagated up the tree of exec nodes. There is no obvious reason to treat query memory limit exceeded differently. In some cases such as scan-node, calling SetMemLimitExceeded() is actually confusing as all scanner threads may pick up error status when any thread exceeds query memory limit, causing a lot of noise in the log. This change replaces most calls to RuntimeState::SetMemLimitExceeded() with MemTracker::MemLimitExceeded(). The remaining places are: the old hash table code, the UDF framework and QueryMaintenance() which checks for memory limit periodically. The query maintenance case will be removed eventually once IMPALA-2399 is fixed. Change-Id: Ic0ca128c768d1e73713866e8c513a1b75e6b4b59 Reviewed-on: http://gerrit.cloudera.org:8080/3140 Reviewed-by: Michael Ho <kwho@cloudera.com> Tested-by: Internal Jenkins	2016-05-23 08:40:19 -07:00
Tim Armstrong	7d5d36a6e4	Use MemTracker::MemLimitExceeded() where appropriate This is an incremental improvement towards IMPALA-3090. Where possible we use MemTracker::MemLimitExceeded() instead of directly constructing the Status object. The remaining cases where we directly construct the state are related the the BufferedBlockMgr, which will be deprecated: either they are produced by the BufferedBlockMgr, or produced when a Pin() unexpectedly fails. Both of these will go away anyway. Change-Id: I77c37f86dd15ace39e28b5cc72d37bc8d4109041 Reviewed-on: http://gerrit.cloudera.org:8080/3148 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Internal Jenkins	2016-05-23 08:40:19 -07:00
Tim Armstrong	1ccfc45d41	IMPALA-3569: handle errors in timezone db initialization Change-Id: I6b4d5e6b992ea023f801edb7b487e57f39920c03 Reviewed-on: http://gerrit.cloudera.org:8080/3125 Reviewed-by: Dan Hecht <dhecht@cloudera.com> Tested-by: Internal Jenkins	2016-05-23 08:40:19 -07:00
Thomas Tauber-Marshall	51869eac56	IMPALA-3542: do_as_user empty check missing Fixes a typo in ImpalaServer::AuthorizeProxyUser where we check that the 'user' parameters isn't empty twice instead of also checking the 'do_as_user' parameter. Change-Id: I8e3962f6f397804e37d4f2c667e97b55bd3ca2bf Reviewed-on: http://gerrit.cloudera.org:8080/3120 Reviewed-by: Matthew Jacobs <mj@cloudera.com> Reviewed-by: Dan Hecht <dhecht@cloudera.com> Tested-by: Internal Jenkins	2016-05-23 08:40:19 -07:00
Michael Ho	f7501d2ec1	IMPALA-3332: Free local allocations in sorter. Sorter can have runaway memory consumption as it never frees local allocations made in comparator_.Less(). In addition, it doesn't check for errors generated during expression evaluation so it may keep sorting even after failures have occurred. This change fixes the problem by freeing local allocations for every n invocations of comparator_.Less() where n is the row batch size specified in the query options. Various error checks are also added to return early if any error is encountered. Change-Id: I941729b4836e5dbb827d4313a0b45bc5df2fa8e1 Reviewed-on: http://gerrit.cloudera.org:8080/3116 Reviewed-by: Michael Ho <kwho@cloudera.com> Tested-by: Internal Jenkins	2016-05-23 08:40:18 -07:00
Tim Armstrong	38416eeeb9	IMPALA-3546: don't die if sysconf() reports bogus cache info With RHEL5 on AWS EC2 for example, sysconf() returns bad info about cache line sizes. We should tolerate this instead of bringing down impalad. Change-Id: Id4d61b05fe213028a7e9aaabe98adc2792b90e07 Reviewed-on: http://gerrit.cloudera.org:8080/3111 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Internal Jenkins	2016-05-23 08:40:15 -07:00
Tim Armstrong	cb2a3aacd6	Turn on C++14 in cross-compiled code Enabling this revealed a latent bug where a #include was wrapped in the impala namespace, resulting in the functions being defined in the wrong namespace. Change-Id: If723167b2d03da7592b64a204e31e81ea868e4f2 Reviewed-on: http://gerrit.cloudera.org:8080/3024 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Internal Jenkins	2016-05-18 14:40:34 -07:00
Dimitris Tsirogiannis	f992dc7f88	IMPALA-2956: Filters should be able to target multiple scan nodes With this commit runtime filters can be assigned to multiple destination nodes (scans). For each filter, the destination nodes are determined using equivalent classes during planning. For each filter, all its destination nodes are in the left subtree rooted at the join node that constructs this filter. A runtime filter may have both local and remote targets. The backend determines how to route each filter depending on the number and type (local, remote) of its destination nodes. With this commit, we enable runtime filter propagation in all the operands of UNION [ALL\|DISTINCT] nodes. Change-Id: Iad2ce4e579a30616c469312a4e658140d317507b Reviewed-on: http://gerrit.cloudera.org:8080/2932 Reviewed-by: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com> Tested-by: Internal Jenkins	2016-05-18 01:40:22 -07:00
Tim Armstrong	265e39f89a	IMPALA-3168: replace HashTable parameters with constants This addresses the regression for small-ndv aggs resulting from prefetching. The idea is that for small-ndv aggs prefetching increases the # of instructions and memory references, but doesn't provide any compensating benefit. This change replaces constant values in the hash-table code, which reduces the instruction count and # of memory references in aggregations. Preliminary perf results show that a low-NDV decimal agg is around 20% faster (2.1s -> 1.7s) and a high-NDV decimal agg is around 7% faster (15s -> 14s). I haven't investigated how much of the speedup is reduced codegen time. Change-Id: I483a19662c90ca54bc21d60fd6ba97dbed93eaef Reviewed-on: http://gerrit.cloudera.org:8080/3088 Tested-by: Internal Jenkins Reviewed-by: Dan Hecht <dhecht@cloudera.com>	2016-05-17 10:09:06 -07:00
Tim Armstrong	9f4276eea8	IMPALA-3286: prefetching for PartitionedAggregationNode This patch builds on top of the prefetching infrastructure to add prefetching to PartitionedAggregationNode. Input batches are evaluated in prefetch groups and hash table buckets are prefetched if the prefetch_mode query option is set to HT_BUCKET. We avoid some pointer indirections on the critical path by caching hash tables in a 'hash_tbls_' array. There is also a bit of cleanup to directly instantiate the templated ProcessBatch() method to remove the ProcessBatch_true() and ProcessBatch_false() hack, and also to separate out ProcessBatchNoGrouping() so that it doesn't have to have the same argument list as ProcessBatch(). Co-author: Michael Ho <kwho@cloudera.com> Change-Id: I7726454efb416d61080c4e11db0ee7ada18c149b Reviewed-on: http://gerrit.cloudera.org:8080/3070 Reviewed-by: Michael Ho <kwho@cloudera.com> Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Internal Jenkins	2016-05-17 10:09:06 -07:00
Matthew Jacobs	9172f4b824	IMPALA-1928: Fix Thrift client transport wrapping order The thrift client incorrectly wraps the TSaslTransport around the TBufferedTransport which leads to significant performance issues. (Note that the server-side wraps the transports in the correct order already.) Currently: TSaslTransport(TBufferedTransport(socket)) Should be: TBufferedTransport(TSaslTransport(socket)) As a result, when we write a structure, we end up doing lots of write calls which hit the TSaslTransport which does no buffering. So it ends up producing output that looks like: [0, 0, 0, 1], <one char>, [0, 0, 0, 1], <one char>, etc. for each individual write call. These end up buffered so we don't get lots of tiny packets on the send side. However, on the receiver side we are doing one recv call per Sasl frame. This patch reorders the wrapping of transports in the thrift client, so that it matches the order on the thrift server which improves exhange performance making it within 10% of non-kerberos. Change-Id: I81d30b3d8d10fe6dcd8eb88cca49734af09f9d91 Reviewed-on: http://gerrit.cloudera.org:8080/3093 Reviewed-by: Dan Hecht <dhecht@cloudera.com> Tested-by: Internal Jenkins	2016-05-17 10:09:06 -07:00
Matthew Jacobs	f067929f3a	IMPALA-3535: Ignore invalid per-pool default query options In 2.5 we added the ability to set per-pool default query options. A string of key-value pairs can be specified with a pool configuration. However, if any options fail to parse, then all the options are ignored. We want that behavior (and returning an error) when parsing the process-wide default query options on startup and when parsing the options sent from a client (e.g. in beeswax server) because an error can be returned immediately for the triggering action at that time (i.e. starting the impalad or submitting a query with the options set). This behavior is bad for the pool default query options because (a) the configuration is set by the administrator and there's nothing we can do until a query is submitted and (b) one invalid option shouldn't mean that other valid options aren't set. Change-Id: If04733b775963091b0314c65286df126fd812358 Reviewed-on: http://gerrit.cloudera.org:8080/3056 Reviewed-by: Dan Hecht <dhecht@cloudera.com> Tested-by: Internal Jenkins	2016-05-17 10:09:05 -07:00
Michael Ho	a59408b575	IMPALA-3286: Prefetching for PHJ probing. This change pipelines the code which probes the hash tables. This is based on the idea which Mostafa presented earlier. Essentially, all rows in a row batch will be evaluated and hashed first before being probed against the hash tables. Hash table buckets are prefetched as hash values of rows are computed. To avoid re-evaluating the rows again during probing (as the rows have been evaluated once to compute the hash values), hash table context has been updated to cache the evaluated expression values, null bits and hash values of some number of rows. Hash table context provies a new iterator like interface to iterate through the cached values. A PREFETCH_MODE query option has also been added to disable prefetching if necessary. The default mode is 1 which means hash table buckets will be prefetched. In the future, this mode may be extended to support hash table buckets' data prefetching too. Combined with the build side prefetching, a self join of table lineitem improves by 40% on a single node run on average: select count(*) from lineitem o1, lineitem o2 where o1.l_orderkey = o2.l_orderkey and o1.l_linenumber = o2.l_linenumber; Change-Id: Ib42b93d99d09c833571e39d20d58c11ef73f3cc0 Reviewed-on: http://gerrit.cloudera.org:8080/2959 Reviewed-by: Michael Ho <kwho@cloudera.com> Tested-by: Internal Jenkins	2016-05-17 01:30:12 -07:00
Casey Ching	b634a55b92	Kudu: Fix warnings from clang Changes: 1) Several places in the tests didn't check return statuses. KUDU_ASSERT_OK can only be used in functions that return void, KUDU_CHECK_OK is used otherwise. 2) The forward declared "class ColumnType" should have actually been a struct. Now there aren't any more Kudu related warnings from clang. Change-Id: Id3e2f5ec9925c3cf81c7f4048decc6a5f97eee66 Reviewed-on: http://gerrit.cloudera.org:8080/3062 Reviewed-by: Dan Hecht <dhecht@cloudera.com> Tested-by: Internal Jenkins	2016-05-17 01:30:12 -07:00
Alex Behm	b4558d384e	IMPALA-3539: Return error status if def/rep level caches failed to allocate. The information in the JIRA is consistent with a failure to allocate memory for the def level cache. There was a bug where this failure status was not properly propagated, so eventually a DCHECK was hit that expected the cache memory to be allocated. Change-Id: I38856e6e1f5fbdbf5327cf31a2a109e6c930901d Reviewed-on: http://gerrit.cloudera.org:8080/3065 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Internal Jenkins	2016-05-14 01:30:01 -07:00
Skye Wanderman-Milne	1aeda141aa	IMPALA-3533: fix Tuple::CodegenMaterializeExprs() The args[] buffer was too small. I also reverted to the usual style of commenting out each unused argument Value*, which makes it slightly easier to spot this kind of bug. Change-Id: Ic2546b8f42ac0a4e0715b134c384ccf311f663c2 Reviewed-on: http://gerrit.cloudera.org:8080/3051 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Reviewed-by: Dan Hecht <dhecht@cloudera.com> Tested-by: Internal Jenkins	2016-05-13 15:52:53 -07:00
Lars Volker	3649ff89e1	IMPALA-3019: Fix unnecessary resets of iterator In order to perform round-robin backend selection, the simple scheduler uses an iterator to the next backend entry to be selected. This iterator needs to be reset whenever it is invalidated by changes to the underlying map. The current behavior resets the pointer on every message of the statestored, even if the message was empty and thus did not result in any changes to the map. After every reset of the iterator round-robin selection starts from the first backend in the scheduler's backend map. As the statestored sends empty keepalive messages every couple of seconds, this effectively limits scheduling of remote reads to only a few backends. This change introduces a check to prevent those unnecessary iterator resets, which will spread remote reads more evenly over all backends. Change-Id: I831d485b46c7d9460fb014a302a26864b6bd573e Reviewed-on: http://gerrit.cloudera.org:8080/2330 Reviewed-by: Lars Volker <lv@cloudera.com> Tested-by: Internal Jenkins Reviewed-on: http://gerrit.cloudera.org:8080/3031	2016-05-13 15:52:53 -07:00
Youwei Wang	0306dd576d	IMPALA-2809: Improve scalar ByteSwap(). This patch improves our ByteSwap() function by handling more byte sizes in the fast path, as opposed to the loop-based slow path. ByteSwap() is used heavily in when scanning Parquet decimals. Before this patch, VTune showed ByteSwap() among the top three worst cycle offenders when running TPCH-Q6 on my local setup with a large lineitem table. After this patch, ByteSwap() shows no significant contribution to the overall cycles spent. There was a measurable improvement of a few percent for TPCH-Q6. Change-Id: I4f462e6bdb022db46b48889a6a7426120a80d9b4 Reviewed-on: http://gerrit.cloudera.org:8080/3033 Reviewed-by: Dan Hecht <dhecht@cloudera.com> Tested-by: Internal Jenkins	2016-05-13 15:52:53 -07:00
Skye Wanderman-Milne	7767d300a3	IMPALA-3311: fix string data coming out of aggs in subplans The problem: varlen data (e.g. strings) produced by aggregations is freed by FreeLocalAllocations() after passing up the output batch. This works for streaming operators or blocking operators that copy their input, but results in memory corruption when the output reaches non-copying blocking operators, e.g. SubplanNode and NestedLoopJoinNode. The fix: this patch makes the PartitionedAggregationNode copy out produced string data if the node is in a subplan. Otherwise it calls MarkNeedsToReturn() on the output batch. Marking the batch would work in the subplan case as well, but would likely be less efficient since it would result in many small batches coming out of the subplan. The patch includes a test case. However, this test only exposes the problem with an ASAN build and the --disable_mem_pools flag, which we don't currently have automated testing for. Change-Id: Iada891504c261ba54f4eb8c9d7e4e5223668d7b9 Reviewed-on: http://gerrit.cloudera.org:8080/2929 Reviewed-by: Dan Hecht <dhecht@cloudera.com> Tested-by: Internal Jenkins	2016-05-12 23:06:36 -07:00
Lars Volker	cb377741ec	Remove replica_preference query option Change-Id: I5a3134b874a53241706d850d186acbfed768f5ee Reviewed-on: http://gerrit.cloudera.org:8080/2323 Reviewed-by: Marcel Kornacker <marcel@cloudera.com> Reviewed-by: Silvius Rus <srus@cloudera.com> Tested-by: Internal Jenkins Reviewed-on: http://gerrit.cloudera.org:8080/3030 Reviewed-by: Lars Volker <lv@cloudera.com>	2016-05-12 23:06:36 -07:00
Skye Wanderman-Milne	9174dee395	IMPALA-1578: fix text scanner to handle "\r\n" delimiters split across blocks This patch modifies HdfsTextScanner to specifically check for split "\r\n" delimiters when the scan range ends with '\r'. If there does turn out to be a split delimiter, the next tuple is considered the responsibility of the next scan range's scanner, as if the delimiter appeared fully in the second scan range. This should not affect the overall performance characteristics of the text scanner since it already must do a remote read past the end of the scan range to read the last tuple. Change-Id: Id42b441674bb21517ad2788b99942a4b5dc55420 Reviewed-on: http://gerrit.cloudera.org:8080/2803 Reviewed-by: Dan Hecht <dhecht@cloudera.com> Tested-by: Internal Jenkins	2016-05-12 23:06:36 -07:00
Alex Behm	e96b463587	IMPALA-3528: Transfer scratch tuple memory in Close() of Parquet scanner. The lifetime of a scanner thread is decoupled from that of row batches that it produces. That means that all resources associated with row batches produced by the scanner thread should be transferred to those batches. The bug was that we were not transferring the ownership of memory from the scratch batch to the final row batch returned in HdfsParquetScanner::Close(). Triggering an event that would cause the freed memory to be dereferenced is possible, but very difficult. My understanding is that it is only possible in exceptional non-deterministic scenarios, e.g., a query is cancelled just at the right time, or the scanner hits a parse/decoding error. Testing: I tested this change locally by running the scanner and nested types test as well as TPCH, nested TPCH, and TPC-DS. Change-Id: Ic34d32c9a41ea66b2b2d8f5e187cc84d4cb569b2 Reviewed-on: http://gerrit.cloudera.org:8080/3041 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Internal Jenkins	2016-05-12 23:06:36 -07:00
Tim Armstrong	6910f4975a	IMPALA-3527: use codegen'd ProcessProbeBatch() when spilling. Change-Id: I92ebfb01e370d0a842270771c9e5f1a4610dc16a Reviewed-on: http://gerrit.cloudera.org:8080/3035 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Internal Jenkins	2016-05-12 23:06:35 -07:00
Tim Armstrong	a2e88f0e6c	IMPALA-3495: incorrect join result due to implicit cast in Murmur hash We observed that some spilling joins started returning incorrect results. The behaviour seems to happen when a codegen'd insert and a non-codegen'd probe function is used (or vice-versa). This only seems to happen in a subset of cases. The bug appears to be a result of the implicit cast of the uint32_t seed value to the int32_t hash argument to HashTable::Hash(). The behaviour is unspecified if the uint32_t does not fit in the int32_t. In Murmur hash, this value is subsequently cast to a uint64_t, so we have a chain of uint32_t->int32_t->uint64_t conversions. It would require a very careful reading of the C++ standard to understand what the expected result is, and whether we're seeing a compiler bug or just unspecified behaviour, but we can avoid it entirely by keeping the values unsigned. Testing: I was able to reproduce the issue under a very specific of circumstances, listed below. Before this change it consistently returned 0 rows. After the change it consistently returned the correct results. I haven't had much luck creating a suitable regression test. * 1 impalad * --disable_mem_pools=true * use tpch_20_parquet; * set mem_limit=1275mb; * TPC-H query 7: select supp_nation, cust_nation, l_year, sum(volume) as revenue from ( select n1.n_name as supp_nation, n2.n_name as cust_nation, year(l_shipdate) as l_year, l_extendedprice * (1 - l_discount) as volume from supplier, lineitem, orders, customer, nation n1, nation n2 where s_suppkey = l_suppkey and o_orderkey = l_orderkey and c_custkey = o_custkey and s_nationkey = n1.n_nationkey and c_nationkey = n2.n_nationkey and ( (n1.n_name = 'FRANCE' and n2.n_name = 'GERMANY') or (n1.n_name = 'GERMANY' and n2.n_name = 'FRANCE') ) and l_shipdate between '1995-01-01' and '1996-12-31' ) as shipping group by supp_nation, cust_nation, l_year order by supp_nation, cust_nation, l_year Change-Id: I952638dc94119a4bc93126ea94cc6a3edf438956 Reviewed-on: http://gerrit.cloudera.org:8080/3034 Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Internal Jenkins	2016-05-12 23:06:35 -07:00
Henry Robinson	df1412c962	IMPALA-3480: Add query options for min/max filter sizes This patch adds two query options for runtime filters: RUNTIME_FILTER_MAX_SIZE RUNTIME_FILTER_MIN_SIZE These options define the minimum and maximum filter sizes for a filter, no matter what the estimates produced by the planner are. Filter sizes are rounded up to the nearest power of two. Change-Id: I5c13c200a0f1855f38a5da50ca34a737e741868b Reviewed-on: http://gerrit.cloudera.org:8080/2966 Tested-by: Internal Jenkins Reviewed-by: Henry Robinson <henry@cloudera.com>	2016-05-12 23:06:35 -07:00
Alex Behm	14cdb0497c	IMPALA-2736: Optimized ReadValueBatch() for Parquet scalar column readers. This change builds on top of the recent move to column-wise materialization of scalar values in the Parquet scanner. The goal of this patch is to improve the scan efficiency, and show the future direction for all column readers. Major TODO: The current patch has minor code duplication/redundancy, and the new ReadValueBatch() departs from (but improves) the existing column reader control flow. To improve code reuse and readability we should overhaul all column readers to be more uniform. Summary of changes: - refactor ReadValueBatch() to simplify control flow - introduce caching of def/rep levels for faster level decoding, and for a tigher value materialization loop - new templated function for value materialization that takes the value encoding as a template argument Mini benchmark vs. cdh5-trunk I ran the following queries on a single impalad before and after my change using a synthetic 'huge_lineitem' table. I modified hdfs-scan-node.cc to set the number of rows of any row batch to 0 to focus the measurement on the scan time. Query options: set num_scanner_threads=1; set disable_codegen=true; set num_nodes=1; select * from huge_lineitem; Before: 22.39s Afer: 13.62s select * from huge_lineitem where l_linenumber < 0; Before: 25.11s After: 17.73s select * from huge_lineitem where l_linenumber % 2 = 0; Before: 26.32s After: 16.68s select l_linenumber from huge_lineitem; Before: 1.74s After: 0.92s Testing: I ran a private exhaustive build and all tests passed. Change-Id: I21fa9b050a45f2dd45cc0091ea5b008d3c0a3f30 Reviewed-on: http://gerrit.cloudera.org:8080/2843 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Alex Behm <alex.behm@cloudera.com>	2016-05-12 14:18:05 -07:00
Lars Volker	df8bf3a965	IMPALA-3490: Add flag to reduce minidump size IMPALA-2686 added the breakpad library to all impala daemons, thus enabling them to write minidump files. This change introduces a flag 'minidump_size_limit_hint_kb', which causes breakpad to reduce the amount of thread stack memory it includes in a minidump, aiming to reduce the minidump size during crashes with a lot of threads. Once a minidump is expected to exceed the configured value, breakpad will include the full stack memory for the first 20 threads, and afterwards capture only 2KB of stack memory for each additional thread. Change-Id: I2f3aa0df51be9f0bf0755fb288702911cdb88052 Reviewed-on: http://gerrit.cloudera.org:8080/2990 Reviewed-by: Lars Volker <lv@cloudera.com> Tested-by: Internal Jenkins	2016-05-12 14:18:04 -07:00

1 2 3 4 5 ...

2562 Commits