Commit Graph

2250 Commits

Author SHA1 Message Date
Lenni Kuff
15327e8136 Migrate DataErrors tests to Python test framework, re-enable subset of tests
This re-enables a subset of the stable data errors tests and updates them to
work in our test framework. This includes support for updating results via --update_results.

This also lets us remove a lot of old code that was there only to support these disabled
tests.

Change-Id: I4c40c3976d00dfc710d59f3f96c99c1ed33e7e9b
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1952
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2277
2014-04-18 02:25:11 -07:00
Nong Li
ac230c7021 Fix active time reporting in runtime profiles.
- A few places didn't have total timer at the beginning.
- Async build thread for blocking join nodes really messed things up (sum of
  children was more than the time in the join node).

Change-Id: I9176ce37cf22f2bcebea21b117e45cce066dbc1d
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2276
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
2014-04-18 02:24:28 -07:00
Henry Robinson
2a69019525 IMPALA-945: Fix column reordering with SELECT expressions
Previously, to produce the correct output expressions for the root plan
fragment before a table sink, InsertStmt would reorder the result
expressions for the query statement at the plan root. This had stopped
working for SelectStmts (and test coverage didn't catch that).

Now InsertStmt produces its own output expressions that can substitute
for the originals from the query statement, and the planner uses those
instead.

All query tests for column reordering have been duplicated to use SELECT
expressions.

Change-Id: Ib909fe35d27416b33ba2e5ac797aa931e1fe43f9
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2204
Tested-by: jenkins
Reviewed-by: Henry Robinson <henry@cloudera.com>
(cherry picked from commit d526db7ac6274f35b6affcb7428327100026e14e)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2275
2014-04-18 00:12:12 -07:00
Nong Li
1cab95066d Add the return type as a column for SHOW FUNCTIONS.
Also includes some misc pattern matching cleanup.

Change-Id: I6c9ec78b094a73864b4d669afbd75a48c9bf9585
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2199
Tested-by: jenkins
Reviewed-by: Nong Li <nong@cloudera.com>
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2271
2014-04-17 17:58:13 -07:00
Nong Li
831c0bbdc1 IMPALA-949: Fix scan range initial queue capacity.
Change-Id: I289c61587da75b318ba5a543d31010920a9cffe9
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2268
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
2014-04-17 15:13:54 -07:00
Nong Li
85be9a5050 Update bin/make* -notests to include other artifacts for packages.
Change-Id: I95e95f0a2e2131875b95d6676620bec7117b7f8a
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2250
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
2014-04-16 00:37:00 -07:00
Nong Li
44fd279f18 Decimal: switch out the boost 128_t int with the c++ standard one.
The c++ standard int128_t is exactly what we want. It is 16 bytes, stored as 2's
complement little endian (the exact extension of the native int types). It out
performs the boost library we were using (see benchmark) and looking at the assembly
for some of the operators, I doubt we can do better. This also seems like the kind
of thing hardware might be able to do natively in the future if we stuck with the
standard implementation.

This requires minimal changes to the rest of our code so the multi int library is
abstracted away.

The standard only added int128 and not 96 or any others. We still will need to use
the boost library for some cases but nothing in the hot path. We might want to revisit
implementing an int96 in the future that is of the same format to get some space
and efficiency savings but I think we can live with just int128 for a while.

Change-Id: I137ef7be812675036dd9b6e5b48dfc5c7aa9ab37
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2200
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2249
2014-04-15 23:24:46 -07:00
Matthew Jacobs
d0c353a9b4 IMPALA-922: Return helpful errors with Yarn group rules
When the -fair_scheduler_allocation_path is configured with a policy that uses
the "primaryGroup" Yarn queue allocation rule, Yarn throws an error if the user
is not on the local OS. Currently the user will get an error message that says:
"java.io.IOException: No groups found for user <username>". We now return a more
helpful error message.

Change-Id: I014ac15ef607e473957752f23af94d0cc4efec0f
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2078
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: jenkins
(cherry picked from commit 3cf37dc4e91afe887ada988f256b7008983580d2)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2244
2014-04-15 15:32:05 -07:00
Nong Li
87295a4e06 Decimal implementation.
This patch implements decimal support for text based formats.

Change-Id: I8e2c9e512ed149fe965216a72cb21fffd4f18e75
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1669
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: Nong Li <nong@cloudera.com>
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2238
Tested-by: jenkins
2014-04-14 21:07:32 -07:00
Srinath Shankar
b1f46c8029 IMPALA-913: Revisit the use of FNV Hash in exchange
FNV hash has the property that the least significant bit of the hashed value
is just the XOR of the LSBs of its input bytes. This results in poor
distribution of rows when the partition keys are duplicated -- for example,
if the partition key is (l_orderkey, l_orderkey). A recommended technique
to mitigate this is to generate a larger hash and use XOR-folding to reduce
it to the desired length.

In this patch FnvHash has been modified to use generate a 64-bit hash and
fold the result down to 32-bits. It has been renamed FnvHash64to32 to make
this explicit.

Change-Id: Ie12ad3f863fca15092803d3e4d616a654cb8d244
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2220
Reviewed-by: Srinath Shankar <sshankar@cloudera.com>
Tested-by: jenkins
2014-04-14 12:03:53 -07:00
Nong Li
3bbe002d19 [CDH5] Break up locking in DiskIoMgr::ScanRange.
Currently, the entire object is protected by one lock. Unfortunately this
lock is taken during calls into libhdfs. This means it is impossible for the
scan node to pull off ready buffers while the disk thread is reading from
this scan range. Whoops. The lock is taken while in libhdfs to facilitate cleanup
so it's very easy to split the big lock up.

Change-Id: Idbf34cdba0cf860a90f9cad016d1ec133f923d85
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2143
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2202
2014-04-13 19:50:25 -07:00
ishaan
5803e6883e Cleanup and re-enable some tests in TestPartitionMetadata
Partition metadata tests were marked as xfail because of IMPALA-624. Additionally, we had
to invoke hive to insert into two partitions pointing to the same location (this
limitation is now removed). This patch changes the test to use Impala exclusively,
removes the xfail tag and adds a teardown method to the test class.

Change-Id: I15fa97bef4f8714d0873a9c713627a198f3388ad
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2086
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2215
2014-04-13 17:55:43 -07:00
ishaan
0e0c480262 Re-enable some tests in test_describe_formatted
A few tests which dealt with running queries via hs2 and impala were marked as xfail as
hiveserver2 would occasionally not come up. Given that we now have a script that checks
whether hiveserver2 is up before continuining the build, it should be safe to remove the
xfail.

Change-Id: I2b5063e7259c01fc0ef8ffda86d85514c9cf959c
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2082
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2214
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
2014-04-13 17:51:45 -07:00
ishaan
6f416dd2c2 Close all queries in test_cancellation
The queries in test_cancellation are currently cancelled but not closed, causing some test
queries to eventually time out because the admission controller limits are passed. This
patch ensures that all queries issued in test_cancellation are closed.

Change-Id: I65b26672155e31889bb6f43d3ac87be0f7b4eb72
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2187
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2213
2014-04-13 17:45:51 -07:00
Nong Li
f9dd32724c Cleanup build scripts.
Consolidated our build scripts and added the -notests option which skips
build the BE tests.

Change-Id: Ida6aa064b7fe47e535c142b9af92b7c158e83c32
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2043
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2201
2014-04-13 17:11:39 -07:00
Nong Li
1a3caca8c4 [CDH5] Update execution engine to take advantage of DN caching.
This finishes up the support to use HDFS caching. The scheduler will
prefer replicas that are cached and the scan node plumbs the metadata
to the io mgr.

This is a bit hard to test without a cluster and some perf benchmarking.
I've added a basic test to make sure the path is being exercised.

Change-Id: I8762ca9ef2f88c3637113d3c5ee82f4c0ea7f1be
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2212
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
2014-04-13 17:11:21 -07:00
Nong Li
826a57d246 IMP-1339: Fix crash in rcfile scanner from mem-pool tracking bug.
In the case where the MemPool fails in FindChunk, we were not properly
updating the MemPool's state.

Change-Id: I3ed9bd7ee9505cfaf4c7812304c1da85ae06f72f
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2160
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2203
2014-04-13 16:02:59 -07:00
Lenni Kuff
d101ef86e2 [CDH5] Bump version to 1.4.0-cdh5-INTERNAL
Change-Id: I0a0334084e444c948f1718133afb2d7246dde414
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2193
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
Tested-by: Lenni Kuff <lskuff@cloudera.com>
2014-04-11 16:03:09 -07:00
Skye Wanderman-Milne
c85d88714f Fix buffer overflow bug in StringCompare()
Includes benchmark for comparing different StringCompare() implementations.

Change-Id: Ib4623b3ae6c99977af332ce5161da66af3cae9e5
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2190
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: Skye Wanderman-Milne <skye@cloudera.com>
2014-04-11 11:16:50 -07:00
Skye Wanderman-Milne
e60bf29a96 IMPALA-13: Use SSE string functions that take an explicit length
This patch modifies DelimitedTextParser and StringValue to work with
data containing null characters by using SSE instructions that take a
length, rather than expecting null-terminated strings. It also adds
some other minor changes to correctly handle data with nulls and to
faciliate testing. I checked the execution time of a count(*) and a
select(*) limit 1 query locally, and saw no difference for either text
or sequence files.

Change-Id: Ia920b35bea7048aa286f39ec83e313c2a39251d1
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2110
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: Skye Wanderman-Milne <skye@cloudera.com>
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2181
2014-04-11 11:16:24 -07:00
Alex Behm
2fff51d9e9 IMP-1329,IMPALA-924: Make ExchangeNode::Open() block until rows are available.
The bug: Coordinator::Wait() is supposed to block until rows become available for
consumption by the client. We rely on Wait() to determine when to advance the query
status to a 'ready' state and signal to the client that rows can be fetched.
Long fetch times can trigger client timeouts at various levels (socket, app, etc.).
Coordinator::Wait() simply opens the coordinator fragment's plan tree.
For most plan nodes, Open() does work to prepare the plan tree, s.t., GetNext()
returns quickly. However, for ExchangeNodes Open() used to not wait
until rows are obtained form the underlying stream receiver.
The fix: Make ExchangeNode::Open() block until rows are available.

Change-Id: I7b197eea11d21fd732414d96c899a17b2d99631c
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2128
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2185
2014-04-10 23:49:38 -07:00
Skye Wanderman-Milne
ba89e60a81 IMPALA-932: evaluate concat/concat_ws children once
Change-Id: Id22a6c1dfb57cf659a1e24af4de6e5a2336cafa4
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2152
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: jenkins
(cherry picked from commit cf6960017d4f7d75c1c685cf362bd3d9cd9b63c7)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2183
2014-04-10 20:20:36 -07:00
Alex Behm
91db96d903 IMPALA-762: Add the query status to Beeswax::get_log() and pick it up in the Impala shell.
COMPUTE STATS is an async DDL command. When COMPUTE STATS fails it will set the
query status of the QueryExecState properly, but the original Beeswax::query() RPC
won't throw. The Impala shell sometimes did not pick up and display the
query status because no RPC actually threw. To fix this, I modified
Beeswax::get_log() to include the query status if it is not ok. The shell looks
for a special prefix to distinguish the query status from the runtime state error log.

Change-Id: I0d9dbf0801629a37de22ea4ebb6d2e5d53b836ef
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1899
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2063
2014-04-10 15:47:06 -07:00
Henry Robinson
37236845b1 Mark test_non_codegen_tinyint_grouping as execute_serially
The test contains an INSERT and some DDL, which is racy if performed in parallel.

Change-Id: I2b88533f45756fcf6372d6ee4eb7edd474087048
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2167
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Reviewed-by: Henry Robinson <henry@cloudera.com>
Tested-by: Henry Robinson <henry@cloudera.com>
(cherry picked from commit 8b103c029cc341bacea4746c369bb58e6af5ed29)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2182
Tested-by: jenkins
2014-04-10 15:17:25 -07:00
Lenni Kuff
342ff28ae2 IMP-1332: Remove unused 'nss-pam-ldapd' openldap contrib module from /thirdparty
Change-Id: I478d9238864052981377a03cd90d37f60129c70e
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2081
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2180
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
2014-04-10 12:20:33 -07:00
Lenni Kuff
9e2dd7e049 Add support for SHOW PARTITIONS <table name>
This statement returns info on all partitions for the given table. It is implemented as
an alias for SHOW TABLE STATS, with some extended analysis checks (such as throwing if
the statement targets an unpartitioned table).

Change-Id: I19154a9d90314de18f86ba355aa5dbed808f147f
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2145
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
Tested-by: Lenni Kuff <lskuff@cloudera.com>
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2179
Tested-by: jenkins
2014-04-10 12:15:39 -07:00
Lenni Kuff
f1f4e99c85 [CDH5] IMP-1326: Impala assumes BlockLocation#getCachedHosts returns IP addresses
Impala determines the location (network address) of all block replicas using
the HDFS API BlockLocation.getNames(), which returns results in IP:port format.
To find where cached replicas are located we call BlockLocation.getCachedHosts(),
which returns results as hostnames. This caused an issue where we would compare
an IP address to a hostname to determine if a replica was cached.

The fix is to resolve cached hosts by comparing against BlockLocation.getHosts(),
which returns the block replica locations by hostname. getHosts() will always return
results in the same order as getNames() and getHosts() and may contain duplicate
entries (multiple data nodes on the same host), which is what we want. This allows
the same array index to be used to convert between the two location formats.

Change-Id: I74fdc20b1dc5200d7e0e90856b8b2088f050e215
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2156
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
Tested-by: Lenni Kuff <lskuff@cloudera.com>
2014-04-08 22:06:14 -07:00
Alex Behm
0585dfb546 IMPALA-888: Materialize union slots referenced by constant predicates.
To keep the predicate assignment/propagation logic simple, we assign conjuncts
whose underlying base table exprs are constant in at least one union operand
to the evaluating MergeNode, and not in the operand(s) whose corresponding base
table exprs are constant.
The JIRA describes two different bugs:
The first bug was that the slots required for evaluating such predicates in the
MergeNode were not marked as materialized. The second bug was that predicates
'pushed' into union operands did not get re-analyzed after substituting the
predicate's exprs with the result exprs of that union operand. Missing casts
lead to a crash. The new test covers both bugs.

Change-Id: I0f5b8a366b32f7d4b2587e13793b6103cdf7e8b3
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2162
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: jenkins
2014-04-07 18:32:29 -07:00
Henry Robinson
415540d789 IMPALA-901: Fix grouping with NULLs when codegen is disabled
The standard implementation of HashTable::Equals() did not correctly
check the NULL bit when the argument row did not evaluate to NULL for a
given probe expr. In the rare circumstance that this gave rise to a
false positive (more on that below), two rows with different grouping
values would be considered equal, and one would be excluded from the
final aggregation output.

HashTable::EvalRow() fills an expression value buffer with the values of
either probe or build exprs evaluated for the argument row. These cached
values are used to determine row equality in Equals(). In order to avoid
a lot of false collisions, an 'unlikely' value is written to that buffer
for NULL values, chosen to be HashUtil::FNV_SEED. So without correct
NULL-bit checking in Equals(), two single-slot rows are considered to be
equal if one of them has NULL for its slot, and the other has a value
equal to HashUtil::FNV_SEED truncated to the size of the slot.

For tinyint columns, this value is -59. As it happens, our random
generator happened to create a table with one tinyint column and which
contained NULL and -59 as values. In order to trigger this bug, the rows
must also have been written to disk in order such that the scanners
returned -59 *first*, and then NULL to the aggregation node; the bug is
not symmetric and works in the opposite case.

Change-Id: I17d43eaeee62b2ac01b67dd599bc4346b012a074
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2130
Reviewed-by: Marcel Kornacker <marcel@cloudera.com>
Tested-by: jenkins
(cherry picked from commit 6e8098254280a9d5ead0b607263ca6728a3222a7)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2161
Reviewed-by: Henry Robinson <henry@cloudera.com>
2014-04-07 17:30:52 -07:00
Alex Behm
8b319f8959 IMPALA-935: Make PlanFragment.getDestFragment() return null if no destination is set.
Change-Id: I269a7f552d7ff67ff4d65e86e8c6df9c41d0fca1
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2159
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: jenkins
2014-04-07 16:21:24 -07:00
Alex Behm
a85dacafe8 IMPALA-904: Make TupleIsNullPredicate work on non-nullable tuples.
We wrap certain exprs substituted from outer-joined inline view in an expr that
evaluates to NULL if the underling tuple(s) are NULL. We do this for exprs that evaluate
to non-NULL values if their slots are NULL, i.e., we must then distinguish tuples that are
NULL from slots that are NULL (otherwise evaluating an expr against a tuple that is NULL
due to the outer join may incorrectly return a non-NULL value.)

The bug: Exprs referring to an outer-joined inline view may appear in various places
in the outer query block. For example, they could appear in an On-clause or be
placed into scans/aggregates due to predicate propagation. In such cases, the underlying
tuples may not be nullable yet because they only become nullable after the outer join.
We had a DCHECK in tuple-is-null-predicate.cc requiring the tuples to be nullable.
The fix: Remove the DCHECK. The fix is not elegant but practical. It would be rather
difficult to fix the inline view expr substitution such that a TupleIsNullPredicate
never references a non-nullable tuple, esp. due to predicate propagation.

Change-Id: I180f75f14173f356abfeec751e6b2d419378a9a7
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2157
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: jenkins
2014-04-07 14:18:49 -07:00
Henry Robinson
99c37aac37 IMPALA-827: Add an option for directories created by INSERT to inherit
their parent's permissions

This patch adds --insert_inherit_permissions. If true, all
new partition directories created by INSERT will inherit their
permissions from their parent. When false, the directories are created
with the default permissions.

Change-Id: Ib2b4c251e51ea5048387169678e8dde34ecfe5f6
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1917
Tested-by: jenkins
Reviewed-by: Henry Robinson <henry@cloudera.com>
2014-04-04 10:25:20 -07:00
Lenni Kuff
c798b23fd9 IMPALA-925: JDBC driver returns no results from getTables()/Columns() with null name pattern
Our HS2 Metadata Op implementation would not return any results if null was passed as the
table name or column name. Instead a null value should be treated the same as '%' (match
everything).

Change-Id: Ibad41e94724cd1f9c1caf40831e30a98132247d9
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2137
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: jenkins
(cherry picked from commit 7020c62545397872877c03a5e101e71edf8101bf)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2142
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
2014-04-03 17:12:25 -07:00
Matthew Jacobs
4d9aad8b9c Admission controller: Change default values for the "default pool"
The admission controller is configurable via Yarn fair scheduler allocation
and Llama configurations, but a "default pool" is used when these files are
not provided. When a pool is defined in a fair-scheduler.xml but no limits
are specified, the following Yarn/Llama default values are used: the max
number of concurrent requests is 20, the max queue size is 50, and the mem
limit is unlimited.

This changes the default values of the "default pool" limits so that the
limits are consistent with the defaults from Yarn/Llama.

Change-Id: Ic76ff550c18cc49353c72926591af46dcbe26ac7
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2006
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: jenkins
(cherry picked from commit 1619d83e452e5b868d12e3934e9704fc5f16cac7)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2118
2014-03-31 15:53:26 -07:00
Matthew Jacobs
cd2dc3e2bd Fix test_failpoints to close queries after cancel
Change-Id: I4f272ccec84030d8b4f85d0e1554a042ee26be30
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2092
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: jenkins
(cherry picked from commit d42aad459a68991fc489caf1edbca10ea599d28a)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2116
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
2014-03-28 18:47:25 -07:00
Matthew Jacobs
f52662c739 Fix TestRequestPoolService, do not use URL.getPath()
URL.getPath() will return a valid URL, which means that some unsafe ASCII
characters are encoded for URLs. In some test environments we had repos
with '@' in the path and this was not handled properly by this test when
we try to load test resources.

Change-Id: I74314c40d7c70b908456cd9263263e83c79e0664
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2108
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: jenkins
(cherry picked from commit a7118d509bb897086966ad98e95052176246801f)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2115
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
2014-03-28 18:42:49 -07:00
Nong Li
c27bd34075 Revert "Disable decimal in analysis."
This reverts commit 695017410adf6d4f8426c4117798c93f823a4b4b.

Change-Id: I919d965e8e711d588e6c56dcdbd3c8e0d9ec7a05
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2104
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
2014-03-27 12:45:55 -07:00
Nong Li
d68c518042 IMPALA-911: Periodically release memory in tcmalloc.
Change-Id: Ifc3d544b7bdc4caa2f238ae13cab6a62cfc587f9
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2085
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
2014-03-26 13:13:49 -07:00
ishaan
734e720297 Fix the tpcds count queries test.
Because of a malformed .test file, TPCDS-COUNT-PROMOTION was never run because of a
missing section delimiter. This patch fixes the .test file and adds the delimiter.

Change-Id: Ifd0fa5db1c2bb84815fc66e981e6a989e6c217e4
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2017
Reviewed-by: Ishaan Joshi <ishaan@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2080
2014-03-25 22:26:42 -07:00
Nong Li
8e7b6d52d3 IMPALA-909: Remove error stack in logs from parquet scanner if the error is expected.
Change-Id: Idb78e025f249424c324167bb44d32fe3d6c83259
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2077
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
2014-03-25 18:39:38 -07:00
Nong Li
b0de4bbe40 IMPALA-812: Fix select node to properly transfer memory ownership.
Change-Id: I83b6d085362726aa080077845d3bef71b184621c
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2076
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
2014-03-25 18:38:55 -07:00
Henry Robinson
8e5848eaf8 RM fixes to get tests passing
* One last NotifyThreadUsageChange() mismatched pair
* Don't set resource in plan fragment params if there isn't a resource
  available. This fixes the problem where if no fragment with resources
  was assigned to the same node as the coordinator, the coordinator
  would have a dummy resource allocation which didn't work with
  expansion.
* Substitute #ID in all impalad arguments to start-impala-cluster.py
  with the 0-indexed ID of the impalad being started. This is required
  to have different Impala processes use different cgroups.

Change-Id: If8c8fd8bef0809bdaf16115a45a9695fc2bf3e1b
(cherry picked from commit c71ce45e97570b8c09900eb5ae2e26984d3306a4)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2060
Tested-by: jenkins
Reviewed-by: Henry Robinson <henry@cloudera.com>
2014-03-24 15:07:45 -07:00
Henry Robinson
bf9741835f Correctly decrement thread usage counter after scanner thread exit
Change-Id: I9bb2a4766904dc73b8433cdffc53c23b32459280
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2033
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: Henry Robinson <henry@cloudera.com>
2014-03-20 18:49:35 -07:00
Matthew Jacobs
02fbeef5e6 IMPALA-908: Requests should fail gracefully if "user" is not set
We should more gracefully handle cases where a username is not
specified. Currently a Precondition will fail, but we should (a)
fall back to a default user and (b) have a way to enforce that
the username is specified and then return a more helpful error
when this occurs.

I've tested this with the jdbc driver with -require_user enabled
and disabled. When -require_user=true, we will return an error:

Using JDBC Driver Name: org.apache.hive.jdbc.HiveDriver
Connecting to: jdbc:hive2://localhost:21050/;auth=noSasl
Executing: select 1
java.sql.SQLException: User must be specified because -require_user is enabled.

Change-Id: I3312d13c0b9b269a4b0e789990995689137e9409
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2021
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
Tested-by: jenkins
(cherry picked from commit 5919b3e45db88258b37ac18ffc98da536225e5df)
2014-03-20 17:34:39 -07:00
Lenni Kuff
3d82c9a5d6 Bump version from 1.3.0-INTERNAL to cdh5-1.3.0
Change-Id: Ib7a37b190091a3f9eb6d6f0f560dd40aed23e231
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2031
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Reviewed-by: Lenni Kuff <lskuff@cloudera.com>
Tested-by: Lenni Kuff <lskuff@cloudera.com>
2014-03-20 17:22:11 -07:00
Henry Robinson
30ee86c880 Fix setting queue in reservation requests
Change-Id: I4505fbb62e84cfd81c9d0afad4e0627be6d31166
(cherry picked from commit 76d9cf29b4b67803e7bbc83492a94a2eb123c588)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2027
Reviewed-by: Henry Robinson <henry@cloudera.com>
Tested-by: jenkins
2014-03-20 16:30:03 -07:00
Nong Li
ee2b9ffb1f IMPALA-906: Fix bug in tracking of row batch auxiliary memory.
We were clearing the mem pool before updating the counter.

Change-Id: Ifd1cfa0ffc28234d403471cc5cb22ac2d4e41091
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2025
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
2014-03-20 16:19:52 -07:00
Henry Robinson
af50a10614 Few RM changes
* Allow destructor to exit quickly if vcore acquisition thread is in RPC
* Allow for optional default resource estimates for queries (for RM only)
* Add Yarn-site suitable for local testing

Change-Id: I93e46477ad05bd3150adfa7d324a54e14a3e1bfe
(cherry picked from commit 2faaa247f8e40dbdc007d9a2c23cdf4060e0563a)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2012
Reviewed-by: Henry Robinson <henry@cloudera.com>
Tested-by: jenkins
2014-03-19 23:24:06 -07:00
Skye Wanderman-Milne
ecdd41bc08 Log runtime state errors to the impalad log as well
These errors should be displayed in the shell, but this doesn't always
work and doesn't help non-shell clients.

Change-Id: Ib77d08cfa208a5e18bc77fa9567678a9cffe5d0d
(cherry picked from commit 21c90b8a6c053bcf52af4f67deff3fa73cf9bdb4)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2004
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: jenkins
2014-03-19 14:29:00 -07:00
Skye Wanderman-Milne
8e9776b824 Mark TestUdfs.test_mem_limits to run serially
This was causing other tests to fail with process mem limit exceeded.

Change-Id: I1407b0896052aece691c681827994961b09d8103
(cherry picked from commit 2bcc46117f504f50ded724fddf74f24bd829c6c6)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2003
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: jenkins
2014-03-19 14:18:11 -07:00