When the coordinator prints the 'backend number' of
fragments that are finished or result in an error, the
hostname associated with that backend is also printed.
Change-Id: I0b27549bd9155ab9b077933ab6f621f4f0887371
Reviewed-on: http://gerrit.cloudera.org:8080/912
Reviewed-by: Marcel Kornacker <marcel@cloudera.com>
Tested-by: Internal Jenkins
Collection-typed slots are expensive to copy, e.g., during data
exchanges or when writing into a buffered-tuple-stream. Even worse,
such slots could be duplicated many times after unnesting in a
subplan. To alleviate this problem, this patch implements a
poor man's projection where collection-typed slots are set to NULL
inside the SubplanNode that flattens them.
The FE guarantees that the contents of an array-typed slot are never
referenced outside of the single UnnestNode that access them, so when
returning eos in UnnestNode::GetNext() we also set the unnested array
slot to NULL to avoid those expensive copies in downstream exec nodes.
The FE provides that guarantee by creating a new slot in the parent
scan for every relative CollectionTableRef. For example, for a table
't' with a collection-typed column 'c' the following query would have
two separate slots in the tuple of 't', one for 'c1' and one for 'c2':
select * from t, t.c c1, t.c c2
Change-Id: I90e5b86463019c9ed810c299945c831c744ff563
Reviewed-on: http://gerrit.cloudera.org:8080/763
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Internal Jenkins
This patch stores the fragment's TExecPlanFragmentParams, which is the
top-level thrift struct for the fragment, in the runtime state. We
store the fragment params so we can dump everything sent from the FE
to the BE when we detect a problem. This patch does so when a call to
HdfsTable::GetPartition() returns NULL, which we never expect to
happen. This will give us more information in the case of crashes
caused by IMPALA-1702 or similar bugs that cause bad partition IDs to
be sent to the BE.
Change-Id: Ibb2ff05810cfd5f7aa3e210555d9d69361e8272a
Reviewed-on: http://gerrit.cloudera.org:8080/456
Reviewed-by: Juan Yu <jyu@cloudera.com>
Tested-by: Internal Jenkins
libhdfs hdfsListDirectory API documentation is wrong. It says it returns NULL
when there is an error. But it will return NULL as well when the directory
is empty. Impala needs to check errno to make sure if an error happened.
The HDFS issue is addressed by HDFS-8407.
Change-Id: I9574c321a56fe339d4ccc3bb5bea59bc41f48ac4
(cherry picked from commit 20da688af19ca41576c82fd7b7d49b4346dbae92)
Reviewed-on: http://gerrit.cloudera.org:8080/394
Reviewed-by: Juan Yu <jyu@cloudera.com>
Tested-by: Internal Jenkins
By doing so, we avoid unnecessarily calling the copy constructor for
Status OK objects and loading the value from memory (due to the old
Status::OK being a global). The impact of this patch was validated by
inspecting both optimized assembly code and generated IR code.
Applying this patch has some effect on the amount of generated code. The
new tool `get_code_size` will list the text, data, and bss sizes for all
archives that we produce in a release build. This patch reduces the code
size by ~20 kB.
Text Data BSS
Old 10578622 576864 40825
New 10559367 576864 40809
The majority of the changes in this patch have been mechanically applied
using:
find be/src -name "*.cc" -or -name "*.h" | xargs sed -i
's/Status::OK;/Status::OK\(\);/'
A new micro-benchmark was added to determine the overhead of using
Status in hot code sections.
Machine Info: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz
status: Function Rate (iters/ms) Comparison
----------------------------------------------------------------------
Call Status::OK() 9.555e+08 1X
Call static Status::Error 4.515e+07 0.04725X
Call Status(Code, 'string') 9.873e+06 0.01033X
Call w/ Assignment 5.422e+08 0.5674X
Call Cond Branch OK 5.941e+06 0.006218X
Call Cond Branch ERROR 7.047e+06 0.007375X
Call Cond Branch Bool (false) 1.914e+10 20.03X
Call Cond Branch Bool (true) 1.491e+11 156X
Call Cond Boost Optional (true) 3.935e+09 4.118X
Call Cond Boost Optional (false) 2.147e+10 22.47X
Change-Id: I1be6f4c52e2db8cba35b3938a236913faa321e9e
Reviewed-on: http://gerrit.cloudera.org:8080/351
Reviewed-by: Martin Grund <mgrund@cloudera.com>
Tested-by: Internal Jenkins
Adds a static definition of the metric metadata used by Impala. The
metric names, descriptions, and other properties are defined in
common/thrift/metrics.json file, and the generate_metrics.py script
creates a thrift representation. The metric definitions are then
available in a constant map which is used at runtime to instantiate
metrics, looking them up in the map by the metric key.
New metrics should be defined by adding an entry to the list of metrics
in metrics.json with the following properties:
key: The unique string identifying the metric. If the metric can
be templated, e.g. rpc call duration, it may be a format
string (in the format used by strings::Substitute()).
description: A text description of the metric. May also be a format
string.
label: A brief title for the metric, not currently used by
Impala but provided for external tools.
units: The unit of the metric. Must be a valid value of TUnit.
kind: The kind of metric, e.g. GAUGE or COUNTER. Must be a valid
value of TMetricKind.
contexts: The context in which this metric may be instantiated.
Usually "IMPALAD", "STATESTORED", "CATALOGD", but may be
a different kind of 'entity'. Not currently used by
Impala but provided for modeling purposes for external
tools.
For example, adding the counter for the total number of queries run over
the lifetime of the impalad process might look like:
{
"key": "impala-server.num-queries",
"description": "The total number of queries processed.",
"label": "Queries",
"units": "UNIT",
"kind": "COUNTER",
"contexts": [
"IMPALAD"
]
}
TODO: Incorporate 'label' into the metrics debug page.
TODO: Verify the context at runtime, e.g. verify 'contexts' contains,
e.g. a DCHECK.
After the metric definition is added, the generate_metrics.py script
will generate the TMetricDefs.thrift that contains a TMetricDef for
the metric definition. At runtime, the metric can be instantiated
using the key defined in metrics.json. Gauges, Counters, and
Properties are instantiated using static methods on MetricGroup. Other
metric types are instantiated using static CreateAndRegister methods
on their associated classes.
TODO: Generate a thrift enum used to lookup metric defs.
TODO: Consolidate the instantiation of metrics that are created
outside of metrics.h (i.e. collection metrics, memory metrics).
TODO: Need a better way to verify if metric definitions are missing.
Change-Id: Iba7f94144d0c34f273c502ce6b9a2130ea8fedaa
Reviewed-on: http://gerrit.cloudera.org:8080/330
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: Internal Jenkins
To be able to use our own spinlock implementation together with the std
/ boost lock_guards, it needs to be lock compatible. This patch adds the
three required methods: lock(), unlock() and try_lock().
Furthermore, the old ScopedSpinLock class is removed to avoid code
duplication.
Change-Id: Icb082b573e5ee71752f5da65a21c7753f40a4a4b
Reviewed-on: http://gerrit.cloudera.org:8080/304
Reviewed-by: Martin Grund <mgrund@cloudera.com>
Tested-by: Internal Jenkins
We recently added ClientConnection::DoRpc() to wrap the tedious retry
logic required to get the underlying connection into the right
state. Doing so saves lots of code in the caller, so this patch moves
almost all calls to the new interface.
The rewrite isn't completely mechanical - some call sites had very
conservative try {} catch {} blocks that I have removed in favour of
having just one error path per invocation.
The remaining call sites are in ResourceBroker::SendLlamaRpc() and
friends, where the handling is a bit unusual.
Change-Id: I972d7328a1ff5c7ace35dd3da43eee4981d867f4
Reviewed-on: http://gerrit.cloudera.org:8080/349
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Internal Jenkins
This patch removes all occurrences of "using namespace std" and "using
namespace boost(.*)" from the codebase. However, there are still cases
where namespace directives are used (e.g. for rapidjson, thrift,
gutil). These have to be tackled in subsequent patches.
To reduce the patch size, this patch introduces a new header file called
"names.h" that will include many of our most frequently used symbols iff
the corresponding include was already added. This means, that this
header file will pull in for example map / string / vector etc, only iff
vector was already included. This requires "common/names.h" to be the
last include. After including `names.h` a new block contains a sorted list
of using definitions (this patch does not fix namespace directive
declarations for other than std / boost namespaces.)
Change-Id: Iebe4c054670d655bc355347e381dae90999cfddf
Reviewed-on: http://gerrit.cloudera.org:8080/338
Reviewed-by: Martin Grund <mgrund@cloudera.com>
Tested-by: Internal Jenkins
This patch introduces the concept of error codes for errors that are
recorded in Impala and are going to be presented to the client. These
error codes are used to aggregate and group incoming error / warning
messages to reduce the spill on the shell and increase the usefulness of
the messages. By splitting the message string from the implementation,
it becomes possible to edit the string independently of the code and
pave the way for internationalization.
Error messages are defined as a combination of an enum value and a
string. Both are defined in the Error.thrift file that is automatically
generated using the script in common/thrift/generate_error_codes.py. The
goal of the script is to have a central understandable repository of
error messages. Adding new messages to this file will require rebuilding
the thrift part. The proxy class ErrorMessage is responsible to
represent an error and capture the parameters that are used to format
the error message string.
When error messages are recorded they are recorded based on the
following algorithm:
- If an error message is of type GENERAL, do not aggregate this message
and simply add it to the total number of messages
- If an error messages is of specific type, record the first error
message as a sample and for all other occurrences increment the count.
- The coordinator will merge all error messages except the ones of type
GENERAL and display a count.
For example, in the case of the parquet file spanning multiple blocks
the output will look like:
Parquet files should not be split into multiple hdfs-blocks.
file=hdfs://localhost:20500/fid.parq (1 of 321 similar)
All messages are always logged to VLOG. In the coordinator error
messages are merged across all backends to retain readability in the
case of large clusters.
The current version of this patch adds these new error codes to some of
the most important error messages as a reference implementation.
Change-Id: I1f1811631836d2dd6048035ad33f7194fb71d6b8
Reviewed-on: http://gerrit.cloudera.org:8080/39
Reviewed-by: Martin Grund <mgrund@cloudera.com>
Tested-by: Internal Jenkins
This patch reworks a lot of the metrics subsystem, laying much of the
groundwork for unifying runtime profiles and metrics in the future, as
well as enabling better rendering of metric data in our webpages, and
richer integration with thirdparty monitoring tools like CM.
There are lots of changes. The most significant are below.
TODO (incomplete list):
* Add descriptions for all metrics
* Settle on a standard hierarchy for process-wide metric groups
* Add path-based resolution for searching for metrics (i.e. resolve
"group1.group2.metric_name")
* Add a histogram metric type
Improvements for all metrics:
** New 'description' field, which allows a human-readable description to
be provided for each metric.
** Metrics must serialise themselves to JSON via the RapidJson
library (all by-hand JSON serialisation has been removed).
** Metrics are contained in MetricGroups (replacing the old 'Metrics'
class), which are hierarchically arranged to make grouping metrics
into smaller subsystems more natural.
** Metrics are rendered via the new webserver templating engine,
replacing the old /metrics endpoint. The old /jsonmetrics endpoint is
retained for backwards compatibility.
Improvements for 'simple' metrics:
** SimpleMetric replaces the old PrimitiveMetric class (using much of
the same code), and are metrics whose value does not itself have
relevant structure (as opposed to sets, lists, etc).
** SimpleMetrics have 'kinds' (counter, gauge, property etc)
** ... and units (from TCounterType), to make pretty-printing easier.
Change-Id: Ida1d125172d8572dfe9541b4271604eff95cfea6
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5722
Tested-by: jenkins
Reviewed-by: Henry Robinson <henry@cloudera.com>
Currently, the backend assumes file paths are on the default FS. Change
this so that the file path is used to infer the appropriate filesystem
to connect to.
Also moves the error checking inside of HdfsFsCache so that each
callsite doesn't need to handle the boiler plate error message
construction independently (and add some missing error handling cases).
Change-Id: I24bc4fbbe8f95b7e5b99ad7e2952b41f1d4c4173
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5200
Tested-by: jenkins
Reviewed-by: Daniel Hecht <dhecht@cloudera.com>
(cherry picked from commit 9d3e2b619a80d1af595193e3cec47284b7b28eba)
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5246
This change gets rid of RuntimeState::udf_mem_tracker_, and introduces
ExecNode::expr_mem_tracker to replace it. This way expr allocations
are still separate, but they fall under the correct exec node.
Change-Id: Iaf2cf610c2adf244ade94a9adb906cd68ad78838
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5000
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: jenkins
(cherry picked from commit 700cd9cde07b6717c758b906c98a6785b34d6634)
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5212
UDF/UDAs use FunctionContext::AllocateLocal() to allocate memory that
is owned by Impala and can be freed at any time. This is used to
return string data to Impala. However, we weren't freeing local
allocations before the end of the query, even if the returned data was
no longer needed. This meant that queries like "select
min(lower(string_col)) from tbl" would never free the memory allocated
by lower(), effectively accumulating the entire dataset in memory.
This patch adds calls to FunctionContext::FreeLocalAllocations() so we
don't accumulate unneeded memory. We don't want to call this too often
(i.e. for every row evaluated) for performance reasons, so the
allocations are freed in RuntimeState::QueryMaintenance() (renamed
from CheckQueryState()).
I'm not sure how to test this, but ad-hoc testing confirms that this
does reduce peak memory usage.
Change-Id: I1b973a027151a86521057756f4fe9bf7954881bd
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4536
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: jenkins
This commit adds a summary page for each completed query which displays
the query statement, the query state, the plan and the execution summary.
Change-Id: I9739f1a2ddd1d6465a69d59bb3f173b0101e6fe8
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/3645
Reviewed-by: Henry Robinson <henry@cloudera.com>
Tested-by: jenkins
(cherry picked from commit 59ef2ed606d7d9c479f6695c8fc7801fcb0ab476)
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/3927
This patch changes the interface for evaluating expressions, in order
to allow for thread-safe expression evaluations and easier
codegen. Thread safety is achieved via the ExprContext class, a
light-weight container for expression tree evaluation state. Codegen
is easier because more expressions can be cross-compiled to IR.
See expr.h and expr-context.h for an overview of the API
changes. See sort-exec-exprs.cc for a simple example of the new
interface and hdfs-scanner.cc for a more complicated example.
This patch has not been completely code reviewed and may need further
cleanup/stylistic work, as well as additional perf work.
Change-Id: I3e3baf14ebffd2687533d0cc01a6fb8ac4def849
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/3459
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: Skye Wanderman-Milne <skye@cloudera.com>
If RM and per-query memory limits were enabled at the same time, the
per-query limit would be ignored if RM wanted to expand the memory
allocation. This change adds an optional reservation limit to a
memtracker. The original limit goes back to being a hard limit -
i.e. any attempt to consume more than that amount results in
failure. The RM reservation limit is the RM-allocated memory limit. If
that is exceeded it triggers the ExpandRmReservation() method, which tries
to retrieve more memory as long as the hard limit is observed.
The net effect is that per-query memory limits have the intended,
hard-limit effect, while the RM limits coexist nicely and can expand
with more memory as required.
At the same time, we change the precedence of various ways of suggesting
an initial reservation size so that the user can change the reservation
size via a query option (MEM_RESERVATION_SIZE).
Change-Id: I41bfa4eb1336810a8a5946f6be3472111a052144
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3134
Tested-by: jenkins
Reviewed-by: Henry Robinson <henry@cloudera.com>
Users can now type 'summary' in the Impala shell after a query executes
to get a breakdown of the work done by each part of the query plan.
Change-Id: Ia6a43429ffc7778f3c2c8fcbf45d83828263c2ab
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2963
Tested-by: jenkins
Reviewed-by: Henry Robinson <henry@cloudera.com>
(cherry picked from commit 9b98d42acb14d43a64832767528ee572eac4979b)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2995
The runtime profile as we present it is not very useful and I think the structure of
it makes it hard to consume. This patch adds a new client facing schemed set of
counters that are collected from the runtime profiles. For example, with this structure
it would be easy to have the shell get the stats of a running query and print a useful
progress report or to check the most relevant metrics for diagnosing issues.
Here's an example of the output for one of the tpch queries:
Operator #Hosts Avg Time Max Time #Rows Est. #Rows Peak Mem Est. Peak Mem Detail
------------------------------------------------------------------------------------------------------------------------
09:MERGING-EXCHANGE 1 79.738us 79.738us 5 5 0 -1.00 B UNPARTITIONED
05:TOP-N 3 84.693us 88.810us 5 5 12.00 KB 120.00 B
04:AGGREGATE 3 5.263ms 6.432ms 5 5 44.00 KB 10.00 MB MERGE FINALIZE
08:AGGREGATE 3 16.659ms 27.444ms 52.52K 600.12K 3.20 MB 15.11 MB MERGE
07:EXCHANGE 3 2.644ms 5.1ms 52.52K 600.12K 0 0 HASH(o_orderpriority)
03:AGGREGATE 3 342.913ms 966.291ms 52.52K 600.12K 10.80 MB 15.11 MB
02:HASH JOIN 3 2s165ms 2s171ms 144.87K 600.12K 13.63 MB 941.01 KB INNER JOIN, BROADCAST
|--06:EXCHANGE 3 8.296ms 8.692ms 57.22K 15.00K 0 0 BROADCAST
| 01:SCAN HDFS 2 1s412ms 1s978ms 57.22K 15.00K 24.21 MB 176.00 MB tpch.orders o
00:SCAN HDFS 3 8s032ms 8s558ms 3.79M 600.12K 32.29 MB 264.00 MB tpch.lineitem l
Change-Id: Iaad4b9dd577c375006313f19442bee6d3e27246a
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2964
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
Enable order-by without limit
Added BufferedBlockMgr to allocate buffers and spill to disk.
Added Sorter for the external sort impelementation
Added new SortNode execution node that completely sorts its input
Changes to enable writing in IoMgr went in a separate patch.
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1539
Reviewed-by: Srinath Shankar <sshankar@cloudera.com>
Tested-by: jenkins
Conflicts:
testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test
Change-Id: I3ece32affe5b006f53bbdfcc03ded01471e818ac
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2900
Reviewed-by: Srinath Shankar <sshankar@cloudera.com>
Tested-by: jenkins
If a partition had a location that did not exist in HDFS, Impala would
refuse to load its metadata. This meant a typo could render a table
unloadable. We fix this problem by removing the existence check from the
frontend, and by inheriting access from the first extant parent of the
partition directory.
Fixing this exposed a second issue, where Impala wouldn't create
directories for partitions in the right place after an INSERT if the
partition location had been changed. To get this right we have to plumb
the partition ID through to Coordinator::FinalizeSuccessfulInsert(), so
that the coordinator can look up the partition's location from the
query-wide descriptor table. As a by-product, this patch rationalises
the per-partition, per-fragment statistics gathering a little bit by
putting almost all the per-partition stats into TInsertPartitionStatus.
Change-Id: I9ee0a1a1ef62cf28f55be3249e8142c362083163
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2851
Reviewed-by: Henry Robinson <henry@cloudera.com>
Tested-by: jenkins
The issue is that Impala crashes in ClearResultCache() with result caching on
for parallel inserts. The reason is that the ClearResltCache() accesses the
coordinator RuntimeState to update the query mem tracker. However, for there is
no coordinator fragment (or RuntimeState) for parallel inserts.
The fix is to intiialize a query mem tracker to track memory usage in the coordinator
instance even if there is no coordinator fragment.
Change-Id: I3a2ef14860f683910c29ae19b931202ca6867b9f
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2501
Reviewed-by: Srinath Shankar <sshankar@cloudera.com>
Tested-by: jenkins
We start up plan fragments in level order, calling Prepare() one level at a time.
Currently, OptimizeLlvmModule is called at the end of Prepare(). OptimizeLlvmModule()
compiles the IR which can be expensive (~1 second). By doing this level order, we
serialize the compile time and for big plans, this can add up.
This change is trivial by moving the OptimizeLlvmModule() call to the Open() phase.
The snipped from the issue is (running on an empty table):
The codegen time is rows available & first row fetched.
Before:
Query Timeline:
Query Timeline: 4s182ms
- Start execution: 2.648ms (2.648ms)
- Planning finished: 321.5ms (318.357ms)
- Rows available: 4s161ms (3s840ms)
- First row fetched: 4s163ms (1.368ms)
- Unregister query: 4s167ms (3.786ms)
With this patch:
Query Timeline: 2s111ms
- Start execution: 2.623ms (2.623ms)
- Planning finished: 369.284ms (366.660ms)
- Rows available: 2s018ms (1s649ms)
- First row fetched: 2s093ms (74.962ms)
- Unregister query: 2s097ms (3.805ms)
Change-Id: Ia78782e4708313ed197877749e80a9a68eeec551
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2597
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
The bug: Coordinator::Wait() is supposed to block until rows become available for
consumption by the client. We rely on Wait() to determine when to advance the query
status to a 'ready' state and signal to the client that rows can be fetched.
Long fetch times can trigger client timeouts at various levels (socket, app, etc.).
Coordinator::Wait() simply opens the coordinator fragment's plan tree.
For most plan nodes, Open() does work to prepare the plan tree, s.t., GetNext()
returns quickly. However, for ExchangeNodes Open() used to not wait
until rows are obtained form the underlying stream receiver.
The fix: Make ExchangeNode::Open() block until rows are available.
Change-Id: I7b197eea11d21fd732414d96c899a17b2d99631c
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2128
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2185
their parent's permissions
This patch adds --insert_inherit_permissions. If true, all
new partition directories created by INSERT will inherit their
permissions from their parent. When false, the directories are created
with the default permissions.
Change-Id: Ib2b4c251e51ea5048387169678e8dde34ecfe5f6
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1917
Tested-by: jenkins
Reviewed-by: Henry Robinson <henry@cloudera.com>
* One last NotifyThreadUsageChange() mismatched pair
* Don't set resource in plan fragment params if there isn't a resource
available. This fixes the problem where if no fragment with resources
was assigned to the same node as the coordinator, the coordinator
would have a dummy resource allocation which didn't work with
expansion.
* Substitute #ID in all impalad arguments to start-impala-cluster.py
with the 0-indexed ID of the impalad being started. This is required
to have different Impala processes use different cgroups.
Change-Id: If8c8fd8bef0809bdaf16115a45a9695fc2bf3e1b
(cherry picked from commit c71ce45e97570b8c09900eb5ae2e26984d3306a4)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2060
Tested-by: jenkins
Reviewed-by: Henry Robinson <henry@cloudera.com>
This defect was caused by incorrectly aggregating intermediate
runtime profiles in the patch for IMPALA-540. Runtime profiles
were being summed instead of averaged.
In this patch, a new AveragedCounter is introduced to maintain a
fragment's average_profile. A fragment's average_profile will now
be the average of the current fragment instance profiles
Change-Id: I2b808023e6206e2e36513f649a16f8c2157f6bb2
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1839
Reviewed-by: Srinath Shankar <sshankar@cloudera.com>
Tested-by: jenkins
* Each node has one QueryResourceMgr per query it is running fragments
for. A QueryResourceMgr handles creating expansion RPC requests, and
monitoring the thread:VCPU ratio for each query (and requesting more
VCPUs from YARN if oversubscribed).
* MemTrackers now have an ExpandLimit API which does nothing unless they
have a QueryResourceMgr. This method blocks for now, but when the IO
manager changes its API to use TryConsume(), we'll need to issue these
asynchronously to avoid keeping hold of a thread.
* ResourceBroker etc. got updated to support the Expansion API.
Change-Id: Ia3c4635497f0563cfc5cd0e330e5f1f586577200
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1800
Reviewed-by: Henry Robinson <henry@cloudera.com>
Tested-by: jenkins
This is some fairly mechanical refactoring that will be necessary for
the expr refactoring branch. It introduces the following changes:
- lookupSymbol() is moved from CreateFunctionStmtBase to Function
- HdfsFsFache and HdfsLibCache are removed from ExecEnv and made into
standalone singletons
- Various 'is_fe_tests' variables are consolidated into one variable
that is initialized early
This is paving the way for having the FE call into the BE on startup
to resolve builtins' symbols. The general idea is to simplify the
order various services are created, by having classes with fewer
dependencies initialized first.
Change-Id: I5d0e1325792a53680a9738e2e3e67fed6201299c
(cherry picked from commit cd08211dbf3d6275de70691b2387face1ba3f81a)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1690
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: jenkins
Adds the ability to set per-pool memory limits. Each impalad tracks the memory
used by queries in each pool; a per-pool memory tracker is added between the
per-query trackers and the process memory tracker. The current memory usage
is disseminated via statestore heartbeats (along with the other per-pool stats)
and a cluster-wide estimate of the pool memory usage is updated when topic
updates are received. The admission controller will not admit incoming
requests if the total memory usage is already over the configured pool limit.
Change-Id: Ie9bc82d99643352ba77fb91b6c25b42938b1f745
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1508
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
(cherry picked from commit 64a137930a318e56a7090a317e6aa5df67ea72cd)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1623
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: Matthew Jacobs <mj@cloudera.com>
resources across interactions with the Llama. The MiniLlama expectes IP:port of
Hadoop DNs, whereas the regular Llama only deals with the hostnames of Yarn NMs
(stripping away the port of resource locations in reservation responses).
Change-Id: I5ebd431336cda4f06df93cfa3fea4a37d1102b63