Commit Graph

147 Commits

Author SHA1 Message Date
Matthew Jacobs
325916eefe IMPALA-2046: Print hostname along with backend number
When the coordinator prints the 'backend number' of
fragments that are finished or result in an error, the
hostname associated with that backend is also printed.

Change-Id: I0b27549bd9155ab9b077933ab6f621f4f0887371
Reviewed-on: http://gerrit.cloudera.org:8080/912
Reviewed-by: Marcel Kornacker <marcel@cloudera.com>
Tested-by: Internal Jenkins
2015-09-27 15:13:31 -07:00
Alex Behm
f46aa38161 IMPALA-2325: Skip NULL tuples in Coordinator::ValidateCollectionSlots().
Change-Id: I4b3e07f1ebe0d4244f7386f54d7b74b403b798e2
Reviewed-on: http://gerrit.cloudera.org:8080/817
Reviewed-by: Marcel Kornacker <marcel@cloudera.com>
Tested-by: Internal Jenkins
2015-09-14 13:43:01 -07:00
Alex Behm
deb9c6f8e6 Nested Types: Poor man's projection for collection-typed slots.
Collection-typed slots are expensive to copy, e.g., during data
exchanges or when writing into a buffered-tuple-stream. Even worse,
such slots could be duplicated many times after unnesting in a
subplan. To alleviate this problem, this patch implements a
poor man's projection where collection-typed slots are set to NULL
inside the SubplanNode that flattens them.

The FE guarantees that the contents of an array-typed slot are never
referenced outside of the single UnnestNode that access them, so when
returning eos in UnnestNode::GetNext() we also set the unnested array
slot to NULL to avoid those expensive copies in downstream exec nodes.

The FE provides that guarantee by creating a new slot in the parent
scan for every relative CollectionTableRef. For example, for a table
't' with a collection-typed column 'c' the following query would have
two separate slots in the tuple of 't', one for 'c1' and one for 'c2':

select * from t, t.c c1, t.c c2

Change-Id: I90e5b86463019c9ed810c299945c831c744ff563
Reviewed-on: http://gerrit.cloudera.org:8080/763
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Internal Jenkins
2015-09-10 05:44:55 +00:00
Skye Wanderman-Milne
00438981ed IMPALA-1702: Dump fragment's thrift descriptor when a table's partition info is corrupt
This patch stores the fragment's TExecPlanFragmentParams, which is the
top-level thrift struct for the fragment, in the runtime state. We
store the fragment params so we can dump everything sent from the FE
to the BE when we detect a problem. This patch does so when a call to
HdfsTable::GetPartition() returns NULL, which we never expect to
happen. This will give us more information in the case of crashes
caused by IMPALA-1702 or similar bugs that cause bad partition IDs to
be sent to the BE.

Change-Id: Ibb2ff05810cfd5f7aa3e210555d9d69361e8272a
Reviewed-on: http://gerrit.cloudera.org:8080/456
Reviewed-by: Juan Yu <jyu@cloudera.com>
Tested-by: Internal Jenkins
2015-06-18 02:28:40 +00:00
Juan Yu
4810e51446 IMPALA-2008: Fix wrong warning when insert overwrite to empty table
libhdfs hdfsListDirectory API documentation is wrong. It says it returns NULL
when there is an error. But it will return NULL as well when the directory
is empty. Impala needs to check errno to make sure if an error happened.
The HDFS issue is addressed by HDFS-8407.

Change-Id: I9574c321a56fe339d4ccc3bb5bea59bc41f48ac4
(cherry picked from commit 20da688af19ca41576c82fd7b7d49b4346dbae92)
Reviewed-on: http://gerrit.cloudera.org:8080/394
Reviewed-by: Juan Yu <jyu@cloudera.com>
Tested-by: Internal Jenkins
2015-05-22 20:23:39 +00:00
Martin Grund
1afe72830a IMPALA-1916: Replace Status::OK by Status::OK()
By doing so, we avoid unnecessarily calling the copy constructor for
Status OK objects and loading the value from memory (due to the old
Status::OK being a global). The impact of this patch was validated by
inspecting both optimized assembly code and generated IR code.

Applying this patch has some effect on the amount of generated code. The
new tool `get_code_size` will list the text, data, and bss sizes for all
archives that we produce in a release build. This patch reduces the code
size by ~20 kB.

      Text      Data    BSS
Old   10578622  576864  40825
New   10559367  576864  40809

The majority of the changes in this patch have been mechanically applied
using:

   find be/src -name "*.cc" -or -name "*.h" | xargs sed -i
   's/Status::OK;/Status::OK\(\);/'

A new micro-benchmark was added to determine the overhead of using
Status in hot code sections.

Machine Info: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz
status:               Function     Rate (iters/ms)          Comparison
----------------------------------------------------------------------
             Call Status::OK()           9.555e+08                  1X
     Call static Status::Error           4.515e+07            0.04725X
   Call Status(Code, 'string')           9.873e+06            0.01033X
            Call w/ Assignment           5.422e+08             0.5674X
           Call Cond Branch OK           5.941e+06           0.006218X
        Call Cond Branch ERROR           7.047e+06           0.007375X
 Call Cond Branch Bool (false)           1.914e+10              20.03X
  Call Cond Branch Bool (true)           1.491e+11                156X
Call Cond Boost Optional (true)          3.935e+09              4.118X
Call Cond Boost Optional (false)         2.147e+10              22.47X

Change-Id: I1be6f4c52e2db8cba35b3938a236913faa321e9e
Reviewed-on: http://gerrit.cloudera.org:8080/351
Reviewed-by: Martin Grund <mgrund@cloudera.com>
Tested-by: Internal Jenkins
2015-05-22 09:53:13 +00:00
Matthew Jacobs
fe87bb1563 Add MetricDefs, static definitions of metric metadata generated from json
Adds a static definition of the metric metadata used by Impala. The
metric names, descriptions, and other properties are defined in
common/thrift/metrics.json file, and the generate_metrics.py script
creates a thrift representation. The metric definitions are then
available in a constant map which is used at runtime to instantiate
metrics, looking them up in the map by the metric key.

New metrics should be defined by adding an entry to the list of metrics
in metrics.json with the following properties:

key:         The unique string identifying the metric. If the metric can
             be templated, e.g. rpc call duration, it may be a format
             string (in the format used by strings::Substitute()).
description: A text description of the metric. May also be a format
             string.
label:       A brief title for the metric, not currently used by
             Impala but provided for external tools.
units:       The unit of the metric. Must be a valid value of TUnit.
kind:        The kind of metric, e.g. GAUGE or COUNTER. Must be a valid
             value of TMetricKind.
contexts:    The context in which this metric may be instantiated.
             Usually "IMPALAD", "STATESTORED", "CATALOGD", but may be
             a different kind of 'entity'. Not currently used by
             Impala but provided for modeling purposes for external
             tools.

For example, adding the counter for the total number of queries run over
the lifetime of the impalad process might look like:

  {
    "key": "impala-server.num-queries",
    "description": "The total number of queries processed.",
    "label": "Queries",
    "units": "UNIT",
    "kind": "COUNTER",
    "contexts": [
      "IMPALAD"
    ]
  }

TODO: Incorporate 'label' into the metrics debug page.
TODO: Verify the context at runtime, e.g. verify 'contexts' contains,
      e.g. a DCHECK.

After the metric definition is added, the generate_metrics.py script
will generate the TMetricDefs.thrift that contains a TMetricDef for
the metric definition. At runtime, the metric can be instantiated
using the key defined in metrics.json. Gauges, Counters, and
Properties are instantiated using static methods on MetricGroup. Other
metric types are instantiated using static CreateAndRegister methods
on their associated classes.

TODO: Generate a thrift enum used to lookup metric defs.
TODO: Consolidate the instantiation of metrics that are created
      outside of metrics.h (i.e. collection metrics, memory metrics).
TODO: Need a better way to verify if metric definitions are missing.

Change-Id: Iba7f94144d0c34f273c502ce6b9a2130ea8fedaa
Reviewed-on: http://gerrit.cloudera.org:8080/330
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: Internal Jenkins
2015-05-14 21:27:28 +00:00
Martin Grund
706893c459 IMPALA-1647: Making Spinlocks std / boost lock compatible
To be able to use our own spinlock implementation together with the std
/ boost lock_guards, it needs to be lock compatible. This patch adds the
three required methods: lock(), unlock() and try_lock().

Furthermore, the old ScopedSpinLock class is removed to avoid code
duplication.

Change-Id: Icb082b573e5ee71752f5da65a21c7753f40a4a4b
Reviewed-on: http://gerrit.cloudera.org:8080/304
Reviewed-by: Martin Grund <mgrund@cloudera.com>
Tested-by: Internal Jenkins
2015-04-22 20:29:39 +00:00
Henry Robinson
705f036829 Move almost all RPC invocations to DoRpc() interface
We recently added ClientConnection::DoRpc() to wrap the tedious retry
logic required to get the underlying connection into the right
state. Doing so saves lots of code in the caller, so this patch moves
almost all calls to the new interface.

The rewrite isn't completely mechanical - some call sites had very
conservative try {} catch {} blocks that I have removed in favour of
having just one error path per invocation.

The remaining call sites are in ResourceBroker::SendLlamaRpc() and
friends, where the handling is a bit unusual.

Change-Id: I972d7328a1ff5c7ace35dd3da43eee4981d867f4
Reviewed-on: http://gerrit.cloudera.org:8080/349
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Internal Jenkins
2015-04-22 02:30:20 +00:00
Martin Grund
2eb12e9593 Deprecating namespace directive declarations (std, boost)
This patch removes all occurrences of "using namespace std" and "using
namespace boost(.*)" from the codebase. However, there are still cases
where namespace directives are used (e.g. for rapidjson, thrift,
gutil). These have to be tackled in subsequent patches.

To reduce the patch size, this patch introduces a new header file called
"names.h" that will include many of our most frequently used symbols iff
the corresponding include was already added. This means, that this
header file will pull in for example map / string / vector etc, only iff
vector was already included. This requires "common/names.h" to be the
last include. After including `names.h` a new block contains a sorted list
of using definitions (this patch does not fix namespace directive
declarations for other than std / boost namespaces.)

Change-Id: Iebe4c054670d655bc355347e381dae90999cfddf
Reviewed-on: http://gerrit.cloudera.org:8080/338
Reviewed-by: Martin Grund <mgrund@cloudera.com>
Tested-by: Internal Jenkins
2015-04-18 01:26:47 +00:00
Martin Grund
b582cdc22b IMPALA-1598: Adding Error Codes to Log Messages
This patch introduces the concept of error codes for errors that are
recorded in Impala and are going to be presented to the client. These
error codes are used to aggregate and group incoming error / warning
messages to reduce the spill on the shell and increase the usefulness of
the messages. By splitting the message string from the implementation,
it becomes possible to edit the string independently of the code and
pave the way for internationalization.

Error messages are defined as a combination of an enum value and a
string. Both are defined in the Error.thrift file that is automatically
generated using the script in common/thrift/generate_error_codes.py. The
goal of the script is to have a central understandable repository of
error messages. Adding new messages to this file will require rebuilding
the thrift part. The proxy class ErrorMessage is responsible to
represent an error and capture the parameters that are used to format
the error message string.

When error messages are recorded they are recorded based on the
following algorithm:

- If an error message is of type GENERAL, do not aggregate this message
  and simply add it to the total number of messages
- If an error messages is of specific type, record the first error
  message as a sample and for all other occurrences increment the count.
- The coordinator will merge all error messages except the ones of type
  GENERAL and display a count.

For example, in the case of the parquet file spanning multiple blocks
the output will look like:

    Parquet files should not be split into multiple hdfs-blocks.
    file=hdfs://localhost:20500/fid.parq (1 of 321 similar)

All messages are always logged to VLOG. In the coordinator error
messages are merged across all backends to retain readability in the
case of large clusters.

The current version of this patch adds these new error codes to some of
the most important error messages as a reference implementation.

Change-Id: I1f1811631836d2dd6048035ad33f7194fb71d6b8
Reviewed-on: http://gerrit.cloudera.org:8080/39
Reviewed-by: Martin Grund <mgrund@cloudera.com>
Tested-by: Internal Jenkins
2015-03-01 03:37:32 +00:00
Henry Robinson
9c3946c57c Rename TCounterType to more general TUnit
Change-Id: I5a43b5d843d5d7ee625d265fc249df77a69395ed
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5755
Reviewed-by: Henry Robinson <henry@cloudera.com>
Tested-by: jenkins
2015-01-14 17:01:49 -08:00
Henry Robinson
0c2a71576f Metric changes
This patch reworks a lot of the metrics subsystem, laying much of the
groundwork for unifying runtime profiles and metrics in the future, as
well as enabling better rendering of metric data in our webpages, and
richer integration with thirdparty monitoring tools like CM.

There are lots of changes. The most significant are below.

TODO (incomplete list):

* Add descriptions for all metrics
* Settle on a standard hierarchy for process-wide metric groups
* Add path-based resolution for searching for metrics (i.e. resolve
  "group1.group2.metric_name")
* Add a histogram metric type

Improvements for all metrics:

** New 'description' field, which allows a human-readable description to
   be provided for each metric.
** Metrics must serialise themselves to JSON via the RapidJson
   library (all by-hand JSON serialisation has been removed).
** Metrics are contained in MetricGroups (replacing the old 'Metrics'
   class), which are hierarchically arranged to make grouping metrics
   into smaller subsystems more natural.
** Metrics are rendered via the new webserver templating engine,
   replacing the old /metrics endpoint. The old /jsonmetrics endpoint is
   retained for backwards compatibility.

Improvements for 'simple' metrics:

** SimpleMetric replaces the old PrimitiveMetric class (using much of
   the same code), and are metrics whose value does not itself have
   relevant structure (as opposed to sets, lists, etc).
** SimpleMetrics have 'kinds' (counter, gauge, property etc)
** ... and units (from TCounterType), to make pretty-printing easier.

Change-Id: Ida1d125172d8572dfe9541b4271604eff95cfea6
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5722
Tested-by: jenkins
Reviewed-by: Henry Robinson <henry@cloudera.com>
2015-01-11 21:05:00 -08:00
Dan Hecht
3c64890afe S3: Change backend libhdfs connections to use the appropriate scheme://authority
Currently, the backend assumes file paths are on the default FS.  Change
this so that the file path is used to infer the appropriate filesystem
to connect to.

Also moves the error checking inside of HdfsFsCache so that each
callsite doesn't need to handle the boiler plate error message
construction independently (and add some missing error handling cases).

Change-Id: I24bc4fbbe8f95b7e5b99ad7e2952b41f1d4c4173
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5200
Tested-by: jenkins
Reviewed-by: Daniel Hecht <dhecht@cloudera.com>
(cherry picked from commit 9d3e2b619a80d1af595193e3cec47284b7b28eba)
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5246
2014-11-13 15:52:09 -08:00
Skye Wanderman-Milne
7f411ecf9f Track expr memory allocations as part of parent node's usage
This change gets rid of RuntimeState::udf_mem_tracker_, and introduces
ExecNode::expr_mem_tracker to replace it. This way expr allocations
are still separate, but they fall under the correct exec node.

Change-Id: Iaf2cf610c2adf244ade94a9adb906cd68ad78838
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5000
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: jenkins
(cherry picked from commit 700cd9cde07b6717c758b906c98a6785b34d6634)
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5212
2014-11-11 20:08:14 -08:00
Nong Li
9bd8b5b8cd Make GetExecSummary() RPC callable while query is running.
Change-Id: I51fbade93b735dc007b1e84762194061a466947a
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4768
Tested-by: jenkins
Reviewed-by: Nong Li <nong@cloudera.com>
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/5091
2014-11-03 22:33:51 -08:00
Skye Wanderman-Milne
ceeb5ee17d Revert "Free local UDF/UDA allocations"
This reverts commit 4c85fafc53e55801c9608a89c25ce6a118e9da3d.

Change-Id: I4891922cb4279a235073dde79806cc27d4169ddc
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4641
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
(cherry picked from commit 53b73e5b4dd11888ab3c86d5d331a652f51f8092)
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4644
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: Skye Wanderman-Milne <skye@cloudera.com>
2014-10-06 15:11:48 -07:00
Henry Robinson
e5201dbd24 Add remote fragment start latency counter to profile
Change-Id: I4deb1bf489223d4128357cfde0269e3519e56c8f
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4613
Reviewed-by: Victor Bittorf <victor.bittorf@cloudera.com>
Tested-by: jenkins
2014-10-06 15:11:04 -07:00
Skye Wanderman-Milne
f3b33181d3 Free local UDF/UDA allocations
UDF/UDAs use FunctionContext::AllocateLocal() to allocate memory that
is owned by Impala and can be freed at any time. This is used to
return string data to Impala. However, we weren't freeing local
allocations before the end of the query, even if the returned data was
no longer needed. This meant that queries like "select
min(lower(string_col)) from tbl" would never free the memory allocated
by lower(), effectively accumulating the entire dataset in memory.

This patch adds calls to FunctionContext::FreeLocalAllocations() so we
don't accumulate unneeded memory. We don't want to call this too often
(i.e. for every row evaluated) for performance reasons, so the
allocations are freed in RuntimeState::QueryMaintenance() (renamed
from CheckQueryState()).

I'm not sure how to test this, but ad-hoc testing confirms that this
does reduce peak memory usage.

Change-Id: I1b973a027151a86521057756f4fe9bf7954881bd
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4536
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: jenkins
2014-10-06 15:08:50 -07:00
Henry Robinson
7274eba8eb CDH-21091: Add coordinator address to query profile
Change-Id: Ie520b467b3eabf1ef045dc43498442a3d0dbd3a7
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4090
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/4125
Reviewed-by: Henry Robinson <henry@cloudera.com>
2014-09-02 13:13:31 -07:00
Henry Robinson
00902c4330 Add a summary page to query table
This commit adds a summary page for each completed query which displays
the query statement, the query state, the plan and the execution summary.

Change-Id: I9739f1a2ddd1d6465a69d59bb3f173b0101e6fe8
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/3645
Reviewed-by: Henry Robinson <henry@cloudera.com>
Tested-by: jenkins
(cherry picked from commit 59ef2ed606d7d9c479f6695c8fc7801fcb0ab476)
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/3927
2014-08-19 19:30:15 -07:00
Skye Wanderman-Milne
559b83d3d0 Expr refactoring
This patch changes the interface for evaluating expressions, in order
to allow for thread-safe expression evaluations and easier
codegen. Thread safety is achieved via the ExprContext class, a
light-weight container for expression tree evaluation state. Codegen
is easier because more expressions can be cross-compiled to IR.

See expr.h and expr-context.h for an overview of the API
changes. See sort-exec-exprs.cc for a simple example of the new
interface and hdfs-scanner.cc for a more complicated example.

This patch has not been completely code reviewed and may need further
cleanup/stylistic work, as well as additional perf work.

Change-Id: I3e3baf14ebffd2687533d0cc01a6fb8ac4def849
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/3459
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: Skye Wanderman-Milne <skye@cloudera.com>
2014-08-17 12:44:44 -07:00
Nong Li
08233499d4 IMPALA-639: Fix per node peak mem usage reporting.
Change-Id: Ie130df4f9434ae8d9ff500afc41a17010927338a
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/3729
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.sjc.cloudera.com:8080/3737
2014-08-03 15:15:55 -07:00
Nong Li
9a2f7d3bbe Add fragment start up query timeline.
Change-Id: Icf015904d91f8e3a043c39b50a6c9eb1e1576c20
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3519
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3573
2014-07-22 02:54:51 -07:00
Henry Robinson
dd4c1c32dc Add optional RM reservation limit to memtrackers
If RM and per-query memory limits were enabled at the same time, the
per-query limit would be ignored if RM wanted to expand the memory
allocation. This change adds an optional reservation limit to a
memtracker. The original limit goes back to being a hard limit -
i.e. any attempt to consume more than that amount results in
failure. The RM reservation limit is the RM-allocated memory limit. If
that is exceeded it triggers the ExpandRmReservation() method, which tries
to retrieve more memory as long as the hard limit is observed.

The net effect is that per-query memory limits have the intended,
hard-limit effect, while the RM limits coexist nicely and can expand
with more memory as required.

At the same time, we change the precedence of various ways of suggesting
an initial reservation size so that the user can change the reservation
size via a query option (MEM_RESERVATION_SIZE).

Change-Id: I41bfa4eb1336810a8a5946f6be3472111a052144
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3134
Tested-by: jenkins
Reviewed-by: Henry Robinson <henry@cloudera.com>
2014-07-01 18:08:47 -07:00
Henry Robinson
df9c13dcbe Fix memtracker instantiation when using FETCH_FIRST
Change-Id: I47b614b3559880f428951b015291bee4f5af6c49
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3038
Reviewed-by: Henry Robinson <henry@cloudera.com>
Tested-by: jenkins
2014-06-20 12:29:20 -07:00
Alex Behm
9dc883b140 IMPALA-1005: Print consistent plan fragment ids in explain plan and runtime profile.
Change-Id: I63b59a896dc9dc0c9ed1d5e889f7b5626ba61202
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3037
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3124
2014-06-18 15:44:43 -07:00
Skye Wanderman-Milne
ce22acc974 Add DCHECK to PrintExecSummary() to prevent indexing into empty array
Change-Id: Ic4d21310a5cfbb8284f58bc0244572906ae050e4
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3017
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: jenkins
(cherry picked from commit 575e6b5fc4035fb9ffb9ad703c64ed5b10010849)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3087
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
2014-06-17 22:55:35 -07:00
Alex Behm
0251e5215c Allow MergeNode with constant selects to run correctly on multiple fragment instances.
Change-Id: I0b1ff27f591366b960aa944fadabbb4b35f4b9b4
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2832
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.ent.cloudera.com:8080/3002
2014-06-12 16:39:55 -07:00
Alex Behm
c8ba9568d7 Do not print exec summary if it has not been initialized.
Change-Id: I9cade74541193e7874d21540ad989aa82cf00506
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2986
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2996
2014-06-12 04:34:01 -07:00
Henry Robinson
9a7c6d286f Add 'summary' to shell
Users can now type 'summary' in the Impala shell after a query executes
to get a breakdown of the work done by each part of the query plan.

Change-Id: Ia6a43429ffc7778f3c2c8fcbf45d83828263c2ab
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2963
Tested-by: jenkins
Reviewed-by: Henry Robinson <henry@cloudera.com>
(cherry picked from commit 9b98d42acb14d43a64832767528ee572eac4979b)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2995
2014-06-12 02:59:58 -07:00
Nong Li
5d903efca3 ExecSummary
The runtime profile as we present it is not very useful and I think the structure of
it makes it hard to consume. This patch adds a new client facing schemed set of
counters that are collected from the runtime profiles. For example, with this structure
it would be easy to have the shell get the stats of a running query and print a useful
progress report or to check the most relevant metrics for diagnosing issues.

Here's an example of the output for one of the tpch queries:
Operator              #Hosts   Avg Time   Max Time    #Rows  Est. #Rows  Peak Mem  Est. Peak Mem  Detail
------------------------------------------------------------------------------------------------------------------------
09:MERGING-EXCHANGE        1   79.738us   79.738us        5           5         0        -1.00 B  UNPARTITIONED
05:TOP-N                   3   84.693us   88.810us        5           5  12.00 KB       120.00 B
04:AGGREGATE               3    5.263ms    6.432ms        5           5  44.00 KB       10.00 MB  MERGE FINALIZE
08:AGGREGATE               3   16.659ms   27.444ms   52.52K     600.12K   3.20 MB       15.11 MB  MERGE
07:EXCHANGE                3    2.644ms      5.1ms   52.52K     600.12K         0              0  HASH(o_orderpriority)
03:AGGREGATE               3  342.913ms  966.291ms   52.52K     600.12K  10.80 MB       15.11 MB
02:HASH JOIN               3    2s165ms    2s171ms  144.87K     600.12K  13.63 MB      941.01 KB  INNER JOIN, BROADCAST
|--06:EXCHANGE             3    8.296ms    8.692ms   57.22K      15.00K         0              0  BROADCAST
|  01:SCAN HDFS            2    1s412ms    1s978ms   57.22K      15.00K  24.21 MB      176.00 MB  tpch.orders o
00:SCAN HDFS               3    8s032ms    8s558ms    3.79M     600.12K  32.29 MB      264.00 MB  tpch.lineitem l

Change-Id: Iaad4b9dd577c375006313f19442bee6d3e27246a
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2964
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
2014-06-11 03:10:11 -07:00
Srinath Shankar
5755b0bdee Order by without limit for Impala
Enable order-by without limit
Added BufferedBlockMgr to allocate buffers and spill to disk.
Added Sorter for the external sort impelementation
Added new SortNode execution node that completely sorts its input
Changes to enable writing in IoMgr went in a separate patch.

Reviewed-on: http://gerrit.ent.cloudera.com:8080/1539
Reviewed-by: Srinath Shankar <sshankar@cloudera.com>
Tested-by: jenkins

Conflicts:

	testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test

Change-Id: I3ece32affe5b006f53bbdfcc03ded01471e818ac
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2900
Reviewed-by: Srinath Shankar <sshankar@cloudera.com>
Tested-by: jenkins
2014-06-09 16:58:08 -07:00
Henry Robinson
b53ee1cafe IMPALA-827: Follow-on comments
Change-Id: Ibb5deb972517871191d1cc89b499b1eb4fd47f7b
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2158
Reviewed-by: Henry Robinson <henry@cloudera.com>
Tested-by: jenkins
(cherry picked from commit 6afc23b485fc26f5311dc92e36243c24ffec259c)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2905
2014-06-08 23:09:54 -07:00
Henry Robinson
60cbe1b0e1 IMPALA-741: Support partitions with non-existant HDFS locations
If a partition had a location that did not exist in HDFS, Impala would
refuse to load its metadata. This meant a typo could render a table
unloadable. We fix this problem by removing the existence check from the
frontend, and by inheriting access from the first extant parent of the
partition directory.

Fixing this exposed a second issue, where Impala wouldn't create
directories for partitions in the right place after an INSERT if the
partition location had been changed. To get this right we have to plumb
the partition ID through to Coordinator::FinalizeSuccessfulInsert(), so
that the coordinator can look up the partition's location from the
query-wide descriptor table. As a by-product, this patch rationalises
the per-partition, per-fragment statistics gathering a little bit by
putting almost all the per-partition stats into TInsertPartitionStatus.

Change-Id: I9ee0a1a1ef62cf28f55be3249e8142c362083163
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2851
Reviewed-by: Henry Robinson <henry@cloudera.com>
Tested-by: jenkins
2014-06-08 18:44:45 -07:00
Srinath Shankar
d193a1e8a5 IMPALA-963: Impala crash in ClearResultCache()
The issue is that Impala crashes in ClearResultCache() with result caching on
for parallel inserts. The reason is that the ClearResltCache() accesses the
coordinator RuntimeState to update the query mem tracker. However, for there is
no coordinator fragment (or RuntimeState) for parallel inserts.
The fix is to intiialize a query mem tracker to track memory usage in the coordinator
instance even if there is no coordinator fragment.

Change-Id: I3a2ef14860f683910c29ae19b931202ca6867b9f
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2501
Reviewed-by: Srinath Shankar <sshankar@cloudera.com>
Tested-by: jenkins
2014-05-19 12:40:12 -07:00
Nong Li
8c2dda5771 CDH-18785: Move optimize module from PlanExecFragment Prepare() to Open().
We start up plan fragments in level order, calling Prepare() one level at a time.
Currently, OptimizeLlvmModule is called at the end of Prepare(). OptimizeLlvmModule()
compiles the IR which can be expensive (~1 second). By doing this level order, we
serialize the compile time and for big plans, this can add up.

This change is trivial by moving the OptimizeLlvmModule() call to the Open() phase.

The snipped from the issue is (running on an empty table):
The codegen time is rows available & first row fetched.

Before:
Query Timeline:
 Query Timeline: 4s182ms
  - Start execution: 2.648ms (2.648ms)
  - Planning finished: 321.5ms (318.357ms)
  - Rows available: 4s161ms (3s840ms)
  - First row fetched: 4s163ms (1.368ms)
  - Unregister query: 4s167ms (3.786ms)

With this patch:
Query Timeline: 2s111ms
  - Start execution: 2.623ms (2.623ms)
  - Planning finished: 369.284ms (366.660ms)
  - Rows available: 2s018ms (1s649ms)
  - First row fetched: 2s093ms (74.962ms)
  - Unregister query: 2s097ms (3.805ms)

Change-Id: Ia78782e4708313ed197877749e80a9a68eeec551
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2597
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
2014-05-17 16:36:39 -07:00
Alex Behm
2fff51d9e9 IMP-1329,IMPALA-924: Make ExchangeNode::Open() block until rows are available.
The bug: Coordinator::Wait() is supposed to block until rows become available for
consumption by the client. We rely on Wait() to determine when to advance the query
status to a 'ready' state and signal to the client that rows can be fetched.
Long fetch times can trigger client timeouts at various levels (socket, app, etc.).
Coordinator::Wait() simply opens the coordinator fragment's plan tree.
For most plan nodes, Open() does work to prepare the plan tree, s.t., GetNext()
returns quickly. However, for ExchangeNodes Open() used to not wait
until rows are obtained form the underlying stream receiver.
The fix: Make ExchangeNode::Open() block until rows are available.

Change-Id: I7b197eea11d21fd732414d96c899a17b2d99631c
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2128
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2185
2014-04-10 23:49:38 -07:00
Henry Robinson
99c37aac37 IMPALA-827: Add an option for directories created by INSERT to inherit
their parent's permissions

This patch adds --insert_inherit_permissions. If true, all
new partition directories created by INSERT will inherit their
permissions from their parent. When false, the directories are created
with the default permissions.

Change-Id: Ib2b4c251e51ea5048387169678e8dde34ecfe5f6
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1917
Tested-by: jenkins
Reviewed-by: Henry Robinson <henry@cloudera.com>
2014-04-04 10:25:20 -07:00
Henry Robinson
8e5848eaf8 RM fixes to get tests passing
* One last NotifyThreadUsageChange() mismatched pair
* Don't set resource in plan fragment params if there isn't a resource
  available. This fixes the problem where if no fragment with resources
  was assigned to the same node as the coordinator, the coordinator
  would have a dummy resource allocation which didn't work with
  expansion.
* Substitute #ID in all impalad arguments to start-impala-cluster.py
  with the 0-indexed ID of the impalad being started. This is required
  to have different Impala processes use different cgroups.

Change-Id: If8c8fd8bef0809bdaf16115a45a9695fc2bf3e1b
(cherry picked from commit c71ce45e97570b8c09900eb5ae2e26984d3306a4)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/2060
Tested-by: jenkins
Reviewed-by: Henry Robinson <henry@cloudera.com>
2014-03-24 15:07:45 -07:00
Srinath Shankar
6038b7593f IMPALA-789: Runtime profile averages are not computed correctly.
This defect was caused by incorrectly aggregating intermediate
runtime profiles in the patch for IMPALA-540. Runtime profiles
were being summed instead of averaged.
In this patch, a new AveragedCounter is introduced to maintain a
fragment's average_profile. A fragment's average_profile will now
be the average of the current fragment instance profiles

Change-Id: I2b808023e6206e2e36513f649a16f8c2157f6bb2
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1839
Reviewed-by: Srinath Shankar <sshankar@cloudera.com>
Tested-by: jenkins
2014-03-10 18:55:00 -07:00
Henry Robinson
da1c7d37ff Add memory and VCPU expansion to RM-enabled queries
* Each node has one QueryResourceMgr per query it is running fragments
  for. A QueryResourceMgr handles creating expansion RPC requests, and
  monitoring the thread:VCPU ratio for each query (and requesting more
  VCPUs from YARN if oversubscribed).
* MemTrackers now have an ExpandLimit API which does nothing unless they
  have a QueryResourceMgr. This method blocks for now, but when the IO
  manager changes its API to use TryConsume(), we'll need to issue these
  asynchronously to avoid keeping hold of a thread.
* ResourceBroker etc. got updated to support the Expansion API.

Change-Id: Ia3c4635497f0563cfc5cd0e330e5f1f586577200
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1800
Reviewed-by: Henry Robinson <henry@cloudera.com>
Tested-by: jenkins
2014-03-07 08:58:05 -08:00
Henry Robinson
f6a297568d IMPALA-844: Catch TException rather than TTransportException during RPCs
Change-Id: I9a0df4ed01047860504efbe20ff56d9d2dc191cb
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1732
Reviewed-by: Henry Robinson <henry@cloudera.com>
Tested-by: jenkins
(cherry picked from commit 05a5bba34e05b8b87a13c2ebbdaa32cbd38c7441)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1754
Tested-by: Henry Robinson <henry@cloudera.com>
2014-03-04 22:26:55 -08:00
Skye Wanderman-Milne
a52b002757 Changes necessary for expr refactoring
This is some fairly mechanical refactoring that will be necessary for
the expr refactoring branch. It introduces the following changes:

- lookupSymbol() is moved from CreateFunctionStmtBase to Function
- HdfsFsFache and HdfsLibCache are removed from ExecEnv and made into
  standalone singletons
- Various 'is_fe_tests' variables are consolidated into one variable
  that is initialized early

This is paving the way for having the FE call into the BE on startup
to resolve builtins' symbols. The general idea is to simplify the
order various services are created, by having classes with fewer
dependencies initialized first.

Change-Id: I5d0e1325792a53680a9738e2e3e67fed6201299c
(cherry picked from commit cd08211dbf3d6275de70691b2387face1ba3f81a)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1690
Reviewed-by: Skye Wanderman-Milne <skye@cloudera.com>
Tested-by: jenkins
2014-02-27 20:38:26 -08:00
Matthew Jacobs
af84be67dd Admission controller: add memory limits in addition to number of requests
Adds the ability to set per-pool memory limits. Each impalad tracks the memory
used by queries in each pool; a per-pool memory tracker is added between the
per-query trackers and the process memory tracker. The current memory usage
is disseminated via statestore heartbeats (along with the other per-pool stats)
and a cluster-wide estimate of the pool memory usage is updated when topic
updates are received. The admission controller will not admit incoming
requests if the total memory usage is already over the configured pool limit.

Change-Id: Ie9bc82d99643352ba77fb91b6c25b42938b1f745
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1508
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
(cherry picked from commit 64a137930a318e56a7090a317e6aa5df67ea72cd)
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1623
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: Matthew Jacobs <mj@cloudera.com>
2014-02-20 14:19:34 -08:00
Nong Li
904ae86e82 IMPALA-626: Allow dropping functions while it is running.
Change-Id: Ia9d6fa1daadddbd05961696d13b9ff43fef2da61
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1621
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
2014-02-20 13:12:10 -08:00
Srinath Shankar
37fad61b96 IMPALA-540: Impala debug webpage should report intermediate runtime profile
The JIRA description is not completely accurate - in fact a runtime profile
is produced for running queries, but updated fragment information was missing.

Change-Id: Icdc79fbc9675d01f23e242e1163a639d52e2df2a
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1323
Reviewed-by: Nong Li <nong@cloudera.com>
Tested-by: jenkins
Reviewed-on: http://gerrit.ent.cloudera.com:8080/1395
Reviewed-by: Srinath Shankar <sshankar@cloudera.com>
2014-01-29 13:37:16 -08:00
Alex Behm
944cfc2e82 Translating hostports of impalads to network addresses suitable for identifying
resources across interactions with the Llama. The MiniLlama expectes IP:port of
Hadoop DNs, whereas the regular Llama only deals with the hostnames of Yarn NMs
(stripping away the port of resource locations in reservation responses).

Change-Id: I5ebd431336cda4f06df93cfa3fea4a37d1102b63
2014-01-15 15:12:05 -08:00
Alex Behm
ab60e40aeb Resource requests now use IPs for the MiniLlama and hostnames for the Llama.
Change-Id: Ifaedabc5ae7cf513c9c2131e071295f093c5bd12
2014-01-15 15:12:05 -08:00
Alex Behm
0614774706 Fixed reservation from MiniLlama by translating hosts of resource requests from impalad hostports to Hadoop DN hostports.
Change-Id: I7a9a26ec4309710f0ad62a1bd18fb076fe6dd120
2014-01-15 15:12:04 -08:00