mirror of
https://github.com/apache/impala.git
synced 2025-12-30 12:02:10 -05:00
This commit tackles a few additions and improvements to test-with-docker. In general, I'm adding workloads (e.g., exhaustive, rat-check), tuning memory setting and parallelism, and trying to speed things up. Bug fixes: * Embarassingly, I was still skipping thrift-server-test in the backend tests. This was a mistake in handling feedback from my last review. * I made the timeline a little bit taller to clip less. Adding workloads: * I added the RAT licensing check. * I added exhaustive runs. This led me to model the suites a little bit more in Python, with a class representing a suite with a bunch of data about the suite. It's not perfect and still coupled with the entrypoint.sh shell script, but it feels workable. As part of adding exhaustive tests, I had to re-work the timeout handling, since now different suites meaningfully have different timeouts. Speed ups: * To speed up test runs, I added a mechanism to split py.test suites into multiple shards with a py.test argument. This involved a little bit of work in conftest.py, and exposing $RUN_CUSTOM_CLUSTER_TESTS_ARGS in run-all-tests.sh. Furthermore, I moved a bit more logic about managing the list of suites into Python. * Doing the full build with "-notests" and only building the backend tests in the relevant target that needs them. This speeds up "docker commit" significantly by removing about 20GB from the container. I had to indicates that expr-codegen-test depends on expr-codegen-test-ir, which was missing. * I sped up copying the Kudu data: previously I did both a move and a copy; now I'm doing a move followed by a move. One of the moves is cross-filesystem so is slow, but this does half the amount of copying. Memory usage: * I tweaked the memlimit_gb settings to have a higher default. I've been fighting empirically to have the tests run well on c4.8xlarge and m4.10xlarge. The more memory a minicluster and test suite run uses, the fewer parallel suites we can run. By observing the peak processes at the tail of a run (with a new "memory_usage" function that uses a ps/sort/awk trick) and by observing peak container total_rss, I found that we had several JVMs that didn't have Xmx settings set. I added Xms/Xmx settings in a few places: * The non-first Impalad does very little JVM work, so having an Xmx keeps it small, even in the parallel tests. * Datanodes do work, but they essentially were never garbage collecting, because JVM defaults let them use up to 1/4th the machine memory. (I observed this based on RSS at the end of the run; nothing fancier.) Adding Xms/Xmx settings helped. * Similarly, I piped the settings through to HBase. A few daemons still run without resource limitations, but they don't seem to be a problem. Change-Id: I43fe124f00340afa21ad1eeb6432d6d50151ca7c Reviewed-on: http://gerrit.cloudera.org:8080/10123 Reviewed-by: Joe McDonnell <joemcdonnell@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
142 lines
5.0 KiB
Plaintext
142 lines
5.0 KiB
Plaintext
<!--
|
|
Licensed to the Apache Software Foundation (ASF) under one
|
|
or more contributor license agreements. See the NOTICE file
|
|
distributed with this work for additional information
|
|
regarding copyright ownership. The ASF licenses this file
|
|
to you under the Apache License, Version 2.0 (the
|
|
"License"); you may not use this file except in compliance
|
|
with the License. You may obtain a copy of the License at
|
|
|
|
http://www.apache.org/licenses/LICENSE-2.0
|
|
|
|
Unless required by applicable law or agreed to in writing,
|
|
software distributed under the License is distributed on an
|
|
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
|
KIND, either express or implied. See the License for the
|
|
specific language governing permissions and limitations
|
|
under the License.
|
|
-->
|
|
|
|
<!--
|
|
|
|
Template/header for a timeline visualization of a multi-container build.
|
|
The timelines represent interesting log lines, with one row per container.
|
|
The charts represent CPU usage within those containers.
|
|
|
|
To use this, concatenate this with a '<script>' block defining
|
|
a global variable named data.
|
|
|
|
The expected format of data is exemplified by the following,
|
|
and is tightly coupled with the implementation generating
|
|
it in monitor.py. The intention of this unfriendly file format
|
|
is to do as much munging as plausible in Python.
|
|
|
|
To make the visualization relative to the start time (i.e., to say that all
|
|
builds start at 00:00), the timestamps are all seconds since the build began.
|
|
To make the visualization work with them, the timestamps are then converted
|
|
into local time, and get displayed reasonably. This is a workaround to the fact
|
|
that the visualization library for the timelines does not accept any data types
|
|
that represent duration, but we still want timestamp-style formatting.
|
|
|
|
var data = {
|
|
// max timestamp seen, in seconds since the epoch
|
|
"max_ts": 8153.0,
|
|
// map of container name to an array of metrics
|
|
"metrics": {
|
|
"i-20180312-140548-ee-test-serial": [
|
|
// a single metric point is an array of timestamp, user CPU, system CPU
|
|
// CPU is the percent of 1 CPU used since the previous timestamp.
|
|
[
|
|
4572.0,
|
|
0.11,
|
|
0.07
|
|
]
|
|
]
|
|
},
|
|
// Array of timelines
|
|
"timeline": [
|
|
// a timeline entry contains a name (for the entire row of the timeline),
|
|
// the message (for a segment of the timeline), and start and end timestamps
|
|
// for the segment.
|
|
[
|
|
"i-20180312-140548",
|
|
"+ echo '>>> build' '4266 (begin)'",
|
|
0.0,
|
|
0.0
|
|
]
|
|
]
|
|
}
|
|
-->
|
|
|
|
<script type="text/javascript" src="https://www.gstatic.com/charts/loader.js"></script>
|
|
|
|
<script type="text/javascript">
|
|
google.charts.load("current", {packages:["timeline", "corechart"]});
|
|
google.charts.setOnLoadCallback(drawChart);
|
|
|
|
function ts_to_hms(secs) {
|
|
var s = secs % 60;
|
|
var m = Math.floor(secs / 60) % 60;
|
|
var h = Math.floor(secs / (60 * 60));
|
|
return [h, m, s];
|
|
}
|
|
|
|
/* Returns a Date object corresponding to secs seconds since the epoch, in
|
|
* localtime. Date(x) and Date(0, 0, 0, 0, 0, 0, 0, x) differ in that the
|
|
* former returns UTC whereas the latter returns the browser local time.
|
|
* For consistent handling within this visualization, we use localtime.
|
|
*
|
|
* Beware that local time can be discontinuous around time changes.
|
|
*/
|
|
function ts_to_date(secs) {
|
|
// secs may be a float, so we use millis as a common denominator unit
|
|
var millis = 1000 * secs;
|
|
return new Date(1970 /* yr; beginning of unix epoch */, 0 /* mo */, 0 /* d */,
|
|
0 /* hr */, 0 /* min */, 0 /* sec */, millis);
|
|
}
|
|
|
|
function drawChart() {
|
|
var timelineContainer = document.getElementById('timelineContainer');
|
|
var chart = new google.visualization.Timeline(timelineContainer);
|
|
var dataTable = new google.visualization.DataTable();
|
|
dataTable.addColumn({ type: 'string', id: 'Position' });
|
|
dataTable.addColumn({ type: 'string', id: 'Name' });
|
|
// timeofday isn't supported here
|
|
dataTable.addColumn({ type: 'datetime', id: 'Start' });
|
|
dataTable.addColumn({ type: 'datetime', id: 'End' });
|
|
// Timeline
|
|
for (i = 0; i < data.timeline.length; ++i) {
|
|
var row = data.timeline[i];
|
|
dataTable.addRow([ row[0], row[1], ts_to_date(row[2]), ts_to_date(row[3]) ]);
|
|
}
|
|
chart.draw(dataTable, { height: "400px" } );
|
|
|
|
for (const k of Object.keys(data.metrics)) {
|
|
var lineChart = document.createElement("div");
|
|
lineChartContainer.appendChild(lineChart);
|
|
|
|
var dataTable = new google.visualization.DataTable();
|
|
dataTable.addColumn({ type: 'timeofday', id: 'Time' });
|
|
dataTable.addColumn({ type: 'number', id: 'User' });
|
|
dataTable.addColumn({ type: 'number', id: 'System' });
|
|
|
|
for (const row of data.metrics[k]) {
|
|
dataTable.addRow([ ts_to_hms(row[0]), row[1], row[2] ]);
|
|
}
|
|
var options = {
|
|
title: 'CPU',
|
|
legend: { position: 'bottom' },
|
|
hAxis: {
|
|
minValue: [0, 0, 0],
|
|
maxValue: ts_to_hms(data.max_ts)
|
|
}
|
|
};
|
|
|
|
var chart = new google.visualization.LineChart(lineChart);
|
|
chart.draw(dataTable, options);
|
|
}
|
|
}
|
|
</script>
|
|
<div id="timelineContainer" style="height: 400px;"></div>
|
|
<div id="lineChartContainer" style="height: 200px;"></div>
|