IMPALA-13106: Support larger imported query profile sizes through compression

Imported query profiles are currently being stored in IndexedDB.
Although IndexedDB does not have storage limitations like other
browser storage APIs, there is a storage limit for a single
attribute / field.

For supporting larger query profiles, 'pako' compression library's
v2.1.0 has been added along with its associated license.

Before adding query profile JSON to indexedDB, it undergoes compression
using this library.

As compression and parsing profile is a long running process
that can block the main thread, this has been delegated to
a worker script running in the background. The worker script
returns parsed query attributes and compressed profile text sent to it.

The process of compression consumes time; hence, an alert message is
displayed on the queries page warning user to refrain from closing or
reloading the page. On completion, the raw total size, compressed
total size, and total processing time are logged to the browser console.

When multiple profiles are chosen, after each query profile insertion,
the subsequent one is not triggered until compression and insertion
are finished.

The inserted query profile field is decompressed before parsing on
the query plan, query profile, query statement, and query timeline page.

Added tests for the compression library methods utilized by
the worker script.

Manual testing has been done on Firefox 126.0.1 and Chrome 126.0.6478.

Change-Id: I8c4f31beb9cac89051460bf764b6d50c3933bd03
Reviewed-on: http://gerrit.cloudera.org:8080/21463
Reviewed-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenkins@cloudera.com>
This commit is contained in:
Surya Hebbar
2024-05-23 18:22:46 +05:30
committed by Impala Public Jenkins
parent fef4bf430d
commit 42e5ea7ea3
16 changed files with 231 additions and 81 deletions

1
.gitattributes vendored
View File

@@ -31,3 +31,4 @@ www/datatables-1.13.2.min.js binary
www/datatables-1.13.2.min.css binary
www/highlight/highlight.pack.js binary
www/jquery/jquery-3.5.1.min.js binary
www/pako.min.js binary

View File

@@ -1217,3 +1217,29 @@ IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
--------------------------------------------------------------------------------
www/pako.min.js: MIT license
(The MIT License)
Copyright (C) 2014-2017 by Vitaly Puzrin and Andrei Tuputcyn
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
--------------------------------------------------------------------------------

View File

@@ -38,14 +38,19 @@ be/src/thirdparty/murmurhash/*
be/src/thirdparty/mpfit/*
be/src/thirdparty/fast_double_parser/*
be/src/kudu/gutil
www/highlight/*
www/DataTables*/*
www/datatables-*.*
www/bootstrap/css/bootstrap*
www/bootstrap/js/bootstrap*
www/favicon.ico
www/c3/*
www/Chart*
www/d3.v3.min.js
www/d3.v5.min.js
www/DataTables*/*
www/datatables-*.*
www/favicon.ico
www/highlight/*
www/icons/*
www/jquery/jquery-3.5.1.min.js
www/pako.min.js
tests/comparison/leopard/static/css/bootstrap*
tests/comparison/leopard/static/fonts/glyphicons-halflings*
tests/comparison/leopard/static/js/bootstrap*
@@ -57,10 +62,6 @@ shell/ext-py/six-1.14.0/*
shell/ext-py/sqlparse-0.3.1/*
shell/ext-py/thrift-0.16.0/*
shell/ext-py/thrift_sasl-0.4.3/*
www/c3/*
www/d3.v3.min.js
www/d3.v5.min.js
www/jquery/jquery-3.5.1.min.js
tests/comparison/leopard/static/css/hljs.css
tests/comparison/leopard/static/js/highlight.pack.js
common/protobuf/kudu

View File

@@ -23,7 +23,7 @@ common-footer.tmpl) }}
<head><title>Apache Impala</title>
<script src='{{ __common__.host-url }}/www/jquery/jquery-3.5.1.min.js'></script>
<script src='{{ __common__.host-url }}/www/bootstrap/js/bootstrap-4.3.1.min.js'></script>
<script src='{{ __common__.host-url }}/www/scripts/util.js'></script>
<script src='{{ __common__.host-url }}/www/scripts/common_util.js'></script>
<link rel="stylesheet" type="text/css" href="{{ __common__.host-url }}/www/datatables-1.13.2.min.css"/>
<script type="text/javascript" src="{{ __common__.host-url }}/www/datatables-1.13.2.min.js"></script>
<link href='{{ __common__.host-url }}/www/bootstrap/css/bootstrap-4.3.1.min.css' rel='stylesheet' media='screen'>

2
www/pako.min.js vendored Normal file

File diff suppressed because one or more lines are too long

View File

@@ -234,11 +234,12 @@ command line parameter.</p>
<h3 id="imported_queries_header">
0 Imported Query Profiles
<sup><a href='#' data-toggle="tooltip" title="These are locally stored queries parsed from JSON query profiles.">[?]</a></sup>
<input id="json_profile_chooser" type="file" accept=".json" onchange="uploadProfile();" multiple/>
<input id="json_profile_chooser" type="file" accept=".json" onchange="startProfilesUpload();" multiple/>
<label for="json_profile_chooser">
<span class="btn btn-primary">Import JSON Profile</span>
</label>
<input id="clear_profiles_button" type="button" class="btn btn-primary" title="Clear All" onclick="clearProfiles();" value="X"/>
<span id="error_message" class="alert-sm alert-danger" style="display: none;"></span>
</h3>
<table id="imported_queries_table" class='table table-hover table-border'>
@@ -282,6 +283,10 @@ command line parameter.</p>
var dbOpenReq = indexedDB.open("imported_queries");
var db;
const profileParseWorker = new Worker("{{ __common__.host-url }}"
+ "/www/scripts/queries/profileParseWorker.js");
var query_processor_start_time, raw_total_size, compressed_total_size, upload_count;
function insertRowVal(row, val) {
row.insertCell().innerHTML = val;
}
@@ -302,67 +307,59 @@ command line parameter.</p>
function clearProfiles() {
db.transaction("profiles", "readwrite").objectStore("profiles").clear().onsuccess = () => {
setScrollReload();
};
}
function showImportedQueriesStatusMessage(status_message) {
error_message.style.display = "unset";
error_message.textContent = status_message;
}
function startProfilesUpload() {
raw_total_size = 0;
compressed_total_size = 0;
upload_count = 0;
query_processor_start_time = Date.now();
showImportedQueriesStatusMessage("Query profiles import in progress."
+ " Do not close/refresh the page.");
uploadProfile();
}
function uploadProfile() {
var uploadCount = 0;
for (var i = 0; i < json_profile_chooser.files.length; i++) {
json_profile_chooser.disabled = true;
var fileReader = new FileReader();
fileReader.readAsText(json_profile_chooser.files[i]);
fileReader.readAsText(json_profile_chooser.files[upload_count]);
fileReader.onload = (e) => {
try {
var profile = JSON.parse(e.target.result).contents;
var val = profile.profile_name;
var query = {};
query.id = val.substring(val.indexOf("=") + 1, val.length - 1);
query.user = profile.child_profiles[0].info_strings
.find(({key}) => key === "User").value;
query.default_db = profile.child_profiles[0].info_strings
.find(({key}) => key === "Default Db").value;
query.type = profile.child_profiles[0].info_strings
.find(({key}) => key === "Query Type").value;
query.start_time = profile.child_profiles[0].info_strings
.find(({key}) => key === "Start Time").value;
query.end_time = profile.child_profiles[0].info_strings
.find(({key}) => key === "End Time").value;
query.bytes_read = profile.child_profiles[2].counters
.find(({counter_name}) => counter_name === "TotalBytesRead").value;
query.bytes_read = getReadableSize(query.bytes_read, 2);
query.bytes_sent = profile.child_profiles[2].counters
.find(({counter_name}) => counter_name === "TotalBytesSent").value;
query.bytes_sent = getReadableSize(query.bytes_sent, 2);
query.state = profile.child_profiles[0].info_strings
.find(({key}) => key === "Query State").value;
query.rows_fetched = profile.child_profiles[1].counters
.find(({counter_name}) => counter_name === "NumRowsFetched").value;
query.resource_pool = profile.child_profiles[0].info_strings
.find(({key}) => key === "Request Pool").value;
query.statement = profile.child_profiles[0].info_strings
.find(({key}) => key === "Sql Statement").value;
if (query.statement.length > 250) {
query.statement = query.statement.substring(0, 250) + "...";
profileParseWorker.postMessage(e.target.result);
};
}
query.profile = profile;
var profileStore = db.transaction("profiles", "readwrite").objectStore("profiles");
profileStore.put(query).onsuccess = () => {
uploadCount++;
if (uploadCount == json_profile_chooser.files.length) {
setTimeout(setScrollReload, 1000);
}
};
} catch (err) {
var alertMessage = document.createElement("span");
alertMessage.className = "alert-sm alert-danger";
alertMessage.textContent = "Error parsing some JSON profiles";
profileParseWorker.onmessage = (e) => {
if (e.data.error) {
showImportedQueriesStatusMessage("Error parsing some JSON profiles");
setTimeout(setScrollReload, 1500);
imported_queries_header.appendChild(alertMessage);
console.log(err);
console.log(e.data.error);
return;
}
var profileStore = db.transaction("profiles", "readwrite").objectStore("profiles");
profileStore.put(e.data).onsuccess = () => {
raw_total_size += json_profile_chooser.files[upload_count].size;
compressed_total_size += e.data.profile.length;
upload_count++;
if (upload_count >= json_profile_chooser.files.length) {
console.log("Raw total size : " + getReadableSize(raw_total_size, 2));
console.log("Compressed total size : "
+ getReadableSize(compressed_total_size, 2));
console.log("Query Profile(s) Processing time : " + getReadableTimeMS(Date.now()
- query_processor_start_time));
setTimeout(setScrollReload, 2000);
} else {
// Recursively call uploadProfile() until all selected JSON profiles
// are parsed and stored
uploadProfile();
}
};
}
}
};
dbOpenReq.onupgradeneeded = (e) => {
db = e.target.result;

View File

@@ -23,9 +23,10 @@ under the License.
<pre id="query_plan">{{plan}}</pre>
<script>
<script type="module">
$("#plan-text-tab").addClass("active");
import {inflateParseJSON} from "./www/scripts/compression_util.js";
var dbOpenReq = indexedDB.open("imported_queries");
var db;
@@ -56,8 +57,8 @@ if (window.location.search.includes("imported")) {
}
var profileStore = db.transaction("profiles", "readonly").objectStore("profiles");
profileStore.get(getQueryID()).onsuccess = (e) => {
query_plan.textContent = e.target.result.profile.child_profiles[0]
.info_strings.find(({key}) => key === "Plan").value;
query_plan.textContent = inflateParseJSON(e.target.result.profile).contents
.child_profiles[0].info_strings.find(({key}) => key === "Plan").value;
};
};
}

View File

@@ -24,6 +24,8 @@ under the License.
{{> www/query_detail_tabs.tmpl }}
<script src="{{ __common__.host-url }}/www/scripts/compression_util.js" type="module"></script>
<br/>
<div id="profile_download_section">
<h4>Download Profile (Available Formats):
@@ -41,9 +43,11 @@ under the License.
<pre id="plain_text_profile_field">{{profile}}</pre>
<script>
<script type="module">
$("#profile-tab").addClass("active");
import {inflateParseJSON} from "./www/scripts/compression_util.js";
var dbOpenReq = indexedDB.open("imported_queries");
var db;
@@ -112,7 +116,8 @@ if (window.location.search.includes("imported")) {
}
var profileStore = db.transaction("profiles", "readonly").objectStore("profiles");
profileStore.get(getQueryID()).onsuccess = (e) => {
plain_text_profile_field.textContent = profileToString(e.target.result.profile);
plain_text_profile_field.textContent = profileToString(
inflateParseJSON(e.target.result.profile).contents);
};
};
}

View File

@@ -28,13 +28,16 @@ under the License.
statements -->
<link rel="stylesheet" href="{{ __common__.host-url }}/www/highlight/styles/default.css">
<script src="{{ __common__.host-url }}/www/highlight/highlight.pack.js"></script>
<script src="{{ __common__.host-url }}/www/scripts/compression_util.js" type="module"></script>
<script>hljs.initHighlightingOnLoad();</script>
<span id="tab_body">{{?stmt}}<pre class="code"><code>{{stmt}}</code></pre>{{/stmt}}</span>
<script>
<script type="module">
$("#stmt-tab").addClass("active");
import {inflateParseJSON} from "./www/scripts/compression_util.js";
var dbOpenReq = indexedDB.open("imported_queries");
var db;
@@ -64,8 +67,8 @@ if (window.location.search.includes("imported")) {
}
var profileStore = db.transaction("profiles", "readonly").objectStore("profiles");
profileStore.get(getQueryID()).onsuccess = (e) => {
var sql_query = e.target.result.profile.child_profiles[0].info_strings
.find(({key}) => key === "Sql Statement").value;
var sql_query = inflateParseJSON(e.target.result.profile).contents
.child_profiles[0].info_strings.find(({key}) => key === "Sql Statement").value;
var sql_stmt_body = document.createElement("pre");
sql_stmt_body.className = "code";
var sql_code = sql_stmt_body.appendChild(document.createElement("code"))

View File

@@ -23,6 +23,7 @@ under the License.
<script src="{{ __common__.host-url }}/www/d3.v5.min.js"></script>
<script src="{{ __common__.host-url }}/www/c3/c3.v7.min.js"></script>
<script src="{{ __common__.host-url }}/www/scripts/compression_util.js" type="module"></script>
<link href="{{ __common__.host-url }}/www/c3/c3.v7.min.css" rel="stylesheet">
<div class="container">
@@ -166,12 +167,13 @@ import {renderTimingDiagram, collectFragmentEventsFromProfile, ntics, set_ntics}
"./www/scripts/query_timeline/fragment_diagram.js";
import {collectUtilizationFromProfile, toogleUtilizationVisibility,
destroyUtilizationChart}
from "/www/scripts/query_timeline/host_utilization_diagram.js";
from "./www/scripts/query_timeline/host_utilization_diagram.js";
import {collectFragmentMetricsFromProfile, closeFragmentMetricsChart} from
"/www/scripts/query_timeline/fragment_metrics_diagram.js";
"./www/scripts/query_timeline/fragment_metrics_diagram.js";
import {profile, set_profile, maxts, set_maxts, diagram_width, set_diagram_width,
border_stroke_width, resizeHorizontalAll}
from "/www/scripts/query_timeline/global_members.js";
from "./www/scripts/query_timeline/global_members.js";
import {inflateParseJSON} from "./www/scripts/compression_util.js";
var chart_export_style;
var last_maxts;
@@ -217,7 +219,7 @@ if (window.location.search.includes("imported")) {
}
var profileStore = db.transaction("profiles", "readonly").objectStore("profiles");
profileStore.get(getQueryID()).onsuccess = (e) => {
set_profile(e.target.result.profile);
set_profile(inflateParseJSON(e.target.result.profile).contents);
refreshView();
};
};

View File

@@ -0,0 +1,22 @@
// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. See the NOTICE file
// distributed with this work for additional information
// regarding copyright ownership. The ASF licenses this file
// to you under the Apache License, Version 2.0 (the
// "License"); you may not use this file except in compliance
// with the License. You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing,
// software distributed under the License is distributed on an
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
// KIND, either express or implied. See the License for the
// specific language governing permissions and limitations
// under the License.
import "../pako.min.js";
export function inflateParseJSON(deflated_string) {
return JSON.parse(pako.inflate(deflated_string, {to : "string"}));
}

View File

@@ -0,0 +1,57 @@
// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. See the NOTICE file
// distributed with this work for additional information
// regarding copyright ownership. The ASF licenses this file
// to you under the Apache License, Version 2.0 (the
// "License"); you may not use this file except in compliance
// with the License. You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing,
// software distributed under the License is distributed on an
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
// KIND, either express or implied. See the License for the
// specific language governing permissions and limitations
// under the License.
importScripts("../../pako.min.js");
importScripts("../common_util.js");
self.onmessage = (e) => {
var query = {};
try {
var profile = JSON.parse(e.data).contents;
var val = profile.profile_name;
query.id = val.substring(val.indexOf("=") + 1, val.length - 1);
query.user = profile.child_profiles[0].info_strings
.find(({key}) => key === "User").value;
query.default_db = profile.child_profiles[0].info_strings
.find(({key}) => key === "Default Db").value;
query.type = profile.child_profiles[0].info_strings
.find(({key}) => key === "Query Type").value;
query.start_time = profile.child_profiles[0].info_strings
.find(({key}) => key === "Start Time").value;
query.end_time = profile.child_profiles[0].info_strings
.find(({key}) => key === "End Time").value;
query.bytes_read = profile.child_profiles[2].counters
.find(({counter_name}) => counter_name === "TotalBytesRead").value;
query.bytes_read = getReadableSize(query.bytes_read, 2);
query.bytes_sent = profile.child_profiles[2].counters
.find(({counter_name}) => counter_name === "TotalBytesSent").value;
query.bytes_sent = getReadableSize(query.bytes_sent, 2);
query.state = profile.child_profiles[0].info_strings
.find(({key}) => key === "Query State").value;
query.rows_fetched = profile.child_profiles[1].counters
.find(({counter_name}) => counter_name === "NumRowsFetched").value;
query.resource_pool = profile.child_profiles[0].info_strings
.find(({key}) => key === "Request Pool").value;
query.statement = profile.child_profiles[0].info_strings
.find(({key}) => key === "Sql Statement").value;
query.statement = query.statement.substring(0, 250) + "...";
query.profile = pako.deflate(e.data, {level : 3});
} catch (err) {
query.error = err;
}
self.postMessage(query);
}

View File

@@ -16,7 +16,6 @@
// under the License.
import {maxts, set_maxts, clearDOMChildren} from "./global_members.js";
import {name_width} from "./fragment_diagram.js";
export var exportedForTest;
// In the data array provided for c3's line chart,

View File

@@ -16,7 +16,7 @@
// under the License.
import {profile, set_maxts, maxts, decimals, set_decimals, diagram_width,
set_diagram_width, diagram_controls_height, diagram_min_height,
set_diagram_width, diagram_min_height,
margin_header_footer, border_stroke_width, margin_chart_end, clearDOMChildren,
resizeHorizontalAll} from "./global_members.js";
import {host_utilization_chart, getUtilizationWrapperHeight}

View File

@@ -0,0 +1,34 @@
// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. See the NOTICE file
// distributed with this work for additional information
// regarding copyright ownership. The ASF licenses this file
// to you under the Apache License, Version 2.0 (the
// "License"); you may not use this file except in compliance
// with the License. You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing,
// software distributed under the License is distributed on an
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
// KIND, either express or implied. See the License for the
// specific language governing permissions and limitations
// under the License.
import {describe, test, expect} from '@jest/globals';
import {readFileSync} from 'node:fs';
// JEST does not support workers, so "profileParseWorker.js" cannot be tested directly
describe("Test Compression Library", () => {
// Test whether the compression library imported by the worker script
// properly utilizes the pako library's compression methods
test("Basic Test", () => {
var exampleJSONProfileText = readFileSync("../../../testdata/impala-profiles/impala_profile"
+ "_log_tpcds_compute_stats.expected.pretty_extended.json", {encoding : "utf-8"});
import("../../../pako.min.js").then((pako) => {
pako = pako.default;
expect(pako.inflate(pako.deflate(exampleJSONProfileText, {level : 3}), {to : "string"}))
.toBe(exampleJSONProfileText);
});
});
});