mirror of
https://github.com/apache/impala.git
synced 2026-01-07 09:02:19 -05:00
Reusing the same advice under "Known Issues", scalability considerations, and in the Impala + Kerberos section. Change-Id: Icbfa755e2c9769a8458fd93362769856cf32e301 Reviewed-on: http://gerrit.cloudera.org:8080/7349 Reviewed-by: Mostafa Mokhtar <mmokhtar@cloudera.com> Tested-by: Impala Public Jenkins
1969 lines
65 KiB
XML
1969 lines
65 KiB
XML
<?xml version="1.0" encoding="UTF-8"?>
|
|
<!--
|
|
Licensed to the Apache Software Foundation (ASF) under one
|
|
or more contributor license agreements. See the NOTICE file
|
|
distributed with this work for additional information
|
|
regarding copyright ownership. The ASF licenses this file
|
|
to you under the Apache License, Version 2.0 (the
|
|
"License"); you may not use this file except in compliance
|
|
with the License. You may obtain a copy of the License at
|
|
|
|
http://www.apache.org/licenses/LICENSE-2.0
|
|
|
|
Unless required by applicable law or agreed to in writing,
|
|
software distributed under the License is distributed on an
|
|
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
|
KIND, either express or implied. See the License for the
|
|
specific language governing permissions and limitations
|
|
under the License.
|
|
-->
|
|
<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
|
|
<concept rev="ver" id="known_issues">
|
|
|
|
<title><ph audience="standalone">Known Issues and Workarounds in Impala</ph><ph audience="integrated">Apache Impala (incubating) Known Issues</ph></title>
|
|
|
|
<prolog>
|
|
<metadata>
|
|
<data name="Category" value="Impala"/>
|
|
<data name="Category" value="Release Notes"/>
|
|
<data name="Category" value="Known Issues"/>
|
|
<data name="Category" value="Troubleshooting"/>
|
|
<data name="Category" value="Upgrading"/>
|
|
<data name="Category" value="Administrators"/>
|
|
<data name="Category" value="Developers"/>
|
|
<data name="Category" value="Data Analysts"/>
|
|
</metadata>
|
|
</prolog>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
The following sections describe known issues and workarounds in Impala, as of the current production release. This page summarizes the
|
|
most serious or frequently encountered issues in the current release, to help you make planning decisions about installing and
|
|
upgrading. Any workarounds are listed here. The bug links take you to the Impala issues site, where you can see the diagnosis and
|
|
whether a fix is in the pipeline.
|
|
</p>
|
|
|
|
<note>
|
|
The online issue tracking system for Impala contains comprehensive information and is updated in real time. To verify whether an issue
|
|
you are experiencing has already been reported, or which release an issue is fixed in, search on the
|
|
<xref href="https://issues.apache.org/jira/" scope="external" format="html">issues.apache.org JIRA tracker</xref>.
|
|
</note>
|
|
|
|
<p outputclass="toc inpage"/>
|
|
|
|
<p>
|
|
For issues fixed in various Impala releases, see <xref href="impala_fixed_issues.xml#fixed_issues"/>.
|
|
</p>
|
|
|
|
<!-- Use as a template for new issues.
|
|
<concept id="">
|
|
<title></title>
|
|
<conbody>
|
|
<p>
|
|
</p>
|
|
<p><b>Bug:</b> <xref keyref=""></xref></p>
|
|
<p><b>Severity:</b> High</p>
|
|
<p><b>Resolution:</b> </p>
|
|
<p><b>Workaround:</b> </p>
|
|
</conbody>
|
|
</concept>
|
|
|
|
-->
|
|
|
|
</conbody>
|
|
|
|
<!-- New known issues for Impala 2.3.
|
|
|
|
Title: Server-to-server SSL and Kerberos do not work together
|
|
Description: If server<->server SSL is enabled (with ssl_client_ca_certificate), and Kerberos auth is used between servers, the cluster will fail to start.
|
|
Upstream & Internal JIRAs: https://issues.apache.org/jira/browse/IMPALA-2598
|
|
Severity: Medium. Server-to-server SSL is practically unusable but this is a new feature.
|
|
Workaround: No known workaround.
|
|
|
|
Title: Queries may hang on server-to-server exchange errors
|
|
Description: The DataStreamSender::Channel::CloseInternal() does not close the channel on an error. This will cause the node on the other side of the channel to wait indefinitely causing a hang.
|
|
Upstream & Internal JIRAs: https://issues.apache.org/jira/browse/IMPALA-2592
|
|
Severity: Low. This does not occur frequently.
|
|
Workaround: No known workaround.
|
|
|
|
Title: Catalogd may crash when loading metadata for tables with many partitions, many columns and with incremental stats
|
|
Description: Incremental stats use up about 400 bytes per partition X column. So for a table with 20K partitions and 100 columns this is about 800 MB. When serialized this goes past the 2 GB Java array size limit and leads to a catalog crash.
|
|
Upstream & Internal JIRAs: https://issues.apache.org/jira/browse/IMPALA-2648, IMPALA-2647, IMPALA-2649.
|
|
Severity: Low. This does not occur frequently.
|
|
Workaround: Reduce the number of partitions.
|
|
|
|
More from the JIRA report of blocker/critical issues:
|
|
|
|
IMPALA-2093
|
|
Wrong plan of NOT IN aggregate subquery when a constant is used in subquery predicate
|
|
IMPALA-1652
|
|
Incorrect results with basic predicate on CHAR typed column.
|
|
IMPALA-1459
|
|
Incorrect assignment of predicates through an outer join in an inline view.
|
|
IMPALA-2665
|
|
Incorrect assignment of On-clause predicate inside inline view with an outer join.
|
|
IMPALA-2603
|
|
Crash: impala::Coordinator::ValidateCollectionSlots
|
|
IMPALA-2375
|
|
Fix issues with the legacy join and agg nodes using enable_partitioned_hash_join=false and enable_partitioned_aggregation=false
|
|
IMPALA-1862
|
|
Invalid bool value not reported as a scanner error
|
|
IMPALA-1792
|
|
ImpalaODBC: Can not get the value in the SQLGetData(m-x th column) after the SQLBindCol(m th column)
|
|
IMPALA-1578
|
|
Impala incorrectly handles text data when the new line character \n\r is split between different HDFS block
|
|
IMPALA-2643
|
|
Duplicated column in inline view causes dropping null slots during scan
|
|
IMPALA-2005
|
|
A failed CTAS does not drop the table if the insert fails.
|
|
IMPALA-1821
|
|
Casting scenarios with invalid/inconsistent results
|
|
|
|
Another list from Alex, of correctness problems with predicates; might overlap with ones I already have:
|
|
|
|
https://issues.apache.org/jira/browse/IMPALA-2665 - Already have
|
|
https://issues.apache.org/jira/browse/IMPALA-2643 - Already have
|
|
https://issues.apache.org/jira/browse/IMPALA-1459 - Already have
|
|
https://issues.apache.org/jira/browse/IMPALA-2144 - Don't have
|
|
|
|
-->
|
|
|
|
<concept id="known_issues_crash">
|
|
|
|
<title>Impala Known Issues: Crashes and Hangs</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
These issues can cause Impala to quit or become unresponsive.
|
|
</p>
|
|
|
|
</conbody>
|
|
|
|
<concept id="IMPALA-4828">
|
|
<title>Altering Kudu table schema outside of Impala may result in crash on read</title>
|
|
<conbody>
|
|
<p>
|
|
Creating a table in Impala, changing the column schema outside of Impala,
|
|
and then reading again in Impala may result in a crash. Neither Impala nor
|
|
the Kudu client validates the schema immediately before reading, so Impala may attempt to
|
|
dereference pointers that aren't there. This happens if a string column is dropped
|
|
and then a new, non-string column is added with the old string column's name.
|
|
</p>
|
|
<p><b>Bug:</b> <xref keyref="IMPALA-4828" scope="external" format="html">IMPALA-4828</xref></p>
|
|
<p><b>Severity:</b> High</p>
|
|
<p><b>Workaround:</b> Run the statement <codeph>REFRESH <varname>table_name</varname></codeph>
|
|
after any occasion when the table structure, such as the number, names, and data types
|
|
of columns, are modified outside of Impala using the Kudu API.
|
|
</p>
|
|
</conbody>
|
|
</concept>
|
|
|
|
<concept id="IMPALA-1972" rev="IMPALA-1972">
|
|
|
|
<title>Queries that take a long time to plan can cause webserver to block other queries</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
Trying to get the details of a query through the debug web page
|
|
while the query is planning will block new queries that had not
|
|
started when the web page was requested. The web UI becomes
|
|
unresponsive until the planning phase is finished.
|
|
</p>
|
|
|
|
<p>
|
|
<b>Bug:</b> <xref keyref="IMPALA-1972">IMPALA-1972</xref>
|
|
</p>
|
|
|
|
<p>
|
|
<b>Severity:</b> High
|
|
</p>
|
|
|
|
</conbody>
|
|
|
|
<concept id="IMPALA-4595">
|
|
<title>Linking IR UDF module to main module crashes Impala</title>
|
|
<conbody>
|
|
<p>
|
|
A UDF compiled as an LLVM module (<codeph>.ll</codeph>) could cause a crash
|
|
when executed.
|
|
</p>
|
|
<p><b>Bug:</b> <xref keyref="IMPALA-4595">IMPALA-4595</xref></p>
|
|
<p><b>Severity:</b> High</p>
|
|
<p><b>Resolution:</b> Fixed in <keyword keyref="impala28_full"/> and higher.</p>
|
|
<p><b>Workaround:</b> Compile the external UDFs to a <codeph>.so</codeph> library instead of a
|
|
<codeph>.ll</codeph> IR module.</p>
|
|
</conbody>
|
|
</concept>
|
|
|
|
<concept id="IMPALA-3069" rev="IMPALA-3069">
|
|
|
|
<title>Setting BATCH_SIZE query option too large can cause a crash</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
Using a value in the millions for the <codeph>BATCH_SIZE</codeph> query option, together with wide rows or large string values in
|
|
columns, could cause a memory allocation of more than 2 GB resulting in a crash.
|
|
</p>
|
|
|
|
<p>
|
|
<b>Bug:</b> <xref keyref="IMPALA-3069">IMPALA-3069</xref>
|
|
</p>
|
|
|
|
<p>
|
|
<b>Severity:</b> High
|
|
</p>
|
|
|
|
<p><b>Resolution:</b> Fixed in <keyword keyref="impala270"/>.</p>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
<concept id="IMPALA-3441" rev="IMPALA-3441">
|
|
|
|
<title>Impala should not crash for invalid avro serialized data</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
Malformed Avro data, such as out-of-bounds integers or values in the wrong format, could cause a crash when queried.
|
|
</p>
|
|
|
|
<p>
|
|
<b>Bug:</b> <xref keyref="IMPALA-3441">IMPALA-3441</xref>
|
|
</p>
|
|
|
|
<p>
|
|
<b>Severity:</b> High
|
|
</p>
|
|
|
|
<p><b>Resolution:</b> Fixed in <keyword keyref="impala270"/> and <keyword keyref="impala262"/>.</p>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
<concept id="IMPALA-2592" rev="IMPALA-2592">
|
|
|
|
<title>Queries may hang on server-to-server exchange errors</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
The <codeph>DataStreamSender::Channel::CloseInternal()</codeph> does not close the channel on an error. This causes the node on
|
|
the other side of the channel to wait indefinitely, causing a hang.
|
|
</p>
|
|
|
|
<p>
|
|
<b>Bug:</b> <xref keyref="IMPALA-2592">IMPALA-2592</xref>
|
|
</p>
|
|
|
|
<p>
|
|
<b>Resolution:</b> Fixed in <keyword keyref="impala250"/>.
|
|
</p>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
<concept id="IMPALA-2365" rev="IMPALA-2365">
|
|
|
|
<title>Impalad is crashing if udf jar is not available in hdfs location for first time</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
If the JAR file corresponding to a Java UDF is removed from HDFS after the Impala <codeph>CREATE FUNCTION</codeph> statement is
|
|
issued, the <cmdname>impalad</cmdname> daemon crashes.
|
|
</p>
|
|
|
|
<p>
|
|
<b>Bug:</b> <xref keyref="IMPALA-2365">IMPALA-2365</xref>
|
|
</p>
|
|
|
|
<p><b>Resolution:</b> Fixed in <keyword keyref="impala250"/>.</p>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
</concept>
|
|
|
|
<concept id="known_issues_performance">
|
|
|
|
<title id="ki_performance">Impala Known Issues: Performance</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
These issues involve the performance of operations such as queries or DDL statements.
|
|
</p>
|
|
|
|
</conbody>
|
|
|
|
<concept id="IMPALA-1480" rev="IMPALA-1480">
|
|
|
|
<!-- Not part of Alex's spreadsheet. Spreadsheet has IMPALA-1423 which mentions it's similar to this one but not a duplicate. -->
|
|
|
|
<title>Slow DDL statements for tables with large number of partitions</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
DDL statements for tables with a large number of partitions might be slow.
|
|
</p>
|
|
|
|
<p>
|
|
<b>Bug:</b> <xref keyref="IMPALA-1480">IMPALA-1480</xref>
|
|
</p>
|
|
|
|
<p>
|
|
<b>Workaround:</b> Run the DDL statement in Hive if the slowness is an issue.
|
|
</p>
|
|
|
|
<p><b>Resolution:</b> Fixed in <keyword keyref="impala250"/>.</p>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
</concept>
|
|
|
|
<concept id="known_issues_usability">
|
|
|
|
<title id="ki_usability">Impala Known Issues: Usability</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
These issues affect the convenience of interacting directly with Impala, typically through the Impala shell or Hue.
|
|
</p>
|
|
|
|
</conbody>
|
|
|
|
<concept id="IMPALA-4570">
|
|
<title>Impala shell tarball is not usable on systems with setuptools versions where '0.7' is a substring of the full version string</title>
|
|
<conbody>
|
|
<p>
|
|
For example, this issue could occur on a system using setuptools version 20.7.0.
|
|
</p>
|
|
<p><b>Bug:</b> <xref keyref="IMPALA-4570">IMPALA-4570</xref></p>
|
|
<p><b>Severity:</b> High</p>
|
|
<p><b>Resolution:</b> Fixed in <keyword keyref="impala28_full"/> and higher.</p>
|
|
<p><b>Workaround:</b> Change to a setuptools version that does not have <codeph>0.7</codeph> as
|
|
a substring.
|
|
</p>
|
|
</conbody>
|
|
</concept>
|
|
|
|
<concept id="IMPALA-3133" rev="IMPALA-3133">
|
|
|
|
<title>Unexpected privileges in show output</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
Due to a timing condition in updating cached policy data from Sentry, the <codeph>SHOW</codeph> statements for Sentry roles could
|
|
sometimes display out-of-date role settings. Because Impala rechecks authorization for each SQL statement, this discrepancy does
|
|
not represent a security issue for other statements.
|
|
</p>
|
|
|
|
<p>
|
|
<b>Bug:</b> <xref keyref="IMPALA-3133">IMPALA-3133</xref>
|
|
</p>
|
|
|
|
<p>
|
|
<b>Severity:</b> High
|
|
</p>
|
|
|
|
<p>
|
|
<b>Resolution:</b> Fixes have been issued for some but not all Impala releases. Check the JIRA for details of fix releases.
|
|
</p>
|
|
|
|
<p><b>Resolution:</b> Fixed in <keyword keyref="impala260"/> and <keyword keyref="impala251"/>.</p>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
<concept id="IMPALA-1776" rev="IMPALA-1776">
|
|
|
|
<title>Less than 100% progress on completed simple SELECT queries</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
Simple <codeph>SELECT</codeph> queries show less than 100% progress even though they are already completed.
|
|
</p>
|
|
|
|
<p>
|
|
<b>Bug:</b> <xref keyref="IMPALA-1776">IMPALA-1776</xref>
|
|
</p>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
<concept id="concept_lmx_dk5_lx">
|
|
|
|
<title>Unexpected column overflow behavior with INT datatypes</title>
|
|
|
|
<conbody>
|
|
|
|
<p conref="../shared/impala_common.xml#common/int_overflow_behavior" />
|
|
|
|
<p>
|
|
<b>Bug:</b>
|
|
<xref keyref="IMPALA-3123">IMPALA-3123</xref>
|
|
</p>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
</concept>
|
|
|
|
<concept id="known_issues_drivers">
|
|
|
|
<title id="ki_drivers">Impala Known Issues: JDBC and ODBC Drivers</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
These issues affect applications that use the JDBC or ODBC APIs, such as business intelligence tools or custom-written applications
|
|
in languages such as Java or C++.
|
|
</p>
|
|
|
|
</conbody>
|
|
|
|
<concept id="IMPALA-1792" rev="IMPALA-1792">
|
|
|
|
<!-- Not part of Alex's spreadsheet -->
|
|
|
|
<title>ImpalaODBC: Can not get the value in the SQLGetData(m-x th column) after the SQLBindCol(m th column)</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
If the ODBC <codeph>SQLGetData</codeph> is called on a series of columns, the function calls must follow the same order as the
|
|
columns. For example, if data is fetched from column 2 then column 1, the <codeph>SQLGetData</codeph> call for column 1 returns
|
|
<codeph>NULL</codeph>.
|
|
</p>
|
|
|
|
<p>
|
|
<b>Bug:</b> <xref keyref="IMPALA-1792">IMPALA-1792</xref>
|
|
</p>
|
|
|
|
<p>
|
|
<b>Workaround:</b> Fetch columns in the same order they are defined in the table.
|
|
</p>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
</concept>
|
|
|
|
<concept id="known_issues_security">
|
|
|
|
<title id="ki_security">Impala Known Issues: Security</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
These issues relate to security features, such as Kerberos authentication, Sentry authorization, encryption, auditing, and
|
|
redaction.
|
|
</p>
|
|
|
|
</conbody>
|
|
|
|
<concept id="renewable_kerberos_tickets">
|
|
|
|
<!-- Not part of Alex's spreadsheet. Not associated with a JIRA number AFAIK. -->
|
|
|
|
<title>Kerberos tickets must be renewable</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
In a Kerberos environment, the <cmdname>impalad</cmdname> daemon might not start if Kerberos tickets are not renewable.
|
|
</p>
|
|
|
|
<p>
|
|
<b>Workaround:</b> Configure your KDC to allow tickets to be renewed, and configure <filepath>krb5.conf</filepath> to request
|
|
renewable tickets.
|
|
</p>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
<!-- To do: Fixed in 2.5.0, 2.3.2. Commenting out until I see how it can fix into "known issues now fixed" convention.
|
|
That set of fix releases looks incomplete so probably have to do some detective work with the JIRA.
|
|
https://issues.apache.org/jira/browse/IMPALA-2598
|
|
<concept id="IMPALA-2598" rev="IMPALA-2598">
|
|
|
|
<title>Server-to-server SSL and Kerberos do not work together</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
If SSL is enabled between internal Impala components (with <codeph>ssl_client_ca_certificate</codeph>), and Kerberos
|
|
authentication is used between servers, the cluster fails to start.
|
|
</p>
|
|
|
|
<p>
|
|
<b>Bug:</b> <xref keyref="IMPALA-2598">IMPALA-2598</xref>
|
|
</p>
|
|
|
|
<p>
|
|
<b>Workaround:</b> Do not use the new <codeph>ssl_client_ca_certificate</codeph> setting on Kerberos-enabled clusters until this
|
|
issue is resolved.
|
|
</p>
|
|
|
|
<p><b>Resolution:</b> Fixed in <keyword keyref="impala250"/> and <keyword keyref="impala232"/>.</p>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
-->
|
|
|
|
</concept>
|
|
|
|
<!--
|
|
<concept id="known_issues_supportability">
|
|
|
|
<title id="ki_supportability">Impala Known Issues: Supportability</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
These issues affect the ability to debug and troubleshoot Impala, such as incorrect output in query profiles or the query state
|
|
shown in monitoring applications.
|
|
</p>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
-->
|
|
|
|
<concept id="known_issues_resources">
|
|
|
|
<title id="ki_resources">Impala Known Issues: Resources</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
These issues involve memory or disk usage, including out-of-memory conditions, the spill-to-disk feature, and resource management
|
|
features.
|
|
</p>
|
|
|
|
</conbody>
|
|
|
|
<concept id="IMPALA-5605">
|
|
<title>Configuration to prevent crashes caused by thread resource limits</title>
|
|
<conbody>
|
|
<p>
|
|
Impala could encounter a serious error due to resource usage under very high concurrency.
|
|
The error message is similar to:
|
|
</p>
|
|
<codeblock><![CDATA[
|
|
F0629 08:20:02.956413 29088 llvm-codegen.cc:111] LLVM hit fatal error: Unable to allocate section memory!
|
|
terminate called after throwing an instance of 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::thread_resource_error> >'
|
|
]]>
|
|
</codeblock>
|
|
<p><b>Bug:</b> <xref keyref="IMPALA-5605">IMPALA-5605</xref></p>
|
|
<p><b>Severity:</b> High</p>
|
|
<p><b>Workaround:</b>
|
|
To prevent such errors, configure each host running an <cmdname>impalad</cmdname>
|
|
daemon with the following settings:
|
|
</p>
|
|
<codeblock>
|
|
echo 2000000 > /proc/sys/kernel/threads-max
|
|
echo 2000000 > /proc/sys/kernel/pid_max
|
|
echo 8000000 > /proc/sys/vm/max_map_count
|
|
</codeblock>
|
|
<p>
|
|
Add the following lines in <filepath>/etc/security/limits.conf</filepath>:
|
|
</p>
|
|
<codeblock>
|
|
impala soft nproc 262144
|
|
impala hard nproc 262144
|
|
</codeblock>
|
|
</conbody>
|
|
</concept>
|
|
|
|
<concept id="flatbuffers_mem_usage">
|
|
<title>Memory usage when compact_catalog_topic flag enabled</title>
|
|
<conbody>
|
|
<p>
|
|
The efficiency improvement from <xref keyref="IMPALA-4029">IMPALA-4029</xref>
|
|
can cause an increase in size of the updates to Impala catalog metadata
|
|
that are broadcast to the <cmdname>impalad</cmdname> daemons
|
|
by the <cmdname>statestored</cmdname> daemon.
|
|
The increase in catalog update topic size results in higher CPU and network
|
|
utilization. By default, the increase in topic size is about 5-7%. If the
|
|
<codeph>compact_catalog_topic</codeph> flag is used, the
|
|
size increase is more substantial, with a topic size approximately twice as
|
|
large as in previous versions.
|
|
</p>
|
|
<p><b>Bug:</b> <xref keyref="IMPALA-5500">IMPALA-5500</xref></p>
|
|
<p><b>Severity:</b> Medium</p>
|
|
<p>
|
|
<b>Workaround:</b> Consider leaving the <codeph>compact_catalog_topic</codeph>
|
|
configuration setting at its default value of <codeph>false</codeph> until
|
|
this issue is resolved.
|
|
</p>
|
|
<p><b>Resolution:</b> A fix is in the pipeline. Check the status of
|
|
<xref keyref="IMPALA-5500">IMPALA-5500</xref> for the release where the fix is available.</p>
|
|
</conbody>
|
|
</concept>
|
|
|
|
<concept id="IMPALA-2294">
|
|
<title>Kerberos initialization errors due to high memory usage</title>
|
|
<conbody>
|
|
<p conref="../shared/impala_common.xml#common/vm_overcommit_memory_intro"/>
|
|
<p><b>Bug:</b> <xref keyref="IMPALA-2294">IMPALA-2294</xref></p>
|
|
<p><b>Severity:</b> High</p>
|
|
<p><b>Workaround:</b></p>
|
|
<p conref="../shared/impala_common.xml#common/vm_overcommit_memory_start" conrefend="vm_overcommit_memory_end"/>
|
|
</conbody>
|
|
</concept>
|
|
|
|
<concept id="drop_table_purge_s3a">
|
|
<title>DROP TABLE PURGE on S3A table may not delete externally written files</title>
|
|
<conbody>
|
|
<p>
|
|
A <codeph>DROP TABLE PURGE</codeph> statement against an S3 table could leave the data files
|
|
behind, if the table directory and the data files were created with a combination of
|
|
<cmdname>hadoop fs</cmdname> and <cmdname>aws s3</cmdname> commands.
|
|
</p>
|
|
<p><b>Bug:</b> <xref keyref="IMPALA-3558">IMPALA-3558</xref></p>
|
|
<p><b>Severity:</b> High</p>
|
|
<p><b>Resolution:</b> The underlying issue with the S3A connector depends on the resolution of <xref href="https://issues.apache.org/jira/browse/HADOOP-13230" format="html" scope="external">HADOOP-13230</xref>.</p>
|
|
</conbody>
|
|
</concept>
|
|
|
|
<concept id="catalogd_heap">
|
|
|
|
<title>Impala catalogd heap issues when upgrading to <keyword keyref="impala25"/></title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
The default heap size for Impala <cmdname>catalogd</cmdname> has changed in <keyword keyref="impala25_full"/> and higher:
|
|
</p>
|
|
|
|
<ul>
|
|
<li>
|
|
<p>
|
|
Previously, by default <cmdname>catalogd</cmdname> was using the JVM's default heap size, which is the smaller of 1/4th of the
|
|
physical memory or 32 GB.
|
|
</p>
|
|
</li>
|
|
|
|
<li>
|
|
<p>
|
|
Starting with <keyword keyref="impala250"/>, the default <cmdname>catalogd</cmdname> heap size is 4 GB.
|
|
</p>
|
|
</li>
|
|
</ul>
|
|
|
|
<p>
|
|
For example, on a host with 128GB physical memory this will result in catalogd heap decreasing from 32GB to 4GB. This can result
|
|
in out-of-memory errors in catalogd and leading to query failures.
|
|
</p>
|
|
|
|
<p>
|
|
<b>Severity:</b> High
|
|
</p>
|
|
|
|
<p>
|
|
<b>Workaround:</b> Increase the <cmdname>catalogd</cmdname> memory limit as follows.
|
|
<!-- See <xref href="impala_scalability.xml#scalability_catalog"/> for the procedure. -->
|
|
<!-- Including full details here via conref, for benefit of PDF readers or anyone else
|
|
who might have trouble seeing or following the link. -->
|
|
</p>
|
|
|
|
<p conref="../shared/impala_common.xml#common/increase_catalogd_heap_size"/>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
<concept id="IMPALA-3509" rev="IMPALA-3509">
|
|
|
|
<title>Breakpad minidumps can be very large when the thread count is high</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
The size of the breakpad minidump files grows linearly with the number of threads. By default, each thread adds 8 KB to the
|
|
minidump size. Minidump files could consume significant disk space when the daemons have a high number of threads.
|
|
</p>
|
|
|
|
<p>
|
|
<b>Bug:</b> <xref keyref="IMPALA-3509">IMPALA-3509</xref>
|
|
</p>
|
|
|
|
<p>
|
|
<b>Severity:</b> High
|
|
</p>
|
|
|
|
<p>
|
|
<b>Workaround:</b> Add <codeph>--minidump_size_limit_hint_kb=<varname>size</varname></codeph> to set a soft upper limit on the
|
|
size of each minidump file. If the minidump file would exceed that limit, Impala reduces the amount of information for each thread
|
|
from 8 KB to 2 KB. (Full thread information is captured for the first 20 threads, then 2 KB per thread after that.) The minidump
|
|
file can still grow larger than the <q>hinted</q> size. For example, if you have 10,000 threads, the minidump file can be more
|
|
than 20 MB.
|
|
</p>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
<concept id="IMPALA-3662" rev="IMPALA-3662">
|
|
|
|
<title>Parquet scanner memory increase after IMPALA-2736</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
The initial release of <keyword keyref="impala26_full"/> sometimes has a higher peak memory usage than in previous releases while reading
|
|
Parquet files.
|
|
</p>
|
|
|
|
<p>
|
|
<keyword keyref="impala26_full"/> addresses the issue IMPALA-2736, which improves the efficiency of Parquet scans by up to 2x. The faster scans
|
|
may result in a higher peak memory consumption compared to earlier versions of Impala due to the new column-wise row
|
|
materialization strategy. You are likely to experience higher memory consumption in any of the following scenarios:
|
|
<ul>
|
|
<li>
|
|
<p>
|
|
Very wide rows due to projecting many columns in a scan.
|
|
</p>
|
|
</li>
|
|
|
|
<li>
|
|
<p>
|
|
Very large rows due to big column values, for example, long strings or nested collections with many items.
|
|
</p>
|
|
</li>
|
|
|
|
<li>
|
|
<p>
|
|
Producer/consumer speed imbalances, leading to more rows being buffered between a scan (producer) and downstream (consumer)
|
|
plan nodes.
|
|
</p>
|
|
</li>
|
|
</ul>
|
|
</p>
|
|
|
|
<p>
|
|
<b>Bug:</b> <xref keyref="IMPALA-3662">IMPALA-3662</xref>
|
|
</p>
|
|
|
|
<p>
|
|
<b>Severity:</b> High
|
|
</p>
|
|
|
|
<p>
|
|
<b>Workaround:</b> The following query options might help to reduce memory consumption in the Parquet scanner:
|
|
<ul>
|
|
<li>
|
|
Reduce the number of scanner threads, for example: <codeph>set num_scanner_threads=30</codeph>
|
|
</li>
|
|
|
|
<li>
|
|
Reduce the batch size, for example: <codeph>set batch_size=512</codeph>
|
|
</li>
|
|
|
|
<li>
|
|
Increase the memory limit, for example: <codeph>set mem_limit=64g</codeph>
|
|
</li>
|
|
</ul>
|
|
</p>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
<concept id="IMPALA-691" rev="IMPALA-691">
|
|
|
|
<title>Process mem limit does not account for the JVM's memory usage</title>
|
|
|
|
<!-- Supposed to be resolved for Impala 2.3.0. -->
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
Some memory allocated by the JVM used internally by Impala is not counted against the memory limit for the
|
|
<cmdname>impalad</cmdname> daemon.
|
|
</p>
|
|
|
|
<p>
|
|
<b>Bug:</b> <xref keyref="IMPALA-691">IMPALA-691</xref>
|
|
</p>
|
|
|
|
<p>
|
|
<b>Workaround:</b> To monitor overall memory usage, use the <cmdname>top</cmdname> command, or add the memory figures in the
|
|
Impala web UI <uicontrol>/memz</uicontrol> tab to JVM memory usage shown on the <uicontrol>/metrics</uicontrol> tab.
|
|
</p>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
<concept id="IMPALA-2375" rev="IMPALA-2375">
|
|
|
|
<!-- Not part of Alex's spreadsheet -->
|
|
|
|
<title>Fix issues with the legacy join and agg nodes using --enable_partitioned_hash_join=false and --enable_partitioned_aggregation=false</title>
|
|
|
|
<conbody>
|
|
|
|
<p></p>
|
|
|
|
<p>
|
|
<b>Bug:</b> <xref keyref="IMPALA-2375">IMPALA-2375</xref>
|
|
</p>
|
|
|
|
<p>
|
|
<b>Workaround:</b> Transition away from the <q>old-style</q> join and aggregation mechanism if practical.
|
|
</p>
|
|
|
|
<p><b>Resolution:</b> Fixed in <keyword keyref="impala250"/>.</p>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
</concept>
|
|
|
|
<concept id="known_issues_correctness">
|
|
|
|
<title id="ki_correctness">Impala Known Issues: Correctness</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
These issues can cause incorrect or unexpected results from queries. They typically only arise in very specific circumstances.
|
|
</p>
|
|
|
|
</conbody>
|
|
|
|
<concept id="IMPALA-4513">
|
|
<title>ABS(n) where n is the lowest bound for the int types returns negative values</title>
|
|
<conbody>
|
|
<p>
|
|
If the <codeph>abs()</codeph> function evaluates a number that is right at the lower bound for
|
|
an integer data type, the positive result cannot be represented in the same type, and the
|
|
result is returned as a negative number. For example, <codeph>abs(-128)</codeph> returns -128
|
|
because the argument is interpreted as a <codeph>TINYINT</codeph> and the return value is also
|
|
a <codeph>TINYINT</codeph>.
|
|
</p>
|
|
<p><b>Bug:</b> <xref keyref="IMPALA-4513">IMPALA-4513</xref></p>
|
|
<p><b>Severity:</b> High</p>
|
|
<p><b>Workaround:</b> Cast the integer value to a larger type. For example, rewrite
|
|
<codeph>abs(<varname>tinyint_col</varname>)</codeph> as <codeph>abs(cast(<varname>tinyint_col</varname> as smallint))</codeph>.</p>
|
|
</conbody>
|
|
</concept>
|
|
|
|
<concept id="IMPALA-4266">
|
|
<title>Java udf expression returning string in group by can give incorrect results.</title>
|
|
<conbody>
|
|
<p>
|
|
If the <codeph>GROUP BY</codeph> clause included a call to a Java UDF that returned a string value,
|
|
the UDF could return an incorrect result.
|
|
</p>
|
|
<p><b>Bug:</b> <xref keyref="IMPALA-4266">IMPALA-4266</xref></p>
|
|
<p><b>Severity:</b> High</p>
|
|
<p><b>Resolution:</b> Fixed in <keyword keyref="impala28_full"/> and higher.</p>
|
|
<p><b>Workaround:</b> Rewrite the expression to concatenate the results of the Java UDF with an
|
|
empty string call. For example, rewrite <codeph>my_hive_udf()</codeph> as
|
|
<codeph>concat(my_hive_udf(), '')</codeph>.
|
|
</p>
|
|
</conbody>
|
|
</concept>
|
|
|
|
<concept id="IMPALA-3084" rev="IMPALA-3084">
|
|
|
|
<title>Incorrect assignment of NULL checking predicate through an outer join of a nested collection.</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
A query could return wrong results (too many or too few <codeph>NULL</codeph> values) if it referenced an outer-joined nested
|
|
collection and also contained a null-checking predicate (<codeph>IS NULL</codeph>, <codeph>IS NOT NULL</codeph>, or the
|
|
<codeph><=></codeph> operator) in the <codeph>WHERE</codeph> clause.
|
|
</p>
|
|
|
|
<p>
|
|
<b>Bug:</b> <xref keyref="IMPALA-3084">IMPALA-3084</xref>
|
|
</p>
|
|
|
|
<p>
|
|
<b>Severity:</b> High
|
|
</p>
|
|
|
|
<p><b>Resolution:</b> Fixed in <keyword keyref="impala270"/>.</p>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
<concept id="IMPALA-3094" rev="IMPALA-3094">
|
|
|
|
<title>Incorrect result due to constant evaluation in query with outer join</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
An <codeph>OUTER JOIN</codeph> query could omit some expected result rows due to a constant such as <codeph>FALSE</codeph> in
|
|
another join clause. For example:
|
|
</p>
|
|
|
|
<codeblock><![CDATA[
|
|
explain SELECT 1 FROM alltypestiny a1
|
|
INNER JOIN alltypesagg a2 ON a1.smallint_col = a2.year AND false
|
|
RIGHT JOIN alltypes a3 ON a1.year = a1.bigint_col;
|
|
+---------------------------------------------------------+
|
|
| Explain String |
|
|
+---------------------------------------------------------+
|
|
| Estimated Per-Host Requirements: Memory=1.00KB VCores=1 |
|
|
| |
|
|
| 00:EMPTYSET |
|
|
+---------------------------------------------------------+
|
|
]]>
|
|
</codeblock>
|
|
|
|
<p>
|
|
<b>Bug:</b> <xref keyref="IMPALA-3094">IMPALA-3094</xref>
|
|
</p>
|
|
|
|
<p>
|
|
<b>Severity:</b> High
|
|
</p>
|
|
|
|
<p>
|
|
<b>Resolution:</b>
|
|
</p>
|
|
|
|
<p>
|
|
<b>Workaround:</b>
|
|
</p>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
<concept id="IMPALA-3126" rev="IMPALA-3126">
|
|
|
|
<title>Incorrect assignment of an inner join On-clause predicate through an outer join.</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
Impala may return incorrect results for queries that have the following properties:
|
|
</p>
|
|
|
|
<ul>
|
|
<li>
|
|
<p>
|
|
There is an INNER JOIN following a series of OUTER JOINs.
|
|
</p>
|
|
</li>
|
|
|
|
<li>
|
|
<p>
|
|
The INNER JOIN has an On-clause with a predicate that references at least two tables that are on the nullable side of the
|
|
preceding OUTER JOINs.
|
|
</p>
|
|
</li>
|
|
</ul>
|
|
|
|
<p>
|
|
The following query demonstrates the issue:
|
|
</p>
|
|
|
|
<codeblock>
|
|
select 1 from functional.alltypes a left outer join
|
|
functional.alltypes b on a.id = b.id left outer join
|
|
functional.alltypes c on b.id = c.id right outer join
|
|
functional.alltypes d on c.id = d.id inner join functional.alltypes e
|
|
on b.int_col = c.int_col;
|
|
</codeblock>
|
|
|
|
<p>
|
|
The following listing shows the incorrect <codeph>EXPLAIN</codeph> plan:
|
|
</p>
|
|
|
|
<codeblock><![CDATA[
|
|
+-----------------------------------------------------------+
|
|
| Explain String |
|
|
+-----------------------------------------------------------+
|
|
| Estimated Per-Host Requirements: Memory=480.04MB VCores=4 |
|
|
| |
|
|
| 14:EXCHANGE [UNPARTITIONED] |
|
|
| | |
|
|
| 08:NESTED LOOP JOIN [CROSS JOIN, BROADCAST] |
|
|
| | |
|
|
| |--13:EXCHANGE [BROADCAST] |
|
|
| | | |
|
|
| | 04:SCAN HDFS [functional.alltypes e] |
|
|
| | partitions=24/24 files=24 size=478.45KB |
|
|
| | |
|
|
| 07:HASH JOIN [RIGHT OUTER JOIN, PARTITIONED] |
|
|
| | hash predicates: c.id = d.id |
|
|
| | runtime filters: RF000 <- d.id |
|
|
| | |
|
|
| |--12:EXCHANGE [HASH(d.id)] |
|
|
| | | |
|
|
| | 03:SCAN HDFS [functional.alltypes d] |
|
|
| | partitions=24/24 files=24 size=478.45KB |
|
|
| | |
|
|
| 06:HASH JOIN [LEFT OUTER JOIN, PARTITIONED] |
|
|
| | hash predicates: b.id = c.id |
|
|
| | other predicates: b.int_col = c.int_col <--- incorrect placement; should be at node 07 or 08
|
|
| | runtime filters: RF001 <- c.int_col |
|
|
| | |
|
|
| |--11:EXCHANGE [HASH(c.id)] |
|
|
| | | |
|
|
| | 02:SCAN HDFS [functional.alltypes c] |
|
|
| | partitions=24/24 files=24 size=478.45KB |
|
|
| | runtime filters: RF000 -> c.id |
|
|
| | |
|
|
| 05:HASH JOIN [RIGHT OUTER JOIN, PARTITIONED] |
|
|
| | hash predicates: b.id = a.id |
|
|
| | runtime filters: RF002 <- a.id |
|
|
| | |
|
|
| |--10:EXCHANGE [HASH(a.id)] |
|
|
| | | |
|
|
| | 00:SCAN HDFS [functional.alltypes a] |
|
|
| | partitions=24/24 files=24 size=478.45KB |
|
|
| | |
|
|
| 09:EXCHANGE [HASH(b.id)] |
|
|
| | |
|
|
| 01:SCAN HDFS [functional.alltypes b] |
|
|
| partitions=24/24 files=24 size=478.45KB |
|
|
| runtime filters: RF001 -> b.int_col, RF002 -> b.id |
|
|
+-----------------------------------------------------------+
|
|
]]>
|
|
</codeblock>
|
|
|
|
<p>
|
|
<b>Bug:</b> <xref keyref="IMPALA-3126">IMPALA-3126</xref>
|
|
</p>
|
|
|
|
<p>
|
|
<b>Severity:</b> High
|
|
</p>
|
|
|
|
<p>
|
|
<b>Workaround:</b> High
|
|
</p>
|
|
|
|
<p>
|
|
For some queries, this problem can be worked around by placing the problematic <codeph>ON</codeph> clause predicate in the
|
|
<codeph>WHERE</codeph> clause instead, or changing the preceding <codeph>OUTER JOIN</codeph>s to <codeph>INNER JOIN</codeph>s (if
|
|
the <codeph>ON</codeph> clause predicate would discard <codeph>NULL</codeph>s). For example, to fix the problematic query above:
|
|
</p>
|
|
|
|
<codeblock><![CDATA[
|
|
select 1 from functional.alltypes a
|
|
left outer join functional.alltypes b
|
|
on a.id = b.id
|
|
left outer join functional.alltypes c
|
|
on b.id = c.id
|
|
right outer join functional.alltypes d
|
|
on c.id = d.id
|
|
inner join functional.alltypes e
|
|
where b.int_col = c.int_col
|
|
|
|
+-----------------------------------------------------------+
|
|
| Explain String |
|
|
+-----------------------------------------------------------+
|
|
| Estimated Per-Host Requirements: Memory=480.04MB VCores=4 |
|
|
| |
|
|
| 14:EXCHANGE [UNPARTITIONED] |
|
|
| | |
|
|
| 08:NESTED LOOP JOIN [CROSS JOIN, BROADCAST] |
|
|
| | |
|
|
| |--13:EXCHANGE [BROADCAST] |
|
|
| | | |
|
|
| | 04:SCAN HDFS [functional.alltypes e] |
|
|
| | partitions=24/24 files=24 size=478.45KB |
|
|
| | |
|
|
| 07:HASH JOIN [RIGHT OUTER JOIN, PARTITIONED] |
|
|
| | hash predicates: c.id = d.id |
|
|
| | other predicates: b.int_col = c.int_col <-- correct assignment
|
|
| | runtime filters: RF000 <- d.id |
|
|
| | |
|
|
| |--12:EXCHANGE [HASH(d.id)] |
|
|
| | | |
|
|
| | 03:SCAN HDFS [functional.alltypes d] |
|
|
| | partitions=24/24 files=24 size=478.45KB |
|
|
| | |
|
|
| 06:HASH JOIN [LEFT OUTER JOIN, PARTITIONED] |
|
|
| | hash predicates: b.id = c.id |
|
|
| | |
|
|
| |--11:EXCHANGE [HASH(c.id)] |
|
|
| | | |
|
|
| | 02:SCAN HDFS [functional.alltypes c] |
|
|
| | partitions=24/24 files=24 size=478.45KB |
|
|
| | runtime filters: RF000 -> c.id |
|
|
| | |
|
|
| 05:HASH JOIN [RIGHT OUTER JOIN, PARTITIONED] |
|
|
| | hash predicates: b.id = a.id |
|
|
| | runtime filters: RF001 <- a.id |
|
|
| | |
|
|
| |--10:EXCHANGE [HASH(a.id)] |
|
|
| | | |
|
|
| | 00:SCAN HDFS [functional.alltypes a] |
|
|
| | partitions=24/24 files=24 size=478.45KB |
|
|
| | |
|
|
| 09:EXCHANGE [HASH(b.id)] |
|
|
| | |
|
|
| 01:SCAN HDFS [functional.alltypes b] |
|
|
| partitions=24/24 files=24 size=478.45KB |
|
|
| runtime filters: RF001 -> b.id |
|
|
+-----------------------------------------------------------+
|
|
]]>
|
|
</codeblock>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
<concept id="IMPALA-3006" rev="IMPALA-3006">
|
|
|
|
<title>Impala may use incorrect bit order with BIT_PACKED encoding</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
Parquet <codeph>BIT_PACKED</codeph> encoding as implemented by Impala is LSB first. The parquet standard says it is MSB first.
|
|
</p>
|
|
|
|
<p>
|
|
<b>Bug:</b> <xref keyref="IMPALA-3006">IMPALA-3006</xref>
|
|
</p>
|
|
|
|
<p>
|
|
<b>Severity:</b> High, but rare in practice because BIT_PACKED is infrequently used, is not written by Impala, and is deprecated
|
|
in Parquet 2.0.
|
|
</p>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
<concept id="IMPALA-3082" rev="IMPALA-3082">
|
|
|
|
<title>BST between 1972 and 1995</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
The calculation of start and end times for the BST (British Summer Time) time zone could be incorrect between 1972 and 1995.
|
|
Between 1972 and 1995, BST began and ended at 02:00 GMT on the third Sunday in March (or second Sunday when Easter fell on the
|
|
third) and fourth Sunday in October. For example, both function calls should return 13, but actually return 12, in a query such
|
|
as:
|
|
</p>
|
|
|
|
<codeblock>
|
|
select
|
|
extract(from_utc_timestamp(cast('1970-01-01 12:00:00' as timestamp), 'Europe/London'), "hour") summer70start,
|
|
extract(from_utc_timestamp(cast('1970-12-31 12:00:00' as timestamp), 'Europe/London'), "hour") summer70end;
|
|
</codeblock>
|
|
|
|
<p>
|
|
<b>Bug:</b> <xref keyref="IMPALA-3082">IMPALA-3082</xref>
|
|
</p>
|
|
|
|
<p>
|
|
<b>Severity:</b> High
|
|
</p>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
<concept id="IMPALA-1170" rev="IMPALA-1170">
|
|
|
|
<title>parse_url() returns incorrect result if @ character in URL</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
If a URL contains an <codeph>@</codeph> character, the <codeph>parse_url()</codeph> function could return an incorrect value for
|
|
the hostname field.
|
|
</p>
|
|
|
|
<p>
|
|
<b>Bug:</b> <xref keyref="IMPALA-1170"></xref>IMPALA-1170
|
|
</p>
|
|
|
|
<p><b>Resolution:</b> Fixed in <keyword keyref="impala250"/> and <keyword keyref="impala234"/>.</p>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
<concept id="IMPALA-2422" rev="IMPALA-2422">
|
|
|
|
<title>% escaping does not work correctly when occurs at the end in a LIKE clause</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
If the final character in the RHS argument of a <codeph>LIKE</codeph> operator is an escaped <codeph>\%</codeph> character, it
|
|
does not match a <codeph>%</codeph> final character of the LHS argument.
|
|
</p>
|
|
|
|
<p>
|
|
<b>Bug:</b> <xref keyref="IMPALA-2422">IMPALA-2422</xref>
|
|
</p>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
<concept id="IMPALA-397" rev="IMPALA-397">
|
|
|
|
<title>ORDER BY rand() does not work.</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
Because the value for <codeph>rand()</codeph> is computed early in a query, using an <codeph>ORDER BY</codeph> expression
|
|
involving a call to <codeph>rand()</codeph> does not actually randomize the results.
|
|
</p>
|
|
|
|
<p>
|
|
<b>Bug:</b> <xref keyref="IMPALA-397">IMPALA-397</xref>
|
|
</p>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
<concept id="IMPALA-2643" rev="IMPALA-2643">
|
|
|
|
<title>Duplicated column in inline view causes dropping null slots during scan</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
If the same column is queried twice within a view, <codeph>NULL</codeph> values for that column are omitted. For example, the
|
|
result of <codeph>COUNT(*)</codeph> on the view could be less than expected.
|
|
</p>
|
|
|
|
<p>
|
|
<b>Bug:</b> <xref keyref="IMPALA-2643">IMPALA-2643</xref>
|
|
</p>
|
|
|
|
<p>
|
|
<b>Workaround:</b> Avoid selecting the same column twice within an inline view.
|
|
</p>
|
|
|
|
<p><b>Resolution:</b> Fixed in <keyword keyref="impala250"/>, <keyword keyref="impala232"/>, and <keyword keyref="impala2210"/>.</p>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
<concept id="IMPALA-1459" rev="IMPALA-1459">
|
|
|
|
<!-- Not part of Alex's spreadsheet -->
|
|
|
|
<title>Incorrect assignment of predicates through an outer join in an inline view.</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
A query involving an <codeph>OUTER JOIN</codeph> clause where one of the table references is an inline view might apply predicates
|
|
from the <codeph>ON</codeph> clause incorrectly.
|
|
</p>
|
|
|
|
<p>
|
|
<b>Bug:</b> <xref keyref="IMPALA-1459">IMPALA-1459</xref>
|
|
</p>
|
|
|
|
<p><b>Resolution:</b> Fixed in <keyword keyref="impala250"/>, <keyword keyref="impala232"/>, and <keyword keyref="impala229"/>.</p>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
<concept id="IMPALA-2603" rev="IMPALA-2603">
|
|
|
|
<title>Crash: impala::Coordinator::ValidateCollectionSlots</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
A query could encounter a serious error if includes multiple nested levels of <codeph>INNER JOIN</codeph> clauses involving
|
|
subqueries.
|
|
</p>
|
|
|
|
<p>
|
|
<b>Bug:</b> <xref keyref="IMPALA-2603">IMPALA-2603</xref>
|
|
</p>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
<concept id="IMPALA-2665" rev="IMPALA-2665">
|
|
|
|
<title>Incorrect assignment of On-clause predicate inside inline view with an outer join.</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
A query might return incorrect results due to wrong predicate assignment in the following scenario:
|
|
</p>
|
|
|
|
<ol>
|
|
<li>
|
|
There is an inline view that contains an outer join
|
|
</li>
|
|
|
|
<li>
|
|
That inline view is joined with another table in the enclosing query block
|
|
</li>
|
|
|
|
<li>
|
|
That join has an On-clause containing a predicate that only references columns originating from the outer-joined tables inside
|
|
the inline view
|
|
</li>
|
|
</ol>
|
|
|
|
<p>
|
|
<b>Bug:</b> <xref keyref="IMPALA-2665">IMPALA-2665</xref>
|
|
</p>
|
|
|
|
<p><b>Resolution:</b> Fixed in <keyword keyref="impala250"/>, <keyword keyref="impala232"/>, and <keyword keyref="impala229"/>.</p>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
<concept id="IMPALA-2144" rev="IMPALA-2144">
|
|
|
|
<title>Wrong assignment of having clause predicate across outer join</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
In an <codeph>OUTER JOIN</codeph> query with a <codeph>HAVING</codeph> clause, the comparison from the <codeph>HAVING</codeph>
|
|
clause might be applied at the wrong stage of query processing, leading to incorrect results.
|
|
</p>
|
|
|
|
<p>
|
|
<b>Bug:</b> <xref keyref="IMPALA-2144">IMPALA-2144</xref>
|
|
</p>
|
|
|
|
<p><b>Resolution:</b> Fixed in <keyword keyref="impala250"/>.</p>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
<concept id="IMPALA-2093" rev="IMPALA-2093">
|
|
|
|
<title>Wrong plan of NOT IN aggregate subquery when a constant is used in subquery predicate</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
A <codeph>NOT IN</codeph> operator with a subquery that calls an aggregate function, such as <codeph>NOT IN (SELECT
|
|
SUM(...))</codeph>, could return incorrect results.
|
|
</p>
|
|
|
|
<p>
|
|
<b>Bug:</b> <xref keyref="IMPALA-2093">IMPALA-2093</xref>
|
|
</p>
|
|
|
|
<p><b>Resolution:</b> Fixed in <keyword keyref="impala250"/> and <keyword keyref="impala234"/>.</p>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
</concept>
|
|
|
|
<concept id="known_issues_metadata">
|
|
|
|
<title id="ki_metadata">Impala Known Issues: Metadata</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
These issues affect how Impala interacts with metadata. They cover areas such as the metastore database, the <codeph>COMPUTE
|
|
STATS</codeph> statement, and the Impala <cmdname>catalogd</cmdname> daemon.
|
|
</p>
|
|
|
|
</conbody>
|
|
|
|
<concept id="IMPALA-2648" rev="IMPALA-2648">
|
|
|
|
<title>Catalogd may crash when loading metadata for tables with many partitions, many columns and with incremental stats</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
Incremental stats use up about 400 bytes per partition for each column. For example, for a table with 20K partitions and 100
|
|
columns, the memory overhead from incremental statistics is about 800 MB. When serialized for transmission across the network,
|
|
this metadata exceeds the 2 GB Java array size limit and leads to a <codeph>catalogd</codeph> crash.
|
|
</p>
|
|
|
|
<p>
|
|
<b>Bugs:</b> <xref keyref="IMPALA-2647">IMPALA-2647</xref>,
|
|
<xref keyref="IMPALA-2648">IMPALA-2648</xref>,
|
|
<xref keyref="IMPALA-2649">IMPALA-2649</xref>
|
|
</p>
|
|
|
|
<p>
|
|
<b>Workaround:</b> If feasible, compute full stats periodically and avoid computing incremental stats for that table. The
|
|
scalability of incremental stats computation is a continuing work item.
|
|
</p>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
<concept id="IMPALA-1420" rev="IMPALA-1420 2.0.0">
|
|
|
|
<!-- Not part of Alex's spreadsheet -->
|
|
|
|
<title>Can't update stats manually via alter table after upgrading to <keyword keyref="impala20"/></title>
|
|
|
|
<conbody>
|
|
|
|
<p></p>
|
|
|
|
<p>
|
|
<b>Bug:</b> <xref keyref="IMPALA-1420">IMPALA-1420</xref>
|
|
</p>
|
|
|
|
<p>
|
|
<b>Workaround:</b> On <keyword keyref="impala20"/>, when adjusting table statistics manually by setting the <codeph>numRows</codeph>, you must also
|
|
enable the Boolean property <codeph>STATS_GENERATED_VIA_STATS_TASK</codeph>. For example, use a statement like the following to
|
|
set both properties with a single <codeph>ALTER TABLE</codeph> statement:
|
|
</p>
|
|
|
|
<codeblock>ALTER TABLE <varname>table_name</varname> SET TBLPROPERTIES('numRows'='<varname>new_value</varname>', 'STATS_GENERATED_VIA_STATS_TASK' = 'true');</codeblock>
|
|
|
|
<p>
|
|
<b>Resolution:</b> The underlying cause is the issue
|
|
<xref href="https://issues.apache.org/jira/browse/HIVE-8648" scope="external" format="html">HIVE-8648</xref> that affects the
|
|
metastore in Hive 0.13. The workaround is only needed until the fix for this issue is incorporated into release of <keyword keyref="distro"/>.
|
|
</p>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
</concept>
|
|
|
|
<concept id="known_issues_interop">
|
|
|
|
<title id="ki_interop">Impala Known Issues: Interoperability</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
These issues affect the ability to interchange data between Impala and other database systems. They cover areas such as data types
|
|
and file formats.
|
|
</p>
|
|
|
|
</conbody>
|
|
|
|
<!-- Opened based on internal JIRA. Not part of Alex's spreadsheet AFAIK. -->
|
|
|
|
<concept id="describe_formatted_avro">
|
|
|
|
<title>DESCRIBE FORMATTED gives error on Avro table</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
This issue can occur either on old Avro tables (created prior to Hive 1.1) or when changing the Avro schema file by
|
|
adding or removing columns. Columns added to the schema file will not show up in the output of the <codeph>DESCRIBE
|
|
FORMATTED</codeph> command. Removing columns from the schema file will trigger a <codeph>NullPointerException</codeph>.
|
|
</p>
|
|
|
|
<p>
|
|
As a workaround, you can use the output of <codeph>SHOW CREATE TABLE</codeph> to drop and recreate the table. This will populate
|
|
the Hive metastore database with the correct column definitions.
|
|
</p>
|
|
|
|
<note type="warning">
|
|
Only use this for external tables, or Impala will remove the data files. In case of an internal table, set it to external first:
|
|
<codeblock>
|
|
ALTER TABLE table_name SET TBLPROPERTIES('EXTERNAL'='TRUE');
|
|
</codeblock>
|
|
(The part in parentheses is case sensitive.) Make sure to pick the right choice between internal and external when recreating the
|
|
table. See <xref href="impala_tables.xml#tables"/> for the differences between internal and external tables.
|
|
</note>
|
|
|
|
<p>
|
|
<b>Severity:</b> High
|
|
</p>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
<concept id="IMP-469">
|
|
|
|
<!-- Not part of Alex's spreadsheet. Perhaps it really is a permanent limitation and nobody is tracking it? -->
|
|
|
|
<title>Deviation from Hive behavior: Impala does not do implicit casts between string and numeric and boolean types.</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
<b>Anticipated Resolution</b>: None
|
|
</p>
|
|
|
|
<p>
|
|
<b>Workaround:</b> Use explicit casts.
|
|
</p>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
<concept id="IMP-175">
|
|
|
|
<!-- Not part of Alex's spreadsheet. Perhaps it really is a permanent limitation and nobody is tracking it? -->
|
|
|
|
<title>Deviation from Hive behavior: Out of range values float/double values are returned as maximum allowed value of type (Hive returns NULL)</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
Impala behavior differs from Hive with respect to out of range float/double values. Out of range values are returned as maximum
|
|
allowed value of type (Hive returns NULL).
|
|
</p>
|
|
|
|
<p>
|
|
<b>Workaround:</b> None
|
|
</p>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
<concept id="flume_writeformat_text">
|
|
|
|
<!-- Not part of Alex's spreadsheet. From a non-public JIRA. -->
|
|
|
|
<title>Configuration needed for Flume to be compatible with Impala</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
For compatibility with Impala, the value for the Flume HDFS Sink <codeph>hdfs.writeFormat</codeph> must be set to
|
|
<codeph>Text</codeph>, rather than its default value of <codeph>Writable</codeph>. The <codeph>hdfs.writeFormat</codeph> setting
|
|
must be changed to <codeph>Text</codeph> before creating data files with Flume; otherwise, those files cannot be read by either
|
|
Impala or Hive.
|
|
</p>
|
|
|
|
<p>
|
|
<b>Resolution:</b> This information has been requested to be added to the upstream Flume documentation.
|
|
</p>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
<concept id="IMPALA-635" rev="IMPALA-635">
|
|
|
|
<!-- Not part of Alex's spreadsheet -->
|
|
|
|
<title>Avro Scanner fails to parse some schemas</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
Querying certain Avro tables could cause a crash or return no rows, even though Impala could <codeph>DESCRIBE</codeph> the table.
|
|
</p>
|
|
|
|
<p>
|
|
<b>Bug:</b> <xref keyref="IMPALA-635">IMPALA-635</xref>
|
|
</p>
|
|
|
|
<p>
|
|
<b>Workaround:</b> Swap the order of the fields in the schema specification. For example, <codeph>["null", "string"]</codeph>
|
|
instead of <codeph>["string", "null"]</codeph>.
|
|
</p>
|
|
|
|
<p>
|
|
<b>Resolution:</b> Not allowing this syntax agrees with the Avro specification, so it may still cause an error even when the
|
|
crashing issue is resolved.
|
|
</p>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
<concept id="IMPALA-1024" rev="IMPALA-1024">
|
|
|
|
<!-- Not part of Alex's spreadsheet -->
|
|
|
|
<title>Impala BE cannot parse Avro schema that contains a trailing semi-colon</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
If an Avro table has a schema definition with a trailing semicolon, Impala encounters an error when the table is queried.
|
|
</p>
|
|
|
|
<p>
|
|
<b>Bug:</b> <xref keyref="IMPALA-1024">IMPALA-1024</xref>
|
|
</p>
|
|
|
|
<p>
|
|
<b>Severity:</b> Remove trailing semicolon from the Avro schema.
|
|
</p>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
<concept id="IMPALA-2154" rev="IMPALA-2154">
|
|
|
|
<!-- Not part of Alex's spreadsheet -->
|
|
|
|
<title>Fix decompressor to allow parsing gzips with multiple streams</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
Currently, Impala can only read gzipped files containing a single stream. If a gzipped file contains multiple concatenated
|
|
streams, the Impala query only processes the data from the first stream.
|
|
</p>
|
|
|
|
<p>
|
|
<b>Bug:</b> <xref keyref="IMPALA-2154">IMPALA-2154</xref>
|
|
</p>
|
|
|
|
<p>
|
|
<b>Workaround:</b> Use a different gzip tool to compress file to a single stream file.
|
|
</p>
|
|
|
|
<p><b>Resolution:</b> Fixed in <keyword keyref="impala250"/>.</p>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
<concept id="IMPALA-1578" rev="IMPALA-1578">
|
|
|
|
<!-- Not part of Alex's spreadsheet -->
|
|
|
|
<title>Impala incorrectly handles text data when the new line character \n\r is split between different HDFS block</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
If a carriage return / newline pair of characters in a text table is split between HDFS data blocks, Impala incorrectly processes
|
|
the row following the <codeph>\n\r</codeph> pair twice.
|
|
</p>
|
|
|
|
<p>
|
|
<b>Bug:</b> <xref keyref="IMPALA-1578">IMPALA-1578</xref>
|
|
</p>
|
|
|
|
<p>
|
|
<b>Workaround:</b> Use the Parquet format for large volumes of data where practical.
|
|
</p>
|
|
|
|
<p><b>Resolution:</b> Fixed in <keyword keyref="impala260"/>.</p>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
<concept id="IMPALA-1862" rev="IMPALA-1862">
|
|
|
|
<!-- Not part of Alex's spreadsheet -->
|
|
|
|
<title>Invalid bool value not reported as a scanner error</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
In some cases, an invalid <codeph>BOOLEAN</codeph> value read from a table does not produce a warning message about the bad value.
|
|
The result is still <codeph>NULL</codeph> as expected. Therefore, this is not a query correctness issue, but it could lead to
|
|
overlooking the presence of invalid data.
|
|
</p>
|
|
|
|
<p>
|
|
<b>Bug:</b> <xref keyref="IMPALA-1862">IMPALA-1862</xref>
|
|
</p>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
<concept id="IMPALA-1652" rev="IMPALA-1652">
|
|
|
|
<!-- To do: Isn't this more a correctness issue? -->
|
|
|
|
<title>Incorrect results with basic predicate on CHAR typed column.</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
When comparing a <codeph>CHAR</codeph> column value to a string literal, the literal value is not blank-padded and so the
|
|
comparison might fail when it should match.
|
|
</p>
|
|
|
|
<p>
|
|
<b>Bug:</b> <xref keyref="IMPALA-1652">IMPALA-1652</xref>
|
|
</p>
|
|
|
|
<p>
|
|
<b>Workaround:</b> Use the <codeph>RPAD()</codeph> function to blank-pad literals compared with <codeph>CHAR</codeph> columns to
|
|
the expected length.
|
|
</p>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
</concept>
|
|
|
|
<concept id="known_issues_limitations">
|
|
|
|
<title>Impala Known Issues: Limitations</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
These issues are current limitations of Impala that require evaluation as you plan how to integrate Impala into your data management
|
|
workflow.
|
|
</p>
|
|
|
|
</conbody>
|
|
|
|
<concept id="IMPALA-4551">
|
|
<title>Set limits on size of expression trees</title>
|
|
<conbody>
|
|
<p>
|
|
Very deeply nested expressions within queries can exceed internal Impala limits,
|
|
leading to excessive memory usage.
|
|
</p>
|
|
<p><b>Bug:</b> <xref keyref="IMPALA-4551">IMPALA-4551</xref></p>
|
|
<p><b>Severity:</b> High</p>
|
|
<p><b>Resolution:</b> </p>
|
|
<p><b>Workaround:</b> Avoid queries with extremely large expression trees. Setting the query option
|
|
<codeph>disable_codegen=true</codeph> may reduce the impact, at a cost of longer query runtime.</p>
|
|
</conbody>
|
|
</concept>
|
|
|
|
<concept id="IMPALA-77" rev="IMPALA-77">
|
|
|
|
<!-- Not part of Alex's spreadsheet. Perhaps it really is a permanent limitation and nobody is tracking it? -->
|
|
|
|
<title>Impala does not support running on clusters with federated namespaces</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
Impala does not support running on clusters with federated namespaces. The <codeph>impalad</codeph> process will not start on a
|
|
node running such a filesystem based on the <codeph>org.apache.hadoop.fs.viewfs.ViewFs</codeph> class.
|
|
</p>
|
|
|
|
<p>
|
|
<b>Bug:</b> <xref keyref="IMPALA-77">IMPALA-77</xref>
|
|
</p>
|
|
|
|
<p>
|
|
<b>Anticipated Resolution:</b> Limitation
|
|
</p>
|
|
|
|
<p>
|
|
<b>Workaround:</b> Use standard HDFS on all Impala nodes.
|
|
</p>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
</concept>
|
|
|
|
<concept id="known_issues_misc">
|
|
|
|
<title>Impala Known Issues: Miscellaneous / Older Issues</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
These issues do not fall into one of the above categories or have not been categorized yet.
|
|
</p>
|
|
|
|
</conbody>
|
|
|
|
<concept id="IMPALA-2005" rev="IMPALA-2005">
|
|
|
|
<!-- Not part of Alex's spreadsheet -->
|
|
|
|
<title>A failed CTAS does not drop the table if the insert fails.</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
If a <codeph>CREATE TABLE AS SELECT</codeph> operation successfully creates the target table but an error occurs while querying
|
|
the source table or copying the data, the new table is left behind rather than being dropped.
|
|
</p>
|
|
|
|
<p>
|
|
<b>Bug:</b> <xref keyref="IMPALA-2005">IMPALA-2005</xref>
|
|
</p>
|
|
|
|
<p>
|
|
<b>Workaround:</b> Drop the new table manually after a failed <codeph>CREATE TABLE AS SELECT</codeph>.
|
|
</p>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
<concept id="IMPALA-1821" rev="IMPALA-1821">
|
|
|
|
<!-- Not part of Alex's spreadsheet -->
|
|
|
|
<title>Casting scenarios with invalid/inconsistent results</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
Using a <codeph>CAST()</codeph> function to convert large literal values to smaller types, or to convert special values such as
|
|
<codeph>NaN</codeph> or <codeph>Inf</codeph>, produces values not consistent with other database systems. This could lead to
|
|
unexpected results from queries.
|
|
</p>
|
|
|
|
<p>
|
|
<b>Bug:</b> <xref keyref="IMPALA-1821">IMPALA-1821</xref>
|
|
</p>
|
|
|
|
<!-- <p><b>Workaround:</b> Doublecheck that <codeph>CAST()</codeph> operations work as expect. The issue applies to expressions involving literals, not values read from table columns.</p> -->
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
<concept id="IMPALA-1619" rev="IMPALA-1619">
|
|
|
|
<!-- Not part of Alex's spreadsheet -->
|
|
|
|
<title>Support individual memory allocations larger than 1 GB</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
The largest single block of memory that Impala can allocate during a query is 1 GiB. Therefore, a query could fail or Impala could
|
|
crash if a compressed text file resulted in more than 1 GiB of data in uncompressed form, or if a string function such as
|
|
<codeph>group_concat()</codeph> returned a value greater than 1 GiB.
|
|
</p>
|
|
|
|
<p>
|
|
<b>Bug:</b> <xref keyref="IMPALA-1619">IMPALA-1619</xref>
|
|
</p>
|
|
|
|
<p><b>Resolution:</b> Fixed in <keyword keyref="impala270"/> and <keyword keyref="impala263"/>.</p>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
<concept id="IMPALA-941" rev="IMPALA-941">
|
|
|
|
<!-- Not part of Alex's spreadsheet. Maybe this is interop? -->
|
|
|
|
<title>Impala Parser issue when using fully qualified table names that start with a number.</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
A fully qualified table name starting with a number could cause a parsing error. In a name such as <codeph>db.571_market</codeph>,
|
|
the decimal point followed by digits is interpreted as a floating-point number.
|
|
</p>
|
|
|
|
<p>
|
|
<b>Bug:</b> <xref keyref="IMPALA-941">IMPALA-941</xref>
|
|
</p>
|
|
|
|
<p>
|
|
<b>Workaround:</b> Surround each part of the fully qualified name with backticks (<codeph>``</codeph>).
|
|
</p>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
<concept id="IMPALA-532" rev="IMPALA-532">
|
|
|
|
<!-- Not part of Alex's spreadsheet. Perhaps it really is a permanent limitation and nobody is tracking it? -->
|
|
|
|
<title>Impala should tolerate bad locale settings</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
If the <codeph>LC_*</codeph> environment variables specify an unsupported locale, Impala does not start.
|
|
</p>
|
|
|
|
<p>
|
|
<b>Bug:</b> <xref keyref="IMPALA-532">IMPALA-532</xref>
|
|
</p>
|
|
|
|
<p>
|
|
<b>Workaround:</b> Add <codeph>LC_ALL="C"</codeph> to the environment settings for both the Impala daemon and the Statestore
|
|
daemon. See <xref href="impala_config_options.xml#config_options"/> for details about modifying these environment settings.
|
|
</p>
|
|
|
|
<p>
|
|
<b>Resolution:</b> Fixing this issue would require an upgrade to Boost 1.47 in the Impala distribution.
|
|
</p>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
<concept id="IMP-1203">
|
|
|
|
<!-- Not part of Alex's spreadsheet. Perhaps it really is a permanent limitation and nobody is tracking it? -->
|
|
|
|
<title>Log Level 3 Not Recommended for Impala</title>
|
|
|
|
<conbody>
|
|
|
|
<p>
|
|
The extensive logging produced by log level 3 can cause serious performance overhead and capacity issues.
|
|
</p>
|
|
|
|
<p>
|
|
<b>Workaround:</b> Reduce the log level to its default value of 1, that is, <codeph>GLOG_v=1</codeph>. See
|
|
<xref href="impala_logging.xml#log_levels"/> for details about the effects of setting different logging levels.
|
|
</p>
|
|
|
|
</conbody>
|
|
|
|
</concept>
|
|
|
|
</concept>
|
|
|
|
</concept>
|
|
</concept>
|