The following sections describe the major issues fixed in each Impala release.
For known issues that are currently unresolved, see
For the full list of issues closed in this release, including bug fixes,
see the
For the full list of Impala fixed issues in
For the full list of Impala fixed issues in Impala 2.7.0, see
The following list contains the most critical fixed issues
(
A crash could occur, with stack trace pointing to
Bug:
Severity: High
A crash could occur because of contention between multiple calls to Java UDFs.
Bug:
Severity: High
A crash could occur because of contention between multiple concurrent statements writing to HBase.
Bug:
Severity: High
A crash or wrong results could occur if the spill-to-disk mechanism encountered a zero-length string at the very end of a data block.
Bug:
Severity: High
If a query plan contains an aggregation node producing string values anywhere within a subplan (that is,if in the SQL statement, the aggregate function appears within an inline view over a collection column), the results of the aggregation may be incorrect.
Bug:
Severity: High
A
Bug:
Severity: High
Impala incorrectly allowed
Bug:
Severity: High
A crash could occur while querying tables with very large rows, for example wide tables with many columns or very large string values. This problem was identified in Impala 2.3, but had low reproducibility in subsequent releases. The fix ensures the memory allocation size is correct.
Bug:
Severity: High
A very large memory allocation within the
Bug:
Severity: High
If a partitioned table used a file format other than Avro, and the file format of an individual partition was changed to Avro, subsequent queries could encounter a crash.
Bug:
Severity: High
A timing problem during runtime filter processing could cause queries against Avro or SequenceFile tables to hang.
Bug:
Severity: High
The following list contains the most critical issues (
Bug:
The stress test was running a build with the TPC-H, TPC-DS, and TPC-H nested queries with scale factor 3.
Bug:
If a UDF JAR was not available in the HDFS location specified in the
Bug:
A join query could fail with an out-of-memory error despite the apparent presence of sufficient memory.
The cause was the internal ordering of operations that could cause a later phase of the query to
allocate memory required by an earlier phase of the query. The workaround was to either increase
or decrease the
Bug:
Referring to the same column twice in a view definition could cause the view to omit
rows where that column contained a
Bug:
Some combinations of
Bug:
Bug:
Parquet dictionary decoders can accumulate throughout query execution, leading to excessive memory usage. One decoder is created per-column per-split.
Bug:
Bug:
Currently, the MemPool would always double the size of the last allocation. This can lead to bad behavior if the MemPool transferred the ownership of all its data except the last chunk. In the next allocation, the next allocated chunk would double the size of this large chunk, which can be undesirable.
Bug:
The
Bug:
A query with a
Bug:
Bug:
An aggregation query could fail with an out-of-memory error, despite sufficient memory being reported as available.
Bug:
Some queries do not close an internal communication channel on an error.
This will cause the node on the other side of the channel to wait indefinitely, causing the query to hang.
For example, this issue could happen on a Kerberos-enabled system if the credential cache was outdated.
Although the affected query hangs, the
Bug:
Querying for the min or max value of a timestamp cast from a bigint via
Workaround: Disable native code generation with:
Bug:
Impala returns wrong result for function
Workaround:
Cast the value to string and use
The set of fixes for Impala in
This section lists the most serious or frequently encountered customer
issues fixed in
A query involving an analytic function could encounter a serious error. This issue was encountered infrequently, depending upon specific combinations of queries and data.
Bug:
An outer join query could fail unexpectedly with an out-of-memory error
when the spill to disk
mechanism was turned off.
Bug:
A join query could encounter a serious error due to an internal failure to allocate memory, which
resulted in dereferencing a
Bug:
Referring to the same column twice in a view definition could cause the view to omit
rows where that column contained a
Bug:
A
Bug:
The
Bug:
Bug:
Impala could fail to access Parquet data files with page headers larger than 8 MB, which could
occur, for example, if the minimum or maximum values for a column were long strings. The
fix adds a configuration setting
Bug:
Queries on Parquet tables could consume excessive memory (potentially multiple gigabytes) due to producing
large intermediate data values while evaluating groups of rows. The workaround was to reduce the size of
the
Bug:
A query that included a
Bug:
A query that included
Bug:
Queries involving HBase tables used substantially more memory than in earlier Impala versions. The problem occurred starting in Impala 2.2.8, as a result of the changes for IMPALA-2284. The fix for this issue involves removing a separate memory work area for HBase queries and reusing other memory that was already allocated.
Bug:
Some combinations of
Bug:
A debug build of Impala could encounter a serious error after encountering some kinds of I/O errors for Parquet files. This issue only occurred in debug builds, not release builds.
Bug:
A join query could fail with an out-of-memory error despite the apparent presence of sufficient memory.
The cause was the internal ordering of operations that could cause a later phase of the query to
allocate memory required by an earlier phase of the query. The workaround was to either increase
or decrease the
Bug:
A query could fail with an internal error while calculating the memory limit. This was an infrequent condition uncovered during stress testing.
Bug:
A query could fail with an internal error while calculating the memory limit. This was an infrequent condition uncovered during stress testing.
Bug:
Bug:
These fixes lift the restriction on using SSL encryption and Kerberos authentication together for internal communication between Impala components.
Bug:
This section lists the most serious or frequently encountered customer
issues fixed in
A number of issues were resolved that could result in serious errors when encountered. The most critical or commonly encountered are listed here.
Bugs:
A number of issues were resolved that could result in wrong results when encountered. The most critical or commonly encountered are listed here.
Bugs:
This section lists the most frequently encountered customer issues fixed in
If an inline view in a
Bug:
Queries involving HBase tables used substantially more memory than in earlier Impala versions. The problem occurred starting in Impala 2.2.8, as a result of the changes for IMPALA-2284. The fix for this issue involves removing a separate memory work area for HBase queries and reusing other memory that was already allocated.
Bug:
Some combinations of
Bug:
The join predicate for an
Bug:
The
Bug:
Adding or subtracting a large
Bug:
An
Bug:
Impala could fail to access Parquet data files with page headers larger than 8 MB, which
could occur, for example, if the minimum or maximum values for a column were long strings.
The fix adds a configuration setting
Bug:
A query that activated the spill-to-disk mechanism could fail if it contained a sort expression involving certain combinations of fixed-length or variable-length types.
Bug:
Some queries that activated the spill-to-disk mechanism could produce a serious error if there was insufficient memory to set up internal work areas. Now those queries produce normal out-of-memory errors instead.
Bug:
A serious error could occur under rare circumstances, due to a race condition while freeing memory during heavily concurrent workloads.
Bug:
A call to
Bug:
An
Bug:
This section lists the most frequently encountered customer issues fixed in
Impala could not read Avro tables created in Hive with the
Bug:
If a Parquet file in HDFS was overwritten by a smaller file, Impala could encounter a serious error.
Issuing a
Bug:
Impala could encounter a serious error when reading compressed text files larger than 1 GB. The fix causes Impala to issue an error message instead in this case.
Bug:
A query using the
Bug:
An edge case in the algorithm used to distribute data among nodes could result in uneven distribution of work for some queries, with all data sent to the same node.
Bug:
A communication error could occur between Impala and the Hive metastore database, causing Impala operations that update table metadata to fail.
Bug:
Certain queries could encounter a serious error if the spill-to-disk mechanism was activated.
Bug:
Impala could generate a suboptimal query plan for some queries involving small tables.
Bug:
This section lists the most frequently encountered customer issues fixed in
Impala warns if it detects a discrepancy in table statistics: a table considered to have zero rows even though there are data files present. In this case, Impala also skips query optimizations that are normally applied to very small tables.
Bug:
A query could encounter a serious error if it included a particular combination of aggregate functions and inline views.
Bug:
A query could encounter a serious error if it included an inline view whose subquery had no
Bug:
A
Bug:
A query could return incorrect results if it contained a
Bug:
A query containing an
Bug:
A
Bug:
A query could encounter a serious error if it included column aliases with the same names as table columns, and used
ordinal numbers in an
Bug:
A query could return incorrect results if it included an outer join clause, inline views, and calls to functions such as
Bug:
A query could return incorrect results if the table contained multiple
Bug:
An
Bug:
This section lists the most frequently encountered customer issues fixed in
When the Impala
Bug:
A query could encounter a serious error if it contained a spill to disk
mechanism was activated.
Bug:
Declaring a partition key column as a
Bug:
A query that referred to a view whose query referred to another view containing a join, could return incorrect results.
Bug:
The
Bug:
Resolution: Rather than change the behavior of the
Query performance was improved substantially for Parquet files containing
Bug:
A join query could encounter a serious error if the query
approached the memory limit on a host so that the spill to disk
mechanism was activated,
and data volume in the join was large enough that an internal memory buffer exceeded 1 GB in size on a particular host.
(Exceeding this limit would only happen for huge join queries, because Impala could split this intermediate data
into 16 parts during the join query, and the buffer only contains compact bookkeeping data rather than the actual
join column data.)
Bug:
This section lists the most frequently encountered customer issues fixed in
Enabling Impala to work with the Isilon filesystem involves a number of
fixes to performance and flexibility for dealing with I/O using remote reads.
See
Bug:
The set of timezones recognized by Impala was expanded.
You can always find the latest list of supported timezones in the
Impala source code, in the file
Bug:
Impala can now process Zulu
time, a synonym for UTC.
Bug:
An
Bug:
Bug:
This section lists the most frequently encountered customer issues fixed in
This section lists the most frequently encountered customer issues fixed in
For the full list of fixed issues in
When the type of a column was changed in either Hive or Impala through
Bug:
Resolution: Resolved by incorporating the fix for
Workaround: On systems without the corresponding Hive fix, change the column back to its original type. The stats reappear and you can recompute or drop them.
If a file was truncated in HDFS without a corresponding
Bug:
Impala could issue messages stating the block locality metadata was stale,
when the metadata was actually fine.
The internal remote bytes read
counter was not being reset properly.
This issue did not cause an actual slowdown in query execution,
but the spurious error could result in unnecessary debugging work
and unnecessary use of the
Bug:
When a table was moved from one database to another, the column statistics were not pointed to the new database.i This could result in lower performance for queries due to unavailable statistics, and also an inability to drop the table.
Bug:
Bug:
The
Bug:
Some queries did not recognize the final line of a text data file if the line did not end with a newline character.
This could lead to inconsistent results, such as a different number of rows for
Bug:
If the HDFS user ID associated with the
Bug:
Truncating a file in HDFS, after Impala had cached the file metadata, could produce a hang when Impala queried a table containing that file.
Bug:
Impala could sometimes fail to
Bug:
This fix relaxes the CPU requirement for Impala. Now only the SSSE3 instruction set is required. Formerly, SSE4.1 instructions were generated, making Impala refuse to start on some older CPUs.
Bug:
This section lists the most significant Impala issues fixed in
If an inline view in a
Bug:
A value of type loss of precision
error.
Bug:
An invalid constant expression in a
Bug:
A call to
Bug:
This section lists the most significant Impala issues fixed in Impala 2.1.6.
Certain queries could encounter a serious error if the spill-to-disk mechanism was activated.
Bug:
Certain queries could encounter a serious error if the spill-to-disk mechanism was activated.
Bug:
Impala could generate a suboptimal query plan for some queries involving small tables.
Bug:
Queries using the
Bug:
Queries against HBase tables could return incomplete results if the
Bug:
A query could encounter a serious error if it contained a spill to disk
mechanism was activated.
Bug:
This section lists the most significant Impala issues fixed in Impala 2.1.5.
Queries including
Bug:
This section lists the most significant Impala issues fixed in Impala 2.1.4.
When expressions that tested for
Bug:
Bug:
An
Bug:
If the
Bug:
A query using the
Bug:
A query referencing a
Bug:
A query using an analytic function
could encounter an error if the
evaluation of an analytic
Bug:
An analytic function containing only an
Bug:
This section lists the most significant issues fixed in Impala 2.1.3.
When Hive writes
Bug:
Converting a floating-point value to a
Bug:
Certain calls to aggregate functions with
Bug:
If the HDFS user ID associated with the
Bug:
Truncating a file in HDFS, after Impala had cached the file metadata, could produce a hang when Impala queried a table containing that file.
Bug:
Successive calls to the data source API could result in excessive memory consumption, with memory allocated but never freed.
Bug:
Impala could issue messages stating the block locality metadata was stale,
when the metadata was actually fine.
The internal remote bytes read
counter was not being reset properly.
This issue did not cause an actual slowdown in query execution,
but the spurious error could result in unnecessary debugging work
and unnecessary use of the
Bug:
This section lists the most significant issues fixed in Impala 2.1.2.
For the full list of fixed issues in Impala 2.1.2, see
When a floating-point value was read from a text file and interpreted as a
Bug:
The
Bug:
A query against a partitioned table could return incorrect results if the
Bug:
The performance of the
Bug:
This section lists the most significant issues fixed in Impala 2.1.1.
For the full list of fixed issues in Impala 2.1.1, see
Bug:
Bug:
This section lists the most significant issues fixed in Impala 2.1.0.
For the full list of fixed issues in Impala 2.1.0, see
Transferring large result sets back to the client application on Kerberos
Bug:
Queries on gzipped text files required holding the entire data file and its uncompressed representation
in memory at the same time.
Bug:
Impala might not be able to access HBase tables, depending on the associated levels of Impala and HBase on the system.
Bug:
Improved code coverage in Impala testing uncovered a number of potentially serious errors that could occur with specific query syntax. These errors are resolved in Impala 2.1.
Bug:
For the full list of fixed issues in Impala 2.0.5, see
This section lists the most significant issues fixed in Impala 2.0.4.
For the full list of fixed issues in Impala 2.0.4, see
When Hive writes
Bug:
If a table data file was replaced by a shorter file outside of Impala,
such as with
Bug:
This section lists the most significant issues fixed in Impala 2.0.3.
For the full list of fixed issues in Impala 2.0.3, see
An anti-join query (or a
Bug:
A query against a partitioned table could return incorrect results if the
Bug:
The performance of the
Bug:
This section lists the most significant issues fixed in Impala 2.0.2.
For the full list of fixed issues in Impala 2.0.2, see
Some operations in queries submitted through Hue or other HiveServer2 clients could produce inconsistent results.
Bug:
Impala could encounter an error from running out of file descriptors. The fix reduces the amount of time file descriptors are kept open, and avoids leaking file descriptors when read operations encounter errors.
The
Bug:
To avoid putting too heavy a load on any one node, Impala now randomizes which scan node processes each HDFS data block rather than choosing the first cached block replica.
Bug:
In clusters secured by Kerberos or LDAP, a discrepancy in internal transmission of user names could cause a communication error with Llama.
Bug:
The
Bug:
This section lists the most significant issues fixed in Impala 2.0.1.
For the full list of fixed issues in Impala 2.0.1, see
After running the
Bug:
Workaround: Upgrading to a level of
This section lists the most significant issues fixed in Impala 2.0.0.
For the full list of fixed issues in Impala 2.0.0, see
Hints specified within a view query did not take effect when the view was queried, leading to slow performance. As part of this fix, Impala now supports hints embedded within comments.
Bug:
Potential wrong results for some types of queries.
Bug:
Potential wrong results for some types of queries.
Bug:
Potential wrong results for some types of queries.
Bug:
Potential wrong results for some types of queries.
Bug:
Potential wrong results for some types of queries.
Bug:
Potential wrong results for some types of queries.
Bug:
Serious error for certain combinations of function calls and data types.
Bug:
Serious error for certain combinations of function calls and data types.
Bug:
Bug:
Hive-created Avro tables with columns specified by a JSON file or literal could produce errors when
queried in Impala, and could not be used with the
Bug:
The Impala debug web UI did not properly encode all output.
Bug:
Certain queries could run without obeying the limits imposed by resource management.
Bug:
Certain
Bug:
In a Kerberos environment, the principal name was not mapped to lowercase, causing issues when a user logged in with an uppercase principal name and Sentry authorization was enabled.
Bug:
Impala 1.4.3 includes fixes to address what is known as the POODLE vulnerability in SSLv3. SSLv3 access is disabled in the Impala debug web UI.
This section lists the most significant issues fixed in Impala 1.4.2.
For the full list of fixed issues in Impala 1.4.2, see
This section lists the most significant issues fixed in Impala 1.4.1.
For the full list of fixed issues in Impala 1.4.1, see
Occasionally, a non-trivial query run through Llama could encounter a serious error. The detailed error in the log was:
Severity: High
Impala log files could contain internal error messages due to a problem formatting certain strings. The messages consisted of a Java call stack starting with:
A downlevel version of the HiveServer2 API could cause difficulty retrieving the precision and scale of a
Bug:
The error in the title could occur following a DDL statement. This issue was discovered during internal testing and has not been reported in customer environments.
Bug:
The time for some network operations was not counted in the report of total time for a query, making it difficult to diagnose network-related performance issues.
Bug:
Certain Avro fields for byte data could cause Impala to be unable to read an Avro data file, even if the
field was not part of the Impala table definition. With this fix, Impala can now read these Avro data
files, although Impala queries cannot refer to the bytes
fields.
Bug:
The
Bug:
This section lists the most significant issues fixed in Impala 1.4.0.
For the full list of fixed issues in Impala 1.4.0, see
The serious error in the title could occur, with the supplemental message:
The issue was due to the use of HDFS caching with data files accessed by Impala. Support for HDFS caching
in Impala was introduced in
Bug:
Resolution: This issue is fixed in Impala 1.3.2. The addition of HDFS caching support in Impala 1.4 means that this issue does not apply to any new level of Impala.
The
Bug:
When a view was accessed while inside a different database, references to tables were not resolved unless the names were fully qualified when the view was created.
Bug:
If an
Bug:
The
Bug:
Operations on tables with many partitions could be slow due to the time to evaluate which partitions were affected. The partition pruning code was speeded up substantially.
Bug:
The performance of the
Bug:
After a
Bug:
Impala could encounter a serious error after a query was cancelled.
Bug:
A deadlock condition could make all
Bug:
Impala 1.3.3 includes fixes to address what is known as the POODLE vulnerability in SSLv3. SSLv3 access is disabled in the Impala debug web UI.
This backported bug fix is the only change between Impala 1.3.1 and Impala 1.3.2.
The serious error in the title could occur, with the supplemental message:
The issue was due to the use of HDFS caching with data files accessed by Impala. Support for HDFS caching
in Impala was introduced in
Bug:
Resolution: This issue is fixed in Impala 1.3.2. The addition of HDFS caching support in Impala 1.4 means that this issue does not apply to any new level of Impala.
This section lists the most significant issues fixed in Impala 1.3.1.
For the full list of fixed issues in Impala 1.3.1, see
Impala could encounter a severe error in a query combining a left outer join with an inline view
containing a
Bug:
If the result of a
Bug:
When a UDF is dropped through the
Bug:
Workaround: Restart the
If a Query
aborted
with no further detail. Common reasons why a
Bug:
After an
Bug:
A
Bug:
Workaround: Impala adds support for ASCII 0 characters as delimiters through the clause
Impala could allocate more memory than necessary during certain operations.
Bug:
Workaround: Before issuing a
When new subdirectories are created underneath a partitioned table by an
Bug:
Resolution: In Impala 1.3.1 and higher, you can specify the
Impala could encounter a severe error in a query where the
Bug:
The ability to specify a subset of columns in an
Bug:
This section lists the most significant issues fixed in Impala 1.3.0, primarily issues that could cause
wrong results, or cause problems running the
For the full list of fixed issues, see
The automatic join reordering optimization could incorrectly reorder queries with an outer join or semi join followed by an inner join, producing incorrect results.
Bug:
Workaround: Including the
A query with a
Bug:
A query could return incorrect results if it combined an aggregate function call, a
Bug:
An aggregation query or a query with
Bug:
If a
Bug:
Referencing the same columns in both a
Bug:
Workaround: Setting the query option
A
Bug:
Impala could return incorrect string results when reading uncompressed Parquet data files containing multiple row groups. This issue only affected Parquet data files produced by MapReduce jobs.
Bug:
Using a column or table name that conflicted with Impala keywords could prevent running the
Bug:
The
Bug:
The
Bug:
If the columns for an Avro table were all defined in the
Bug:
Workaround: Re-create the Avro table with columns defined in SQL style, using the output of
This section lists the most significant issues fixed in Impala 1.2.4. For the full list of fixed issues,
see
A large number of concurrent
Bug:
Workaround: Restart the
A large number of tables and partitions could result in unnecessary CPU overhead during Impala idle time and background operations.
Bug:
Resolution: Catalog server processing was optimized in several ways.
A query against a
Bug:
Workaround: Set the query option
Impala nodes could produce repeated error messages after recovering from a communication error with the statestore service.
Bug:
A join query could produce wrong results if multiple equality comparisons between the same tables referred to the same column.
Bug:
Certain outer join queries could return wrong results. If one of the tables involved in the join was an
inline view, some tests from the
An HBase cell could contain a value larger than 32 KB, leading to a serious error when Impala queries that table. The error could occur even if the applicable row is not part of the result set.
Bug:
Workaround: Use smaller values in the HBase table, or exclude the column containing the large value from the result set.
A query involving a
Bug:
Workaround: Set the query option
If a table had more than 32,767 partitions, Impala would not recognize the partitions above the 32K limit and query results could be incomplete.
Bug:
Queries against HBase tables could fail with an error if the row key was compared to a function return
value rather than a string constant. Also, queries against HBase tables could fail if the
Resolution: Queries now return appropriate results when function calls are used in the row key
comparison. For queries involving non-existent row keys, such as
This release is a fix release that supercedes Impala 1.2.2, with the same features and fixes as 1.2.2 plus one additional fix for compatibility with Parquet files generated outside of Impala by components such as Hive, Pig, or MapReduce.
An early version of the Column chunk should not contain two dictionary pages
.
This issue does not occur for Parquet files produced by Impala
Bug:
This section lists the most significant issues fixed in Impala 1.2.2. For the full list of fixed issues,
see
Impala does not currently optimize the join order of queries; instead, it joins tables in the order in which they are listed in the FROM clause. Queries that contain one or more large tables on the right hand side of joins (either an explicit join expressed as a JOIN statement or a join implicit in the list of table references in the FROM clause) may run slowly or crash Impala due to out-of-memory errors. For example:
Anticipated Resolution: Fixed in Impala 1.2.2.
Workaround: In Impala 1.2.2 and higher, use the
should be modified to:
Some Parquet files could be generated by other components that Impala could not read.
Bug:
Resolution: The underlying issue is being addressed by a fix in the Parquet libraries. Impala 1.2.2 works around the problem and reads the existing data files.
The statestore service cound experience an internal error leading to a hang.
Bug:
A
Bug:
A serious error could occur when doing an
Bug:
If the JAR file for a Java-based Hive UDF was not in the
Bug:
This section lists the most significant issues fixed in Impala 1.2.1. For the full list of fixed issues,
see
While querying a table with long column values, Impala could over-allocate memory leading to an out-of-memory error. This problem was observed most frequently with tables using uncompressed RCFile or text data files.
Bug:
Resolution: Fixed in 1.2.1
A join query could allocate a temporary work area that was larger than needed, leading to an out-of-memory error. The fix makes Impala return unused memory to the system when the memory limit is reached, avoiding unnecessary memory errors.
Bug:
Resolution: Fixed in 1.2.1
Impala could encounter an out-of-memory condition setting up work areas for Parquet tables with many columns. The fix reduces the size of the allocated memory when not actually needed to hold table data.
Bug:
Resolution: Fixed in 1.2.1
This section lists the most significant issues fixed in Impala 1.2 (beta). For the full list of fixed
issues, see
This section lists the most significant issues fixed in Impala 1.1.1. For the full list of fixed issues,
see
Certain queries involving
Bug:
Queries could fail with a block size is too big
error, due to
Bug:
Queries could fail if an Impala RCFile table was defined with more columns than in the corresponding RCFile data files.
Bug:
Certain combinations of clauses in a view definition for a partitioned table could result in inefficient performance and incorrect results.
Bug:
The SerDes class string written into Parquet data files created by Impala was updated for compatibility
with Parquet support in Hive. See
Bug:
A query returning a small result sets from a large table could tie up memory unnecessarily for the duration of the query.
Bug:
Queries against Avro tables could fail depending on whether the Avro schema URL was specified in the
Bug:
Queries could allocate substantially more memory than specified in the
Bug:
This section lists the most significant issues fixed in Impala 1.1. For the full list of fixed issues, see
This issue is due to a performance tradeoff between systems running many queries concurrently, and systems running a single query. Systems running only a single query could experience lower performance than in early beta releases. Systems running many queries simultaneously should experience higher performance than in the beta releases.
A query could fail if it involved 3 or more tables and the last join table was specified as a subquery.
Bug:
Bug:
The
Bug:
The Impala web UI would sometimes display a query as if it were still running, after the query was cancelled.
Bug:
The
For the
Bug:
This section lists the most significant issues fixed in Impala 1.0.1. For the full list of fixed issues,
see
Impala might issue an erroneous error message when processing a Parquet data file produced by a non-Impala Hadoop component.
Bug:
Resolution: Fixed
If an RCFile table definition had fewer columns than the fields actually in the data files, queries would fail.
Bug:
Resolution: Fixed
The
Bug:
Resolution: Fixed
A query for an HBase table could omit data from the last region.
Bug:
Resolution: Fixed
After a region in an HBase table was split or moved, an Impala query might return incomplete or out-of-date results.
Bug:
Resolution: Fixed
After a successful
Bug:
Resolution: Fixed
Operations involving calls to the Java JNI subsystem (for example, queries on HBase tables) could allocate memory but not release it.
Bug:
Resolution: Fixed
Impala returns 0 for bad time values in UNIX_TIMESTAMP, Hive returns NULL.
Impala:
Hive:
Bug:
Anticipated Resolution: Fixed
Insert INTO TABLE SELECT <constant> will not insert any data and may return an error.
Anticipated Resolution: Fixed
Here are the major user-visible issues fixed in Impala 1.0. For a full list of fixed issues, see
A query containing both
Bug:
Resolution: Fixed
An
Bug:
Resolution: Fixed
In the Impala web user interface, the profile page for an
Bug:
Resolution: Fixed
Queries involving an HBase table could be slower than expected, due to excessive memory usage on the Impala nodes.
Bug:
Resolution: Fixed
No validation was done to check that the
Bug:
Resolution: Fixed
Workaround: Always upgrade the
Bug:
Resolution: Fixed
Pressing Ctrl-C in the
Bug:
Resolution: Fixed
Specifying an empty string or
Bug:
Resolution: Fixed. The behavior for empty partition keys was made more compatible with the corresponding Hive behavior.
The
Bug:
Resolution: Fixed
Casting from a string literal back to the same type would cause an invalid type cast
error rather
than leaving the original value unchanged.
Bug:
Resolution: Fixed
Some queries that returned very few rows experienced unnecessary memory usage.
Bug:
Resolution: Fixed
A serious error could occur for relatively small and inexpensive queries.
Bug:
Resolution: Fixed
Certain aggregation queries against Parquet tables were inefficient due to lower than required thread utilization.
Bug:
Resolution: Fixed
The Impala
Bug:
Resolution: Fixed. The metadata was made more Hive-compatible.
The
Bug:
Resolution: Fixed
A subquery would fail if the
Bug:
Resolution: Fixed
The result set from a right outer join query could include erroneous rows containing
Bug:
Resolution: Fixed
The Parquet scanner non-deterministically hangs when executing some queries.
Bug:
Resolution: Fixed
When attempting to load metadata from an unsupported Hive table type (INDEX and VIEW tables), Impala fails with an unclear error message.
Bug:
Resolution: Fixed in 0.7
Resolution: Fixed in 0.7
Resolution: Fixed in 0.7
Workaround: None
It is currently not possible to limit the memory consumption of a single query. All tables on the right hand side of JOIN statements need to be able to fit in memory. If they do not, Impala may crash due to out of memory errors.
Resolution: Fixed in 0.7
Aggregate of a subquery result set returns wrong results if the subquery contains a 'limit' clause and data is distributed across multiple nodes. From the query plan, it looks like we are just summing the results from each worker node.
Bug:
Resolution: Fixed in 0.7
We currently cannot utilize a predicate like "country_code in ('DE', 'FR', 'US')" to do partitioning pruning, because that requires an equality predicate or a binary comparison.
We should create a superclass of planner.ValueRange, ValueSet, that can be constructed with an arbitrary predicate, and whose isInRange(analyzer, valueExpr) constructs a literal predicate by substitution of the valueExpr into the predicate.
Bug:
Resolution: Fixed in 0.7
Impala reads the NameNode address and port as command line parameters rather than reading them from
Severity: Low
Resolution: Fixed in 0.6 - Impala reads the namenode location and port from the Hadoop
configuration files, though setting
Queries may fail on secure environment due to
Bug:
Resolution: Fixed in 0.6
Concurrent queries may fail when Impala is using Thrift to communicate with part of the Hive Metastore
such as the Hive Metastore Service. In such a case, the error
Bug:
Resolution: Fixed in 0.6
Impala fails to start if it is unable to establish a connection with the Hive Metastore. This behavior was fixed, allowing Impala to start, even when no Metastore is available.
Bug:
Resolution: Fixed in 0.6
In some queries (including "USE database" statements), database names are treated as case-sensitive. This may lead queries to fail with an IllegalStateException.
Bug:
Resolution: Fixed in 0.6
Impala does not ignore hidden HDFS files, meaning those files prefixed with a period '.' or underscore '_'. This diverges from Hive/MapReduce, which skips these files.
Bug:
Resolution: Fixed in 0.6
Impala may have reduced performance on tables that contain a large number of partitions. This is due to extra overhead reading/parsing the partition metadata.
Resolution: Fixed in 0.5
Backend impalads do not cache connections to the coordinator. On a secure cluster, this introduces a latency proportional to the number of backend clients involved in query execution, as the cost of establishing a secure connection is much higher than in the non-secure case.
Bug:
Resolution: Fixed in 0.5
Concurrent queries may fail with error:
Bug:
Resolution: Fixed in 0.5
The Impala UNIX_TIMESTAMP(val, format) operation compares the length of format and val and returns NULL if they do not match. Hive instead effectively truncates val to the length of the format parameter.
Bug:
Resolution: Fixed in 0.5
Impala is impacted by Hive bug
Anticipated Resolution: To be fixed in a future release
Workaround: Restart the
The lpad/rpad builtin functions generate the wrong results.
Resolution: Fixed in 0.4
Compressed files with extensions incorrectly generate an exception.
Bug:
Resolution: Fixed in 0.4
Some queries with large limits were hanging.
Resolution: Fixed in 0.4
Resolution: Fixed in 0.4
If Impala is unable to load the metadata for a table for any reason, a subsequent query referring to that
table will return an
Resolution: Fixed in 0.3
After failing to load metadata for a table, Impala removes that table from the list of known tables
returned in
Resolution: Fixed in 0.3
Attempting to select from these tables fails.
Resolution: Fixed in 0.3
Queries that contain OUTER JOINs may not return the correct results if there are predicates referencing any of the joined tables in the WHERE clause.
Resolution: Fixed in 0.3.
Subqueries that contain an aggregate cannot be joined with another table or Impala may crash. For example:
Resolution: Fixed in 0.2
For example:
Resolution: Fixed in 0.2
For example:
Resolution: Fixed in 0.2
Attempting to read such files does not generate a diagnostic.
Resolution: Fixed in 0.2
When querying an HBase table whose row-key is string type, the Impala server may raise a null pointer exception.
Resolution: Fixed in 0.2